Niko's Project Corner

My Masters of Science Thesis involved the usage of a so-called "omnidirectional camera". There are various ways of achieving 180° or even 360° view, with their distinct pros and cons. The general benefit of these alternative camera systems is that objects don't need to be tracked, because generally they stay withing the extremely broad Field of View (FoV) of the camera. This is also very beneficial in visual odometry tasks, because landmarks can be tracked for longer periods of time.

Some design options are shown in Figure 1. The simplest solution is to use a fish-eye lens, and it is fairly straight forward to calibrate. However the image resolution is typically somewhat limited, and it cannot cover much more than 180°. An alternative approach is to use multiple normal cameras, an combine their images in software. This can produce very high resolution images, but requires a lot more effort and parameters in calibration, especially since also vingetting needs to be corrected before images can be joined seamlessly together. There are commercial products such as Ladybug from PointGrey. The third option is to use a combination of a traditional lens and a concave mirror to achieve up to 360° × 90° FoV. These are relatively compact and mid-priced, with a bit limited resolution and a fairly simple calibration models and procedures. This is the type of camera system I used at my Master's Thesis, which was titled Real Time Feature Tracking on Omnidirectional Camera Image in Forest Environment.

Figure 1: Alternative designs for omnidirectional camera system: a) fish-eye lens b) multi-camera rig c) catadioptic system.

The camera system I used in my thesis is presented in Figure 2. In addition to the omnidirectional camera system, it has an Inertial Measurement Unit (IMU) and a supporting structure to attach an compact 2D laser scanner (40 × 40 mm) such as Hokuyo URG. This leaves us with a FoV of about 290°, which is not an problem because typically part of FoV would be blocked by the user or robot anyway. Anyway this dark area can be ignored during image rectification.

One intuitive image rectification model is to make each image pixel correspond to an equal solid angle. For example if the camera has a FoV of 360° × 90°, suitable image resolutions would have resolutions such as 360 × 90 and 720 × 180. To determine the actual mapping, the camera must be calibrated.

Figure 2: A catadioptric omnidirectional camera (consisting of a traditional lens and a hyperbolic mirror). It also has an Inertial Measurement Unit (IMU) and a place for a 2D laser scanner.

An example captured frame can be seen in Figure 3. The captured image on the right, and the converted panorama is on top-right. The bottom-right image is overlayed with interest points and their tracks, and some metadata about the program's frame rate etc. This was generated by the program I developed for my Master's Thesis, which I'll describe in an other article.

Figure 3: An example omnidirectional frame, with equiangular panorama and tracked interest points.

One simple model was presented in [1], which is based on a kind of an radial basis function to determine the z-component of the light ray (when camera is looking at the z-direction). This is illustrated in Figure 4, and other illustrations can be seen in the original paper. In this model it is very easy to determine the light ray's direction for each pixel, but it is more difficult to calculate the object's location on the image based on its position in relation to the camera.

Figure 4: Camera model for perspective and an omnidirectional cameras.

Let's first revise how the typical perspective camera is parametrized. The linear pinhole camera model is well described in Wikipedia. It is important to differentiate between intrinsic and extrinsic parameters. Basically intrinsic parameters describe the physical aspects of the camera and lens, such as focal length and pixel density. Extrinsic parameters describe the location and orientation of the camera in world coordinates. To determine object's location in the image, first extrinsic parameters are used to determine object's location in camera's coordinate system. Then intrinsic parameters are used to project the object to the camera's sensor plane. To determine that which direction a given pixel is observing, these steps are applied in reverse order. This model cannot account for lens distortion, nor it models vignetting.

This same model cannot be applied to omnidirectional cameras, because it operates on a fundamentally different principle. Interestingly the more generic model can be used to describe any camera with a single viewpoint property, including perspective cameras. For these it is also able to account for lens distortion.

Starting from a point on the image, it is first transformed via affine transformation into a new coordinate system, in which the origin is close to the image center. This is essentially the same parametrization as in the pinhole camera model. Then the pixel's distance from the new origin is determined, and it is used to evaluate a low-degree polynomial function to determine the z-component of the ray. Usually the polynomial needs to have a higher degree than two, but not really greater than four.

If the camera is also able to see ''behind'' it, then this polynomial will get also negative values. For an ideal pinhole camera this function would just have constant value, which corresponds to the focal length of the lens (in pixel units). When lens distortion is present, it will either grow (for barrel distortion) or decrease (for pincushion distoriton) as the pixels are further away from the center. Nevertheless for normal lenses it will not deviate that much from its value at the origin. There can be greater changes for fisheye lenses.

For the used omnidirectional system this function started from a big positive value, reached 0 at a distance of \textasciitilde 500 pixels and grows to positive after that. This means that pixels near the image center are actually gathering light behind the camera, or in case the camera is pointed upwards, these pixels are seeing the ground. Correspondingly pixels near the image border are gathering light above the horizon.

To use this calibration data for generating a panoramic image in which each pixel covers the equal solid angle, this mapping function needs to be used to determine the azimuth angle for each pixel distance. This is calculated as α(r) = arctan(^f(r)⁄_r), which clearly does not have a closed form solution. The relationship between f(r) and α(r) can be seen in Figure 5 for f(r) = 400 - 0.2 r - 0.002 r².

Figure 5: An example polynomial for an omnidirectional camera with a FoV of 360° × (90 + 60)° = 360° × 150°. It is evident than because f(400) = 0, then α(400) = 0°, indicating that these pixels are observing light from the horizontal direction.

To sample images from 65° to -40° at 15° intervals, the corresponding distances from the image center are roughly 150, 220, 280, 330, 380, 440, 520 and 630 pixels. The resulting pixel sampling pattern is shown in Figure 6, with 24 samples along the horizontal angle.

Figure 6: Example sampling pattern for generating a panoramic image.

1	D. Scaramuzza, A. Martinelli, R. Siegwart A flexible technique for accurate omnidirectional camera calibration and structure from motion Proceedings of IEEE ICVS, 2006. PDF

Cheating at Bananagrams with real-time AI, part 1, 2023 Apr (Matching: Computer Vision)
Introduction to Stable Diffusion's parameters, 2022 Nov (Matching: Computer Vision)
Matching puzzle pieces together, 2022 Jul (Matching: Computer Vision)
Image and video clustering with an autoencoder, 2022 Jan (Matching: Computer Vision)
Helsinki Deblur Challenge 2021, 2021 Dec (Matching: Computer Vision)

Home	(Home page)
About	(About me)
Platform	(About this blog)

LinkedIn	(Niko Nyrhilä)
GitHub	(nikonyrh)
Stackoverflow	(nikonyrh)

Bruteforcing Countdown numbe...	(2023 Apr)
Cheating at Bananagrams with...	(2023 Apr)
Introduction to Stable Diffu...	(2022 Nov)
Matching puzzle pieces together	(2022 Jul)
Single channel speech / musi...	(2022 Feb)

Computer Vision	(13)
GitHub	(12)
Databases	(9)
Elasticsearch	(6)
FFT	(5)
Rendering	(5)
Applied mathematics	(4)

Omnidirectional cameras

Related blog posts:

Home

Navigation

External

Most recent

Most frequent tags

Most frequent languages

Co-occurrence matrix

	Matl	Pyth	C++	Cloj	Bash	Kera
Comput	6	6	3	1	0	5
GitHub	0	2	1	4	3	0
Databa	0	3	2	2	1	0
Render	3	0	3	0	0	0
Nginx	0	1	0	0	4	0
Autoen	0	3	0	1	0	2
Elasti	0	2	0	3	0	0
FFT	3	1	1	0	0	1
Data S	2	1	2	1	0	1
JVM	0	1	0	3	1	0
Docker	0	1	0	0	3	0
FastCG	0	0	3	0	0	0
Applie	2	2	0	0	0	0
Field	2	0	2	0	0	0
Omnidi	2	0	2	0	0	0
Affine	2	0	2	0	0	0
Master	1	0	2	0	0	0
Archit	0	1	0	0	2	0
Visual	1	0	2	0	0	0
Spark	0	1	0	0	2	0
Blog	0	0	0	2	0	0
Hyphen	0	0	0	2	0	0
Stack	0	1	1	0	0	0
SQL	0	0	1	1	0	0
Busine	0	1	0	1	0	0
Signal	0	1	0	0	0	1
Encryp	0	0	0	0	1	0
Git	0	0	0	1	0	0
Stable	0	1	0	0	0	0
Redis	0	1	0	0	0	0
Thrust	0	0	1	0	0	0
Kibana	0	0	0	1	0	0
Astron	1	0	0	0	0	0
Mustac	0	0	1	0	0	0
NAT	0	0	0	0	1	0
jQuery	0	0	1	0	0	0
SSH	0	0	0	0	1	0
Happyh	0	0	1	0	0	0
Backup	0	0	0	0	1	0
Pthrea	0	0	1	0	0	0
AWS	0	0	0	0	1	0
SIFT	0	0	1	0	0	0
SURF	0	0	1	0	0	0
Conjug	0	0	1	0	0	0
Kalman	0	0	1	0	0	0
Partic	0	0	1	0	0	0
Gradie	0	0	1	0	0	0
Simult	0	0	1	0	0	0
Roboti	0	0	1	0	0	0
Princi	1	0	0	0	0	0
Receiv	1	0	0	0	0	0
Linear	1	0	0	0	0	0
Suppor	1	0	0	0	0	0
Machin	1	0	0	0	0	0
Discre	1	0	0	0	0	0

Python	(13)
C++	(11)
Matlab	(10)
Keras	(6)
Clojure	(6)
Bash	(6)
PHP	(6)