ArticlePDF Available

Self Calibration of Stereo Cameras With Lens Distortion

Authors:

Abstract

In this paper we present a new method for the self calibration of a stereo camera pair including lens distortion. A single stereo image pair is used without calibration object or prior knowledge of the scene. The method is based on the estimation of a stereo image correspondence field followed by extraction of the calibration parameters. The correspondence field is obtained by a Markov Random Field motion estimator with a calibration invariant epipolar constraint. A new probabilistic model for local lens distortion serves as basis for the constraint. A probabilistic stereo camera model is fitted to the correspondence field using a MAP criterion. The optimum is found by the stochastic minimisation procedure simulated annealing. We are currently investigating the new technique. We expect to improve on the results reported in [8], since in our approach the number of corresponding points used for parameter extraction is several orders higher. 1. Introduction In the area of 3D measurement...
Self calibration of stereo cameras with lens distortion
André Redert and Emile Hendriks
Information and Communication Theory Group, Department of Electrical Engineering
Delft University of Technology
Mekelweg 4, 2628 CD Delft, The Netherlands
email: {andre,emile}@it.et.tudelft.nl
phone: +31 15 278 6269, fax: +31 15 278 1843
Abstract
In this paper we present a new method for the
self calibration of a stereo camera pair
including lens distortion. A single stereo image
pair is used without calibration object or prior
knowledge of the scene.
The method is based on the estimation of a
stereo image correspondence field followed by
extraction of the calibration parameters. The
correspondence field is obtained by a Markov
Random Field motion estimator with a
calibration invariant epipolar constraint. A
new probabilistic model for local lens
distortion serves as basis for the constraint. A
probabilistic stereo camera model is fitted to
the correspondence field using a MAP
criterion. The optimum is found by the
stochastic minimisation procedure simulated
annealing.
We are currently investigating the new
technique. We expect to improve on the results
reported in [8], since in our approach the
number of corresponding points used for
parameter extraction is several orders higher.
1. Introduction
In the area of 3D measurements with stereo
imaging, accurate calibration of the camera pair
is crucial. The reconstruction of scene points is
done by estimation of corresponding pixels in
left and right image and subsequent
triangulation. Accurate triangulation can only
be performed if the external, internal and
lens/ccd distortion parameters of both cameras
are known. As an important byproduct
calibration reduces the correspondence problem
from 2D motion estimation to the more efficient
and reliable 1D disparity estimation.
There are two different techniques for
calibration: fixed and self calibration. In fixed
calibration, all parameters are extracted off line
by placing a special object with known
geometry in front of the cameras and processing
of the camera images. In the past 20 years, little
has change is observed in fixed calibration
techniques [2,7].
Although fixed calibration procedures give
the most accurate results, they suffer from a
number of disadvantages. The procedure
requires a special calibration object and user
interaction. Most off all, the parameters become
useless after a change in the camera parameters
due to e.g. zooming.
These disadvantages can be circumvented by
self calibration techniques. In [4] distortion
parameters are found on the basis of a stereo
pair or triplet of images. In [8] all calibration
parameters are found using a stereo pair of
images. The images contain some calibration
pattern with unknown geometry. Lens
distortion is assumed to be equal for both
cameras and have a radial component only. A
sparse correspondence field is obtained by
feature extraction, from which then the
calibration parameters are determined.
In this paper, we consider self calibration of
all camera parameters on the basis of a single
image pair. The image pair contains the actual
3D scene of interest. A dense correspondence
field is estimated, without need for feature
extraction. Subsequently the calibration
parameters are extracted from the field, see
Figure 1.
Figure 1: The self calibration system
The correspondence field is obtained by motion
estimation with an epipolar constraint that is
invariant to the internal and external camera
parameters. It utilizes a probabilistic Markov
Random Field model for lens distortion. The
distortion MRF model is then easily combined
with a MRF model for motion estimation which
are known for their high accuracy results [5].
Finally a probabilistic stereo camera model is
fitted to the correspondence field obtained. Due
to the a priori parameter probabilities the
solution is regularized. This enables the
existence of a
single
global optimum and makes
its localisation more easy. We use the
stochastic minimisation procedure simulated
annealing to find the solution.
The paper is organised as follows. In the next
section we describe our correspondence
estimator based on motion estimation and a new
probabilistic distortion model. Section 3 gives
the probabilistic stereo camera model and
section 4 describes the extraction of camera
parameters from the correspondence field.
Finally section 5 gives our preliminary results.
2. Correspondence estimation
In this section the goal is to find the
corresponding pixels in left and right images, in
the absence of calibration knowledge.
In a stereo image pair, the corresponding pixel
pairs follow the so called epipolar constraint.
Without lens distortion, all correspondences in
the stereo pair ly on straight lines in the left and
right images, the epipolar lines, regardless of all
other parameters. With lens distortion, the
epipolar lines become curved.
We use a Markov Random Field motion
estimator with a constraint on the curvature of
the epipolar lines. This is invariant to all
camera parameters except lens distortion.
We estimate the correspondence field
m
and
simultaneously a field
L
that describes the
orientation of the epipolar curve for each point
in the left image. The field
m
defines the
correspondending left and right pixels:
x
y
m
x
y
R
L
L
=
(1)
The scalar field 0
L
< models the direction
of the epipolar lines in the left image, defined
as the angle between the
x
L
axis and the line
tangent to the epipolar curve. Figure 3 shows
the situation in which there is no lens distortion
and the epipolar curves are straight lines.
Figure 3: The epipolar direction field
L
At each point
P
in the left image, we construct
a line
l
tangent to the epipolar line and
parametrize it by
L
with
L
= 0 at P.
For a distortionless left camera the following
holds:
λ
α α
α α
α
L
L
L
L
L
L
L
L
x
y
= + =
cos
sin
0
(2)
Now we project line
l
onto the right image
plane using the correspondence field
m
. As
depicted in Figure 3, the uniform
parametrisation by
L
is not preserved by the
projection. It is affected by both calibration
parameters and disparity, which is a function of
the particular 3D scene. When traveling on line
l
with constant velocity, the corresponding walk
on line
l
in the right image will thus exhibit
unknown accelerations.
If the right camera is also distortionless, the
line
l
in the right image is a straight line. To
remain on a straight line, the acceleration
should always be parallel to the velocity.
Formulated mathematically, we have:
λ
λ
2
2
L
R
R
L
R
R
x
y
K
x
y
=
(3)
The value of
K
depends on the calibration
parameters and the particular 3D scene. It
changes with image position.
For cameras with distortion, (2) and (3) do not
hold. We now model any deviation from (2)
and (3) by Gaussian probability densities,
which result in quadratic energy functions in
the MRF model. After rewriting (2) and (3)
into:
α
L
'
=
0
a
K
v
=
(4)
we construct the energy terms by:
E
L
1
2
=
α
'
E
a
v
v
a
2
2
2
2
=
×
(5)
The normalisation in
E
2
makes sure the energy
is invariant to
K
and the absolute magnitude of
both
a
and
v
, which all depend on the particular
3D scene. Appropriate weight factors for the
two terms in a MRF motion estimator will be
determined by experiment.
We have now defined a correspondence
estimator with 3 unknown fields (motion
m
X
,
m
Y
and epipolar angle
L
) and two field equations
(2) and (3). Together with a luminance
difference term [5] they form the total MRF
model.
We use a hierarchical gradient descent
algorithm to compute
m
and
L
given the left
and right image luminance.
3. Stereo camera model
In this section we describe our probabilistic
stereo camera model. Figure 2 shows the
complete model. Five reference frames are
defined, the stereo frame, the left/right lens
frames and the left/right projection frames.
Gaussian probability densities are assigned to
all parameters. We will now describe the
parameters and their mean and variance.
Figure 2: The stereo camera model
Without calibration object with a known length
in meters, it is impossible in a stereo system to
obtain measurements in meters. We therefore
select the camera baseline as our unit of length.
The frame SF is defined to be a right handed
frame in which the two optical centres ly on the
x-axis symmetrically around the origin.
The optical centres therefore have fixed
coordinates in SF, (-½,0,0) for the left camera
and (+½,0,0) for the right camera. The
orientations of the the left and right lens frames
are defined by two sets of Euler angles
(
x
,
y
,
z
). The lens is present in the origin of
the lens frames, oriented in the xy planes. We
assume radial symmetry in the lenses and thus
we can assume
z
=0. The other two angles are
modelled by =0 and =2 rad. This introduces
a small bias towards cameras that are aimed at
the same object. The reference frame
SF
is
defined up to a rotation around the x-axis. We
can therefore introduce an arbitrary equation
that eliminates either
x;L
or
x;R
,
such as e.g.
x;L
+
x;R
= 0.
We assume the ccd to be perfectly flat and
have perfectly perpendicular image axes. The
image formation is invariant for scaling of the
triplet focal length, horizontal and vertical pixel
size. Therefore we choose without with loss of
generality the horizontal size of the pixels equal
to 1 (both cameras) and the vertical size equal
to R
L/R
, the pixel ratio. The ratio is modeled as
= 1, = 0.3.
The positions of the projection frames
PF
L/R
(ccd chip) relative to the lens frames
LF
L/R
are
defined by two vectors (
O
PF
X
LF
,
O
PF
Y
LF
,
O
PF
Z
LF
).
The first two numbers define the intersection of
the lens optical axis with the ccd (mis-
positioning) and are modeled by = 0, = 10
(pixels). The third is the focal length modeled
with = .
The orientation of the projection frames
PF
L/R
(ccd chip) relative to the lens frames
LF
L/R
is
defined by two sets of Euler angles (
x
,
y
,
z
).
z
defines the rotation of the projection frame and
is modeled with = . The other two angles
define non perpendicular ccd placement and are
modeled with = 0, = 5 .
Since ccd misplacement is incorporated in
several of the previous parameters, lens
distortion can be modeled simpler than in [6].
We use only the radial distortion parameter K
3
with = 0, = 0.3.
4. Extraction of calibration
parameters
Once we have the correspondence field
m
, we
can extract the camera parameters. We use the
following procedure.
In the stereo frame
SF
we introduce a new
coordinate, the epipolar coordinate e:
e
y
z
SF
SF
SF
=
arctan
(6)
Given a point in the left image P
PL
we use the
stereo camera model to construct the line in
space on which all possible corresponding
scene points
P
SL
ly. All these points have the
same epipolar coordinate
P
SL
e
SF
. Using the field
m
we do the same with the right image point.
The difference in epipolar coordinate for the
left and right points is due to errors in the
correspondence field
m
and mismatch of the
assumed camera model and the actual cameras.
The difference is modeled by a Gaussian
probability density function with =0 and
epi
to be determined by experiment.
The total probability for the parameters given
prior knowledge defined in section 3 and
deviations from (6) is defined as
P
e
TOT
E
TOT
=
,
with
E
TOT
equal to:
E
Q
TOT
i
i
i
all
parameters
epi
all
pixel
pairs
=
+
µ
σ σ
Q
SL
e
SR
e
i
SF
SF
P
P
2
2
(7)
We use a MAP criterion to define the best set of
parameters. A simulated annealing procedure is
used to maximize
P
TOT
. This procedure does not
require the computation of complex derivatives
of
P
TOT
with respect to the parameters. The
search space is small enough to allow a slow
cooling procedure that finds the global
optimum.
5. Preliminary results
We are currently working on the motion
estimator with energy terms (5). We are
considering an extra smoothness term that is
necessary also in the case when complete
calibration data is available [3].
We expect to improve on the results reported
in [8], since in our approach a priori calibration
information is used and the number of
corresponding points used for parameter
extraction is several orders higher.
Acknowledgement
This work is done in the framework of the
European ACTS project PANORAMA. One of
the major goals of this project is a hardware
realisation of a real time multi viewpoint
autostereoscopic video system.
References
[1] D.V. Papadimitriou and T.J. Dennis,
Epipolar line estimation and rectification
for stereo image pairs,
IEEE Transactions
on Image Processing
, Vol. 5, No. 4, 1996,
pp. 672-676
[2] F. Pedersini, D. Pele, A. Sarti and S.
Tubaro, Calibration and self-calibration of
multi-ocular camera systems, in
proceedings of the
International Workshop
on Synthetic-Natural Hybrid Coding and
Three Dimensional Imaging
(IWSNHC3DI97), Rhodos, Greece, pp. 81-
84, 1997
[3] P.A. Redert, C.J. Tsai, E.A. Hendriks and
A.K. Katsaggelos, Disparity estimation
with modeling of occlusion and object
orientation, Proceedings of SPIE VCIP98,
Vol. 3309, pp. 798-808, 1998
[4] G.P. Stein, Lens distortion calibration
using point correspondences, in
IEEE
Conference on CVPR
, pp. 602-609, 1997
[5] C. Stiller, Object-based estimation of
dense motion fields,
IEEE Transactions on
Image Processing
, Vol. 6, No. 2, pp. 234-
250, 1997
[6] J. Weng, P. Cohen and M. Herniou,
Camera calibration with distortion models
and accuracy evaluation, in
IEEE
Transactions on PAMI
, Vol. 14, No. 10,
pp. 965-980, 1992
[7] H.J. Woltring, Simultaneous multiframe
analytical calibration (SMAC) by recourse
to oblique obervations of planar control
distributions,
SPIE Vol. 166 Applications
of Human Biostereometrics
, pp. 124-135,
1978
[8] Z. Zhang, On the epipolar geometry
between two images with lens distortion,
in proceedings
of Int. Conf. Pattern
Recognition
(
ICPR
), Vol. 1, pp. 407-411,
1996
... Self-calibration techniques allow for online processing without special objects or user intervention. A dense field can be used to obtain high parameter accuracy [48]. ...
... In structure from motion applications, the reconstruction of the selected object can be handled in the same way as in the 3-D from stereo application, with one exception. The two virtual spatial cameras cannot be calibrated off line, and therefore selfcalibration techniques have to be used on the basis of the estimated correspondence field [41,48]. ...
... Bayesian approach to correspondence estimation More recent approaches for dense correspondence estimation are the Bayesian methods, applied to temporal image pairs [10,26,51,54,61] and to spatial pairs [11,48,60]. In this approach the simultaneous estimation of dense correspondence fields is easily combined with object segmentation. ...
Article
Full-text available
The estimation of correspondences in natural image pairs plays an important role in a large number of applications such as video coding, frame rate conversion, multi-viewpoint image generation, camera calibration, 3D from stereo, and structure from motion. An overview of the techniques for dense geometric correspondence estimation is presented. Different types of image pairs are discussed. Some improvements for correspondence estimation in image pairs are projected, which include, the estimation of all pseudo-correspondences, the incorporation of image restoration models, modeling of specular reflectivity of scene surfaces, the use of image sequences, and the application of epipolar geometry.
Article
Full-text available
In this paper we present a technique for the accurate calibration of a multiple-camera image acquisition systems used for applications to 3D reconstruction of scenes. The main characteristic of this technique is the fact that it allows us to obtain highly accurate calibrations even when using a very simple and inexpensive calibration target such as a planar pattern. This has been made possible by exploiting the geometry constraints of a multiple-camera acquisition system and by applying a self-calibrating approach that estimates the camera parameters and, at the same time, refines the estimation of the exact position of the calibration pattern's fiducial points. Experiments with rather different setups have been carried out in order to test the proposed calibration techniques in a variety of conditions. The calibration performance, in terms of accuracy of the measured camera parameters, have been predicted through error propagation analysis and verified experimentally through measurement error estimation.
Conference Paper
Full-text available
In order to achieve a 3D, either Euclidean or projective, reconstruction with high precision, one has to consider lens distortion. In almost all work on multiple-views problems in computer vision, a camera is modeled as a pinhole. Lens distortion has usually been corrected off-line. This paper intends to consider lens distortion as an integral part of a camera. We first describe the epipolar geometry between two images with lens distortion. For a point in one image, its corresponding point in the other image should lie on a so-called epipolar curve. We then investigate the possibility of estimating the distortion parameters and the fundamental matrix based on the generalized epipolar constraint. Experimental results with computer simulation show that the distortion parameters can be estimated correctly if the noise in image points is low and the lens distortion is severe. Otherwise, it is better to treat the cameras as being distortion-free
Article
Full-text available
Stereo matching is fundamental to applications such as 3-D visual communications and depth measurements. There are several different approaches towards this objective, including feature-based methods, 1,2 block-based methods, 3,4 and pixel-based methods. 5 Most approaches use regularization to obtain reliable fields. Generally speaking, when smoothing is applied to the estimated depth field, it results in a bias towards surfaces that are parallel to the image plane. This is called fronto-parallel bias. 4 Recent pixel-based approaches 5 claim that no disparity smoothing is necessary. In their approach, occlusions and objects are explicitly modeled. But these models interfere each others in the case of slanted objects and result in a fragmented disparity field. In this paper we propose a disparity estimation algorithm with explicit modeling of object orientation and occlusion. The algorithm incorporates adjustable resolution and accuracy. Smoothing can be applied without introd...
Conference Paper
The use of a flat calibration object with a known 2-D control point distribution for simultaneous estimation of internal and external camera parameters in whole-body movement studies allows coverage of the full observation area as required to avoid extrapolation errors in subsequent 3-D trajectory reconstruction, without the cumbersomeness of more conventional 3-D calibration objects designed to cover the whole observation area. In this adaptation of Simultaneous Multiframe Analytical Calibration (S.M.A.C.), the use of certain oblique (convergent) observations or frames on a planar control distribution allows full recovery of all internal camera parameters, by simultaneous estimation of the various frame positions and attitudes with respect to the camera. Calibration of a second camera whether real or virtual as in mirror photography may be conducted by observing at least one frame through both cameras. The procedure is initialized by means of the fractional linear transformation describing central projection between two planes; the only a priori information required in this approach is an approximate value for the camera principal point; no initial values are generally required for camera principal distance or for the various frame positions and attitudes. In the present paper, the theory and implementation of planar S.M.A.C. are discussed, followed by results from simulation studies. Experimental work is currently in progress.
Article
Motion estimation belongs to key techniques in image sequence processing. Segmentation of the motion fields such that, ideally, each independently moving object uniquely corresponds to one region, is one of the essential elements in object-based image processing. This paper is concerned with unsupervised simultaneous estimation of dense motion fields and their segmentations. It is based on a stochastic model relating image intensities to motion information. Based on the analysis of natural images, a region-based model of motion-compensated prediction error is proposed. In each region the error is modeled by a white stationary generalized Gaussian random process. The motion field and its segmentation are themselves modeled by a compound Gibbs/Markov random field accounting for statistical bindings in spatial direction and along the direction of motion trajectories. The a posteriori distribution of the motion field for a given image sequence is formulated as an objective function, such that its maximization results in the MAP estimate. A deterministic multiscale relaxation technique with regular structure is employed for optimization of the objective function. Simulation results are in a good agreement with human perception for both the motion fields and their segmentations
Article
The assumption that epipolar lines are parallel to image scan lines is made in many algorithms for stereo analysis. If valid, it enables the search for corresponding image features to be confined to one dimension and, hence, simplified. An algorithm that generates a vertically aligned stereo pair by warped resampling is described. The method uses grey scale image matching between the components of the stereo pair but confined to feature points.
Conference Paper
This paper describes a new method for lens distortion calibration using only point correspondences in multiple views, without the need to know either the 3D location of the points or the camera locations. The standard lens distortion model is a model of the deviations of a real camera from the ideal pinhole or projective camera model. Given multiple views of a set of corresponding points taken by ideal pinhole cameras there exist epipolar and trilinear constraints among pairs and triplets of these views. In practice, due to noise in the feature detection and due to lens distortion these constraints do not hold exactly and we get some error. The calibration is a search for the lens distortion parameters that minimize this error. Using simulation and experimental results with real images we explore the properties of this method. We describe the use of this method with the standard lens distortion model, radial and decentering, but it could also be used with any other parametric distortion models. Finally we demonstrate that lens distortion calibration improves the accuracy of 3D reconstruction
Article
A camera model that accounts for major sources of camera distortion, namely, radial, decentering, and thin prism distortions is presented. The proposed calibration procedure consists of two steps: (1) the calibration parameters are estimated using a closed-form solution based on a distribution-free camera model; and (2) the parameters estimated in the first step are improved iteratively through a nonlinear optimization, taking into account camera distortions. According to minimum variance estimation, the objective function to be minimized is the mean-square discrepancy between the observed image points and their inferred image projections computed with the estimated calibration parameters. The authors introduce a type of measure that can be used to directly evaluate the performance of calibration and compare calibrations among different systems. The validity and performance of the calibration procedure are tested with both synthetic data and real images taken by tele- and wide-angle lenses
Disparity estimation with modeling of occlusion and object orientation
  • P A Redert
  • C J Tsai
  • E A Hendriks
  • A K Katsaggelos
P.A. Redert, C.J. Tsai, E.A. Hendriks and A.K. Katsaggelos, "Disparity estimation with modeling of occlusion and object orientation", Proceedings of SPIE VCIP98, Vol. 3309, pp. 798-808, 1998