ArticlePDF Available

Contour-based iterative pose estimation of 3D rigid object

Wiley
IET Computer Vision
Authors:

Abstract and Figures

Estimating pose parameters of a 3D rigid object based on a 2D monocular image is a fundamental problem in computer vision. State-of-the-art methods usually assume that certain feature correspondences are available a priori between the input image and object's 3D model. This presumption makes the problem more algebraically tractable. However, when there is no feature correspondence available a priori, how to estimate the pose of a truly 3D object using just one 2D monocular image is still not well solved. In this article, a new contour-based method which solves both the pose estimation problem and the feature correspondence problem simultaneously and iteratively is proposed. The outer contour of the object is firstly extracted from the input 2D grey-level image, then a tentative point correspondence relationship is established between the extracted contour and object's 3D model, based on which object's pose parameters will be estimated; the newly estimated pose parameters are then used to revise the tentative point correspondence relationship, and the process is iterated until convergence. Experiment results are promising, showing that the authors' method has fast convergence speed and good convergence radius.
Content may be subject to copyright.
Published in IET Computer Vision
Received on 12th July 2010
Revised on 17th February 2011
doi: 10.1049/iet-cvi.2010.0098
ISSN 1751-9632
Contour-based iterative pose estimation of
3D rigid object
D.W. Leng W.D. Sun
Department of Electronic Engineering, Tsinghua University, Beijing 100084, People’s Republic of China
E-mail: jimdavid126@gmail.com
Abstract: Estimating pose parameters of a 3D rigid object based on a 2D monocular image is a fundamental problem in computer
vision. State-of-the-art methods usually assume that certain feature correspondences are available a priori between the input image
and object’s 3D model. This presumption makes the problem more algebraically tractable. However, when there is no feature
correspondence available a priori, how to estimate the pose of a truly 3D object using just one 2D monocular image is still
not well solved. In this article, a new contour-based method which solves both the pose estimation problem and the feature
correspondence problem simultaneously and iteratively is proposed. The outer contour of the object is firstly extracted from
the input 2D grey-level image, then a tentative point correspondence relationship is established between the extracted contour
and object’s 3D model, based on which object’s pose parameters will be estimated; the newly estimated pose parameters are
then used to revise the tentative point correspondence relationship, and the process is iterated until convergence. Experiment
results are promising, showing that the authors’ method has fast convergence speed and good convergence radius.
1 Introduction
Pose estimation of a 3D rigid object based on monocular
observation is a fundamental problem in computer vision,
computer graphics, photogrammetry, robotics etc.
Conventional methods often assume that certain feature
correspondence relationship is available a priori between
the input 2D grey-level image and the 3D model of the
object, for example, point correspondence [1 5], which is
most commonly utilised; line correspondence [6 8]; plane
correspondence [9, 10] and other feature correspondence
[11–13]. These presumed feature correspondence
relationships provide much algebraic ease for the problem
treatment.
However, under actual application situations, the
presumption that certain feature correspondence should be
given a priori is not always tenable. In fact, the problem of
determining the 2D 3D feature correspondence is even a
much harder problem than pose estimation (when certain
feature correspondence is given a priori) itself. To handle
this problem, different methods have been proposed and
mainly fall into three categories: (i) The main idea of the
first category’s methods is direct, that is, to determine the
feature correspondence first, then estimate the pose
parameters based on it. These methods heavily rely on
certain features’ extraction, for example, point feature
extraction, such as SIFT [14], SURF [15], SUSAN [16],
line feature extraction [6– 8] and stable region extraction
[12]. Viksten et al. [17, 18] give reviews of these
category’s methods. However, as already pointed out by
Lee et al. [1921], no feature is stable and reliable enough
in general 3D situations. This highly restricts the
application range of these methods; besides, how to extract
corresponding feature on object’s 3D model is still an open
problem. (ii) The second category’s methods resort to
image recognition techniques to bypass the problem of
determining the 2D 3D feature correspondence [20, 22].A
gallery of profile images of object’s 3D model with
different view angles is created beforehand, then the input
image is compared within the gallery, and pose parameters
of the most similar profile image are claimed to be the
object’s pose. The major disadvantage of these methods is
that to obtain a relatively finer parameter approximation, a
much larger profile image gallery would be needed,
whose size grows exponentially. (iii) The third category’s
methods adopt iterative mechanism [21, 23, 24]: certain
holistic cost function is defined with the input image
and object’s 3D model. Pose parameters are estimated
along with the iterative optimisation of the cost function.
The major strengths of these methods are that no feature
correspondence is required a priori, no profile image gallery
is needed and is more stable than the methods of the first
category. The major weakness is that these methods often
have limited convergence radius, and a relatively good
initialisation would be necessary for these methods to
converge successfully.
The method proposed in this article generally belongs to
this third category, that is, no feature correspondence
between the input 2D image and object’s 3D model is
required a priori and object’s pose is estimated iteratively.
This method is motivated by the work presented in [23] by
Iwashita which uses only object’s contour to estimate pose
parameters. Iwashita [23] demonstrated the feasibility of
estimating object’s pose using only image contour, but the
IET Comput. Vis., 2011, Vol. 5, Iss. 5, pp. 291– 300 291
doi: 10.1049/iet-cvi.2010.0098 &The Institution of Engineering and Technology 2011
www.ietdl.org
pose estimation algorithm proposed in [23] still needs
improvements. To bypass the problem of determining
feature correspondence between the input 2D image and
object’s 3D model, Iwashita et al. [23] resorted to defining
a holistic ensemble cost function on all contour pixels:
complicated forces and torsion moments are introduced to
drive the object’s model towards aligning with the input
image. The ensemble cost function is highly non-linear
which casts a heavy burden on the subsequent optimisation
process and makes the algorithm prone to being trapped
into local minimum. Besides non-linearity, the ensemble
cost function of [23] did not differentiate between different
pose parameters but treated them equally. However, the six
pose parameters (three rotation angle components and three
translation components) are heterogeneous and affect the
object’s imaging process in different ways. To achieve good
pose estimation result, the algorithm should embody these
differences [1, 25]. The third problem is that [23] utilised
gradient descent method to optimise its cost function, but
owing to the high non-linearity of the cost function, explicit
partial differential equations cannot be derived, and
numerical approximation had to be used, which is
computationally expensive and error prone.
In this article, we propose a new scheme of iterative pose
estimation method for 3D rigid object based on object’s
contour. As in [23], here we also use object’s contour as
algorithm’s start point, but instead of defining ensemble
cost function then estimating pose parameters by optimising
it, we exploit the information provided by object’s contour
in a different way. After extracting object’s outer contour
from the input image, we extract contour pixel feature and
try to establish a tentative 2D 3D point correspondence
relationship between the input 2D grey-level image and
object’s 3D model. This tentative point correspondence
relationship is then used to estimate object’s pose
parameters with well-developed point-based pose estimation
algorithm [1], and the newly estimated pose parameters are
then fed back to revise the previously established tentative
point correspondence. This process is iterated until correct
point correspondence relationship is established and pose
parameters of the 3D object are successfully retrieved. The
main feature of our method is that both the pose estimation
problem and the feature correspondence problem are solved
simultaneously and iteratively. None of the three problems
mentioned of [23] exists in our method, making it
computationally much more efficient, faster and stabler.
Experiments show that our method has faster convergence
speed and wider convergence radius.
In a very recently published conference paper [26], Cui and
Hildenbrand proposed an iterative method with an idea
similar to that of ours, that is, iteratively establishing the
point correspondence relationship between the image
contour and object’s 3D model and estimating the pose
parameters based on it. The major difference between our
work and [26] lies in the way of establishing the contour
point correspondence. The way of [26] is 3D3D-wise:
directly retrieve the nearest vertex point of object’s 3D
model to the 3D sight line of a contour pixel. For a
complex object whose 3D model contains thousands of
vertices, the computational burden of this 3D 3D-wise way
is rather heavy. To decrease the computational complexity,
[26] proposed to simplify the object’s 3D model with 3D
Fourier transform, but this in turn lowers the accuracy of
the established point correspondence relationship. Instead,
we adopt a 2D 2D– 3D-wise way: first establish the 2D
2D point correspondence between input image contour and
projection image contour (see Section 2.2), then back-
project the 2D–2D correspondence onto the surface of
object’s 3D model to obtain the final 2D–3D point
correspondence relationship. The advantage of this indirect
way is that both the 2D– 2D point correspondence
establishing step and the back-projecting step can be
accelerated by fast algorithms, which are described in this
paper. Besides, the point correspondence established by Cui
and Hildenbrand [26] will never be exactly accurate since
only vertex information of object’s 3D model is used there,
causing poor convergence performance for an unevenly
triangulated object model. This problem is also solved by
our 2D–2D 3D correspondence establishing method.
The remainder of this article is organised as follows: in
Section 2 the details of our method are described and
analysed theoretically; in Section 3 the performance of our
method is studied experimentally, and as will be shown, the
results of convergence speed and convergence radius of our
method are very promising; Section 4 concludes the article.
2 Iterative pose estimation based on the
object’s contour
2.1 Method description
In this section, we address the details of our method and
analyse it theoretically. Before entering into algorithm
specifics, we first describe the outline of the processing
flow of our method, as illustrated in Fig. 1. The whole
processing flow divides into two major stages:
preprocessing stage and iterative stage. The preprocessing
stage (Figs. 1ac) receives a monocular 2D grey-level
image as input, then the outer contour of the object and
contour pixel features are extracted. The iterative stage
(Figs. 1df) first establishes a tentative 2D 3D point
correspondence relationship between the extracted contour
and object’s 3D model with pose parameters estimated at
the last iteration. New values of object’s pose parameters
are then re-estimated based on the established tentative
2D–3D point correspondence. This process is iterated until
correct 2D–3D point correspondence is obtained and the
object’s 3D pose parameters are successfully retrieved.
There are two presumptions made by our iterative pose
estimation method which readers should keep in mind:
1. The object’s 3D pose and its outer contour (on the image
plane) are in one-to-one correspondence. This means, for a
specific contour, there exists only one determinate
corresponding pose of the object;
2. When the object’s pose varies continuously, its
corresponding contour on the image plane also changes
continuously.
Presumption (1) legitimates the object’s outer contour as a
viable feature for pose estimating, and presumption (2)
ensures that the iterative mechanism possesses reasonable
convergence radius. Note that presumption (1) is an ideal
condition, for artificial 3D objects that are self-symmetric in
certain ways, presumption (1) cannot be satisfied strictly.
For example the contours in the top view and the bottom
view of an aircraft are identical, and thus the contour-based
pose estimation algorithm would not be able to differentiate
between these two cases. Fortunately, these cases are
mathematically rare whose integral measure equals to zero
in the possible 3D pose space, so we can still apply the
contour-based pose estimation algorithm to these objects;
292 IET Comput. Vis., 2011, Vol. 5, Iss. 5, pp. 291 –300
&The Institution of Engineering and Technology 2011 doi: 10.1049/iet-cvi.2010.0098
www.ietdl.org
and for the odd cases under which object’s contours are
identical, since the possible solutions are known a priori,
they would not be a big problem.
In this article, perspective camera model is used to describe
camera’s imaging process. Camera model and corresponding
coordinate frames configuration are shown in Fig. 2, in which
subscripts u,vand pindicate camera coordinate frame, object
self-centred coordinate frame and image plane coordinate
frame, respectively. Object self-centred coordinate frame
and camera coordinate frame are related by the rigid
transformation as
xu=Rxv+t(1)
or in matrix form as
Xu=RXv+thT(2)
in which his an all-ones vector, Ris the rotation matrix and
tis the translation vector. The image plane coordinate frame
and object self-centred coordinate frame are related by the
imaging equation
xp
1

K(R|t)xv
1
 (3)
Symbol ‘’ means equal in homogeneous manner, and Kis
the inner camera parameter matrix.
In the remainder of this article, a calibrated camera is
assumed, that is, the inner camera parameter matrix Kis
known. This provides us the convenience of leaving K
unconsidered, and making the subsequent derivations more
compact.
2.2 Contour extraction and tentative point
correspondence establishing
Our method is based on the outer contour of the object. A
clean, continuous, single-pixel wide contour is necessary
for the method to produce good performance. To extract
object’s contour from the input image, various methods are
available. In our current implementation, we use a modified
active contour model proposed by the authors, which can
produce a closed and noise-robust contour, and can still be
very fast. The contour of the projection image which is
generated by projecting object’s 3D model onto the
image plane can be extracted by simple binary operations,
which can be implemented efficiently with modern central
processing unit–graphics processing unit (CPU–GPU)
computing architecture. For ease of statement, we will refer
the contour extracted from the input 2D grey-level image as
‘extracted contour’, and the contour got from the projection
image as ‘projected contour’ in the remainder of this article.
After extracting the object’s contour from the input image,
the next step is to extract contour pixel feature and establish
tentative point correspondence relationship between the
extracted contour and object’s 3D model. How to determine
the point correspondence between a 2D image and a 3D
model directly is still an open problem in computer vision.
Here we solve this problem indirectly, that is, firstly project
Fig. 2 Camera model and coordinate frames’ configuration
Fig. 1 Algorithm flowchart of the proposed method
aInput 2D grey-level image
bExtracted contour from the input image
cExtract contour pixel feature
dEstablish tentative point correspondence between the extracted contour and object’s 3D model
eModel wireframe overlaid on the 2D grey-level image with the estimated pose
fFinal pose estimation result
acBelong to the preprocessing stage, and dfbelong to the iterative stage
IET Comput. Vis., 2011, Vol. 5, Iss. 5, pp. 291– 300 293
doi: 10.1049/iet-cvi.2010.0098 &The Institution of Engineering and Technology 2011
www.ietdl.org
the object’s 3D model onto the image plane to obtain the
projected contour, secondly determine the 2D–2D point
correspondence between the extracted contour and the
projected contour and thirdly points on the projected contour
which have corresponding point on the extracted contour are
back-projected onto the surface of object’s 3D model,
carrying the established 2D–2D point correspondence
relationship forward to the final 2D–3D point correspondence.
The contour pixel feature extracted in our current
implementation is simply the pixel’s position in image. This
is the simplest feature ever possible for contour pixels. As
will be shown in the experiment section, even this simple
feature can provide good convergence performance. More
complex features may improve our method’s convergence
performance, but this is not the focus of the current article.
In this article, we will spare our attention on how far the
proposed scheme can go.
2.2.1 Establishing 2D–2D point correspondence: To
determine the 2D 2D point correspondence between the
extracted contour from the input image and the projected
contour from the projection image, we resort to a very
intuitive observation, that is, when two poses of an object
are close to each other, the correct corresponding points
should also be near geometrically between the
corresponding two contours. Thus, we can first establish the
point correspondence tentatively by searching geometrically
nearest point pairs, and revise it iteratively with updated
pose parameters’ estimation results. Geometrically nearest
point pair is not a difficult mathematical concept; however,
to find all nearest point pairs between two contours, which
may contain hundreds of points, could be rather
computationally expensive if not implemented properly.
Here we propose to use distance map to establish the 2D
2D point correspondence between the two contours
efficiently. Note that distance map is also used in [23], but
in a very different way. Distance map lies in the kernel of
[23]’s method, it is used to define the ensemble cost
function; here in our method, a distance map is used to
accelerate the 2D–2D point correspondence establishment.
A distance map describes the shortest distance of image
pixels to the given contour. More mathematically, given the
extracted contour C, for an image pixel x, its distance map
value is given as
D(x)=min
yxy,y[C(4)
This definition is very similar to the signed distance function
commonly used in level set theory except that no sign
determination is necessary here. So, fast marching method
[27] which is often used to compute the signed distance
function can be used to build the distance map [23].
However, the fast marching method is designed for general
distance measure, and its O(Nlog N)(Nis the total number
of image pixels) computational complexity is still rather
high even for a medium-sized image. In our
implementation, since the distance map is built only based
on the Euclidean distance measure, we adopt the distance
transform method proposed in [28], whose computational
complexity is only O(N), and is much faster than fast
marching method for large Nvalue. It is worth emphasising
that the distance map needs only to be computed once
during the whole pose estimation process.
As its definition (4), the distance map value of a given
image pixel is the shortest distance from this position to the
given contour; thus, similar to the signed distance function
of level set, pixels that have the same distance map values
will form a closed and continuous isocontour, see Fig. 3.
To retrieve the nearest contour pixel for a given position,
we can simply trace down the minus gradient direction
from this position, and the first contour pixel met along this
path will be the nearest contour pixel as required. Fig. 3
illustrates how the nearest contour pixel can be retrieved
efficiently by taking advantage of the built distance map.
The greatest advantage of utilising distance map for nearest
point retrieving is that the distance map needs only to be
built once, and once the distance map is built, the
computational complexity of retrieving geometrically
nearest point is remarkably reduced to only O(1).
2.2.2 Establishing 2D–3D point correspondence:
After the 2D2D point correspondence relationship
between the extracted contour and the projected contour is
established, the next step is to carry the established 2D–2D
point correspondence relationship forward to the final 2D
3D point correspondence required by the subsequent point-
based pose estimation sub-procedure. What we need to do
is to back-project the points of the projected contour, which
have the corresponding point on the extracted contour, onto
the surface of the object’s 3D model to obtain the
corresponding 3D points’ coordinates. To accomplish this
task, we propose the following two-step method.
The first step is to retrieve the triangular patches which
correspond to the projected contour from the 3D model. To
fully utilise the great power of modern GPU, we dye each
triangular patch of the 3D model with different colour, then
using this colour attribute as hash index, we can efficiently
retrieve the required triangular patches from object’s 3D
model, which usually consists of thousands of triangular
patches.
Since the retrieved triangular patch could be relative large
with respect to the scale of the whole 3D model, it would be
too coarse to be used in the subsequent pose estimation sub-
procedure. The next step is to obtain the precise 3D points’
coordinates with the retrieved triangular patches. Assume
that there is no rotation and translation transform between
camera coordinate frame and object self-centred coordinate
frame, and let the three vertices of a triangular patch be
represented as X
v
¼{x
v1
,x
v2
,x
v3
}, then the plane defined
Fig. 3 Retrieving the nearest contour pixel for a given position
The black dot is the start position, with the help of distance map, the required
nearest contour pixel (shown as grey dot) can be found very efficiently
294 IET Comput. Vis., 2011, Vol. 5, Iss. 5, pp. 291 –300
&The Institution of Engineering and Technology 2011 doi: 10.1049/iet-cvi.2010.0098
www.ietdl.org
by this triangular patch will be
P=(xv1xv3)×(xv2xv3)
xv3·(xv1×xv2)

(5)
Let x
g
represent the corresponding 3D point’s coordinates,
then it will be given as
xvg =(P(4)/xg·P(1:3))xg(6)
2.3 Iterative pose estimation and convergence
specifics
After the tentative 2D–3D point correspondence relationship
is established between the input image and object’s 3D
model, the next step is to execute the point-based pose
estimation sub-procedure. In our current implementation,
we adopt the orthogonal iteration (OI) algorithm proposed
in [1] which is fast and numerically precise. Now with the
tentative 2D–3D point correspondence and the OI
algorithm, new values can be estimated for the 3D object’s
pose parameters. These updated pose parameters are then
fed back to generate new projection image, establish new
tentative point correspondence and then estimate new
values for the object’s 3D pose parameters. This process is
then iterated until correct point correspondence relationship
is established and correct pose estimation result is retrieved;
if the process does not converge after a preassigned
iteration number, we just abort it and return a failure.
To measure the fitness of the pose estimation result
returned by the iterative method, we perform the ‘XOR’
operation with the binary image extracted from the input
grey-level image and the binary projection image. If the
pose estimation result is close to object’s actual pose, then
the area of the result regions after the XOR operation would
be small. So define a ratio
Aratio =area(BIob BIpr)
area(BIpr)(7)
in which BI
ob
represents the binary image extracted from the
input grey-level image, BI
pr
represents the binary projection
image got by projecting the object’s 3D model onto the
image plane. For good pose estimation result, the value of
A
ratio
would be small. This measure will be used for
convergence determination in the experiment section.
The last problem till now left undiscussed is how to obtain
a good initialisation for the iterative method to ensure
satisfactory pose estimation results. Since it is not the focus
of this article, we now give only a brief discussion about
this problem. To fulfil the initialisation task, many template-
based methods can be used [29 31]. Our method is similar
to the work of [20]. A small gallery consisting of normalised
profile images is built beforehand, with a rather coarse
sampling of the possible poses. Normalisation is necessary to
remove the effect of translation parameters. Then after the
input grey-level image is segmented, the result image is
normalised and compared within the profile image gallery,
and the pose of the most matching profile image is chosen to
initialise our iterative pose estimation method.
Table 1 summarises the algorithm flow of the proposed
contour-based iterative pose estimation method for a 3D
rigid object.
3 Experiment results
In this section, we test the performance of our iterative pose
estimation method in convergence speed, convergence
radius and robustness to noise with various-shaped 3D rigid
objects. As a comparison, the pose estimation results
returned by Iwashita’s method [23] and Cui’s method [26]
are also presented. All the codes involved are implemented
in Matlab scripts and run on a PC with 1.8 GHz CPU and
1 GB RAM.
3.1 Convergence speed performance
In this subsection, the convergence speed performance of our
new method is tested with a small model gallery which
consists of an aircraft, a racecar, a house, a desk lamp and
a grand piano. These models are of various shapes, size
scales and detail complexity, providing good sample
diversity for the following experiments.
We run the pose estimation methods five times for each
model, with different pose initialisation each time. For the
pose initialisation required by the iterative methods, we
pollute the true pose values with Gaussian noise, which
produces a total about 308deviation for the 3D rotation
angles. For Iwashita’s method [23], to guarantee the best
convergence radius, at each iteration Armijo rule [32] is
used to search an optimal time step for each of the six pose
parameters. The time cost by this optimal step searching
procedure varies dramatically with different parameter
tuning, in fairness we will not count in the absolute time in
convergence speed comparison for Iwashita’s method. Also
for Cui’s method [26], to guarantee the best convergence
radius, the model simplification procedure proposed in [26]
is not utilised here. For fairness the absolute convergence
time is also not counted in for Cui’s method.
Table 1 Contour-based iterative pose estimation of a 3D rigid object
Preprocessing stage:
1. receive a monocular 2D grey level image as input
2. extract object’s outer contour and contour pixel features
3. build the distance map based on the contour pixel feature extracted in step 2
Iterative stage:
1. generate the projection image using last estimated pose parameters and extract object’s outer contour from the projection image
2. establish the tentative 2D–2D point correspondence relationship between the extracted contour and the projected contour
3. back project the projected contour onto the 3D model’s surface, establish the tentative 2D– 3D point correspondence relationship
between the extracted contour and object’s 3D model
4. solve the point-based pose estimation problem using OI algorithm [1]
5. check convergence condition (7), go to step 4
IET Comput. Vis., 2011, Vol. 5, Iss. 5, pp. 291– 300 295
doi: 10.1049/iet-cvi.2010.0098 &The Institution of Engineering and Technology 2011
www.ietdl.org
To measure the fitness of results returned by the two
methods, we adopt A
ratio
defined in Section 2. If A
ratio
is smaller than a given threshold, the pose estimation task is
claimed to be successfully accomplished; otherwise it is
claimed as a failure.
Fig. 4 illustrates the pose estimation results returned by our
method for the house, desk lamp and piano models. The left-
column images show the initial pose configurations and the
right-column images present the final pose estimation results.
For ease of inspection, the effect of translation is removed
for all the images. For all the 5 ×5 test runs, our method
register object’s 3D model (represented by a wire mesh in
the image) precisely with the input 2D grey-level image.
Even though some parts of the initially posed model are far
from their counterparts in the 2D grey-level image, a good
estimation result can still be achieved. This also reveals that
our method has good error-tolerance performance.
Table 2 summarises the convergence speed statistics of the
three methods for the test gallery. The threshold of A
ratio
is set
to a thumb rule value 4e22. For our method, it usually takes
tens of iterations before successfully converging, and the time
cost per iteration is less than 1/3 s in most cases. Whereas for
Iwashita’s method, it usually takes hundreds of iterations
before convergence, which is much slower than ours; Cui’s
method takes fewer iterations than Iwashita’s method, but
still slower than that of ours.
Except for convergence speed, Iwashita’s method failed in
several cases for models of aircraft, house and desk lamp. It
seemed trapped in some local optimal solutions because of
its complicated highly non-linear cost function and error-
prone gradient descent-based optimisation method. Cui’s
method also failed in several cases for models of house,
desk lamp and piano, because it failed to establish the
correct 2D–3D point correspondence for the unevenly
triangulated test models. In contrast, the performance of our
method is more consistent and stable. These statistics also
indicate that our method has a wider convergence radius
that will be examined further in the following subsection.
In Fig. 5, six frames from one pose estimation run are
presented to give a more clear inspection about the
convergence process of our method. These six frames are
extracted from a total of 72 iterations (to make the iteration
last longer, we intentionally initialise the method poorly and
set a much smaller threshold value for A
ratio
than before).
Fig. 6 presents two curves: (i) is the evolution curve of
A
ratio
and (ii) is the evolution curve of residual error of the
point-based pose estimation sub-procedure. As we can see,
these two curves are consistent with each other, and also
demonstrate how well the object’s 3D model is registered to
the input 2D grey-level image along the iteration.
3.2 Convergence radius performance
For iterative methods, their convergence radii are always
expected to be as wide as possible. The wider the
convergence radius is, the easier it is to find a plausible initial
solution. There are two main factors influencing the
convergence radius of our method: model’s shape and
camera’s view angle. For different model shapes and camera
view angles, the actual convergence radius is usually different.
In this subsection, the convergence radius performance of our
method against these two factors will be examined. For
comparison, results obtained with Iwashita’s method and
Cui’s method are also presented.
Fig. 4 Illustration of pose estimation results returned by our
method for the house, desk lamp and piano models
The image in the left column shows the initial pose configuration, and the
image in the right column shows the final pose estimation result. The
projection of object’s 3D model is represented by a wire mesh. The effect
of translation is removed for the ease of inspection
a,cand eInitial pose configurations
b,dand fFinal pose estimation results
Table 2 Convergence speed statistics for the method of [23, 26] and our method
Model Method of [23] Method of [26] Our method
Average
iteration no.
Time/
iteration, s
a
Case
failed
b
Average
iteration no.
Time/
iteration, s
Case
failed
Average
iteration no.
Time/
iteration, s
Case
failed
aircraft 192.6 1 53.6 0 26.4 0.0786 0
race car 65.8 0 16.0 0 8.4 0.2246 0
house 124.4 2 38.6 1 12.6 0.2165 0
desk lamp 221.2 2 51.6 1 22.4 0.1710 0
piano 126.8 0 24.2 1 12.8 0.2291 0
a
In fairness, the time cost per iteration is not counted in for methods of [23, 26] here
b
The threshold of A
ratio
is set to 4e22
296 IET Comput. Vis., 2011, Vol. 5, Iss. 5, pp. 291 –300
&The Institution of Engineering and Technology 2011 doi: 10.1049/iet-cvi.2010.0098
www.ietdl.org
The biggest problem for executing this subsection’s
experiments is that the number of possible poses for a 3D
object is extremely large, for example, with 18sample
interval for each of the three pose angles, the total pose
sample number would be 360 ×360 ×180 ¼2.3 ×10
7
.
Owing to the computing power limitation, we compromised
with the number of test models and the sample density of
possible poses as follows: for test models we choose two
highly different objects, an aircraft and a dolphin toy, which
are showed in Figs. 7aand b; for pose sample density, we
set the sample intervals for the first two pose angles (yaw
angle
a
and pitch angle
b
) both to 208; for the third pose
angle (roll angle), since its effect is rotation in plane, we
just omit it because it can be recovered simply by a 2D
rotation. The initial pose solution required by the iterative
methods is obtained by offsetting the true pose within the
range
D
a
×D
b
=[458,458]×[458,458] (8)
with 58interval for each rotation angle. Thus, for one test
model, the number of sampled poses (i.e. different camera’s
view angles) is 18 ×9, and for each sampled pose, there
will be 19 ×19 test runs to explore its neighbourhood.
These result in a total of 1.17 ×10
5
runs for each method.
To display the experiment results, we calculate the number
of successful runs for each sampled pose and map it onto a
unit sphere, see Fig. 7. Each point on the sphere indicates
the width of convergence radius for the corresponding view
angle. The brighter the colour is, the wider the convergence
radius is. Owing to the left right symmetry of the test
models, the front view and the back view of the result
sphere are approximately identical, and so only the front
view of sphere is shown.
To quantify the convergence radius results, we introduce
the following equation to convert the number of successful
runs into equivalent radius value
Radius =
Nsuccess
Ntotal
·Rtotal (9)
in which N
success
is the number of successful runs for a given
sampled pose, N
total
is the total number of test runs for the
sampled pose and Rtotal is the radius of the offset range.
This equation is derived from the fact that the area of a
circle is proportional to the square of its radius. In our case,
we have
Ntotal =361, Rtotal =458(10)
Table 3 summarises the convergence radius statistics of the
three methods. There are several conclusions can be derived
from Fig. 7 and Table 3:
1. For different-shaped models and camera view angles, the
convergence radius performance can vary dramatically for
all the three methods;
2. All the three methods exhibit similar convergence radius
variation trends;
3. Our method has much wider convergence radius than
Iwashita’s method and Cui’s method both for the complex
aircraft model and for the relatively simple dolphin toy model.
3.3 Noise robustness
Experiments in the previous two subsections postulate ideal
contour extraction result. In actual applications, the contour
extracted from the input grey-level image is seldom ideal
because of noise, complex image background, lighting
variation etc. This imperfectness will surely bring down the
pose estimation method’s performance. In this subsection,
we study the noise robustness performance of our method
under various signal to noise ratio (SNR) condition.
The test models used in this subsection is the same with
Subsection 3.2, see Figs. 7aand b. To generate simulation
test images, we set the sample intervals of yaw angle
a
and
Fig. 5 Convergence illustration of our method
afare the 1st, 2nd, 4th, 10th, 20th and 72nd frames out of a total 72
iterations
Fig. 6 Convergence curves
aEvolution curve of A
ratio
bEvolution curve of residual error of the point-based pose estimation sub-procedure
IET Comput. Vis., 2011, Vol. 5, Iss. 5, pp. 291– 300 297
doi: 10.1049/iet-cvi.2010.0098 &The Institution of Engineering and Technology 2011
www.ietdl.org
pitch angle
b
both to 108. Compared with total randomisation,
this setting ensures more accurate and comprehensive
performance evaluation results. The extracted contour is
then contaminated with different levels of Gaussian noise to
simulate real-world defect, see Fig. 8. The initial pose
required by the method is obtained by polluting the true
value by Gaussian noise with 20 dB SNR. The iteration
number is fixed to 100 for each run. To evaluate the
Fig. 7 Convergence radius experiment results mapped onto a unit sphere
aand bTest models: an aircraft and a dolphin toy
cand dFront views of the result spheres for Iwashita’s method
eand fFront views of the result spheres for Cui’s method
gand hFront views of the result spheres for our method
Table 3 Convergence radius statistics for the method of [23, 26] and our method (8)
Model Method of [23] Method of [13] Our method
Min
radius
Max
radius
Mean
radius
Std
variance
Min
radius
Max
radius
Mean
radius
Std
variance
Min
radius
Max
radius
Mean
radius
Std
variance
aircraft 10.86 28 18.9 3.61 12.7 37.1 23.6 4.34 22.8 44.5 33.3 4.71
dolphin toy 13.5 31.6 23.7 3.71 17.2 36.2 28.6 4.09 27.1 45 41.1 3.87
298 IET Comput. Vis., 2011, Vol. 5, Iss. 5, pp. 291 –300
&The Institution of Engineering and Technology 2011 doi: 10.1049/iet-cvi.2010.0098
www.ietdl.org
noise-robust performance, we analyse the statistics of final
A
ratio
returned by our method.
Results are shown in Fig. 9. As we can see, for both test
models, the inflection point of the mean A
ratio
variation
curve occurs at 30 dB. When SNR .30 dB, the mean
values of A
ratio
are both lower than 1e21, an experience
value indicating acceptable pose estimation result obtained;
however, when SNR ,30 dB, both curves rise rapidly.
This indicates that for our method to work properly, SNR
above 30 dB is required. Referring to Fig. 8b, which
illustrates what a noise-contaminated contour looks like
under 30 dB SNR, we can conclude that the noise
robustness performance of our method is satisfying.
3.4 Experiments with real-world image data
In this subsection, the effectiveness of our new method will
be verified with experiments using real-world aircraft
images. Aircraft objects are highly manoeuvrable and can
be arbitrarily posed, and thus are more fit to the ‘general
3D rigid object’ concept concentrated on by this article.
Experiment data and results are presented in Fig. 10.
The 1st and 3rd columns of Fig. 10 are the input monocular
images, and the 2nd and 4th are the corresponding pose
estimation results returned by our method. The result image
is rendered with the estimated pose parameters and object’s
3D model. The object aircraft in Figs. 10aand bis the F16
Fighting Falcon, and in Figs. 10cand dis the F22 Raptor.
From Fig. 10, we can see that our new method successfully
accomplished the pose estimation tasks with various object
pose, size and image quality.
4 Conclusions
In this article, we focus on the topic of pose estimation of
general 3D rigid object with no feature correspondence
between the input monocular image and object’s 3D model
is available a priori, and a new contour-based iterative
method is proposed which is fast and has wide convergence
radius. Our new method solves the feature correspondence
problem and pose estimation problem simultaneously and
iteratively, that is, not only the pose parameters of the 3D
object, but also the 2D 3D point correspondence between
the input grey-level image and object’s 3D model can be
retrieved. The tentative point correspondence establishing
scheme keeps our method free of defining the highly non-
linear ensemble cost functions, making our method
computationally more efficient and stable. Experimental
results show that the performance of our method is
promising in convergence speed, convergence radius and
noise robustness.
In this present work, we only adopt the 2D geometrical
properties to build the tentative 2D 3D point
correspondence between the input grey-level image and
object’s 3D model. It would be sensible to incorporate other
image properties to improve the accuracy of the tentative
Fig. 8 Noise-contaminated contour extraction results
aSNR ¼60 dB
bSNR ¼30 dB
cSNR ¼10 dB
Fig. 9 Results of noise robustness experiments
Fig. 10 Pose estimation results for real-world image data by our method
The 1st and 3rd columns are the input monocular images, and the 2nd and 4th are the corresponding pose estimation results. The result image is rendered with the
estimated pose parameters and the object’s 3D model
aand bPose estimation results for the F16 Fighting Falcon
cand dPose estimation results for the F22 Raptor
IET Comput. Vis., 2011, Vol. 5, Iss. 5, pp. 291– 300 299
doi: 10.1049/iet-cvi.2010.0098 &The Institution of Engineering and Technology 2011
www.ietdl.org
point correspondence establishing procedure, if not incurring
too much extra computation cost. This will be future work.
5 Acknowledgment
We would like to acknowledge Prof. Iwashita for her help in
understanding their algorithm.
6 References
1 Lu, C.P., Hager, G.D., Mjolsness, E.: ‘Fast and globally convergent pose
estimation from video images’, IEEE Trans. Pattern Anal. Mach. Intell.,
2000, 22, (6), pp. 610– 622
2 Burschka, D., Mair, E.: ‘Direct pose estimation with a monocular
camera’, Robot Vis., 2008, (LNCS,4931), pp. 440– 453
3 Haralick, R.M., Lee, C.N., Ottenberg, K., No¨lle, M.: ‘Review and
analysis of solution of the three point perspective pose estimation
problem’, IJCV, 1994, 13, (3), pp. 331– 356
4 Moreno-Noguer, F., Lepetit, V., Fua, P.: ‘Accurate non-iterative O(n)
solution to the PnP problem’. IEEE ICCV’07, Rio de Janeiro,
pp. 2168– 2175
5 Leng, D.W., Sun, W.D.: ‘Finding all the solutions of PnP problem’.
IEEE IST’09, Shenzhen, pp. 348– 352
6 Ansar, A., Daniilidis, K.: ‘Linear pose estimation from points or lines’,
IEEE Trans. Pattern Anal. Mach. Intell., 2003, 25, (5), pp. 578– 589
7 David, P., DeMenthon, D., Duraiswami, R., Samet, H.: ‘Simultaneous
pose and correspondence determination using line features’.
CVPR’03, 2003, vol. 2, pp. 424– 431
8 Christy, S., Horaud, R.: ‘Iterative pose computation from line
correspondences’, CVIU, 1999, 73, (1), pp. 137– 144
9 Hanning, T., Schoene, R., Graf, S.: ‘A closed form solution for
monocular re-projective 3D pose estimation of regular planar
patterns’, ICIP, 2006, 1–7, pp. 2197– 2200
10 Jacobs, D., Basri, R.: ‘3D to 2D pose determination with regions’, IJCV,
1999, 34, (2– 3), pp. 123 –145
11 Tahri, O., Chaumette, F.: ‘Complex objects pose estimation based on
image moment invariants’. Proc. IEEE Int. Conf. on Robotics and
Automation, Barcelona, Spain, April 2005, pp. 436–441
12 Donoser, M., Bischof, H.: ‘Efficient maximally stable extremal region
(MSER) tracking’. CVPR’06, 2006, vol. 1, pp. 553– 560
13 Kyriakoulis, N., Gasteratos, A.: ‘Color-based monocular visuoinertial
3-D pose estimation of a Volant robot’, IEEE Trans. Instrum. Meas.,
2010, 59, (10), pp. 2706–2715
14 Lowe, D.G.: ‘Distinctive image features from scale-invariant keypoints’,
IJCV, 2004, 60, (2), pp. 91– 110
15 Bay, H., Tuytelaars, T., Gool, L.V., Zurich, E.: ‘SURF: speeded up
robust features’, CVIU, 2008, 110, (3), pp. 346– 359
16 Smith, S.M., Brady, J.M.: ‘SUSAN a new approach to low level
image processing’, IJCV, 1997, 23, (1), pp. 45– 78
17 Viksten, F., Forsse´n, P.E., Johansson, B., Moe, A.: ‘Comparison of local
image descriptors for full 6 degree-of-freedom pose estimation’. IEEE
Int. Conf. on Robotics and Automation, Kobe, Japan, 2009,
pp. 1139– 1146
18 Shan, G.L., Ji, B., Zhou, Y.F.: ‘A review of 3D pose estimation from a
monocular image sequence’. CISP’09, 2009, Tianjin, pp. 1 –5
19 Lee, T.K., Drew, M.S.: ‘3D object recognition by eigen-scale-space of
contours’. SSVM ’07, 2007, vol. 4485, pp. 883–894
20 Dunker, J., Hartmann, G., Sto¨hr, M.: ‘Single view recognition and
pose estimation of 3D objects using sets of prototypical views and
spatially tolerant contour representations’. ICPR’96, 1996, vol. 4,
pp. 14–18
21 Dambrevile, S., Sandhu, R., Yezzi, A., Tannenbaum, A.: ‘A geometric
approach to joint 2D region-based segmentation and 3D pose estimation
using a 3D shape prior’, SIAM J. Imaging Sci., 2010, 3, (1),
pp. 110–132
22 Poggio, T., Edelman, S.: ‘A network that learns to recognize three-
dimensional objects’, Nature, 1990, 343, pp. 263– 266
23 Iwashita, Y., Kurazume, R., Konishi, K., Nakamoto, M., Hashizume,
M., Hasegawa, T.: ‘Fast alignment of 3D geometrical models and 2D
grayscale images using 2D distance maps’, Syst. Comput. Jpn., 2007,
38, (14), pp. 1889–1899
24 Chetverikov, D., Stepanov, D., Krsek, P.: ‘Robust Euclidean alignment
of 3D point sets: the trimmed iterative closest point algorithm’, Image
Vis. Comput., 2005, 23, (3), pp. 299– 309
25 DeMenthon, D.F., Davis, L.S.: ‘Model-based object pose in 25 lines of
code’, IJCV, 1995, 15, (1– 2), pp. 123–141
26 Cui, Y., Hildenbrand, D.: ‘Pose estimation based on geometric algebra’.
GraVisMa, 2009, pp. 17– 24
27 Sethian, J.A.: ‘A fast marching level set method for monotonically
advancing fronts’, Proc. Natl. Acad. Sci. USA, 1996, 93, pp. 1591– 1595
28 Felzenszwalb, P.F., Huttenlocher, D.P.: ‘Distance transforms of sampled
functions’, Cornell Computing and Information Science TR2004-1963,
available at: http://ecommons.library.cornell.edu/handle/1813/5663
29 Horaud, R.: ‘New methods for matching 3D objects with single
perspective views’, IEEE Trans. Pattern Anal. Mach. Intell., 1987, 9,
(3), pp. 401–412
30 Dhome, M., Richetin, M., Lapreste´, J.T., Rives, G.: ‘Determination
of the attitude of 3D objects from a single perspective view’,
IEEE Trans. Pattern Anal. Mach. Intell., 1989, 11,(12),
pp. 1265– 1278
31 Gonza´lez, J.M., Sebastia´n, J.M., Garcı´a, D ., Sa´nchez, F., Angel, L.:
‘Recognition of 3D object from one image based on projective
and permutative invariants’. ICIAR’04, 2004, vol. 3211,
pp. 705–712
32 Bertsekas, D.P.: ‘Constrained optimization and Lagrange multiplier
methods’ (Academic Press, 1982)
300 IET Comput. Vis., 2011, Vol. 5, Iss. 5, pp. 291 –300
&The Institution of Engineering and Technology 2011 doi: 10.1049/iet-cvi.2010.0098
www.ietdl.org
... If there is no feature correspondence available, how to estimate the 3-D pose of an object using one image is still not well solved. Leng [9] proposed a new contour-based method, which dealt with both the pose estimation and the feature correspondence simultaneously and iteratively. The outer contour of the object is first extracted from the 2-D image. ...
Conference Paper
This paper presents a 3-D object localization and tracking technique based on the CAD model and multi-view image captures of the object. From the given projected 2-D pose model in the image, the matching lines between the model contour and the object's edge feature are used for nonlinear 3-D pose computation. The object location information in the real world is then identified. The method presented in this work has been validated on several experiments with various test objects. The results demonstrate that the proposed approach is robust to the partial occlusion and provides accurate positioning of the real object.
... If there is no feature correspondence available, how to estimate the 3-D pose of an object using one image is still not well solved. Leng [9] proposed a new contour-based method, which dealt with both the pose estimation and the feature correspondence simultaneously and iteratively. The outer contour of the object is first extracted from the 2-D image. ...
Conference Paper
In this paper, we present an improved marker for optical tracking. The proposed IR pattern can accept up to 50% of occlusion so that we can reduce the requirement of line-of-sight during the instrument tracking. To provide the accurate instrument tracking results for surgeons to manipulate minimally invasive surgery, we integrate both optical tracking and electromagnetic tracking methods into an image guided surgery system. Furthermore, we use a sensor fusion algorithm to fuse the measurements from these subsystems to make the measurement of the system more stable.
Article
Full-text available
Satellite pose estimation is an essential task for on-orbit service. After obtaining sequential images of the unknown satellite, its relative pose can be estimated by capturing and matching keypoints in these images. Generally, existing point-based methods will produce numerous keypoints, represented by high-dimensional descriptors. Consequently, they are inappropriate to estimate the relative pose between the observer and the unknown target, due to the limited capacity of storage and computation of the on-board processor. On this basis, a low-dimensional binary-based descriptor is proposed, in conjunction with the use of a keypoint selection algorithm, to address the above-mentioned limitation. The developed algorithm is then used to select the keypoints from a series of points generated by the Harris detector. Each keypoint is represented by a novel 72-dimensional binary descriptor, the vicinity of which is divided into 12 sectors, and then each sector is mathematically encoded by six binary bits. The experimental results obtained via the presented approach demonstrate a good coincidence with that of the SURF and ORB methods, in terms of the keypoint matching and pose estimation accuracies, while it consumes less storage and has a faster matching speed.
Article
This paper presents a novel vision-based method to solve the 6-degree-of-freedom pose estimation problem of textureless space objects from a single monocular image. Our approach follows a coarse-to-fine procedure, utilizing only shape and contour information of the input image. To achieve invariance to initialization, we select a series of projection images which are similar to the input image and establish many-to-one 2D-3D correspondences by contour feature matching. Intensive attention is focused on outlier rejection and we introduce an innovative strategy to fully utilize geometric matching information to guide pose calculation. Experiments based on simulated images are carried out, and the results manifest that pose estimation error of our approach is about 1% even in situations with heavy outlier correspondences.
Conference Paper
This paper presents an improved pose estimation algorithm for vision-based space objects. The major weakness of most existing methods is limited convergence radius. In most cases they ignore the influence of translation, only focusing on rotation parameters. To breakthrough these limits, we utilizes hybrid local image features to explicitly establish 2D-3D correspondences between the input image and 3D model of space objects, and then estimate rotation and translation parameters based on the correspondences. Experiments with simulated models are carried out, and the results show that our algorithm can successfully estimate the pose of space objects with large convergence radius and high accuracy.
Article
Estimating ego motion of monocular visual system from the input image sequence is a critical problem in computer vision. This paper considers this problem from the view of time domain optimisation. At first, based on the frame to frame matches, the traditional monocular visual navigation algorithm is introduced, and this process is also seen as a two-view method. On the basis of the three-view matches, the trifocal tensor based monocular navigation algorithm is described. In addition, this paper proposes a novel moving horizon estimation (MHE) based algorithm to tackle the pose estimation problem. In this algorithm, the epipolar constraints and the closed loop constraints of position along with consecutive time steps are all involved in the global optimisation model. In accordance with this optimisation function, the MHE is introduced to achieve the tradeoff between computation costs and estimation accuracy. Because of the inherent robustness of the referred moving horizon estimator, it is becoming more popular in various applications. Based on the general epipolar constraints, trifocal tensor constraints and step crossed optimisation function during the whole moving window, the corresponding three referred pose estimation algorithms are all implemented comparatively at the end of this paper. The rotation and translation experiments are implemented to validate the improvements of the time domain optimisation methods.
Article
Full-text available
We present a direct method to calculate a 6DoF pose change of a monocular camera for mobile navigation. The calculated pose is estimated up to a constant unknown scale parameter that is kept constant over the entire reconstruction process. This method allows a direct cal- culation of the metric position and rotation without any necessity to fuse the information in a probabilistic approach over longer frame sequence as it is the case in most currently used VSLAM approaches. The algorithm provides two novel aspects to the field of monocular navigation. It allows a direct pose estimation without any a-priori knowledge about the world directly from any two images and it provides a quality measure for the estimated motion parameters that allows to fuse the resulting information in Kalman Filters. We present the mathematical formulation of the approach together with experimental validation on real scene images.
Article
Full-text available
We find the pose of an object from a single image when the relative geometry of four or more noncoplanar visible feature points is known. We first describe an algorithm, POS (Pose from Orthography and Scaling), that solves for the rotation matrix and the translation vector of the object by a linear algebra technique under the scaled orthographic projection approximation. We then describe an iterative algorithm, POSIT (POS with ITerations), that uses the pose found by POS to remove the ''perspective distortions'' from the image, then applies POS to the corrected image instead of the original image. POSIT generally converges to accurate pose measurements in a few iterations. Mathematica code is provided in an Appendix.
Article
Full-text available
PnP (perspective-n-point) problem is a classical problem in computer vision and photogrammetry. According to the number of corresponding points, PnP problem can be resolved linearly or nonlinearly. When the number of corresponding points is greater than or equal to 6, PnP problem can be formulated as an linear least squares problem and solution is unique in most cases; however, when the number of corresponding points is smaller than 6, resolving PnP problem is nonlinear in essence and there are usually multiple feasible solutions. In spite of intense study of PnP problem in the last few decades, finding all the solutions of PnP efficiently and numerically stably still remains an open problem. In this work, we attack this problem from a new perspective and propose a method for finding all the solutions of PnP problem when one of the solutions is given a priori. We also show experimentally that our method is numerically stable and efficient, even under severely noisy conditions.
Article
2D-3D pose estimation is an important task for computer vision, ranging from robot navigation to medical intervention. In such applications as robot guidance, the estimation procedure should be fast and automatic, but in industrial metrology applications, the precision is typically a more important factor. In this paper, a new 3D approach for infrared data visualization precisely with the help of 2D-3D pose estimation based on Geometric Algebra is proposed. The approach provides a user friendly interface, a flexible structure and a precise result, which can be adjusted to almost all the geometrically complex objects.
Article
This article presents a novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features). SURF approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster. This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (specifically, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps. The paper encompasses a detailed description of the detector and descriptor and then explores the effects of the most important parameters. We conclude the article with SURF's application to two challenging, yet converse goals: camera calibration as a special case of image registration, and object recognition. Our experiments underline SURF's usefulness in a broad range of topics in computer vision.
Article
This chapter discusses the method of multipliers for equality constrained problems. By solving an approximate problem, an approximate solution of the original problem can be obtained. However, if a sequence of approximate problems can be constructed that converges in a well-defined sense to the original problem, then the corresponding sequence of approximate solutions would yield in the limit a solution of the original problem. The basic idea in penalty methods is to eliminate some or all of the constraints and add to the objective function a penalty term that prescribes a high cost to infeasible points. A parameter that determines the severity of the penalty and as a consequence the extent to which the resulting unconstrained problem approximates the original constrained problem is associated with the penalty methods.
Article
This paper describes a new approach to low level image processing; in particular, edge and corner detection and structure preserving noise reduction.Non-linear filtering is used to define which parts of the image are closely related to each individual pixel; each pixel has associated with it a local image region which is of similar brightness to that pixel. The new feature detectors are based on the minimization of this local image region, and the noise reduction method uses this region as the smoothing neighbourhood. The resulting methods are accurate, noise resistant and fast.Details of the new feature detectors and of the new noise reduction method are described, along with test results.