Content uploaded by Prasanna Rangarajan

Author content

All content in this area was uploaded by Prasanna Rangarajan on Nov 14, 2014

Content may be subject to copyright.

IPSJ Transactions on Computer Vision and Applications Vol. 3 80–94 (Oct. 2011)

Regular Paper

HyperLS for Parameter Estimation in Geometric Fitting

Kenichi Kanatani,†1Prasanna Rangarajan,†2

Yasuyuki Sugaya†3and Hirotaka Niitsuma†1

We present a general framework of a special type of least squares (LS) es-

timator, which we call “HyperLS,” for parametiper estimation that frequently

arises in computer vision applications. It minimizes the algebraic distance un-

der a special scale normalization, which is derived by a detailed error analysis

in such a way that statistical bias is removed up to second order noise terms.

We discuss in detail many theoretical issues involved in its derivation. By nu-

merical experiments, we show that HyperLS is far superior to the standard LS

and comparable in accuracy to maximum likelihood (ML), which is known to

produce highly accurate results but may fail to converge if poorly initialized.

We conclude that HyperLS is a perfect candidate for ML initialization.

1. Introduction

An important task in computer vision is the extraction of 2-D/3-D geometric

information from image data7),8), for which we often need to estimate parame-

ters from observations that should satisfy implicit polynomials in the absence of

noise. For such a problem, maximum likelihood (ML) is known to produce highly

accurate solutions, achieving the theoretical accuracy limit to a ﬁrst approxima-

tion in the noise level3),8),10). However, ML requires iterative search, which does

not always converge unless started from a value suﬃciently close to the solu-

tion. For this reason, various numerical schemes that can produce reasonably

accurate approximations have been extensively studied7) . The simplest of such

schemes is algebraic distance minimization, or simply least squares (LS), which

minimizes the sum of the squares of polynomials that should be zero in the ab-

sence of noise. However, the accuracy of LS is very much limited. Recently, a

new approach for increasing the accuracy of LS has been proposed in several

†1 Okayama University

†2 Southern Methodist University

†3 Toyohashi University of Technology

applications1),12),22)–24). In this paper, we call it HyperLS and present a uniﬁed

formulation and clarify various theoretical issues that have not been fully studied

so far.

Section 2 deﬁnes the mathematical framework of the problem with illustrating

examples. Section 3 introduces a statistical model of observation. In Section 4,

we discuss various issues of ML. Section 5 describes a general framework of al-

gebraic ﬁtting. In Sections 6 and 7, we do a detailed error analysis of algebraic

ﬁtting in general and in Section 8 derive expressions of covariance and bias of the

solution. In Section 9, we deﬁne HyperLS by choosing the scale normalization

that eliminates the bias up to second order noise terms. In Section 10, we do

numerical experiments to show that HyperLS is far superior to the standard LS

and is comparable in accuracy to ML, which implies that HyperLS is a perfect

candidate for initializing the ML iterations. In Section 11, we conclude.

2. Geometric Fitting

The term “image data” in this paper refers to values extracted from images

by image processing operations such as edge ﬁlters and interest point detectors.

An example of image data includes the locations of points that have special

characteristics in the images or the lines that separate image regions having

diﬀerent properties. We say that image data are “noisy” in the sense that image

processing operations for detecting them entail uncertainty to some extent. Let

x1,... xNbe noisy image data, which we regard as perturbations in their true

values ¯

x1,...,¯

xNthat satisfy implicit geometric constraints of the form

F(k)(x;θ)=0, k = 1,...,L. (1)

The unknown parameter θallows us to infer the 2-D/3-D shape and motion of

the objects observed in the images7),8) . We call this type of problem geometric

ﬁtting8). In many vision applications, we can reparameterize the problem to

make the functions F(k)(x;θ) linear in θ(but generally nonlinear in x), allowing

us to write Eq. (1) as

(ξ(k)(x),θ) = 0, k = 1,...,L, (2)

where and hereafter (a,b) denotes the inner product of vectors aand b. The

80 c

°2011 Information Processing Society of Japan

(x , y )

α α

(x , y )

α α

(x ’, y ’)

α α

(x , y )

α α (x ’, y ’)

α α

(a) (b) (c)

Fig. 1 (a) Fitting an ellipse to a point sequence. (b) Computing the fundamental matrix from corresponding points between two

images. (c) Computing a homography between two images.

vector ξ(k)(x) represents a nonlinear mapping of x.

Example 1 (Ellipse ﬁtting). Given a point sequence (xα, yα), α= 1, . . . N,

we wish to ﬁt an ellipse of the form

Ax2+ 2Bxy +Cy2+ 2(Dx +Ey) + F= 0.(3)

(Fig. 1(a)). If we let

ξ= (x2,2xy, y2,2x, 2y , 1)>,θ= (A, B, C, D, E , F)>,(4)

the constraint in Eq. (3) has the form of Eq. (2) with L= 1.

Example 2 (Fundamental matrix computation). Corresponding points

(x, y) and (x0, y0) in two images of the same 3-D scene taken from diﬀerent posi-

tions satisfy the epipolar equation7)

(x,F x0)=0,x≡(x, y, 1)>,x0≡(x0, y0,10)>,(5)

where Fis called the fundamental matrix , from which we can compute the camera

positions and the 3-D structure of the scene7),8) (Fig. 1(b)). If we let

ξ= (xx0, xy0, x, yx0, y y0, y, x0, y0,1)>,

θ= (F11, F12 , F13, F21 , F22 , F23, F31 , F32 , F33)>,(6)

the constraint in Eq. (5) has the form of Eq. (2) with L= 1.

Example 3 (Homography computation). Two images of a planar or in-

ﬁnitely far away scene are related by a homography of the form

x0'Hx,x≡(x, y, 1)>,x0≡(x0, y0,10)>,(7)

where His a nonsingular matrix, and 'denotes equality up to a nonzero mul-

tiplier7),8) (Fig. 1(c)). We can alternatively express Eq. (7) as the vector product

equality

x0×Hx =0.(8)

If we let

ξ(1) = (0,0,0,−x, −y, −1, xy0, y y0, y0)>,

ξ(2) = (x, y, 1,0,0,0,−xx0,−yx0,−x0)>,

ξ(3) = (−xy0,−yy0,−y0, xx0, yx0, x0,0,0,0)>,(9)

θ= (H11, H12 , H13, H21, H22 , H23, H31 , H32 , H33)>,(10)

the three components of Eq. (8) have the form of Eq. (2) with L= 3. Note

that ξ(1),ξ(2) , and ξ(3) in Eq. (9) are linearly dependent; only two of them are

independent.

3. Statistical Model of Observation

Before proceeding to the error analysis of the above problems, we need to intro-

duce a statistical model of observation. We regard each datum xαas perturbed

from its true value ¯

xαby ∆xα, which we assume to be independent Gaussian

noise of mean 0and covariance matrix V[xα]. We do not impose any restrictions

on the true values ¯

xαexcept that they should satisfy Eq. (1). This is known

as a functional model. We could alternatively introduce some statistical model

according to which the true values ¯

xαare sampled. Then, the model is called

structural. This distinction is crucial when we consider limiting processes in the

following sense10).

Conventional statistical analysis mainly focuses on the asymptotic behavior as

the number of observations increases to ∞. This is based on the reasoning that

81

the mechanism underlying noisy observations would better reveal itself as the

number of observations increases (the law of large numbers) while the number of

available data is limited in practice. So, the estimation accuracy vs. the number

of data is a major concern. In this light, eﬀorts have been made to obtain a con-

sistent estimator for ﬁtting an ellipse to noisy data or computing the fundamental

matrix from noisy point correspondences such that the solution approaches its

true value in the limit N→ ∞ of the number Nof the data17),18) .

In image processing applications, in contrast, one cannot “repeat” observa-

tions. One makes an inference given a single set of images, and how many times

one applies image processing operations, the result is always the same, because

standard image processing algorithms are deterministic; no randomness is in-

volved. This is in a stark contrast to conventional statistical problems, where

we view observations as “samples” from potentially inﬁnitely many possibilities

and could obtain, by repeating observations, diﬀerent values originating from

unknown, uncontrollable, or unmodeled causes, which we call “noise” as a whole.

In image-based applications, the accuracy of inference deteriorates as the un-

certainty of image processing operations increases. Thus, the inference accuracy

vs. the uncertainty of image operations, which we call “noise” for simplicity, is a

major concern. Usually, the noise is very small, often subpixel levels. In light of

this observation, it has been pointed out that in image domains the “consistency”

of estimators should more appropriately be deﬁned by the behavior in the limit

σ→0 of the noise level σ3),10).

In this paper, we are interested in image processing applications and focus on

the perturbation analysis around σ= 0 with the number Nof data ﬁxed. Thus,

the functional model suits our purpose. If we want to analyze the error behavior

in the limit of N→ ∞, we need to assume some structural model that speciﬁes

how the statistical characteristics of the data depend on N. The derivation of

consistent estimators for N→ ∞ is based on such an assumption17),18). However,

it is diﬃcult to predict the noise characteristics for diﬀerent N. Image processing

ﬁlters usually output a list of points or lines or their correspondences along with

their conﬁdence values, from which we use only those with high conﬁdence. If

we want to collect a lot of data, we necessarily need to include those with low

conﬁdence, but their statistical properties are hard to estimate, since such data

are possibly misdetections. This is the most diﬀerent aspect of image processing

from laboratory experiments, in which any number of data can be collected by

repeated trials.

4. Maximum Likelihood for Geometric Fitting

Under the Gaussian noise model, maximum likelihood (ML) of our problem

can be written as the minimization of the Mahalanobis distance

I=

N

X

α=1

(¯

xα−xα, V [xα]−1(¯

xα−xα)),(11)

with respect to ¯

xαsubject to the constraint that

(ξ(k)(¯

xα),θ)=0, k = 1,...,L, (12)

for some θ. If the noise is homogeneous and isotropic, Eq. (11) is the sum of the

squares of the geometric distances between the observations xαand their true

values ¯

xα, often referred to as the reprojection error7) . That name originates

from the following intuition: We infer the 3-D structure of the scene from its

projected images, and when the inferred 3-D structure is “reprojected” onto the

images, Eq. (11) measures the discrepancy between the “reprojections” of our

solution and the actual observations.

In statisitcs, ML is criticized for its lack of consistency17),18). In fact, estimation

of the true values ¯

xα, called nuisance parameters when viewed as parameters,

is not consistent as N→ ∞ in the ML framework, as pointed out by Neyman

and Scott21) as early as in 1948. As discussed in the preceding section, however,

the lack of consistency has no realistic meaning in vision applications. On the

contrary, ML has very desirable properties in the limit σ→0 of the noise level

σ: the solution is “consistent” in the sense that it converges to the true value

as σ→0 and “eﬃcient” in the sense that its covariance matrix approaches a

theoretical lower bound as σ→03),8),10).

According to the experience of many vision researchers, ML is known to pro-

duce highly accurate solutions7), and no necessity is felt for further accuracy

improvement. Rather, a major concern is its computational burden, because ML

usually requires complicated nonlinear optimization.

82

The standard approach is to introduce some auxiliary parameters to express

each of ¯

xαexplicitly in terms of θand the auxiliary parameters. After they are

substituted back into Eq. (11), the Mahalanobis distance Ibecomes a function of

θand the auxiliary parameters. Then, this joint parameter space, which usually

has very high dimensions, is searched for the minimum. This approach is called

bundle adjustment 7),27) , a term originally used by photogrammetrists. This is

very time consuming, in particular if one seeks a globally optimal solution by

searching the entire parameter space exhaustively6) .

A popular alternative to bundle adjustment is minimization of a function of θ

alone, called the Sampson error7) , which approximates the minimum of Eq. (11)

for a given θ(the actual expression is shown in Section 6). Kanatani and Sug-

aya16) showed that the exact ML solution can be obtained by repeating Sampson

error minimization, each time modifying the Sampson error so that in the end

the modiﬁed Sampson error coincides with the Mahalanobis distance. It turns

out that in many practical applications the solution that minimizes the Sampson

error coincides with the exact ML solution up to several signiﬁcant digits; usually,

two or three rounds of Sampson error modiﬁcation are suﬃcient11),14),15).

However, minimizing the Sampson error is not straightforward. Many numer-

ical schemes have been proposed, including the FNS (Fundamental Numerical

Scheme) of Chojnacki et al.4), the HEIV (Heteroscedastic Errors-in-Variable) of

Leedan and Meer19) and Matei and Meer20), and the projective Gauss-Newton

iterations of Kanatani and Sugaya13). All these rely on local search, but the iter-

ations do not always converge if not started from a value suﬃciently close to the

solution. Hence, accurate approximation schemes that do not require iterations

are very much desired, even though the solution may not be optimal, and various

algebraic methods have been studied in the past.

5. Algebraic Fitting

For the sake of brevity, we abbreviate ξ(k)(xα) as ξ(k)

α.Algebraic ﬁtting refers

to minimizing the algebraic distance

J=1

N

N

X

α=1

L

X

k=1

(ξ(k)

α,θ)2=1

N

N

X

α=1

L

X

k=1

θ>ξ(k)

αξ(k)>

αθ= (θ,M θ),(13)

where we deﬁne

M=1

N

N

X

α=1

L

X

k=1

ξ(k)

αξ(k)>

α.(14)

Equation (13) is trivially minimized by θ=0unless some scale normalization

is imposed on θ. The most common normalization is kθk= 1, which we call

the standard LS. However, the solution depends on the normalization. So, we

naturally ask: What normalization will maximize the accuracy of the solution?

This question was raised ﬁrst by Al-Sharadqah and Chernov1) and Rangarajan

and Kanatani23) for circle ﬁtting, then by Kanatani and Rangarajan12) for ellipse

ﬁtting and by Niitsuma et al.22) for homography computation. In this paper, we

generalize these results to an arbitrary number of constraints. Following these

authors1),12),22),23), we consider the class of normalizations

(θ,Nθ) = c, (15)

with some symmetric matrix Nfor a nonzero constant c. In Eq. (15), θis the

optimization parameter and Nis an unknown matrix to be determined, while the

constant cis ﬁxed for the problem. We need not specify the value of c, because N

is unknown. Since Eq. (15) can be written as (θ,(N/c)θ) = 1, we may determine

N0=N/c instead of N, but the form of Eq. (15) with cunspeciﬁed is more

convenient in our analysis.

Traditionally, the matrix Nis positive deﬁnite or semideﬁnite, but in the fol-

lowing, we allow Nto be nondeﬁnite (i.e., neither positive nor negative deﬁnite),

so the constant cin Eq. (15) is not necessarily positive. The standard treatment

of algebraic ﬁtting goes as follows. Given the matrix N, the solution θthat

minimizes Eq. (13) subject to Eq. (15), if it exists, is given by the solution of the

generalized eigenvalue problem

Mθ =λN θ.(16)

Note that Mis always positive semideﬁnite from its deﬁnition in Eq. (14). If

there is no noise in the data, we have (θ,ξ(k)

α) = 0 for all kand α. Hence, Eq. (14)

implies Mθ =0, so λ= 0. In the presence of noise, Mis positive deﬁnite, so

λis positive whether Nis positive deﬁnite or semideﬁnite. The corresponding

solution is obtained as the eigenvector θfor the smallest λ. For the standard LS,

for which N=I, Eq. (16) becomes an ordinary eigenvalue problem

83

Mθ =λθ,(17)

and the solution is the unit eigenvector θof Mfor the smallest eigenvalue λ.

This is the traditional treatment of algebraic ﬁtting, but the situation is slightly

diﬀerent here: Nis not yet given and can be nondeﬁnite, and the eigenvalues of

Eq. (16) may not be all positive. So, we face the problem of which eigenvalues

and eigenvectors of Eq. (16) to choose as a solution. In the following, we do

perturbation analysis10) of Eq. (16) by assuming that λ≈0 and choose the solu-

tion to be the eigenvector θfor the λwith the smallest absolute value, although

in theory there remains a possibility that another choice happens to produce a

better result in some cases. We also regard Eq. (16) as the deﬁnition of “alge-

braic ﬁtting,” rather than Eq. (13) and Eq. (15). This is because, while Eq. (16)

always has a solution, Eq. (13) may not be minimized subject to Eq. (15) by a

ﬁnite θ. This can occur, for example, when the contour of (θ,Mθ), which is a

hyperellipsoid in the space of θ, happens to be elongated in a direction in the null

space of N. Then, the minimum of (θ,Mθ) could be reached in the limit of kθk

→ ∞. Note that Eq. (15) is unable to normalize the norm kθkinto a ﬁnite value

if Nhas a null space. Theoretically, such an anomaly can always occur because

Mis a random variable deﬁned by noisy data, and if the probability of such

an occurrence is nearly 0, it may still lead to E[kˆ

θk] = ∞2). Once the problem

is converted to Eq. (16), for which eigenvectors θhave scale indeterminacy, we

can adopt normalization kθk= 1 rather than Eq. (13). Then, the solution θis

always a unit vector.

6. Error Analysis

We can expand each ξ(k)

αin the form

ξ(k)

α=¯

ξ(k)

α+ ∆1ξ(k)

α+ ∆2ξ(k)

α+···,(18)

where ¯

ξ(k)

αis the noiseless value, and ∆iξ(k)

αis the ith order term in ∆xα. The

ﬁrst order term is written as

∆1ξ(k)

α=T(k)

α∆xα,T(k)

α≡∂ξ(k)(x)

∂x¯¯¯¯¯x=¯

xα

.(19)

We deﬁne the covariance matrices of ξ(k)

α,k= 1, ...,L, by

V(kl)[ξα]≡E[∆1ξ(k)

α∆1ξ(l)>

α] = T(k)

αE[∆xα∆x>

α]T(l)>

α=T(k)

αV[xα]T(l)>

α,

(20)

where E[·] denotes expectation.

The Sampson error that we mentioned in Section 4, which approximates the

minimum of the Mahalanobis distance in Eq. (11) subject to the constraints in

Eq. (12), has the following form7),8):

K(θ) = 1

N

N

X

α=1

L

X

k,l,=1

W(kl)

α(ξ(k)

α,θ)(ξ(l)

α,θ).(21)

Here, W(kl)

αis the (kl) element of (Vα)−

r, and Vαis the matrix whose (kl)

element is

Vα=³(θ, V (kl)[ξα]θ)´,(22)

where the true data values ¯

xαin the deﬁnition of V(kl)[ξα] are replaced by their

observations xα. The operation ( ·)−

rdenotes the pseudoinverse of truncated

rank r(i.e., with all eigenvalues except the largest rreplaced by 0 in the spectral

decomposition), and ris the rank (the number of independent equations) of

the constraint in Eq. (12). The name “Sampson error” stems from the classical

ellipse ﬁtting scheme of Sampson25). For given x(k)

α, Eq. (21) can be minimized

by various means including the FNS4) , HEIV19),20), and the pro jective Gauss-

Newton iteration13).

Example 4 (Ellipse ﬁtting). For the ellipse ﬁtting in Example 1, the ﬁrst

order error ∆1ξis written as

∆1ξα= 2 Ã¯xα¯yα0 1 0 0

0 ¯xα¯yα0 1 0 !>Ã∆xα

∆yα!.(23)

The second order error ∆2ξαhas the following form:

∆2ξα= (∆x2

α,2∆xα∆yα,∆y2

α,0,0,0)>.(24)

Example 5 (Fundamental matrix computation). For the fundamental ma-

trix computation in Example 2, the ﬁrst order error ∆1ξis written as

84

∆1ξα=

¯x0

α¯y0

α1 0 0 0 0 0 0

0 0 0 ¯x0

α¯y0

α1 0 0 0

¯xα0 0 ¯yα0 0100

0 ¯xα0 0 ¯yα0 0 1 0

>

∆xα

∆yα

∆x0

α

∆y0

α

.(25)

The second order error ∆2ξαhas the following form:

∆2ξα= (∆xα∆x0

α,∆xα∆y0

α,0,∆yα∆x0

α,∆yα∆y0

α,0,0,0,0)>.(26)

Example 6 (Homography computation). For the fundamental matrix com-

putation in Example 2, the ﬁrst order error ∆1ξis written as

∆1ξ(1)

α=

000−1 0 0 ¯y0

α0 0

0 0 0 0 −1 0 0 ¯y0

α0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 ¯xα¯yα1

>

∆xα

∆yα

∆x0

α

∆y0

α

,

∆1ξ(2)

α=

100000−¯x0

α0 0

010000 0 −¯x0

α0

000000−¯xα−¯yα−1

000000 0 0 0

>

∆xα

∆yα

∆x0

α

∆y0

α

,

∆1ξ(3)

α=

−¯y0

α0 0 ¯x0

α0 0000

0−¯y0

α0 0 ¯x0

α0000

0 0 0 ¯xα¯yα1000

−¯xα−¯yα−1 0 0 0 0 0 0

>

∆xα

∆yα

∆x0

α

∆y0

α

.(27)

The second order error ∆2ξ(k)

αhas the following form:

∆2ξ(1)

α= (0,0,0,0,0,0,∆xα∆y0

α,∆yα∆y0

α,0)>,

∆2ξ(2)

α= (0,0,0,0,0,0,−∆x0

α∆xα,−∆x0

α∆yα,0)>,

∆2ξ(3)

α= (−∆y0

α∆xα,−∆y0

α∆yα,0,∆x0

α∆xα,∆x0

α∆yα,0,0,0,0)>.(28)

7. Perturbation Analysis

Substituting Eq. (18) into Eq. (14), we obtain

M=¯

M+ ∆1M+ ∆2M+···,(29)

where

¯

M=1

N

N

X

α=1

L

X

k=1

¯

ξ(k)

α¯

ξ(k)>

α,(30)

∆1M=1

N

N

X

α=1

L

X

k=1

(¯

ξ(k)

α∆1ξ(k)>

α+ ∆1ξ(k)

α¯

ξ(k)>

α),(31)

∆2M=1

N

N

X

α=1

L

X

k=1

(¯

ξ(k)

α∆2ξ(k)>

α+ ∆1ξ(k)

α∆1ξ(k)>

α+ ∆2ξ(k)

α¯

ξ(k)>

α).(32)

We also expand the solution θand λof Eq. (16) in the form

θ=¯

θ+ ∆1θ+ ∆2θ+···, λ =¯

λ+ ∆1λ+ ∆2λ+···.(33)

Substituting Eq. (29) and Eq. (33) into Eq. (16), we have

(¯

M+ ∆1M+ ∆2M+···)(¯

θ+ ∆1θ+ ∆2θ+···)

= (¯

λ+ ∆1λ+ ∆2λ+···)N(¯

θ+ ∆1θ+ ∆2θ+···).(34)

Note that Nis a variable to be determined, not a given function of observations,

so it is not expanded. Since we consider perturbations near the true values, the

resulting matrix Nmay be a function of the true data values. In that event, we

replace the true data values by their observations and do an a posteriori analysis

to see how this aﬀects the accuracy. For the moment, we regard Nas an unknown

variable. From a strictly mathematical point of view, the two sides of Eq. (34)

may not deﬁne an absolutely convergent series expansion. Here, we do not go into

such a theoretical question; we simply test the usefulness of the ﬁnal results by

experiments a posteriori, as commonly done in physics and engineering. At any

rate, we are concerned with only up to the second order terms in the subsequent

analysis.

Equating terms of the same order in Eq. (34), we obtain

¯

M¯

θ=¯

λN¯

θ,(35)

¯

M∆1θ+ ∆1M¯

θ=¯

λN∆1θ+ ∆1λN¯

θ,(36)

¯

M∆2θ+ ∆1M∆1θ+ ∆2M¯

θ=¯

λN∆2θ+ ∆1λN∆1θ+ ∆2λN¯

θ.(37)

We have ¯

M¯

θ=0for the true values, so ¯

λ= 0. From Eq. (31), we have

(¯

θ,∆1¯

M¯

θ) = 0. Computing the inner product of Eq. (36) and ¯

θon both sides,

we see that ∆1λ= 0. Multiplying Eq. (36) by the pseudoinverse ¯

M−of ¯

Mfrom

left, we obtain

85

∆1θ=−¯

M−∆1M¯

θ.(38)

Note that since ¯

M¯

θ=0, the matrix ¯

M−¯

M(≡P¯

θ) is the projection operator

in the direction orthogonal to ¯

θ. Also, equating the ﬁrst order terms in the

expansion k¯

θ+ ∆1θ+ ∆2θ+···k2= 1 shows (¯

θ,∆1θ) = 010), hence P¯

θ∆1θ=

∆1θ. Substituting Eq. (38) into Eq. (37) and computing its inner product with

¯

θon both sides, we obtain

∆2λ=(¯

θ,∆2M¯

θ)−(¯

θ,∆1M¯

M−∆1M¯

θ)

(¯

θ,N¯

θ)=(¯

θ,T¯

θ)

(¯

θ,N¯

θ),(39)

where we put

T= ∆2M−∆1M¯

M−∆1M.(40)

Next, we consider the second order error ∆2θ. Since θis normalized to unit

norm, we are interested in the error component orthogonal to ¯

θ. So, we consider

∆⊥

2θ≡P¯

θ∆2θ(= ¯

M−¯

M∆2θ).(41)

Multiplying Eq. (37) by ¯

M−from left and substituting Eq. (38), we obtain

∆⊥

2θ= ∆2λ¯

M−N¯

θ+¯

M−∆1M¯

M−∆1M¯

θ−¯

M−∆2M¯

θ

=(¯

θ,T¯

θ)

(¯

θ,N¯

θ)¯

M−N¯

θ−¯

M−T¯

θ.(42)

8. Covariance and Bias

8.1 Covariance Analysis

From Eq. (38), the covariance matrix V[θ] of the solution θhas the leading

term

V[θ] = E[∆1θ∆1θ>] = 1

N2¯

M−E[(∆1Mθ)(∆1M θ)>]¯

M−

=1

N2¯

M−EhN

X

α=1

L

X

k=1

(∆ξ(k)

α,θ)¯

ξ(k)

α

N

X

β=1

L

X

l=1

(∆ξ(l)

β,θ)¯

ξ(l)>

βi¯

M−

=1

N2¯

M−

N

X

α,β=1

L

X

k,l=1

(θ, E[∆ξ(k)

α∆ξ(l)>

β]θ)¯

ξ(k)

α¯

ξ(l)>

β¯

M−

=1

N2¯

M−³N

X

α=1

L

X

k,l=1

(θ, V (kl)[ξα]θ)¯

ξ(k)

α¯

ξ(l)>

α´¯

M−

=1

N¯

M−¯

M0¯

M−,(43)

where we deﬁne

¯

M0=1

N

N

X

α=1

L

X

k,l=1

(θ, V (kl)[ξα]θ)¯

ξ(k)

α¯

ξ(l)>

α(44)

In the above derivation, we have noted that from our noise assumption we have

E[∆1ξ(k)

α∆1ξ(l)>

β] = δαβ V(kl)[ξα], where δαβ is the Kronecker delta.

8.2 Bias Analysis

The important observation is that the covariance matrix V[θ] does not contain

N. Thus, all algebraic methods have the same covariance matrix in the leading

order, as pointed out by Al-Sharadqah and Chernov1) for circle ﬁtting. This

observation leads us to focus on the bias. We now seek an Nthat reduces the

bias as much as possible. It would be desirable if we could ﬁnd such an Nthat

minimizes the total mean square error E[k∆1θ+∆2θ+···k2], but at the moment

this seems to be an intractable problem; minimizing the bias alone is a practical

compromise, whose eﬀectiveness is tested by experiments a posteriori.

From Eq. (38), we see that the ﬁrst order bias E[∆1θ] is 0, hence the leading

bias is E[∆⊥

2θ]. From Eq. (42), we have

E[∆⊥

2θ] = (¯

θ, E[T]¯

θ)

(¯

θ,N¯

θ)¯

M−N¯

θ−¯

M−E[T]¯

θ.(45)

We now evaluate the expectation E[T] of Tin Eq. (40). From Eq. (32), we see

that E[∆2M] is given by

E[∆2M] = 1

N

N

X

α=1

L

X

k=1³¯

ξ(k)

αE[∆2ξ(k)

α]>

+E[∆1ξ(k)

α∆1ξ(k)>

α]+E[∆2ξ(k)

α]¯

ξ(k)>

α´

=1

N

N

X

α=1

L

X

k=1³V(kk)[ξα]+2S[¯

ξ(k)

αe(k)>

α]´,(46)

86

where we have used Eq. (20) and deﬁned

e(k)

α≡E[∆2ξ(k)

α].(47)

The operator S[·] denotes symmetrization (S[A]=(A+A>)/2). The expecta-

tion E[∆1M¯

M−∆1M] has the following form (see Appendix):

E[∆1M¯

M−∆1M] = 1

N2

N

X

α=1

L

X

k,l=1³tr[ ¯

M−V(kl)[ξα]]¯

ξ(k)

α¯

ξ(l)>

α

+(¯

ξ(k)

α,¯

M−¯

ξ(l)

α)V(kl)[ξα] + 2S[V(kl)[ξα]¯

M−¯

ξ(k)

α¯

ξ(l)>

α]´.(48)

From Eq. (46) and Eq. (48), the expectation of Tis

E[T] = NT−1

N2

N

X

α=1

L

X

k,l=1³tr[ ¯

M−V(kl)[ξα]]¯

ξ(k)

α¯

ξ(l)>

α

+(¯

ξ(k)

α,¯

M−¯

ξ(l)

α)V(kl)[ξα]+2S[V(kl)[ξα]¯

M−¯

ξ(k)

α¯

ξ(l)>

α]´,(49)

where we put

NT=1

N

N

X

α=1

L

X

k=1³V(kk)[ξα]+2S[¯

ξ(k)

αe(k)>

α]´.(50)

9. HyperLS

Now, let us choose Nto be the expression E[T] in Eq. (49) itself, namely,

N=NT−1

N2

N

X

α=1

L

X

k,l=1³tr[ ¯

M−V(kl)[ξα]]¯

ξ(k)

α¯

ξ(l)>

α+(¯

ξ(k)

α,¯

M−¯

ξ(l)

α)V(kl)[ξα]

+2S[V(kl)[ξα]¯

M−¯

ξ(k)

α¯

ξ(l)>

α]´,(51)

Letting N=E[T] in Eq. (45), we see that

E[∆⊥

2θ] = ¯

M−³(¯

θ,N¯

θ)

(¯

θ,N¯

θ)N−N´¯

θ=0.(52)

Since the right-hand side of Eq. (49) contains the true values ¯

ξαand ¯

M, we

replace ¯

xαin their deﬁnition by the observation xα. This does not aﬀect the

result, since the odd order noise terms have expectation 0 and hence the resulting

error in E[∆⊥

2θ] is of the fourth order. Thus, the second order bias is exactly 0.

In ﬁtting a circle to a point sequence, Al-Sharadqah and Chernov1) proposed

to choose N=NTand showed that the second order bias E[∆⊥

2θ] is zero up to

O(1/N2). They called their method Hyper. What we have shown here is that

the second order bias is completely removed by including the second term on the

right-hand side of Eq. (51). We call our scheme HyperLS.

Note that Nhas scale indeterminacy: If Nis multiplied by c(6= 0), Eq. (16)

has the same solution θ; only λis divided by c. Thus, the noise characteristics

V(kl)[ξα] in Eq. (20) and hence V[xα] need to be known only up to scale; we need

not know the absolute magnitude of the noise.

For numerical computation, standard linear algebra routines for solving the

generalized eigenvalue problem of Eq. (16) assume that Nis positive deﬁnite,

but here Nis nondeﬁnite. This causes no problem, because Eq. (16) can be

written as

Nθ =1

λMθ.(53)

As mentioned earlier, the matrix Min Eq. (14) is positive deﬁnite for noisy

data, so we can solve Eq. (53) instead, using a standard routine. If the smallest

eigenvalue of Mhappens to be 0, it indicates that the data are all exact, so

any method, e.g., the standard LS, gives an exact solution. For noisy data, the

solution θis given by the eigenvector of Eq. (53) for the eigenvalue 1/λ with the

largest absolute value.

Example 7 (Ellipse ﬁtting). If the noise in (xα, yα) is independent and Gaus-

sian with mean 0 and standard deviation σ, the vector eα(= e(1)) in Eq. (47) is

given by

eα=σ2(1,0,1,0,0,0)>.(54)

Hence, the matrix NTin Eq. (50) is given by

87

NT=σ2

N

N

X

α=1

6x2

α6xαyαx2

α+y2

α6xα2yα1

6xαyα4(x2

α+y2

α) 6xαyα4yα4xα0

x2

α+y2

α6xαyα6y2

α2xα6yα1

6xα4yα2xα4 0 0

2yα4xα6yα0 4 0

1 0 1 0 0 0

.(55)

The Taubin method26) is to use as N

NTaubin =4σ2

N

N

X

α=1

x2

αxαyα0xα0 0

xαyαx2

α+y2

αxαyαyαxα0

0xαyαy2

α0yα0

xαyα0 1 0 0

0xαyα0 1 0

0 0 0 0 0 0

,(56)

which we see is obtained by letting eα=0in Eq. (50). As pointed out earlier,

the value of σin Eq. (55) and Eq. (56) need not be known. Hence, we can simply

let σ= 1 in Eq. (16) and Eq. (53) in actual computation.

Example 8 (Fundamental matrix computation). If the noise in (xα, yα)

and (x0

α, y0

α) is independent and Gaussian with mean 0 and standard deviation

σ, the vector eα(= e(1)) in Eq. (47) is 0, so the NTin Eq. (50) becomes

NT=σ2

N

N

X

α=1

x2

α+x02

αx0

αy0

αx0

αxαyα0 0 xα0 0

x0

αy0

αx2

α+y02

αy0

α0xαyα0 0 xα0

x0

αy0

α1 0 0 0 0 0 0

xαyα0 0 y2

α+x02

αx0

αy0

αx0

αyα0 0

0xαyα0x0

αy0

αy2

α+y02

αy0

α0yα0

0 0 0 x0

αy0

α1 0 0 0

xα0 0 yα0 0 1 0 0

0xα0 0 yα0 0 1 0

0 0 0 0 0 0 0 0 0

.

(57)

It turns out that the use of this matrix NTcoincides with the well known Taubin

method26) . As in ellipse ﬁtting, we can let σ= 1 in Eq. (57) in actual computa-

tion.

Example 9 (Homography computation). If the noise in (xα, yα) and (x0

α, y0

α)

is independent and Gaussian with mean 0 and standard deviation σ, the vectors

e(k)

αin Eq. (47) are all 0, so the NTin Eq. (50) becomes

NT=σ2

N

N

X

α=1

x2

α+y02

α+ 1 xαyαxα−x0

αy0

α0

xαyαy2

α+y02

α+ 1 yα0−x0

αy0

α

xαyα1 0 0

−x0

αy0

α0 0 x2

α+x02

α+ 1 xαyα

0−x0

αy0

α0xαyαy2

α+x02

α+ 1

0 0 0 xαyα

−x0

α0 0 −y0

α0

0−x0

α0 0 −x0

α

0 0 0 0 0

0−x0

α0 0

0 0 −x0

α0

0 0 0 0

xα−y0

α0 0

yα0−y0

α0

1 0 0 0

0x2

α+x02

α+y02

α2xαyα2xα

0 2xαyαy2

α+x02

α+y02

α2yα

0 2xα2yα2

,(58)

For homography computation, the constraint is a vector equation in Eq. (8).

Hence, the Taubin method26), which is deﬁned for a single constraint equation,

cannot be applied. However, the use of the above NTas Nplays the same role

of the Taubin method26) for ellipse ﬁtting and fundamental matrix computation,

as ﬁrst pointed out by Rangarajan and Papamichalis24). As before, we can let σ

= 1 in the matrix NTin actual computation.

In the following, we call the use of NTas Nthe Taubin approximation. For

88

fundamental matrix computation, it coincides with the Taubin method26) , but

for homography computation the Taubin method was not deﬁned. For ellipse

ﬁtting, the Taubin method and the Taubin approximation are slightly diﬀerent;

the Taubin method is equivalent to use only the ﬁrst term on the right hand

side of Eq. (50). For circle ﬁtting, the Taubin approximation is the same as the

“Hyper” of Al-Sharadqah and Chernov1).

10. Numerical Experiments

We did the following three experiments:

Ellipse ﬁtting: We ﬁt an ellipse to the point sequence shown in Fig. 2(a). We

took 31 equidistant points on the ﬁrst quadrant of an ellipse with major and

minor axes 100 and 50 pixels, respectively.

Fundamental matrix computation: We compute the fundamental matrix

between the two images shown in Fig. 2(b), which view a cylindrical grid

surface from two directions. The image size is assumed to be 600 ×600

(pixels) with focal lengths 600 pixels for both. The 91 grid points are used

as corresponding points.

Homography computation: We compute the homography relating the two

images shown in Fig. 2(c), which view a planar grid surface from two direc-

tions. The image size is assumed to be 800 ×800 (pixels) with focal lengths

600 pixels for both. The 45 grid points are used as corresponding points.

In all experiments, we divided the data coordinate values by 600 (pixels) (i.e., we

used 600 pixels as the unit of length) to make all the data values within the range

of about ±1. This is for stabilizing numerical computation with ﬁnite precision

(a) (b) (c)

Fig. 2 (a) 31 points on an ellipse. (b) Two views of a curved grid. (c) Two views of a planar grid.

θ

∆ θ

θ

O

Fig. 3 The true value ¯

θ, the computed value θ, and its

orthogonal component ∆θto ¯

θ.

length; without this data scale normalization, serious accuracy loss is incurred,

as pointed out by Hartley5) for fundamental matrix computation.

For each example, we compared the standard LS, HyperLS, its Taubin approx-

imation, and ML, for which we used the FNS of Chojnacki et al.4) for ellipse

ﬁtting and fundamental matrix computation and the multiconstraint FNS of Ni-

itsuma et al.22) for homography computation. As mentioned in Section 4, FNS

minimizes not directly Eq. (11) but the Sampson error in Eq. (21), and the exact

ML solution can be obtained by repeated Sampson error minimization16) . The

the solution that minimizes the Sampson error usually agrees with the ML solu-

tion up to several signiﬁcant digits11),14),15). Hence, FNS can safely be regarded

as minimizing Eq. (11).

Let ¯

θbe the true value of the parameter θ, and ˆ

θits computed value. We

consider the following error:

∆⊥θ=P¯

θˆ

θ,P¯

θ≡I−¯

θ¯

θ>.(59)

The matrix P¯

θrepresents the orthogonal projection onto the space orthogonal

to ¯

θ. Since the computed value ˆ

θis normalized to a unit vector, it distributes

around ¯

θon the unit sphere. Hence, the meaningful deviation is its component

orthogonal to ¯

θ, so we measure the error component in the tangent space to the

unit sphere at ¯

θ(Fig. 3).

We added independent Gaussian noise of mean 0 and standard deviation σto

the xand ycoordinates of data each point and repeated the ﬁtting Mtimes for

each σ, using diﬀerent noise. We let M= 10000 for ellipse ﬁtting and fundamen-

tal matrix computation and M= 1000 for homography computation. Then, we

89

0.1

0.2

0.3

0.4

0.5

0.6

0 1 2 3

123

4

σ

(a) (b) (c)

Fig. 4 RMS error vs. the standard deviation σof the noise added to each point: 1. standard LS, 2. Taubin approximation, 3.

HyperLS, 4. ML. The dotted lines indicate the KCR lower bound. (a) Ellipse ﬁtting. (b) Fundamental matrix computation.

(c) Homography computation.

evaluated the root-mean-square (RMS) error

E=v

u

u

t

1

M

M

X

a=1

k∆⊥θ(a)k2,(60)

where ∆θ(a)is the value of ∆θin the ath trial. The theoretical accuracy limit,

called the KCR lower bound3),8),10) , is given by

E[∆⊥θ∆⊥θ>]Âσ2

N³1

N

N

X

α=1

L

X

k,l=1

¯

W(kl)

α¯

ξ(k)

α¯

ξ(l)>

α´−

≡VKCR[θ],(61)

where ¯

W(kl)

αis the value of W(kl)

αin Eq. (21) evaluated by assuming σ= 1 and

using the true values ¯

θand ¯

ξ(kl)

α. The relation Âmeans that the left-hand side

minus the right-hand side is a positive semideﬁnite symmetric matrix, and the

operation ( ·)−denotes pseudoinverse. We compared the RMS error in Eq. (60)

with the trace Eq. (61):

qE[k∆⊥θk2]≥ptrVKCR[θ].(62)

Figure 4 plots for σthe RMS error of Eq. (60) for each method and the KCR

lower bound of Eq. (62).

We also compared the reprojection error for diﬀerent methods. According to

statistics, the reprojection error Iin Eq. (11) for ML is subject to a χ2distribu-

tion with rN −ddegrees of freedom, where ris the codimension of the constraint

and dis the dimension of the parameters8). Hence, if ML is computed by assum-

ing σ= 1, the square root of the average reprojection error per datum is expected

to be σpr−d/N.Figure 5 plots the square root of the average, per datum, of

the computed reprojection error, which was approximated by the Sampson error

K(θ) in Eq. (21), along with the theoretical expectation.

We observe the following:

Ellipse ﬁtting: The standard LS performs poorly, while ML exhibits the high-

est accuracy, almost reaching the KCR lower bound. However, ML compu-

tation fails to converge above a certain noise level. In contrast, HyperLS

produces, without iterations, an accurate solution close to ML. The accuracy

of its Taubin approximation is practically the same as the traditional Taubin

method and is slightly lower than HyperLS. Since r= 1, d= 5, the square

root of the average reprojection error per datum has theoretical expectation

σp1−5/N to a ﬁrst approximation. We see that the computed value almost

coincides with the expected value expect for the standard LS.

Fundamental matrix computation: Again, the standard LS is poor, while

ML has the highest accuracy, almost reaching the KCR lower bound. The

accuracy of HyperLS is very close to ML. Its Taubin approximation (= the

traditional Taubin method) has practically the same accuracy as HyperLS.

90

0.2

0.4

0.6

0.8

0 0.2 0.4 0.6 0.8

σ

1 4

2 3

1

2

3

0 1 2 3

σ

1

2 3 4

(a) (b) (c)

Fig. 5 Root square average reprojection error per datum vs. the standard deviation σof the noise added to each point: 1. standard

LS, 2. Taubin approximation, 3. HyperLS, 4. ML. The dotted lines indicate theoretical expectation. (a) Ellipse ﬁtting.

(b) Fundamental matrix computation. (c) Homography computation.

The fundamental matrix has the constraint that its rank be 2. The compari-

son here is done before the rank constraint is imposed. Hence, r= 1, d= 8,

and the square root of the average reprojection error per datum is expected

to be σp1−8/N. We see that the computed value almost coincides with

the expected value expect for the standard LS.

Homography computation: In this case, too, the standard LS is poor, while

ML has the highest accuracy, almost reaching the KCR lower bound. How-

ever, ML computation fails to converge above a certain noise level. The

accuracy of HyperLS is very close to ML. Its Taubin approximation has prac-

tically the same accuracy as HyperLS. Since r= 2, d= 4, the square root of

the average reprojection error per datum is expected to be σp2(1 −4/N).

We see that the computed value almost coincides with the expected value

expect for the standard LS.

In all examples, the standard LS performs poorly, while ML provides the highest

accuracy. Note that the diﬀerences among diﬀerent methods are more marked

when measured in the RMS error than in the reprojection error. This is because

the RMS error measures the error of the parameters of the equation, while the

reprojection error measures the closeness of the ﬁt to the data. For ellipse ﬁt-

ting, for example, the RMS error compares the ﬁtted ellipse equation and the

true ellipse equation, while the reprojection error measures how close the ﬁtted

ellipse is to the data. As a result, even if two ellipse have nearly the same dis-

tances to the data, their shapes can be very diﬀerent. This diﬀerence becomes

more conspicuous as the data cover a shorter segment of the ellipse. The same

observation can be done for fundamental matrix computation and homography

computation.

We also see from our experiments that ML computation may fail in the presence

of large noise. The convergence of ML critically depends on the accuracy of the

initialization. In the above experiments, we used the standard LS to start the

FNS iterations. We conﬁrmed that the use of HyperLS to start the iterations

signiﬁcantly extends the noise range of convergence, though the computation fails

sooner or later. On the other hand, HyperLS is algebraic and hence immune to

the convergence problem, producing a solution close in accuracy to ML in any

noise level.

The Taubin approximation is clearly inferior to HyperLS for ellipse ﬁtting but is

almost equivalent to HyperLS for fundamental matrices and homographies. This

reﬂects the fact that while ξis quadratic in xand yfor ellipses (see Eq. (4)),

the corresponding ξand ξ(k)are bilinear in x,y,x0, and y0for fundamental

matrices (see Eq. (6)) and homographies (see Eq. (9)), so e(k)

αin Eq. (47) is 0.

In structure-from-motion applications, we frequently do inference from multiple

images based on “multilinear” constraints involving homographies, fundamental

matrices, trifocal tensors, and other geometric quantities7). For such problems,

the constraint itself is nonlinear but is linear in observations of each image.

91

Then, e(k)

α=0, because noise in diﬀerent images are assumed to be independent.

In such a problem, the accuracy of HyperLS is nearly the same as its Taubin

approximation. However, HyperLS is clearly superior in a situation where the

constraint involves nonlinear terms in observations of the same image, e.g., ellipse

ﬁtting.

11. Concluding Remarks

We have presented a general formulation for a special type of least squares

(LS) estimator, which we call “HyperLS,” for geometric problems that frequently

appear in vision applications. We described the problem in the most general

terms and discussed various theoretical issues that have not been fully studied

so far. In particular, we pointed out that the characteristics of image-based

inference is very diﬀerent to the conventional statistical domains and discussed

in detail various issues related to ML and algebraic ﬁtting. Then, we derived

HyperLS by introducing a normalization that eliminates statistical bias of LS up

to second order noise terms.

It would be ideal if we could minimize the total mean squares error by taking

all higher order terms into account. Due to technical diﬃculties, we limited our

attention to the bias up to the second order. Also, we introduced in our deriva-

tion several assumptions about the choice of the eigenvalues and the convergence

of series expansion. However, the purpose of this paper is not to establish math-

ematical theorems with formal proofs. Our aim is to derive techniques that are

useful in practical problems; the usefulness is to be tested by experiments.

Our numerical experiments for computing ellipses, fundamental matrices, and

homographies showed that HyperLS yields a solution far superior to the stan-

dard LS and comparable in accuracy to ML, which is known to produce highly

accurate solutions but may fail to converge if poorly initialized. Thus, HyperLS

is a perfect candidate for ML initialization. We compared the performance of

HyperLS and its Taubin approximation and attributed the performance diﬀer-

ences to the structure of the problem. In this paper, we did not show real image

demos, concentrating on the general mathematical framework, because particular

applications have been shown elsewhere1),12),22),23).

Acknowledgments The authors thank Ali Al-Sharadqah and Nikolai Cher-

nov of the University of Alabama at Birmingham, U.S.A, Wolfgang F¨orstner

of the University of Bonn, Germany, and Alexander Kukush of National Taras

Shevchenko University of Kyiv, Ukraine, for helpful discussions. This work was

supported in part by the Ministry of Education, Culture, Sports, Science, and

Technology, Japan, under a Grant in Aid for Scientiﬁc Research (C 21500172).

References

1) Al-Sharadqah, A. and Chernov, N.: Error analysis for circle ﬁtting algorithms.

Elec. J. Stat., Vol.3, pp.886–911 (2009).

2) Cheng, C.-L. and Kukush, A.: Non-existence of the ﬁrst moment of the adjusted

least squares estimator in multivariate errors-in-variables model, Metrika, Vol.64,

No.1, pp.41–46 (2006).

3) Chernov, N. and Lesort, C.: Statistical eﬃciency of curve ﬁtting algorithms, Com-

put. Stat. Data Anal., Vol.47, Vol.4, pp.713–728 (2004).

4) Cho jnacki, W., Brooks, M.J., van den Hengel, A. and Gawley, D.: On the ﬁtting

of surfaces to data with covariances, IEEE Trans. Patt. Anal. Mach. Intell., Vol.22,

No.11, pp.1294–1303 (2000).

5) Hartley, R.I.: In defense of the eight-point algorithm, IEEE Trans. Patt. Anal.

Mach. Intell., Vol.19, No.6, pp.580–593 (1997).

6) Hartley, R. and Kahl, F.: Optimal algorithms in multiview geometry, Proc. 8th

Asian Conf. Computer Vision, Tokyo, Japan, November 2007, Vol.1, pp.13–34

(2007).

7) Hartley, R. and Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd

ed., Cambridge University Press, Cambridge, U.K. (2004).

8) Kanatani, K.: Statistical Optimization for Geometric Computation: Theory and

Practice, Elsevier Science, Amsterdam, The Netherlands (1996); reprinted, Dover,

New York, U.S.A. (2005).

9) Kanatani, K.: Ellipse ﬁtting with hyperaccuracy, IEICE Trans. Inf. & Syst.,

Vol.E89-D, No.10, pp.2653–2660 (2006).

10) Kanatani, K.: Statistical optimization for geometric ﬁtting: Theoretical accuracy

analysis and high order error analysis, Int. J. Comput. Vis., Vol.80, No.2, pp.167–

188 (2006).

11) Kanatani, K. and Niitsuma, H.: Optimal two-view planar scene triangulation, IPSJ

Trans. Comput. Vis. Appl., Vol.4 (2011), to appear.

12) Kanatani, K. and Rangara jan, P.: Hyper least squares ﬁtting of circles and ellipses,

Comput. Stat. Data Anal., Vol.55, Vol.6, pp.2197–2208 (2011).

13) Kanatani, K. and Sugaya, Y.: Performance evaluation of iterative geometric ﬁtting

algorithms, Comp. Stat. Data Anal., Vol.52, No.2, pp.1208–1222 (2007).

92

14) Kanatani, K. and Sugaya, Y.: Compact algorithm for strictly ML ellipse ﬁtting,

Proc. 19th Int. Conf. Pattern Recog., Tampa, FL, U.S.A. (2008).

15) Kanatani, K. and Sugaya, Y.: Compact fundamental matrix computation, IPSJ

Trans. Comput. Vis. Appl., Vol.2, pp.59–70 (2010).

16) Kanatani, K. and Sugaya, Y.: Uniﬁed computation of strict maximum likelihood

for geometric ﬁtting, J. Math. Imaging Vis., Vol.38, pp.1–13 (2010).

17) Kukush, K., Markovski, I. and Van Huﬀel, S.: Consistent fundamental matrix

estimation in a quadratic measurement error model arising in motion analysis,

Comp. Stat. Data Anal., Vol.41, No.1, pp.3–18 (2002).

18) Kukush, K., Markovski, I. and Van Huﬀel, S.: Consistent estimation in an implicit

quadratic measurement error model, Comp. Stat. Data Anal., Vol.47, No.1, pp.123–

147 (2004).

19) Leedan, Y. and Meer, P.: Heteroscedastic regression in computer vision: Problems

with bilinear constraint, Int. J. Comput. Vis., Vol.37, No.2, pp.127–150 (2000).

20) Matei, B.C. and Meer, P.: Estimation of nonlinear errors-in-variables models for

computer vision applications, IEEE Trans. Patt. Anal. Mach. Intell., Vol.28, No.10,

pp.1537–1552 (2006).

21) Neyman, J. and Scott, E.L.: Consistent estimates based on partially consistent

observations, Econometrica, Vol.16, No.1, pp.1–32 (1948).

22) Niitsuma, H., Rangara jan, P. and Kanatani, K.: High accuracy homography com-

putation without iterations, Proc. 16th Symp. Sensing Imaging Inf., Yokohama,

Japan (2010).

23) Rangara jan, P. and Kanatani, K.: Improved algebraic methods for circle ﬁtting,

Elec. J. Stat., Vol.3, pp.1075–1082 (2009).

24) Rangara jan, P. and Papamichalis, P.: Estimating homographies without normal-

ization, Proc. Int. Conf. Image Process., Cairo, Egypt, pp.3517–3520 (2009).

25) Sampson, P.D.: Fitting conic sections to “very scattered” data: An iterative reﬁne-

ment of the Bookstein algorithm, Comput. Graphics Image Process., Vol.18, No.1,

pp.97–108 (1982).

26) Taubin, G.: Estimation of planar curves, surfaces, and non-planar space curves de-

ﬁned by implicit equations with applications to edge and range image segmentation,

IEEE Trans. Patt. Anal. Mach. Intell., Vol.13, No.11, pp.1115–1138 (1991).

27) Triggs, B., McLauchlan, P.F., Hartley, R.I. and Fitzgibbon, A.: Bundle

adjustment—A modern synthesis, Vision Algorithms: Theory and Practice, Triggs,

B, Zisserman, A. and Szeliski, R. (Eds.), pp.298–375, Springer (2000).

Appendix

The term E[∆1M¯

M−∆1M] is computed as follows:

E[∆1M¯

M−∆1M]

=Eh1

N

N

X

α=1

3

X

k=1³¯

ξ(k)

α∆1ξ(k)>

α+ ∆1ξ(k)

α¯

ξ(k)>

α´¯

M−1

N

N

X

β=1

3

X

l=1 ³¯

ξ(l)

β∆1ξ(l)>

β

+∆1ξ(l)

β¯

ξ(l)>

β´i

=1

N2

N

X

α,β=1

3

X

k,l=1

E[(¯

ξ(k)

α∆1ξ(k)>

α+∆1ξ(k)

α¯

ξ(k)>

α)¯

M−(¯

ξ(l)

β∆1ξ(l)>

β+∆1ξ(l)

β¯

ξ(l)>

β)]

=1

N2

N

X

α,β=1

3

X

k,l=1

E[¯

ξ(k)

α∆1ξ(k)>

α¯

M−¯

ξ(l)

β∆1ξ(l)>

β+¯

ξ(k)

α∆1ξ(k)>

α¯

M−∆1ξ(l)

β¯

ξ(l)>

β

+∆1ξ(k)

α¯

ξ(k)>

α¯

M−¯

ξ(l)

β∆1ξ(l)>

β+ ∆1ξ(k)

α¯

ξ(k)>

α¯

M−∆1ξ(l)

β¯

ξ(l)>

β]

=1

N2

N

X

α,β=1

3

X

k,l=1

E[¯

ξ(k)

α(∆1ξ(k)

α,¯

M−¯

ξ(l)

β)∆1ξ(l)>

β+¯

ξ(k)

α(∆1ξ(k)

α,¯

M−∆1ξ(l)

β)¯

ξ(l)>

β

+∆1ξ(k)

α(¯

ξ(k)

α,¯

M−¯

ξ(l)

β)∆1ξ(l)>

β+ ∆1ξ(k)

α(¯

ξ(k)

α,¯

M−∆1ξ(l)

β)¯

ξ(l)>

β]

=1

N2

N

X

α,β=1

3

X

k,l=1

E[(∆1ξ(k)

α,¯

M−¯

ξ(l)

β)¯

ξ(k)

α∆1ξ(l)>

β+(∆1ξ(k)

α,¯

M−∆1ξ(l)

β)¯

ξ(k)

α¯

ξ(l)>

β

+(¯

ξ(k)

α,¯

M−¯

ξ(l)

β)∆1ξ(k)

α∆1ξ(l)>

β+ ∆1ξ(k)

α(¯

M−∆1ξ(l)

β,¯

ξ(k)

α)¯

ξ(l)>

β]

=1

N2

N

X

α,β=1

3

X

k,l=1

E[¯

ξ(k)

α(( ¯

M−¯

ξ(l)

β)>∆1ξ(k)

α)∆1ξ(l)>

β

+tr[ ¯

M−∆1ξ(l)

β∆1ξ(k)>

α]¯

ξ(k)

α¯

ξ(l)>

β+ (¯

ξ(k)

α,¯

M−¯

ξ(l)

β)∆1ξ(k)

α∆1ξ(l)>

β

+∆1ξ(k)

α(∆1ξ(l)>

β¯

M−¯

ξ(k)

α)¯

ξ(l)>

β]

=1

N2

N

X

α,β=1

3

X

k,l=1³¯

ξ(k)

α¯

ξ(l)>

β¯

M−E[∆1ξ(k)

α∆1ξ(l)>

β]

+tr[ ¯

M−E[∆1ξ(l)

β∆1ξ(k)>

α]]¯

ξ(k)

α¯

ξ(l)>

β+ (¯

ξ(k)

α,¯

M−¯

ξ(l)

β)E[∆1ξ(k)

α∆1ξ(l)>

β]

+E[∆1ξ(k)

α∆1ξ(l)>

β]¯

M−¯

ξ(k)

α¯

ξ(l)>

β´

=1

N2

N

X

α,β=1

3

X

k,l=1³¯

ξ(k)

α¯

ξ(l)>

β¯

M−δαβ V(kl)[ξα] + tr[ ¯

M−δαβ V(kl)[ξα]]¯

ξ(k)

α¯

ξ(l)>

β

93

+(¯

ξ(k)

α,¯

M−¯

ξ(l)

β)δαβ V(kl)[ξα] + δαβ V(kl)[ξα]¯

M−¯

ξ(k)

α¯

ξ(l)>

β´

=1

N2

N

X

α=1

3

X

k,l=1³¯

ξ(k)

α¯

ξ(l)>

α¯

M−V(kl)[ξα] + tr[ ¯

M−V(kl)[ξα]]¯

ξ(k)

α¯

ξ(l)>

α

+(¯

ξ(k)

α,¯

M−¯

ξ(l)

α)V(kl)[ξα] + V(kl)[ξα]¯

M−¯

ξ(k)

α¯

ξ(l)>

α´

=1

N2

N

X

α=1

3

X

k,l=1³tr[ ¯

M−V(kl)[ξα]]¯

ξ(k)

α¯

ξ(l)>

α+ (¯

ξ(k)

α,¯

M−¯

ξ(l)

α)V(kl)[ξα]

+2S[V(kl)[ξα]¯

M−¯

ξ(k)

α¯

ξ(l)>

α]´.(63)

Thus, Eq. (48) is obtained.

(Received January 1, 2011)

(Revised May 16, 2011)

(Accepted August 1, 2011)

(Released October 17, 2011)

(Communicated by Peter Sturm)

Kenichi Kanatani received his B.E., M.S., and Ph.D. in applied

mathematics from the University of Tokyo in 1972, 1974 and 1979,

respectively. After serving as Professor of computer science at

Gunma University, Gunma, Japan, he is currently Professor of

computer science at Okayama University, Okayama, Japan. He is

the author of many books on computer vision and received many

awards including the best paper awards from IPSJ (1987) and

IEICE (2005). He is an IEEE Fellow.

Prasanna Rangarajan received his B.E. in electronics and com-

munication engineering from Bangalore University, Bangalore, In-

dia, in 2000 and his M.S. in electrical engineering from Columbia

University, New York, NY, U.S.A., in 2003. He is currently a

Ph.D. candidate in electrical engineering at Southern Methodist

University, Dallas, TX, U.S.A. His research interests include im-

age processing, structured illumination and parameter estimation

for computer vision.

Yasuyuki Sugaya received his B.E., M.S., and Ph.D. in com-

puter science from the University of Tsukuba, Ibaraki, Japan, in

1996, 1998, and 2001, respectively. From 2001 to 2006, he was

Assistant Professor of computer science at Okayama University,

Okayama, Japan. Currently, he is Associate Professor of informa-

tion and computer sciences at Toyohashi University of Technology,

Toyohashi, Aichi, Japan. His research interests include image pro-

cessing and computer vision. He received the IEICE best paper award in 2005.

Hirotaka Niitsuma received his B.E. and M.S. in applied physics

from Osaka University, Japan, in 1993 and 1995, respectively, and

his Ph.D. in informatic science from NAIST, Japan, in 1999. He

was a researcher at TOSHIBA, at JST Corporation, at Denso IT

Laboratory, Inc., at Kwansei Gakuin University, Japan, at Kyung-

pook National University, Korea, and at AIST, Japan. From April

2007, he is Assistant Professor of computer science at Okayama

University, Japan. His research interests include computer vision, machine learn-

ing, and neural networks.

94