Content uploaded by Paul D Sampson

Author content

All content in this area was uploaded by Paul D Sampson on Nov 12, 2017

Content may be subject to copyright.

COMPUTER GRAPHICS AND IMAGE PROCESSING 18,

97- 108 (1982)

NOTE

Fitting Conic Sections to “Very Scattered” Data: An

iterative Refinement of the Bookstein Algorithm*

PAUL

D.

SAMPSON?

Department of Statistics, The University of Chicago, Chicago, Illinois 60637

Received March 25, 1980

An examination of the geometric interpretation of the error-of-fit measure of the Bookstein

algorithm for fitting conic sections shows why it may not be entirely satisfactory when the data

are “very scattered” in the sense that the data points are distributed rather widely about an

underlying smooth curve. A simple iterative refinement of the Bookstein algorithm, similar in

spirit to iterative weighted least-squares methods in regression analysis, results in a fitted conic

section that approximates the conic that would minimize the sum of squared orthogonal

distances of data points from the fitted conic. The usefulness and limitations of the refined

algorithm are demonstrated on two different types of “very scattered” data.

I. INTRODUCTION

Bookstein [I] recently presented us with a valuable algorithm for fitting conic

sections to scattered data. It is generally suited to ellipses and hyperbolas; it is easy

to compute, requiring only an eigenanalysis; and perhaps most important, it fits

tonics to data in a manner invariant under rotations, translations, and changes of

scale of the data. These last two properties make the algorithm nearly analogous to

the principal components method of fitting a straight line to a data scatter. However,

in contrast with the principal components fit of a line, the Bookstein algorithm

minimizes a sum of squared “errors” that are not the orthogonal Euclidean distances

from data points to the fitted tonics. This fact explains why the fit of the Bookstein

algorithm may not be entirely satisfactory on data scatters that are not accurately

represented by conic sections (either because there is considerable “noise” in the

data, or because a functional form other than a conic would be better suited to the

data).

Depending on the distribution of errors about the “true” conic, a number of

different preferred criteria for the goodness-of-fit of a conic section might be

defined. In many situations an intuitively appealing goal is, indeed, the simple

minimization of the sum of squared errors of the data points measured in the

direction normal to the fitted conic (see, for example, the discussion of Gnanadesi-

kan [2, Sect. 2.41). While achieving this goal poses an intractable computational

problem, a reexamination of the definition of error-of-fit for the Bookstein algorithm

suggests a simple iterative refinement of it which approximates the least sum of

squared orthogonal distances (LSOD) fit.

*Support for this research was provided in part by National Science Foundation Grants SOC76-80389

and MCS76-81435.

$Current address: Department of Statistics, GN-22, University of Washington, Seattle, Washington

98195. 97

0146-664X/82/010097-12$02.00/0

Copyright 0 1982 by Academic Press. Inc.

All rights of reproduction in any fom reserved.

98

PAUL D. SAMP!SON

We examine the Bookstein algorithm and propose its iterative refinement in

Sections 2 and 3. In Section 4 the new algorithm is demonstrated on two types of

“very scattered” or noisy data which do not precisely define the underlying smooth

curves to be modeled by conic sections.

2. THE BOOKSTEIN ALGORITHM AND ITS ERROR-OF-FIT

Choosing notation consistent with [l] we write the equation of a conic as

VZ’=Ax2+Bxy+Cy2+Dx+Ey+F=0, (1)

where V = (A,

D,

C,

D, E,

F) and

Z

= (.x2, xy, y2, x, y, 1). For an arbitrary point

(x, y) the error-of-fit is measured by the equation of the conic and is denoted by

Q(x, y) =

VZ’.

For a set of data (xi, yi), i = 1,2,. . . , n, the algorithm consists of

minhnizing the sum of squared errors

i (Q(xi, Y~))~ = i (VZ;)2 = WV’,

i=l i=l n

where

Zi

= (xf, xiyi, y,‘, xi, yi, 1) and S = 2

ZiZ;,

subject to the quadratic nor-

malization VDV’ = constant,

D

= diag( 1, 5, l,O, 0,O). Bookstein shows that this

particular normalization provides the algorithm with its invariance under rotation,

translation, and change of scale. We can also place arbitrary linear constraints on

the coefficient vector

V

by requiring

VA4

= 0 with M a 6 X

p

matrix of rank 5 5.

The vector

V

minimizing

WV’

subject to VDV’ = constant and

VA4 = 0

is the

eigenvector of largest eigenvalue of the system

V[I-

M(M'S-lM)-M'S-l]D-

AVS=O, (2)

where - denotes any generalized inverse.

Although the solution to the problem can be expressed in terms of the generalized

eigensystem (2), for computational purposes it is preferable to implement the

method of Golub and Underwood [3], which reduces the problem to the solution of

a smaller, symmetric eigensystem.

A geometric interpretation of the error-of-fit for this algorithm is given by Fig. 1

(taken from Bookstein [ 1, Fig. 11). The error-of-fit for a point (x, y) relative to a

given conic is proportional to 1 - [(d + d,)2/d2], where (d + d,) is the distance

from the point to the center of the conic, and d is the distance from the conic to its

center along the ray from center to point.

For completeness we note that the error-of-fit for a point (x0, yO) separated from

a hyperbola by its asymptotes can be pictured in the same way by making reference

to the conjugate of the given hyperbola. This is illustrated in Fig. 2. The error of a

point relative to the conic Q(x, y) = 0 is proportional to 1 + [(d, + d2)2/di],

where (d, + d,) and d, are the distances center to point and center to conjugate

conic Q2, respectively. Furthermore, since the proportionality factor is the same for

Figs. 1 and 2, it is easy to show that Q(x, y) = Q2(x, y) for all points (x, y) on the

asymptotes to these hyperbolas. Thus, a hyperbola and its conjugate are equidistant

from their asymptotes by the measure Q.

More generally, note that contours of equal error, Q(x, y) = c, are themselves

conic sections concentric with (i.e., having the same center as) Q(x, y) =

0. For a

FITTING CONICS TO “VERY SCATTERED” DATA

99

Parabola Hy~erboh

FIG. 1. Error-of-fit for the Bookstein algorithm (Bookstein [I], Fig. 1.) Q(x, y) a: I - (d +

d,)*/d*.

hyperbola, the contours represent a family of hyperbolas all with the same asymp-

totes. From Fig. 3a it is clear that observations (xi, y;) located where a hyperbola

approaches its asymptotes are more heavily weighted by Q than observations near

the vertex. Similarly, Fig. 3b shows that deviations from an ellipse along the “flat”

side are more heavily weighted than those at the ends, where the radius of curvature

is smallest. For this reason, when there is a considerable amount of “noise” in the

data the Bookstein algorithm, based on the measure Q, may result in an unsatisfac-

Fw: 2. Error-of-fit for the Bookstein algorithm with reference to conjugate hyperbolas.

Q(x,,,

y,,) a I

+ (d, + d2)*/d:.

100

PAUL D. SAMPSON

b

FIG.

3. Families of concentric conic sections. (a) Concentric hyperbolas. (b) Concentric elipses.

tory fitted conic because of over-fitting the “flat” parts of conic sections relative to

the “curved” parts around vertices.

3. ITERATIVE REFINEMENT OF TI-IE BOOKSTEIN ALGORITHM

To see how one might compensate for the inherent weighting of errors-of-fit by

the measure Q as illustrated in Section 2, consider the concentric hyperbolas

Q(x, y) = 0 and Q(x, y) = ci,

i =

1,2, d rawn in Fig. 4. The points a and b are the

same distance from the central hyperbola according to the measure Q. For both

FITTING CONICS TO “VERY SCATTERED” DATA

101

FIG. 4. Approximation to orthogonal distance from a conic section.

points,

d’

labels the orthogonal distance to the hyperbola and

d”

labels the distance

to the hyperbola measured in a direction orthogonal to the contour Q(x, y) = c;. In

most situations

d”

will be a good approximation to the orthogonal distance

d’.

(Point a is almost a worst case for the hyperbola in Fig. 4, while at point

b

the two

distances are clearly almost the same.)

The rate of change of Q at the point a in the direction orthogonal to Q(x, v) = c,

is just the magnitude or norm of the gradient of Q(x, y) at the point a. Let a have

coordinates (x0, y,). The gradient is

= (ZAX,

+ By, + D)i, + (2Cy, +

Bx, + E)i,,

where i, and i, are unit vectors parallel to the x and y axes. Its squared norm is

therefore

1 v

Q(x,,

y,) I* = (2.4X, + By, + 0)” + (2Cy, + Bx, + E)*.

Since a linear approximation gives us

Q(& Y;‘> = Q<x,, va) - d” I vQ(x,, .Y,) I 3

where

d”

is distance measured from CZ” to a, a first order approximation to

d”

is

Q(L x,)

d” = I

vQ<x,,

~a) I *

102

PAUL D. SAMPSON

This argument suggests that a good algorithm would minimize

(3)

In other words, we should weight the error Q by the inverse of the norm of its

gradient. Unfortunately, (3) is not a simple quadratic form in the coefficient vector

V and we cannot express the solution to the minimization problem analytically in

closed form.

To avoid the difficulty of numerically mimn&ing (3), I suggest the following

two-step procedure. First, fit a conic section by the Bookstein algorithm of Section 2.

Let this first-stage solution be V(l) = (A,, B,, C,, D,, E,, F,). Use V(l) to estimate

( vQ(xi, y,) 1 at each point (xi, yi) with

qi = [(2A,x, + B,y, + D,)2 + (2C,y, + B,xi + E,)2]“2. (4

Next, fit a second conic by the Bookstein algorithm, but using the weighted

cross-product matrix

s(2) = $ z;z,/q,2

i=l

so that one minimizes

VS’2’V’

= i [VZ;/qi12 = i [Q(x,, yi)/qi]‘.

i=l i=l

(5)

Note that since the computed gradient is invariant under translations, rotations, and

changes of scale, this procedure remains invariant in the manner described by

Bookstein.

If the initial Bookstein fit is in a neighborhood of the LSOD solution, (4) should

provide a good approximation to the desired weights, and the vector V solving (2)

(with weighted cross-product matrix Sc2)) will represent a conic that closely ap-

proximates the LSOD conic. In general it may be necessary to compute further

iterations with reweighted cross-product matrices SC“) to refine the solution. This

represents an “iterative weighted least squares” algorithm similar in spirit to those

that appear in the regression literature (cf. Ramsay [4]).

4. APPLICATIONS OF THE ITERATIVE ALGORITHM

Figure 5 shows two scatters of data with the best-fitting tonics (constrained to

pass through two end points) resulting from (A) the Bookstein algorithm and (B) the

two-step iterative refinement. Though the fitted tonics are very similar, clearly conic

B fits the scatter in the vicinity of the vertex better than conic A.

The behavior of solutions resulting from further iterations of the algorithm is of

some interest. For data sets like those illustrated in Fig. 5 (see Sampson [S]), further

iterations display no appreciable differences in the fitted tonics. The rapid conver-

gence of the coefficient vectors is shown in Table 1, where all coefficients A = -D

FITTING CONKS TO “VERY SCATTERED” DATA

103

21u2.2.

FIG.

5. Conies fitted to data scatters: (A) Bookstein algorithm, (B) two-step iterative procedure.

TABLE 1

Conic Coefficients from Iterative Refinement of

Bookstein Algorithm Applied to Data of Fig. 5

ID. No. A B C D E F

__- -

2279.2 0.684893 0.081016 -0.052821 - 0.684893 0.229104 0

0.690010 0.088135 -0.001655 -0.690010 0.200007 0

0.689505 0.086223 -0.007828 -0.689505 0.204133 0

0.689571 0.086336 - 0.007037 - 0.68957 I 0.203664 0

0.689563 0.086313 -0.007132 -0.689563 0.203725 0

2142.2 0.696735 - 0.029327 0. I32226 - 0.696735 0.103812 0

0.694276 - 0.027705 0.169476 -0.694276 0.080445 0

0.694754 -0.028022 0.163335 - 0.694754 0.084669 0

0.69468 I - 0.027976 0.164306 -0.694681 0.084004 0

0.694693 -0.027984 0.164153 -0.694693 0.084110 0

104

PAUL D. SAMPSON

TABLE 2

Eigenvalues, Approximate Error Sums of Squares, and

Eigenvectors from Bookstein Algorithm Applied to

Gnanadesikan Data

Approx.

Eigenvalue Error S.S. A B C D E F

91.99 0.3875 -0.6736

-0.1100

0.1956 0.6072 0.3565 -0.0075

349.39 0.7088 -0.2458 0.0486 -0.8328 0.2523 -0.4221 -0.0433

15.14 0.7163 -0.0592 0.8790 0.0432 0.2215 -0.4066 -0.0875

and

F

= 0 since the tonics are constrained to pass through (0,O) and (1,O). In fact,

the iteration of this algorithm implicitly represents a fixed-point iteration procedure

for the solution of a nonlinear system of equations V = g(V) implied by the system

(2) with weighted cross-product matrix S = S(V) as in (5). However, the implicit

system of equations is so complex that it would be extremely difficult to determine

the conditions under which such a scheme converges (cf. Isaacson and Keller [6,

(Sect. 3.31).

For very noisy data the convergence of this procedure is much less direct. An

analysis of some data generated by Gnanadesikan [2, Exhibit 9b] clearly illustrates

the complications that arise when a data scatter does not precisely define a smooth

curve. The data was generated by adding random normal disturbances to the x and y

coordinates of a set of points located exactly on a parabola.’

Since the constraint matrix

D

in (2) has rank 3 there are three nontrivial solutions

to the eigensystem. It can occur that the two largest eigenvalues for this system are

nearly equal, indicating that two distinct tonics (orthogonal in the coefficient space)

fit the data almost equally well with respect to the measure Q. However, because of

the implicit weighting of errors by Q, it can also occur that the conic corresponding

to the eigenvector of largest eigenvalue of (2) simply does not fit the data in the

expected manner and is not the LSOD solution to (2). This is the case for the

Gnanadesikan data. Table 2 gives the three nontrivial eigenvectors and eigenvalues

of (2) (with no constraints: M = 0). It also lists the approximate sum of squared

orthogonal errors for each of the three tonics using the quadratic form (5), but with

weights

qi

recomputed for each of the tonics. The eigenvector of second largest

eigenvalue corresponds to the conic with the least sum of squared orthogonal errors.

The three conic sections are plotted with the original data scatter in Fig. 6. None

of the three represents an adequate model for the data. We can see that the conic of

largest eigenvalue fits the concentration of points at the bottom of Fig. 6 with the

flat side of an ellipse, somewhat at the expense of the rest of the data scatter.

For cases like this it is unclear how, or whether, one should proceed without

recourse to some subjective or a priori knowledge about the problem at hand: that

is, which conic is “closest” to the neighborhood of the LSOD conic. The iterative

algorithm proposed here cannot be expected to succeed if it is not based on an initial

conic in this neighborhood. One might, for example, be able to specify the ap-

proximate direction of the major axis desired of the fitted conic and proceed by

choosing that conic (eigenvector) most in agreement with this axis as a basis for the

’ The precise coordinates used in our analysis were digitized by hand from Gnanadesikan’s exhibit.

FITTING CONICS TO “VERY SCATTERED” DATA

105

FIG.

6. Three conic solutions to E!q. (2) for the Gnanadesikan data (labeled in the order of Table 2).

computation of the weights

qi

for the next iteration. Or, one could simply choose the

conic with the least approximate sum of squared orthogonal errors as the basis for

the next iteration. For the Gnanadesikan data these two approaches coincide if one

is within a “reasonable neighborhood” of the LSOD conic. However, in this case it is

necessary to rely on subjective or a priori information about the major axis to guide

the early iterations beginning with tonics that, as shown in Fig. 6, are not too similar

to intuitive expectations about what the LSOD conic should be.

For the data at hand the major axis appears to be roughly vertical rather than

horizontal, and consequently it appears that the second conic, a hyperbola, is more

in agreement with the expected LSOD fit in this sense. Beginning with the second

eigenvector as a basis for the computation of the weights

qi

in (4), and choosing the

solutions to (2) most in agreement with our subjective assessment of the true major

axis as bases for further iterations, we do converge to a satisfactory fitted conic. The

sequence of conic coefficients, along with the approximate sums of squared orthogo-

nal errors are listed in Table 3. Selected tonics in this sequence are plotted in Fig. 7.

A number of remarks about this example are in order. For every iteration except

the third, the conic chosen for continued iterations was the solution to (2) (with

weighted cross-product matrix Sck) and M = 0) of least approximate sum of squared

orthogonal errors and second largest eigenvalue. As the sequence of conics switches

from hyperbolas to ellipses at the third iteration, the discrepancy is not surprising.

We should note that for a given set of

qi,

the Q(xi,

yi)/qi

approximate orthogonal

distances to a conic only for the tonics like the one upon which the

qi

were based.

For other tonics these numbers have no geometric interpretation, so it is possible to

find tonics minimEng (5), or solving (2) with weighted cross-product matrices, that

do not fit the data at all. On the other hand, if there exists a unique LSOD solution,

then a sequence of solutions to (2) can be expected to converge to it. It is therefore

106

PAUL D. SAMPSON

b

*

**

FIG. 7. Sequence of conic sections from iterative refinement of Bookstein algorithm applied to the

Gnanadesikan data (labeled in the order. of Table 3).

not unreasonable that our sequence of tonics almost always corresponds to the

second largest eigenvalue.

Finally, note that the sequence of conks of Table 3 and Fig. 7 is comprised of two

alternating subsequences converging together, but the limit of this sequence is not

exactly the LSOD conic. The approximate sums of squared orthogonal errors begin

to increase slightly after the sixth iteration. This too is reasonable since the solution

to (2) with S =

S(V)

does not represent the minimization of

VS(V)V'

subject to

VDF"

= constant; rather it represents a heuristically appealing and computationally

feasible approximation to the minimization problem. For this reason we should take

the conic at the sixth iteration as our best approximation to the LSOD conic.

FITTING CONICS TO “VERY SCATTERED” DATA

TABLE 3

Approximate Error Sums of Squares and Conic Coefficients

from Iterative Refinement of Bookstein Algorithm Applied

to Gnanadesikan Data

107

Approx.

Error S.S. A B C D E F

0.3875 I - 0.6736 -0.1100 0.1956 0.6072

0.32714 -0.5640 -0.2058 0.3816 0.4741

0.41891 -0.5767 -0.1321 -0.5767 0.4994

0.21886 -0.7154 -0.2061 - 0.2407 0.6073

0.23739 -0.6691 -0.1380 - 0.4294 0.5883

0.20732 -0.7121 -0.1429 -0.2535 0.6290

0.22554 -0.6813 -0.1154 -0.3892 0.6073

0.20735 -0.7106 -0.1316 -0.2639 0.6308

0.22253 -0.6855 -0.1 I43 -0.3749 0.61 I7

0.2082 1 -0.7091 -0.1298 -0.2728 0.6299

0.2209 I - 0.6882 -0.1160 -0.3657 0.6139

0.20905 -0.7078 -0.1292 -0.2803 0.6289

0.21971 - 0.6903 -0.1176 -0.3584 0.6154

0.20979 -0.7067 -0.1287 - 0.2866 0.6280

0.3565 - 0.0075

0.5179 0.0217

- 0.2424 - 0.0957

0.1334 -0.0332

- 0.0253 - 0.0470

0.1055 - 0.0375

- 0.0036 -0.0469

0.0936 - 0.0390

0.0068 -0.0463

0.0868 - 0.0396

0.0144 - 0.0456

0.0814 -0.0400

0.0206 -0.0451

5. CONCLUSIONS

The examples of Section 4 demonstrate how a simple iterative refinement of the

Bookstein algorithm can be used to approximate the least sum of squared orthogonal

errors fit of any conic section to a scatter of data. If a data scatter is obtained by

digitizing the coordinates of points located along a clearly defined outline, as in the

examples considered by Bookstein [ 11, the original algorithm will need no refinement

unless a conic section model is inappropriate. However, when the data are “very

scattered” in the sense that data points are distributed rather widely about an

underlying outline, the iterative refinement results in a much improved fit.

The examples in this study also demonstrate the limitations of the iterative

procedure proposed here. If the fitted tonics are constrained in the manner of the

first example of Section 4, our experience suggests that a small number of iterations

will result in an adequate solution. However, for unconstrained tonics and noisy

data scatters it may not always be possible to run the iterative procedure without

intervention guided by a priori information about the orientation of the “true”

conic. This problem reflects the fact that we do not have analytical results stating the

conditions under which the fixed-point iteration scheme converges. Once a conic in

an undetermined neighborhood of the LSOD conic is obtained, the iterative proce-

dure converges reasonably well if the sequence of solutions to (2) of least approxi-

mate sum of squared orthogonal errors is followed.

REFERENCES

I. F. L. Bookstein, Fitting conic sections to scattered data,

Computer Graphics and Image Processing 9,

1979, 56-91.

2. R. Gnanadesikan,

Methods for Statistical Data Analysis of Multivariate Observations,

Wiley, New York,

1977.

108

PAUL D. SAMPSON

3. G. H. Golub and R. Underwood, Stationary values of the ratio of quadratic forms subject to linear

constraints, Z. Angew. Murk. Phys. 21, 1970,318-326.

4. J. 0. Ramsay, A comparative study of several robust estimates of slope, intercept, and scale in linear

regression, J. Amer. Statist. Assoc. l2, 1977, 608-615.

5. P. D. Sampson, Statistical Analysis

of

Shape with Conic Sections and Elliptical Distributions on a

Hypersphere, Technical Report No. 1 IO, Dept. of Statistics, University of Chicago, 1980.

6. E. Isaacson and H. B. Keller, Analysis

of Numerical

Methods, Wiley, New York, 1966.