Content uploaded by Paul D Sampson
Author content
All content in this area was uploaded by Paul D Sampson on Nov 12, 2017
Content may be subject to copyright.
COMPUTER GRAPHICS AND IMAGE PROCESSING 18,
97- 108 (1982)
NOTE
Fitting Conic Sections to “Very Scattered” Data: An
iterative Refinement of the Bookstein Algorithm*
PAUL
D.
SAMPSON?
Department of Statistics, The University of Chicago, Chicago, Illinois 60637
Received March 25, 1980
An examination of the geometric interpretation of the error-of-fit measure of the Bookstein
algorithm for fitting conic sections shows why it may not be entirely satisfactory when the data
are “very scattered” in the sense that the data points are distributed rather widely about an
underlying smooth curve. A simple iterative refinement of the Bookstein algorithm, similar in
spirit to iterative weighted least-squares methods in regression analysis, results in a fitted conic
section that approximates the conic that would minimize the sum of squared orthogonal
distances of data points from the fitted conic. The usefulness and limitations of the refined
algorithm are demonstrated on two different types of “very scattered” data.
I. INTRODUCTION
Bookstein [I] recently presented us with a valuable algorithm for fitting conic
sections to scattered data. It is generally suited to ellipses and hyperbolas; it is easy
to compute, requiring only an eigenanalysis; and perhaps most important, it fits
tonics to data in a manner invariant under rotations, translations, and changes of
scale of the data. These last two properties make the algorithm nearly analogous to
the principal components method of fitting a straight line to a data scatter. However,
in contrast with the principal components fit of a line, the Bookstein algorithm
minimizes a sum of squared “errors” that are not the orthogonal Euclidean distances
from data points to the fitted tonics. This fact explains why the fit of the Bookstein
algorithm may not be entirely satisfactory on data scatters that are not accurately
represented by conic sections (either because there is considerable “noise” in the
data, or because a functional form other than a conic would be better suited to the
data).
Depending on the distribution of errors about the “true” conic, a number of
different preferred criteria for the goodness-of-fit of a conic section might be
defined. In many situations an intuitively appealing goal is, indeed, the simple
minimization of the sum of squared errors of the data points measured in the
direction normal to the fitted conic (see, for example, the discussion of Gnanadesi-
kan [2, Sect. 2.41). While achieving this goal poses an intractable computational
problem, a reexamination of the definition of error-of-fit for the Bookstein algorithm
suggests a simple iterative refinement of it which approximates the least sum of
squared orthogonal distances (LSOD) fit.
*Support for this research was provided in part by National Science Foundation Grants SOC76-80389
and MCS76-81435.
$Current address: Department of Statistics, GN-22, University of Washington, Seattle, Washington
98195. 97
0146-664X/82/010097-12$02.00/0
Copyright 0 1982 by Academic Press. Inc.
All rights of reproduction in any fom reserved.
98
PAUL D. SAMP!SON
We examine the Bookstein algorithm and propose its iterative refinement in
Sections 2 and 3. In Section 4 the new algorithm is demonstrated on two types of
“very scattered” or noisy data which do not precisely define the underlying smooth
curves to be modeled by conic sections.
2. THE BOOKSTEIN ALGORITHM AND ITS ERROR-OF-FIT
Choosing notation consistent with [l] we write the equation of a conic as
VZ’=Ax2+Bxy+Cy2+Dx+Ey+F=0, (1)
where V = (A,
D,
C,
D, E,
F) and
Z
= (.x2, xy, y2, x, y, 1). For an arbitrary point
(x, y) the error-of-fit is measured by the equation of the conic and is denoted by
Q(x, y) =
VZ’.
For a set of data (xi, yi), i = 1,2,. . . , n, the algorithm consists of
minhnizing the sum of squared errors
i (Q(xi, Y~))~ = i (VZ;)2 = WV’,
i=l i=l n
where
Zi
= (xf, xiyi, y,‘, xi, yi, 1) and S = 2
ZiZ;,
subject to the quadratic nor-
malization VDV’ = constant,
D
= diag( 1, 5, l,O, 0,O). Bookstein shows that this
particular normalization provides the algorithm with its invariance under rotation,
translation, and change of scale. We can also place arbitrary linear constraints on
the coefficient vector
V
by requiring
VA4
= 0 with M a 6 X
p
matrix of rank 5 5.
The vector
V
minimizing
WV’
subject to VDV’ = constant and
VA4 = 0
is the
eigenvector of largest eigenvalue of the system
V[I-
M(M'S-lM)-M'S-l]D-
AVS=O, (2)
where - denotes any generalized inverse.
Although the solution to the problem can be expressed in terms of the generalized
eigensystem (2), for computational purposes it is preferable to implement the
method of Golub and Underwood [3], which reduces the problem to the solution of
a smaller, symmetric eigensystem.
A geometric interpretation of the error-of-fit for this algorithm is given by Fig. 1
(taken from Bookstein [ 1, Fig. 11). The error-of-fit for a point (x, y) relative to a
given conic is proportional to 1 - [(d + d,)2/d2], where (d + d,) is the distance
from the point to the center of the conic, and d is the distance from the conic to its
center along the ray from center to point.
For completeness we note that the error-of-fit for a point (x0, yO) separated from
a hyperbola by its asymptotes can be pictured in the same way by making reference
to the conjugate of the given hyperbola. This is illustrated in Fig. 2. The error of a
point relative to the conic Q(x, y) = 0 is proportional to 1 + [(d, + d2)2/di],
where (d, + d,) and d, are the distances center to point and center to conjugate
conic Q2, respectively. Furthermore, since the proportionality factor is the same for
Figs. 1 and 2, it is easy to show that Q(x, y) = Q2(x, y) for all points (x, y) on the
asymptotes to these hyperbolas. Thus, a hyperbola and its conjugate are equidistant
from their asymptotes by the measure Q.
More generally, note that contours of equal error, Q(x, y) = c, are themselves
conic sections concentric with (i.e., having the same center as) Q(x, y) =
0. For a
FITTING CONICS TO “VERY SCATTERED” DATA
99
Parabola Hy~erboh
FIG. 1. Error-of-fit for the Bookstein algorithm (Bookstein [I], Fig. 1.) Q(x, y) a: I - (d +
d,)*/d*.
hyperbola, the contours represent a family of hyperbolas all with the same asymp-
totes. From Fig. 3a it is clear that observations (xi, y;) located where a hyperbola
approaches its asymptotes are more heavily weighted by Q than observations near
the vertex. Similarly, Fig. 3b shows that deviations from an ellipse along the “flat”
side are more heavily weighted than those at the ends, where the radius of curvature
is smallest. For this reason, when there is a considerable amount of “noise” in the
data the Bookstein algorithm, based on the measure Q, may result in an unsatisfac-
Fw: 2. Error-of-fit for the Bookstein algorithm with reference to conjugate hyperbolas.
Q(x,,,
y,,) a I
+ (d, + d2)*/d:.
100
PAUL D. SAMPSON
b
FIG.
3. Families of concentric conic sections. (a) Concentric hyperbolas. (b) Concentric elipses.
tory fitted conic because of over-fitting the “flat” parts of conic sections relative to
the “curved” parts around vertices.
3. ITERATIVE REFINEMENT OF TI-IE BOOKSTEIN ALGORITHM
To see how one might compensate for the inherent weighting of errors-of-fit by
the measure Q as illustrated in Section 2, consider the concentric hyperbolas
Q(x, y) = 0 and Q(x, y) = ci,
i =
1,2, d rawn in Fig. 4. The points a and b are the
same distance from the central hyperbola according to the measure Q. For both
FITTING CONICS TO “VERY SCATTERED” DATA
101
FIG. 4. Approximation to orthogonal distance from a conic section.
points,
d’
labels the orthogonal distance to the hyperbola and
d”
labels the distance
to the hyperbola measured in a direction orthogonal to the contour Q(x, y) = c;. In
most situations
d”
will be a good approximation to the orthogonal distance
d’.
(Point a is almost a worst case for the hyperbola in Fig. 4, while at point
b
the two
distances are clearly almost the same.)
The rate of change of Q at the point a in the direction orthogonal to Q(x, v) = c,
is just the magnitude or norm of the gradient of Q(x, y) at the point a. Let a have
coordinates (x0, y,). The gradient is
= (ZAX,
+ By, + D)i, + (2Cy, +
Bx, + E)i,,
where i, and i, are unit vectors parallel to the x and y axes. Its squared norm is
therefore
1 v
Q(x,,
y,) I* = (2.4X, + By, + 0)” + (2Cy, + Bx, + E)*.
Since a linear approximation gives us
Q(& Y;‘> = Q<x,, va) - d” I vQ(x,, .Y,) I 3
where
d”
is distance measured from CZ” to a, a first order approximation to
d”
is
Q(L x,)
d” = I
vQ<x,,
~a) I *
102
PAUL D. SAMPSON
This argument suggests that a good algorithm would minimize
(3)
In other words, we should weight the error Q by the inverse of the norm of its
gradient. Unfortunately, (3) is not a simple quadratic form in the coefficient vector
V and we cannot express the solution to the minimization problem analytically in
closed form.
To avoid the difficulty of numerically mimn&ing (3), I suggest the following
two-step procedure. First, fit a conic section by the Bookstein algorithm of Section 2.
Let this first-stage solution be V(l) = (A,, B,, C,, D,, E,, F,). Use V(l) to estimate
( vQ(xi, y,) 1 at each point (xi, yi) with
qi = [(2A,x, + B,y, + D,)2 + (2C,y, + B,xi + E,)2]“2. (4
Next, fit a second conic by the Bookstein algorithm, but using the weighted
cross-product matrix
s(2) = $ z;z,/q,2
i=l
so that one minimizes
VS’2’V’
= i [VZ;/qi12 = i [Q(x,, yi)/qi]‘.
i=l i=l
(5)
Note that since the computed gradient is invariant under translations, rotations, and
changes of scale, this procedure remains invariant in the manner described by
Bookstein.
If the initial Bookstein fit is in a neighborhood of the LSOD solution, (4) should
provide a good approximation to the desired weights, and the vector V solving (2)
(with weighted cross-product matrix Sc2)) will represent a conic that closely ap-
proximates the LSOD conic. In general it may be necessary to compute further
iterations with reweighted cross-product matrices SC“) to refine the solution. This
represents an “iterative weighted least squares” algorithm similar in spirit to those
that appear in the regression literature (cf. Ramsay [4]).
4. APPLICATIONS OF THE ITERATIVE ALGORITHM
Figure 5 shows two scatters of data with the best-fitting tonics (constrained to
pass through two end points) resulting from (A) the Bookstein algorithm and (B) the
two-step iterative refinement. Though the fitted tonics are very similar, clearly conic
B fits the scatter in the vicinity of the vertex better than conic A.
The behavior of solutions resulting from further iterations of the algorithm is of
some interest. For data sets like those illustrated in Fig. 5 (see Sampson [S]), further
iterations display no appreciable differences in the fitted tonics. The rapid conver-
gence of the coefficient vectors is shown in Table 1, where all coefficients A = -D
FITTING CONKS TO “VERY SCATTERED” DATA
103
21u2.2.
FIG.
5. Conies fitted to data scatters: (A) Bookstein algorithm, (B) two-step iterative procedure.
TABLE 1
Conic Coefficients from Iterative Refinement of
Bookstein Algorithm Applied to Data of Fig. 5
ID. No. A B C D E F
__- -
2279.2 0.684893 0.081016 -0.052821 - 0.684893 0.229104 0
0.690010 0.088135 -0.001655 -0.690010 0.200007 0
0.689505 0.086223 -0.007828 -0.689505 0.204133 0
0.689571 0.086336 - 0.007037 - 0.68957 I 0.203664 0
0.689563 0.086313 -0.007132 -0.689563 0.203725 0
2142.2 0.696735 - 0.029327 0. I32226 - 0.696735 0.103812 0
0.694276 - 0.027705 0.169476 -0.694276 0.080445 0
0.694754 -0.028022 0.163335 - 0.694754 0.084669 0
0.69468 I - 0.027976 0.164306 -0.694681 0.084004 0
0.694693 -0.027984 0.164153 -0.694693 0.084110 0
104
PAUL D. SAMPSON
TABLE 2
Eigenvalues, Approximate Error Sums of Squares, and
Eigenvectors from Bookstein Algorithm Applied to
Gnanadesikan Data
Approx.
Eigenvalue Error S.S. A B C D E F
91.99 0.3875 -0.6736
-0.1100
0.1956 0.6072 0.3565 -0.0075
349.39 0.7088 -0.2458 0.0486 -0.8328 0.2523 -0.4221 -0.0433
15.14 0.7163 -0.0592 0.8790 0.0432 0.2215 -0.4066 -0.0875
and
F
= 0 since the tonics are constrained to pass through (0,O) and (1,O). In fact,
the iteration of this algorithm implicitly represents a fixed-point iteration procedure
for the solution of a nonlinear system of equations V = g(V) implied by the system
(2) with weighted cross-product matrix S = S(V) as in (5). However, the implicit
system of equations is so complex that it would be extremely difficult to determine
the conditions under which such a scheme converges (cf. Isaacson and Keller [6,
(Sect. 3.31).
For very noisy data the convergence of this procedure is much less direct. An
analysis of some data generated by Gnanadesikan [2, Exhibit 9b] clearly illustrates
the complications that arise when a data scatter does not precisely define a smooth
curve. The data was generated by adding random normal disturbances to the x and y
coordinates of a set of points located exactly on a parabola.’
Since the constraint matrix
D
in (2) has rank 3 there are three nontrivial solutions
to the eigensystem. It can occur that the two largest eigenvalues for this system are
nearly equal, indicating that two distinct tonics (orthogonal in the coefficient space)
fit the data almost equally well with respect to the measure Q. However, because of
the implicit weighting of errors by Q, it can also occur that the conic corresponding
to the eigenvector of largest eigenvalue of (2) simply does not fit the data in the
expected manner and is not the LSOD solution to (2). This is the case for the
Gnanadesikan data. Table 2 gives the three nontrivial eigenvectors and eigenvalues
of (2) (with no constraints: M = 0). It also lists the approximate sum of squared
orthogonal errors for each of the three tonics using the quadratic form (5), but with
weights
qi
recomputed for each of the tonics. The eigenvector of second largest
eigenvalue corresponds to the conic with the least sum of squared orthogonal errors.
The three conic sections are plotted with the original data scatter in Fig. 6. None
of the three represents an adequate model for the data. We can see that the conic of
largest eigenvalue fits the concentration of points at the bottom of Fig. 6 with the
flat side of an ellipse, somewhat at the expense of the rest of the data scatter.
For cases like this it is unclear how, or whether, one should proceed without
recourse to some subjective or a priori knowledge about the problem at hand: that
is, which conic is “closest” to the neighborhood of the LSOD conic. The iterative
algorithm proposed here cannot be expected to succeed if it is not based on an initial
conic in this neighborhood. One might, for example, be able to specify the ap-
proximate direction of the major axis desired of the fitted conic and proceed by
choosing that conic (eigenvector) most in agreement with this axis as a basis for the
’ The precise coordinates used in our analysis were digitized by hand from Gnanadesikan’s exhibit.
FITTING CONICS TO “VERY SCATTERED” DATA
105
FIG.
6. Three conic solutions to E!q. (2) for the Gnanadesikan data (labeled in the order of Table 2).
computation of the weights
qi
for the next iteration. Or, one could simply choose the
conic with the least approximate sum of squared orthogonal errors as the basis for
the next iteration. For the Gnanadesikan data these two approaches coincide if one
is within a “reasonable neighborhood” of the LSOD conic. However, in this case it is
necessary to rely on subjective or a priori information about the major axis to guide
the early iterations beginning with tonics that, as shown in Fig. 6, are not too similar
to intuitive expectations about what the LSOD conic should be.
For the data at hand the major axis appears to be roughly vertical rather than
horizontal, and consequently it appears that the second conic, a hyperbola, is more
in agreement with the expected LSOD fit in this sense. Beginning with the second
eigenvector as a basis for the computation of the weights
qi
in (4), and choosing the
solutions to (2) most in agreement with our subjective assessment of the true major
axis as bases for further iterations, we do converge to a satisfactory fitted conic. The
sequence of conic coefficients, along with the approximate sums of squared orthogo-
nal errors are listed in Table 3. Selected tonics in this sequence are plotted in Fig. 7.
A number of remarks about this example are in order. For every iteration except
the third, the conic chosen for continued iterations was the solution to (2) (with
weighted cross-product matrix Sck) and M = 0) of least approximate sum of squared
orthogonal errors and second largest eigenvalue. As the sequence of conics switches
from hyperbolas to ellipses at the third iteration, the discrepancy is not surprising.
We should note that for a given set of
qi,
the Q(xi,
yi)/qi
approximate orthogonal
distances to a conic only for the tonics like the one upon which the
qi
were based.
For other tonics these numbers have no geometric interpretation, so it is possible to
find tonics minimEng (5), or solving (2) with weighted cross-product matrices, that
do not fit the data at all. On the other hand, if there exists a unique LSOD solution,
then a sequence of solutions to (2) can be expected to converge to it. It is therefore
106
PAUL D. SAMPSON
b
*
**
FIG. 7. Sequence of conic sections from iterative refinement of Bookstein algorithm applied to the
Gnanadesikan data (labeled in the order. of Table 3).
not unreasonable that our sequence of tonics almost always corresponds to the
second largest eigenvalue.
Finally, note that the sequence of conks of Table 3 and Fig. 7 is comprised of two
alternating subsequences converging together, but the limit of this sequence is not
exactly the LSOD conic. The approximate sums of squared orthogonal errors begin
to increase slightly after the sixth iteration. This too is reasonable since the solution
to (2) with S =
S(V)
does not represent the minimization of
VS(V)V'
subject to
VDF"
= constant; rather it represents a heuristically appealing and computationally
feasible approximation to the minimization problem. For this reason we should take
the conic at the sixth iteration as our best approximation to the LSOD conic.
FITTING CONICS TO “VERY SCATTERED” DATA
TABLE 3
Approximate Error Sums of Squares and Conic Coefficients
from Iterative Refinement of Bookstein Algorithm Applied
to Gnanadesikan Data
107
Approx.
Error S.S. A B C D E F
0.3875 I - 0.6736 -0.1100 0.1956 0.6072
0.32714 -0.5640 -0.2058 0.3816 0.4741
0.41891 -0.5767 -0.1321 -0.5767 0.4994
0.21886 -0.7154 -0.2061 - 0.2407 0.6073
0.23739 -0.6691 -0.1380 - 0.4294 0.5883
0.20732 -0.7121 -0.1429 -0.2535 0.6290
0.22554 -0.6813 -0.1154 -0.3892 0.6073
0.20735 -0.7106 -0.1316 -0.2639 0.6308
0.22253 -0.6855 -0.1 I43 -0.3749 0.61 I7
0.2082 1 -0.7091 -0.1298 -0.2728 0.6299
0.2209 I - 0.6882 -0.1160 -0.3657 0.6139
0.20905 -0.7078 -0.1292 -0.2803 0.6289
0.21971 - 0.6903 -0.1176 -0.3584 0.6154
0.20979 -0.7067 -0.1287 - 0.2866 0.6280
0.3565 - 0.0075
0.5179 0.0217
- 0.2424 - 0.0957
0.1334 -0.0332
- 0.0253 - 0.0470
0.1055 - 0.0375
- 0.0036 -0.0469
0.0936 - 0.0390
0.0068 -0.0463
0.0868 - 0.0396
0.0144 - 0.0456
0.0814 -0.0400
0.0206 -0.0451
5. CONCLUSIONS
The examples of Section 4 demonstrate how a simple iterative refinement of the
Bookstein algorithm can be used to approximate the least sum of squared orthogonal
errors fit of any conic section to a scatter of data. If a data scatter is obtained by
digitizing the coordinates of points located along a clearly defined outline, as in the
examples considered by Bookstein [ 11, the original algorithm will need no refinement
unless a conic section model is inappropriate. However, when the data are “very
scattered” in the sense that data points are distributed rather widely about an
underlying outline, the iterative refinement results in a much improved fit.
The examples in this study also demonstrate the limitations of the iterative
procedure proposed here. If the fitted tonics are constrained in the manner of the
first example of Section 4, our experience suggests that a small number of iterations
will result in an adequate solution. However, for unconstrained tonics and noisy
data scatters it may not always be possible to run the iterative procedure without
intervention guided by a priori information about the orientation of the “true”
conic. This problem reflects the fact that we do not have analytical results stating the
conditions under which the fixed-point iteration scheme converges. Once a conic in
an undetermined neighborhood of the LSOD conic is obtained, the iterative proce-
dure converges reasonably well if the sequence of solutions to (2) of least approxi-
mate sum of squared orthogonal errors is followed.
REFERENCES
I. F. L. Bookstein, Fitting conic sections to scattered data,
Computer Graphics and Image Processing 9,
1979, 56-91.
2. R. Gnanadesikan,
Methods for Statistical Data Analysis of Multivariate Observations,
Wiley, New York,
1977.
108
PAUL D. SAMPSON
3. G. H. Golub and R. Underwood, Stationary values of the ratio of quadratic forms subject to linear
constraints, Z. Angew. Murk. Phys. 21, 1970,318-326.
4. J. 0. Ramsay, A comparative study of several robust estimates of slope, intercept, and scale in linear
regression, J. Amer. Statist. Assoc. l2, 1977, 608-615.
5. P. D. Sampson, Statistical Analysis
of
Shape with Conic Sections and Elliptical Distributions on a
Hypersphere, Technical Report No. 1 IO, Dept. of Statistics, University of Chicago, 1980.
6. E. Isaacson and H. B. Keller, Analysis
of Numerical
Methods, Wiley, New York, 1966.