Page 1

Automatic Localization of Anatomical Point Landmarks for Brain

Image Processing Algorithms

Scott C. Neu and Arthur W. Toga

Department of Neurology, UCLA Laboratory of Neuro Imaging, David Geffen School of Medicine,

Suite 225, 635 Charles Young Drive South, Los Angeles, CA 90095-7334, USA

Arthur W. Toga: toga@loni.ucla.edu

Abstract

Many brain image processing algorithms require one or more well-chosen seed points because

they need to be initialized close to an optimal solution. Anatomical point landmarks are useful for

constructing initial conditions for these algorithms because they tend to be highly-visible and

predictably-located points in brain image scans. We introduce an empirical training procedure that

locates user-selected anatomical point landmarks within well-defined precisions using image data

with different resolutions and MRI weightings. Our approach makes no assumptions on the

structural or intensity characteristics of the images and produces results that have no tunable run-

time parameters. We demonstrate the procedure using a Java GUI application (LONI ICE) to

determine the MRI weighting of brain scans and to locate features in T1-weighted and T2-

weighted scans.

Keywords

Anatomical point landmark; Automation; Singular value decomposition; Least-squares; Neural

network; Multi-resolution; Seed points

Introduction

For many brain image processing algorithms, determining a good seed point is crucial. Such

algorithms include snakes and active contours (Kass et al. 1987; Daneels et al. 1993;

McInerney and Dehmeshki 2003), level sets (Sethian 1999), and others (Cox et al. 1996)

which often produce different results when started with different initial conditions. This

occurs because these algorithms typically search out and find local-minima solutions; that is,

in their solution-spaces they evolve towards the minimum that is closest to their starting

point. Software programs that utilize these algorithms must then either require users to

manually select starting points (Meyer and Beucher 1990; MacDonald et al. 1994;

Yushkevich et al. 2006) or incorporate manual interaction into their implementations

(Barrett and Mortensen 1997; Mortensen and Barrett 1995; Falco et al. 1998; Boykov and

Jolly 2001). There are currently many software packages available for brain image

processing that require user intervention to perform image registration, 1 fiber tracking in

diffusion tensor images,2 interactive labeling of sulci,3 image segmentation,4 and visual

© Humana Press 2008

Correspondence to: Arthur W. Toga, toga@loni.ucla.edu.

1http://www.imaging.robarts.ca/Software/register_tutorial/register.html.

2http://www.ia.unc.edu/dev/download/fibertracking/index.htm

3http://www.bic.mni.mcgill.ca/~georges/Seal_Reference/seal_howto.html

NIH Public Access

Author Manuscript

Neuroinformatics. Author manuscript; available in PMC 2011 June 13.

Published in final edited form as:

Neuroinformatics. 2008 ; 6(2): 135–148. doi:10.1007/s12021-008-9018-x.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 2

image processing tasks.5 Unfortunately, for large-scale brain studies these semi-automatic

approaches become extremely time-consuming.

Anatomical point landmarks (Rohr 2001) are useful for constructing initial conditions

because they tend to be highly-visible and predictably-located points in image scans. This is

particularly true for brain structures, which appear roughly regular in shape, size, and

location. Current approaches that report on the detection of anatomical point landmarks are

limited to a few well-defined brain landmarks or specific modality, make restrictive

assumptions about image intensity distributions or image structures, and/or require intensive

computational time to execute. In this paper, we describe a method that requires no prior

knowledge of brain structures or assumptions on intensity distributions, and uses an example

set of manually selected anatomical point landmarks to calculate fitting parameters using

singular value decomposition (Kalman 1996). Singular value decomposition is a widely-

used technique in image analysis and has been used in areas such as image compression

(O’Leary and Peleg 1983), digital signal processing (Hansen and Jensen 1998;

Konstantinides and Yovanof 1995; Konstantinides et al. 1997), facial recognition (Sirovich

and Kirby 1987; Turk and Pentland 1991), texture analysis (Chen and Pavlidis 1983; Davis

et al. 1979), image restoration (Huang and Narendra 1974), and watermarking (Gorodetski

et al. 2001; Chandra 2002).

In previous work, differential operators have been applied (Le Briquer et al. 1993; Rohr

1997; Hartkens et al. 2002) to images of the brain to locate prominent points where

anatomical structures are strongly curved. Because this produces multiple points, a

manually-adjusted threshold is usually required to reduce the number of points that do not

coincide with neurologically-recognized point landmarks. The computation of partial

derivatives makes this approach relatively sensitive to image noise and small intensity

variations. Deformable and parametric intensity models have been used to detect tip- and

saddle-like structures in images of the head (Frantz et al. 2000; Alker et al. 2001; Wörz et al.

2003) by fitting model parameters to the image data through optimization of edge-based

measures. But these approaches have not been extended to other brain structure types. Other

methods rely more heavily on assumptions of brain geometry and intensity characteristics

(Han and Park 2004) and apply image intensity models to compute probability maps

(Fitzpatrick and Reinhardt 2005) that must be reduced for computational reasons. Linear

registration approaches (Woods et al. 1998) and algorithms (Pohl et al. 2002; Fischl et al.

2002) that aim to label entire anatomical structures in the brain can be used to produce point

landmarks, but their execution times are typically larger (20–30 min) than those algorithms

specifically designed to locate anatomical point landmarks (10–20 s).

In the field of image registration (Toga 1999), there are landmark-based methods that bring

image volumes from different imaging devices into spatial alignment for image-guided

neurosurgery (Peters et al. 1996) and therapy planning, as well as for visualization and

quantitative analysis. The anatomical point landmarks are usually identified by the user

interactively (Hill et al. 1991; Ende et al. 1992; Fang et al. 1996; Strasters et al. 1997), but

there are approaches that create surfaces from brain image volumes and then use differential

geometry to automatically extract extremal points (Thirion 1994; Pennec et al. 2000). These

methods usually minimize the distances between corresponding point landmarks with a

least-squares approach (Arun et al. 1987) or an iterative transformation application (Besl and

McKay 1992).

4http://www.itk.org/ItkSoftwareGuide.pdf

5http://rsb.info.nih.gov/ij/docs/intro.html

Neu and TogaPage 2

Neuroinformatics. Author manuscript; available in PMC 2011 June 13.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 3

The detection of anatomical point landmarks can be of particular use in the removal of facial

features from brain images. A subject’s identity can sometimes be discovered from a three-

dimensional surface reconstruction of the subject’s head, which may lead to violations of

HIPAA (Health Insurance Portability and Accountability Act) regulations and IRB

(Institutional Review Board) requirements that protect subject confidentiality. To address

these concerns, applications have been developed that black-out (i.e., replace with black

voxels) (Bischoff-Grethe et al. 2004) or warp (Shattuck et al. 2003) facial features in MRI

image volumes. Currently these applications require significant processing time, can only be

directly applied to T1-weighted images, and run the risk (Shattuck et al. 2003) of altering the

imaged brain anatomy.

Methods

Our methodology consists of constructing a series of functions that sequentially locates a

user-selected anatomical point landmark on a two-dimensional image slice or three-

dimensional image volume. Starting at the image/volume center, each function in the series

takes as input the output point of the previous function. The output point is closer to the

anatomical point landmark than the input point and is known with finer precision. The last

function in the series is constructed when the addition of another function yields no

improvement in the precision of locating the anatomical point landmark.

Data

Data used in this paper were obtained from the Alzheimer’s Disease Neuroimaging Initiative

(ADNI) database. ADNI is the result of efforts of many co-investigators from a broad range

of academic institutions and private corporations, and subjects have been recruited from

over 50 sites across the U.S. and Canada. The initial goal of ADNI was to recruit 800 adults,

ages 55 to 90, to participate in the research-approximately 200 cognitively normal older

individuals to be followed for 3 years, 400 people with MCI to be followed for 3 years, and

200 people with early AD to be followed for 2 years.

The different scanner manufacturers and models, pulse sequences, and spatial sizes and

resolutions of the MRI (magnetic resonance imaging) data we obtained from the ADNI

database are shown in Table 1. Because of the heterogeneousness of these acquisition

properties, we list only the most prevalent combinations. The subjects that were scanned

were between the ages of 60 and 90 (average age 73) and about 60% were female. Image

slices were acquired in both the axial (ax) and sagittal (sag) planes using different pulse

sequences [gradient echo (GR), spin echo (SE), inversion recovery (IR), and propeller MRI

(RM)] and different magnetic field strengths (1.5 and 3 Tesla). No structural abnormalities

(e.g., brain tumors) were present. We reoriented the image volumes so that the subjects are

facing the same direction and spatially normalized them into volumes of 256 × 256 × 256

voxels with 1 × 1 × 1 mm3 voxel spacings.

Terminology

We empirically determine the values of unknown constants in our functions using a training

set of image data. A testing set is then used to evaluate the trained functions using image

data that was not present during training. We typically use training sets that are two to ten

times larger than the testing sets.

As is illustrated in Fig. 1, a lattice and grid are used to extract data from each image slice/

volume in the training set. The grid contains grid cells and the intensities of all the image

pixels/voxels inside each grid cell are averaged together and normalized across all the cells.

Training data is extracted by placing the center of the grid at each equally-spaced lattice

Neu and Toga Page 3

Neuroinformatics. Author manuscript; available in PMC 2011 June 13.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 4

point and then averaging the image intensities inside the grid cells. We typically use a lattice

with a 5–7 pixel/voxel spacing and have not observed better results using smaller spacings.

As long as the size of each grid cell is much larger than the lattice spacing, we expect results

close to those that use all the points between the lattice points.

Although a subject’s head is normally well-centered in a brain scan, it is usually not true that

the same image voxel from two different scans will correspond to the same anatomical

point. However, in general the image voxels will be close to some common anatomical

point. If one draws a box large enough around each image voxel, that common anatomical

point will be contained by both boxes. The boxes in this case represent the initial precision;

i.e., how well the anatomical point is known in both scans. The initial precision of the input

point is a required input parameter for the first function in the series and is dependent upon

the scale on which dissimilarities occur between images. The final precision of the output

point of the last function is always limited by structural variations between subjects and

noise artifacts in the image data.

Mathematical Model

We train each function in the series by defining a lattice that fills the space that represents

the precision of the input to that function. For the first function, the center of the lattice is

placed at the center of each image/volume, and is large enough to cover the initial precision.

For the next functions in the series, the center of the lattice is placed at the user-selected

anatomical point landmark (for example, the center of the subject’s right eye) and the lattice

is given a size equal to the precision of the output from the previous function.

Our objective is to train the function to move every lattice point closer to the user-selected

anatomical point landmark on each image/volume of the training set. The space into which

the lattice points are moved represents the output precision of the function. The grid is

centered at each lattice point and the image intensities inside each grid cell are computed for

training. We assume that the horizontal and vertical distances from each lattice point to the

anatomical point landmark are linearly correlated with the image intensities (see the

Appendix) through

(1)

where m is the total number of lattice points in the training set, n is the number of grid cells,

Iij is the average of the image intensities in the jth grid cell at the ith lattice point, and the

are unknown coefficients that are to be solved for. Each

kth dimension from the ith lattice point to the anatomical point landmark. In Fig. 2, there is

one set of Eq. 1 for the horizontal distances

represents the distance along the

and one set for the vertical distances .

Equations 1 can also be written in matrix form as

(2)

where I is an m × n matrix and Ak and Dk are vectors of size n and m, respectively. Since we

choose the total number of lattice points to be greater than the number of grid cells (m > n),

Neu and Toga Page 4

Neuroinformatics. Author manuscript; available in PMC 2011 June 13.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 5

Eq. 1 is an overdetermined set of linear equations. This being the case, we can use singular

value decomposition (Press et al. 1995) to obtain a least-squares solution to Eq. 2. The

singular value decomposition of the matrix I yields6

(3)

where U is an m × n column-orthogonal matrix with UT · U = 1, V is an n × n orthogonal

matrix with V·VT = 1, andWis an n × n matrix whose off-diagonal elements are zero and

whose diagonal elements (the singular values wj) are non-negative:

(4)

The inverse of W is simply a matrix with the reciprocals of the singular values,

(5)

for which any diagonal element may be set to zero if it is numerically ill-conditioned (Press

et al. 1995). Substituting Eq. 3 into Eq. 2 and using the orthogonality relations for U and V

and Eqs. 4 and 5 gives the least-squares solution for the unknown coefficients

(6)

It is only necessary to calculate the generalized inverse matrix I−1 once; the same matrix is

used to compute the coefficients for each dimension. Note that Eq. 6 is not an exact solution

of Eq. 2 but is a solution only in the least-squares sense; I is a non-square matrix and

although I−1 · I = 1, in general I · I−1 ≠ 1.

Two-Dimensional Results

We illustrate our approach in two-dimensions by training a sequence of functions to locate

the center of each subject’s right eye in a two-dimensional image set. Each image slice in the

two-dimensional set is a T1-weighted MRI image that shows the subject’s right eye. We

used a training set of 64 image slices and a testing set of 32 image slices (each image slice

from a different subject). The center of each subject’s right eye (the white square in the

lower-right corner of Fig. 1) was manually located in each image in the training and testing

sets.

Figure 3 and Figure 4 illustrate the new locations of the lattice points on an image in the

testing set when the lattice points are moved in accordance with Eqs. 1 and 6. A 5 × 5 grid

of cells equal to the image size was used to compute the intensities Iij. Most of the points in

Fig. 4 fit inside a rectangle that is about one half of the lattice size. Many of the points that

fall outside the rectangle were originally located near the edges of the lattice. We can

6The superscript T denotes the transpose of a matrix.

Neu and TogaPage 5

Neuroinformatics. Author manuscript; available in PMC 2011 June 13.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript