Page 1

International Journal of Computer Vision 55(2/3), 85–106, 2003

c ? 2003 Kluwer Academic Publishers. Manufactured in The Netherlands.

Deformable M-Reps for 3D Medical Image Segmentation

STEPHEN M. PIZER, P. THOMAS FLETCHER, SARANG JOSHI, ANDREW THALL,

JAMES Z. CHEN, YONATAN FRIDMAN, DANIEL S. FRITSCH, A. GRAHAM GASH,

JOHN M. GLOTZER, MICHAEL R. JIROUTEK, CONGLIN LU, KEITH E. MULLER,

GREGG TRACTON, PAUL YUSHKEVICH AND EDWARD L. CHANEY

Medical Image Display & Analysis Group, University of North Carolina, Chapel Hill

Received October 26, 2001; Revised September 13, 2002; Accepted June 27, 2003

Abstract.

geometry. They are particularly well suited to model anatomic objects and in particular to capture prior geometric

information effectively in deformable models segmentation approaches. The representation is based on figural

models, which define objects at coarse scale by a hierarchy of figures—each figure generally a slab representing

a solid region and its boundary simultaneously. This paper focuses on the use of single figure models to segment

objects of relatively simple structure.

A single figure is a sheet of medial atoms, which is interpolated from the model formed by a net, i.e., a mesh or

chain, of medial atoms (hence the name m-reps), each atom modeling a solid region via not only a position and a

width but also a local figural frame giving figural directions and an object angle between opposing, corresponding

positions on the boundary implied by the m-rep. The special capability of an m-rep is to provide spatial and orien-

tational correspondence between an object in two different states of deformation. This ability is central to effective

measurement of both geometric typicality and geometry to image match, the two terms of the objective function

optimized in segmentation by deformable models. The other ability of m-reps central to effective segmentation is

their ability to support segmentation at multiple levels of scale, with successively finer precision. Objects modeled

by single figures are segmented first by a similarity transform augmented by object elongation, then by adjustment

of each medial atom, and finally by displacing a dense sampling of the m-rep implied boundary. While these models

and approaches also exist in 2D, we focus on 3D objects.

The segmentation of the kidney from CT and the hippocampus from MRI serve as the major examples in this

paper. The accuracy of segmentation as compared to manual, slice-by-slice segmentation is reported.

M-reps (formerly called DSLs) are a multiscale medial means for modeling and rendering 3D solid

Keywords:

segmentation, medial, deformable model, object, shape, medical image

1.Introduction

Segmentation via deformable models has shown the

advantage of allowing the expected geometric con-

formation of objects to be expressed (Cootes, 1993;

Staib, 1996; Delingette, 1999; among others, also see

McInerny, 1996 for a survey of active surfaces meth-

ods).Thebasicformulationistorepresentanobjectby

a set of geometric primitives and to deform the object

by changing the values of the primitives to optimize an

objective function including a match of the deformed

object to the image data. Either the objective function

also includes a term reflecting the geometric typical-

ity of the deformed object, or the deformation is con-

strained to objects with adequate geometric typicality.

In our work the objective function is the sum of a geo-

metric typicality term and a geometry to image match

term.

The most common geometric representation in the

literature of segmentation by deformable models has

been a mesh of boundary locations. The hypothesis

described and tested in this paper is that improved

Page 2

86

Pizer et al.

•

•

•

•

Figure 1.

dial locus of an object as a sheet of disks (spheres in 3D) bitangent

to the object boundary and our equivalent view (right) as an m-rep:

a curve (sheet in 3D) of hubs at the sphere center and equal length

spokesnormaltoobjectboundary.Thelocusofthespokeendsforms

the medially implied boundary.

A 2D illustration of (left) the traditional view of the me-

segmentations will result from using a representation

that is at multiple levels of scale and that at all but the

finest levels of scale is made from meshes of medial

atoms. We will show that this geometric representa-

tion, which we call m-reps, has advantages in measur-

ing both the geometric typicality and the geometry to

image match, in providing the efficiency advantages of

segmentation at multiple scales, and in characterizing

the object as an easily deformable solid.

Many authors, in image analysis, geometry, human

vision, computer graphics, and mechanical modeling,

have come to the understanding, originally promul-

gated by Blum (1967), that the medial relationship1

between points on opposite sides of a figure (Fig. 1) is

an important factor in the object’s geometric descrip-

tion. Biederman (1987), Marr (1978), Burbeck (1996),

Leyton (1992), Lee (1995), and others have produced

psychophysical and neurophysiological evidence for

the importance of medial relationships (in 2D projec-

tion) in human vision. The relation has also been ex-

plored in 3D by Nackman (1985), Vermeer (1994),

and Siddiqi (1999), and medial axis modeling tech-

niques have been applied by many researchers, includ-

ing Bloomenthal (1991), Wyvill (1986), Singh (1998),

Amenta (1998), Bittar (1995), Igarashi (1999) and

Markosian (1999). Of these, Bloomenthal and Wyvill

provided skeletal-based soft-objects; Singh provided

medial (wire-based) deformations; Amenta and Bittar

worked on medially based reconstruction; Igarashi

used a medial spine in 2D to generate 3D surfaces

from sketched outlines; and Markosian used implicit

surfaces generated by skeletal polyhedra.

One of the advantages of a medial representation

is that it allows one to distinguish object deforma-

tions into along-object deviations, namely elongations

and bendings, and across-object deviations, namely

bulgingsandattachmentofprotrusionsorindentations.

Anadditionaladvantageisthatdistances,andthusspa-

tial tolerances, can be expressed as a fraction of medial

width. These properties allow positions and orienta-

tions to be followed through deformations of elonga-

tion, widening, or bending. Because geometric typi-

cality requires comparison of corresponding positions

of an object before and after deformation and because

geometrytoimagematchrequirescomparisonofinten-

sities at corresponding positions, this ability to provide

what we call a figural coordinate system is advanta-

geous in segmentation by deformable models.

Medial representations divide a multi-object com-

plexintoobjectsandobjectsintofigures,i.e.,slabswith

an unbranching medial locus (see Fig. 1). In the fol-

lowing we also show how they naturally divide figures

into figural sections, and how by implying a bound-

ary they aid in dividing the boundaries of these fig-

ural sections into smaller boundary tiles. This natu-

ral subdivision into the very units of medical interest

provides the opportunity for segmentation at multiple

levels of scale, from large scale to small, that pro-

vides at each scale a segmentation that is of smaller

tolerance than the previous, just larger scale. Such a

hierarchical approach was promulgated by Grenander

(1981). Such a multi-scale-level approach is required

for a segmentation that operates in time linear in the

number of the smallest scale geometric elements, here

the boundary tiles. The fact that at each level the units

are geometrically related to the units of relatively uni-

form tissue properties yields effective and efficient

segmentations.

Our m-reps representation described in Pizer (1999)

and Joshi (2001) (in the first reference called DSLs)

reverses the notion of medial relations descended from

Blum (1967) from a boundary implying a medial de-

scription to a mesh of medial atoms implying bound-

aries, i.e., from an unstable to a stable relation. The

radius-proportional ruler and the need to have locality

at the scale of the figural section require it to use a

width-proportional sampling of the medial surface in

place of a continuous medial sheet.2These latter prop-

ertiesfollowfromthedesiredirectlytorepresentshape,

i.e., object geometry of some locality that is similarity

transforminvariant.Thespecificsaregivenlaterinthis

section.

M-reps also extend the medial description to the

inclusion of a width-proportional tolerance, provid-

ing opportunities for stages of the representation with

successively smaller tolerances. Representations with

large tolerance can ignore detail and focus on gross

Page 3

Deformable M-Reps for 3D Medical Image Segmentation87

Figure 2.

is represented by a chain of medial atoms. Certain medial atoms in a subfigure are interfigurally linked (dashed lines on the left) to their parent

figures. In the 3D example of a hippocampus (middle) there is one figure, represented by a mesh of medial atoms. Each hub with two line

segment spokes forms a medial atom (Fig. 3). The mesh is viewed from two directions, and the renderings below show the boundary implied by

the mesh. The example on the right shows a 4-figure m-rep for a cerebral ventricle.

M-reps: In the 2D example (left) there are 4 figures: a main figure, a protrusion, an indentation, and a separate object. Each figure

shape, and in these large-tolerance stages discrete

samplings can be coarse, resulting in considerable ef-

ficiency of manipulation and presentation. Smaller-

tolerancestagescanfocusonrefinementsofthelarger-

tolerance stages and thus more local aspects.

As described in Pizer (1999) and Joshi (2001), as

a result of the aforementioned requirements an m-rep

model of an object is a representation (data structure)

consisting of a hierarchy of linked m-rep models for

single figures (Fig. 2). A model for a single figure is

madefromanet(meshorchain)ofmedialatoms(hence

the name m-reps), each atom (Fig. 3) designating not

only a medial position x and width r, but also a lo-

calfiguralframeF implyingfiguraldirections,andthe

object’s local narrowing rate, given by an object angle

θ between opposing, corresponding positions on the

implied boundary. In addition, width proportionality

constants indicate net link length, boundary tolerance,

boundary curvature limits, and, for measuring the fit of

the atom to a 3D image, an image interrogation aper-

ture. As detailed in later papers, a multifigure model

of an object consists of a directed acyclic graph of fig-

ure nets, with interfigural links capturing information

about subfigural location along the parent figure’s me-

dially implied boundary, figural width relative to the

parent figure, and subfigural orientation relative to the

parent figure. The elements of the figural graph also

contain boundary displacement maps that can be used

to give fine scale to the model.

Sometimes one wishes to represent and then seg-

ment multiple disconnected objects at the same time.

An example is the cerebral ventricles, hippocampus,

and caudate in which the structures are related but one

Figure 3.

length boundary-pointing arrows ? p and ? s (for “port” and “star-

board”), which we call “spokes”. The atom on the left is for an

internal mesh position, implying two boundary sections. The atom

ontherightisforameshedgeposition,implyingasectionofbound-

ary crest. The atoms are shown in the “atom-plane” containing x,

? p and ? s. An atom is represented by the medial hub position x; the

length r of the boundary-pointing arrows; a frame made from the

unit-length bisector?b of ? p and ? s, the?b-orthogonal unit vector ? n in

the atom plane, and the complementary unit vector?b⊥; and the “ob-

ject angle” θ between?b and each spoke. For a slab-like section of

figure, ? p and ? s provide links between the medial point and the im-

plied boundary (shown as a narrow curve), giving approximations,

withtolerance,tobothitspositionanditsnormal.Theimpliedfigure

section is slab-like and centered on the head of the atom’s spokes,

i.e., it is extended in the?b⊥direction just as it is illustrated to do in

the atom-plane directions perpendicular to its spoke.

Medial atoms, made from a position x and two equal

Page 4

88

Pizer et al.

is not a protrusion or an indentation on another. An-

other example are the pair of kidneys and the liver. In

oursystemthesecanbeconnectedbyoneormorecon-

nections between the representations of the respective

objects, allowing the position of one figure to predict

boundary positions of the other. This matter is left to a

paper covering the segmentation of multi-object com-

plexes (Fletcher, 2002).

In the remainder of this paper we first (Section 2.1)

detail the m-reps data structure and geometry, then

(Section 2.2) describe how a continuous boundary

and medial sheets are interpolated from the sampled

sheet directly represented in an m-reps figure, and then

(Section 2.3) detail the way in which the m-rep pro-

vides positional and orientational correspondences be-

tweenmodelsanddeformedmodels.Afterbriefdiscus-

sionsinSections2.4ofmodelbuildingfromsegmented

objects that serve for model training, in Section 3 we

discuss the method of segmentation by deformable

m-reps. In Section 4 we give results of segmentation of

kidneys, hippocampi, and a horn of the cerebral ven-

tricle by these methods, and in Section 5 we conclude

with a comparative discussion of our method and indi-

cations of future directions in which our segmentation

method is being developed.

2.Representation of Objects by M-reps

2.1.M-reps Geometry

Intuitively a figure is a main component of an object

or a protrusion, an indentation, a hole, or an associated

nearby or internally contained object. In Pizer (1999)

wecarefullydefineafigure,makingitclearthattheno-

tion is centered on the association of opposing points

onthefigurecalledbyBlum(1967)“medialinvolutes.”

Whereas Blum conceived of starting from a boundary

representation and deriving the medial involutes, our

idea is to start with a representation giving medial in-

formation and thus widths, and imply sections of fig-

ure bounded by involutional regions. As illustrated in

Fig. 3, in order for a medial atom m by itself to imply

two opposing sections of boundary, as well as the solid

region between them, we define m={x, r, F, θ} to

consist of

(1) a position, x, the skeletal, or “hub,” position (this

requires3scalarsfora3Datom).xgivesthecentral

location of the solid section of figure that is being

represented by the atom.

(2) awidth,r,thedistancefromtheskeletalpositionto

two or more implied boundary positions (1 scalar)

and thus the length of both ? p and ? s. r gives the

scale of the solid section of figure that is being

representedbytheatom.Thatis,itprovidesalocal

ruler for the object.

(3) a frame F=(? n,?b,?b⊥), implying the tangent plane

to the skeleton, via its normal ? n, and?b, the partic-

ular unit vector in that tangent plane that is along

the direction of fastest narrowing between the im-

pliedboundarysections.Themedialpositionxand

the two boundary-pointing arrows, as illustrated

in Fig. 3, are in the (? n,?b) plane. The continuous

medial surface implied by the set of m must pass

through each x and have a normal of ? n there. F

requires 3 scalars for a 3D atom. F gives the ori-

entation of the solid section of figure that is being

represented by the atom. That is, it provides a lo-

cal figural compass for the object. The frame is

given by first derivatives of x and r with respect to

distance along the tangent plane to x.

(4) an “object angle” θ that determines the angulation

of the implied sections of boundary relative to?b

?b is rotated by ±θ towards ? n to produce normals

? p/r and ? s/ to the implied boundary. θ is normally

between π/3 and π/2, the angle corresponding to

parallel implied boundaries.

In 3D, figures are generically slabs, though m-reps can

also represent tubes. A slab figure is a 2-sheet of me-

dial atoms satisfying the constraint that the implied

boundary folds nowhere. The constraint expresses the

relation that the Jacobian of the mapping between the

medial surface and the boundary is everywhere pos-

itive. A more complete, mathematical presentation of

theconstraintsofrepresentationsofwhichlegalm-reps

are a subset can be found in Damon (2002).

In our representation a discrete m-rep for a slab is

a mesh {mij, 1 ≤ i ≤ m, 1 ≤ j ≤ n} of medial

atoms that sample a 2-sheet of medial atoms (Fig. 2).

We presently use rectangular meshes of atoms (quad-

meshes), even though creating and displaying m-reps

based on meshes of triangles (tri-meshes) have advan-

tages because of their relation to simplices. In slab fig-

ures the net is a two-dimensional mesh and internal

nodes of the mesh have a pair of boundary-pointing

vectors pointing from x to x + ? p and x + ? s.

Implied by the atoms in the mesh are

(1) a tolerance τ =βλr of boundary position normal

to the boundary,

Page 5

Deformable M-Reps for 3D Medical Image Segmentation89

(2) the length of links to other primitives, approxi-

mately of length γλr, with γλ significant fraction

of 1.0, and

(3) a constraint δλr on the radius of curvature of the

boundary.

(4) a size ιλr of the image region whose intensity val-

ues directly affect the atom when measuring the

match of the geometric model to the image.

The constant λ specifies the scale of the figural repre-

sentation, which will vary from stage to stage in an

algorithm working among coarse and fine levels of

representation. The proportionality constants β, γ, δ,

and ι are presently set by experience (Burbeck, 1996;

Fritsch,1997;McAuliffe,1996),and,tomaintainmag-

nification invariance, the constants βλ, γλ, δλ, and ιλ

decrease in proportion as the scale decreases.

The successive refinement, coarse-to-fine, of a me-

dial mesh can provide a successive correction to the

medially implied object by interpolating atoms at the

finer spacing from those at the coarser spacing and

then optimizing the finer mesh (Yushkevich, 2001), al-

though we have not implemented this feature yet in the

m-rep models used in segmentation. This refinement

brings with it a decrease in tolerance of the implied

boundary and radius of curvature constraint, the ad-

dition of patches of medial deformations relative to a

figural (u, v) space (see Section 2.2) to handle heavily

bentsectionsofslab,aswellasproportionatelysmaller

constant of radius proportionality, λ.

We call a figure represented via m-reps an m-figure.

The net of medial atoms contains internal nodes and

end nodes, as well. The end nodes for a slab are linked

together to form the boundary of the mesh. For an ob-

ject made from a single figure, the end nodes need

to capture how the boundary of the slab or tube is

closedbywhatiscalledacrestindifferentialgeometry

(Koenderink, 1990). For example, a pancake is closed

at its sides by such a crest. Whereas the internal nodes

for a slab-like segment have two boundary-pointing

vectors, end nodes for slab-like segments have three

boundary-pointing vectors, with the additional vector

pointing from x in the?b direction to the crest. Thus?b

mustcycleasonemovesaroundthefiguralcrest.Inter-

nal nodes for tubes have a circle of boundary-pointing

vectors, obtained by adding to x the full circle of rota-

tions in a full circle of ? p about?b.

For slabs a sequence of edge atoms forms a curve

of a crest or a curve of a corner closing the slab. As

illustrated in Fig. 3, these segment closed ends may be

rounded with any level of elongation η: the vertex is

taken at x+ηr?b, and the end section in the principal

direction across the implied crest is described by an

interpolation using the position and tangent there and

at the two points x+ ? p and x+? s and applying an inter-

polating function to produce a boundary crest with the

desired extent, tangency, and curvature properties. We

use this formulation for ends of end atoms represented

as {x, r, F, θ,η}, instead of the Blum formulation, in

which η = 1, in order to stabilize the image match at

ends as well as to allow corners, i.e., crests of infinite

curvature,tohaveafinitelysampledrepresentation.Al-

though corners do not normally appear in medical im-

ages, they are needed to model manufactured objects.

Corner atoms have their vertex at x + r(1/cos(θ))?b.

2.2.Interpolated Medial Sheets

and Figural Rendering

As stated above, an m-rep mesh of medial atoms for

a single figure should be thought of as a represen-

tation for a continuous sheet of atoms and a cor-

responding continuous implied boundary. The sheet

extends to the space curve of points osculating the

crest of the implied figural boundary. We interpo-

late the m-rep mesh into this sheet parameterized by

(u,v) ∈ [(j, j +1)×(k,k +1)] for the mesh element

with the jkth atom at its lower left corner. The inter-

polation is obtained by a process that has local support

on these mesh elements and on the edge elements that

are bounded by edge atoms only. If the medial atoms

areseparatedbyconstantr-proportionaldistances,this

parametrization satisfies the objective of representing

shape locally.

The interpolation is achieved by applying a variant

of subdivision surface methods (Catmull, 1978) to the

meshofimpliedboundarypositionsandnormalsgiven

atthespokeends(includingthecrestspokes).Thevari-

ant, described in detail in Thall (2003), makes the sub-

division surface match the position and the normal of

the spoke ends to within their tolerance. This bound-

ary surface (see Fig. 4) is C2smooth everywhere but

the isolated points corresponding to the atoms at the

corners of the mesh. From this surface Thall’s method

allows the calculation of interpolated medial atoms.

As it stands, Thall’s method is unlikely to produce

folded boundaries but is not guaranteed against folds.

In our segmentation we penalize against the high cur-

vatures that are precursors to a deformation that yields

folds. The full avoidance of medial atoms that imply

Page 6

90

Pizer et al.

Figure 4.

interpolated boundary. Right: interpolated boundary mesh at voxel spacing.

Left: A single-figure m-rep. Left middle: Coarse mesh of atom boundary positions for a figure. Right middle: Atom ends vs.

a folded boundary will be achievable using the mathe-

matical results found in Damon (2002).

The result of Thall’s interpolation is that with each

boundary position we can associate a boundary figural

coordinate (Pizer, 2002) (Fig. 2), the figure number

together with a side parameter t (= −1 for port, = +1

forstarboard,witht ∈ (−1,1)aroundthecrest)andthe

parameters(u,v),describingwhichinterpolatedatom’s

spoke touches the boundary there. For each figure we

interpolate the atoms sufficiently finely that a set of

voxel-size triangular tiles represent the boundary. The

methodcomputestheboundarypositionandassociated

normal and r value for an arbitrary boundary figural

coordinate (u,v,t).

Points (x, y,z) in space can also be given a figural

coordinate by appending an r-proportional distance τ

(Fig. 1) to the figural coordinates of the closest me-

dially implied boundary point. To allow the distinc-

tion by the sign of the distance of the inside and the

outside of the figure, we take the distance to be rela-

tive to the medially implied boundary and to be neg-

ative on the interior of the figure. A procedure map-

ping arbitrary spatial positions (x, y,z) into figural

coordinates (u,v,t,τ) has been written. Also, an ar-

bitrary fine triangular tiling of the medially implied

boundary can be computed. Rendering can be based

on these triangular tiles or on implicit rendering using

Figure 5.Correspondence over deformation via figural correspondence.

the τ function. As well, the correspondence under fig-

ural deformation given by figural coordinates is criti-

callyusefulincomputingtheobjectivefunctionusedin

segmentation.

2.3.Correspondence Through Deformation

As previously described, figures are designed to pro-

videanaturalcoordinatesystem,giving,firstaposition

on the medial sheet, second a figural side, and finally a

figuraldistanceinfiguralwidthrelativetermsalongthe

appropriate medial spoke from a specified position. As

detailed in Section 3.3, this figural coordinate system

isusedinoursegmentationmethodtofollowboundary

locations and other spatial locations through the model

deformation process.

Inparticular,asillustratedinFig.5,aboundarypoint

after deformation, identified by its figural coordinate

can be compared to the corresponding point before

deformation, and the magnitude of the r-proportional

distance between these points can be used to measure

the local deformation. Also, the intensity in the tar-

get image, at a figural coordinate relative to a puta-

tivelydeformedmodelcanbecomparedtotheintensity

in a training image (or training images) at the corre-

sponding figural coordinate relative to the undeformed

model.

Page 7

Deformable M-Reps for 3D Medical Image Segmentation91

Figure 6.

dotted curves. Implied boundaries are rendered with shading. Hippocampus: see Fig. 2. Left: kidney parenchyma+renal pelvis. Middle: lateral

horn of cerebral ventricle. Right: multiple single-figure objects in male pelvis: rectum, prostate, bladder, and pubic bones (one bone is occluded

in this view).

M-reps models. Heavy dots show hubs of medial atoms. Lines are atoms’ spokes. The mesh connecting the medial atoms is shown as

2.4. M-reps Model Building for Anatomic Objects

from Training Images

Model-buildingmustspecifyofwhichfiguresanobject

ormulti-objectcomplexismadeup,thesizeofthemesh

of each figure, and the way the figures are related, and

it must also specify each medial atom. In this paper,

focused on single-figure objects, only the mesh size

and its medial atoms must be specified. Illustrated in

the panels of Fig. 6 are single-figure m-rep models of

a variety of anatomic structures that we have built.

Because an m-rep is intended to allow the repre-

sentation of a whole population of an anatomic object

acrosspatients,itisbesttobuilditbasedonasignificant

sample of instances of the segmented object. Styner

(2001) describes a tool for stably producing models

from such samples. The hippocampus model shown in

Fig. 2 was built using this tool. We also have a design

tool for building m-rep models from a single training

3D-intensity data set and a b-rep from a previous seg-

mentation, e.g., a manual segmentation, of the object

in that image. The kidney model shown in Fig. 6 was

built using this tool.

It is obvious that effective segmentation depends on

building a model that can easily deform into any in-

stance of the object that can appear in a target image.

However, the production of models is not the subject

of this paper, so we assume in the following that a sat-

isfactory model can be produced and test this fact via

the success of segmentations.

3. Segmentation by Deformable M-reps

3.1.Visualizations

3.1.1. Viewing an M-rep.

the object represented by an m-rep, the m-rep and the

implied boundary must be viewable. To judge if an

To allow appreciation of

m-rep adequately matches the associated image, capa-

bilities described in Sections 3.2 and 3.3 are needed

to visualize the m-rep in 3D relative to the associated

image. If the match is not good, the user needs a tool to

modify the m-rep, either as a whole or atom by chosen

atom, while in real time seeing the change in the im-

plied boundary and the relation of the whole m-rep or

modifiedatomtotheimage.Afterthisseldomrequired

manual modification the m-rep or atom may then be

attracted by the image data.

As seen in Figs. 2, 4, and 6, we view an m-rep as a

connected mesh of balls, with each ball attached to a

pair of spokes and, for end atoms, to the crest-pointing

ηr?b vector. The inter-atom lines, the spokes, and the

crest-pointing vectors can be optionally turned off.

In addition, the implied boundary can be viewed as a

dot cloud, a mesh of choosable density, or a rendered

surface.

3.1.2. Visualization of the M-rep vs. a Target Image.

Visualization of greyscale image data must be in 2D;

onlyin2Dimagecutscanthehumanunderstandthein-

teractionofageometricentitywiththeimagedata.The

implied boundary of an m-rep can be visualized in 3D

versusoneormoreofthecardinaltri-orthogonalplanes

(x, y; y,z; or x,z), with the choice of these planes dy-

namically changeable (Fig. 11). Other image planes in

which it is useful to visualize the fit of the m-rep to the

image data are the atom-based planes described in the

next paragraph. Showing the curve of intersection of

the m-rep implied boundary with any of these image

planes is useful.

The desired relationship of a single medial atom to

the image data is that at each spoke end there is in the

image a boundary orthogonal to the spoke. This rela-

tion is normally not viewable in a cardinal plane. In-

stead one needs to visualize and edit the atoms in cross

sections of the object that are normal to the boundary.

Page 8

92

Pizer et al.

Figure 7. The viewing planes of interest for a medial atom: Top: 3D views. Bottom: 2D views.

The spokes of a medial atom, if they have been

correctly placed relative to the image, are normal to

the perceived boundary, so they should be contained in

thecrosssection.Weconcludethatanaturalviewplane

shouldcontainbothoftheboundarypointingarrowsof

the medial atom, that is the plane passing through x of

theatomandspannedby(? n,?b)oftheatom(iftheobject

angle is other than π/2). We call this the atom plane.

We can superimpose the medial atom, as well as the

medially implied boundary slice on the image in this

plane (Figs. 7(a), (b), and (c)) and visualize in-plane

changes in the atom and the boundary there.

On the other hand, the atom may be misplaced or-

thogonaltotheatomplaneormisrotatedoutoftheatom

plane, so it needs to be viewed in an image plane or-

thogonal to the atom plane. There is no such plane that

contains both spokes, but if we could live with having

onlyonespokeintheatom-orthogonalplane,wecould

view across the boundary to which the spoke should

beorthogonal.Thedesiredplaneisthatspannedbythe

atom’s?b⊥and the chosen spoke of the atom. But we

wouldonlyneedthehalfplaneendingatx,the“port”or

“starboard half-plane.” Thus, we could draw simulta-

neouslythetwoadjoininghalfplanesfortherespective

spokes (Fig. 7).

When an atom is selected, it implies visualization

planes, which can been chosen from among the atom

plane, the port spoke half-plane, and the starboard

spokehalf-plane.Theimagedatainthechosenplane(s)

can be viewed in the 3D viewing window, with the im-

age data texture rendered onto these planes (Fig. 7(a)

and (b)), and in respective 2D windows for the chosen

plane(s) (Figs. 7(c), (d) and (e)).

Manual editing of m-reps, should it be necessary,

can be done using these visualizations. The viewer can

point to positions in the three 2D viewing windows

where the spoke ends should be moved, and the atom

can be modified to as closely as possible meet these

requirements, based on the object being locally a lin-

ear slab. Also, tools for manually rotating, translating,

scaling, and changing the object angle and elongation

(for end atoms) are easily provided.

3.2.Multi-Scale-Level Model Deformation Strategy

for Segmentation from Target Images

Our method for deforming m-reps into image data al-

lowsmodel-directedsegmentationofobjectsinvolume

data. The deformation begins with a manually chosen

initial similarity transform of the model. To meet the

efficiency requirements of accurate segmentation, the

segmentation process then follows a number of stages

of segmentation at successively smaller levels of scale

(see Table 1). At each scale level the model is the

result of the next larger scale level and we optimize

Page 9

Deformable M-Reps for 3D Medical Image Segmentation 93

Table 1.Geometry by scale level.

Scale

level k

Transformation

parameters ωk

Geometric entityTransformation Sk

Primitive zk

ii

Neighbors N(zk

i)

1 Object ensemble SimilarityObject ensemble pose 7: 3D sim transf paramsNone

2ObjectSimilarity Object pose7: 3D sim transf paramsAdjacent objects

3 Main figureSimilarity plus elongationFigure pose 8: 3D sim transf params,

1 elongation param

Adjacent figures

3Subfigure (attached

to a host to

represent a

protrusion or

indentation)

Similarity in figural

coordinates of its host’s

figural boundary, plus

hinging and elongation

Figural pose in host’s

cords and elongation

6: 4 2D sim transf

params, 1 hinging

param, 1 elongation

param

Adjacent figures,

possibly

attached to

same host

4 Through section of

figure (medial

atom)

Medial atom change Medial atom value8 (or 9): medial atom

params (+η for

external atoms)

2–4 adjacent

medial atoms

5 Boundary vertex Displacement along

medially implied normal

Boundary vertex

position

1: displacement param Adjacent

boundary

vertices

an objective function of the same form: the sum of a

geometric typicality metric (detailed later in this sec-

tion)andageometrytoimagematchmetric(detailedin

Section 3.3). At each scale level there is a type of geo-

metric transformation chosen appropriate to that scale

and having only at most 9 parameters.

The deformation strategy, from a model to a candi-

dateobtainedbygeometricallytransformingthemodel,

follows two basic geometric principles, according to

the conceptual structure presented in Sections 1 and 2.

(1) In both the geometric typicality and the model to

image match metrics all geometry is in figurally

related terms. Thus

• model-relativeandcandidate-relativepositions

correspondwhentheyhavecommonfiguralco-

ordinates, and

• all distances are r-proportional.

(2) Calculating geometric typicality at any scale level

is done in terms of the relations relevant to that

scale, i.e., relative to its values predicted by the

previous, next larger, scale and by its neighbors

at its scale. The neighborhood of a medial atom

is made up of its immediately adjacent atoms, and

theneighborhoodofaboundarytilevertexismade

up of the adjacent boundary tile vertices.

To describe the algorithm in detail, we make a number

of definitions.

The process begins with a model z that is manually

translated,rotated,anduniformlyscaledintotheimage

data by the user to produce an initialized model z0.

z0is successively transformed through a number of

scale levels into deformed models zkuntil z5is the

final segmentation. The details and descriptions of the

primitives, their neighbor relations, and the associated

transformationsateachscalelevelaregiveninTable1.

Let zkbe the geometric representation at scale level

k. Let zk

scale level k. At all scale levels k ≤ 4, each zk

sented as a collection of medial atoms, and a geomet-

ric transformation on zk

transformation to each medial atom in its representa-

tion. Each primitive zk

neighbors N(zk

at the next larger scale (k −1) that contains zk

this containing entity the parent primitive P(zk

P(zk

zk

of the set representing zk

a boundary vertex is the corresponding vertex on the

medially implied surface with zero displacement. Also

associatedwithscalelevelk isatypeoftransformation

Sksuchthatzk

parameters of the particular transformation Skapplied

toP(zk

The similarity transform S consisting of translation

by t, rotation O and uniform scaling α applied to a

medial atom m = {x, r, F, θ} produces S ◦m =

{αOx+t,αr, O ◦ F,θ}.Figuralelongationbyν leaves

ibe the representation of the ith primitive at

iis repre-

iis computed by applying that

ifor k > 1 has a small set of

i) at scale level k and a geometric entity

i. We call

i). While

i) is at scale level k − 1, it is of the same type as

i. That is, for k ≤ 4 P(zk

i, and for k = 5 the parent of

i) is represented as a superset

i= SkP(zk

i).Lettheparametersωk

ibethe

i)atscalelevelk−1toproducezk

iatscalelevelk.

Page 10

94

Pizer et al.

fixed the medial atoms at a specified atom row i (the

hinge end for subfigures) and successively produces

translations and rotations of the remaining atoms in

terms of the atoms in the previously treated, adjacent

row i−, as follows:

S3(ν) ◦ mij=?xi−j+ ν(xij− xi−j),rij,?FijF−1

◦ Fi−j,θij

Thesubfiguretransformationappliesasimilaritytrans-

form to each of the atoms in the hinge. This transfor-

mation, however, is not in Euclidean coordinates but

in the figural coordinates of the boundary of the par-

ent. That transformation is not used in this paper, so its

details are left to Liu (2002). The medial atom trans-

formation S4translation by t, rotation O, r scaling α,

and object angle change ?θ applied to a medial atom

m = {x, r, F, θ} produces S4(t, O,α,?θ) ◦ m =

{x + t,αr, O ◦ F,θ + ?θ}. The boundary displace-

ment transformation τ applied to a boundary vertex

with position y, medial radial width r, and medially

implied normal ? n yields the position y + τr? n.

The algorithm for segmentation successively modi-

fieszk−1toproducezk.Indoingsoitpassesthroughthe

various primitives zk

objective function H(zk, zk−1, I)=wk(-Geomdiff(zk,

zk−1))+Match(zk,I).Geomdiff(zk,zk−1)measuresthe

geometric difference between zkand zk−1, and thereby

-Geomdiff(zk, zk−1) measures the geometric typicality

of zkat scale level k. Match(zk, I) measures the match

between the geometric description zkand the target

image I. Both Geomdiff(zk, zk−1), and Match(zk, I) are

measured in reference to the object boundaries Bkand

Bk−1, respectively implied by zkand zk−1. The weight

wkof the geometric typicality is chosen by the user.

For any medial representation z, the boundary B is

computed as a mesh of quadrilateral tiles as follows,

with each boundary tile vertex being known both with

regards to its figural coordinates u and its Euclidean

coordinates y. For a particular figure, u = (u,v,t),

as described in Section 2.2. When one figure is an at-

tached subfigure of a host figure, with the attachment

along the v coordinate of the subfigure, there is a blend

region whose boundary has coordinates u = (u,w,t),

where u and t are the figural coordinates of the subfig-

ure and w ∈ [−1, 1] moves along the blend from the

curveonthesubfigureterminatingtheblend(w = −1)

to the curve on the host figure terminating the blend

(w = +1). This blending procedure is detailed in Liu

(2002).

i−j

?ν

?

iin zkand for each i optimizes an

As mentioned in Section 2.2, the computation of

B is accomplished by a variation of Catmull-Clark

subdivision (Catmull, 1978) of the mesh of quadrilat-

eral tiles (or, in general, tiles formed by any polygon)

formedfromthetwo(orthree,spokeendsofthemedial

atomsinz.Thall’svariation(2003)producesalimitsur-

face that iteratively approaches a surface interpolating

in position to spoke ends and with a normal interpo-

lating the respective spoke vectors. That surface is a

B-spline at all but finitely many points on the surface.

The program gives control of the number of iterations

and of a tolerance on the normal and thus of the close-

ness of the interpolations. A method for extending this

approach to the blend region between two subfigures

is presently under evaluation.

Geomdiff(zk, zk−1) is computed as the sum of two

terms, one term measuring the difference between the

boundary implied by zkand the boundary implied by

zk−1, and, in situations when N(zk

othertermmeasuringthedifferencebetweenboundary

implied by zkand that implied by zkwith zk

by its prediction from its neighbors, with the predic-

tion based on neighbor relations in P(zk

termenforcesalocalshapeconsistencywiththemodel

and depends on the fact that figural geometry allows a

geometric primitive to be known in the coordinate sys-

tem of a neighboring primitive. The weight between

the neighbor term and the parent term in the geomet-

rical typicality measure is set by the user. In the tests

described in Section 4, the neighbor term weight was

0.0 in the medial atom stage and 1.0 in the boundary

displacement stage.

The prediction of the value of one geometric prim-

itive zk

level using the transformation Skis defined as follows.

Choose the parameters of Sksuch that Skapplied to

the zksubset of zk−1is close as possible to zkin the

vicinity of zk

(Skzk)j. Those predictions depend on the prediction

of one medial atom by another. Medial atom z4

rj, Fj, θj} predicts medial atom z4

by recording T = {(xj− xi)/rj, (rj− rj)/rj, FjF−1

θj− θi}, where FjF−1

Fiinto Fj. T takes z4

modified z4

The boundary difference Bdiff(z1, z2) between two

m-reps z1 and z2 is given by the following average

r-proportional distance between boundary points that

correspond according to their figural coordinates, al-

though it could involve points with common figural

i) is not empty, an-

ireplaced

i). The second

jin an m-rep from another zk

iat the same scale

j. Apply that Skto zkto give predictions

j= {xj,

i= {xi, ri, Fi, θi}

i,

i

is the rotation that takes frame

iinto z4

iproduces a predicted z4

jand when applied to a

j.

Page 11

Deformable M-Reps for 3D Medical Image Segmentation95

coordinates other than at the boundary and it will in

the future involve probabilistic rather than geometric

distance measures.

Bdiff(z1,z2) =

?

−

?

B2

??y1− y2

??2

2(σr(y2))2dy

??

area(B2).

The r-value is that given by the model at the present

scale level, i.e., the parent of the primitive being trans-

formed.Thenormalizationofdistancebymedialradius

r makesthecomparisoninvarianttouniformscalingof

both the model and the deformed model for the local

geometriccomponentbeingadjustedatthatscalelevel.

Finally, the geometry to image match measure

Match (zk, I) between the geometric description zk

and the target image I is given by

Itemplate(y,τ)ˆI(y?,τ)dydτ where y and y?are bound-

ary points in B(zk) and B(zk

coordinates, G(τ) is a Gaussian in τ,ˆI is the target

image I rms-normalized with Gaussian weighting in

the boundary-centered collar τ ∈ [−τmax,τmax] for the

deformed model candidate (see Fig. 8), and the tem-

plate image Itemplateand the associated model ztemplate

are discussed in Section 3.3.1.

Insummary,forafullsegmentationofamulti-object

complex,thereisfirstasimilaritytransformationofthe

wholecomplex,thenasimilaritytransformofeachob-

ject, then for each of the figures in turn (with parent

figures optimized before subfigures) first a similarity-

like transform that for protrusion and indentation

figures respects their being on the surface of their par-

ent, then modification of all parameters of each medial

atom. After all of these transformations are complete,

there is finally the optimization of the dense boundary

?τmax

−τmax

?

BkG(τ)

template) that agree in figural

Figure 8.

showing the boundary as a mesh and showing three cross-sections of the collar.

The collar forming the mask for measuring geometry to image match. Left: in 2D, both before and after deformation. Right: in 3D,

vertices implied by the medial stages. Since in this pa-

per we describe only the segmentation of single figure

objects,therearethreestagesbeyondtheinitialization:

the figural stage, the medial atom (figural section)

stage, and the boundary displacement stage.

For all of the stages with multiple primitives (in the

case tested in this paper, the medial atom stage and

the boundary stage), we follow the strategy of iterative

conditional modes, so the algorithm cycles among the

atoms in the figure or boundary in random order until

the group converges. The geometric transformation of

a boundary vertex modifies only its position along its

normal [1 parameter]; the normal direction changes as

a result of the shift, thus affecting the next iteration of

the boundary transformation.

3.3. The Optimization Method

and Objective Function

Multiscale segmentation by deformable models re-

quires many applications of optimization of the objec-

tive function. The optimization must be done at many

scalelevelsandforincreasinglymanygeometricprim-

itives as the scale becomes smaller. Efficient optimiza-

tion is thus necessary. We have tried both evolutionary

approaches and a conjugate gradient approach to opti-

mization. The significant speed advantages of the con-

jugate gradient method are utilizable if one can make

the objective function void of nonglobal optima for the

range of the parameters being adjusted that is guaran-

teedbythepreviousscalelevel.Wehavethusdesigned

our objective functions to have as broad optima as pos-

sible and chosen the fineness of our scale levels and

intra-level stages to guarantee that each stage or level

Page 12

96

Pizer et al.

produces a result within the bump-free breadth of the

main optimum of the next stage or level.

When the target image is noisy and the object con-

trast is low, the interstep fineness requirement just

laid out requires multiple substages of image blurring

within a scale level. That is, at the first substage the

target image must be first blurred before being used in

the geometry to image match term. At later substages

the blurring that is used decreases.

At present the largest scale level involved in a seg-

mentation requires a single user-selected weight, be-

tween the geometric typicality term and the geometry

to image match term. All smaller scale stages require

two user-selected weights, the one just mentioned plus

a weight between the parent-to-candidate distance and

the neighbor-predictions-to-candidate distance. How-

ever, we intend in the future that our objective func-

tion be a log posterior probability. When this comes

to pass, both terms in the objective function will be

probabilistic, as determined by a set of training im-

ages. These terms then would be a log prior for the

geometric typicality term and a log likelihood for the

geometry to image match term. In this situation there

is no issue weighting the geometric typicality and ge-

ometry to image match terms. However, at present our

geometrictypicalitytermismeasuredinr-proportional

squared distances from model-predicted positions and

the geometry to image match term is measured in rms-

proportional intensity squared units resulting from the

correlation of a template image and the target image,

normalized by local variability in these image intensi-

ties. While this strategy allows the objective function

to change little with image intensity scaling or with

geometric scaling, it leaves the necessity of setting the

relative weight between the geometric typicality term

and the geometry to image match term.

The remainder of this section consists of a subsec-

tiondetailingthegeometry-to-imagematchtermofthe

objective function, followed by a section detailing the

boundary displacement stage of the optimization.

3.3.1.TheGeometry-to-ImageMatchMeasure.

usefultocomputethematchbetweengeometryandthe

image based on a model template. Such a match is en-

abledbycomparingthetemplateimage Itemplateandthe

target image data I at corresponding positions in fig-

ural coordinates, at figural coordinates determined in

the model. The template is presently determined from

a single training image Itemplate, in which the model z

has been deformed to produce ztemplateby applying the

Itis

m-reps deformation method through the medial atom

scale level (level 4) on the characteristic image corre-

sponding to a user-approved segmentation. In our im-

plementation the template is defined only in a mask

region defined by a set of figural coordinates, each

with a weight of a Gaussian in its figural distance-to-

boundary, τ, about the model-implied boundary. The

standard deviation of the Gaussian used for the results

in this paper is1/2of the half-width of the collar. The

mask is choosable as a collar symmetrically placed

about the boundary up to a user-chosen multiple of r

from the boundary (Fig. 8) or as the union of the object

interior with the collar, a possibility especially easily

allowed by a medial representation. In the results re-

portedhereweuseaboundarycollarmask.Themaskis

chosen by subdividing the boundary positions affected

by the transformation with a fixed mesh of figural co-

ordinates (u,v) and then choosing spatial positions to

be spaced along each medial spoke (implied boundary

normal) at that (u,v). These along-spoke positions are

equallyspacedinthefiguraldistancecoordinateτ upto

aplusorminusafixedcutoffvalueτmaxchosenatmod-

elingtime.ForthekidneyresultsreportedinSection 4,

this cutoff value was 0.3, so the standard deviation of

the weighting Gaussian in the intensity correlation is

0.15.

The template to image match measure is choosable

in our tool from among a normalized correlation mea-

sure, with weights, and a mutual information measure,

with weights, but for all the examples here the cor-

relation measure has been used and the weight in all

mask voxels is unity. The correlation measure that we

use is an average, over the boundary sample points, of

the along spoke intensity profile correlations at these

sample points. For the geometry to correspond to the

volume integral of these point-to-corresponding-point

correlations, each profile must be weighted by the

boundary surface area between it and its neighboring

sample points, and the profile must be weighted by its

r-proportional length. In addition, as indicated above,

we weight each product in the correlation by a Gaus-

sianinτ fromtheboundary.Also,tomaketheintensity

profilesinsensitivetooffsetsandlinearcompressionin

the intensity scale, the template is offset to a mean

of zero and both the template and the target image are

rms-normalized.Thetemplate’srmsvalueiscomputed

within the mask in the training image, and the target

image’srmsvalueiscomputedforaregioncorrespond-

ing to a blurred version of the mask after the manual

placement of the model.

Page 13

Deformable M-Reps for 3D Medical Image Segmentation97

In our segmentation program the template is choos-

able from among a derivative of Gaussian, described

more precisely below, and the intensity values in the

training image in the region, described in more detail

in Section 4.2. In each case the template is normalized

by being offset by the mean intensity in the mask and

normalized in rms value.

ThederivativeofGaussiantemplateformodeltoim-

age match is built in figural coordinates in the space of

the model, i.e., the space of the training image. That

is, each along-spoke template profile, after the Gaus-

sian mask weighting, is a derivative of a Gaussian with

a fixed standard deviation in the figural coordinate τ,

or equivalently anr-proportional standard deviation in

Euclidean distance. We choose 0.1 as the value of the

standard deviation in τ. Since this template is associ-

ated with the target image via common figural coordi-

nates, in effect the template in the target image space is

notaderivativeof3DGaussianbutawarpedderivative

of 3D Gaussian, with the template’s standard deviation

in spatial terms increases with the figural width.

3.3.2. Boundary Displacement Optimization.

boundary deformation stage is much like active sur-

faces,exceptthatthegeometrictypicalitytermconsists

not only of a term measuring the closeness of each

boundary displacement to that at each of the neigh-

boring boundary positions but also a term measuring

the log probability of these displacements in the me-

dially based prior. Since the tolerance of the medially

implied boundary is r-proportional, the log Gaussian

medially based prior, conditional on the medial es-

timate, is proportional to the negative square of the

r-normalized distance to the medially implied bound-

ary (Chen, 1999). The method of Joshi (2001), with

which we complete the segmentation, uses this com-

binedgeometrictypicalitymeasure,anditsboundaryto

image match measure is a log probability based on the

objectanditsbackgroundeachhavingnormalintensity

distributions.

The

Figure 9.

MRI using a single figure model.

Segmentation results of the lateral horn of a cerebral ventricle at the m-rep level of scale (i.e., before boundary displacement) from

4.Segmentation Results: Deformed Models

4.1.Segmenting the Kidney from CT;

Segmentation Accuracy

We have tested this method for the extraction of

three anatomic objects well modeled by a sin-

gle figure: the lateral cerebral ventricle, the kidney

parenchyma+pelvis, and the hippocampus. Extract-

ing the lateral ventricle from MR images is not very

challenging because the ventricle appears with high

contrast, but a single result using a Gaussian derivative

template is shown in Fig. 9.

ExtractingthekidneyfromCTimagesischallenging

under the conditions of the work reported here for ra-

diation therapy treatment planning (RTP). The kidney

sits in a crowded soft tissue environment where parts

of its boundary have good contrast resolution against

surrounding structures but other parts have poor con-

trast resolution. Also, not too far away are ribs and

vertebrae, appearing very light with very high contrast

(Fig. 10). The typical CT protocol for RTP involves

non-gated slice-based imaging, without breath hold-

ing and without injecting the patient with a “dye” to

enhance the contrast of the kidney. During the time

interval between slice acquisition the kidneys are dis-

placed by respiratory motion, resulting in significantly

jagged contours in planes tilted relative to the slice

plane (Fig. 10). A combination of partial volume and

motion artifacts causes the poles to be poorly visual-

ized or spuriously extended to adjacent slices. Motion

and partial volume artifacts degrade the already poor

contrast between the kidney and adrenal gland, which

sits on top of the kidney.

The single figure m-rep used here includes part of

the pelvis along with the kidney parenchyma, mimick-

ing segmentation as performed for RTP. The complex

architecture of the renal pelvis acts as structure noise

for this m-rep. When using a Gaussian template that is

designed to give increased response at boundaries next

Page 14

98

Pizer et al.

Figure 10.

study, demonstrating significant partial volume and breathing ar-

tifacts. A human segmentation is shown as a green tint. Note the

scalloped boundary and spurious sections of the kidney, which were

segmented by one of two human raters but excluded by m-rep seg-

mentation. Note also the nearby high-contrast rib that can create a

repulsive force when a Gaussian derivative template is used.

Sagittal plane through a CT of the kidney, used in this

to which are non-narrow strips of object with intensity

lighterthanitsbackground,thefollowingbehaviorsare

noted. (1) If the geometric penalty weight is low, the

kidney m-rep can move inside a sequence of vertebral

bodies because the high contrast on only a portion of

themodelresultsinahighmodeltoimagematchvalue.

Table 2.

pairs). Distances are in cm. The examples span the range, from best to worst, of human-m-rep volume overlap.

Comparison of m-reps segmentation to manual segmentation for six examples selected from 24 kidneys (twelve kidney

Kidney codeVolumes comparedVol. Overlap 1st-Qtl∗

2nd-Qtl∗

3rd-Qtl∗

?Dist?

0.117

0.124

0.123

0.133

0.108

0.103

0.103

0.182

0.191

0.106

0.162

0.189

0.079

0.272

0.280

0.097

0.307

0.314

Haussdorf distance

639 RA–B

A–C

B–C

A–B

A–C

B–C

A–B

A–C

B–C

A–B

A–C

B–C

A–B

A–C

B–C

A–B

A–C

B–C

0.947

0.940

0.940

0.935

0.939

0.940

0.953

0.903

0.899

0.951

0.909

0.896

0.957

0.843

0.840

0.950

0.807

0.808

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.200

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.200

0.000

0.200

0.200

0.000

0.000

0.000

0.200

0.200

0.000

0.200

0.200

0.000

0.200

0.200

0.000

0.200

0.200

0.200

0.200

0.200

0.200

0.200

0.200

0.200

0.200

0.283

0.200

0.200

0.283

0.200

0.400

0.400

0.200

0.400

0.400

0.693

1.342

1.183

1.149

0.917

1.077

0.600

0.894

0.849

0.721

1.217

1.131

0.566

1.789

1.897

0.800

2.209

2.272

646 L

634 L

633 L

637 R

635 L

∗Quartile columns give the surface separation associated with each quartile, e.g., an entry of.200 in the 2nd-Qtl column means that

50% of all voxels on the surfaces of the compared segmentations are separated by no more than .200 cm. (.200 cm is the smallest

unit of measurement). ?Dist? is the median distance between the surfaces of the compared segmentations.

This is easily prevented by an adequately high weight

for geometric typicality. (2) A portion of the implied

boundary of the kidney m-rep can move to include a

rib, as a result of the high contrast of the rib. (3) A

portion of the implied boundary of the kidney m-rep

can move to include part of the muscle. (4) The bound-

ary at the liver, appearing with at most texture contrast,

does not attract the implied m-rep boundary, with the

result that the geometric typicality term makes the kid-

neynotquitefollowthekidney/liverboundary.Inother

organs, parts of the edge with high contrast of opposite

polarity to other parts would repel the m-rep boundary.

Avoidingsomeofthesedifficultiesnecessitatedreplac-

ing the first post-initialization similarity transform by

a similarity transform augmented by an elongation for

themainfigure.Despitethesechallenges,segmentation

using a Gaussian derivative template, built on a single

right kidney, is successful when compared against hu-

man performance.

An example deformation sequence is shown in

Fig. 11, showing the improved segmentation at each

stage. Results of a typical kidney segmentation are

visualized in Fig. 12. Comparisons between m-rep

segmentation (rater C) and human segmentation

Page 15

Deformable M-Reps for 3D Medical Image Segmentation99

Figure 11.

progress through consecutive stages via overlaid grey curves to show the kidney segmentation after stage N vs. white curves after stage N + 1.

Top row: stages are the initial position of the kidney model vs. the figural similarity transform plus elongation. Middle: the similarity transform

plus elongation vs. medial atom transformations. Bottom: medial atom transformations vs. 3D boundary displacements.

Stage by stage progress: all rows, from left to right, show results on Coronal, Sagittal and Axial CT slices. Each row compares

(raters A and B) using our evaluation system Valmet

(Gerig, 2001) are given in Table 2 and Figs. 13–15.

Comparisons are given for 12 kidney pairs (12 right

kidneys and corresponding left kidneys. Manual seg-

mentation by A and B was performed slice-by-slice

using the program, Mask (Tracton, 1994). Within-slice

pixelsizewasapproximately1mm,andslicethickness

varied image to image between 3 mm and 8 mm. Im-

ages were resampled for m-rep segmentation to yield

isotropic 2 mm voxels. At the comparison stage using

Valmet the segmented volumes, originally represented

bysetsofcontoursforhumansandas3Dsurfacesform-

reps, were scan converted to produce voxelized (2 mm

voxels) representations.

The median volume overlap for human segmenta-

tions, as measured by the overlap volume divided by

the union of the two volumes being compared, is 94%

(σ = 1.7%, min = 90%, max = 96%). The mean sur-

face separation, averaged over all kidneys, is 1.1 mm

(σ = 0.3 mm); the mean surface separation for a given

kidney is defined in terms of closest points, i.e., as

1

2

?

1

N1

?

N1

?

?

i=1

min

j=1,...,N2

N2

?

??y1

i− y2

j

??

?

+

1

N2

j

i=1

min

i=1,...,N1

??y1

i− y2

j

??

??

,