Page 1

Joint Prior Models of Neighboring Objects for 3D Image

Segmentation

Jing Yang and James S. Duncan

Departments of Electrical Engineering and Diagnostic Radiology, Yale University P.O. Box 208042,

New Haven CT 06520-8042, USA

Abstract

This paper presents a novel method for 3D image segmentation, where a Bayesian formulation, based

on joint prior knowledge of multiple objects, along with information derived from the input image,

is employed. Our method is motivated by the observation that neighboring structures have consistent

locations and shapes that provide configurations and context that aid in segmentation. In contrast to

the work presented earlier in [1], we define a Maximum A Posteriori (MAP) estimation model using

the joint prior information of the multiple objects to realize image segmentation, which allows

multiple objects with clearer boundaries to be reference objects to provide constraints in the

segmentation of difficult objects. To achieve this, muiltiple signed distance functions are employed

as representations of the objects in the image. We introduce a representation for the joint density

function of the neighboring objects, and define joint probability distribution over the variations of

objects contained in a set of training images. By estimating the MAP shapes of the objects, we

formulate the joint shape prior models in terms of level set functions. We found the algorithm to be

robust to noise and able to handle multidimensional data. Furthermore, it avoids the need for point

correspondences during the training phase. Results and validation from various experiments on 2D/

3D medical images are demonstrated.

I. Introduction

Image segmentation remains an important and challenging task due to poor image contrast,

noise, and missing or diffuse boundaries. To address these problems, Snakes or Active Contour

Models (ACM) (Kass et al. (1987)) [2] have been widely used for segmenting non-rigid objects

in a wide range of applications, where an initial contour is deformed towards the boundary of

the object to be detected by minimizing an energy functional. These methods maybe sensitive

to the starting position and may “leak” through the boundary of the object if the edge feature

is not salient enough.

In more sophisticated deformable models, the incorporation of more specific prior information

into deformable models has received a large amount of attention. Cootes et al. [3] find

corresponding points across a set of training images and construct a statistical model of shape

variation from the point positions. The best match of the model to the image is found by

searching over the model parameters. Staib and Duncan [4] incorporate global shape

information into the segmentation process by using an elliptic Fourier decomposition of the

boundary and placing a Gaussian prior on the Fourier coefficients. Zeng et al. [5] develop a

coupled surfaces algorithm to segment the cortex by using a thickness prior constraint.

© 2004 IEEE

j.yang@yale.edu .

NIH Public Access

Author Manuscript

Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. Author manuscript; available in PMC

2010 May 5.

Published in final edited form as:

Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2004 June 27; 1: I314–I319. doi:10.1109/CVPR.

2004.1315048.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 2

Leventon et al. [6] extend Caselles' [7] geodesic active contours by incorporating shape

information into the evolution process.

Our work is also a prior-information-based approach to image segmentation. As an extention

of the neighbor-constraint deformable model presented earlier in [1], our work shares the

observation that neighboring structures have consistent locations and shapes that provide

configurations and context that aid in segmentation. In contrast to the work presented in [1],

the MAP segmentation framework that we present in this paper is based on a joint prior

information of the multiple objects in the image (instead of using the conditional local neighbor

prior information). The objects with clearer boundaries in the image can be used as reference

objects to provide constraints in the segmentation of difficult objects. Our work also shares the

common aspects with a number of coupled active contour models [1][5][8], where multiple

level set functions are employed as the representations of the multiple objects within the image.

By using this level-sets based numerical algorithm, several objects can be segmented

simultaneously.

The strength of our approach is the incorporation of joint prior information of multiple objects

into image segmentation to improve the segmentation results as well as reduce the complexity

of the segmentation process by providing prior constraints from multiple neighboring objects.

Our model is based on a MAP framework using the joint prior information of neighboring

objects within the image. We introduce a representation for the joint density function of the

neighbor objects and define the corresponding probability distributions. Formulating the

segmentation as a MAP estimation of the shapes of the objects and modeling in terms of level

set functions, we compute the associated Euler-Lagrange equations. The contours evolve while

attempting to adhere to the neighbor prior information and the image gray level information.

II. Description of the Model

A. MAP Framework with Joint Prior Information Of Multiple Objects

As presented in our previous work in [1], probabilistic formulations are powerful approaches

to deformable models. Deformable models can be fit to the image data by finding the model

shape parameters that maximize the posterior probability. Consider an image I that has M

shapes of interest; a MAP framework can be used to realize image segmentation combining

joint prior information of the neighboring objects and image information:

(1)

where S1, S2, …, SM are the evolving surfaces of all the shapes of interest.

p(I|S1, S2, …, SM) is the probability of producing an image I given S1, S2, …, SM. In 3D,

assuming gray level homogeneity within each object, we use the following imaging model

[9]:

(2)

Yang and Duncan Page 2

Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. Author manuscript; available in PMC 2010 May 5.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 3

where c1i and σ1i are the average and variance of I inside Si, c2i and σ2i are the average and

variance of I outside Si but also inside a certain domain Ωi that contains Si.

p(S1, S2, …, SM) is the joint density function of all the M objects. It contains the neighbor prior

information such as the relative position and shape among the objects. In many cases, objects

to be detected have one or more neighboring structures with known or clearer boundaries in

the image. These known or easily segmented objects can be used as reference objects to provide

constraints in the segmentation of those difficult objects. Assume S1 = ξ1, S2 = ξ2, …, Sl = ξl

(1 ≤ l < M) are the known shapes in the image, Sl+1, Sl+2, …, SM are the difficult shapes to be

segmented. Then the MAP framework in equation (1) can be written in this form:

(3)

B. Joint Prior Model Of Multiple Objects

To build a model for the joint prior of the neighboring objects, we choose level sets as the

representation of the shapes [1][6][8], and then define the joint probability density function p

(S1, S2, …, SM) in equation (1).

Consider a training set of n aligned images {I1, I2, …, In}, with M objects or structures of

interest in each image. The surfaces of each of the Mn shapes in the training set are embedded

as the zero level set of Mn separate higher dimensional level sets with negative distances inside

and positive distances outside the object, as shown below:

(4)

Using the technique developed in [6], each of the Ψij(i = 1, 2, …, M; j = 1, 2, …n) is placed as

a column vector with Nd elements, where d is the number of spatial dimensions and Nd is the

number of samples of each level set function. We can use vector

the representation of the M objects of interest in image Ij. Thus, the corresponding training set

is {χ1, χ2, …, χn}. Our goal is to build a joint model of the multiple objects over the distribution

of the level sets vector χ.

as

Following the lead of [6][8], the mean and variance of the level sets vector χ can be computed

using Principal Component Analysis (PCA). The mean level sets vector, , is calculated using

(5)

For each level sets vector χj in the training set we calculate its deviation from the mean, dχj,

where

(6)

Yang and DuncanPage 3

Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. Author manuscript; available in PMC 2010 May 5.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 4

Then each such deviation is placed as a column vector in a MNd × n dimensional matrix Q.

Using Singular Value Decomposition (SVD), Q = UΣVT. U is a matrix whose column vectors

represent the set of orthogonal modes of joint variation of the M objects and Σ is a diagonal

matrix of corresponding singular values. An estimate of the joint level sets of the M objects

χ can be represented by k principal components and a k dimensional vector of coefficients α

(where k < n)[3]:

(7)

To write equation (7) in the level sets form, we have:

(8)

Under the assumption of a Gaussian distribution of joint level sets represented by α, the joint

probability density function of neighboring objects, p(S1, S2, …, SM), can be approximated by:

(9)

Figure 1 shows a few of the 16 MR cardiac training images used to define the level set based

shape model of the endocardial boundary of the left and right ventricles. Before computing

and combining the level sets of these training shapes, the curves were rigidly aligned. By using

PCA of the joint level sets of the two structures, we can build a model of the joint shapes of

left and right ventricles. Figure 2 illustrates zero level sets corresponding to the mean and three

primary modes of variance of the distribution of the two ventricles jointly.

We also show a 3D training set of two rigidly aligned subcortical structures: the left amydala

and left hippocampus in Figure 3. Figure 4 shows the three primary modes of variance of the

left amydala and left hippocampus. Note that the zero level sets of the mean joint level sets

and primary modes appear to be reasonable representative shapes of the classes of objects being

learned. This shows that our joint prior model of multiple objects successfully incorporates the

neighbor prior information such as the relative position and shape among the objects and unifies

them under one framework.

In our active contour model, we also add some regularizing terms [1]: a general smoothness

Gibbs prior for the region boundaries pB(S1, S2, …, SM) and a model for the size of the region

pA(S1, S2, … SM).

(10)

Yang and DuncanPage 4

Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. Author manuscript; available in PMC 2010 May 5.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript

Page 5

(11)

where Ai is the size of the region of shape i, c is a constant and μi and νi are scalar factors. Here

we assume the boundary smoothness and the region size of all the objects are independent.

Thus, the joint prior probability p(S1, S2, … SM) can be approximated by a product of the

following probabilities:

(12)

Therefore, equation (1) can be approximated by:

(13)

Since:

(14)

Let

(15)

Given the first l objects in the image, the MAP estimation of the other shapes of interest in

equation (3), , is also the minimizer of the above energy functional E.

This minimization problem can be formulated and solved using the level set method and we

can realize the segmentation of multiple objects simultaneously.

C. Level Set Formulation of the Model

In the level set method, Si is the zero level set of a higher dimensional level set ψi corresponding

to the i th object being segmented, i.e., Si = {(x, y, z)|ψi(x, y, z) = 0}. The evolution of surface

Si is given by the zero-level surface at time t of the function ψi(t, x, y, z). We define ψi to be

Yang and DuncanPage 5

Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. Author manuscript; available in PMC 2010 May 5.

NIH-PA Author Manuscript

NIH-PA Author Manuscript

NIH-PA Author Manuscript