3D characterization of brain atrophy in Alzheimer's disease and mild
cognitive impairment using tensor-based morphometry
Xue Hua,aAlex D. Leow,aSuh Lee,aAndrea D. Klunder,aArthur W. Toga,aNatasha Lepore,a
Yi-Yu Chou,aCaroline Brun,aMing-Chang Chiang,aMarina Barysheva,aClifford R. Jack, Jr.b
Matt A. Bernstein,bPaula J. Britson,bChadwick P. Ward,bJennifer L. Whitwell,b
Bret Borowski,bAdam S. Fleisher,cNick C. Fox,dRichard G. Boyes,dJosephine Barnes,d
Danielle Harvey,eJohn Kornak,fNorbert Schuff,g,iLauren Boreta,gGene E. Alexander,j
Michael W. Weiner,g,h,iand Paul M. Thompsona,⁎
the Alzheimer's Disease Neuroimaging Initiative
aLaboratory of Neuro Imaging, Department of Neurology, UCLA School of Medicine, Neuroscience Research Building 225E,
635 Charles Young Drive, Los Angeles, CA 90095-1769, USA
bMayo Clinic College of Medicine, Rochester, MN, USA
cDepartment of Neurosciences, UC San Diego, La Jolla, CA, USA
dDementia Research Centre, University College London, Institute of Neurology, London, UK
eDepartment of Public Health Sciences, UC Davis School of Medicine, Davis, CA, USA
fDepartment of Radiology and Department of Epidemiology and Biostatistics, UC San Francisco, San Francisco, CA, USA
gDepartment of Radiology, UC San Francisco, San Francisco, CA, USA
hDepartment of Medicine and Psychiatry, UC San Francisco, San Francisco, CA, USA
iDepartment of Veterans Affairs Medical Center, San Francisco
jDepartment of Psychology and Evelyn F. McKnight Brain Institute, University of Arizona, Tucson, AZ, USA
Received 26 September 2007; revised 6 February 2008; accepted 11 February 2008
Available online 21 February 2008
Tensor-based morphometry (TBM) creates three-dimensional maps
of disease-related differences in brain structure, based on nonlinearly
registering brain MRI scans to a common image template. Using two
different TBM designs (averaging individual differences versus align-
ing group average templates), we compared the anatomical distribu-
tion of brain atrophy in 40 patients with Alzheimer's disease (AD),
40 healthy elderly controls, and 40 individuals with amnestic mild
cognitive impairment (aMCI), a condition conferring increased risk
for AD. We created an unbiased geometrical average image template
for each of the three groups, which were matched for sex and age
(mean age: 76.1 years+/-7.7 SD). We warped each individual brain
image (N=120) to the control group average template to create
Jacobian maps, which show the local expansion or compression factor
at each point in the image, reflecting individual volumetric differences.
Statistical maps of group differences revealed widespread medial
temporal and limbic atrophy in AD, with a lesser, more restricted
distribution in MCI. Atrophy and CSF space expansion both correlated
strongly with Mini-Mental State Exam (MMSE) scores and Clinical
Dementia Rating (CDR). Using cumulative p-value plots, we investi-
gated how detection sensitivity was influenced by the sample size, the
choice of search region (whole brain, temporal lobe, hippocampus), the
of TBM design. In the future, TBM may help to (1) identify factors that
resist or accelerate the disease process, and (2) measure disease burden
in treatment trials.
© 2008 Elsevier Inc. All rights reserved.
Alzheimer's disease (AD) is the commonest form of dementia
worldwide, afflicting over 5 million people in the United States
alone. In early AD, memory is typically among the first functions
to be impaired, followed by a progressive decline in executive
function, language, affect, and other cognitive and behavioral
domains. It would be beneficial to prevent AD progression before
widespread neurodegeneration has occurred, so recent therapeutic
efforts have also focused on individuals with mild cognitive
impairment (MCI), a transitional state between normal aging and
dementia that carries a 4–6-fold increased risk, relative to the
general population, of future diagnosis of dementia (Petersen
et al., 1999; Petersen, 2000; Petersen et al., 2001). Early detection
NeuroImage 41 (2008) 19–34
⁎Corresponding author. Fax: +1 310 206 5518.
E-mail address: email@example.com (P.M. Thompson).
Available online on ScienceDirect (www.sciencedirect.com).
1053-8119/$ - see front matter © 2008 Elsevier Inc. All rights reserved.
requires innovations in tracking disease burden in vivo (Fleisher
et al., 2007). Magnetic resonance imaging (MRI) and MRI-based
image analysis methods have the potential to track brain atrophy
automatically at multiple time-points. MRI has revealed fine-scale
anatomical changes which are associated with cognitive decline
and which occur in a spreading pattern that mirrors the advance of
pathology (Thompson and Apostolova, in press). MRI-based maps
of brain degeneration are beginning to reveal the distribution and
evolution of cerebral volume losses, how brain changes in AD and
other dementias relate to behavior, and which brain changes
predict imminent decline (Scahill et al., 2003; Apostolova et al.,
2006; Apostolova and Thompson, 2007).
Tensor-based morphometry (TBM) is a relatively new image
analysis technique that identifies regional structural differences
from the gradients of the deformation fields that align, or ‘warp’,
images to a common anatomical template (reviewed in (Ashburner
and Friston, 2003)). Highly automated methods such as TBM are
being tested to examine their utility in large-scale clinical trials,
and in studies to identify factors that influence disease onset,
progression (Leow et al., 2005b; Cardenas et al., 2007), or normal
development (Thompson et al., 2000a; Chung et al., 2001; Hua
etal.,inpress). In TBM, a nonlinear registration algorithm reshapes
each 3D structural image to match a target brain image – either
based on an individual subject, or specially constructed to reflect
the mean anatomy of a population (Kochunov et al., 2001, 2002;
Lepore et al., 2007). Color-coded Jacobian maps – which show the
local expansion or compression factor at each point in the image –
indicate local volume loss or gain relative to a reference image
(Freeborough and Fox, 1998; Chung et al., 2001; Fox et al., 2001;
Ashburner and Friston, 2003; Riddle et al., 2004). TBM may also
be used to map systematic anatomic differences between different
patient groups using cross-sectional data (Davatzikos et al., 2003;
Shen and Davatzikos, 2003; Studholme et al., 2004; Dubb et al.,
2005; Brun et al., 2007; Chiang et al., 2007a,b; Lee et al., 2007;
Lepore et al., 2008).
The traditional TBM design (Ashburner, 2007; Chiang et al.,
2007a,b) computes individual Jacobian maps, i.e. “expansion
factor maps”, from the non-linear registrations that align each
subject's MRI image to a reference brain. Distinguishing features
of group morphometry emerge after the maps of individual
anatomical differences from the template are compared statisti-
cally across groups, or correlated with relevant clinical measures.
This scheme may be called ‘averaging individual differences’ in
the sense that the signal analyzed is based on maps of anatomical
differences computed for every individual separately (Rohlfing et
al., 2005). We use this term to distinguish it from an approach
that directly aligns mean anatomical templates representing each
group (Rohlfing et al., 2005; Aljabar et al., 2008). By contrast,
when a Jacobian map is created for each subject – which is the
standard TBM approach that we use to report findings in this
paper – correlations may be assessed between the detected
individual differences and individual factors such as age, sex and
clinical scores. We compare the standard and direct approaches
later in this paper.
3D maps that define the level of atrophy (relative to appropriate
controls) at a certain disease stage (Jack et al., 2005), may have
value in staging the degenerative process, predicting outcomes, and
understanding atrophic patterns characteristic of different dementia
subtypes or stages, e.g. when individuals transition from MCI and
AD. In this study, we examined the level of atrophy in AD and
MCI relative to controls; we studied how specific methodological
choices (e.g., sample size, initial linear registration) affected the
statistical power to detect these differences; and we also inves-
tigated, at a voxelwise level, how brain atrophy correlated with
clinical measures such as MMSE, and global Clinical Dementia
Rating (CDR). Finally, we compared our results using the tradi-
tional TBM design with ones from directly aligning group average
images – a relatively new concept in deformation-based group
morphometry, which has been advocated recently in the literature
(Rohlfing et al., 2005; Aljabar et al., 2006, 2008).
Materials and methods
The Alzheimer's Disease Neuroimaging Initiative (ADNI)
(Mueller et al., 2005a,b) is a large multi-site longitudinal MRI
and FDG-PET (fluorodeoxyglucose positron emission tomogra-
phy) study of 800 adults, ages 55 to 90, including 200 elderly
controls, 400 subjects with mild cognitive impairment, and 200
patients with AD. The ADNI was launched in 2003 by the
National Institute on Aging (NIA), the National Institute of
Biomedical Imaging and Bioengineering (NIBIB), the Food and
Drug Administration (FDA), private pharmaceutical companies
and non-profit organizations, as a $60 million, 5-year public-
private partnership. The primary goal of ADNI has been to test
whether serial MRI, PET, other biological markers, and clinical
and neuropsychological assessment can be combined to measure
the progression of MCI and early AD. Determination of sensitive
and specific markers of very early AD progression is intended to
aid researchers and clinicians to develop new treatments and
monitor their effectiveness, as well as lessen the time and cost of
clinical trials. The Principal Investigator of this initiative is
Michael W. Weiner, M.D., VA Medical Center and University of
California – San Francisco.
At the time of writing this report, data collection for the ADNI
project is in progress. Here we performed an initial analysis of the
screening MRI scans of 120 subjects, divided into 3 groups: 40
healthy elderly individuals, 40 individuals with amnestic MCI, and
40 individuals with probable AD. Each group of 40 subjects was
well matched in terms of gender and age: each group included 21
males and 19 females; mean ages for the control, MCI and AD
groups were, respectively, 76.2 years (standard deviation (SD)=
6.9 years), 75.9 years (SD=8.3), and 76.0 years (SD=8.5), with no
To test whether each type of TBM design correctly detects no
differences when no true differences are present, we selected an
independent (second) group of normal subjects (N=40, mean age=
76.0 years, SD=4.5 years), age- and gender-matched to the first
group of controls. There was no overlap between this group and the
initial normal group described above.
All subjects underwent thorough clinical/cognitive assessment
at the time of scan acquisition. As part of each subject's cognitive
evaluation, the Mini-Mental State Examination (MMSE) was ad-
ministered to provide a global measure of mental status based on
evaluation of five cognitive domains (Folstein et al., 1975; Cock-
rell and Folstein, 1988); scores of 24 or less (out of a maximum of
30) are generally consistent with dementia. The Clinical Dementia
Rating (CDR) was also assessed as a measure of dementia severity
(Hughes et al., 1982; Morris, 1993). A global CDR of 0, 0.5, 1, 2
and 3, respectively, indicate no dementia, very mild, mild, mode-
20 X. Hua et al. / NeuroImage 41 (2008) 19–34
rate, and severe dementia. The elderly normal subjects had MMSE
scores between 28 and 30 (inclusive), a global CDR of 0, and no
subjects had MMSE scores in the range of 24 to 30, a global CDR
of 0.5, and mild memory complaints, with memory impairment
assessed via education-adjusted scores on the Wechsler Memory
Scale - Logical Memory II (Wechsler, 1987). All AD patients met
NINCDS/ADRDA criteria for probable AD (McKhann et al., 1984)
with an MMSE score between 20 and 23. As such, these subjects
Overall, ADNI included AD subjects with MMSE scores as high as
consistent level of atrophy might be identified. 16 AD patients had a
CDR of 0.5, and the rest had a CDR of 1. Detailed exclusion criteria,
e.g., regarding concurrent use of psychoactive medications, may be
found in the ADNI protocol (Mueller et al., 2005a,b). Briefly,
subjects were excluded if they had any serious neurological disease
other than incipient AD, any history of brain lesions or head trauma,
or psychoactive medication use (including antidepressants, neuro-
leptics, chronic anxiolytics or sedative hypnotics, etc.).
The study was conducted according to Good Clinical Practice,
the Declaration of Helsinki and U.S. 21 CFR Part 50-Protection of
Human Subjects, and Part 56-Institutional Review Boards. Written
informed consent for the study was obtained from all participants
MRI acquisition and image correction
All subjects were scanned with a standardized MRI protocol,
developed after a major effort evaluating and comparing 3D T1-
weighted sequences for morphometric analyses (Leow et al., 2006;
Jack et al., in press).
High-resolution structural brain MRI scans were acquired at
multiple ADNI sites using 1.5 Tesla MRI scanners from General
Electric Healthcare and Siemens Medical Solutions All scans were
collected according to the standard ADNI MRI protocol. For each
subject, two T1-weighted MRI scans were collected using a sagittal
1.5T acquisition parameters are repetition time (TR) of 2400 ms,
minimum full TE, inversion time (TI) of 1000 ms, flip angle of 8°,
24 cm field of view, acquisition matrix was 192×192×166 in the x-,
y-, and z- dimensions yielding a voxel size of 1.25×1.25×1.2 mm3.
In plane, zero-filled reconstruction (i.e., sinc interpolation) yielded a
256×256 matrix for a reconstructed voxel size of 0.9375×0.9375×
corrections to ensure consistency among scans acquired at different
sites (Gunter et al., 2006).
Additional image corrections were also applied, using a proces-
sing pipeline at the Mayo Clinic, consisting of: (1) a procedure
termed GradWarp for correction of geometric distortion due to
gradient non-linearity (Jovicich et al., 2006), (2) a “B1-correction”,
to adjust for image intensity non-uniformity using B1 calibration
scans (Jack et al., in press), (3) “N3” bias field correction, for
reducing intensity inhomogeneity caused by non-uniformities in
the radio frequency (RF) receiver coils (Sled et al., 1998), and (4)
geometrical scaling, according to a phantom scan acquired for each
subject (Jack et al., in press), to adjust for scanner- and session-
specific calibration errors. In addition to the original uncorrected
image files, images with all of these corrections already applied
(GradWarp, B1, phantom scaling, and N3) are available to the
general scientific community.
To adjust for global differences in brain positioning and scale
across individuals, all scans were linearly registered to the stereo-
tactic space defined by the International Consortium for Brain
Mapping (ICBM-53) (Mazziotta et al., 2001) with a 9-parameter
(9P) transformation (3 translations, 3 rotations, 3 scales) using the
Minctracc algorithm (Collins et al., 1994). The Results section
reports separate tests based on using 12-parameter affine registra-
tions for the initial (global) component of the registration (which
also allows shearing along x, y, and z axes). Globally aligned images
were resampled in an isotropic space of 220 voxels along each axis
(x, y, and z) with a final voxel size of 1 mm3.
Unbiased group average template - Minimal Deformation Target
A minimal deformation target (MDT) is an unbiased average
a group of subjects, typically with a mathematically-defined mean
Joshi et al., 2004; Studholme and Cardenas, 2004; Kovacevic et al.,
2005; Christensen et al., 2006; Lorenzen et al., 2006; Lepore et al.,
The motivation for constructing a mean geometric template, or
‘customized template’, based on subjects in the study is to make it
easier to automatically register new scans to the template, to reduce
bias in the registrations (using a template that deviates least from the
been shown to be slightly higher if a customized template is used
(Lepore et al., 2007). To construct an MDT for the normal subject
group, the 9-parameter globally aligned brain scans (N=40) were
averaged voxel-by-voxel after intensity normalization to create an
initial affine average template. Next, the aligned individual scans
were non-linearly registered to the affine average template using a
non-linear inverse consistent elastic intensity-based registration
algorithm (Leow et al., 2005a,b). Satisfactory registration was
achievedwhenajoint costfunctionwasoptimized,based onalinear
combination of the mutual information (MI) between the deforming
image and the target (affine average template) and the elastic energy
of the deformation, which quantifies the irregularity of the defor-
mation field. The deformation field was computed using a spectral
method to implement the Cauchy-Navier elasticity operator
(Marsden and Hughes, 1983; Thompson et al., 2000b) using a Fast
Fourier Transform (FFT) resolution of 32×32×32. This corre-
sponds to an effective voxel size of 6.875 mm in the x, y, and z
dimensions (220 mm/32=6.875 mm). The non-linear average
were non-linearly registered to the affine average template. Finally,
we created the MDT for the normal group by applying inverse
geometric centering of the displacement fields to the non-linear
constructed a separate MDT for the MCI and AD groups. These
and 9P linear registration, we also investigated the effects of re-
ducing the sample size on the statistical maps of group differences
21 X. Hua et al. / NeuroImage 41 (2008) 19–34
(N=10, 20, 30 subjects per group) using either 9-parameter (9P) or
12-parameter (12P) affine registration. For comparisons in reduced
samples, the same MDT was still used (based on 40 subjects) to
make sure that the results were spatially registered with each other.
Three-dimensional Jacobian maps
To quantify 3D patterns of volumetric brain atrophy in MCI and
AD based on the method of “averaging individual differences”
(Fig. 1a), all individual brains (N=120) were non-linearly aligned
to the MDT for the normal group (Leow et al., 2005a). Subse-
quently, a separate Jacobian map was created for each subject to
characterize the local volume differences between that individual
and the normal group anatomical mean template. The determinant
(local expansion factor) of the local Jacobian matrix was derived
from the forward deformation field (see (Lepore et al., 2008), for a
more complex approach analyzing the full tensor). Color-coded
Jacobian determinants were used to illustrate regions of volume
expansion, i.e. those with det J(r)N1, or contraction, i.e., J(r)b1
(Freeborough and Fox, 1998; Toga, 1999; Thompson et al., 2000a;
Chung et al., 2001; Ashburner and Friston, 2003; Riddle et al.,
2004) relative to the normal group template. Negative or zero-
valued Jacobians are not obtained using this method, as the
inverse-consistent implementation regularizes the inverse deforma-
tion mapping and causes the resulting Jacobian determinants to
cluster quite tightly around zero after log transformation, as well as
removing the skew and bias from their distribution (see Leow
et al., 2005a,b, 2007 for examination of the Jacobian distributions).
As all images were registered to the same template, these Jacobian
maps share a common anatomical coordinate defined by the nor-
mal template. Individual Jacobian maps within each group were
averaged across subjects and compared statistically at each voxel to
assess the magnitude and significance of deficits in MCI and AD
versus the healthy controls.
We also examined group differences by directly aligning group
average images (Fig. 1b), a concept first introduced by Rohlfing
(Rohlfing et al., 2005). In this approach, an unbiased geometrical
average template was created for each of the three groups, and each
disease group average template was directly aligned to the control
group average template to create single Jacobian maps from which
to quantify inter-group local volume differences.
Subject-template alignment methods follow a similar pattern
to MDT construction, and for elastic registration we used a FFT
resolution of 32×32×32; this corresponds to an effective size
of 6.875 mm (220 mm/32=6.875 mm) in each of the x-, y-, and
z- dimensions; for template-template registrations, we ran the
deformations at a higher FFTresolution of64, since the MDTs share
anatomical features with very similar resolution and contrast. The
choice of FFT grid size depends on the expected spatial coherence
a more complex approach, by empirical estimation of the bivariate
Green’s function or 6D Lambda-tensor ((Fillard et al., 2005); this
approach will be tested once these covariance functions are es-
timable from a very large image database of images).
The first approach generated 120 Jacobian maps, which encode
individual differences with respect to the normal template. This
enabled us to carry out voxel-wise statistical tests between the
system. The Jacobian maps in MCI and AD were compared to those
from normal controls. At each voxel, we evaluated the significance
level of group differences using a two-sample t test with unequal
variance. The resulting p-values were displayed as maps to allow
visualization ofthepatternsofsignificantdifferences throughoutthe
In addition, we used permutation testing to assess the overall
significance of group differences, corrected for multiple compari-
sons [see, e.g., Bullmore et al., 1999; Nichols and Holmes 2002;
Thompson et al., 2003; Chiang et al., 2007a,b]. A null distribution
for the group differences in Jacobian at each voxel was constructed
using 10,000 random permutations of the data. For each test, the
subjects’ diagnosis was randomly permuted and voxel-wise t tests
were conducted to identify voxels more significant than p=0.05.
The volume of voxels in the brain more significant than p=0.05was
Fig. 1. Two Alternative TBM Designs. Inter-group differences in brain structure may be assessed with TBM, using two alternative designs, which differ in terms
of which images are registered to each other. The first approach, termed “Averaging Individual Differences”, (a), a mean template is created for the control
subjects. Then, every image in the study is nonlinearly aligned to the control average template and the set of individual differences between each subject and the
template is analyzed statistically. In panel a, the mean template was built from controls only; arguably, it could instead be re-built each time based on all subjects
relevant to a specific hypothesis (e.g., Controls and MCI only, for an MCI-control comparison), but this would mean that the results of different contrasts would
not be spatially registered with each other. The second approach, termed “Aligning Group Averages”, reflects a relatively new concept in TBM design (Rohlfing
et al., 2005; Aljabar et al., 2006, 2008), (b), and creates minimal deformation targets (MDTs) for each diagnostic group separately using nonlinear registrations of
subjects within the group (A1, A2, …; N1, N2, …; etc.) to create a template reflecting the group's mean geometry. Systematic anatomical differences between
groups may then be assessed using direct alignment of these group-specific templates. In panel b, JADand JMCIdenote the Jacobians of these mappings, which
contain information on the level of atrophy in the MCI and AD groups versus controls. We compare these two methods at the end of the Results section; for all
statistical tests, the standard method of “averaging individual differences” is used.
22 X. Hua et al. / NeuroImage 41 (2008) 19–34
computed for the real experiment and for the random assignments.
Finally,aratio, describingthefraction ofthetime thesuprathreshold
volume was greater in the randomized maps than the real effect (the
original labeling), was calculated to give an overall P-value for the
significance of the map (corrected for multiple comparisons by
permutation). The correction is for the number of tests, so it quan-
tifies the level of surprise in seeing the overall map. The number of
permutations N was chosen to be 10,000, to control the standard
error SEp of the omnibus probability p, which follows a binomial
distribution B(N, p) with known standard error (Edgington, 1995).
When N=10,000, the approximate margin of error (95% confidence
interval) for p is around 5% of p.
Cumulative distribution function (CDF) plots were used to com-
pare the power of detecting significant effects when using the TBM
design of averaging individual differences, with sample sizes vary-
ing from 10 to 40 per group, and the two different linear registration
schemes. These CDF plots are commonly generated when using
false discovery rate methods to assign overall significance values to
statistical maps (Benjamini and Hochberg, 1995; Genovese et al.,
different methods, subject to certain caveats (Lepore et al., 2007), as
they show the proportion of supra-threshold voxels in a statistical
map, for a range of thresholds. A cumulative plot of p-values in a
statistical map, after the p-values have been sorted into numerical
order, can compare the proportion of suprathreshold statistics with
null data, or between one method and another, to assess their power
and strict thresholds (in fact at any threshold in the range 0 to 1). In
p-values observed for the statistical comparison of patients versus
controls is plotted against the corresponding p-value that would be
expected, under the null hypothesis of no group difference. For null
distributions (comparing two independent normal groups), the cu-
mulative distribution of p-values is expected to fall approximately
along the diagonal line y=x, because a proportion y of voxels in a
null p-value map will, on average, fall below the threshold y; large
upswings of the CDF from that diagonal line are associated with
significant signal. Greater effect sizes are represented by larger
gives formulae for thresholds that control false positives at a known
Using the results of the above two-sample t-tests, Fig. 3 shows
the cumulative histograms (CDF plots) of the probability maps for
voxel-wise differences in mean Jacobian between the MCI and AD
groups and normal controls. Within each CDF plot, the curves show
increasing effect sizes, in rank order from bottom to top, for de-
tecting voxels with statistical differences between groups.
Regions of interest (ROIs)
Regions of interest, including frontal, parietal, temporal, and
occipital lobes, were defined by manually labeling the normal group
masks for each lobe, which were subsequently used to summarize
brain atrophy at a regional level in each group. Within each lobe,
tissue types were distinguished by creating maps of gray and white
matter, CSF, and non-brain tissues using the partial volume classi-
fication (PVC) algorithm from the BrainSuite software package
(Shattuck and Leahy, 2002). CSF was excluded from the masks as
the trend for CSF differences is typically opposite to cerebral dif-
ferences in subjects with varying levels of brain atrophy, i.e., greater
CSF space expansions are typically associated with greater atrophy.
While these CSF signals are potentially of diagnostic interest
(Carmichael et al., 2006, 2007), they were excluded to avoid con-
founding the average values in regions where tissue atrophy was
The hippocampus was delineated on the control (N=40) average
template by investigators at the University College London (J.B.).
The ROI tracing was performed using MIDAS (Medical Image
Display and Analysis System) software (Freeborough et al., 1997).
This delineation included the hippocampus proper, dentate gyrus,
subiculum, and alveus (Fox et al., 1996; Scahill et al., 2003).
Correlations of structural brain differences (Jacobian Values) with
clinical measurements and genetic variants
At each voxel, correlations were assessed, using the general
linear model, between the Jacobian value and several clinical mea-
sures - the MMSE, and Clinical Dementia Rating summary scores
(Morris, 1993). The CDR assesses a patient's cognitive and func-
tional performance in six areas on a scale of 0 (no impairment) to 3
(impaired): memory, orientation, judgment & problem solving,
community affairs, home & hobbies, and personal care. As there
is a significant range restriction with global CDR scores, we also
assessed correlations with the CDR ‘sum-of-boxes’ scores, which
have a greater dynamic range (0-18), and arguably provides more
useful information than the CDR global score, especially in mild
cases (Lynch et al., 2006). In the Jacobian maps, CSF regions
typically show ‘expansion’ as AD progresses (for example, due to
lateral ventricle enlargement), so we performed separate evaluations
of the positive, negative and two-sided associations between the
Jacobian and diagnostic group. The results of voxel-wise correla-
tions were corrected for multiple comparisons by permutation
testing. Clinical scores were randomly assigned to each subject and
the number of voxels with significant correlations (p≤0.01) was
recorded. After 10,000 permutations, a ratio was calculated de-
scribing the fraction of the null simulations in which a statistical
effect (defined here in advance as the total supra-threshold volume)
had occurred with similar or greater magnitude than the real effects.
The primary threshold of 0.01 has been used in our past studies and
is based on setting a moderately strong threshold at the voxel level
(alternatively,FDR could be used); the total supra-threshold volume
is often used to assess the magnitude of an anatomically distributed
spatially localize the signal than tests based on cluster extent or peak
height (Frackowiak et al., 2003). This ratio served as an estimate of
the overall significance of the correlations, corrected for multiple
comparisons, as performed in many prior studies (Nichols and
3D maps of brain atrophy in MCI and AD
We first examined the level of brain atrophy using the method
of averaging individual differences. The resulting statistical maps
(Fig. 2) detected the known characteristic patterns of atrophy in
23X. Hua et al. / NeuroImage 41 (2008) 19–34
AD, revealing profound tissue loss in the temporal lobes bilaterally,
the hippocampus, thalamus, widening of the bodies of the lateral
ventricles and expansion of the circular sulcus of the insula.
Permutation tests were conducted to assess the overall signi-
ficance of the maps, corrected for multiple comparisons. The per-
the MCI (two tailed: P=0.04; negative one tail: P=0.02, ROI: left
temporal lobe) and AD (two tailed: P=0.002, ROI: whole brain)
when compared to the normal group respectively, corrected for
Power to detect brain atrophy in MCI and AD
The cumulative distribution function (CDF) curves (Fig. 3)
using the method of averaging individual differences. Eight different
experiments are shown, comparing various sample sizes (N=10, 20,
30, or 40 per group) and different linear registration schemes (9P vs.
the second group of normal individuals to the initial control template.
Fig. 2. 3D Maps of Brain Atrophy (based on the method of averaging individual differences). The top rows of panel a and b show the level of atrophy in 40 AD
patients and 40 MCI subjects as a percentage reduction in volume relative to controls, respectively. The bottom rows show the significance of these reductions,
revealing highly significant atrophy in AD but a more anatomically restricted pattern of atrophy in MCI. The high Jacobian values immediately adjacent to the
ventricle result from the limited spatial resolution of the deformation fields, which are computed via a Fourier transform on a 323grid. The ventricles expand
presumably due to cell or myelin loss in broad areas overlying the ventricular surface; not in the narrow band of tissue immediately adjacent to the ventricular
surface, in which voxels are partial volumed with voxels where expansion is detected. This is also the case with PETand fMRI imaging, where imaging signals
‘bleed’ into the CSF space where the signal differs (and typically there is no CSF signal). As with those modalities, sharp boundaries are not found in group
averaged deformation maps, so the image resolution should be interpreted with this in mind. Higher resolution morphometry is possible in longitudinal studies,
where it is feasible (and makes sense) to perform image registration at a finer spatial scale.
24 X. Hua et al. / NeuroImage 41 (2008) 19–34
have a CDFthat isa diagonal line (see Fig. 3),wealso confirmed that
this is indeed the case using empirical data from two groups of
controls. The black line in Fig. 3 falls almost exactly on the diagonal,
confirming that this TBM design controls for false positives at the
appropriate rate for all thresholds (it is not exactly diagonal). If CDFs
from many independent samples were averaged,the population mean
CDF should tend towards a diagonal line. As expected, regardless of
the method used, there are more significant voxels (at any given
threshold such as pb0.01) detected in the AD versus Control
comparison, relative to the MCI versus Control comparison. More
linear registration is used (solid lines) are mostly situated above the
registration schememay have superior power for detecting atrophy in
MCI and AD and differentiating these groups from normal subjects.
As might be expected, sample size greatly influences the power to
detect brain atrophy in MCI and AD, with effect sizes increasing
monotonically with sample size.
Correlations with clinical measurements
Any quantitative measure of brain atrophy has greater value if
it can be shown to correlate with established measures of cognitive
or clinical decline, or with future outcome measures, such as immi-
nent conversion to AD. We found strong correlations between the
Jacobian values derived from the standard method (N=120) and the
clinical measures (MMSE, CDR summary and sum-of-boxes
scores; Table 1). This table reports corrected p-values for the
correlations with voxel-level TBM values, rather than with a
global summary value from TBM. To avoid reductions in power
due to restricting the range to the AD or MCI groups separately,
these correlations are reported for the entire sample of 120. As
such, the normal subjects, who tend to score in the normal range on
all the clinical measures, drive these associations to some extent.
Within the whole brain, the P values represent the overall
significance level of correlations between the Jacobian maps and
the clinical measures (corrected for multiple comparisons). The
significance level is based on the number of suprathreshold voxels
in the ROI, rather than their average or maximum. This method
is sometimes known as set-level inference, which generally has
greatest power (relative to other tests such as peak height or
cluster size) for detecting a spatially distributed effect. Since there
are two types of signals in the Jacobian maps: regional expansion
(e.g., in the ventricles) and regional atrophy (e.g., in gray and
white matter), positive and negative correlations are tested sepa-
rately. Two-tailed tests detect any consistent structural differ-
ences without an emphasis on the sign of the changes (gain or
Fig. 3. Comparing Effect Sizes with CDF Plots: Influences of SampleSize and Registration Model (9 vs. 12 parameter linear registration) on the Statistical Effect
size in Discriminating MCI and AD from Normal Subjects, using the TBM design of averaging individual differences. With some caveats (see Discussion),
higher CDF curves denote greatereffect sizes, i.e., a large number of significantvoxels detected within the temporallobes. Three aspects are notable: (1) Atrophy
is detectedwitha greatereffect sizeinthe AD group comparedtothe MCI group,regardlessof the method;(2) inAD,effect sizesare greaterwhenlarger samples
are used(e.g.,40 pergroupversus 30, 20 or 10); (3) Using 9-parameterversus12-parameterinitial registrationmay provideslight gainsin power, but theseare no
greater than the improvements gained by adding 10 subjects to the sample. The omnibus (corrected) significance for each experiment is the q-value obtained
from the positive FDR method. 40 subjects pergroup are needed to detect atrophy significantly in AD and at trend level in MCI, using the entire temporal lobe as
an ROI. Atrophy is detected in MCI with 40 subjects per group, but only in the left temporal lobe. Perhaps surprisingly, the jump in effect size as one goes from a
samplesizeof30 to40is muchgreaterfor normalvsAD thannormalvsMCI.Thismaybebecausethe trueeffect sizein ADis greater,butit mayalso bebecause
CDFs rely on thresholdings of the statistical maps, leading to many voxels reaching significance within a small range of sample sizes.
25 X. Hua et al. / NeuroImage 41 (2008) 19–34
Average brain changes within each regions of interest (ROI)
detailed 3D maps to simpler numeric summaries that may be more
convenient to use as outcome measures in a clinical trial, especially
when a small number of outcome measures must be agreed in
advance. To summarize group differences or other statistical effects
detected by TBM in a lobe, ;hemisphere, or in a region of interest
computed from an independent experiment, several different nu-
meric summaries are possible, such as the number of suprathreshold
voxels in an ROI, the maximum statistic within an ROI, or some
weighted average of the Jacobian values within the ROI. For sim-
plicity, we summarize the Jacobian values by averaging them within
several ROIs traced on the control MDT. While not necessarily the
optimal summary in terms of power, these results may at least be
compared with the results of automated volumetric parcellation
methods. This equivalence occurs because the average Jacobian in a
were labeled automatically by transferring atlas labels onto the
individual using the deformation field.
We computed the spatial mean of the Jacobian within each ROI
for every subject from the individual Jacobian maps (N=120;
matter, there is a consistent trend for tissue reduction: AD b MCI b
Normal. The result from a T test (two-tailed, unequal variance)
detects significant atrophy only in AD and only in the temporal lobe
of the last areas to be affected by AD (Delacourte et al., 1999;
Thompson et al., 2003), shows no tissue loss.
As a post hoc test, we investigated whether the use of a hip-
pocampal region of interest would detect group differences better
than using the whole temporal lobe (Fig. 5). This type of test is
exploratory only, with the goal of finding the best region for
averaging the Jacobian values, if a single numerical score is derived
from TBM. The hippocampal ROI was delineated on the Control
N=40 average template by investigators at the University College
hippocampal ROI (p=0.04). There is a visually apparent trend for a
left versus right asymmetry in the degree of atrophy, but it is not
significant in either MCI or AD samples.
The outcome of these analyses suggests that using a hippo-
campal or temporal lobe ROI to summarize the effects in TBM
maps may be inferior to using pFDR to quantify suprathreshold
statistics within the same ROI. This is because the effects within
each ROI are spatially heterogeneous, and numerical averages
across spatial regions necessarily deplete the power of local tests
by averaging all voxels equally. By contrast, pFDR can measure
the quantity of non-null statistical events in an ROI, which may
detect effects that are focused on a relatively small region of an
ROI, or only partially overlapping with it.
Correlations with clinical measures of dementia severity
MMSE ScoreGlobal CDR Sum-of-Boxes
Negative correlations (−1)
Two-tailed correlations (0)
Positive correlations (+1)
P values (corrected for multiple comparisons by permutation testing) for
correlations between Jacobian maps and clinical measures. If negative cor-
relations are significant, there are regions of the image (e.g., the ventricles)
where greater expansion is correlated with lower MMSE scores. If positive
correlations are significant, there are regions of the image (e.g., in gray and
white matter) where volume reductions are correlated with lower MMSE
scores. If two-tailed correlations are significant, there is evidence that
structural differences in the group, whether they are contractions or ex-
pansions, are linked with cognition. For MMSE, the positive or two-tailed
correlations – which are sensitive to atrophy – are more robust. Higher
global CDR and sum-of-boxes scores denote greater impairment; in that
case, the negativeor two-tailed correlations aremore reliable.Put simply, the
atrophy (volume contraction) detected by TBM links better with cognition
than volume expansions do (in the CSF spaces), although each is sig-
that the CSF expansionsignalhas less signalto noisethan the atrophicsignal
as we are using a statistical tests that depend on the total volume of regions
from FDR). It may be that, if the statistical tests had been formulated
differently, e.g., as strict voxel-level comparisons (e.g. maximal t-statistics),
they would detect CSF differences with greater effect sizes than atrophic
Fig. 4. Mean Jacobian of Normal, MCI and AD in each lobar ROI (N=40 for each group, 9P linear registration). This plot shows TBM-based volume estimates
(relative to the normal subjects) for each tissue type in each lobe. To ease comparison of volumes across lobes, values are expressed as a proportion of the control
average volume. The⁎indicates that there is significant atrophy relative to normal subjects.
26 X. Hua et al. / NeuroImage 41 (2008) 19–34
Comparison with the method of directly aligning mean anatomical
Fig. 6 shows the mean level of volumetric atrophy, in AD and
MCI, relative to controls, as a percentage, using the method of
et al., 2005; Aljabar et al., 2006, 2008). This method suggests that
there is widespread atrophy in AD, in agreement with both the
standard method and with visual inspection of the MDT templates.
Atrophy of 20–30% was detected throughout the temporal lobe in
AD, with moderate atrophy (10-20%) in the superior and middle
frontal gyri, superior frontal sulcus, and corona radiata. The MCI
pattern suggests atrophy of around 5% throughout the white matter,
with deficits reaching 10–15% in the temporal lobes and
hippocampus. A mean Jacobian was calculated within each ROI
to show the computed overall volume differences for each lobe
thegreatestvolumetric deficit lossinthewhitematter,areduction of
6.62%, and in temporal lobe gray matter, a volume deficit of 5.79%.
In line with the literature, frontal and parietal gray matter show
smaller proportional deficits, and tissue loss is not detected in the
However, the direct alignment method has a serious limitation.
When computing a group difference based on aligning group aver-
age images, there is no convenient way to conduct voxel-wise
statistical tests to establish the significance of the observed differ-
ences (as noted in (Rohlfing et al., 2005)) since only one Jacobian
map is derived to identify differences between the two group tem-
plates. In principle, a null distribution for the group-to-group defor-
mation may be computed by permuting the assignment of subjects
to groups, constructing mean anatomical templates for each per-
mutation, and assessing the statistical distribution of deformation
maps that would arise between these templates. As thousands of
independent MDTs would be required to assemble this reference
distribution, and each would require two rounds of nonlinear re-
gistration in groups of 40 subjects, this is computationally pro-
hibitive (requiring around 80,000 CPU hours). If an omnibus
probability (i.e., corrected for multiple comparisons) is determined
by comparing the number of suprathreshold voxels in the true
labeling to the permutation distribution, the number of permutations
N must be chosen to control the standard error SEp of omnibus
probability p, which follows a binomial distribution B(N, p) with
standard error of the resulting p-values derived from the permuta-
tion distribution, N=8,000 randomizations are required to ensure
that approximate margin of error (95% confidence interval) for p is
around 5% of p, when 0.05 is chosen as the significance level.
As an approximation, we conducted voxel-wise two-sample
t tests using the variance term obtained from the first approach as
an estimate of the group variance between MCI/AD and control,
subject to checking (below) that this did not inflate Type I error
when truly null groups were compared. Using the estimated
pð1 ? pÞ=N
(Edgington, 1995). To adequately control the
Fig. 5. Mean Jacobian Values within the Hippocampus, for Normal, MCI
and AD Groups, using a Manually-Defined Traces of the Hippocampal
Formation Delineated on the Control Average Template (N=40 for each
group, 9P linear registration). This plot shows TBM-based volume estimates
(relative to the normal subjects) for each tissue type in ROIs of the left and
right hippocampi. The⁎indicates that there is significant atrophy relative to
Fig. 6. 3D maps of brain atrophy (based on aligning group averages). These
maps show the level of volumetric atrophy in AD and MCI relative to
controls. In the top left panel, AD patients show prominent atrophy of up to
30% regionally in the temporal lobes, widespread reductions in the white
matter, and notable expansion of the interhemispheric fissure (coded in red
colors). MCI subjects show atrophy in the same regions but to a lesser
degree. These maps are computed after spatial normalization of all brains to
the same global scale, so regions with apparent excess tissue indicate regions
with either absolute volumetric gains, or relatively greater tissue in pro-
portion to overall brain scale. The bottom row shows the significance of
these changes, computed using the variance of the individual deformation
mappings, and color-coded according to the scale at the bottom.
Mean Jacobian within each ROI (based on aligning group averages)
When aligning group averages, the tissue reductions, as a percentage relative
to controls, are shown here for the AD and MCI groups.
27X. Hua et al. / NeuroImage 41 (2008) 19–34
variance from the individual Jacobian maps, this alternative TBM
design appeared to detect substantial atrophy in regions
degenerating both early and late in AD. However, when applied
to compare two different groups of normal subjects, the direct
method did not control for false positives at the conventional rate,
showing widespread “differences” even after multiple compar-
isons correction (Fig. 7; FDR q-value=0.0001). This problem
occurs because the varianceinthetemplate-to-template registration
is not simply related to the variance in the individual-to-template
case; it depends on the geometry of the registration algorithm's cost
function landscape with respect to the transformation parameters.
One might expect the averaging of individual differences to be a
slightly conservative approach as the variance in individual-to-
template registrations is typically much higher than the variance in
template-to-template registrations, as the cost function landscape is
much smoother with respect to the alignment parameters when
aligning two template images of very similar contrast and geometry.
The registration error in individual registrations may be greater than
that observed in template-to-template registrations, and this source
of variance works against finding systematic group differences in
volume, and may therefore underestimate the true reduction in
volume in AD and MCI. This seems to be supported by the finding
that the estimated volume differences for each tissue type in each
lobe are around 50% greater for the TBM method based on directly
aligning group averages, than for the TBM method based on
recent study (Chou et al., in press), in which anatomical labeling of
the ventricles based on a single registration was more error-prone
thancombiningmultiple imagestoderive asegmentation, whichled
to better effect sizes in discriminating AD from controls (see
(Twining et al., 2005), for related work). Even so, the lack of a
computable null distribution for the direct method means that
differences it detects cannot be regarded as statistically established.
Using the variance of the individual mappings is not appropriate, as
it leads to false positives.
A second argument may also be made that the direct method is
inherently more prone to registration error and than the averaging
of results from many registrations. Regardless of the algorithm
used, both linear and nonlinear registration are imperfect and
registration errors are not simply Gaussian at each voxel. When
each subject is registered individually to a template, these errors are
not likely to be compounded, as each subject has slightly different
error maps that are likely to cancel out to some degree. However,
when the non-linear averages are directly registered to each other,
the registration errors will be compounded (as the same registration
error is found in all subjects of the group after they have been
aligned to the group template). This is likely to induce “spatial
shifts” that may appear as (false) group differences.
Finally some comment is necessary regarding the discounting
of global anatomical differences in TBM. The maps reported here
assessed residual anatomical differences after an initial 9-parameter
global scaling of all AD, MCI, and control subjects to match an
anatomical template. This scaling was performed in the automated
registration step, and, in our cohort, the degrees of scaling (mean
global expansion factors) for groups of controls, MCI and AD
patients were 1.35 (SD=0.14), 1.35 (0.14) and 1.32 (0.15) respec-
tively, and there was no significant difference among the three
groups (single factor ANOVA p-value=0.62). As such, we did not
adjust for group differences in overall brain scaling in our analyses,
as no such differences were detected.
Fig. 7. CDF Plots of Group Differences based on Aligning Group Average
Images (N=40; 9P linear registration). Voxel-wise two-sample t tests were
conducted using the Jacobians derived from direct alignment of group
average templates and the variance term obtained from the method of
averaging individual differences. The green curve (MCI vs. Normal) over-
laps with the black curve which represents the empirically confirmed null
distribution of statistics (registering one normal group template to the other).
These lines are far from diagonal, and the method of aligning group
averages, when used with a variance term from individual registrations, does
not control false positives properly (FDR q-value for controls: 0.0001).
Fig. 8. Single-subject analysis. Here regional brain volumes in a single
subject with AD are locally 20–30% lower than the control group average
(top panel, blue colors) and 20–30% larger in some of the CSF spaces (red
colors). Using the variance in the control group to assess the percentile, for
regional volumes, at which this subject would fall relative to normal con-
trols, most regions are well outside the confidence limits for normal volumes
(lower panel, red colors). For caveats regarding the significance and inter-
pretation of single-subject TBM maps, see the main text.
28 X. Hua et al. / NeuroImage 41 (2008) 19–34
Some comment is warranted regarding the possible value of
TBM to assess of atrophy in individual subjects, which is closer to
the problem faced in a clinical setting when evaluating disease
burden. While we do not attempt a comprehensive analysis of this
question here, Fig. 8 shows a map comparing brain structure in a
single subject against a group mean. Relative to the mean template
from the control subjects, this individual has 30% lower regional
volumes throughout much of the white matter (blue colors), clear
CSF space expansion in the Sylvian fissures (red colors) and in
with the standard deviation of the normal group, the significance
map shows widespread regions with abnormally low tissue volumes
(in the white matter) or abnormal expansions (in the perisylvian
CSF). These effects are not focused in the cortex, suggesting that
elastic registration has higherpower toresolve whitematter atrophy,
perhaps because (1) registration is typically more accurate in the
deep white matter than in the cortical gray matter, and (2) normal
abnormalities are easier to detect.
This study had four main findings. First, a TBM method based
on directly aligning group averaged images was found to be prob-
lematic, as it did not correctly control for false positives. This
problem was solved by aligning each subject to a single template,
and analyzing individual maps. Second, we showed a CDF-based
method that can help to decide which methodological choices affect
power in TBM; linear (9 parameter) initial registration and larger
samples were found to give higher effect sizes, and the dependency
on sample size was explored. Third, analysis of voxels in large
regions such as the temporal lobe was more powerful than using
small regions such as the hippocampus, confirming that TBM is
better for resolving distributed atrophy rather than very small-scale
changes, at least when used in a cross-sectional design. Fourthly,
clinical measures of deterioration in brain function (MMSE, CDR
scores) were tightly linked with both atrophy and ventricular
expansion, but the atrophy measures gave higher effect sizes. The
best TBM-based marker of neurodegeneration was temporal lobe
atrophy, as this distinguished AD from controls better than other
In our comparison of two types of TBM design, we first used the
traditional method, which creates individual Jacobian maps for each
subject by non-linearly aligning their MRIs to the normal MDT
template. All the Jacobian maps share a common coordinate system
defined by the normal MDT, so an average map of the group
(normal, MCI or AD) was created by taking the arithmetic mean at
each voxel (other possible approaches include using the geometric
mean, matrix logarithm mean, Frechét mean, or geodesic metrics on
the deformation velocity (Woods, 2003; Avants and Gee, 2004;
Leow et al., 2006; Aljabar et al., 2008; Lepore et al., 2008).
Statistical parametric maps may then be computed to associate
regional atrophy with predictors measured in each individual (diag-
nosis, clinical scores, etc.). By contrast, the direct method uses
geometric centering to construct an average template that conforms
to the group mean geometry, and then a single non-rigid transfor-
mation quantifies group differences. The two methods both detect
tissue loss in temporal lobes, hippocampus, the thalamus and wide-
spread widening of sulcal and ventricular CSF spaces, congruent
with prior studies (Baron et al., 2001; Callen et al., 2001; Frisoni
et al., 2002; Busatto et al., 2003; Gee et al., 2003; Thompson et al.,
2003; Karas et al., 2004; Testa et al., 2004; Teipel et al., 2007;
Whitwell et al., 2007).
The direct method has several limitations. First, it is difficult to
covary for other variables measured at the individual level, such as
age or sex, although this could be circumvented to some degree by
matching samples for these variables. Second, it is computationally
prohibitive to compute an empirical null distribution for deforma-
tions between group average templates, unless tens of thousands of
templates are generated from permuted datasets. Null distributions
for Jacobian maps based on individual registrations are faster to
compute, but do not adequately control for false positives when null
no true difference). Further study is necessary to clarify how regis-
tration errors compare when registering individuals and templates to
other templates. In a recent study, Aljabar et al. (2008) computed
two years of age, based on creating a mean template for baseline
scans and directly aligning it to a mean template from follow-up
the mapped changes, the overall growth factors for gray and white
matter, computed from thisdirect registration, agreed with measures
from independent segmentations, and the results were visually rea-
sonable and in line with the neurodevelopmental literature. This
suggests that the change rates observed with the direct method may
be accurate, at least in a longitudinal study, but their significance is
it may be more robust than in a cross-sectional study, as the cohorts
at each time point are by definition matched on all demographic
variables other than time. In a cross-sectional study, any confounds
indemographic matchingof thegroupsmay enterthemapsof group
differences,without astatistical means toadjustfor themorestimate
Any TBM study is limited by the accuracy with which deform-
able registration can match anatomical boundaries between indivi-
dual brains and corresponding regions on the template. Our mean
deformation template (MDT) was created after rigorous nonlinear
registration, and geometric centering. Several studies have
suggested that registration bias can be reduced, and effect sizes
increased, by using an unbiased group-average template of this kind
(Kovacevic et al., 2005; Kochunov et al., 2002; Good et al., 2001;
Lepore et al., 2007). Most anatomical features and boundaries are
well-preserved in the MDT, and the hippocampus is sufficiently
possible to achieve accurate regional measurements of atrophy,
especially in small regions such as the hippocampus, since that
would assume a locally highly accurate registration. TBM is best for
assessing differences with at a scale greater than 3–4 mm (the
resolution of the FFT used to compute the deformation field). For
smaller-scale effects, direct modeling of the structure, e.g. using
surface-based geometrical methods, may offer additional statistical
power todetect subregionaldifferences(e.g.,Morraetal.,submitted
As the ADNI initiative is a study of 200 AD, 400 MCI, and 200
controls, this study focused not just on AD but also on MCI. The
focus in the AD field has shifted to MCI in recent years, in the hope
of tracking disease progression and ultimately resisting it, before
individuals progress to AD. It is useful to know what factors affect
detection power or link with cognition in MCI versus AD, as factors
that can enhance power in MCI may not be so relevant in a study of
AD, and regions in which atrophy correlates with cognition in MCI
29 X. Hua et al. / NeuroImage 41 (2008) 19–34
study, we therefore included power estimates and measures of effect
sizes for TBM studies of both MCI and AD, revealing that sample
requirements differ greatly for different effects of interest.
In this study, we did not (beyond multiple pair-wise comparisons)
attempt to gain any insight into the shift in morphological changes
from normal controls to MCI to AD. A strength of a TBM analysis
would be to map all subjects to a common template, and then track
the distribution of atrophy it spreads anatomically over time (e.g.
Thompson et al., 2001) or with clinical progression (Janke et al.,
2001). As ADNI is a longitudinal study, we plan to fit longitudinal
models to detect the shift in the location of greatest atrophy as the
will require repeated-measures methods, which have not yet been
validated for TBM, and specialized methods for creating longitudinal
mean templates, which are emerging in the literature (see Lorenzen
et al., 2004, 2006).
The ROI-based analyses (Figs. 4 and 5) revealed patterns of
atrophy in MCI and AD, but with relatively low significance levels.
In future, we will see if statistical power can be improved by
atrophy, as the effects of CSF expansion partially oppose the con-
one), such as taking the average Jacobian in the contracting regions,
or counting the numbers of contracting voxels. Such an approach
could be biased, in that a group with greater variance in the Jacobian
could have more contracting voxels while having the same mean
level of atrophy. Also, an analysis of contracting voxels could be
which could occur, at least in principle.
Use of CDF plots
In neuroscientific studies using TBM, it is vital to optimize
statistical power for detecting anatomical differences, especially
when evaluating the power of treatment to counteract degeneration,
as in a drug trial, or in an epidemiological study to identify
neuroprotective factors (Lopez et al., 2007). Comparison of power
across image analysis methods is of great interest, but some caveats
are necessary regarding the use of CDF-based approaches, in
which the ordered p-values are plotted and compared to the
expected 45-degree line under the null hypothesis of “no effect”. In
highly sensitive methods, the departure of the early part of the
curve from a 45-degree line will be large (showing a positive
upswing). This assumption is supported by our plots (Fig. 3), in
which successively larger sample sizes boost the effect size in
statistical maps identifying group differences, for both MCI and
AD. As shown in the CDF plots (Fig. 3), for all significance
thresholds (values on the x axis), the proportion of significant
voxels, detecting group differences, increases dramatically as the
group size is enlarged from N=10 to 40. In prior work (Lepore et
al., 2008), we used this same CDF approach to note that the
deviation of the statistics from the null distribution generally
increases with the number of parameters included in the statistics,
with multivariate TBM statistics on the full tensor typically
outperforming scalar summaries of the deformation based on the
eigenvalues, trace, or the Jacobian determinant. With this approach,
we also found that effect sizes in TBM may be boosted, at least in
some contexts, by using mean anatomical templates based on Lie
group averaging (Lepore et al., 2007) or by using deformation
models based on information-theoretic Kullback-Leibler distances
(Leow et al., 2007), or using Riemannian fluid models, which
regularize the deformation in a log-Euclidean manifold (Brun et al.,
Even so, we do not have ground truth regarding the extent and
degree of atrophy or neurodegeneration in AD or MCI. So, al-
though an approach that finds greater disease effect sizes is likely
to be more accurate than one that fails to detect disease, it would be
better to compare these models in a predictive design where ground
truth regarding the dependent measure is known (i.e., morphometry
predicting cognitive scores or future atrophic change; see e.g.,
(Grundman et al., 2002)). We are collecting this data at present,
and any increase in power for a predictive model may allow a
stronger statement regarding the relative power of different models
in TBM, or the relative power of one image analysis method versus
another for tracking brain disease.
method than another in one experiment, it may not be true of all
experiments. Without confirmation on multiple samples, it may not
reflect a reproducible difference between methods. FDR and its
variants (Storey, 2002; Langers et al., 2007) declare that a CDF
shows evidence of a signal if it rises greater than 20 times more
sharply than a null distribution, so a related criterion could be
developed to compare two empirical mean CDFs after multiple
experiments. As simple numeric summaries sacrifice much of the
power of maps, and provide a rather limited view of the differences
based comparisons of methods seems warranted.
Correlations with clinical measures
The corrected P values signify the overall significance levels
of the correlations between atrophy and clinical scores within the
whole brain. For MMSE, both the positive and two-tailed tests are
significant, suggesting a correlation between the regions of volume
reduction and lower MMSE scores. For global CDR and sum-of-
boxes, we obtain robust results in both negative and two-tailed
correlations. As higher CDR scores denote greater impairment, the
negative correlation links lower brain volume with greater CDR
scores. Based on Table 1, atrophy of brain tissue (gray and white
matter) detected by TBM links better with cognition than volume
expansion (e.g., of the ventricles), although each is significantly
associated with both MMSE and CDR. Strictly speaking, the CSF
expansion signal may offer less signal to noise than the atrophic
signal as we are using statistical tests that depend on the total
volume of regions that reach a certain threshold (supra-threshold
volume and corrected q-values from FDR). It may be that, if the
statistical tests had been formulated differently, e.g., as strict voxel-
level comparisons (e.g., maximal t-statistics), they would detect
CSF differences with greater effect sizes than atrophic effects.
Analysis of group size
It may seem odd to assess effect size in groups as small as 10 to
40 subjects per group when imaging studies such as ADNI now
assess 200 or 400 subjects per group. Here a sample as low as 10 is
merely included to show how power completely breaks down when
the sample is minimal and not sufficiently powered to detect an
effect with reasonable confidence. Although morphometric studies
30X. Hua et al. / NeuroImage 41 (2008) 19–34
designed to contrast patients in several categories (treatment versus
placebo, MCI converters versus non-converters, ApoE4 carriers
versus non-carriers), so it is common to have groups containing
as few as 10 subjects for some statistical contrasts (given the low
annual rateof conversionfrom MCI toAD,andthe low incidence of
certain risk genotypes). As seen with our CDF approach, for con-
trasts that are underpowered, it may have merit to plot the CDFs
based on pilot samples, and assess the rate at which the CDFs are
increasing (or not) with successive increments in the sample size.
Although there is no widely accepted power analysis for morpho-
metric studies using statistical maps as outcome measures, the CDF
based methods, such as those advocated here, offer a means to study
reject a null hypothesis.
are needed in interpreting them. First, in this case all of the variance
subject with the normal group, so some covariation for age, sex, and
possibly other factors, ideally based on multiple regression in a large
sample, would be more appropriate to calibrate the level of age-
adjusted atrophy. Second, lower tissue volumes in an individual are
not always a sign of disease, so plotting regional volumes as a per-
centile relative to a normative population (which is essentially what
the significance map is) may reflect a combination of disease-related
atrophy, and some natural variation in brain volumes. These factors
could beeasier todisentangleina longitudinalevaluation ofthe same
patient over time. Finally, as noted by Salmond et al. (2002), if a
Gaussian distribution is assumed for the Jacobian statistics at each
to non-Gaussianity when comparing a single subject to a group. To
as normally distributed, Salmond et al. suggested that the data be first
heavily smoothed (using a 12mm FWHM kernel); alternatively, a
large control population could be used to establish a non-parametric
reference distribution at each voxel, which is essentially the
permutation approach taken here.
Anatomical maps and prior work
The main contribution of this paper, relative to prior work using
voxel-based morphometry (VBM) and tensor-based morphometry
in AD or MCI, is to study the effects of different analysis choices
within the framework of TBM, and how they affect the sensitivity
for detecting disease effects. Our anatomical findings are largely in
line with prior work using automated techniques to map patterns of
brain atrophy at voxel-level. Initial formulations of VBM derived
maps of structural differences by comparing the local composition
of brain tissue types after global position and volumetric differ-
ences had been removed through spatial normalization (Ishii et al.,
2005; Shiino et al., 2006; Davatzikos et al., 2008; Fan et al., 2008;
Karas et al., 2007; Smith et al., 2007; Vemuri et al., 2008). In
contrast, TBM is a method based on high-dimensional image
registration, which derives information on regional volumetric dif-
ferences from the deformation field that aligns the images. Recent
reformulations of VBM, termed ‘optimized VBM’ (Davatzikos
et al., 2001; Good et al., 2001) modulate the voxel intensity of the
spatially normalized gray matter maps by the local expansion
factor of a 3D deformation field that aligns each brain to a standard
brain template. As a result, the final modulated voxel contains the
same amount of gray matter as in the native pre-registered gray
matter map. Chetelat et al. (2002) and Karas et al. (2004) used
VBM to analyze patterns of gray matter loss in MCI and AD.
Relative to normal subjects, Chetelat et al., (2002) found that MCI
subjects showed significant atrophy in the hippocampus, temporal
cortices, and cingulate gyri. Gray matter density in the posterior
association cortex was significantly higher in MCI than AD. Karas
et al. (2004) found similar patterns of parietal atrophy in AD and
MCI, but found active hippocampal atrophy in the transitional
stage from MCI to AD. The author suggested this discrepancy
could be due to borderline significance or difference in disease
severity of MCI populations. A very recent study by Teipel et al.
(Teipel et al., 2007) used the TBM method to study brain
degeneration in MCI and AD. They used principal component
analysis to extract spatially distributed anatomical features asso-
ciated with the diagnosis of AD, and they focused on identifying
features that may be useful in predicting the transition from MCI to
AD. Future longitudinal TBM studies with the ADNI data are
likely to reveal which aspects of atrophy are most predictive of
future conversion to AD, and which voxel-based methods are
optimal for detecting progression or correlations with cognition. As
the sample size increases, it may be possible to detect and model
effects of the MRI platform, field strength, or acquisition site, to
determine whether the multi-site and dual MRI platform acquisi-
tion of the data contributed to reduced effect sizes, especially for
the MCI group. Comparisons distinguishing MCI from controls my
be more sensitive to these effects, whereas the AD versus control
group comparison has an effect size so great that it overwhelms
any increased variability due to multicenter acquisition. This
general, will be evaluated in future.
Data used in preparing this article were obtained from the
Alzheimer’s Disease Neuroimaging Initiative database (www.loni.
ucla.edu/ADNI). Many ADNI investigators therefore contributed to
of ADNI investigators is available at www.loni.ucla.edu/ADNI/
Collaboration/ADNI_Citation.shtml. This work was primarily
funded by the ADNI (Principal Investigator: Michael Weiner; NIH
grant number U01 AG024904). ADNI is funded by the National
Institute of Aging, the National Institute of Biomedical Imaging and
Bioengineering (NIBIB), and the Foundation for the National
Institutes of Health, through generous contributions from the fol-
lowing companies and organizations: Pfizer Inc., Wyeth Research,
Bristol-Myers Squibb, Eli Lilly and Company, Glaxo- SmithKline,
Merck & Co. Inc., AstraZeneca AB, Novartis Pharmaceuticals
Corporation, the Alzheimer’s Association, Eisai Global Clinical
Development, Elan Corporation plc, Forest Laboratories, and the
Institute for the Study of Aging (ISOA), with participation from the
U.S. Food and Drug Administration. The grantee organization is the
Northern California Institute for Research and Education, and the
study is coordinated by the Alzheimer’s Disease Cooperative Study
at the University of California, San Diego. Algorithm development
for this study was also funded by the NIA, NIBIB, the National
Library of Medicine, and the National Center for Research Re-
31 X. Hua et al. / NeuroImage 41 (2008) 19–34
sources (AG016570, EB01651, LM05639, RR019771 to PT).
Author contributions were as follows: XH, AL, SL, AK, AT, NL,
YC, MC, MB, RB, JB, NS, LB, and PT performed the image
GA, and MW contributed substantially to the image acquisition,
study design, quality control, calibration and pre- processing,
databasing and image analysis. We thank Anders Dale for his
contributions to the image pre-processing and the ADNI project.
Part of this work was undertaken at UCLH/UCL, which received a
proportion of funding from the Department of Health’s NIHR
Biomedical Research Centres funding scheme.
Aljabar, P., Bhatia, K.K., Hajnal, J.V., Boardman, J.P., Srinivasan, L.,
Rutherford, M.A., Dyet, L.E., Edwards, A.D., Rueckert, D., 2006.
Analysis of Growth in the Developing Brain Using Non-Rigid Re-
gistration. IEEE Int. Symp. Biomed. Imaging 201–204.
Aljabar, P., Bhatia, K.K., Murgasova, M., Hajnal, J.V., Boardman, J.P.,
Srinivasan, L., Rutherford, M.A., Dyet, L.E., Edwards, A.D., Rueckert,
D., 2008. Assessment of brain growth in early childhood using
deformation-based morphometry. Neuroimage 39, 348–358.
Apostolova, L.G., Thompson, P.M., 2007. Brain mapping as a tool to study
neurodegeneration. Neurotherapeutics 4, 387–400.
Apostolova, L.G., Dutton, R.A., Dinov, I.D., Hayashi, K.M., Toga, A.W.,
Cummings, J.L., Thompson, P.M., 2006. Conversion of mild cognitive
Arch. Neurol. 63, 693–699.
Ashburner, J., 2007. A fast diffeomorphic image registration algorithm.
Neuroimage 38, 95–113.
Ashburner, J., Friston, K.J., 2003. Morphometry.. In: Ashburner, J., Friston,
K.J., Penny, W. (Eds.), Human Brain Function. Academic Press.
Avants, B., Gee, J.C., 2004. Geodesic estimation for large deformation
anatomical shape averaging and interpolation. Neuroimage 23 (Suppl 1),
morphometry in mild Alzheimer's disease. Neuroimage 14, 298–309.
Benjamini, Y., Hochberg, Y., 1995. Controlling the false discovery rate: a
practical and powerful approach to multiple testing. J. R. Stat. Soc., B.
57 (1), 289–300.
Brun, C., Lepore, N., Pennec, X., Chou, Y., Lopez, O., Aizenstein, H.,
Becker, J., Toga, A., Thompson, P., 2007. Comparison of Standard and
Riemannian Elasticity for Tensor-Based Morphometry in HIV/AIDS. In:
Proc. of MICCAI'07 Workshop on Statistical Registration: Pair-wise
and Group-wise Alignment and Atlas Formation. Springer, Berlin,
Bullmore, E.T., Suckling, J., Overmeyer, S., Rabe-Hesketh, S., Taylor, E.,
Brammer, M.J., 1999. Global, voxel, and cluster tests, by theory and
permutation, for a difference between two groups of structural MR
images of the brain. IEEE Trans. Med. Imaging 18 (1), 32–42 Jan.
Busatto, G.F., Garrido, G.E., Almeida, O.P., Castro, C.C., Camargo, C.H.,
Cid, C.G., Buchpiguel, C.A., Furuie, S., Bottino, C.M., 2003. A voxel-
based morphometry study of temporal lobe gray matter reductions in
Alzheimer's disease. Neurobiol. Aging 24, 221–231.
Callen, D.J., Black, S.E., Gao, F., Caldwell, C.B., Szalai, J.P., 2001. Beyond
the hippocampus: MRI volumetry confirms widespread limbic atrophy
in AD. Neurology 57, 1669–1674.
Cardenas, V.A., Studholme, C., Gazdzinski, S., Durazzo, T.C., Meyerhoff,
D.J., 2007. Deformation-based morphometry of brain changes in alcohol
dependence and abstinence. Neuroimage 34, 879–887.
Carmichael, O.T., Thompson, P.M., Dutton, R.A., Lu, A., Lee, S.E., Lee,
J.Y., Kuller, L.H., Lopez, O.L., Aizenstein, H.J., Meltzer, C.C., Liu, Y.,
Toga, A.W., Becker, J.T., 2006. Mapping ventricular changes related to
dementia and mild cognitive impairment in a large community-based
cohort. IEEE ISBI 315–318.
Carmichael, O.T., Kuller, L.H., Lopez, O.L., Thompson, P.M., Dutton, R.A.,
A.W., Becker, J.T., 2007. Ventricular volume and dementia progression
in the Cardiovascular Health Study. Neurobiol. Aging 28, 389–397.
Chetelat, G., Desgranges, B., De La Sayette, V., Viader, F., Eustache, F.,
Baron, J.C., 2002. Mapping gray matter loss with voxel-based morpho-
metry in mild cognitive impairment. Neuroreport 13, 1939–1943.
Chiang, M.C., Dutton, R.A., Hayashi, K.M., Lopez, O.L., Aizenstein, H.J.,
Toga, A.W., Becker, J.T., Thompson, P.M., 2007a. 3D pattern of brain
atrophy in HIV/AIDS visualized using tensor-based morphometry. Neu-
roimage 34, 44–60.
Chiang, M.C., Reiss, A.L., Lee, A.D., Bellugi, U., Galaburda, A.M.,
Korenberg, J.R., Mills, D.L., Toga, A.W., Thompson, P.M., 2007b. 3D
pattern of brain abnormalities in Williams syndrome visualized using
tensor-based morphometry. Neuroimage 36, 1096–1109.
Chou, Y.Y., Lepore, N., de Zubicaray, G.I., Carmichael, O.T., Becker, J.T.,
Toga, A.W., Thompson, P.M., in press. Automated ventricular mapping
with multi-atlas fluid image alignment reveals genetic effects in
Alzheimer's disease. Neuroimage.
Christensen, G.E., Johnson, H.J., Vannier, M.W., 2006. Synthesizing av-
erage 3D anatomical shapes. Neuroimage 32, 146–158.
Chung, M.K., Worsley, K.J., Paus, T., Cherif, C., Collins, D.L., Giedd, J.N.,
Rapoport, J.L., Evans, A.C., 2001. A unified statistical approach to
deformation-based morphometry. Neuroimage 14, 595–606.
Psychopharmacol. Bull. 24, 689–692.
Collins, D.L., Neelin, P., Peters, T.M., Evans, A.C., 1994. Automatic 3D
intersubject registration of MR volumetric data in standardized Talairach
space. J. Comput. Assist. Tomogr. 18, 192–205.
Davatzikos, C., Genc, A., Xu, D., Resnick, S.M., 2001. Voxel-based mor-
phometry using the RAVENS maps: methods and validation using
simulated longitudinal atrophy. Neuroimage 14, 1361–1369.
Davatzikos, C., Barzi, A., Lawrie, T., Hoon Jr., A.H., Melhem, E.R., 2003.
Correlation of corpus callosal morphometry with cognitive and motor
function in periventricular leukomalacia. Neuropediatrics 34, 247–252.
Davatzikos,C., Fan, Y., Wu, X.,Shen, D., Resnick, S.M.,2008. Detection of
prodromal Alzheimer's disease via pattern classification of magnetic
resonance imaging. Neurobiol. Aging 29, 514–523.
Delacourte, A., David, J.P., Sergeant, N., Buee, L., Wattez, A., Vermersch,
P., Ghozali, F., Fallet-Bianco, C., Pasquier, F., Lebert, F., Petit, H., Di
Menza, C., 1999. The biochemical pathway of neurofibrillary degene-
ration in aging and Alzheimer's disease. Neurology 52, 1158–1165.
Dubb, A., Xie, Z., Gur, R., Gee, J., 2005. Characterization of brain plasticity
in schizophrenia using template deformation. Acad. Radiol. 12, 3–9.
Edgington, E.S., 1995. Randomization tests, 3rd. Marcel Dekker, New York.
Fan, Y., Batmanghelich, N., Clark, C.M., Davatzikos, C., 2008. Spatial
patterns of brain atrophy in MCI patients, identified via high-dimen-
sional pattern classification, predict subsequent cognitive decline. Neu-
roimage 39 (4), 1731–1743.
Fillard, P., Arsigny, V., Pennec, X., Thompson, P., Ayache, N., 2005. Ex-
trapolation of Sparse Tensor Fields: Application to the Modeling of
Brain Variability. Information Processing in Medical Imaging (IPMI).
Springer, Berlin, Glenwood Springs, Colorado, pp. 644–652.
Fleisher, A., Fennema-Notestine, C., Hagler, D., Podraza, K., Wu, E.,
Taylor, C., Karow, D., Dale, A., 2007. Baseline structural MRI correlates
of clinical measures in the Alzheimer's disease neuroimaging initiative.
Alzheimer's & Dementia: Journal of the Alzheimer's Association July
2007, Vol. 3, Issue 3, pp. S107–S108.
Folstein, M.F., Folstein, S.E., McHugh, P.R., 1975. Mini-mental state. A
practical method for grading the cognitive state of patients for the
clinician. J. Psychiatr. Res. 12, 189–198.
Fox, N.C., Warrington, E.K., Freeborough, P.A., Hartikainen, P., Kennedy,
A.M., Stevens, J.M., Rossor, M.N., 1996. Presymptomatic hippocampal
atrophy in Alzheimer's disease. A longitudinal MRI study. Brain 119
(Pt 6), 2001–2007.
Fox, N.C., Crum, W.R., Scahill, R.I., Stevens, J.M., Janssen, J.C., Rossor,
M.N., 2001. Imaging of onset and progression of Alzheimer's disease
32 X. Hua et al. / NeuroImage 41 (2008) 19–34
with voxel-compression mapping of serial magnetic resonance images.
Lancet 358, 201–205.
Zeki, Ashburner, J., Penny, W.D. (Eds.), 2003. Human Brain Function,
2nd. Academic Press, USA.
Freeborough, P.A., Fox, N.C., 1998. Modeling brain deformations in Al-
zheimer disease by fluid registration of serial 3D MR images. J. Comput.
Assist. Tomogr. 22, 838–843.
Freeborough, P.A., Fox, N.C., Kitney, R.I., 1997. Interactive algorithms for
the segmentation and quantitation of 3-D MRI brain scans. Comput.
Methods Programs Biomed. 53, 15–25.
Frisoni, G.B., Testa, C., Zorzan, A., Sabattoli, F., Beltramello, A., Soininen,
disease with voxel based morphometry. J. Neurol. Neurosurg. Psychiatry
Gee, J., Ding, L., Xie, Z., Lin, M., DeVita, C., Grossman, M., 2003. Al-
zheimer's disease and frontotemporal dementia exhibit distinct atrophy-
behavior correlates. Acad. Radiol. 10, 1392–1401.
Genovese, C.R., Lazar, N.A., Nichols, T., 2002. Thresholding of statistical
maps in functional neuroimaging using the false discovery rate. Neuro-
image 15, 870–878.
Good, C.D., Johnsrude, I.S., Ashburner, J., Henson, R.N., Friston, K.J.,
Frackowiak, R.S., 2001. Avoxel-based morphometric study of ageing in
465 normal adult human brains. Neuroimage 14, 21–36.
Grundman, M., Sencakova, D., Jack Jr., C.R., Petersen, R.C., Kim, H.T.,
Schultz, A., Weiner, M.F., DeCarli, C., DeKosky, S.T., van Dyck, C.,
Thomas, R.G., Thal, L.J., 2002. Brain MRI hippocampal volume and
prediction of clinical status in a mild cognitive impairment trial. J. Mol.
Neurosci. 19, 23–27.
Gunter, J., Bernstein, M., Borowski, B., Felmlee, J., Blezek, D., Mallozzi,
R., 2006. Validation testing of the MRI calibration phantom for the
Alzheimer's Disease Neuroimaging Initiative Study. ISMRM 14th Sci-
entific Meeting and Exhibition.
Hua, X., Leow, A.D., Levitt, J.G., Caplan, R., Thompson, P.M., Toga, A.W.,
based morphometry. Hum. Brain. Mapp.
clinical scale for the staging of dementia. Br. J. Psychiatry 140, 566–572.
Ishii, K., Kawachi, T., Sasaki, H., Kono, A.K., Fukuda, T., Kojima, Y., Mori,
E., 2005. Voxel-based morphometric comparison between early- and
late-onset mild Alzheimer's disease and assessment of diagnostic per-
formance of z score images. AJNR Am. J. Neuroradiol. 26, 333–340.
Jack Jr., C.R., Shiung, M.M., Weigand, S.D., O'Brien, P.C., Gunter, J.L.,
Boeve, B.F., Knopman, D.S., Smith, G.E., Ivnik, R.J., Tangalos, E.G.,
Petersen, R.C., 2005. Brain atrophy rates predict subsequent clinical con-
version in normal elderly and amnestic MCI. Neurology 65, 1227–1231.
Jack, C.R., Jr., Bernstein, M.A., Fox, N.C., Thompson, P., Alexander, G.,
Harvey, D., Borowski, B., Britson, P.J., Whitwell, J.L., Ward, C., Dale,
A.M., Felmlee, J.P., Gunter, J.L., Hill, D.L., Killiany, R., Schuff, N.,
Fox-Bosetti, S., Lin, C., Studholme, C., Decarli, C.S., Ward, H.A.,
Fleisher, A.S., Albert, M., Green, R., Bartzokis, G., Glover, G.,
Mugler, J., Weiner, M.W., in press. The Alzheimer's disease neuroima-
ging initiative (ADNI): MRI methods. J. Magn. Reson. Imaging.
Janke, A.L., de Zubicaray, G., Rose, S.E., Griffin, M., Chalk, J.B.,
Galloway, G.J., 2001. 4D deformation modeling of cortical disease
progression in Alzheimer's dementia. Magn. Reson. Med. 46, 661–666.
Joshi, S., Davis, B., Jomier, M., Gerig, G., 2004. Unbiased diffeomorphic
atlas construction for computational anatomy. Neuroimage 23 (Suppl 1),
Jovicich, J., Czanner, S., Greve, D., Haley, E., van der Kouwe, A., Gollub,
R., Kennedy, D., Schmitt, F., Brown, G., Macfall, J., Fischl, B., Dale, A.,
2006. Reliability in multi-site structural MRI studies: effects of gradient
non-linearity correction on phantom and human data. Neuroimage 30,
Karas, G.B., Scheltens, P., Rombouts, S.A., Visser, P.J., van Schijndel,
R.A., Fox, N.C., Barkhof, F., 2004. Global and local gray matter loss
in mild cognitive impairment and Alzheimer's disease. Neuroimage
Karas, G., Scheltens, P., Rombouts, S., van Schijndel, R., Klein, M., Jones,
early-onset Alzheimer's disease: a morphometric structural MRI study.
Neuroradiology 49, 967–976.
Kochunov, P., Lancaster, J.L., Thompson, P., Woods, R., Mazziotta, J.,
Hardies, J., Fox, P., 2001. Regional spatial normalization: toward an
optimal target. J. Comput. Assist. Tomogr. 25, 805–816.
Kochunov, P., Lancaster, J., Thompson, P., Toga, A.W., Brewer, P., Hardies,
J., Fox, P., 2002. An optimized individual target brain in the Talairach
coordinate system. Neuroimage 17, 922–927.
Kochunov, P., Lancaster, J., Hardies, J., Thompson, P.M., Woods, R.P.,
Cody, J.D., Hale, D.E., Laird, A., Fox, P.T., 2005. Mapping structural
differences of the corpus callosum in individuals with 18q deletions
using targetless regional spatial normalization. Hum. Brain Mapp. 24,
Kovacevic, N., Henderson, J.T., Chan, E., Lifshitz, N., Bishop, J., Evans,
A.C., Henkelman, R.M., Chen, X.J., 2005. A three-dimensional MRI
atlas of the mouse brain with estimates of the average and variability.
Cereb. Cortex 15, 639–645.
Langers, D.R., Jansen, J.F., Backes, W.H., 2007. Enhanced signal detection
in neuroimaging by means of regional control of the global false
discovery rate. Neuroimage 38, 43–56.
Lee, A.D., Leow, A.D., Lu, A., Reiss, A.L., Hall, S., Chiang, M.C., Toga,
A.W., Thompson, P.M., 2007. 3D pattern of brain abnormalities in
Fragile X syndrome visualized using tensor-based morphometry. Neu-
roimage 34, 924–938.
Leow, A., Huang, S.C., Geng, A., Becker, J.T., Davis, S., Toga, A.W.,
Thompson, P.M., 2005a. Inverse Consistent Mapping in 3D Deformable
Image Registration: Its Construction and Statistical Properties. Informa-
tion Processing in MedicalImaging.SpringerBerlin,GlenwoodSprings,
Colorado, USA, pp. 493–503.
Leow, A.D., Thompson, P.M., Hayashi, K.M., Bearden, C., Nicoletti, M.A.,
Monkul, S.E., Brambilla, P., Sassi, R.B., Mallinger, A.G., Soares, J.C.,
2005b. Lithium Effectson Human Brain Structure Mapped Using Longi-
tudinal MRI. Society for Neuroscience, Washington, DC.
Leow, A.D., Klunder, A.D., Jack Jr., C.R., Toga, A.W., Dale, A.M.,
Bernstein, M.A., Britson, P.J., Gunter, J.L., Ward, C.P., Whitwell, J.L.,
Borowski, B.J., Fleisher, A.S., Fox, N.C., Harvey, D., Kornak, J.,
Schuff, N., Studholme, C., Alexander, G.E., Weiner, M.W., Thompson,
P.M., 2006. Longitudinal stability of MRI for mapping brain change
using tensor-based morphometry. Neuroimage 31, 627–640.
Leow, A., Yanovsky, I., Chiang, M., Lee, A., Klunder, A., Lu, A., Becker, J.,
maps and inverse-consistent deformations in non-linear image registra-
tion. IEEE Trans. Med. Imaging 822–832.
Lepore, N., Brun, C., Pennec, X., Chou, Y., Lopez, O., Aizenstein, H.,
Becker, J., Toga, A., Thompson, P., 2007. Mean Template for Tensor-
Based Morphometry using Deformation Tensors. MICCAI, Springer,
Berlin, Brisbane, Australia.
Lepore,N.,Brun,C., Chou,Y.Y.,Chiang, M.C.,Dutton,R.A.,Hayashi,K.M.,
Luders, E., Lopez, O.L., Aizenstein, H.J., Toga, A.W., Becker, J.T.,
Thompson, P.M., 2008. Generalized tensor-based morphometry of HIV/
AIDS using multivariate statistics on deformation tensors. IEEE Trans.
Med. Imag. 27, 129–141.
Lopez, O.L., Kuller, L.H., Becker, J.T., Dulberg, C., Sweet, R.A., Gach, H.M.,
Lorenzen, P., et al., 2004. Multi-class posterior atlas formation via unbiased
Kullback-Leibler template estimation. MICCAI, pp. 95–102.
Lorenzen, P., Prastawa, M., Davis, B., Gerig, G., Bullitt, E., Joshi, S., 2006.
Multi-modal image set registration and atlas formation. Med. Image
Anal. 10, 440–451.
Lynch, C.A., Walsh, C., Blanco, A., Moran, M., Coen, R.F., Walsh, J.B.,
Lawlor, B.A., 2006. The clinical dementia rating sum of box score in
mild dementia. Dement. Geriatr. Cogn. Disord. 21, 40–43.
33 X. Hua et al. / NeuroImage 41 (2008) 19–34
Marsden, J., Hughes, T., 1983. Mathematical foundations of elasticity. Download full-text
Mazziotta, J., Toga, A., Evans, A., Fox, P., Lancaster, J., Zilles, K., Woods,
R., Paus, T., Simpson, G., Pike, B., Holmes, C., Collins, L., Thompson,
P.,MacDonald,D., Iacoboni,M., Schormann,T., Amunts,K., Palomero-
Gallagher, N., Geyer, S., Parsons, L., Narr, K., Kabani, N., Le Goualher,
G., Boomsma, D., Cannon, T., Kawashima, R., Mazoyer, B., 2001. A
probabilistic atlas and reference system for the human brain: Interna-
tional Consortium for Brain Mapping (ICBM). Philos. Trans. R. Soc.
Lond., B Biol. Sci. 356, 1293–1322.
McKhann, G., Drachman, D., Folstein, M., Katzman, R., Price, D., Stadlan,
E.M., 1984. Clinical diagnosis of Alzheimer's disease: report of the
NINCDS-ADRDA Work Group under the auspices of Department of
Health and Human Services Task Force on Alzheimer's Disease. Neu-
rology 34, 939–944.
Morra, J., Tu, Z., Apostolova, L.G., Green, A.E., Avedissian, C., Madsen,
S.K., Parikshak, N., Hua, X., Toga, A.W., Jack, C.R., Schuff, N.,
Weiner, M.W., Thompson, P.M., submitted for publication. Automated
3D Mapping of Hippocampal Atrophy and its Clinical Correlates in 400
Subjects with Alzheimer's Disease, Mild Cognitive Impairment, and
Elderly Controls, ISBI 2008.
Morris, J.C., 1993. The Clinical Dementia Rating (CDR): current version
and scoring rules. Neurology 43, 2412–2414.
Mueller, S.G., Weiner, M.W., Thal, L.J., Petersen, R.C., Jack, C., Jagust, W.,
Trojanowski, J.Q., Toga, A.W., Beckett, L., 2005a. The Alzheimer's
disease neuroimaging initiative. Neuroimaging Clin. North Am. 15,
Mueller, S.G., Weiner, M.W., Thal, L.J., Petersen, R.C., Jack, C.R., Jagust,
W., Trojanowski, J.Q., Toga, A.W., Beckett, L., 2005b. Ways toward an
early diagnosis in Alzheimer's disease: The Alzheimer's Disease Neu-
roimaging Initiative (ADNI). Alzheimers. Dement. 1, 55–66.
Nichols, T.E., Holmes, A.P., 2002. Nonparametric permutation tests for func-
tional neuroimaging: a primer with examples. Hum. Brain Mapp. 15, 1–25.
Petersen, R.C., 2000. Aging, mild cognitive impairment, and Alzheimer's
disease. Neurol. Clin. 18, 789–806.
Petersen, R.C., Smith, G.E., Waring, S.C., Ivnik, R.J., Tangalos, E.G., Kok-
men, E., 1999. Mild cognitive impairment: clinical characterization and
outcome. Arch. Neurol. 56, 303–308.
Petersen, R.C., Doody, R., Kurz, A., Mohs, R.C., Morris, J.C., Rabins, P.V.,
Ritchie, K., Rossor, M., Thal, L., Winblad, B., 2001. Current concepts in
mild cognitive impairment. Arch. Neurol. 58, 1985–1992.
Riddle, W.R., Li, R., Fitzpatrick, J.M., DonLevy, S.C., Dawant, B.M., Price,
R.R., 2004. Characterizing changes in MR images with color-coded
Jacobians. Magn. Reson. Imaging 22, 769–777.
Rohlfing, T., Pfefferbaum, A., Sullivan, E.V., Maurer, C.R., 2005. Infor-
mation fusion in biomedical image analysis: combination of data vs.
combination of interpretations. Inf. Process. Med. Imaging 19, 150–161.
Salmond, C.H., Ashburner, J., Vargha-Khadem, F., Connelly, A., Gadian,
D.G., Friston, K.J., 2002. Distributional assumptions in voxel-based
morphometry. Neuroimage 17, 1027–1030.
Scahill, R.I., Frost, C., Jenkins, R., Whitwell, J.L., Rossor, M.N., Fox, N.C.,
2003. A longitudinal study of brain volume changes in normal aging
using serial registered magnetic resonance imaging. Arch. Neurol. 60,
Shattuck, D.W., Leahy, R.M., 2002. BrainSuite: an automated cortical
surface identification tool. Med. Image Anal. 6, 129–142.
Shen, D., Davatzikos, C., 2003. Very high-resolution morphometry using
mass-preserving deformations and HAMMER elastic registration. Neu-
roimage 18, 28–41.
Shiino, A., Watanabe, T., Maeda, K., Kotani, E., Akiguchi, I., Matsuda, M.,
2006. Four subgroups of Alzheimer's disease based on patterns of at-
rophy using VBM and a unique pattern for early onset disease. Neuro-
image 33, 17–26.
Sled, J.G., Zijdenbos, A.P., Evans, A.C., 1998. A nonparametric method for
automatic correction of intensity nonuniformity in MRI data. IEEE
Trans. Med. Imaging 17, 87–97.
Smith, C.D., Chebrolu, H., Wekstein, D.R., Schmitt, F.A., Jicha, G.A.,
Cooper, G., Markesbery, W.R., 2007. Brain structural alterations before
mild cognitive impairment. Neurology 68, 1268–1273.
Storey, J.D., 2002. A direct approach to false discovery rates. J. R. Statist.
Soc. B 64 (Pt. 3), 479–498.
Studholme, C., Cardenas, V., 2004. A template free approach to volu-
metric spatial normalization of brain anatomy. Pattern Recogn. Lett.
B., Weiner, M., 2004. Deformation tensor morphometry of semantic
dementia with quantitative validation. Neuroimage 21, 1387–1398.
Teipel, S.J., Born, C., Ewers, M., Bokde, A.L., Reiser, M.F., Moller, H.J.,
Hampel, H., 2007. Multivariate deformation-based analysis of brain
atrophy to predict Alzheimer's disease in mild cognitive impairment.
Neuroimage 38, 13–24.
Testa, C., Laakso, M.P., Sabattoli, F., Rossi, R., Beltramello, A., Soininen,
H., Frisoni, G.B., 2004. A comparison between the accuracy of voxel-
based morphometry and hippocampal volumetry in Alzheimer's disease.
J. Magn. Reson. Imaging 19, 274–282.
Thompson, P., Apostolova, L., in press. Computational anatomical methods
as applied to aging and dementia. Br. J. Radiol. (Dec. 2007, Invited
Thompson, P.M., Giedd, J.N., Woods, R.P., MacDonald, D., Evans, A.C.,
Toga, A.W., 2000a. Growth patterns in the developing brain detected by
using continuum mechanical tensor maps. Nature 404, 190–193.
Thompson, P.M., Woods, R.P., Mega, M.S., Toga, A.W., 2000b. Mathe-
matical/computational challenges in creating deformable and probabil-
istic atlases of the human brain. Hum. Brain Mapp. 9, 81–92.
Thompson, P.M., Vidal, C., Giedd, J.N., Gochman, P., Blumenthal, J.,
Nicolson, R., Toga, A.W., Rapoport, J.L., 2001. Mapping adolescent
brain change reveals dynamic wave of accelerated gray matter loss in
very early-onset schizophrenia. Proc. Natl. Acad. Sci. U. S. A. 98,
Thompson, P.M., Hayashi, K.M., de Zubicaray, G., Janke, A.L., Rose, S.E.,
Semple, J., Herman, D., Hong, M.S., Dittmer, S.S., Doddrell, D.M.,
Toga, A.W., 2003. Dynamics of gray matter loss in Alzheimer's disease.
J. Neurosci. 23, 994–1005.
Toga, A.W., 1999. Brain Warping, 1st. Academic Press, San Diego.
Twining, C., Cootes, T., Marsland, S., Petrovic, V., Schestowitz, R., Taylor,
C., 2005. A Unified Information-Theoretic Approach to Groupwise
Non-rigid Registration and Model Building. Information Processing
in Medical Imaging: 19th International Conference, IPMI. Springer,
Berlin, Glenwood Springs, CO., pp. 1–14.
diagnosis in individual subjects using structural MR images: validation
studies. Neuroimage 39, 1186–1197.
Wechsler, D., 1987. Wechsler Memory Scale. Psychological Corp/Harcourt
Brace Jovanovich, New York.
Whitwell, J.L., Przybelski, S.A., Weigand, S.D., Knopman, D.S., Boeve,
B.F., Petersen, R.C., Jack Jr., C.R., 2007. 3D maps from multiple MRI
illustrate changing atrophy patterns as subjects progress from mild
cognitive impairment to Alzheimer's disease. Brain 130, 1777–1786.
Woods, R.P., 2003. Characterizing volume and surface deformations in an
atlas framework: theory, applications, and implementation. Neuroimage
34X. Hua et al. / NeuroImage 41 (2008) 19–34