ArticlePDF Available

Abstract and Figures

In this paper, we propose a generative statistical model to learn the spatiotemporal variability in longitudinal shape data sets, which contain repeated observations of a set of objects or individuals over time. From all the short-term sequences of individual data, the method estimates a long-term normative scenario of shape changes and a tubular coordinate system around this trajectory. Each individual data sequence is therefore (i) mapped onto a specific portion of the trajectory accounting for differences in pace of progression across individuals, and (ii) shifted in the shape space to account for intrinsic shape differences across individuals that are independent of the progression of the observed process. The parameters of the model are estimated using a stochastic approximation of the expectation–maximization algorithm. The proposed approach is validated on a simulated data set, illustrated on the analysis of facial expression in video sequences, and applied to the modeling of the progressive atrophy of the hippocampus in Alzheimer’s disease patients. These experiments show that one can use the method to reconstruct data at the precision of the noise, to highlight significant factors that may modulate the progression, and to simulate entirely synthetic longitudinal data sets reproducing the variability of the observed process.
[Left] Shape spatiotemporal reference frame y0,c0,m0,t0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$y_0, c_0, m_0, t_0$$\end{document} with respect to which a shape y admits coordinates t∈R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t\in \mathbb {R}$$\end{document}, v∈v0⊥⊂Tγ(t0)Dc0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$v\in v_0^\bot \subset T_{\gamma (t_0)}\mathcal {D}_{c_0}$$\end{document}. Three spaces are involved: the manifold of control points Rp×d\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}^{p \times d}$$\end{document} (top), the manifold of diffeomorphisms Dc0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {D}_{c_0}$$\end{document} (middle), and the shape submanifold Sy0,c0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {S}_{y_0, c_0}$$\end{document} of the extrinsic shape space E\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {E}$$\end{document} (bottom). The momenta m0,m\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m_0, m$$\end{document} and the velocity fields v0,v\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$v_0, v$$\end{document} are in one-to-one correspondence. The velocity field v, also called space-shift, is parallel-transported along the geodesic γ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma $$\end{document} by the operator t→Pγv(t)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t\rightarrow P_\gamma ^v (t)$$\end{document}. Figure 2 illustrates the effect of parallel transport on Dc0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {D}_{c_0}$$\end{document}. [Right] Illustrations of the manifolds abstractly depicted on the left side of the figure. The panels of each row plots elements of the corresponding geodesic (solid black lines on the left panel). The two columns correspond to the times t0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_0$$\end{document} and t. Each row displays two example elements of the corresponding geodesic (solid black lines on the left panel). The two columns correspond to the times t0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_0$$\end{document} and t
… 
This content is subject to copyright. Terms and conditions apply.
International Journal of Computer Vision
https://doi.org/10.1007/s11263-020-01343-w
Learning the spatiotemporal variability in longitudinal shape data sets
Alexandre Bˆ
one1,2,3,4,5·Olivier Colliot1,2,3,4,5·Stanley Durrleman1,2,3,4,5·
for the Alzheimer’s Disease Neuroimaging Initiative
Received: 12 April 2019 / Accepted: 19 May 2020
Abstract In this paper, we propose a generative statistical
model to learn the spatiotemporal variability in longitudinal
shape data sets, which contain repeated observations of a set
of objects or individuals over time. From all the short-term
sequences of individual data, the method estimates a long-
term normative scenario of shape changes and a tubular co-
ordinate system around this trajectory. Each individual data
sequence is therefore (i) mapped onto a specific portion of
the trajectory accounting for differences in pace of progres-
sion across individuals, and (ii) shifted in the shape space
to account for intrinsic shape differences across individu-
als that are independent of the progression of the observed
process. The parameters of the model are estimated using
a stochastic approximation of the expectation-maximization
algorithm. The proposed approach is validated on a simu-
lated data set, illustrated on the analysis of facial expres-
sion in video sequences, and applied to the modeling of the
progressive atrophy of the hippocampus in Alzheimer’s dis-
ease patients. These experiments show that one can use the
method to reconstruct data at the precision of the noise, to
highlight significant factors that may modulate the progres-
sion, and to simulate entirely synthetic longitudinal data sets
reproducing the variability of the observed process.
Keywords Longitudinal data ·Statistical shape analysis ·
Large deformation diffeomorphic metric mapping ·Medical
imaging ·Disease progression modeling
1Institut du Cerveau, ICM, F-75013, Paris, France
2Inserm, U 1127, F-75013, Paris, France
3CNRS, UMR 7225, F-75013, Paris, France
4Sorbonne Universit´
e, F-75013, Paris, France
5Inria, Aramis project-team, F-75013, Paris, France
Data used in preparation of this article were partly obtained from
the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. As
such, the investigators within the ADNI contributed to the design and
implementation of ADNI and/or provided data but did not participate
in analysis or writing of this report. A complete listing of ADNI inves-
tigators can be found at: adni.loni.usc.edu.
1 Introduction
1.1 Motivation
Video sequences of smiling faces, repeated measurements
of growing plants or developing cells, medical images col-
lected at multiple visits from a population of patients af-
fected by a chronic disease: all these examples can be un-
derstood as data collections where individual instances of a
common underlying process are observed at multiple time-
points. Such collections are called longitudinal data sets.
The individual processes are thought to result from ran-
dom variations of a common underlying process (or few of
them). Because of the dynamic nature of the observed pro-
cesses, one might decompose the variability in two compo-
nents: the dynamic or temporal variability on the one hand,
and the time-independent or spatial variability on the other
hand. In our examples, the variability in the pace of growth
or in age at disease onset is understood as temporal variabil-
ity. By contrast, there are also intrinsic inter-individual dif-
ferences in height, weight or shape that are independent of
the pace at which the plant grows or the disease progresses,
which we call spatial variability. The main difficulty here is
that growth or disease progression affect also height, weight
or shape, so that the differences between two observations of
two different samples are due to (i) the fact that the two in-
dividuals are observed at different stages of the process, and
(ii) that they have different intrinsic characteristics. Disen-
tangling these two sources of variability would not be pos-
sible if one had only one observation per individual. Having
repeated observations of the individuals over time, as in lon-
gitudinal data sets, implies that one could decompose the
changes due to the progression of the process from those
due to intrinsic differences that are independent of the pro-
gression.
2 Alexandre Bˆ
one, Olivier Colliot, and Stanley Durrleman
The goal of this paper is to propose a statistical learn-
ing method which can describe the spatiotemporal variabil-
ity in a longitudinal data set. We focus here on shape data,
where the shape may be encoded by an image, or by geomet-
rical objects extracted from images such as curves, surface
meshes or segmented volumes.
One of the main difficulties is that the experimental de-
sign often provides little control on the temporal sampling
of the observations. We are interested here by processes for
which there is no clear marker of progression, such as the
progression of neurodegenerative diseases for which the age
at disease onset is hard to determine. Therefore there is no
easy way to re-align in time the individual data sequences to
analyze the inter-individual variability at each stage of the
process. By contrast, the method needs to learn how the indi-
vidual data sequences position themselves in relation to each
other. Furthermore, the follow-up period of the observations
rarely covers the whole process, but often just a small part
of it. In clinical studies for examples, patients may be fol-
lowed for few years whereas the disease may progress over
decades. Eventually, dealing with shape data raises the need
for generic representations of such data that can be included
in computational approaches.
1.2 Related work
Structured data like shapes can be advantageously repre-
sented as elements of curved spaces, such as Riemmanian
manifolds, in order to account for the prior on their struc-
ture. Either defined by invariance [34, 56, 57] or topology-
preserving properties [7, 15, 21, 33, 51], shape spaces define
distance metrics adapted to the geometry of a well-identified
class of objects, such as brain magnetic resonance images
or segmented organs. These data representations allows the
generalization of the mean-variance analysis [3, 28, 50, 65],
which learns the geometrical distribution of a cross-sectional
data set in terms of an average shape, and variability-encoding
parameters. Typical healthy or pathological configurations
can be summarized in this manner, thus opening the way to
automatic diagnosis at the individual level. Time-series data
sets, consisting in the repeated observation of the same ob-
ject at successive time-points, can be described by general-
ized regression approaches on the same shape spaces [6, 26,
27, 29, 39, 49]. A time-continuous scenario of geometrical
transformation is then estimated, offering in turn individual-
ized interpolation and extrapolation methods. The statistical
analysis of longitudinal data sets requires to extend the con-
cept of generalized mean-variance for such time series. In
other words, it requires the definition of a statistical distri-
bution of curves drawn on a shape space.
Shape spaces are usually equipped with a differential
structure of infinite dimension. In particular, the large defor-
mation diffeomorphic metric mapping (LDDMM) approach
defines shape spaces as orbits of template shapes under the
action of an infinite-dimensional parametric group of dif-
feomorphisms of the 2D/3D ambient space [63]. With this
approach, the geometrical differences between two objects
are captured by estimating the diffeomorphic transforma-
tion that warps one into the other. More recent works pro-
pose finite-dimensional approaches built on the same princi-
ples: [64] uses truncated Fourier transforms to build a finite-
dimensional Lie algebra, and [20] constructs a finite-dimensional
Riemannian manifold based on a set of self-interacting par-
ticles.
Such structures are favorable to the analysis of longitudi-
nal data sets because they naturally offer the parallel trans-
port operator [40], which allows to compare tangent-space
vectors at distant points in a relevant manner. This operator
is key to compare trajectories on the manifold, and therefore
to analyze longitudinal data. In [56, 57] for instance, trajec-
tories on manifolds are compared by parallel-transporting
their initial velocity vectors back to some privileged point of
the manifold, thereby handling the spatial variability if a ref-
erence configuration and reference time-point is known. A
similar approach is followed in [35] where medical images
are analyzed in a voxel-wise fashion, or also in [54] with the
co-adjoint transport instead of the parallel transport. In [16],
the variability of a large longitudinal data set of thalamus
shapes is analyzed by transporting individual residual defor-
mations along a common and pre-computed trajectory, back
to a baseline point. In [52, 53] the authors define the exp-
parallelization operator which extends the notion of parallel
lines to Riemannian manifolds. The works [10, 36] build on
this operator to analyze dynamic networks and shape ob-
jects respectively. Other approaches propose to work on a
space of trajectories, such as in [47] where the Sasaki met-
ric is used to define distances between geodesic curves on
a manifold, or in [12] which requires the same number of
observations per subject.
If parallel transport allows to spatially align manifold-
valued trajectories, a temporal alignment mechanism is also
needed for data sets with variability in the individual pro-
gression dynamics. For instance, two patients developing the
same neurological disease have no reason to reach the same
disease stage at the same age, nor to have synchronous pro-
gressions. A solution is to use time-warp functions, which
define a mapping between an abstract common reference
time frame and the individual time lines [9,10, 21,35, 36,45,
52, 53]. In [56, 57], the authors build on the square-root ve-
locity fields framework to quotient the space of spatiotempo-
ral paths by diffeomorphic time-warps. In [48], a monotonic
Gaussian process is built from a set of temporal sources.
Learning the spatiotemporal variability in longitudinal shape data sets 3
1.3 Contributions
In this paper, we propose a method that learns an average
progression and its spatiotemporal variability from a lon-
gitudinal shape data set. The average progression takes the
form of a geodesic curve in the finite-dimensional Rieman-
nian approximation of the LDDMM framework of [22]. The
concept of exp-parallelization introduced in [52, 53] is then
applied in this context to define a tubular coordinate system,
also called Fermi coordinates, around the average geodesic.
The average trajectory and its coordinate system are auto-
matically learned by the method, so that every individual
data sequence is mapped to a specific portion of the aver-
age trajectory to account for the temporal variability, and
shifted in the shape space to account for the spatial vari-
ability. The calibration of the resulting generative statisti-
cal model is done by adapting a stochastic approximation
EM method. This paper extends the conference paper [10],
with finer modeling of the variability in the individual paces
of progression, and an original optimization method for ac-
celerated model calibration. The proposed approach is val-
idated on a simulated data set, illustrated on a facial ex-
pression recognition task, and applied to hippocampus shape
progression modeling in Alzheimer’s disease.
Section 2 defines the concept of shape spatiotemporal
coordinate systems, which allows the introduction of the
generative statistical model in Section 3. Section 4 details
the calibration, personalization and simulation algorithms,
which are evaluated and illustrated in Section 5. These ex-
periments will evaluate the goodness-of-fit of the model, the
relevance of the representation of the spatiotemporal vari-
ability for the identification of factors explaining this vari-
ability, and the ability of the model to generate synthetic
data sets that reproduce the observed variability in the train-
ing data set.
2 Shape spatiotemporal reference frame
Within LDDMM frameworks, shape are positioned with re-
spect to a reference shape, often called atlas or template. A
coordinate system is defined in the tangent-space at the tem-
plate shape. We propose here to replace the template (which
is a single shape) by a curve (i.e. a shape trajectory), and
the coordinate system by a tubular spatiotemporal coordi-
nate system centered around the template trajectory. We first
review the usual construction of a static template shape be-
fore extending it to the spatiotemporal case.
2.1 Positioning a shape with respect to a static atlas
Positioning a target shape ywith respect to a static reference
y0is called the registration problem. Deformation-based mor-
phometry solves it by estimating a diffeomorphism φ1of the
ambient space Rd(d= 2 or 3) that transforms y0into y,
which we note φ1? y0=y. In the context of LDDMM, dif-
feomorphisms are constructed by following the streamlines
of dynamic vector fields tvtC
0(Rd,Rd)over [0,1]:
tφt=vtφtwith φ0=Id.(1)
Following the approach in finite-dimension of [22], we fur-
ther assume that any vtwrites as the Gaussian convolution
of pmomentum vectors mt=m(1)
t, ..., m(p)
tRdover a
corresponding set of control points ct=c(1)
t, ..., c(p)
tRd:
vt:xRd
p
X
k=1
gc(k)
t, x·m(k)
tRd(2)
with g:x, x0Rdexp kx0xk2
`22the Gaussian
kernel function of kernel width σ > 0. Many other diffeo-
morphisms constructed in this manner might actually trans-
form y0into φ1?y0: we call solution of the registration prob-
lem the most regular transformation i.e. that minimizes its
“kinetic” energy:
1
2Z1
t=0 kvtk2
Gct=1
2Z1
t=0
m>
t·Gct·mt(3)
where t[0,1],Gctis the p×p“kernel” symmetric
positive-definite matrix of general term g[c(k)
t, c(l)
t], and (.)>
is the matrix transposition. Such energy-minimizing curves,
also called geodesics, are such that the control points and the
momentum vectors trajectories are fully determined by their
initial values and the following Hamiltonian equations [46]:
˙ct=Gct·mt; ˙mt=1
2ctm>
t·Gct·mt(4)
where x(.)is the gradient operator with respect to x. As-
suming that there exist a diffeomorphism φ1constructed ac-
cording to equations (1, 2) such that φ1?y0=y, this last re-
sult allows to compactly represent the positioning of ywith
respect to y0with a set of pcontrol points c0and attached
momenta m0. In other words, m0is the coordinate of yin
the coordinate system defined by (c0, y0). In practice, the
perfect registration constraint φ1? y0=yis relaxed, and we
call solution to the registration problem the extremal-path
diffeomorphism φ1that warps y0as close as possible to y,
for some extrinsic error measure dE(y0, y). In this paper, the
following choices are considered for dE, depending on the
nature of y0and y:
the `2metric for meshes with point-to-point correspon-
dence (i.e. the sum of squared differences between point
positions),
the current metric [14, 59] for oriented surface meshes
without point-to-point correspondence (details are given
in appendix for the reader’s convenience).
4 Alexandre Bˆ
one, Olivier Colliot, and Stanley Durrleman
Noting y0as a collection y(1)
0, ..., y(K)
0of Kpoints of Rd,
φ1acts independently and directly on each point y(k)
0ac-
cording to φ1? y(k)
0=φ1y(k)
0. Note that the methodology
introduced in this section can be adapted straightforwardly
to image data, by defining the action of the diffeomorphisms
φon the image Ias Iφ1and using the sum of squared
differences between image intensities as the error measure.
2.2 Riemannian structure
Let c0be a set of pcontrol points. We define:
Dc0=φ1|tφt=vtφt, φ0=Id, vt=Conv(ct, mt)
( ˙ct,˙mt) = Ham(ct, mt), m0Rp×d(5)
where Conv(., .)and Ham(., .)are compact notations for the
convolution operator defined by equation (2) and the Hamil-
tonian equations (4) respectively. Equipped at any φ∈ Dc0
!"
#"
$"
$
%& '"
'
("
#
(
)
*+,-.,
/.,012/.,
3452
6
78
9
3452
:
#"#
("; 6<#"= > ("
%& ; 6 #"
'"
!"
<?=
!"
<@= !"
<A=
!"
<B=
!"
<C=
$"
<?=
6 #
6<#= > ("
Fig. 1: [Left]. Shape spatiotemporal reference frame y0, c0, m0, t0with respect to which a shape yadmits coordinates tR,
vv
0Tγ(t0)Dc0. Three spaces are involved: the manifold of control points Rp×d(top), the manifold of diffeomorphisms
Dc0(middle), and the shape submanifold Sy0,c0of the extrinsic shape space E(bottom). The momenta m0, m and the velocity
fields v0, v are in one-to-one correspondence. The velocity field v, also called space-shift, is parallel-transported along the
geodesic γby the operator tPv
γ(t). Figure 2 illustrates the effect of parallel transport on Dc0.
[Right]. Illustrations of the manifolds abstractly depicted on the left side of the figure. Each row displays two example
elements of the corresponding geodesic (solid black lines on the left panel). The two columns correspond respectively to the
times t0and t.
Learning the spatiotemporal variability in longitudinal shape data sets 5
!
"
Fig. 2: [Bottom]. Illustration of a shape geodesic tγ(t)? y0: the man-like shape (solid black contour) raises his left arm.
This geodesic is parametrized by a single set of control points c0(black dots) and attached momentum vectors m0(bold
blue arrows), to which corresponds the velocity field v0(light blue arrows). A second set of momentum vectors m(bold red
arrows) attached to the same control points c0parametrizes the exp-parallelization of this shape geodesic.
[Top]. Exp-parallel shape curve tη(t)? y0to the shape geodesic γ ? y0: the exp-parallelization transfers the arm-raising
motion from one man-like shape to another.
with the local metric G1
φ(c0),Dc0has the structure of a Rie-
mannian manifold of dimension p×d. The tangent-space at
φis the set of velocity fields obtained by convolving any set
of momentum on φ(c0):
TφDc0=Conv(φ(c0), m)|mRp×d.(6)
The geodesics of Dc0are the curves tφtof constant ki-
netic energy (see equation (3)) i.e. such that the correspond-
ing control points and momenta trajectories tct, mtsat-
isfy the Hamiltonian equations (4). We define the exponen-
tial operator on Dc0:
Expt0,t
φ:v0TφDc0φt∈ Dc0(7)
where φtis the diffeomorphism reached at time tby the
geodesic path obtained by integration from some reference
time t0Rwith initial conditions φ(c0),m0such that
v0=Conv(φ(c0), m0), and φ0=φ. The momentum vec-
tor m0is the dual of the velocity field v0. The particular
case Exp0,1
φcorresponds to the usual Riemannian exponen-
tial map and will be noted Expφ. Diffeomorphisms φ∈ Dc0
act on shapes of the ambient space ythrough the action ?
previously defined. Let y0be a reference shape. We define
its orbit under the action ?:
Sy0,c0=Dc0? y0={φ ? y0|φ∈ Dc0}.(8)
Sy0,c0is a submanifold of the extrinsic shape space Ein
which is defined the distance dE.
2.3 Positioning a shape with respect to a dynamic atlas
Instead of positioning shapes with respect to a static atlas
y0, we aim now to position shapes with respect to a shape
geodesic tγ(t)? y0, where γis a geodesic of Dc0of
the form γ(t) = Expt0,t
Id (v0)with v0=Conv(c0, m0). Sim-
ilarly to cylindrical coordinates in Euclidian spaces, under
some conditions (see [30, 43]) a shape y∈ Sy0,c0admits a
unique spatiotemporal coordinate, also known as Fermi co-
ordinates, tRand vTγ(t0)Dc0such that v˙γ(t0):
y= ExpPv
γ(t)?y0with ExpPv
γ(t) = Expγ(t)Pv
γ(t)(9)
where Pv
γ(t)denotes the parallel transport of valong γfrom
t0to t. The curves γand η:tExpPv
γ(t)are said exp-
parallel, and the mapping γηis called exp-parallelization
along v[52, 53]. In other words, a choice of y0, c0, m0, t0
defines a spatiotemporal reference frame, with respect to
which a shape ycan be unambiguously positioned in terms
of a time tand a velocity field vorthogonal to v0= ˙γ(t0).
The time tis the temporal component of the coordinate which
positions the shape along the reference trajectory given by
the direction v0. The velocity vis the spatial component
of the coordinate, which positions the shape in the hyper-
plane that is orthogonal to v0. This decomposition can also
be understood as the orthogonal projection of yonto the one-
dimensional submanifold γ ? y0, hence the condition vv0.
6 Alexandre Bˆ
one, Olivier Colliot, and Stanley Durrleman
Figures 1 and 2 illustrate this concept of spatiotemporal ref-
erence frame, in which any shape yadmits the coordinates
t, v. Note that the time-point t0does not play any particular
role, in the sense that ycan be described in the same manner
for any other choice t0
0; a one-to-one transformation of the
spatiotemporal reference frame can actually be derived as:
t0=t+t0
0t0and v0= Pv
γ(t0
0).(10)
In general, the target shape ymight not exactly belong to
Sy0,c0. Similarly to the static atlas case, equation (9) is re-
laxed and we call solution to the longitudinal registration
problem the pair t,vsuch that y0is warped as close as pos-
sible to y, in the sense of the extrinsic metric dE.
3 Statistical model for longitudinal data sets of shapes
3.1 Hierarchical generative model
Let {yi,j , ti,j }i,j be a longitudinal data set of shapes, which
is the collection of repeated individual measurements yi,1,
...,yi,nifor i= 1, ..., n, where each shape yi,j corresponds
to a time ti,j R. Measurements are considered as sam-
ple points along individual trajectories, which are in turn
considered exp-parallel to a reference geodesic curve, there-
fore having a constant spatial coordinate in the spatiotempo-
ral reference frame centered around this reference geodesic.
Noting y0,c0,m0,t0the parameters of the spatiotempo-
ral coordinate system, v0=Conv(c0, m0)and γ:t
Expt0,t
c0(v0)the reference geodesic, the statistical model writes:
ExpPvi
γψi(ti,j )? y0
iid
∼ NEyi,j , σ2
),
where
ψi:tαi·(tτi) + t0,
vi=Conv(c0, mi), mi=A0,m
0·si,
and
αi
iid
N[0,+[(1, σ2
α), τi
iid
N(t0, σ2
τ),
si
iid
N(0,1)
(11)
where the noise distribution NEµ, σ2
)is defined such that
the likelihood is proportional to p(y)exp(dE(yµ)2/2σ2
).
Model (11) is hierarchical in the sense that individual trajec-
tories tExpPvi
γψi(t)are independently defined as spa-
tiotemporal transformations of a common, population-level
geodesic tγ(t).
The time-warp functions ψiencode the temporal vari-
ability of the observed individual trajectories in terms of
pace of progression αiand onset time τi. They map the in-
dex ti,j of the j-th shape of the i-th individual (e.g. the age
of the subject at a given visit), to a time-point ψi(ti,j)on the
reference geodesic (e.g. the disease stage).
The spatial variability is encoded by the space-shifts vi
v
0Tγ(t0)Dc0along which γis exp-parallelized. Those
space-shifts admit dual representations under the form of the
momenta mi, which are assumed to derive from qsource
parameters si=s(1)
i, ..., s(q)
i, in the spirit of independent
component analysis (ICA) [31]. The orthogonality viv0,
necessary for the identifiability of the model, is ensured by
the projection of each column of the (p·d)×qmixing ma-
trix A0onto the hyperplane m
0of Rp×dfor the cometric
Gc0, noted A0,m
0. The individual parameters are modeled
as independent samples from normal distributions:
a truncated normal distribution with fixed mean for the
acceleration factor αi, allowing the identifiability of m0;
a normal distribution for the onset time τi;
a normal distribution with fixed mean and variance for
the sources si, allowing the identifiability of y0and A0,m
0
respectively.
These parameters define individual trajectories as random
spatiotemporal transformations of a common reference tra-
jectory. The spatial and temporal transformations commute,
in the sense that tR,ExpPvi
γψi= ExpPvi
γψi. The
population trajectory is fully parameterized by the template
shape y0, the control points c0, the momenta m0and the ref-
erence time t0. The individual variability is unambiguously
represented by two reduced sets of scalar parameters: the ac-
celeration αiand the onset time τifor the temporal part, and
the sources s(1)
i, ..., s(q)
ifor the spatial part. In practice, it is
possible to choose a number of sources qp×dmuch
lower than the dimension of the tangent-space Tγ(t0)Dc0
while still capturing most of the geometrical variability in
the data.
3.2 Mixed-effects and Bayesian modeling
We further specify the formulation of the model (11) to fit
the framework of mixed-effects models. We distinguish:
the fixed-effects θ= (θ1, θ2)with θ1= (t0, στ, σα, σ)
and θ2= (y0, c0, m0, A0), also called the model param-
eters,
the random effects z= (zi)iwhere zi= (αi, τi, si).
We choose to work in a Bayesian framework, in order to the-
oretically ensure the existence of the maximum a posteriori
(MAP) estimate of the parameters θm. Such priors also reg-
ularize and guide the estimation procedure thanks to reason-
able and mild prior assumptions on the optimal fixed effects
values. The following standard conjugate distributions are
selected as Bayesian priors on the model parameters:
t0∼ N(t0, ς 2
t), y0∼ N(y0, ς 2
y),
σ2
τ∼ IG(mτ, ς 2
τ), c0∼ N(c0, ς 2
c),
σ2
α∼ IG(mα, ς 2
α), m0∼ N(m0, ς 2
m),
σ2
∼ IG(m, ς 2
), A0∼ N(A0, ς 2
A),
where IG(., .)denotes the inverse-gamma distribution.
Learning the spatiotemporal variability in longitudinal shape data sets 7
4 Algorithms: calibration, personalization, simulation
4.1 Objectives
Given a longitudinal data set of shapes {yi,j, ti,j }i,j that we
may note more compactly {y, t}, we formulate three algo-
rithmic objectives:
Calibration, which consists in computing the MAP pa-
rameters θm, unconditionally to any random effect z:
θm=argmaxθZp{y}, z, θ ;{t}·dz. (12)
Personalization, which consist in computing the MAP
random effects zmthat best represent some longitudinal
shape data set {y, t}(which may or may not be the one
used for calibration), given the calibrated model θm:
zm=argmaxzp{y}, z, θm;{t}.(13)
Simulation, which consist in generating a new data set
{ys}that resembles the original data set {y}.
We give now the details of the algorithms to solve these opti-
mization problems. Their implementation is freely available
in the software Deformetrica (find the install instructions
and the documentation at www.deformetrica.org).
4.2 Computation of the complete log-likelihood
Evaluating the joint log-likelihood log p({y}, z, θ ;{t}) =
Pn
i=1 Pni
j=1 log p(yi,j , zi, θ ;ti,j )for some set of parame-
ters θand random effects z= (zi)iis central for both cal-
ibration and personalization algorithms. The computation-
ally most intensive part is the computation of the conditional
log-likelihood log p(yi,j |zi, θ ;ti,j ), which amounts to syn-
thesize the candidate data for the current values of the fixed
and random-effects (θ, zi)and measure its discrepancy with
the true observation yi,j. The synthesis of the data follows
the generative model introduced in Section 2 and essentially
requires the integration of ordinary differential equations.
Algorithm 1 details the procedure, where |E| denotes the
dimension of the extrinsic shape shape E, and F(.)the cu-
mulative distribution function of the standard Gaussian. The
“source index” refers to the ICA components.
4.3 Calibration
4.3.1 Initialization procedure for model calibration
A good choice of initial parameters θ[0] and latent variables
z[0] improves the convergence speed of the calibration al-
gorithm. We propose in this section an initialization proce-
dure that combines several elementary shape analysis tools.
Given a longitudinal data set of shapes {yi,j, ti,j }i,j :
Algorithm 1: Compute the complete log-likelihood.
input : Longitudinal data set of shapes {y , t}={yi,j, ti,j }i,j .
Population parameters θ=y0, c0, m0, a0, t0, στ, σα, σ.
Individual parameters z= (zi)iwith zi=αi, τi, si.
output: The complete log-likelihood Q= log p({y}, z, θ ;{t}).
Set Q= 0.// initialization
/*compute the squared residuals 2
i,j for each visit */
Compute the initial velocity field v0=Conv(c0, m0).
Compute the geodesic γ:tExpt0,t
Id (v0).// see [22]
for the source index l= 1 to q
Compute the l-th column of A0,m
0, projecting Coll(A0)on m
0.
Compute the initial velocity field wl=Convc0,Coll(A0,m
0).
Compute the parallel transport wl:tPwl
γ(t).// see [41]
end
for the individual index i= 1 to n
for the visit index j= 1 to nj
Compute the time-warped age ψi,j =αi·(ti,j τi) + t0.
Compute the initial velocity field vi,j =Pq
l=1 s(l)
i·wl(ψi,j ).
Compute φi,j = Expγ(ψi,j)(vi,j )γ(ψi,j ).// see [22]
Compute the squared residual 2
i,j =dE(yi,j , φi,j ? y0)2.
/*add the model log-likelihood log p(yi,j |zi, θ ;ti,j )*/
Update QQ1
2|E| · log σ2
+2
i,j 2
.
end
/*add the random effects log-likelihood log p(zi|θ)*/
Update QQ1
2log σ2
τ+(τit0)22
τ+ksik2
`2+log σ2
α+
log(1 F(1α))2+ (αi1)22
α.
end
/*add the Bayesian prior log-likelihood log p(θ)*/
Update QQ1
2(t0t0)22
t+mτ(log σ2
τ+ς2
τ2
τ) +
mα(log σ2
α+ς2
α2
α) + m(log σ2
+ς2
2
).// log p(θ1)
Update QQ1
2ky0y0k2
`22
y+kc0c0k2
`22
c+
km0m0k2
`22
m+
A0A0
2
`22
A.// log p(θ2)
1. estimate a Bayesian atlas model (see [28]) from the base-
line shapes {yi,1}i, to get an approximate population-
level average geometry y0
0, c0
0as well as nspace-shift
momenta m0
imapping this geometry to the baseline ob-
servations, and an estimate of the noise level σ0
;
2. for i= 1, ..., n, estimate a geodesic regression model
(see [26]) from the individual time-series {yi,j}j, then
parallel transport (see [41]) the computed individual ini-
tial momenta back to the mean geometry y0
0, c0
0along
the corresponding space-shift m0
i, and finally compute
the Euclidean average of those w0
ito get an approximate
population-level mean momenta m0
0=hw0
iii;
3. for i= 1, ..., n, initialize the individual temporal param-
eters with τi=hti,j ij,α2
i=w0
i·Gc0
0·m0
0
m0
0·Gc0
0·m0
0if this value is
positive and αi= 1 otherwise, then compute σ0
τand σ0
α
according to equations (16) and (17) respectively;
4. solve a standard ICA problem with qcomponents from
the collection of space-shift momenta w0
i,m0⊥
0
prelimi-
narily projected on the orthogonal space to m0
0, and set
A0
0as the estimated mixing matrix;
8 Alexandre Bˆ
one, Olivier Colliot, and Stanley Durrleman
5. shoot forward the mean geometry y0
0, c0
0in the direction
m0
0with length t00
0t0
0where t0
0=hti,1iiand t00
0=
hti,j ii,j to get longitudinally centered estimates c00
0,y00
0,
m00
0, and parallel-transport the qcolumns of A0
0along the
same geodesic to obtain A00
0;
6. personalize the model given by the initial parameters
θ[0] = (y00
0, c00
0, m00
0, A00
0, t00
0, σ0
τ, σ0
α, σ0
)to obtain z[0].
4.3.2 The MCMC-SAEM-GD algorithm
Calibration is a computationally-intensive task for mostly
two reasons. First, the optimized variable θis of high di-
mension |θ|= 4 + |y0|+d·p·(2 + q)where dis the dimen-
sion of the ambient space, pthe number of control points,
qthe number of sources, and |y0|the number of vertices
necessary to describe the template mesh. Second, the opti-
mized function requires the computation of the integral over
the latent variables. The term p({y}, z, θ ;{t})can only be
evaluated for some given random-effect values z, by solving
sets of ordinary differential equations (see Algorithm 1). In
this paper, we propose to address this computational chal-
lenge by combining the MCMC-SAEM algorithm with gra-
dient descent (GD). The backbone of this algorithm is the
SAEM algorithm [18], which is a stochastic approximation
(SA) of the classical expectation-maximization (EM) algo-
rithm [19]: are alternated a stochastic simulation step z[k]
p(z|{y}, θ[k1] ;{t})of the latent variables followed by a
deterministic update of the model parameters θ[k]θ?(z[k]).
Algorithm 2: Calibration with MCMC-SAEM-GD.
input : Dataset y. Initial parameters θ[0] and z[0] .
Sequence of step-sizes (ρ[k])k. Sampling variances (σ(b))b.
output: Estimation of θmθ[k].
Set k= 0 and S[0]
1=S1(z[0]).// initialization, eq. (14)
repeat
Set kk+ 1.
/*block Gibbs symmetric random walk sampling */
foreach random variable z(b)in (z(1), z(2) , z(3))=(α, τ, s)do
Draw a candidate z(b) N (z[k1](b), σ2
b).
Let z= (z[k](1), ..., z [k](b1), z(b), z[k1](b+1) , ...).
Compute the ratio ω=log p({y},z,θ[k1];{t})
p({y},z[k1] [k1];{t}).// alg. 1
Draw uaccording to the uniform distribution u∼ U(0,1).
if log u<ωthen z[k](b)z(b)else z[k](b)z[k1](b).
end
Adapt the proposal variances (σ(b))b.// see [5]
/*analytical update rule for θ1(classical SAEM) */
Set S[k]
1S[k1]
1+ρ[k]·S1(z[k])S[k1]
1.// eq. (14)
Set θ[k]
1θ?
1(S[k])// eqs. (15) -(18)
/*gradient-descent-based update heuristic for θ2*/
Solve θ?
2=argmax
θ2
p({y}, z[k], θ[k]
1, θ2;{t})by GD. // alg. 1
Set θ[k]
2θ[k1]
2+ρ[k]·hθ?
2θ[k1]
2i.// heuristic
until convergence
In [37], the authors introduce the MCMC-SAEM algorithm,
where the simulation step is replaced by a Markov chain
Monte-Carlo (MCMC) step while still preserving the the-
oretical convergence properties. In this paper, an analytical
update rule θ?cannot be found for all the parameters θ: we
use a gradient descent approach to overcome this difficulty,
and we name MCMC-SAEM-GD the global resulting algo-
rithm. Algorithm 2 gives a high-level pseudo-code of the
proposed procedure. The sufficient statistics write:
St=1
n
n
X
i=1
τi, Sα=1
n
n
X
i=1
(αi1)2,(14)
Sτ=1
n
n
X
i=1
τ2
i, S=1
|E| · n· hniii
n
X
i=1
ni
X
j=1
2
i,j ,
where 2
i,j =dE{yi,j ,ExpPvi
γψi(ti,j )? y0}2and hniiiis
the average number of longitudinal observations per subject.
The update rules write:
t?
0=ς2
tSt+σ?
τ
2
nt0·ς2
t+σ?
τ
2
n1
(15)
σ?
τ=hSτ2t?
0St+t?
0
2+mτ
nς2
τi1
2·h1 + mτ
ni1
2(16)
σ?
α=hSα+mα
nς2
αi1
2·1f(1?
α)?
α
1F(1?
α)+mα
n1
2
(17)
σ?
=S+m
|E| nhniii
ς2
1
2
·1 + m
|E| nhniii1
2
(18)
where f(.)is the probability density function of the standard
normal distribution. Both the coupled set of equations (15)-
(16) and the implicit equation (17) can easily be solved by
iterative update. Equation (18) is closed-form.
4.3.3 Implementation details
The sequence of ρ[k]required by Algorithm 2 is chosen to be
constantly equal to 1 in a preliminary “burn-in” phase of the
calibration procedure, and then decreases with the iterations
with an exponential decay. The fanning numerical scheme
is used to compute the parallel transport along geodesics in
a scalable manner [41, 42, 62]. A block Metropolis-Hasting-
within-Gibbs approach is used for the MCMC sampling step,
where each variable αi,τiand siare successively sampled.
Several transition kernels can be chained in order to decrease
the correlation between z[k1] and z[k]. Proposal variances
are dynamically adapted during the iterations to ensure that
the acceptation rates remain close to 30% [5]. The optimiza-
tion problem for the update of θ2is solved by steepest gradi-
ent descent. The gradients of the complete log-likelihood are
obtained by autodifferentiation using the PyTorch library.
Learning the spatiotemporal variability in longitudinal shape data sets 9
The PyKeops library [13] implements smart autodifferen-
tiation methods for convolution intensive computations to
avoid memory overflows.
4.4 Personalization
Once the model is calibrated using a training data set, any
individual data sequence {yi,j, ti,j }jcan be reconstructed
by fitting the model (whether it was part of the training set
or not, i.e. inor i > n respectively). This procedure,
called here personalization, consists in solving the optimiza-
tion problem defined by equation (13). Note that all indi-
viduals can be treated independently i.e. equation (13) is
equivalent to solving several sub-problems of the form zm
i=
argmaxzilog p({yi,j}j, zi, θ m;{ti,j}j). The computed opti-
mal latent variables zm
igive in turn the spatiotemporal coor-
dinates of the individual trajectory in the reference frame of
the calibrated model θm. We use the L-BFGS optimization
method [38], where gradients are automatically computed
using the PyTorch autodifferentiation library.
4.5 Simulation
The purpose of the simulation is to take advantage of the
generative nature of the model to generate an entirely syn-
thetic data set that reproduces the characteristics of the orig-
inal training data set.
Given a longitudinal data set, the calibration followed by
the personalization to the training data yields a normative
model of progression, a spatiotemporal coordinate system
(both being encoded by the parameters θm) and the coordi-
nates of each individual in this reference frame (i.e. zm=
(zm
i)i). We denote ep(zm,{t})the empirical joint distribu-
tion of those individual parameters and of the corresponding
time-indices {t}. We simulate synthetic data {ys}by sam-
pling random variables from this empirical distribution, and
generate data following the generative model (11).
We use statistic functions ζ(most often not sufficient,
similarly to [44]) to evaluate to which extend the simulated
data resemble the training data:
ζ({ys})ζ({y}),with ({ys}iid
p({y}|zs, θm;{ts})
(zs,{ts})iid
ep(zm,{t}).
(19)
For visualization purposes, we may choose to ignore the cal-
ibrated variance of noise σm
and replace it with the degen-
erated value σ= 0. This choice will generate smoother
shapes, and we will call such simulations “without noise”.
5 Experiments
5.1 Validation on synthetic shape data
In this section, the calibration, personalization and simula-
tion algorithms are validated on a synthetic data set in 2D.
The simulation algorithm is first used in Section 5.1.1 to
generate a synthetic shape data set from a chosen ground
truth model. We use then the calibration method to infer the
model parameters from the synthetic data set. The perfor-
mance and the stability of the calibration algorithm is eval-
uated in various settings in Section 5.1.2. The calibrated
model is personalized in Section 5.1.3, and the learned in-
dividual parameters are compared to the true values. Even-
tually, we re-simulate a synthetic data set from the calibrated
model, and assess in Section 5.1.4 to which extend this new
synthetic data set has similar statistics as the original data
set.
5.1.1 Simulating synthetic data from a ground truth model
We choose values of fixed effects θ=(y0,c0,m0,A0,t0,στ,
σα,σ), which specifies a normal distribution of shape tra-
jectories. The chosen geometrical parameters y0, c0, m0, A0
are shown in Figure 3. In addition, we choose t0=0,στ=2,
σα=0.2and σ{0.00,0.01,0.02,0.03,0.05}.
We use the generative model to simulate a total of n
{50,100,200}individual trajectories and to sample them at
several time-points {ti,j}ni
j=1. We draw the number of ob-
servations for each individual ni2according to a shifted
Poisson distribution with parameter E(ni){3,5,7,9}. We
21 0 1 2
2.0
1.5
1.0
0.5
0.0
0.5
1.0
1.5
2.0
2 0 2
2
1
0
1
2
2 0 2
2
1
0
1
2
2 0 2
2
1
0
1
22 0 2
2
1
0
1
2
Fig. 3: Visualization of the parameters θ1=(y0, c0, m0, A0).
The template shape y0is in solid black, the control points c0
are the five dot points in either blue or red, the momenta
m0is the bold blue arrow, and the four columns of A0are
the bold red arrows. The velocity fields corresponding to the
momenta m0or the geometrical components of A0are re-
spectively represented with light blue or light red arrows.
The green dots and arrows on the top figure mark the four
landmark positions that will be considered for the statistic ζ,
in order to validate the simulation algorithm in Section 5.1.4.
10 Alexandre Bˆ
one, Olivier Colliot, and Stanley Durrleman
finally impose that the individual time-points {ti,j}jare uni-
formly distributed in the observation interval [ti,1, ti,ni] =
[ti,0∆ti/2, ti,0+∆ti/2], where both the observation time
window ∆ti=ti,niti,1and the mid-point ti,0are drawn
according to normal distributions: ∆ti
iid
NE(ni)2, σ 2
τ
and ti,0
iid
N(t0, σ2
τ). Figure 4 displays some generated data
in the reference case where σ= 0.02 and E(ni)=7.
5.1.2 Model calibration
The model calibration outputs are the estimated population
parameters θ= (θ1, θ2)with θ1= (t0, στ, σα, σ)and θ2=
(y0, c0, m0, A0). They are expected to be close to the MAP
θmdefined by equation (12).
Computing the MAP. Because only a finite number of data
points are available and a Bayesian prior p(θ)is assumed on
the parameters θ, the MAP θmdoes not correspond exactly
to the ground truth parameters θt. The calibrated parameters
θare expected to converge towards the corresponding MAP
parameters θmwhen the number of iterations goes to infinity
– and this section experimentally verifies it, when the MAP
parameters θmare known to converge towards the ground
truth parameters θtwhen the number of observations goes
to infinity. For each configuration of ground truth parame-
ters θtand particular random sampling ztof the generative
model they define, θm
1can be analytically computed with
equations (15-18) and θm
2approximated by a steepest gra-
dient descent approach initialized with θtand zt(see Algo-
rithm 2). The calibration error between θand θmis analyzed
and discussed in details in the rest of this section, when the
statistical error between θmand θtis only computed in the
reference configuration. Using the performance metrics in-
troduced in the following paragraph, the second line of Ta-
ble 1 (in italicized text) gives the corresponding quantitative
normalized distances, which remain below 6% in all cases.
Normalized error metrics. The error for the scalar parame-
ters θ1= (t0, στ, σα, σ)is measured by the absolute differ-
ence between estimated and MAP values. The error for t0
is normalized by the characteristic population observation
window, that we define as 2·(1 + σα)·[E(∆ti)/2 + στ].
The remaining errors for στ, σα, σare respectively normal-
ized by the true standard deviations στ= 2,σα= 0.2, and
the estimated noise level by the Bayesian atlas model [28]
computed during the initialization pipeline described in Sec-
tion 4.3.1. The error on the template shape y0is assessed
as the maximum point-to-point residual distance, and nor-
malized by the conservative value of 3 spatial units for the
characteristic size of the considered shape (see Figure 3).
The control points c0and the momenta m0are jointly eval-
uated through the `2distance of the estimated velocity field
v0=Conv(c0, m0)to the MAP value, normalized by the `2
norm of this MAP velocity field. Finally, the convergence
y0(%) c0, m0(%) c0, A0(%) t0(%) στ(%) σα(%) σ(%)
reference 2.5 ±0.01 6.2 ±0.10 2.1 ±0.02 8.8 ±0.06 1.7 ±0.28 7.0 ±2.73 7.7 ±0.01
statistical error 0.0 0.4 0.1 2.5 5.6 3.0 0.0
σ= 0.00 2.7 ±0.01 5.9 ±0.17 2.5 ±0.05 7.4 ±0.24 2.6 ±0.60 8.1 ±5.24 36.4 ±0.03
σ= 0.01 6.0 ±0.01 2.5 ±0.09 1.7 ±0.01 2.7 ±0.07 0.7 ±0.21 0.8 ±0.44 15.1 ±0.03
σ= 0.02 2.5 ±0.01 6.2 ±0.10 2.1 ±0.02 8.8 ±0.06 1.7 ±0.28 7.0 ±2.73 7.7 ±0.01
σ= 0.03 1.4 ±0.01 1.9 ±0.09 1.5 ±0.03 4.8 ±0.08 2.3 ±0.38 3.8 ±1.23 5.1 ±0.01
σ= 0.05 1.8 ±0.02 4.1 ±0.21 1.9 ±0.04 1.5 ±0.10 1.1 ±0.34 0.9 ±0.33 1.7 ±0.01
E(ni)=3 1.6 ±0.02 5.7 ±0.29 2.1 ±0.05 5.2 ±0.22 3.9 ±1.12 7.1 ±2.03 6.5 ±0.03
E(ni)=5 4.4 ±0.01 5.9 ±0.32 2.7 ±0.03 1.1 ±0.15 5.8 ±0.56 0.7 ±0.36 6.7 ±0.02
E(ni)=7 2.5 ±0.01 6.2 ±0.10 2.1 ±0.02 8.8 ±0.06 1.7 ±0.28 7.0 ±2.73 7.7 ±0.01
E(ni)=9 2.4 ±0.01 3.2 ±0.05 1.7 ±0.04 7.2 ±0.03 1.8 ±0.11 2.7 ±0.17 6.3 ±0.01
q= 2 2.5 ±0.04 18.0 ±0.28 51.3 ±0.06 10.6 ±0.16 9.3 ±0.52 3.1 ±0.84 172.4 ±0.07
q= 4 2.5 ±0.01 6.2 ±0.10 2.1 ±0.02 8.8 ±0.06 1.7 ±0.28 7.0 ±2.73 7.7 ±0.01
q= 6 2.5 ±0.01 6.8 ±0.10 1.9 ±0.03 8.9 ±0.07 1.7 ±0.20 9.3 ±1.80 7.0 ±0.02
n= 50 3.4 ±0.01 3.6 ±0.16 2.2 ±0.03 4.7 ±0.04 0.2 ±0.16 2.0 ±0.17 7.6 ±0.02
n= 100 2.5 ±0.01 6.2 ±0.10 2.1 ±0.02 8.8 ±0.06 1.7 ±0.28 7.0 ±2.73 7.7 ±0.01
n= 200 1.8 ±0.00 3.9 ±0.08 1.5 ±0.02 2.6 ±0.04 0.8 ±0.12 3.7 ±0.19 6.9 ±0.01
Table 1: Final average normalized performance metrics and associated standard deviations, obtained after 10 independent
runs of the MCMC-SAEM algorithm in varied configurations. The algorithm is run for 200 iterations in all configurations.
The reference configuration corresponds to a noise level σ= 0.02, an average number of visits per subject E(ni) = 7,
q= 4 allowed components of geometrical variability, and n= 100 input subjects. The second line gives the discrepancy
between the ground truth (used for generating the data) and the MAP (used for evaluating the calibration performance), in
the reference case. We call this discrepancy the statistical error, by opposition to the calibration error.
Learning the spatiotemporal variability in longitudinal shape data sets 11
0.29 1.77 3.25
-3.89 -2.96 -2.03 -1.10 -0.18 0.75 1.68
-1.48 -1.02 -0.57 -0.12 0.33 0.79
-15.00 -10.00 -5.00 0.00 5.00 10.00 15.00
-3.20 -2.19 -1.19 -0.18 0.82 1.82 2.83
-3.19 -2.31 -1.43 -0.56 0.32 1.20 2.08
-2.57 -2.21 -1.84 -1.48 -1.11 -0.75
simulate (with noise)
learned
model
reconstructed data personalize
simulated data
learning data true model
calibrate
simulate
(without noise)
Fig. 4: Illustration of the evaluation procedure for model calibration, and subsequent personalization or simulation from the
learned model. The ground truth population geodesic is plotted in black on the central line. From this model are simulated
n= 100 individual spatiotemporal trajectories: three randomly-picked samples are plotted in black on the top lines. The
population geodesic of the calibrated model is plotted in green on the central line, superimposed with the ground truth
geodesic. This calibrated model can then be personalized to the training observations as plotted in red, or leveraged to
simulate new spatiotemporal trajectories that resemble the original data set as plotted in blue.
of modulation matrix A0is assessed by measuring the mis-
match between the sets of space-shifts that can be gener-
ated from the pair of parameters (c0, A0): (i) the estimated
modulation matrix is first re-projected on the MAP control
points, (ii) the linear subspaces generated by the columns
of the MAP and estimated modulation matrices are defined,
(iii) the matrix representations of the projectors over those
subspaces are computed, (iv) the average of the four great-
est eigenvalues of their difference captures the mismatch be-
tween those projectors, (v) the result is normalized by the
largest eigenvalue of the MAP projector.
Evaluation setups and results. In addition to the previously
introduced setups, configurations with varying allowed num-
ber of sources q∈ {2,4,6}are also evaluated. We call the
configuration with σ= 0.02,q= 4,n= 100, and E(ni)=7
the reference one. Augmented with the 11 configurations
differing from this reference by a single parameter, a total of
12 calibration problems are defined. Each is solved 10 times
by running the stochastic MCMC-SAEM-GD algorithm.
12 Alexandre Bˆ
one, Olivier Colliot, and Stanley Durrleman
Figure 5 plots the evolution of the error metrics across the
allowed 200 iterations for the reference configuration: the
black lines correspond to the 10 different runs, and in green
is represented their mean and standard deviation. The al-
gorithm is stable i.e. converges to similar results at each
run: the final standard deviation of the error is smaller than
10% of the maximal (initial) error, for all parameters. The
two regimes of the algorithm can be identified: the burn-
in phase for the first half of the iterations where the step-
sizes ρ[k]remained fixed to 1, followed by the concentration
phase where the step-sizes decrease geometrically. We can
finally notice that σαis estimated with more variance than
other parameters, and than στin particular. This suggests
that adding higher-order components to the time-warp func-
tions ψiwould come with estimation challenges.
Table 1 gives the average final error metrics and the associ-
ated standard deviations. Those standard deviations remain
in all but one case below 3%, underlining the stability of the
estimation algorithm. In most cases, the parameters are es-
timated with less than 10% of error: exceptions only appear
in the configurations with very low noise levels σ0.01
or underestimated number of geometrical sources q= 2.
Interestingly, a higher noise level σdoes not necessarily
correlate with a degraded estimation of the parameters. The
presence of noise can actually help the algorithm to better
explore the space of parameters: it seemed that local min-
ima may be harder to escape for very low levels of noise. In
particular, the estimation performance of the noise variance
σ2
improves when the true value increases. Table 1 stud-
ies also the impact of the length of the observation period
E(ni). Long periods generally favour more accurate esti-
mation of the parameters v0=Conv(c0, m0)which encodes
the direction of the progression, and (στ, σα)which capture
its dynamical variability. However, because of compensation
mechanisms that may take place at the individual level be-
tween αiand τi, it is rather the joint quantity στ+σαthat is
clearly better estimated when E(ni)increases than στand
σαindependently.
The same table compares also the estimation quality when
the true number of sources is underestimated (q= 2), per-
fectly chosen (q= 4) or overestimated (q= 6). The recon-
struction ability of our model, measured by σ, increases
with q, and seems to saturate once the optimal number of
sources has been reached. The large estimation error made
on σin the case q=2 comes from the fact that data was sim-
ulated from exactly four geometrical components of compa-
rable importance (see Figure 3), thus creating a strong re-
construction performance thresholding effect when choos-
ing q < 4. One can expect smoother variations of the recon-
structive performance on real data sets, which do not result
from the exact simulation of the generative model. Parame-
ters are in majority less well estimated in the q= 2 config-
uration, and at comparable distance to the MAP in the two
remaining ones. Finally, Table 1 shows that the number of
0 50 100 150 200
2.50
2.52
2.54
2.56
2.58
error on y0(%)
0 50 100 150 200
6.0
6.2
6.4
6.6
6.8
7.0
error on c0,m0(%)
0 50 100 150 200
2.1
2.2
2.3
error on c0,A0(%)
0 50 100 150 200
9
10
11
error on t0(%)
0 50 100 150 200
0
10
20
30
40
error on στ(%)
0 50 100 150 200
0
10
20
30
40
error on σα(%)
0 50 100 150 200
8
10
12
14
16
18
error on σe(%)
Fig. 5: Evolution of the error metrics across the 200 allowed iterations of the MCMC-SAEM algorithm, for the reference
configuration: noise standard deviation σ= 0.02,q= 4 estimated components of geometrical variability, learning on a data
set composed of n= 100 with on average E(ni) = 7 visits per subject, spanning 5 time units. The 10 solid black curves
correspond to 10 independent runs of the same – stochastic – MCMC-SAEM algorithm; the bold green curve is their average
and the light green region indicates the associated standard deviation. The algorithm consistently converges towards similar
parameters at each run, and those estimated parameters are satisfyingly close to the MAP estimate.
Learning the spatiotemporal variability in longitudinal shape data sets 13
training subjects nhas a major influence over the quality of
the estimation. Almost all metrics are improved in the con-
figuration with n=200 subjects.
In conclusion, the proposed MCMC-SAEM-GD algo-
rithm successfully solves our model calibration problem in
varied configurations. The stochastic procedure is stable across
independent repetitions. The presence of noise in the train-
ing data is well-handled, and actually seems to act as a good
regularizer for the estimation procedure. An underestimated
number of sources does not harm the convergence of the
procedure, but mostly impairs the reconstruction ability of
the learned model. This number should therefore be grad-
ually increased to meet the reconstruction goals of the ex-
perimenter, keeping in mind that an intrinsic optimal per-
formance will be reached when qis large enough. Finally,
increasing the number of subjects or the number of visits
are both beneficial for model calibration.
5.1.3 Personalization after calibration
Once calibrated, the longitudinal shape models are person-
alized to the training data. The estimated individual parame-
ters αi, τi, siare compared to their true value. In order to be
comparable with the true sources, the estimated sources are
first brought back to the cotangent space defined by the true
control points ct
0by solving Conv(ct
0, mt
i) = Conv(c0, mi).
Figure 6 plots the estimated ziagainst the true corre-
sponding values. The acceleration factors are well aligned
on the bisector. The onset ages and sources are also esti-
mated with a low variance, but with a non-negligible bias.
0.6 0.8 1.0 1.2 1.4
true (αi)i
0.6
0.8
1.0
1.2
1.4
estimated (αi)i
R2=0.920
66 68 70 72 74 76
true (τi)i
67.5
70.0
72.5
75.0
estimated (τi)i
R2=0.965
21 0 1 2
true (si)i
3
2
1
0
1
2
estimated (si)i
R2=0.999
geometrical component 1
geometrical component 2
geometrical component 3
geometrical component 4
Fig. 6: Comparison of the estimated individual parameters
zi= (αi, τi, si)after personalization of the mean calibrated
model to the simulated observations, in the reference sce-
nario. In each scatter plot, the identity is represented by the
solid black line. The R2value for the sources is an average
over the four geometrical components.
This effect is due to the fact that the estimation of individ-
ual parameters during personalization may compensate for
some error made during the estimation of population param-
eters during calibration. Time-shifts τimay compensate an
error on the reference time t0. Acceleration factors αimay
compensate for an error in the norm of v0. Sources simay
compensate for an error in the norm of the columns of A0.
These effects do not question the identifiability of the
model, but rather suggest that, for a finite number of ob-
servations, the likelihood may have a rather flat maximum,
for which a range of parameter values may reconstruct data
almost equally well. Finally, two outliers can be noticed in
Figure 6 for the pace of progression αi, as well as for the
onset age τi. These outliers correspond to extremely reduced
windows of observation ti,niti,1, respectively equal to 0.07
and 0.20, when the theoretical mean is equal to 5.
Table 2 summarizes the results in all configurations, giv-
ing for each of the twelve considered setups the median error
and associated median absolute deviation on zi= (αi, τi, si)
when personalizing the average calibrated model. The me-
dian is reported instead of the mean because it is more robust
to outliers. Focusing on the estimation variability, it appears
that the sources siare the best estimated parameters, fol-
lowed by the pace of progression αiand the onset ages τi.
∆αi(%) ∆τi(%) ∆si(%)
reference 4.4±11.9 44.2±15.3 11.7±3.3
σ= 0.00 3.8±4.3 42.4±24.8 5.4±6.8
σ= 0.01 2.1±6.4 11.8±9.4 11.1±12.0
σ= 0.02 4.4±11.9 44.2±15.3 11.7±3.3
σ= 0.03 7.6±15.7 25.5±15.0 6.7±2.5
σ= 0.05 21.8±23.4 4.4±14.7 9.1±3.0
E(ni)=3 9.3±32.7 12.1±19.3 8.8±3.2
E(ni)=5 2.9±13.0 3.2±20.8 3.6±7.5
E(ni)=7 4.4±11.9 44.2±15.3 11.7±3.3
E(ni)=9 8.2±7.3 46.0±16.1 7.6±2.3
q= 2 1.6±18.0 48.3±50.7 2.4±69.4
q= 4 4.4±11.9 44.2±15.3 11.7±3.3
q= 6 9.8±12.3 45.3±15.8 11.7±3.3
n= 50 12.9±8.1 24.8±18.3 6.9±6.1
n= 100 4.4±11.9 44.2±15.3 11.7±3.3
n= 200 9.8±10.5 13.2±10.9 13.1±2.4
Table 2: Median of the residual errors and associated median
absolute deviation times 1.4826 for the estimated individ-
ual parameters, expressed in percentage of the correspond-
ing ground-truth standard deviations σα= 0.2,στ= 2 and
σs= 1. The results are given for the reference scenario plus
eleven perturbed scenarii, where either the noise level σ,
the average number of visits per subject E(ni), the allowed
number of geometrical components qor the number of sub-
jects nis varied.
14 Alexandre Bˆ
one, Olivier Colliot, and Stanley Durrleman
Fig. 7: Distribution of the position of landmarks of interest in the raw (i.e. original), reconstructed (by personalization of the
calibrated model) and simulated data sets, for the reference configuration. Those landmarks of interest are indicated by green
dots on Figure 3. The simulated distributions are similar to the corresponding raw ones, suggesting that the spatiotemporal
variability of the original data set has been successfully captured.
The estimation of the pace of progression αiquickly dete-
riorates with increasing levels of noise σ, reaching almost
25% of the true standard deviation σα= 0.2in the most
noisy configuration. The estimation of the onset ages τiand
sources seems more robust, with no clear tendency. The es-
timation of the pace αiimproves when the number of visits
per subject E(ni)increases. The same trend can be noticed
for the onset age τi, although with a reduced amplitude. The
sources siremain well-estimated in all scenarii. No clear
difference can be noticed between the reference q= 4 and
the over-estimated number of geometrical components case
q= 6, suggesting that adding components does not hamper
the personalization of a calibrated model. However, under-
estimating this number of components with q= 2 deteri-
orates the estimation of the sources si, and the dynamical
parameters αiand τito a lesser extend. As in the previous
section, we interpret this large performance drop due to the
fact that data was simulated according to exactly four ge-
ometrical sources of similar magnitude (see Figure 3): in
real data sets, one may expect the estimation performance to
change more smoothly with q. Finally, an increased number
of subjects nallows a better performance of the personaliza-
tion algorithm, especially for the onset age τiand source si
parameters.
5.1.4 Simulation after calibration and personalization
After calibration and personalization, the learned model and
empirical distribution of the random effects can be used to
simulate entirely synthetic shape trajectories. Figure 4 gives
some randomly selected samples from such simulated tra-
jectories for the reference scenario, where (see equation (19)):
the fixed effects θmare averages over the 10 calibrations;
the random effects zsare drawn according to indepen-
dent normal distributions with mean and standard devi-
ations equal to the values given by Table 2;
the visit ages tsare drawn according to the true pro-
cedure based on the average calibrated values for t0,
στ, and the empirical average hniiifor E(ni)(see Sec-
tion 5.1.1).
Figure 7 compares the distribution of vertical or horizontal
positions of the tips of the original (see Section 5.1.1), re-
constructed (see Section 5.1.3) and simulated observations.
Those landmarks of interest are indicated by green dots and
arrows on Figure 3, and form the statistic ζintroduced in
equation (19). A total of 1,000 subjects are simulated, when
only 100 were available for model calibration. The three dis-
tributions largely overlap, indicating that the learned distri-
bution of shape trajectories reproduces the true distribution.
Learning the spatiotemporal variability in longitudinal shape data sets 15
happiness
male female
sadess
male female
surprise
male female
fear
male female
disgust
male female
anger
male female
Fig. 8: Learned emotion spatiotemporal models. The population geodesic is plotted in green, and the shifted progressions
along the gender mode of geometrical variability are plotted in black.
16 Alexandre Bˆ
one, Olivier Colliot, and Stanley Durrleman
5.2 Dynamic facial expression
5.2.1 Data and preprocessing
The Birmingham University 3D dynamic facial expression
database [61] gathers short video sequences from 101 sub-
jects (of which 58 female, 43 male). Each subject mimics in
6 distinct sequences basic emotions which are Anger, Dis-
gust, Fear, Happiness, Sadness and Surprise. For each of
those 606 sequences we uniformly extract 8 frames span-
ning from the first to the 36-th one, which correspond to
a subsampling of the first 1.4 seconds of each video. We
do not work directly with the images, but with a set of 75
semi-automatically extracted landmarks, which come with
this data set. Every set of 3D landmarks is registered to a
reference one by similarity-based Procrustes alignment.
5.2.2 Model calibration: learned emotion models
We learn 6 distinct longitudinal atlas models: one per emo-
tion, calibrated on the n= 101 sequences of ni= 8 frames
for all subjects i. We choose q= 10 sources. Figure 8 shows
in green the estimated average scenario for each emotion.
Qualitatively, those average scenarii show a typical pattern
of facial expression. The Disgust, Fear, Happiness and Sur-
prise models feature large displacements in the area of the
mouth in particular. The Sadness expression is more mute,
with a subtle displacement of the eyebrows. The Anger model
shows a combined displacement of both eyes and eyebrows.
5.2.3 Gender-specific emotion patterns
The estimated models are personalized to the correspond-
ing training data sets, giving for each sequence an optimal
zi= (αi, τi, si). We only focus on exploiting the individ-
ual source parameters siRq=R10 in this section. For
each model, we fit a 1D partial least square regression model
for predicting the gender from a linear combination of the
sources variables si[1]. We then test whether the linear com-
bination of the sources are significantly different between
men and women using a Student t-test. All p-values are smaller
than 105, thus showing significant differences in the geom-
etry of the face between genders that are independent of the
pattern of expression.
Figure 8 shows the typical scenario for men and women,
which are built by translating the mean scenario in the direc-
tion of the average of the sources for each gender (in black).
For all emotion models, male subjects tend to have wider
faces than females, as it can very clearly be seen in the area
of the cheeks or of the nose for the Anger and Surprise mod-
els.
Angry Disgust Fear Happy Sad Surprise
Angry 64.3 7.0 8.1 4.0 16.6 -
Disgust 13.7 55.1 12.4 14.8 1.9 2.0
Fear 1.0 16.6 58.6 13.9 7.0 3.0
Happy 1.9 6.0 13.0 79.1 - -
Sad 16.5 2.0 14.2 1.1 66.2 -
Surprise 1.0 3.0 16.0 - 1.0 79.1
Table 3: Average confusion matrix across 5-fold linear dis-
criminant classification. The sequence features consist in a
12-scalar vector that stacks the 6 pairs of dynamical param-
eters αi,τiobtained by personalizing the 6 emotion models.
The average accuracy is 67.08 %.
5.2.4 Application to classification
We propose to automatically recognize the emotion from a
sequence based on the personalization of each facial expres-
sion model to the sequence. We propose here to use the dy-
namic variables αi,τifor classification.
More precisely, we perform a 5-fold cross-validation en-
suring that each group is gender-balanced. For each split:
six longitudinal shape models are learned on the training
sequences for each emotion;
these models are personalized to all the 606 sequences:
for each sequence a total of 6 zivectors are therefore
estimated;
for each sequence, the estimated temporal parameters
αi, τiare stacked into vectors of 6×2 = 12 scalars;
these feature vectors are used to train and test a simple
linear discriminant classifier on the corresponding train
and test sequences.
Table 3 gives the confusion matrix obtained with this pro-
cedure, averaged over the 5 folds. The average classification
accuracy is 67.08 %, above the chance level which amounts
to 16.67 %. For comparison, [4] reported an average accu-
racy of almost 100 %, [58] of 90.44 %, and [23] of 74.63 %.
We emphasize however that our performance is achieved:
using the default linear discriminant analysis from the
sklearn library, without any hyperparameter tuning as
in [4] with random forest, in [58] with hidden markov
model or in [23] with radial support vector machine;
on all the 606 available sequences, without any manual
selection of a subset of 60/101 subjects as it is done in
[4, 58] or of 507/606 sequences as done in [23];
based only on 12 intuitive scalar features per sequence,
that encode how an individual emotional pattern dynam-
ically compares to population models of basic emotions.
From this experiment that has not been particularly tuned
to achieve best classification performance, we conclude that
our model captured shape characteristics that are specific to
Learning the spatiotemporal variability in longitudinal shape data sets 17
(a) Left hippocampus mean progression.
(b) Right hippocampus mean progression.
Fig. 9: Typical model of hippocampus atrophy from MCI to Alzheimer’s disease stage. Physiological ages (from left to right,
in years): 58.6, 63.0, 67.4, 71.8, 76.2, 80.7, 85.1, 89.5, 93.9.
each emotion. It is worth noting that we used here only dy-
namic parameters that capture how fast or slow the face is
changing in the sequence, or with which delay.
5.3 Hippocampal atrophy in Alzheimer’s disease
5.3.1 Data and preprocessing
Data used in the preparation of this section were obtained
from the Alzheimer’s Disease Neuroimaging Initiative (ADNI)
database (adni.loni.usc.edu).
We select all the T1-weighted MRIs of subjects that were
diagnosed as presenting mild cognitive impairements at some
visit, and diagnosed as converted to Alzheimer’s disease at
some later visit. See Table 4 for summary statistics. This
data set amounts to a total of 1993 visits from n= 322
subjects. Second-take “re-test” MR images are available for
1838 of those visits and will be used to estimate the noise
Number of subjects 322
Number of visits 1993
Average number of visits per subject (±std) 5.8 (±2.4)
Average age (±std) 74.0 (±6.7)
Sex ratio (F/M in %) 41.2 / 58.8
Amyloid status (+/-/unknown in %) 73.2 / 7.1 / 19.7
APOE carriership (%) 65.2
Education (mean ±std, in years) 15.9 (±2.8)
Marital status (married/not married in %) 80.9 / 19.1
Table 4: Summary statistics of the medical data set of
Alzheimer’s disease patients.
in the data. All those 1993 + 1838 = 3831 images are
pre-processed exactly in the same manner, starting with the
longitudinal pipeline of FreeSurfer1(version 5.3.0) [24,25].
The skull-stripped brains are then aligned with an affine 12-
degrees-of-freedom transformation onto the Colin27 aver-
age brain2with FSL 5.03[60]. Meshes of the left and right
hippocampus are obtained from the original images as fol-
low:
the volumetric segmentations of the hippocampus com-
puted with FreeSurfer are transformed into meshes using
the aseg2srf script of July 20094,
the resulting meshes are decimated by a 88% factor us-
ing Paraview 5.4.15[2],
they are aligned using the previously-computed global
affine transformation estimated with the FSL software,
residual pose differences among subjects are removed
by rigidly aligning the meshes from the baseline image
of each subject to the corresponding hippocampus mesh
in the Colin27 atlas image, this transformation with 6
degrees of freedom being computed with the GMMReg
script of June 20086[32],
the same transformation is finally used to align the meshes
from the follow-up images of the same subject.
1available at: https://surfer.nmr.mgh.harvard.edu
2available at: http://www.bic.mni.mcgill.ca/ServicesAtlases/Colin27
3available at: https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/
4available at: https://brainder.org
5available at: www.paraview.org
6available at: https://github.com/bing-jian/gmmreg
18 Alexandre Bˆ
one, Olivier Colliot, and Stanley Durrleman
5.3.2 Models of atrophy of the hippocampus
We calibrate two longitudinal shape models on all the 1993
meshes of the left and right hippocampus respectively, choos-
ing in both cases q= 8 sources. The deformation kernel
width is set to σ= 10 mm. The current distance is used
to compute distances between meshes without point corre-
spondence, with a kernel width of σE= 5 mm [14, 59].
Figure 9 shows the estimated average progression, which
consists in an overall atrophy of both the left and right hip-
pocampus with a specific deformation of their shape. It is
worth noting that we reconstruct here the progressive at-
rophy of the hippocampus over more than 30 years of dis-
ease progression although patients have never been observed
for more than few years. This can be achieved because the
method automatically re-aligns in time the data of patients
that are at different, but unknown, disease stage.
5.3.3 Personalization to unseen data
We assess the reconstruction performance of the calibrated
models using a 5-fold cross-validation. The n= 322 sub-
jects are split into 5 groups; 2×5distinct shape models
40 60 80 100 120 140 160 180 200
current-metric absolute residual (mm2)
0.0
0.2
0.4
0.6
0.8
1.0
cumulative distribution function
measurement errors
reconstruction errors
0.000
0.005
0.010
0.015
0.020
0.025
0.030
probability distribution function
(a) Left hippocampus. The mean error is 68.5±15.9mm2for the
shape model, and 83.2±36.0mm2for the re-test measurement.
50 75 100 125 150 175 200
current-metric absolute residual (mm2)
0.0
0.2
0.4
0.6
0.8
1.0
cumulative distribution function
measurement errors
reconstruction errors
0.000
0.005
0.010
0.015
0.020
0.025
0.030
probability distribution function
(b) Right hippocampus. The mean error is 69.8±15.0mm2for the
shape model, and 85.2±40.1mm2for the re-test measurement.
Fig. 10: Comparison of the generalization error to unseen
data of the learned shape models and the intrinsic measure-
ment error. The discrepancies between meshes are computed
with the current metric with σE= 5 mm, without assuming
any point-to-point correspondence.
are calibrated on the training sets for the left and right hip-
pocampus. Those models are then personalized to the un-
seen test subjects. To assess the goodness of fit, we measure
the residual errors and compared the distribution of such er-
rors with the noise distribution. This noise distribution is de-
termined by measuring the distance between the two meshes
extracted from the “test” and “re-test” images acquired from
the same patient the same day, thus capturing all the vari-
ability due to varying image quality and its consequence in
the processing. Figure 10 shows the superimposition of the
distribution of the residual errors with the distribution of the
differences between the meshes of the test and re-test im-
ages. The reconstruction errors are on average smaller than
the intrinsic uncertainty on the data, and with a lower vari-
ance as well. The model allows therefore to reconstruct in-
dividual data at the precision of the noise. It is worth noting
that this could be achieved using a reduced set of 2×10
scalars, which are for each hippocampus the pace of pro-
gression αi, the onset age τi, and the eight sources si.
5.3.4 Association with co-factors
We calibrate and personalize the models on whole data set,
and aim to study how some genetic, biological and environ-
mental co-factors may modulate the progression of Alzheimer’s
left hippocampus right hippocampus
genetic
gender
female vs.
male
αi×1.23 [**] ×1.21 [**]
τi12.4months [**] 8.7months [*]
si±0.54 [***] ±0.57 [****]
APOE-4
carrier vs.
non-carrier
αi×1.22 [*]
τi35.8months [***] 32.5months [**]
si
biological
amyloid
positive vs.
negative
αi×1.52 [**] ×1.67 [*]
τi
si
environmental
marital
married vs.
non-married
αi×1.14 [*]
τi42.5months [***] 36.3months [**]
si
education
nb. of years
of education
αi
τi3.7months/y [**] 5.1months/y [***]
si
Table 5: Significant associations of individual parameters
with genetic, biological and environmental factors: effect
sizes and significance levels of the adjusted p-values (thresh-
olds 5%, 1%, 0.1%, 0.01%). Time-shifts τiare in months,
others have no units. Directions of space-shifts are not
signed. The 23 subjects (out of n= 322) without amyloid
information have been discarded.
Learning the spatiotemporal variability in longitudinal shape data sets 19
disease in patients. We therefore aim to find correlations be-
tween individual variables zi= (αi, τi, si)and the follow-
ing factors: gender, APOE-4 carriership, presence of amy-
loid plaques, education level and marital status.
To this end, the parameters αiand τiare regressed against
the five considered cofactors, and two-tailed t-tests are per-
formed on the coefficients. A 2-block partial least square
regression model [1] is used to regress the eight sources si
against the five cofactors in a one-dimensional projection
space. A two-tailed t-test is then performed on the weights
of the multivariate regression of the linear combination of
sources against the cofactors. For each case, the obtained
five p-values are corrected with the Benjamini-Hochberg false
discovery rate procedure [8].
The obtained correlations for both left and right hip-
pocampus are summarized in Table 5. The two first rows
indicate that the atrophy of the hippocampus develops faster
and starts earlier in female subjects. Male and female sub-
jects present significantly different shape of their hippocam-
pus regardless of its atrophy due to aging or disease progres-
sion. Figure 11 presents the corresponding mode of geomet-
rical variability. Hippocampal atrophy also starts earlier in
carriers of at least one 4 allele of the APOE gene, with an
effect size of almost three years. The atrophy occurs at an
accelerated pace in amyloid-positive subject, as well as for
APOE-4 allele carriers and married subjects but only in a
significant manner in the left hemisphere of the brain. Fi-
nally, the atrophy occurs earlier in married subjects, as well
as in educated subjects.
The results obtained by correlating the estimated indi-
vidual parameters ziwith the genetic and biological factors
are in line with current knowledge. The results obtained with
respect to the marital status are more surprising, and should
probably be taken with care as the non-married group, which
represents less than 20% of the considered 299 subjects (see
Table 5) is very heterogeneous. It gathers widowed, divorced,
or never married subjects. Finally, we show that the atrophy
starts earlier also in subjects with higher level of education.
This fact is not as counter-intuitive as it appears, and actually
is in line with the cognitive reserve theory [55], which sup-
ports the idea that education can help to compensate dam-
aged brain anatomy at the clinical level, maintaining un-
altered cognitive capacities for a period of time. In other
words, cognitive decline would be delayed with respect to
the onset of brain atrophy in educated subjects. Since, in ad-
dition, the age at diagnosis is not correlated with the number
of years of education in our dataset (r=0.02 and p=0.70
according to a two-tailed test based on Pearson’s correla-
tion coefficient), this explains why the subjects present an
increased atrophy of their hippocampi for an increased edu-
cation: they enrolled with a more advanced stage of anatom-
ical pathology, after some years of compensation.
A A
P
P
R L
(a) Coronal view.
P P
AA
L R
(b) Axial view.
Fig. 11: Superposition of the male-like (in blue) and the
female-like (in pink) hippocampus geometries, in two stan-
dard views. The letters L, R, A, P respectively indicate the
left, right, anterior and posterior directions.
5.3.5 Simulation of hippocampus atrophy due to AD
The calibrated models and the empirical distribution of ran-
dom effects ziestimated by their personalization to the train-
ing data are used to simulate synthetic progressions of the
hippocampus. In order to validate such a simulation method,
the simulated trajectories are sampled at several ages, and
the empirical distribution of the volumes of the simulated
hippocampus are compared to the distribution of the original
hippocampus. The volume is commonly used as a biomarker
in clinical studies, and we aim to assess if the simulated co-
hort could be used instead of the original one.
To do so, we simulate the same number of subjects as in
the training cohort (n= 322) with the same number of time-
points and same time interval between visits. Note that we
do not use the age at baseline, so that the sequence of obser-
vation time-points in the synthetic subjects may be shifted
in time compared to the real ones. We simulate according
to the empirical distribution of the individual parameters zi
and the age at baseline. There exists indeed a correlation be-
tween the estimated time-shift τiand the baseline age of the
enrolled subjects ti,1, as they tend to be included in the study
at similar disease stage. To be more precise:
20 Alexandre Bˆ
one, Olivier Colliot, and Stanley Durrleman
the empirical joint distribution of the time-related pa-
rameters αiand τiaugmented with the age at baseline
ti,1is computed using a kernel density estimation method;
the empirical joint distribution of the time-related pa-
rameters augmented with the sources siis captured by
fitting a multivariate Gaussian distribution.
A simulated data set is then created by applying 322 times
the following procedure:
draw the acceleration factor αi, the onset age τiand the
baseline age ti,1from the corresponding kernel density;
draw the sources sifrom the multivariate Gaussian con-
ditional distribution with respect to its already-drawn time-
related parameters;
draw without replacement the sequence of visits of one
subject i.e. the number of visits and the time intervals
between them;
0.0
0.1
0.2
0.3
0.4
0.5
0.6
probability distribution function
1234567
hippocampal volume (cm3)
0.0
0.2
0.4
0.6
0.8
1.0
cumulative distribution function
raw
reconstructed
simulated
(a) Left hippocampus: the mean volume is 2958 ±779 mm3for raw
data, 2863 ±693 mm3for reconstructed data, and 2865 ±746 mm3
for simulated data.
0.0
0.1
0.2
0.3
0.4
0.5
0.6
probability distribution function
01234567
hippocampal volume (cm3)
0.0
0.2
0.4
0.6
0.8
1.0
cumulative distribution function
raw
reconstructed
simulated
(b) Right hippocampus: the mean volume is 3081 ±862 mm3for raw
data, 3014 ±754 mm3for reconstructed data, and 3063 ±763 mm3
for simulated data.
Fig. 12: Distribution of the left and right hippocampal vol-
ume in the raw, reconstructed and simulated data set. The
simulated volume distribution is very close to the volume
distribution of the reconstructed data set. The remaining bias
between those two distributions and the one corresponding
to the raw data comes from the smoothing behavior of the
current noise model, leveraged to deal with noisy meshes
without point correspondence. See Figure 13.
Fig. 13: Several views of a single example of the recon-
struction of a right hippocampus structure by the longitu-
dinal shape model. The reconstruction is the smooth white
structure, and the raw data point is plotted in red.
sample the individual hippocampus trajectory defined by
zi= (αi, τi, si)at the baseline age ti,1and the follow-
up visits.
This protocol is repeated for both the left and right hip-
pocampus, and for men and women (meaning that the es-
timation of the empirical distributions is done for both gen-
ders separately).
Figure 12 shows the volume distributions of the raw, re-
constructed and simulated data. The cumulative distribution
functions associated to the simulated and reconstructed dis-
tributions of hippocampal volumes are superimposed. This
result suggests that for this volume statistic, the simulated
and true data set could be used interchangeably. Raw and
reconstructed distributions does not superimpose so well,
because the model reconstructs smooth shapes whereas raw
meshes often have small protrusion pointing outward of the
surface which tend to bias volume computation (see Fig-
ure 13). This volume difference between the raw and recon-
structed meshes amounts on average to 84.5 mm3for the left
hippocampus and 67.3 mm3for the right hippocampus.
Now validated, the simulation algorithm could be used
to synthesize a data set of left and right hippocampus of any
number of subjects, with any desired visit sampling. The
proposed gender-wise split further allows to achieve any de-
sired male-female balance.
6 Conclusion
We proposed a statistical modeling approach that represents
individual data sequences as samples along continuous tra-
jectories, these trajectories being considered as spatiotempo-
ral perturbations of a population-average progression. The
spatial warp is defined thanks to the exp-parallelization op-
erator on manifolds. The time warps are affine time-reparameterizing
functions. The spatial and temporal individual parameters
Learning the spatiotemporal variability in longitudinal shape data sets 21
position the progression of each subject in a spatiotempo-
ral reference frame centered around the average trajectory
of the population.
We proposed calibration, personalization and simulation
algorithms to address different statistical questions. The cal-
ibration algorithm combines the MCMC-SAEM stochastic
approach with gradient descent to estimate the underlying
common process and its spatiotemporal variability from a
longitudinal data set of shapes. It does not require a common
time reference to be available across individual processes,
which furthermore may be observed each for only short pe-
riods of time. Personalizing such calibrated models to a new
individual data yields quantitative, low-dimensional and in-
terpretable measures of how the progression of an individ-
ual deviates from a normative scenario. These parameters
include an acceleration factor and a time-shift on the one
hand, and geometrical sources of variability on the other
hand. Such individual parameters offer relevant features for
classification or correlation tasks, in a post-processing step.
The generative nature of the proposed model naturally of-
fers a simulation algorithm, which can generate entirely syn-
thetic data sets. Such data set may be sampled at any desired
temporal frequency, for any number of subjects and with a
full control over the population characteristics, for instance
in terms of gender balance.
We emphasize that the proposed modeling approach is
able to deal with meshes without any assumption on their
topology, in particular without assuming point-to-point cor-
respondence. It may be extended easily to deal with images
or other geometric primitives, provided that one can define
a metric between such objects.
The three proposed algorithms were validated in varied
simulated configurations, demonstrating their ability to re-
trieve the true parameters or reproduce the original data dis-
tribution. They were illustrated on a data set of facial ex-
pressions, showing the relevance of the learned normative
scenarios and the potential of the spatiotemporal parameters
for classification. We apply the method also to large medical
data set of patients that develop Alzheimer’s disease. The
average scenarii of atrophy for the hippocampus subcorti-
cal structures are in line with current medical knowledge.
Individual sequences are successfully parametrized by 10
scalar spatiotemporal coordinates in the calibrated reference
frames. Correlating these coordinates with genetic, biolog-
ical and environmental factors gives valuable insights into
protective factors influencing age at onset or pace of pro-
gression. We also evidence typical shape differences across
sub-groups, which are independent of the shape changes due
to ageing or disease progression.
The calibration algorithm is computationally intensive:
estimating a model of hippocampus progression took around
a day. Our code is already parallel, combines CPU and GPU
together, and offers a fine-grained initialization pipeline. Fur-
ther pure optimization of our code (among which multi-
GPU support, fast Fourier transforms for convolutions) is
planned, as well as evaluating the performance of variational
methods for calibration – which are not trivial to implement
in a longitudinal context without a fixed number of observa-
tions per individual.
As for any modeling approach, our model relies on some
assumptions. For instance, subjects are considered to follow
trajectories that are parallel to the population average. This
hypothesis may be alleviated by introducing drift parame-
ters to model a progressive deviation from the average sce-
nario. Such a development would add to the complexity of
the model, which may require to have even more data to be
calibrated. Further extensions would consider also to esti-
mate not only one representative trajectory at the population
level but several of them, for instance by estimating a mix-
ture model along the lines of [17]. Nonetheless, it is worth
noting that in its current form the model is able to recon-
struct data at the precision of the noise.
The model also builds on the LDDMM framework for
modeling shape variability. This framework relies also on
some assumptions on the geometry of the shape space. Fu-
ture work will consider to learn such geometry from the
data instead of relying on prior assumptions, along the lines
of [11] for instance. Learning other parameters such as the
number of sources, using automatic model selection meth-
ods for instance, would also add to the usability of the method.
Acknowledgements The research leading to this publication has been
funded in part by the European Research Council (ERC) under grant
agreement No 678304 (LEASP), European Union’s Horizon 2020 re-
search and innovation programme under grant agreement No 666992
(EuroPOND) and No 826421 (TVB-Cloud), and the program “Investisse-
ments d’avenir” ANR-10-IAIHU-06 (IHU ICM) and ANR-19-P3IA-
0001 (PRAIRIE 3IA Institute)
The facial expression data set at the basis of Section 5.2 was built and
shared by the Binghamton University. The authors warmly thank Pr.
Lijun Yin for granting data access, and Peng Liu for his help in down-
loading the data set.
Regarding Section 5.3, data collection and sharing was funded by the
Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Insti-
tutes of Health Grant U01 AG024904) and DOD ADNI (Department of
Defense award number W81XWH-12-2-0012). ADNI is funded by the
National Institute on Aging, the National Institute of Biomedical Imag-
ing and Bioengineering, and through generous contributions from the
following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Dis-
covery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-
Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Phar-
maceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-
La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE
Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research
& Development, LLC.; Johnson & Johnson Pharmaceutical Research
& Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso
Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technolo-
gies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imag-
ing; Servier; Takeda Pharmaceutical Company; and Transition Thera-
peutics. The Canadian Institutes of Health Research is providing funds
to support ADNI clinical sites in Canada. Private sector contributions
are facilitated by the Foundation for the National Institutes of Health
(www.fnih.org). The grantee organization is the Northern California
22 Alexandre Bˆ
one, Olivier Colliot, and Stanley Durrleman
Institute for Research and Education, and the study is coordinated by
the Alzheimer’s Therapeutic Research Institute at the University of
Southern California. ADNI data are disseminated by the Laboratory
for Neuro Imaging at the University of Southern California.
References
1. Abdi, H.: Partial least square regression (pls regression). Encyclo-
pedia for research methods for the social sciences 6(4), 792–795
(2003)
2. Ahrens, J., Geveci, B., Law, C.: Paraview: An end-user tool for
large data visualization. The visualization handbook 717 (2005)
3. Allassonni`
ere, S., Durrleman, S., Kuhn, E.: Bayesian mixed effect
atlas estimation with a diffeomorphic deformation model. SIAM
Journal on Imaging Science 8, 1367–1395 (2015)
4. Amor, B.B., Drira, H., Berretti, S., Daoudi, M., Srivastava, A.: 4-d
facial expression recognition by learning geometric deformations.
IEEE Trans. Cybernetics 44(12), 2443–2457 (2014)
5. Atchade, Y.F.: An adaptive version for the metropolis adjusted
langevin algorithm with a truncated drift. Methodology and Com-
puting in applied Probability 8(2), 235–254 (2006)
6. Banerjee, M., Chakraborty, R., Ofori, E., Okun, M.S., Viallan-
court, D.E., Vemuri, B.C.: A nonlinear regression technique for
manifold valued data with applications to medical image analysis.
In: Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, pp. 4424–4432 (2016)
7. Beg, M., Miller, M., Trouv´
e, A., Younes, L.: Computing large
deformation metric mappings via geodesic flows of diffeomor-
phisms. IJCV 61(2), 139–157 (2005)
8. Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate:
a practical and powerful approach to multiple testing. Journal
of the Royal statistical society: series B (Methodological) 57(1),
289–300 (1995)
9. Bilgel, M., Prince, J.L., Wong, D.F., Resnick, S.M., Jedynak,
B.M.: A multivariate nonlinear mixed effects model for longitudi-
nal image analysis: Application to amyloid imaging. Neuroimage
134, 658–670 (2016)
10. Bˆ
one, A., Colliot, O., Durrleman, S.: Learning distributions of
shape trajectories from longitudinal datasets: a hierarchical model
on a manifold of diffeomorphisms. In: Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, pp.
9271–9280 (2018)
11. Bˆ
one, A., Louis, M., Colliot, O., Durrleman, S., Initiative, A.D.N.,
et al.: Learning low-dimensional representations of shape data
sets with diffeomorphic autoencoders. In: International Confer-
ence on Information Processing in Medical Imaging, pp. 195–207.
Springer (2019)
12. Chakraborty, R., Banerjee, M., Vemuri, B.C.: Statistics on the
space of trajectories for longitudinal data analysis. In: Biomedical
Imaging (ISBI 2017), 2017 IEEE 14th International Symposium
on, pp. 999–1002. IEEE (2017)
13. Charlier, B., Feydy, J., Glaun`
es, J.A., Trouv´
e, A.: An efficient ker-
nel product for automatic differentiation libraries, with applica-
tions to measure transport (2017)
14. Charon, N., Charlier, B., Glaun`
es, J., Gori, P., Roussillon, P.: Fi-
delity metrics between curves and surfaces: currents, varifolds,
and normal cycles. In: Riemannian Geometric Statistics in Medi-
cal Image Analysis, pp. 441–477. Elsevier (2020)
15. Christensen, G.E., Rabbitt, R.D., Miller, M.I.: Deformable tem-
plates using large deformation kinematics. IEEE transactions on
image processing 5(10), 1435–1447 (1996)
16. Cury, C., Durrleman, S., Cash, D.M., Lorenzi, M., Nicholas, J.M.,
Bocchetta, M., van Swieten, J.C., Borroni, B., Galimberti, D.,
Masellis, M., et al.: Spatiotemporal analysis for detection of pre-
symptomatic shape changes in neurodegenerative diseases: Initial
application to the genfi cohort. NeuroImage 188, 282–290 (2019)
17. Debavelaere, V., B ˆ
one, A., Durrleman, S., Allassonni`
ere, S., Ini-
tiative, A.D.N., et al.: Clustering of longitudinal shape data sets
using mixture of separate or branching trajectories. In: Interna-
tional Conference on Medical Image Computing and Computer-
Assisted Intervention, pp. 66–74. Springer (2019)
18. Delyon, B., Lavielle, M., Moulines, E.: Convergence of a stochas-
tic approximation version of the em algorithm. Annals of statistics
pp. 94–128 (1999)
19. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood
from incomplete data via the em algorithm. Journal of the royal
statistical society. Series B (methodological) pp. 1–38 (1977)
20. Durrleman, S., Allassonni`
ere, S., Joshi, S.: Sparse adaptive pa-
rameterization of variability in image ensembles. IJCV 101(1),
161–183 (2013)
21. Durrleman, S., Pennec, X., Trouv´
e, A., Braga, J., Gerig, G., Ay-
ache, N.: Toward a comprehensive framework for the spatiotem-
poral statistical analysis of longitudinal shape data. Interna-
tional Journal of Computer Vision 103(1), 22–59 (2013). DOI
10.1007/s11263-012- 0592-x. URL https://doi.org/10.
1007/s11263-012- 0592-x
22. Durrleman, S., Prastawa, M., Charon, N., Korenberg, J.R., Joshi,
S., Gerig, G., Trouv´
e, A.: Morphometry of anatomical shape com-
plexes with dense deformations and sparse parameters. NeuroIm-
age (2014)
23. Fang, T., Zhao, X., Shah, S.K., Kakadiaris, I.A.: 4d facial expres-
sion recognition. In: Computer Vision Workshops (ICCV Work-
shops), 2011 IEEE International Conference on, pp. 1594–1601.
IEEE (2011)
24. Fischl, B., Dale, A.M.: Measuring the thickness of the human
cerebral cortex from magnetic resonance images. Proceedings of
the National Academy of Sciences 97(20), 11,050–11,055 (2000)
25. Fischl, B., Salat, D.H., Busa, E., Albert, M., Dieterich, M., Hasel-
grove, C., Van Der Kouwe, A., Killiany, R., Kennedy, D., Klave-
ness, S., et al.: Whole brain segmentation: automated labeling of
neuroanatomical structures in the human brain. Neuron 33(3),
341–355 (2002)
26. Fishbaugh, J., Prastawa, M., Gerig, G., Durrleman, S.: Geodesic
regression of image and shape data for improved modeling of 4D
trajectories. In: ISBI 2014 - 11th International Symposium on
Biomedical Imaging, pp. 385 – 388 (2014)
27. Fletcher, T.: Geodesic regression and the theory of least squares
on riemannian manifolds. IJCV 105(2), 171–185 (2013)
28. Gori, P., Colliot, O., Marrakchi-Kacem, L., Worbe, Y., Poupon,
C., Hartmann, A., Ayache, N., Durrleman, S.: A Bayesian Frame-
work for Joint Morphometry of Surface and Curve meshes in
Multi-Object Complexes. Medical Image Analysis 35, 458–474
(2017). DOI 10.1016/j.media.2016.08.011. URL https://
hal.inria.fr/hal-01359423
29. Hinkle, J., Muralidharan, P., Fletcher, P.T., Joshi, S.: Polynomial
regression on riemannian manifolds. In: European Conference on
Computer Vision, pp. 1–14. Springer (2012)
30. Hirsch, M.W.: Differential topology, vol. 33. Springer Science &
Business Media (2012)
31. Hyv¨
arinen, A., Karhunen, J., Oja, E.: Independent component
analysis, vol. 46. John Wiley & Sons (2004)
32. Jian, B., Vemuri, B.C.: Robust point set registration using gaus-
sian mixture models. IEEE transactions on pattern analysis and
machine intelligence 33(8), 1633–1645 (2011)
33. Joshi, S.C., Miller, M.I.: Landmark matching via large deforma-
tion diffeomorphisms. IEEE Transactions on Image Processing
9(8), 1357–1370 (2000)
34. Kendall, D.G.: Shape manifolds, procrustean metrics, and com-
plex projective spaces. Bulletin of the London Mathematical So-
ciety 16(2), 81–121 (1984)
35. Kim, H.J., Adluru, N., Suri, H., Vemuri, B.C., Johnson, S.C.,
Singh, V.: Riemannian nonlinear mixed effects models: Analyz-
ing longitudinal deformations in neuroimaging. In: Proceedings
Learning the spatiotemporal variability in longitudinal shape data sets 23
of IEEE Conference on Computer Vision and Pattern Recognition
(CVPR) (2017)
36. Koval, I., Schiratti, J.B., Routier, A., Bacci, M., Colliot, O., Al-
lassonni`
ere, S., Durrleman, S., Initiative, A.D.N., et al.: Statistical
learning of spatiotemporal patterns from longitudinal manifold-
valued networks. In: International Conference on Medical Im-
age Computing and Computer-Assisted Intervention, pp. 451–
459. Springer (2017)
37. Kuhn, E., Lavielle, M.: Coupling a stochastic approximation ver-
sion of em with an mcmc procedure. ESAIM: Probability and
Statistics 8, 115–131 (2004)
38. Liu, D.C., Nocedal, J.: On the limited memory bfgs method for
large scale optimization. Mathematical programming 45(1-3),
503–528 (1989)
39. Lorenzi, M., Ayache, N., Frisoni, G., Pennec, X.: 4D registration
of serial brain’s MR images: a robust measure of changes applied
to Alzheimer’s disease. Spatio Temporal Image Analysis Work-
shop (STIA), MICCAI (2010)
40. Lorenzi, M., Ayache, N., Pennec, X.: Schild’s ladder for the paral-
lel transport of deformations in time series of images. In: Biennial
International Conference on Information Processing in Medical
Imaging, pp. 463–474. Springer (2011)
41. Louis, M., B ˆ
one, A., Charlier, B., Durrleman, S.: Parallel trans-
port in shape analysis: a scalable numerical scheme. In: Interna-
tional Conference on Geometric Science of Information, pp. 29–
37. Springer (2017)
42. Louis, M., Charlier, B., Jusselin, P., Pal, S., Durrleman, S.: A
fanning scheme for the parallel transport along geodesics on rie-
mannian manifolds. SIAM Journal on Numerical Analysis 56(4),
2563–2584 (2018)
43. Manasse, F., Misner, C.W.: Fermi normal coordinates and some
basic concepts in differential geometry. Journal of mathematical
physics 4(6), 735–745 (1963)
44. Marin, J.M., Pudlo, P., Robert, C.P., Ryder, R.J.: Approximate
bayesian computational methods. Statistics and Computing 22(6),
1167–1180 (2012)
45. Marinescu, R.V., Eshaghi, A., Lorenzi, M., Young, A.L., Oxtoby,
N.P., Garbarino, S., Shakespeare, T.J., Crutch, S.J., Alexander,
D.C., Initiative, A.D.N., et al.: A vertex clustering model for dis-
ease progression: application to cortical thickness images. In:
International Conference on Information Processing in Medical
Imaging, pp. 134–145. Springer (2017)
46. Miller, M.I., Trouv´
e, A., Younes, L.: Geodesic shooting for com-
putational anatomy. Journal of Mathematical Imaging and Vision
24(2), 209–228 (2006)
47. Muralidharan, P., Fletcher, P.T.: Sasaki metrics for analysis of lon-
gitudinal data on manifolds. In: Computer Vision and Pattern
Recognition (CVPR), 2012 IEEE Conference on, pp. 1027–1034.
IEEE (2012)
48. Nader, C.A., Ayache, N., Robert, P., Lorenzi, M.: Monotonic gaus-
sian process for spatio-temporal trajectory separation in brain
imaging data. arXiv preprint arXiv:1902.10952 (2019)
49. Niethammer, M., Huang, Y., Vialard, F.X.: Geodesic regression
for image time-series. In: International Conference on Medical
Image Computing and Computer-Assisted Intervention, pp. 655–
662. Springer (2011)
50. Pennec, X.: Intrinsic statistics on riemannian manifolds: Basic
tools for geometric measurements. Journal of Mathematical Imag-
ing and Vision 25(1), 127–154 (2006)
51. Pennec, X., Fillard, P., Ayache, N.: A riemannian framework for
tensor computing. International Journal of Computer Vision 66(1),
41–66 (2006)
52. Schiratti, J.B., Allassonni`
ere, S., Colliot, O., Durrleman, S.:
Learning spatiotemporal trajectories from manifold-valued lon-
gitudinal data. In: C. Cortes, N.D. Lawrence, D.D. Lee,
M. Sugiyama, R. Garnett (eds.) NIPS 28, pp. 2404–2412. Curran
Associates, Inc. (2015)
53. Schiratti, J.B., Allassonniere, S., Colliot, O., Durrleman, S.: A
bayesian mixed-effects model to learn trajectories of changes from
repeated manifold-valued observations. The Journal of Machine
Learning Research 18(1), 4840–4872 (2017)
54. Singh, N., Hinkle, J., Joshi, S., Fletcher, P.T.: Hierarchical
geodesic models in diffeomorphisms. IJCV 117(1), 70–92 (2016)
55. Stern, Y.: Cognitive reserve and alzheimer disease. Alzheimer
Disease & Associated Disorders 20(2), 112–117 (2006)
56. Su, J., Kurtek, S., Klassen, E., Srivastava, A., et al.: Statistical
analysis of trajectories on riemannian manifolds: bird migration,
hurricane tracking and video surveillance. The Annals of Applied
Statistics 8(1), 530–552 (2014)
57. Su, J., Srivastava, A., de Souza, F.D., Sarkar, S.: Rate-invariant
analysis of trajectories on riemannian manifolds with application
in visual speech recognition. In: Proceedings of the IEEE Confer-
ence on Computer Vision and Pattern Recognition, pp. 620–627
(2014)
58. Sun, Y., Yin, L.: Facial expression recognition based on 3d dy-
namic range model sequences. In: European Conference on Com-
puter Vision, pp. 58–71. Springer (2008)
59. Vaillant, M., Glaun`
es, J.: Surface matching via currents. In: Infor-
mation processing in medical imaging, pp. 1–5. Springer (2005)
60. Woolrich, M.W., Jbabdi, S., Patenaude, B., Chappell, M., Makni,
S., Behrens, T., Beckmann, C., Jenkinson, M., Smith, S.M.:
Bayesian analysis of neuroimaging data in fsl. Neuroimage 45(1),
S173–S186 (2009)
61. Yin, L., Chen, X., Sun, Y., Worm, T., Reale, M.: A high-resolution
3d dynamic facial expression database. In: Automatic Face &
Gesture Recognition, 2008. FG’08. 8th IEEE International Con-
ference on, pp. 1–6. IEEE (2008)
62. Younes, L.: Jacobi fields in groups of diffeomorphisms and appli-
cations. Quarterly of Applied Mathematics 65(1), 113–134 (2007)
63. Younes, L.: Shapes and Diffeomorphisms. Applied Mathematical
Sciences. Springer Berlin Heidelberg (2010). URL https://
books.google.fr/books?id=SdTBtMGgeAUC
64. Zhang, M., Fletcher, P.T.: Finite-dimensional lie algebras for fast
diffeomorphic image registration. In: International Conference
on Information Processing in Medical Imaging, pp. 249–260.
Springer (2015)
65. Zhang, M., Singh, N., Fletcher, P.T.: Bayesian estimation of reg-
ularization and atlas building in diffeomorphic image registration.
In: IPMI, vol. 23, pp. 37–48 (2013)
A Background: meshes represented as currents
The theory of currents has been introduced in [59], and is
used in this paper to define a distance metric between pairs
of meshes without any assumption on their topology, and in
particular without assuming point-to-point correspondence.
See also [14] for more details.
A.1 Continuous theory
Let ybe a surface mesh, that we represent as an infinite set
of tuples (x, n(x)) where xis a point of R3, and n(x)the
normal vector of yat this point. Let gE:R3×R3Rbe a
positive-definite kernel operator, and Ethe associated repro-
ducing kernel Hilbert space. We define the current transform
C(y