ArticlePDF Available

Decentralized Analysis of Brain Imaging Data: Voxel-Based Morphometry and Dynamic Functional Network Connectivity


Abstract and Figures

In the field of neuroimaging, there is a growing interest in developing collaborative frameworks that enable researchers to address challenging questions about the human brain by leveraging data across multiple sites all over the world. Additionally, efforts are also being directed at developing algorithms that enable collaborative analysis and feature learning from multiple sites without requiring the often large data to be centrally located. In this paper, we propose two new decentralized algorithms: (1) A decentralized regression algorithm for performing a voxel-based morphometry analysis on structural magnetic resonance imaging (MRI) data and, (2) A decentralized dynamic functional network connectivity algorithm which includes decentralized group ICA and sliding-window analysis of functional MRI data. We compare results against those obtained from their pooled (or centralized) counterparts on the same data i.e., as if they are at one site. Results produced by the decentralized algorithms are similar to the pooled-case and showcase the potential of performing multi-voxel and multivariate analyses of data located at multiple sites. Such approaches enable many more collaborative and comparative analysis in the context of large-scale neuroimaging studies.
Content may be subject to copyright.
published: 27 August 2018
doi: 10.3389/fninf.2018.00055
Frontiers in Neuroinformatics | 1August 2018 | Volume 12 | Article 55
Edited by:
Sook-Lei Liew,
University of Southern California,
United States
Reviewed by:
Gennady V. Roshchupkin,
Erasmus Medical Center, Erasmus
University Rotterdam, Netherlands
Amir Omidvarnia,
Florey Institute of Neuroscience and
Mental Health, Australia
Harshvardhan Gazula
Bradley T. Baker
Received: 20 March 2018
Accepted: 06 August 2018
Published: 27 August 2018
Gazula H, Baker BT, Damaraju E,
Plis SM, Panta SR, Silva RF and
Calhoun VD (2018) Decentralized
Analysis of Brain Imaging Data:
Voxel-Based Morphometry and
Dynamic Functional Network
Front. Neuroinform. 12:55.
doi: 10.3389/fninf.2018.00055
Decentralized Analysis of Brain
Imaging Data: Voxel-Based
Morphometry and Dynamic
Functional Network Connectivity
Harshvardhan Gazula 1
*, Bradley T. Baker 1,2
*, Eswar Damaraju 1,3 , Sergey M. Plis 1,
Sandeep R. Panta 1, Rogers F. Silva 1and Vince D. Calhoun1, 3
1The Mind Research Network, Albuquerque, NM, United States, 2Department of Computer Science, The University of New
Mexico, Albuquerque, NM, United States, 3Department of Electrical and Computer Engineering, The University of New
Mexico, Albuquerque, NM, United States
In the field of neuroimaging, there is a growing interest in developing collaborative
frameworks that enable researchers to address challenging questions about the human
brain by leveraging data across multiple sites all over the world. Additionally, efforts
are also being directed at developing algorithms that enable collaborative analysis
and feature learning from multiple sites without requiring the often large data to
be centrally located. In this paper, we propose two new decentralized algorithms:
(1) A decentralized regression algorithm for performing a voxel-based morphometry
analysis on structural magnetic resonance imaging (MRI) data and, (2) A decentralized
dynamic functional network connectivity algorithm which includes decentralized group
ICA and sliding-window analysis of functional MRI data. We compare results against
those obtained from their pooled (or centralized) counterparts on the same data i.e.,
as if they are at one site. Results produced by the decentralized algorithms are
similar to the pooled-case and showcase the potential of performing multi-voxel and
multivariate analyses of data located at multiple sites. Such approaches enable many
more collaborative and comparative analysis in the context of large-scale neuroimaging
Keywords: decentralized algorithms, COINSTAC, VBM, dFNC, multi-shot
In the current times, innovation and discovery are often underpinned by the size of data at one’s
disposal and this has led to a paradigm shift in scientific research increasing the emphasis on
collaborative data-sharing (Cragin et al., 2010; Tenopir et al., 2011). This growing significance of
data-sharing is more evident in the field of neuroscience where, in the past few years, there has been
a proliferation of efforts (Poldrack et al., 2013) toward enabling researchers to leverage data across
multiple sites. In part, this is due to the fact that collecting neuroimaging data is expensive as well as
time consuming (Landis et al., 2016) and aggregating or sharing data across various sites provides
researchers with an opportunity to uncover important findings that are beyond the scope of the
original study (Poldrack et al., 2013). In addition to making predictions more certain by increasing
the sample size (Button et al., 2013), sharing data ensures reliability and validity of the results, and
safeguards against data fabrication and falsification (Tenopir et al., 2011; Ming et al., 2017).
Gazula et al. Decentralized Analysis of Brain Imaging Data
As mentioned previously, data-specific collaborative efforts
include either aggregating the data via a centralized data sharing
repository or sharing data via agreement based collaborations,
or data usage agreement (DUA) in other words (Thompson
et al., 2014, 2017). However, each methodology has its own set
of barriers. For example, policy or proprietary restrictions or
data re-identification concerns (Sweeney, 2002; Shringarpure and
Bustamante, 2015) might hinder data sharing whereas DUAs
might take months to complete and even if one comes through,
there is no guarantee of the utility of the data until the planned
analysis is performed (Baker et al., 2015; Ming et al., 2017). Other
significant challenges include the storage and computational
resources needed which could prove costly as the volume of the
data shared goes up.
Frameworks such as ENIGMA (Thompson et al., 2014, 2017)
to some extent bypass the need for DUAs by performing a
centrally coordinated analysis at each local site. This enables
potentially large data at each local site to stay put allowing a
greater level of control as well as privacy. Another framework
called ViPAR (Carter et al., 2015) tries to go one step further
by, relying on open-source technologies, completely isolating
the data at the local site but only pooling them via transfer to
perform automated statistical analyses. This repeated pooling
of data becomes cumbersome as the number of sites or the
size of the data at each site goes up and ENIGMA (Thompson
et al., 2014, 2017; Hibar et al., 2015; van Erp et al., 2016)
addresses this issue by pooling local statistical results for
further analysis, also known as, meta-analysis (Adams et al.,
2016). However, the heterogeneity among the local analyses
caused by adopting various data collection mechanisms or
preprocessing methods can lead to inaccurate meta-analysis
Plis et al. (2016), proposed a web-based framework titled
Collaborative Informatics and Neuroimaging Suite Toolkit
for Anonymous Computation (COINSTAC) to address the
aforementioned issues. COINSTAC provides a platform to
analyze data stored locally across multiple organizations without
the need for pooling the data at any point during the analysis.
It is intended to be an ultimate one-stop shop by which
researchers can build any statistical or machine learning model
collaboratively in a decentralized fashion. This framework
implements a message passing infrastructure that will allow large
scale analysis of decentralized data with results on par with those
that would have been obtained if the data were in one place.
Since, there is no pooling of data it also preserves the privacy of
individual datasets.
Some of the decentralized computations discussed in the
literature so far include decentralized regression (Plis et al.,
2016), joint independent component analysis (Baker et al.,
2015), decentralized independent vector analysis (Wojtalewicz
et al., 2017), decentralized neural networks (Lewis et al., 2017),
decentralized stochastic neighbor embedding (Saha et al., 2017)
and many more. To our knowledge, most of these algorithms
have been tested on synthetic data. In this work we present two
new decentralized algorithms that are widely used in a centralized
manner in the imaging community and demonstrate their utility
on real world brain imaging data.
Regression, is widely used in neuroimaging studies as it
enables one to regress certain covariates, for example- age,
diagnosis, gender or treatment response, to study their effects
on the structure and function of various brain regions. Some
examples of regression related studies in this field include
(Fennema-Notestine et al., 2007) where regression was used
as a validity test in examining the aggregation of structural
imaging across different datasets. In addition, the very successful
ENIGMA studies are mostly using regression analyses for a
small number of variables. Roshchupkin et al. (2016) presented a
framework titled HASE (high-dimensional association analyses)
that is capable of analyzing high-dimensional data at full
resolution, yielding exact association statistics. While singleshot
and multishot regression have been presented previously (Plis
et al., 2016), their treatment was cursory in nature without any
actual consideration of the appropriate gradient descent scheme
or the validity of the methods on real datasets both of which have
been presented in this work.
In this paper, in addition to improving the single-shot
and multi-shot regression we also present a new variant of
decentralized regression- “decentralized regression with normal
equation” and extend this work to operate on voxels in an
MRI image, in order to implement a voxel-based morphometry
(VBM) study in a decentralized framework (Ashburner and
Friston, 2000). We implement and evaluate the proposed
decentralized VBM approach on the publicly available MIND
Clinical Imaging Consortium (MCIC) dataset (available via the
COINS data exchange at and contrast the
results obtained with those from pooled/centralized regression to
validate the proof-of-concept.
Another widely utilized method in neuroimaging analysis
is dynamic functional network connectivity (dFNC) (Sakoglu
et al., 2010; Allen et al., 2014). dFNC is an analysis pipeline
for functional magnetic resonance imaging (fMRI) data, which
allows for the identification and analysis of networks of co-
activating brain states. In contrast to static approaches (Smith
et al., 2009), which take the mean connectivity over time-
points, dFNC uses clustering of time varying connectivity
estimates computed from sliding-windows taken over subject
time-courses, thus becoming desirable in experiments where
network connectivity is highly dynamic in the time dimension,
for example in experiments which utilize resting-state fMRI
(Deco et al., 2013; Damaraju et al., 2014).
Importantly, dFNC is focused on time-courses of networks
extracted from a group independent component analysis (ICA),
which is a widely used approach for estimating functional brain
networks (Calhoun and Adali, 2012) and as such to implement
dFNC we needed to also implement a decentralized group ICA
For collaborative neuroimaging applications, a decentralized
version of dFNC is desirable for many of the same reasons
as regression, and currently, no such decentralized version
exists. Unlike regression, however, the dFNC pipeline consists
of multiple, distinct stages, all of which require decentralization.
In this paper, we present an initial version of decentralized
dFNC by providing decentralized approaches to both the group
spatial independent component analysis (ICA) and K-Means
Frontiers in Neuroinformatics | 2August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
clustering steps in the pipeline, which, along with additional
preprocessing steps including sliding window correlation, can
be implemented together to perform decentralized dFNC. Our
resulting methods, dgICA, and ddFNC via dK-Means, provide
dynamic connectivity results consistent with established pooled
approaches in the literature, thus representing an important step
toward more exhaustive analysis of the decentralized approaches
to the dFNC pipeline. Our contributions in this paper can thus be
summarized as follows.
1. Development of decentralized regression with normal
equation, improvement of single-shot and multi-shot
regression and their validation on structural MRI data
2. Presentation of a decentralized dynamic functional network
connectivity analysis pipeline and its evaluation on functional
MRI data
2.1. Decentralized VBM (i.e., Voxelwise
Decentralized Regression)
Statistical analysis plays a key role in the field of neuroimaging
studies. Researchers would often want to characterize the effect of
various factors such as age, gender, disease condition, etc., on the
composition of brain tissue at various regions of the brain. Voxel-
based morphometry (VBM) (Ashburner and Friston, 2000) is
one such approach that facilitates a comprehensive comparison,
via generalized linear modeling, of voxel-wise gray matter
concentration between different groups, for example. To enable
such statistical assessment on data present at various sites, it is
important to develop decentralized tools. In this section, we first
provide a brief overview of decentralized regression algorithms
(the building blocks of decentralized VBM which is essentially
voxel-wise regression) along with some notation.
The goal of decentralized regression is to fit a linear equation
(given by Equation 1) relating the covariates at Sdifferent sites to
the corresponding responses. Assume each site jhas data set Dj=
{(xi,yi):i∈ {1, 2, ...,sj}} where xi,jRdis a d-dimensional
vector of real-values features, and yjis a response. We consider
fitting the model in Equation 2 where wis given as [w;b] and x
as [x;1]
The vector of regression parameters/weights wis found by
minimizing the sum of the squared error given in Equation (3)
The regression objective function is a linearly separable function,
that can be written as sum of a local objective function calculated
at each local site as follows:
Fj(w) (4)
(yiwxi,j) (5)
A central aggregator (AGG) is assumed whose role is to compute
the global minimizer ˆwof F(w).
2.1.1. Single-Shot Regression
In one approach to solve the decentralized regression problem,
termed the single-shot regression (Plis et al., 2016), each site
jfinds the minimizer ˆwjof the local objective function Fj(w).
This is the same as solving the regression problem at each
local site. Once the regression model at each site is fit, the
weights are sent to the central aggregator (AGG) where they
are aggregated (weighted average) to find the global minimizer
or can be used separately to perform a meta-analysis similar to
those performed in ENIGMA (using a manual spreadsheet-based
approach however) (Turner et al., 2013; van Erp et al., 2016). The
pseudocode to perform single-shot decentralized regression (Plis
et al., 2016), with a slight modification, is presented here again for
Algorithm 1 Single-shot Regression
Require: Data Djat site jfor sites j=1, 2, ...,S, where |Dj| =
1: for j=1 to Sdo
2: ˆwj=argminwFj(w).
3: Node jsends ˆwjto AGG.
4: end for
5: AGG computes ˆw=1
j=1sjˆwjand return ˆw
2.1.2. Decentralized Regression With Normal
One limitation of single-shot regression is that the “site” level
covariates cannot be included at each local site as this leads
to collinearity issues. This issue can be offset by utilizing a
decentralized version of the analytical solution to the linear
regression problem. For a standard regression problem of the
form given in Equation (2), the analytical solution is given as
Assuming that the augmented data matrix xis made up of data
from different local sites, i.e.,
Frontiers in Neuroinformatics | 3August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
it’s easy to see that ˆwcan be written as
1··· x
1··· x
The above variant of the analytical solution to a regression model
shows that even if the data resides in different locations, fitting a
global model in the presence of site covariates delivers results that
are exactly similar to the pooled case.
Algorithm 2 Decentralized Regression with Normal Equation
Require: Data Djat site jfor sites j=1, 2, ...,S, where |Dj| =
1: for j=1 to Sdo
2: Compute Cov(Xj)=x
3: Compute x
4: Node jsends Cov(Xj) and x
jyjto AGG.
5: end for
6: AGG computes
jyjand return
2.1.3. Multi-Shot Regression
Decentralized regression with a normal equation is a nice
mathematical formulation which produces results that are exactly
the same as those from the pooled regression. However, one
of the biggest drawback of the analytical form of regression is
it becomes computationally expensive to evaluate the inverse
of xxas the number of features in the dataset (D) increases.
While in a neuroimaging setting there might not be as many
covariates to make it computationally expensive, it is indeed a
challenge while working with datasets where the cardinality of the
feature set is usually large (especially in machine learning). One
can overcome this drawback by implementing an optimization
method in a way that entails the local sites and AGG having to
communicate iteratively. This is a type of distributed gradient
descent and such a regression is termed “multi-shot” regression
(Plis et al., 2016).
For a regression model of the form given in Equation 5, the
gradient update equation (given a learning rate η) is given as
ˆwt+1= ˆwtη·Fj(ˆw) (10)
Algorithm 3 Multi-shot Regression
Require: Data Djat site jfor sites j=1, 2, ...,S, where |Dj| =
Require: Step size η(Suggested default: 0.001)
Require: β1,β2[0, 1): Exponential decay rates for the moment
estimates (Suggested defaults: 0.9 and 0.999, respectively)
Require: Small constant δused for numerical stabilization
(Suggested default: 108)
Require: ˆwt10(Initial parameter vector), m00
(Initialize 1st moment vector), v00(Initialize 2nd
moment vector), t0 (Initialize timestep), tolerance Tol
at AGG
1: while True do
2: for j=1 to Sdo
3: AGG sends ˆwt1to node j
4: Node jcomputes Fj(ˆwt1)
5: Node jsends Fj(ˆwt1) to AGG.
6: end for
7: AGG computes FcPS
8: mtβ1·mt1+(1 β1)·Fcupdate biased first
moment estimate
9: vtβ2·vt1+(1 β2)·F2
cupdate biased second
moment estimate
10: ˆmtmt/(1 βt
1)Compute bias-corrected first
moment estimate
11: ˆvtvt/(1 βt
2)Compute bias-corrected second
moment estimate
12: AGG computes ˆwt← ˆwt1η· ˆmt/(pˆvt+δ)
Update parameters
13: if || ˆwt− ˆwt1||2Tol then
14: break
15: end if
16: ˆwt1← ˆwt
17: end while
18: return ˆwtas ˆwResulting parameters
(yi− ˆwxi,j)xi,j(11)
In multi-shot regression, at every time step the AGG sends the
value of ˆwt1to each of the local sites which then compute their
local gradients Fj(wt) and send them back to the AGG where it
sums up all the local gradients in order to update the parameter
vector ˆwt. The need to sum up all the local gradients is explained
as follows:
From Equation (4), F(ˆw)=PS
∴ ▽F(ˆw)=PS
j=1Fj(ˆw) (12)
To illustrate this using an example, suppose there are 3 sites
(S=3) with s1,s2and s3number of samples, respectively, at
each site. The global objective function F(ˆw) can be easily written
Frontiers in Neuroinformatics | 4August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
as the sum of objective functions from each site (this because the
objective function is linear) as follows:
(yj− ˆwxj)2
(yj− ˆwxj)2+
(yj− ˆwxj)2
(yj− ˆwxj)2
∴ ▽F(ˆw)=
F3(ˆw) (13)
From Equation (13), it should be easy to see that the aggregated
gradient is just a sum of the gradients from each site. On the
other hand, if the mean sum of squared errors is preferred i.e.,
j=1(yj− ˆwxj)2, which mathematically has the same
minimizer as Pm
j=1(yj− ˆwxj)2since F(ˆw) is convex, it can be
shown that the aggregated gradient is a weighted average of the
gradients from the local sites:
∴ ▽F(ˆw)=1
Algorithm 3 shows the steps involved in multi-shot regression.
In order to update the parameters (here,ˆw), any off-the-shelf
optimization scheme, for example, gradient descent, adagrad
(Duchi et al., 2011), adadelta (Zeiler, 2012), momentum gradient
descent (Rumelhart et al., 1986), nesterov accelerated gradient
descent (Nesterov et al., 1983), Adam (Kingma and Ba, 2014)
could have been used. The choice of scheme adopted could
depend on the data being analyzed, Moreover, additional
considerations have to be given to the stopping criterion
tolerance, the number of iterations, the choice of learning rate
and any other additional hyper-parameters depending on the
scheme utilized. In some cases, the choice of optimization scheme
can result in an analysis which could take minutes, days or years
to arrive. In our tests, we found out that the Adam optimization
scheme performs extremely well on the real dataset and hence has
been adopted to perform the multi-shot regression.
2.1.4. Other Statistics
In addition to generating the weights of the covariates (regression
parameters), one would also be interested in determining the
overall model performance given by goodness-of-fit or the
coefficient of determination (R2) as well as the statistical
significance of each weight parameter (t-value or p-value).
As demonstrated in Algorithm 4 (Ming et al., 2017),
determining R2entails calculating the sum-square-of-errors
(SSE) as well as total sum of squares (SST) which are evaluated at
each local site and then aggregated at the global site to evaluate R2
given by 1SSE/SST. An intermediary step before the calculation
of SST is the calculation of the global ¯ywhich is determined by
taking a weighted average of the local ¯yjweighted on the size of
data at each local site.
Algorithm 4 Decentralized R2calculation
Require: Data Djat site jfor sites j=1, 2, ...,S, where |Dj| =
1: AGG sends ˆwto each local site.
2: for j=1 to Sdo
3: Node jcomputes ¯yj=1
4: Node jsends ¯yjand sjto AGG.
5: end for
6: AGG computes ¯y=PS
global mean
7: AGG sends ¯yto local sites
8: for j=1 to Sdo
9: SSTj=Psj
i=1(yi− ¯y)2
10: ˆyj= ˆw·xj
11: SSEj=Psj
i=1(yi− ˆyj)2
12: Node jsends SSTjand SSEjto AGG
13: end for
14: AGG computes SST PS
15: return R2
Algorithm 5 (Ming et al., 2017) details the steps involved
in calculating the t-values (and therefore p-values) of each
regression parameter. Assuming the weight vector has been
calculated using either the single-shot or multi-shot regression,
the global weight vector ( ˆw) is sent to each of the local sites
where the local covariance matrix as well as the sum-square-of-
errors is calculated and sent back along with the data size to
the aggregator (AGG) which then utilizes that information to
calculate the t-values for each parameter (or coefficient). Once,
the t-values have been calculated, the corresponding two-tailed
p-values can be deduced using any publicly available distributions
Frontiers in Neuroinformatics | 5August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
Algorithm 5 Decentralized t-value calculation
Require: Data Djat site jfor sites j=1, 2, ...,S, where |Dj| =
1: AGG sends ˆwto each local site.
2: for j=1 to Sdo
3: ˆyj= ˆw·xj
4: SSEj=Psj
i=1(yi− ˆyj)2
5: Cov(xj)=x
6: Node jsends SSEj,Cov(Xj) and sjto AGG.
7: end for
8: AGG computes Cov(x)PS
SE(W)pdiag(MSE ·Cov(x)1),
t← ˆw/SE(W))
9: return t
2.1.5. Bandwidth and Complexity
For singleshot regression, each site communicates a local weight
vector ˆwjof size (d+1) to the aggregator in addition to the
cardinality of the dataset at each site |Dj| = sj, a scalar. Once
all the information is aggregated, a weighted average of the local
ˆwjs with the weights being sjperformed to get the global weight
vector ˆw. Assuming sj>dand that the normal equation is used
to get the local weight vectors ˆwjs, the computational complexity
is O(d2sj) whereas the computational complexity of calculating
the weighted average at the AGG is O(d).
In the case of decentralized regression with normal equation,
the first step (at each site) includes the calculation of xx(at
O(d2sj)) and xy(at O(dsj)) with an overall complexity of
O(d2sj). A total information of PS
communicated to the AGG where they are aggregated (as shown
in Algorithm 2) to obtain the global weight vector ˆwat O(d3).
Contrary to where the computation starts in the case of
singleshot or DRNE, the computation/communication starts
from the AGG in multishot regression. The AGG initializes the
ˆwand communicates the (d+1)-sized vector to each of the S
sites. At every iteration, each site jthen calculates the gradient
vector (O(d)) and sends it back to the AGG which again means
the communication S×(d+1) accounting for S sites. At the
AGG, steps 7 though 12 (refer to Algorithm 3) are performed at
an order of O(d) which are again sent back to each of the local
sites, implying a communication of S×d, for the next iteration
of the gradient descent.
The above treatment of communication bandwidth and
complexity is subject to certain considerations viz., the number of
covariates, the number of samples at each site, the optimization
scheme used in the calculation of xx, the stopping criterion, etc.
2.2. Decentralized dFNC
In this section, we briefly present our initial work toward
performing dynamic functional network connectivity (dFNC)
analysis in a decentralized framework. As mentioned earlier,
dFNC is a multi-step pipeline finds common states in subject
fMRI time-courses (TCs), and is often done by clustering a
sliding window over subject time-courses, as is done (e.g., Allen
et al., 2014; Damaraju et al., 2014). Thus, we present methods
for decentralized spatial ICA along with decentralized K-Means
clustering. Our presentation here is by no means a rigorous take
on dFNC which we save for future work.
2.2.1. Decentralized Group Spatial ICA
Following preprocessing, the first step in the dFNC pipeline
includes group ICA (Calhoun et al., 2001). Since we are dealing
with fMRI data, suppose that we now have data XRd×N, where
dis the voxel-space of the data (in brain voxels), and Nis the
total number of time-points across all subjects in the network. In
linear spatial ICA, we model each individual subject as a mixture
of rmany statistically independent spatial maps, ARd×r, and
their time-courses, SRr×Ni, where Niis the length of the time-
course belonging to subject i. In the decentralized case, we can
model the global data set Xas the column-wise concatenation of
ssites in the temporal dimension, where each site is modeled as a
set of subjects concatenated in the temporal dimension:
X=[A1S1A2S2··· AsSs]Rd×N.
Our goal is to learn a global unmixing matrix, W, such
that XW ˆ
A, where ˆ
ARd×ris a set of unmixed
spatially independent components. To this end, we perform a
decentralized group independent component analysis (dgICA).
Our method consists first of the two-stage GlobalPCA procedure
utilized in Baker et al. (2015). In this procedure, each site first
performs subject-specific LocalPCA dimension-reduction and
whitening to a common kprincipal components in the temporal
dimension. A decentralized, second stage, then produces a global
set of rspatial eigenvectors, VRr×d. As outlined in Baker
et al. (2015), this second stage has sites pass locally-reduced
eigenvectors to other sites in a peer-to-peer scheme, where
upon receiving a set of eigenvectors, a site then stacks them
in the column dimension, and performs a further reduction of
the stacked matrix, which is then passed to the next peer in
the network. This process iterates until the global eigenvectors
reach some aggregator (AGG), or otherwise terminal site in the
Algorithm 6 Decentralized group ICA algorithm (dgICA)
Require: ssites with data {XiRd×Ni:i=1, 2, ...,s}, intended
final rank r, local site rank k2r, local subject rank k1.
1: for all sites i=1, 2, ...,sdo
2: Perform LocalPCA (Baker et al., 2015) on each site
k1eigen-vectors for each subject.
3: Perform LocalPCA (Baker et al., 2015) on concatenated
subjects k2eigenvectors at each site.
4: Reduce local data set to Xi,red Rd×k2
5: end for
6: Perform GlobalPCA (Baker et al., 2015) to obtain rglobal
eigenvectors, V, at the aggregator.
7: On the aggregator, perform ICA to obtain global unmixing
matrix, W.
Frontiers in Neuroinformatics | 6August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
The aggregator site then performs whitening on these
resulting eigenvectors, and runs a local ICA algorithm, such
as infomax ICA (Bell and Sejnowski, 1995), to produce the
spatial unmixing matrix, W. The global spatial eigenvectors,
V, are then unmixed to produce ˆ
Aby computing ˆ
which is shared across the decentralized network. Each site then
uses this unmixing matrix to produce individual time-courses
for each i-th subject by computing AiXT
iS. Each site can
then perform spatio-temporal regression back reconstruction
approach (Calhoun et al., 2001; Erhardt et al., 2011) to produce
subject-specific spatial maps.
2.2.2. Decentralized Clustering
In order to perform dFNC in a decentralized paradigm, we
first require a notion of decentralized clustering. Following
the precedent of previous work in dFNC, we focus first on
decentralized K-Means optimization, for which there exist a
number of pre-established methods for decentralization. A
number of methods utilize some manner of weighted centroid
averaging, where each site in the network broadcasts updated
centroids to an aggregator node which then computes the merged
centroids, and rebroadcasts them to the local sites (Forman and
Zhang, 2000; Dhillon and Modha, 2000; Jagannathan and Wright,
2005), though completely peer-to-peer approaches have also been
proposed (Datta et al., 2006, 2009), as well as methods robust to
asynchronous updates (Di Fatta et al., 2013). Though we have
not found any methods which do this, methods which compute
K-Means via gradient descent (Bottou, 2010) are also amenable
to decentralization (Yuan et al., 2016). For simplicity’s sake, we
take the approach of centroid-averaging outlined in Dhillon and
Modha (2000), and leave rigorous presentation and comparison
of the remaining methods as future work.
To perform clustering for distributed dFNC, we first have
each site separate its subjects into sliding-window time-courses,
where the window length is fixed across the decentralized
network. Additionally, initial clustering was performed on a
subset of windows from each subject, corresponding to windows
of maximal variability in correlation across component pairs. To
obtain these exemplars, each site computes variance of dynamic
Algorithm 7 Decentralized dFNC algorithm (ddFNC)
Require: ssites with data {XiRd×Ni:i=1, 2, ...,s}, win-size
t, number of clusters k.
1: dgICA W, global unmixing matrix, broadcast to sites.
2: for all sites = i=1, 2, ...,sdo
3: Back-reconstruct subject TCs
4: Using sliding window of size t, obtain r×rcovariance
5: Obtain exemplar covariance matrices (Damaraju et al.,
6: end for
7: Run K-Means on exemplar covariance matrices to obtain k
initial centroids, C0.
8: Run K-Means with initial clusters C0to obtain kcentroids C,
and clustering assignment for each instance, L.
connectivity across all pairs of components at each window.
We then select windows corresponding to local maxima in this
variance time-course. This resulted in an average of 8 exemplar
windows per subject. We then perform decentralized K-Means
on the exemplars to obtain a set of centroids, which are shared
across the decentralized network, which we feed into a second
stage of K-Means clustering.
For the second stage of decentralized clustering, at each
iteration, each site computes updated centroids according to
Dhillon and Modha (2000), which corresponds to a local K-
Means update. These local centroids are then sent to the
aggregator node, which computes the weighted average of
these updated centroids, and re-broadcasts the updated global
centroids until convergence.
2.2.3. Bandwidth and Complexity
To compute the communication and complexity for ddFNC, we
separately analyse the novel component algorithms of dgICA and
For decentralized group ICA, the communication of the
algorithm is closely related to the communication of GlobalPCA.
In the GlobalPCA algorithm given in Baker et al. (2015),
each site communicates a d×rmatrix of eigenvectors to
the subsequent site until the aggregator is reached. After the
aggregator performs ICA to obtain the global unmixing matrix,
W, this matrix is broadcast to all other sites in the network.
Thus, for a single, non-aggregator site, the total communication
for dgICA is exactly d×r+r2. At the aggregator, the total
communication is exactly d×r+r2×sif the unmixing matrix
is broadcast directly to each node. Of course, this cost could be
mitigated by following a peer to peer communication scheme,
and having other non-aggregator sites broadcast the unmixing
matrix as well.
Next, we can compute the overall complexity of dgICA as the
total complexity of local site operations. Consider an individual
site, i, with msubjects, where the concatenated matrix is given
as XiRd×Ni. In general, the complexity of SVD on the
Ni×Nicovariance matrix is O(N3
i), though this can be improved
upon by using iterative methods, such as the MATLAB svds
function. Thus, the complexity for the two-stage LocalPCA
computation on one site is O(2N3
i). The per-site complexity for
GlobalPCAis given as the complexity of a SVD computed on a
d×dcovariance matrix, which is created by concatenating the k2
eigenvectors from the previous site; i.e., the per-site complexity
for GlobalPCA is O(d3). Finally, the complexity of ICA is exactly
equal to the number of ICA iterations, J, which depends heavily
on the choice of ICA algorithm, and hyper-parameter selection
(see Bell and Sejnowski, 1995 for more details on the complexity
of Infomax, for example). Thus, the total per-site complexity for
dgICA is O(N3
i) for non-aggregator nodes, and O(N3
i+J) on the aggregator node. The overall runtime of dgICA is
thus dependent on the computational resources available at each
site, as well as the computational resources and ICA parameters
chosen by the aggregator site.
Prior to performing K-Means, each site icomputes Ni,jw
windowed time-courses of length won each subject j, computing
the rank rcovariance matrix for those windows. Thus, if there are
Frontiers in Neuroinformatics | 7August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
misubjects at site i, the local complexity is O(mi(Nw)r3) for
this operation. No inter-site communication occurs during this
For decentralized K-Means, the communication between sites
depends on the number of “K-Means Iterations,J, i.e., the
number of iterations required for the centroids to stabilize. J
depends heavily on the initial centroids, the distance metric used,
the distribution of the global data set, and other factors which
make it difficult to compute exactly for arbitrary data. In each
iteration of decentralized K-Means, we communicate kmany
centroids of size Rr2, for an average communication of r2·k·J
from the sites to the aggregator. The aggregator, then, performs
a total of r2·k·J·scommunication (Dhillon and Modha,
2000), which again, could be mitigated by passing centroids to
intermediate sites, provided those sites can be trusted with the
centroid information.
The time complexity of decentralized K-Means is described
in Dhillon and Modha (2000). At each site, the distance
and centroid recalculation computations come out to per-site
complexity of O((3kr2+Mik+Mir2+kr2)·J) (Dhillon and
Modha, 2000), where Miis the number of instances at site i.
The total number of computations consists of the sum of these
site-wise complexities, and the centroid-averaging step with a
complexity of O(kr2), for a total of O((3kr2+Mk+Mr2+kr2)·J),
where Mis the total number of data instances in the decentralized
Since dK-Means is computed twice for full ddFNC, once on
the exemplars, and once on the global set of subject windows, the
complete complexity of the clustering stage of the algorithm is
given as the dK-Means complexity for M=PEiadded to the
dK-Means complexity for M=Pmi, i.e., O((3kr2+(PEi+
The overall site-wise complexity and communication for
ddFNC is just the sum of the site-wise communication and
complexities for each of the stages described here. In the
paradigm described here, the communication and complexity
on the aggregator is generally more demanding than that on
the individual sites, which makes sense for cases where the
aggregator has sufficient and reliable network and hardware
resources. In cases where this is not necessarily true, some
of the aggregation tasks can be distributed to other sites in
the network, thus reducing communication and complexity on
the final aggregator. In the dgICA algorithm, performing ICA
on the aggregator may become a bottleneck if the aggregator
does not have sufficient computational resources to perform a
standard run of ICA; however, this problem could be mitigated
by performing a hardware check on sites in the consortium,
and assigning the role of aggregator dynamically based on
availability of computational resources. For more discussion of
the particularities of network communication and other issues
which may arise in decentralized frameworks like the one used
for ddFNC, see Plis et al. (2016).
3.1. Structural MRI for Decentralized VBM
As part of validating the proof-of-concept, we applied
decentralized VBM to brain structure data collected on
chronic schizophrenic patients and healthy controls. Specifically,
the data comes from the Mind Clinical Imaging Consortium
(MCIC) collection- a publicly accessible, on-line data repository
containing curated anatomical and functional MRI, in addition
to other data, collected from individuals with and without
a schizophrenia spectrum disorder (Gollub et al., 2013) and
available via the COINS data exchange
(Scott et al., 2011).
Although more information about the MCIC can be found in
Gollub et al. (2013), here we will report numbers for the final
data used in this study as some subjects were excluded during the
preprocessing phase. The final cohort for whom data are available
includes 146 patients and 160 controls with site distribution
as follows: Site B (IA) 40 patients/67 controls; Site D (MGH)
32/23; Site C (UMN) 32/26; Site A (UNM) 42/44, respectively.
All subjects provided informed consent to participate in the study
that was approved by the human research committees at each of
the sites.
Briefly, T1-weighted structural MRI (sMRI) images were
acquired with the following scan parameters: TR =2, 530 ms for
3 T, TR =12 ms for 1.5 T; TE =3.79 ms for 3 T, TE =4.76 ms for
1.5 T; FA =7for 3 T, FA =20for 1.5 T; TI =1100 ms for 3 T;
Bandwidth =181 for 3 T, Bandwidth =110 for 1.5 T; voxelsize =
0.625 ×0.625 mm; slice thickness 1.5 mm; FOV =16 18cm.
The T1-weighted sMRI data were preprocessed using
the Statistical Parametric Mapping software using unified
segmentation (Ashburner and Friston, 2005), in which image
registration, bias correction and tissue classification were
performed using a single integrated algorithm resulting in
individual brains segmented into gray matter, white matter and
cerebrospinal fluid and nonlinearly warped to the Montreal
Neurological Institute (MNI) standard space. The resulting gray
matter concentration (GMC) images were re-sliced to 2 ×2×
2mm, resulting in 91 ×109 ×91 voxels. Although one can
obtain both modulated (Jacobian corrected) and unmodulated
gray matter segmentations, in this study, we use unmodulated
GMC maps to test our regression models.
To test the decentralized regression on the MCIC data
described in the previous paragraph, we regress the age,
diagnosis, gender and the site covariates on the voxel intensities
(600,000 voxels). All the decentralized computations discussed
here have been performed on a single machine.
3.2. Functional MRI for dFNC
To evaluate ddFNC , we utilize imaging data from Damaraju
et al. (2014) collected from 163 healthy controls (117 males, 46
females; mean age: 36.9 years) and 151 age- and gender matched
patients with schizophrenia (114 males, 37 females; mean age:
37.8 years), for a total of 314 subjects.
The scans were collected during an eyes closed resting fMRI
protocol at 7 different sites across United States and pass data
quality control (see Supplementary Material). Informed and
written consent was obtained from each participant prior to
scanning in accordance with the Internal Review Boards of
corresponding institutions (Keator et al., 2016). A total of 162
brain-volumes of echo planar imaging BOLD fMRI data were
collected with a temporal resolution of 2 s on 3-Tesla scanners.
Frontiers in Neuroinformatics | 8August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
Imaging data for six of the seven sites was collected on a 3T
Siemens Tim Trio System and on a 3T General Electric Discovery
MR750 scanner at one site. Resting state fMRI scans were
acquired using a standard gradient-echo echo planar imaging
paradigm: FOV of 220 ×220 mm (64 ×64 matrix), TR = 2
s, TE = 30 ms, FA = 770, 162 volumes, 32 sequential ascending
axial slices of 4 mm thickness and 1 mm skip. Subjects had their
eyes closed during the resting state scan. Data preprocessing for
dgICA was performed according to the preprocessing steps in
Damaraju et al. (2014).
3.3. ddFNC Experimental Parameters
We verify that ddFNC can generate sensible dFNC clusters by
replicating the centroids produced in Damaraju et al. (2014). We
run both pooled and decentralized versions of our algorithm,
and compare our results directly with the results provided by
the authors of Damaraju et al. (2014). We thus closely follow the
experimental procedure in Damaraju et al. (2014), with some of
the additional post-processing omitted for simplicity. To evaluate
the success of our pipeline, we run a simple experiment where
we implement the ddFNC pipeline end-to-end on the data,
simulating 314 subjects being evenly shared over 2 decentralized
We set a window-length of 22 time-points (44 s), for a total
of 140 windows per subject. For dgICA, we first estimate 120
subject-specific principal components locally, and reduce each
subject to 120 points in the temporal dimension. Subjects are
then concatenated temporally on each site, and we use the
GlobalPCA algorithm in Baker et al. (2015) to estimate 100
TABLE 1 | Correlation between SSE from pooled, single-shot and multi-shot
Pooled Single-shot Multi-shot
Pooled 1.000000 0.992905 1.000000
Single-shot 0.992905 1.000000 0.992905
Multi-shot 1.000000 0.992905 1.000000
FIGURE 1 | Pairwise plot of Sum Square of Errors (SSE) from pooled, single-shot and multi-shot regression. Although the distribution plot looks similar across the three
regressions, the pooled regression vs. multi-shot regression scatter plot demonstrates how identical they are to each other.The scatter plot of pooled regression vs.
single-shot regression demonstrates that the SSE values obtained from singles-shot regression are on the higher side compared to the values from pooled regression.
Frontiers in Neuroinformatics | 9August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
spatial components, and perform whitening. We then use local
infomax ICA (Bell and Sejnowski, 1995) on the aggregator to
estimate the unmixing matrix W, and estimate 100 spatially
independent components, ˆ
A. We then broadcast ˆ
Aback to the
local sites, and each site computes subject-specific time-courses.
After spatial ICA, we have each site perform a set of
additional post-processing steps prior to decentralized dFNC.
First, we select 47 components from the initial 100, by
computing components which are most highly correlated with
the components from Damaraju et al. (2014). We then have each
site drop the first 2 points from each subject, regress subject
head movement parameters with 6 rigid body estimates, their
derivatives and squares (total of 24 parameters). Additionally,
any spikes identified are interpolated using 3rd order spline
fits to good neighboring data, where spikes are defined as any
points exceeding mean (FD) + 2.5 *std(FD) , where FD is
framewise displacement [interpolating 0 to 9 points (mean, sd:
3, 1.76)].
For clustering, we forgo a separate elbow-criterion estimation,
and use the optimal number of clusters from Damaraju et al.
(2014), setting k= 5. For the exemplar stage of clustering,
we evaluate 200 runs where we initialize centroids uniformly
randomly from local data, and then run dK-Means using the
cluster averaging strategy in Dhillon and Modha (2000). For
our distance measure, we use scikit-learn (Pedregosa et al.,
2011) to compute the correlation distance between covariance
matrices following the methods in Damaraju et al. (2014). To
keep our implementation simple, unlike Damaraju et al. (2014),
we do not utilize graphical LASSO to estimate the covariance
matrix, and thus do not optimize for any regularization
parameters. Additionally, we do not perform additional Fisher-
Z transformations or perform additional regularization using a
previously computed static dFNC result. Future implementations
may also utilize a decentralized static functional network
connectivity (sFNC) algorithm as preprocessing, as is done for
the pooled case in Damaraju et al. (2014). Finally, for the second
stage of dK-Means, we initialize using the centroids from the
run with the highest silhouette score, computed using the scikit-
learn python toolbox (Pedregosa et al., 2011), again running dK-
Means to convergence. After computing the centroids, we use
the correlation distance and the Hungarian matching algorithm
(Kuhn, 1955) to match both plotted spatial components from
dgICA and the resulting centroids from dK-Means.
4.1. Decentralized VBM Results
For starters, in order to compare the efficacy of each regression
(single-shot and multi-shot) against the pooled case, we present
a simple pairwise plot of the SSE of the regression performed on
every voxel, Figure 1. In mathematical terms, the SSE represents
lowest objective function value that could be attained from the
regression model. It can be seen from Figure 1 that the SSE
from multi-shot and pooled/centralized regression lie perfectly
along a diagonal indicating the parameters obtained from them
are identical. This can also be verified from Table 1 showing the
correlation between the different SSEs. Please note that results
from the decentralized regression with normal equation were not
presented as it has been mathematically shown to be equivalent
to that of a pooled regression.
FIGURE 2 | Violin plot of Sum Square of Error differences between every pair of regression. The plot of differences in SSE from pooled regression and multi-shot
regression (P-MS) centered around 0 demonstrates how identical the results from the two regressions are. On the other hand, the SSE values from single-shot
regression are higher compared to those from the pooled regression.
Frontiers in Neuroinformatics | 10 August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
It can be seen that the correlation between SSE from the
centralized regression and multi-shot is 1. On the other hand,
it can also be noticed that the SSE correlations between single-
shot and pooled or single-shot and multi-shot are slightly
lower than perfect correlation. The single-shot approach can be
considered to be similar to a meta-analysis, whereas the multi-
shot approach is basically a mega-analysis (i.e., equivalent to the
pooled analysis).
Figure 2 shows a violin (distribution) plot of the difference
in SSE from every pair of regression. Evidently, the differences
in SSE between pooled and multi-shot regression are centered
around 0. To reinforce our notion that the multi-shot is superior
to single-shot we take a look at the R2values from the different
regressions and compare. It can be seen from Figure 3 that the
R2values from multi-shot and pooled regression align perfectly
along a diagonal (correlation =1, refer to Table 2) or have exactly
the same distribution, whereas those from single-shot are all over
the place.
As noted earlier, in addition to evaluating the regression
model parameters, researchers will also be interested in
understanding the statistical significance of the various
parameter estimates. Figures 46show the statistical significance
of each covariate (age, diagnosis and gender), from both
TABLE 2 | Correlation between R2from pooled, single-shot and multi-shot
Pooled Single-shot Multi-shot
Pooled 1.000000 0.906662 1.000000
Single-shot 0.906662 1.000000 0.906662
Multi-shot 1.000000 0.906662 1.000000
FIGURE 3 | Pairwise scatter plots of Coefficient of Determination R2from the three types of regression. It can be seen again that the R2values for the regressions
from multi-shot regression and pooled regression are exactly equal. The R2values from single-shot regression are less than their corresponding values from pooled
regression or multi-shot regression because the model being fit in single-shot has fewer covariates (Note, one of the limitations of the single-shot is that the site
specific covariates could not be included as it introduces collinearity).
Frontiers in Neuroinformatics | 11 August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
FIGURE 4 | Rendered images of voxel-wise significance values (log10 p-value ×sign(t)) for the covariate “Age” from pooled regression (Top) and single-shot
regression (Center), and multi-shot regression (Bottom) overlaid on MNI average template. One could see that the regions with expected gray matter decrease as
age increases are similar from all kinds of regression. Although the single-shot regression uses fewer covariates, the similarity of the rendered images with those of
pooled regression or multi-shot regression indicate the relative weight or orientation of the corresponding βcoefficient will be similar to those from pooled/multi-shot
FIGURE 5 | Rendered images of voxel-wise significance values (log10 p-value ×sign(t)) for the covariate “Diagnosis” from pooled regression (Top) and single-shot
regression (Center) and multi-shot regression (Bottom) overlaid on MNI average template. Regardless of the type of regression performed, the images indicate that
in the medial frontal and bilateral temporal lobe/insula there is a significant gray matter density reduction for schizophrenic patients compared to the same regions of
the healthy subjects.
Frontiers in Neuroinformatics | 12 August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
FIGURE 6 | Rendered images of voxel-wise significance values (log10 p-value ×sign(t)) for the covariate “Gender” from pooled regression (Top) and single-shot
regression (Center) and multi-shot regression (Bottom) overlaid on MNI average template. It can be seen from all the three rendered images that there is a significant
amount of gray matter reduction in the sub-cortical regions for males. Since we are using unmodulated gray matter maps, these sex differences could be due to
changes in brain volumes.
FIGURE 7 | Flowchart of the ddFNC procedure e.g., with 2 sites. To perform dgICA, sites first locally compute subject-specific LocalPCA to reduce the temporal
dimension, and then use the GlobalPCA procedure from Baker et al. (2015) to compute global spatial eigenvectors, which are then sent to the aggregator. The
aggregator then performs ICA on the global spatial eigenvectors, using InfoMax ICA (Bell and Sejnowski, 1995) for example, and passes the resulting spatial
components back to local sites. The dK-Means procedure then iteratively computes global centroids using the procedure outlined in Dhillon and Modha (2000), first
computing centroids from subject exemplar dFNC windows, and then using these centroids to initialize clustering over all subject windows.
Frontiers in Neuroinformatics | 13 August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
centralized and decentralized regressions performed against
each voxel, plotted on an MNI brain template. Figure 4 shows
the brain images with the (log10p-val ×sign(t))-values for
the weight parameter corresponding to “Age.” It is notable
to see that the results from the multi-shot regression have a
perfect correlation to those from the pooled version. Moreover,
the observations show the expected decrease in gray matter
concentration as age increases. Figures 5,6show the rendered
images for log10p-values for the “Diagnosis” and “Gender”
covariate, respectively.
4.2. ddFNC Results
A summary of the complete steps in the decentralized dFNC
pipeline is given in Figure 7. In Figure 8, we plot some examples
of the components estimated from decentralized spatial ICA
in comparison with the spatial components from Damaraju
et al. (2014), after performing Hungarian matching between
the estimated spatial maps. We also plot the correlation of the
components from our ICA implementation in comparison to the
components from Damaraju et al. (2014). Indeed, the estimated
components are highly correlated with the results from Damaraju
et al. (2014), for all 100 estimated components, as well for the 47
selected neurological components from Damaraju et al. (2014),
indicating that dgICA is able to produce results comparable to
the pooled case. We include additional spatial maps for all 47
estimated spatial components in the Supplementary Material.
In Figure 9, we plot the centroids from Damaraju et al.
(2014) (Figure 9A), as well as the centroids estimated using
decentralized dFNC (Figure 9B). Indeed, the centroids found
using ddFNC prove similar to the centroids found in Damaraju
et al. (2014), with centroids 2 and 3 being the closest matches
under correlation distance.
The results described in the previous section demonstrate the
fidelity of decentralized regression and decentralized dynamic
function network connectivity in analyzing neuroimaging data.
FIGURE 8 | (A,B) Illustrate examples of matched spatial maps from dgICA and pooled ICA. (C,D) Show the correlation of the components between pooled spatial
ICA and dgICA after hungarian matching. (C) Shows correlation between all 100 components, and (D) Shows correlation between the 47 neurological components
selected in Damaraju et al. (2014).
Frontiers in Neuroinformatics | 14 August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
FIGURE 9 | The k=5 centroids for pooled dFNC from Damaraju et al. (2014) (A), and the hungarian-matched centroids from ddFNC (B).
Although single-shot regression is simple and easy to
implement, it limits our ability to incorporate site covariates
and thus might not be extremely helpful. The decentralized
regression with normal equation and multi-shot regression are
superior to single-shot regression because not only do they
allow incorporating site related variables but also give exact
results as the pooled regression. The linearity and convexity of
the regression objective function made this possible and thus
are an excellent alternative to perform regression on multi-site
In terms of the regression objective function, either the sum
of squared errors or mean sum of squared errors can be used in
practice. However, its mathematically convenient to use sum of
squared errors which subsequently entails (at the AGG) a simple
addition of the gradients (O(1)) instead of a weighted average
of the gradients (O(n)). Added to that, we also showed how the
sample size at the local sites has no bearing on the final results.
On a more practical note, the need for multi-shot regression
might not arise often in a neuroimaging setting where the
number of covariates is usally small. In such cases, the
decentralized regression with normal equation will suffice.
However, in decentralized settings where the number of
covariates is usually large (machine learning/big data) the multi-
shot regression comes to the fore. From a computational time
standpoint, and as discussed in the computational complexity
section, it should be obvious that the multi-shot regression takes
more time to complete than the decentralized regression with
normal equation as it involves iteratively passing the gradients
between the local nodes and the AGG. It is worth mentioning that
although the decentralized regression algorithms demonstrated
here pertain to a simple linear regression model, these algorithms
can easily be extended to more complex models with polynomial
terms or interaction terms as well as to ridge regression, lasso
regression, and elastic net regression.
Regarding ddFNC, we plan on performing a more robust
analysis, going into the future, as a stand-alone algorithm,
particularly with respect to different variations on the dK-Means
optimization and initialization, or with differing versions of ICA
on the aggregator (AGG) node, such as fastICA (Koldovský
et al., 2006), Entropy Bound Minimization (Li and Adali,
2010), and others. Additionally, the possibility of performing
a decentralized static FNC either as a preprocessing step to
ddFNC or a separate analysis is attractive. One other avenue
worth exploring with ddFNC is the flow of information across
the decentralized network. In particular, since the GlobalPCA
step in dgICA already makes the procedure partially peer-to-
peer, it makes sense to explore adding this functionality to
the dK-Means methods to preserve this peer-to-peer structure.
Finally, we plan to evaluate privacy-sensitive versions of ddFNC,
utilizing differential-privacy or other privacy measures as a way
to perform these analyses with some assurance of per-subject
privacy in the decentralized network.
Finally, we note that the decentralization of algorithms in
a neuroimaging setting emphasizes the importance of analysis
on data present at multiple sites, the decentralization discussed
herewith is no different from other decentralized algorithms
discussed elsewhere in literature. The AGG is not really a master
node per se but in fact one of the local sites itself. The term AGG
was introduced to separate all the other local sites from that site
where the results are accumulated.
In this paper, we presented a simple case study of how
voxel-based morphometry and dynamic functional network
connectivity analysis can be performed on multi-site data without
the need for pooling data at a central site. The study shows
that both the decentralized voxel-based morphometry as well
as the decentralized dynamic functional network connectivity
yield results that are comparable to its pooled counterparts
guaranteeing a virtual pooled analysis effect by a chain of
computation and communication process. Other advantages
Frontiers in Neuroinformatics | 15 August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
of such a decentralized platform include data privacy and
support for large data. In conclusion, the results presented here
strongly encourage the use of decentralized algorithms in large
neuroimaging studies over systems that are optimized for large-
scale centralized data processing.
For the MCIC data, all subjects provided informed consent
to participate in the study that was approved by the human
research committees at each of the sites (UNM HRRC #03-429;
UMinn IRB #0404M59124; MGH IRB# 2004P001360; UIowa
IRB #1998010017). In addition to the informed consent, all
patients successfully completed a questionnaire verifying that
they understood the study procedures.
For fBIRN data, all subjects provided informed consent to
participate in the study that was approved by the human research
committees of each of the participating institutes in the fBIRN
data repository.
HG implemented the decentralized regression algorithms on
structural MRI data and wrote the regression part of the
paper. BB implemented the decentralized dynamic functional
network connectivity pipeline on functional MRI data and
wrote that part of the paper. ED contributed immensely to
the analysis as well as interpretation of the results from both
decentralized regression and decentralized dFNC pipeline. SRP
contributed to the brain imaging data preprocessing pipeline.
SMP proposed the decentralized data analysis system and led
the algorithm development effort. RS helped formulate the
decentralized regression with normal equation and development
of decentralized spatial ICA. VC led the team and formed the
This work was funded by the National Institutes of Health
(grant numbers: P20GM103472/5P20RR021938, R01EB005846,
1R01DA040487) and the National Science Foundation (grant
numbers: 1539067 and 1631819).
The Supplementary Material for this article can be found
online at:
Adams, H. H., Adams, H., Launer, L. J., Seshadri, S., Schmidt, R., Bis,
J. C., et al. (2016). Partial derivatives meta-analysis: pooled analyses when
individual participant data cannot be shared. bioRxiv 038893. doi: 10.1101/
Allen, E. A., Damaraju, E., Plis, S. M., Erhardt, E. B., Eichele, T., and Calhoun, V. D.
(2014). Tracking whole-brain connectivity dynamics in the resting state. Cereb.
Cortex 24, 663–676. doi: 10.1093/cercor/bhs352
Ashburner, J., and Friston, K. J. (2000). Voxel-based morphometry–the methods.
Neuroimage 11, 805–821. doi: 10.1006/nimg.2000.0582
Ashburner, J., and Friston, K. J. (2005). Unified segmentation. NeuroImage 26,
839–851. doi: 10.1016/j.neuroimage.2005.02.018
Baker, B. T., Silva, R. F., Calhoun, V. D., Sarwate, A. D., and Plis, S. M. (2015).
“Large scale collaboration with autonomy: Decentralized data ICA,” in 2015
IEEE 25th International Workshop on Machine Learning for Signal Processing
(MLSP), (Boston, MA: IEEE), 1–6.
Bell, A. J., and Sejnowski, T. J. (1995). An information-maximization approach
to blind separation and blind deconvolution. Neural Comput. 7, 1129–1159.
doi: 10.1162/neco.1995.7.6.1129
Bottou, L. (2010). “Large-scale machine learning with stochastic gradient descent,
in Proceedings of COMPSTAT’2010 (Paris: Springer), 177–186.
Button, K. S., Ioannidis, J. P., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S.,
et al. (2013). Power failure: why small sample size undermines the reliability of
neuroscience. Nat. Rev. Neurosci. 14:365. doi: 10.1038/nrn3475
Calhoun, V. D., and Adali, T. (2012). Multisubject independent component
analysis of fMRI: a decade of intrinsic networks, default mode,
and neurodiagnostic discovery. IEEE Rev. Biomed. Eng. 5, 60–73.
doi: 10.1109/RBME.2012.2211076
Calhoun, V. D., Adali, T., Pearlson, G. D., and Pekar, J. (2001). A
method for making group inferences from functional mri data using
independent component analysis. Hum. Brain Mapp. 14, 140–151. doi: 10.1002/
Carter, K. W., Francis, R. W., Carter, K., Francis, R., Bresnahan, M., Gissler, M.,
et al. (2015). Vipar: a software platform for the virtual pooling and analysis of
research data. Int. J. Epidemiol. 45, 408–416. doi: 10.1093/ije/dyv193
Cragin, M. H., Palmer,C. L., C arlson, J. R., and Witt, M. (2010). Data sharing, small
science and institutional repositories. Philos. Trans. R. Soc. Lond. A Math. Phys.
Eng. Sci. 368, 4023–4038. doi: 10.1098/rsta.2010.0165
Damaraju, E., Allen, E. A., Belger, A., Ford, J., McEwen, S., Mathalon, D.,
et al. (2014). Dynamic functional connectivity analysis reveals transient
states of dysconnectivity in schizophrenia. NeuroImage 5, 298–308.
doi: 10.1016/j.nicl.2014.07.003
Datta, S., Giannella, C., and Kargupta, H. (2006). “K-means clustering over a large,
dynamic network,” in Proceedings of the 2006 SIAM International Conference
on Data Mining (SIAM), 153–164. doi: 10.1137/1.9781611972764.14
Datta, S., Giannella, C., and Kargupta, H. (2009). Approximate distributed k-
means clustering over a peer-to-peer network. IEEE Trans. Knowl. Data Eng.
21, 1372–1388. doi: 10.1109/TKDE.2008.222
Deco, G., Ponce-Alvarez, A., Mantini, D., Romani, G. L., Hagmann, P., and
Corbetta, M. (2013). Resting-state functional connectivity emerges from
structurally and dynamically shaped slow linear fluctuations. J. Neurosci. 33,
11239–11252. doi: 10.1523/JNEUROSCI.1091-13.2013
Dhillon, I. S., and Modha, D. S. (2000). “A data-clustering algorithm on distributed
memory multiprocessors,” in Large-Scale Parallel Data Mining, Workshop on
Large-Scale Parallel KDD Systems, SIGKDD (Berlin; Heidelberg: Springer),
Di Fatta, G., Blasa, F., Cafiero, S., and Fortino, G. (2013). Fault tolerant
decentralised k-means clustering for asynchronous large-scale networks.
J. Paral. Distribut. Comput. 73, 317–329. doi: 10.1016/j.jpdc.2012.
Duchi, J., Hazan, E., and Singer, Y. (2011). Adaptive subgradient methods for
online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159.
Erhardt, E. B., Rachakonda, S., Bedrick, E. J., Allen, E. A., Adali, T., and Calhoun,
V. D. (2011). Comparison of multi-subject ica methods for analysis of fMRI
data. Hum. Brain Mapp. 32, 2075–2095. doi: 10.1002/hbm.21170
Fennema-Notestine, C., Gamst, A. C., Quinn, B. T., Pacheco, J., Jernigan,
T. L., Thal, L., et al. (2007). Feasibility of multi-site clinical structural
neuroimaging studies of aging using legacy data. Neuroinformatics 5, 235–245.
doi: 10.1007/s12021-007-9003-9
Forman, G., and Zhang, B. (2000). Distributed data clustering can be efficient and
exact. ACM SIGKDD Explor. Newsl. 2, 34–38. doi: 10.1145/380995.381010
Frontiers in Neuroinformatics | 16 August 2018 | Volume 12 | Article 55
Gazula et al. Decentralized Analysis of Brain Imaging Data
Gollub, R. L., Shoemaker, J. M., King, M. D., White, T., Ehrlich, S., Sponheim,
S. R., et al. (2013). The mcic collection: a shared repository of multi-modal,
multi-site brain image data from a clinical investigation of schizophrenia.
Neuroinformatics 11, 367–388. doi: 10.1007/s12021-013-9184-3
Hibar, D. P., Stein, J. L., Renteria, M. E., Arias-Vasquez, A., Desrivières, S.,
Jahanshad, N., et al. (2015). Common genetic variants influence human
subcortical brain structures. Nature 520, 224. doi: 10.1038/nature14101
Jagannathan, G., and Wright, R. N. (2005). “Privacy-preserving distributed k-
means clustering over arbitrarily partitioned data,” in Proceedings of the
Eleventh ACM SIGKDD International Conference on Knowledge Discovery in
Data Mining, KDD’05 (Chicago, IL: ACM), 593–599.
Keator, D. B., van Erp, T. G., Turner, J. A., Glover, G. H., Mueller,
B. A., Liu, T. T., et al. (2016). The function biomedical informatics
research network data repository. Neuroimage 124, 1074–1079.
doi: 10.1016/j.neuroimage.2015.09.003
Kingma, D. P., and Ba, J. (2014). Adam: a method for stochastic optimization.
Koldovský, Z., Tichavský, P., and Oja, E. (2006). Efficient variant of
algorithm fastica for independent component analysis attaining the
cramér-rao lower bound. IEEE Trans. Neural Netw. 17, 1265–1277.
doi: 10.1109/TNN.2006.875991
Kuhn, H. W. (1955). The Hungarian method for the assignment problem. Naval
Res. Logist. Q. 2, 83–97. doi: 10.1002/nav.3800020109
Landis, D., Courtney, W., Dieringer, C., Kelly, R., King, M., Miller, B.,
et al. (2016). Coins data exchange: an open platform for compiling,
curating, and disseminating neuroimaging data. NeuroImage 124, 1084–1088.
doi: 10.1016/j.neuroimage.2015.05.049
Lewis, N., Plis, S., and Calhoun, V. (2017). “Cooperative learning: Decentralized
data neural network,” in 2017 International Joint Conference on Neural
Networks (IJCNN) (Anchorage, AK), 324–331.
Li, X.-L., and Adali, T. (2010). Independent component analysis by
entropy bound minimization. IEEE Trans. Signal Process. 58, 5151–5164.
doi: 10.1109/TSP.2010.2055859
Ming, J., Verner, E., Sarwate, A., Kelly, R., Reed, C., Kahleck, T., et al. (2017).
Coinstac: decentralizing the future of brain imaging analysis. F1000Res. 6:1512.
doi: 10.12688/f1000research.12353.1
Nesterov, Y. (1983). A method for unconstrained convex minimization problem
with the rate of convergence O(1/k2). Dokl. AN USSR 269, 543–547.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O.,
et al. (2011). Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12,
Plis, S. M., Sarwate, A. D., Wood, D., Dieringer, C., Landis, D., Reed, C.,
et al. (2016). Coinstac: a privacy enabled model and prototype for leveraging
and processing decentralized brain imaging data. Front. Neurosci. 10:365.
doi: 10.3389/fnins.2016.00365
Poldrack, R. A., Barch, D. M., Mitchell, J., Wager, T., Wagner, A. D., Devlin,
J. T., et al. (2013). Toward open sharing of task-based fmri data: the openfmri
project. Front. Neuroinform. 7:12. doi: 10.3389/fninf.2013.00012
Roshchupkin, G. V., Adams, H., Vernooij, M. W., Hofman, A., Van Duijn, C.,
Ikram, M. A., et al. (2016). Hase: framework for efficient high-dimensional
association analyses. Sci. Rep. 6:36076. doi: 10.1038/srep36076
Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning
representations by back-propagating errors. Nature 323:533.
doi: 10.1038/323533a0
Saha, D. K., Calhoun, V. D., Panta, S. R., and Plis, S. M. (2017). “See without
looking: joint visualization of sensitive multi-site datasets,” in Proceedings
of the Twenty-Sixth International Joint Conference on Artificial Intelligence
(IJCAI’2017) (Melbourne, VIC), 2672–2678.
Sakoglu, Ü., Pearlson, G. D., Kiehl, K. A., Wang, Y. M., Michael, A. M.,
and Calhoun, V. D. (2010). A method for evaluating dynamic functional
network connectivity and task-modulation: application to schizophrenia.
Magn. Reson. Mater. Phys. Biol. Med. 23, 351–366. doi: 10.1007/s10334-010-
Scott, A., Courtney, W., Wood, D., de la Garza, R., Lane, S., Wang, R.,
et al. (2011). Coins: an innovative informatics and neuroimaging tool
suite built for large heterogeneous datasets. Front. Neuroinform. 5:33.
doi: 10.3389/fninf.2011.00033
Shringarpure, S. S., and Bustamante, C. D. (2015). Privacy risks from
genomic data-sharing beacons. Am. J. Hum. Genet. 97, 631–646.
doi: 10.1016/j.ajhg.2015.09.010
Smith, S. M., Fox, P. T., Miller, K. L., Glahn, D. C., Fox, P. M., Mackay,
C. E., et al. (2009). Correspondence of the brain’s functional architecture
during activation and rest. Proc. Natl. Acad. Sci. U.S.A. 106, 13040–13045.
doi: 10.1073/pnas.0905267106
Sweeney, L. (2002). k-anonymity: a model for protecting privacy. Int. J. Uncert.
Fuzziness Knowl. Based Syst. 10, 557–570. doi: 10.1142/S0218488502001648
Tenopir, C., Allard, S., Douglass, K., Aydinoglu, A. U., Wu, L., Read, E., et al.
(2011). Data sharing by scientists: practices and perceptions. PLoS ONE
6:e21101. doi: 10.1371/journal.pone.0021101
Thompson, P. M., Andreassen, O. A., Arias-Vasquez, A., Bearden, C. E., Boedhoe,
P. S., Brouwer, R. M., et al. (2017). Enigma and the individual: predicting factors
that affect the brain in 35 countries worldwide. Neuroimage 145, 389–408.
doi: 10.1016/j.neuroimage.2015.11.057
Thompson, P. M., Stein, J. L., Medland, S. E., Hibar, D. P., Vasquez,
A. A., Renteria, M. E., et al. (2014). The enigma consortium:
large-scale collaborative analyses of neuroimaging and genetic
data. Brain Imaging Behav. 8, 153–182. doi: 10.1007/s11682-013-
Turner, J. A., Damaraju, E., Van Erp, T. G., Mathalon, D. H., Ford, J. M.,
Voyvodic, J., et al. (2013). A multi-site resting state fmri study on the
amplitude of low frequency fluctuations in schizophrenia. Front. Neurosci.
7:137. doi: 10.3389/fnins.2013.00137
van Erp, T. G., Hibar, D. P., Rasmussen, J. M., Glahn, D. C., Pearlson, G. D.,
Andreassen, O. A., et al. (2016). Subcortical brain volume abnormalities in
2028 individuals with schizophrenia and 2540 healthy controls via the enigma
consortium. Mol. Psychiatry 21:547. doi: 10.1038/mp.2015.63
Wojtalewicz, N. P., Silva, R. F., Calhoun, V. D., Sarwate, A. D., and Plis, S. M.
(2017). “Decentralized independent vector analysis,” in 2017 IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP) (New Orleans,
LA: IEEE), 826–830.
Yuan, K., Ling, Q., and Yin, W. (2016). On the convergence of decentralized
gradient descent. SIAM J. Optim. 26, 1835–1854. doi: 10.1137/130943170
Zeiler, M. D. (2012). Adadelta: an adaptive learning rate method. arXiv[preprint]
Conflict of Interest Statement: The authors declare that the research was
conducted in the absence of any commercial or financial relationships that could
be construed as a potential conflict of interest.
Copyright © 2018 Gazula, Baker, Damaraju, Plis, Panta, Silva and Calhoun. This
is an open-access article distributed under the terms of the Creative Commons
Attribution License (CC BY). The use, distribution or reproduction in other forums
is permitted, provided the original author(s) and the copyright owner(s) are credited
and that the original publication in this journal is cited, in accordance with accepted
academic practice. No use, distribution or reproduction is permitted which does not
comply with these terms.
Frontiers in Neuroinformatics | 17 August 2018 | Volume 12 | Article 55

Supplementary resource (1)

... One such decentralized analysis available in the COIN-STAC framework is the voxel-based morphometry (VBM) (Ashburner and Friston 2000), which was introduced by Gazula et al. (2018) who conceptualized some variants of the decentralized regression and validated them on a publicly available dataset. In this paper, we showcase the power of the COINSTAC framework by conducting a real-world decentralized VBM analysis of MRI data at two different sites to study structural changes in the adolescent brain in relationship to three exemplars of relevant external factors: age, body mass index (BMI), and smoking. ...
... The developing brain shows substantial changes with age (Gogtay et al. 2004). Previous results have generally found that cortical gray matter volume decreases steadily during adolescence to young adulthood, with a deceleration in the third decade (Mills et al. 2016). ...
... The steps involved in calculating the t-values (and therefore p-values) of each regression parameter are explained in Gazula et al. (2018) and Ming et al. (2017). The global weight vector (ŵ) is sent to each of the local sites, where the local covariance matrix and SSE are calculated and returned with the data size to the aggregator (AGG), which then utilizes this information to calculate the tvalue for each parameter (or coefficient). ...
Full-text available
There has been an upward trend in developing frameworks that enable neuroimaging researchers to address challenging questions by leveraging data across multiple sites all over the world. One such open-source framework is the Collaborative Informatics and Neuroimaging Suite Toolkit for Anonymous Computation (COINSTAC) that works on Windows, macOS, and Linux operating systems and leverages containerized analysis pipelines to analyze neuroimaging data stored locally across multiple physical locations without the need for pooling the data at any point during the analysis. In this paper, the COINSTAC team partnered with a data collection consortium to implement the first-ever decentralized voxelwise analysis of brain imaging data performed outside the COINSTAC development group. Decentralized voxel-based morphometry analysis of over 2000 structural magnetic resonance imaging data sets collected at 14 different sites across two cohorts and co-located in different countries was performed to study the structural changes in brain gray matter which linked to age, body mass index (BMI), and smoking. Results produced by the decentralized analysis were consistent with and extended previous findings in the literature. In particular, a widespread cortical gray matter reduction (resembling a ‘default mode network’ pattern) and hippocampal increase with age, bilateral increases in the hypothalamus and basal ganglia with BMI, and cingulate and thalamic decreases with smoking. This work provides a critical real-world test of the COINSTAC framework in a “Large-N” study. It showcases the potential benefits of performing multivoxel and multivariate analyses of large-scale neuroimaging data located at multiple sites.
... Aggregation of shared iterates between sites allow these decentralized analysis frameworks to converge to solutions which are equivalent to the pooled case. The developers of the COINSTAC decentralized analysis framework (Plis et al., 2016) have successfully amassed a number of decentralized algorithms vital to neuroimaging analysis, including but not limited to independent vector analysis (Wojtalewicz, Silva, Calhoun, Sarwate, & Plis, 2017), deep neural networks (Lewis, Plis, & Calhoun, 2017), and voxel-based morphometry (Gazula et al., 2018). In this work, we further one particular iterative pipeline, decentralized dynamic functional network connectivity (ddFNC), which combines a number of distinct and useful algorithms used primarily in neuroimaging analysis. ...
... In this work, we further one particular iterative pipeline, decentralized dynamic functional network connectivity (ddFNC), which combines a number of distinct and useful algorithms used primarily in neuroimaging analysis. We build on preliminary work introduced elsewhere (Gazula et al., 2018), extending the presentation of ddFNC to include more thorough analysis of the individual algorithms contained within it. ...
... Similarly, Remedios et al. provide a decentralized application of deep learning for neuroimage segmentation (Remedios et al., 2020). Decentralized joint independent component analysis (Baker, Silva, Calhoun, Sarwate, & Plis, 2015), independent vector analysis (Wojtalewicz et al., 2017), decentralized stochastic neighbor embeddings (Saha et al., 2019), and voxel-based morphometry (Gazula et al., 2018) have also been applied to the analysis of decentralized neuroimaging data. In general, many of these frameworks proceed by iteratively computing the statistics used for optimization of a particular algorithm in a decentralized way. ...
Full-text available
As neuroimaging data increase in complexity and related analytical problems follow suite, more researchers are drawn to collaborative frameworks that leverage data sets from multiple data‐collection sites to balance out the complexity with an increased sample size. Although centralized data‐collection approaches have dominated the collaborative scene, a number of decentralized approaches—those that avoid gathering data at a shared central store—have grown in popularity. We expect the prevalence of decentralized approaches to continue as privacy risks and communication overhead become increasingly important for researchers. In this article, we develop, implement and evaluate a decentralized version of one such widely used tool: dynamic functional network connectivity. Our resulting algorithm, decentralized dynamic functional network connectivity (ddFNC), synthesizes a new, decentralized group independent component analysis algorithm (dgICA) with algorithms for decentralized k‐means clustering. We compare both individual decentralized components and the full resulting decentralized analysis pipeline against centralized counterparts on the same data, and show that both provide comparable performance. Additionally, we perform several experiments which evaluate the communication overhead and convergence behavior of various decentralization strategies and decentralized clustering algorithms. Our analysis indicates that ddFNC is a fine candidate for facilitating decentralized collaboration between neuroimaging researchers, and stands ready for the inclusion of privacy‐enabling modifications, such as differential privacy.
... Multiple neuroimaging algorithms have been federated and can be run in COINSTAC. Examples of implemented algorithms include decentralized voxel-based morphometry (Gazula et al., 2018), decentralized t-distributed stochastic neighbor embedding (Saha et al., 2017(Saha et al., , 2021, decentralized dynamic functional network connectivity , and decentralized support vector machine with differential privacy (Sarwate et al., 2014). ...
... The goal of this paper was two-fold: to demonstrate the feasibility of COINSTAC (Gazula et al., 2018) for performing decentralized analysis on large datasets present across multiple sites, compare the results of the decentralized ICA and Neuromark, and to use this experiment to evaluate the Starting with tobacco users, there was decreased connectivity in most of the pairings but the most prominent result was hypoconnectivity between the auditory and (Fedota & Stein, 2015). There was an exception with the DMN, that had increased within-network connectivity but also with the cognitive control and cerebellar domains. ...
Full-text available
With the growth of decentralized/federated analysis approaches in neuroimaging, the opportunities to study brain disorders using data from multiple sites has grown multi-fold. One such initiative is the Neuromark, a fully automated spatially constrained independent component analysis (ICA) that is used to link brain network abnormalities among different datasets, studies, and disorders while leveraging subject-specific networks. In this study, we implement the neuromark pipeline in COINSTAC, an open-source neuroimaging framework for collaborative/decentralized analysis. Decentralized exploratory analysis of nearly 2000 resting-state functional magnetic resonance imaging datasets collected at different sites across two cohorts and co-located in different countries was performed to study the resting brain functional network connectivity changes in adolescents who smoke and consume alcohol. Results showed hypoconnectivity across the majority of networks including sensory, default mode, and subcortical domains, more for alcohol than smoking, and decreased low frequency power. These findings suggest that global reduced synchronization is associated with both tobacco and alcohol use. This proof-of-concept work demonstrates the utility and incentives associated with large-scale decentralized collaborations spanning multiple sites.
... But, it is possible to conduct the linear regression analysis in a totally distributed manner. For decentralized regression, we refer the reader to [50]. In summary, we can write the least squares solution for the regression parameterβ as: ...
Full-text available
The examination of multivariate brain morphometry patterns has gained attention in recent years, especially for their powerful exploratory capabilities in the study of differences between patients and controls. Among many existing methods and tools for analysis of brain anatomy based on structural magnetic resonance imaging (sMRI) data, data-driven source based morphometry (SBM) focuses on the exploratory detection of such patterns. Constrained source-based morphometry (constrained SBM) is a widely used semi-blind extension of SBM that enables extracting maximally independent reference-alike sources using the constrained independent component analysis (ICA) approach. In order to operate, constrained SBM needs the data to be locally accessible. However, there exist many reasons (e.g., the concerns of revealing identifiable rare disease information, or violating strict IRB policies) that may preclude access to data from different sites. In this scenario, constrained SBM fails to leverage the benefits of decentralized data. To mitigate this problem, we present a novel approach: decentralized constrained source-based morphometry (dcSBM). In dcSBM, the original data never leaves the local site. Each site operates constrained ICA on their private local data while using a common distributed computation platform. Then, an aggregator/master node aggregates the results estimated from each local site and applies statistical analysis to find out the significant sources. In our approach, we first use UK Biobank sMRI data to investigate the reliability of our dcSBM algorithm. Finally, we utilize two additional multi-site patient datasets to validate our model by comparing the resulting group difference estimates from both centralized and decentralized constrained SBM.
... This iterative process has to be automated, so that machine learning and latent variable analyses can be conducted in a decentralized environment. Currently decentralized versions of iterative regression, independent component analysis for static and dynamic network connectivity analyses, support vector machines, and distributed t-stochastic neighbor embedding (t-sne) for visualization are all available through COINSTAC (Gazula et al., 2018;Saha et al., 2017;Plis et al., 2016;Sarwate et al., 2014;Baker et al., in press;Saha et al., 2020). In some cases a shared reference data set is leveraged, or testing/training configurations are incorporated. ...
Full-text available
The FAIR principles, as applied to clinical and neuroimaging data, reflect the goal of making research products F indable, A ccessible, I nteroperable, and R eusable. The use of the Collaborative Informatics and Neuroimaging Suite Toolkit for Anonymized Computation (COINSTAC) platform in the Enhancing Neuroimaging Genetics through Meta-Analysis (ENIGMA) consortium combines the technological approach of decentralized analyses with the sociological approach of sharing data. In addition, ENIGMA + COINSTAC provides a platform to facilitate the use of machine-actionable data objects. We first present how ENIGMA and COINSTAC support the FAIR principles, and then showcase their integration with a decentralized meta-analysis of sex differences in negative symptom severity in schizophrenia, and finally present ongoing activities and plans to advance FAIR principles in ENIGMA + COINSTAC. ENIGMA and COINSTAC currently represent efforts toward improved Access, Interoperability, and Reusability. We highlight additional improvements needed in these areas, as well as future connections to other resources for expanded Findability.
... The very first federated algorithm developed in COINSTAC was federated regression (Ming et al., 2017;Plis et al., 2016). Gazula et al. (2018) presented a simple case study of how voxel-based morphometry (VBM) can be performed on multi-site data without the need for pooling data at a central site. In an extension of the federated VBM, the same authors (Gazula et al., 2021) collaborated with IMA-GEN from the UK and cVEDA from India and provided a critical real-world test of the COINSTAC framework in a "Large-N" study. ...
Full-text available
The field of neuroimaging has embraced sharing data to collaboratively advance our understanding of the brain. However, data sharing, especially across sites with large amounts of protected health information (PHI), can be cumbersome and time intensive. Recently, there has been a greater push towards collaborative frameworks that enable large-scale federated analysis of neuroimaging data without the data having to leave its original location. However, there still remains a need for a standardized federated approach that not only allows for data sharing adhering to the FAIR (Findability, Accessibility, Interoperability, Reusability) data principles, but also streamlines analyses and communication while maintaining subject privacy. In this paper, we review a non-exhaustive list of neuroimaging analytic tools and frameworks currently in use. We then provide an update on our federated neuroimaging analysis software system, the Collaborative Informatics and Neuroimaging Suite Toolkit for Anonymous Computation (COINSTAC). In the end, we share insights on future research directions for federated analysis of neuroimaging data.
... COINSTAC is intended to be the ultimate hub by which researchers can build statistical (Ming et al., 2017) or machine learning models (Gazula et al., 2018) collaboratively in a decentralized fashion. This framework implements a message-passing infrastructure that allows large-scale analysis of decentralized data with results on par with those that would have been obtained if the data were centralized. ...
... Limitations Although NEURO-LEARN incorporates data access control and dataset templates for the sake of data administration and desensitization, they are not sufficient for privacy protection according to the concept of differential privacy (Roth and Dwork 2013), since subject-level quantitative features are uploaded for data accumulation, while no further data perturbation is adopted to avoid data reidentification (Chaudhuri et al. 2011). Meanwhile, a number of multi-center neuroimaging researches have been exploring distributed privacy-preserving algorithms with strategies including aggregating group-level variables across multiple sites (Gazula et al. 2018;Plis et al. 2016), and aggregating locally-trained models (Dluhos et al. 2017;Chang et al. 2018). While these approaches are proven to have comparable performance with the pooled-data counterparts, our proposed solution incorporates localized conventional feature extraction methods, and online pattern analysis with generic feature reduction and machine learning algorithms, aiming to facilitate longitudinal data accumulation and horizontal model evaluation, accomplish higher generalizability without developing sophisticated and highly-coupled infrastructures for decentralized algorithms, and adapt to multiple usage scenarios. ...
Full-text available
The development of neuroimaging instrumentation has boosted neuroscience researches. Consequently, both the fineness and the cost of data acquisition have profoundly increased, leading to the main bottleneck of this field: limited sample size and high dimensionality of neuroimaging data. Therefore, the emphasis of ideas of data pooling and research collaboration has increased over the past decade. Collaborative analysis techniques emerge as the idea developed. In this paper, we present NEURO-LEARN, a solution for collaborative pattern analysis of neuroimaging data. Its collaboration scheme consists of four parts: projects, data, analysis, and reports. While data preparation workflows defined in projects reduce the high dimensionality of neuroimaging data by collaborative computation, pooling of derived data and sharing of pattern analysis workflows along with generated reports on the Web enlarge the sample size and ensure the reliability and reproducibility of pattern analysis. Incorporating this scheme, NEURO-LEARN provides an easy-to-use Web application that allows users from different sites to share projects and processed data, perform pattern analysis, and obtain result reports. We anticipate that this solution will help neuroscientists to enlarge sample size, conquer the curse of dimensionality and conduct reproducible studies on neuroimaging data with efficiency and validity.
... 1 COINSTAC: a decentralized and privacy-enabled infrastructure model for brain imaging data (Gazula et al., 2018;Plis et al., 2016). ...
Full-text available
Neuroimaging‐based approaches have been extensively applied to study mental illness in recent years and have deepened our understanding of both cognitively healthy and disordered brain structure and function. Recent advancements in machine learning techniques have shown promising outcomes for individualized prediction and characterization of patients with psychiatric disorders. Studies have utilized features from a variety of neuroimaging modalities, including structural, functional, and diffusion magnetic resonance imaging data, as well as jointly estimated features from multiple modalities, to assess patients with heterogeneous mental disorders, such as schizophrenia and autism. We use the term “predictome” to describe the use of multivariate brain network features from one or more neuroimaging modalities to predict mental illness. In the predictome, multiple brain network‐based features (either from the same modality or multiple modalities) are incorporated into a predictive model to jointly estimate features that are unique to a disorder and predict subjects accordingly. To date, more than 650 studies have been published on subject‐level prediction focusing on psychiatric disorders. We have surveyed about 250 studies including schizophrenia, major depression, bipolar disorder, autism spectrum disorder, attention‐deficit hyperactivity disorder, obsessive–compulsive disorder, social anxiety disorder, posttraumatic stress disorder, and substance dependence. In this review, we present a comprehensive review of recent neuroimaging‐based predictomic approaches, current trends, and common shortcomings and share our vision for future directions.
Full-text available
In the era of Big Data, sharing neuroimaging data across multiple sites has become increasingly important. However, researchers who want to engage in centralized, large-scale data sharing and analysis must often contend with problems such as high database cost, long data transfer time, extensive manual effort, and privacy issues for sensitive data. To remove these barriers to enable easier data sharing and analysis, we introduced a new, decentralized, privacy-enabled infrastructure model for brain imaging data called COINSTAC in 2016. We have continued development of COINSTAC since this model was first introduced. One of the challenges with such a model is adapting the required algorithms to function within a decentralized framework. In this paper, we report on how we are solving this problem, along with our progress on several fronts, including additional decentralized algorithms implementation, user interface enhancement, decentralized regression statistic calculation, and complete pipeline specifications.
Conference Paper
Full-text available
Visualization of high dimensional large-scale datasets via an embedding into a 2D map is a powerful exploration tool for assessing latent structure in the data and detecting outliers. There are many methods developed for this task but most assume that all pairs of samples are available for common computation. Specifically, the distances between all pairs of points need to be directly computable. In contrast, we work with sensitive neuroimaging data, when local sites cannot share their samples and the distances cannot be easily computed across the sites. Yet, the desire is to let all the local data participate in collaborative computation without leaving their respective sites. In this scenario, a quality control tool that visualizes decentralized dataset in its entirety via global aggregation of local computations is especially important as it would allow screening of samples that cannot be evaluated otherwise. This paper introduces an algorithm to solve this problem: decentralized data stochastic neighbor embedding (dSNE). Based on the MNIST dataset we introduce metrics for measuring the embedding quality and use them to compare dSNE to its centralized counterpart. We also apply dSNE to a multi-site neuroimaging dataset with encouraging results.
Full-text available
High-throughput technology can now provide rich information on a person’s biological makeup and environmental surroundings. Important discoveries have been made by relating these data to various health outcomes in fields such as genomics, proteomics, and medical imaging. However, cross-investigations between several high-throughput technologies remain impractical due to demanding computational requirements (hundreds of years of computing resources) and unsuitability for collaborative settings (terabytes of data to share). Here we introduce the HASE framework that overcomes both of these issues. Our approach dramatically reduces computational time from years to only hours and also requires several gigabytes to be exchanged between collaborators. We implemented a novel meta-analytical method that yields identical power as pooled analyses without the need of sharing individual participant data. The efficiency of the framework is illustrated by associating 9 million genetic variants with 1.5 million brain imaging voxels in three cohorts (total N = 4,034) followed by meta-analysis, on a standard computational infrastructure. These experiments indicate that HASE facilitates high-dimensional association studies enabling large multicenter association studies for future discoveries.
Full-text available
The field of neuroimaging has embraced the need for sharing and collaboration. Data sharing mandates from public funding agencies and major journal publishers have spurred the development of data repositories and neuroinformatics consortia. However, efficient and effective data sharing still faces several hurdles. For example, open data sharing is on the rise but is not suitable for sensitive data that are not easily shared, such as genetics. Current approaches can be cumbersome (such as negotiating multiple data sharing agreements). There are also significant data transfer, organization and computational challenges. Centralized repositories only partially address the issues. We propose a dynamic, decentralized platform for large scale analyses called the Collaborative Informatics and Neuroimaging Suite Toolkit for Anonymous Computation (COINSTAC). The COINSTAC solution can include data missing from central repositories, allows pooling of both open and “closed” repositories by developing privacy-preserving versions of widely-used algorithms, and incorporates the tools within an easy-to-use platform enabling distributed computation. We present an initial prototype system which we demonstrate on two multi-site data sets, without aggregating the data. In addition, by iterating across sites, the COINSTAC model enables meta-analytic solutions to converge to “pooled-data” solutions (i.e., as if the entire data were in hand). More advanced approaches such as feature generation, matrix factorization models, and preprocessing can be incorporated into such a model. In sum, COINSTAC enables access to the many currently unavailable data sets, a user friendly privacy enabled interface for decentralized analysis, and a powerful solution that complements existing data sharing solutions.
Full-text available
In this review, we discuss recent work by the ENIGMA Consortium ( - a global alliance of over 500 scientists spread across 200 institutions in 35 countries collectively analyzing brain imaging, clinical, and genetic data. Initially formed to detect genetic influences on brain measures, ENIGMA has grown to over 30 working groups studying 12 major brain diseases by pooling and comparing brain data. In some of the largest neuroimaging studies to date - of schizophrenia and major depression - ENIGMA has found replicable disease effects on the brain that are consistent worldwide, as well as factors that modulate disease effects. In partnership with other consortia including ADNI, CHARGE, IMAGEN and others, ENIGMA's genomic screens - now numbering over 30,000 MRI scans - have revealed at least 8 genetic loci that affect brain volumes. Downstream of gene findings, ENIGMA has revealed how these individual variants - and genetic variants in general - may affect both the brain and risk for a range of diseases. The ENIGMA consortium is discovering factors that consistently affect brain structure and function that will serve as future predictors linking individual brain scans and genomic data. It is generating vast pools of normative data on brain measures - from tens of thousands of people - that may help detect deviations from normal development or aging in specific groups of subjects. We discuss challenges and opportunities in applying these predictors to individual subjects and new cohorts, as well as lessons we have learned in ENIGMA's efforts so far.
Conference Paper
Data sharing for collaborative research systems may not be able to use contemporary architectures that collect and store data in centralized data centers. Research groups often wish to control their data locally but are willing to share access to it for collaborations. This may stem from research culture as well as privacy concerns. To leverage the potential of these aggregated larger data sets, we would like tools that perform joint analyses without transmitting the data. Ideally, these analyses would have similar performance and ease of use as current team-based research structures. In this paper we design, implement, and evaluate a decentralized data independent component analysis (ICA) that meets these criteria. We validate our method on temporal ICA for functional magnetic resonance imaging (fMRI) data; this method shares only intermediate statistics and may be amenable to further privacy protections via differential privacy.