ArticlePDF Available

Functional connectivity signatures of major depressive disorder: machine learning analysis of two multicenter neuroimaging studies

Authors:

Abstract and Figures

The promise of machine learning has fueled the hope for developing diagnostic tools for psychiatry. Initial studies showed high accuracy for the identification of major depressive disorder (MDD) with resting-state connectivity, but progress has been hampered by the absence of large datasets. Here we used regular machine learning and advanced deep learning algorithms to differentiate patients with MDD from healthy controls and identify neurophysiological signatures of depression in two of the largest resting-state datasets for MDD. We obtained resting-state functional magnetic resonance imaging data from the REST-meta-MDD (N = 2338) and PsyMRI (N = 1039) consortia. Classification of functional connectivity matrices was done using support vector machines (SVM) and graph convolutional neural networks (GCN), and performance was evaluated using 5-fold cross-validation. Features were visualized using GCN-Explainer, an ablation study and univariate t-testing. The results showed a mean classification accuracy of 61% for MDD versus controls. Mean accuracy for classifying (non-)medicated subgroups was 62%. Sex classification accuracy was substantially better across datasets (73–81%). Visualization of the results showed that classifications were driven by stronger thalamic connections in both datasets, while nearly all other connections were weaker with small univariate effect sizes. These results suggest that whole brain resting-state connectivity is a reliable though poor biomarker for MDD, presumably due to disease heterogeneity as further supported by the higher accuracy for sex classification using the same methods. Deep learning revealed thalamic hyperconnectivity as a prominent neurophysiological signature of depression in both multicenter studies, which may guide the development of biomarkers in future studies.
This content is subject to copyright. Terms and conditions apply.
ARTICLE OPEN
Functional connectivity signatures of major depressive
disorder: machine learning analysis of two multicenter
neuroimaging studies
Selene Gallo
1,2,18
, Ahmed El-Gazzar
1,2,18
, Paul Zhutovsky
1,2
, Rajat M. Thomas
1,2
, Nooshin Javaheripour
3
, Meng Li
3
,
Lucie Bartova
4
, Deepti Bathula
5
, Udo Dannlowski
6
, Christopher Davey
7
, Thomas Frodl
8,9
, Ian Gotlib
10
, Simone Grimm
11
,
Dominik Grotegerd
6
, Tim Hahn
6
, Paul J. Hamilton
12
, Ben J. Harrison
7
, Andreas Jansen
13
, Tilo Kircher
13
, Bernhard Meyer
4
,
Igor Nenadić
13
, Sebastian Olbrich
14
, Elisabeth Paul
12
, Lukas Pezawas
4
, Matthew D. Sacchet
15
, Philipp Sämann
16
,
Gerd Wagner
3
, Henrik Walter
17
, Martin Walter
8,9
, PsyMRI* and Guido van Wingen
1,2
© The Author(s) 2023
The promise of machine learning has fueled the hope for developing diagnostic tools for psychiatry. Initial studies showed high
accuracy for the identication of major depressive disorder (MDD) with resting-state connectivity, but progress has been hampered
by the absence of large datasets. Here we used regular machine learning and advanced deep learning algorithms to differentiate
patients with MDD from healthy controls and identify neurophysiological signatures of depression in two of the largest resting-state
datasets for MDD. We obtained resting-state functional magnetic resonance imaging data from the REST-meta-MDD (N=2338) and
PsyMRI (N=1039) consortia. Classication of functional connectivity matrices was done using support vector machines (SVM) and
graph convolutional neural networks (GCN), and performance was evaluated using 5-fold cross-validation. Features were visualized
using GCN-Explainer, an ablation study and univariate t-testing. The results showed a mean classication accuracy of 61% for MDD
versus controls. Mean accuracy for classifying (non-)medicated subgroups was 62%. Sex classication accuracy was substantially
better across datasets (7381%). Visualization of the results showed that classications were driven by stronger thalamic
connections in both datasets, while nearly all other connections were weaker with small univariate effect sizes. These results
suggest that whole brain resting-state connectivity is a reliable though poor biomarker for MDD, presumably due to disease
heterogeneity as further supported by the higher accuracy for sex classication using the same methods. Deep learning revealed
thalamic hyperconnectivity as a prominent neurophysiological signature of depression in both multicenter studies, which may
guide the development of biomarkers in future studies.
Molecular Psychiatry (2023) 28:3013–3022; https://doi.org/10.1038/s41380-023-01977-5
INTRODUCTION
With more than 163 million people affected [1], major depressive
disorder (MDD) is the most common psychiatric disorder in the
world. This number keeps increasing every year, adding urgency
to the question of how to diagnose, prevent, and treat it [2]. The
promise of articial intelligence for medicine also sparked the
interest for using machine learning techniques for the develop-
ment of biomarkers in psychiatry [3]. A meta-analysis of initial
small-scale studies suggested that resting-state functional
magnetic resonance imaging (fMRI) may provide highly accurate
biomarkers for MDD [4]. However, neuroimaging biomarkers
showed lower accuracies for other psychiatric disorders when
based on large scale datasets, presumably due to increased
heterogeneity within the patient group [5]. Until now, large scale
resting-state cohorts for MDD have not been available, limiting the
progress of the development of biomarkers for MDD.
In this work, we used data from two of the largest consortia
(REST-meta-MDD (http://rfmri.org/REST-meta-MDD)[6] and
Received: 22 October 2021 Revised: 12 January 2023 Accepted: 19 January 2023
Published online: 15 February 2023
1
Amsterdam UMC location University of Amsterdam, Department of Psychiatry, Meibergdreef 9, Amsterdam, The Netherlands.
2
Amsterdam Neuroscience, Amsterdam,
The Netherlands.
3
Department Of Psychiatry and Psychotherapy, Jena University Hospital, Jena, Germany.
4
Department of Psychiatry and Psychotherapy, Medical University of
Vienna, Vienna, Austria.
5
Indian Institute of Technology (IIT), Ropar, India.
6
Institute for Translational Psychiatry, University of Münster, Münster, Germany.
7
Department of
Psychiatry, The University of Melbourne, Melbourne, VIC, Australia.
8
Department of Psychiatry and Psychotherapy, Otto von Guericke University Magdeburg, Magdeburg,
Germany.
9
German center for mental health, CIRC, Magdeburg, Germany.
10
Department of Psychology, Stanford University, Stanford, CA 94305, USA.
11
Department of Psychiatry,
Charité Universitätsmedizin Berlin, Berlin, Germany.
12
Center for Social and Affective Neuroscience, Department of Biomedical and Clinical Sciences, Linköping University,
Linköping, Sweden.
13
Department Of Psychiatry, University of Marburg, Marburg, Germany.
14
Department of Psychiatry, Psychotherapy and Psychosomatics, University Hospital
of Zurich, Zurich, Switzerland.
15
Center for Depression, Anxiety, and Stress Research, McLean Hospital, Harvard Medical School, Belmont, MA, USA.
16
Max Planck Institute of
Psychiatry, Munich, Germany.
17
Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Department of Psychiatry
and Psychotherapy, Charitéplatz 1, D-10117 Berlin, Germany.
18
These authors contributed equally: Selene Gallo, Ahmed El-Gazzar. *A list of authors and their afliations appears
at the end of the paper. email: a.g.elgazzar@amsterdamumc.nl
www.nature.com/mp
Molecular Psychiatry
1234567890();,:
Content courtesy of Springer Nature, terms of use apply. Rights reserved
PsyMRI (http://psymri.com), from now on mddrest and psymri) that
obtained resting-state fMRI data across different research centers
from patients with MDD and matched healthy controls (HC) to
evaluate the potential of resting-state functional connectivity (FC)
as biomarker for MDD.
FC between brain regions refers to the statistical dependence of
neurophysiological signals [7], typically measured as Pearson
correlation [8]. Until recently, the gold standard to explore brain
differences was univariate-group-analysis, which interrogates one
voxel at the time, and has revealed consistent FC differences in
MDD [9]. However, univariate analysis potentially misses more
complex patterns and is only able to detect average group
differences. In the last few years, the increasing availability of
machinelearning (ML) and deep learning (DL) techniques [10] has
enabled researchers to look into multivariate patterns. Recent
results using the popular ML classier support vector machine
(SVM) obtained up to 95% classication accuracy in small datasets
[11,12]. DL algorithms are advanced ML techniques that learn
abstract representation of the input data as an integral part of the
training process. DL may have huge potential for high-
dimensional data such as neuroimaging [13]. DL has shown
convincing early results in many tasks involving image analysis,
including classication of psychiatric disorders (see [14] for a
review). Specic deep learning models on graphs (i.e., graph
convolutional networks; GCN) have recently emerged, and
demonstrated powerful performance on various tasks. Generally
speaking, GCN models are a type of neural network architecture
that can specically leverage the graph structure that is typical for
FC [15]. GCNs also enable the visualization of the important
features to counter the typical criticism of ML for being black-
boxes[16], and enable their use for uncovering the neural
signatures of psychiatric disorders.
In the research reported here, we trained linear and nonlinear
(rbf) SVM and spatial GCN classiers on the mddrest (selected
N=2338) and psymri (selected N=1039) datasets separately as
well as combined. We performed two complimentary post-hoc
visualization experiments: GCN-Explainer [17], which highlights
the important connections between those brain regions that are
necessary for the classier to distinguish between MDD and
controls; and an ablation study in which each brain region is
systematically excluded (virtually ablated) one by one from the
model. The consequent drop in accuracy from the original model
accuracy indicates the contribution of the excluded region to the
overall performance. To assess whether identied connections were
stronger or weaker in MDD, we used group-level t-tests. Further-
more, as clinical heterogeneity is expected to have a large inuence
on the classication accuracy, we performed additional classica-
tions for medicated and non-medicated patients separately.
METHODS AND MATERIALS
Datasets
The psymri consortium consists of 23 cohorts from across the world,
including raw data from 531 patients (60% Males, 33.7 +/11.6 years old)
and 508 controls (65% Males, 35.1 +/12.2 years old). The mddrest dataset
collected byREST-meta-MDD Project is currently the largest resting-state
fMRI database for MDD, including 1255 patients (57% Males, 36.6 +/15.7
years old) and 1083 HC (62% Males, 35.1 +/14.7 years old) from 25
cohorts in China. Supplementary Fig. S1 in the Supplementary Materials
shows the distribution of participants between sites of the datasets.
Demographic data are reported in Supplementary Table S1 for each of the
classication tasks separately (see Supplementary Information for more
details about sample composition).
We also utilized samples from two external rs-fMRI datasets that do
not target MDD to benchmark classication performance on an
independent task. Abide [18] is a comparable retrospective multicenter
neuroimaging consortium but with patients with autism spectrum
disorders (ASD) instead of MDD. In this study we used a sample of
(N=2000 (1590 M/410 F), 1030 ASD/970 TD) from both the rst and
second releases. The UK Biobank [19] is a prospective population cohort
with harmonized data acquisition. We used a randomly sampled subset
of the resting-state fMRI dataset with a comparable sample size to our
MDD consortia (N=2000, 1000 M/1000 F).
Anonymized data were made available for these consortia from studies
that were approved by local Institutional Review Boards. All study
participants provided written informed consent at their local institution.
Data processing
Standard preprocessing of the psymri dataset was done in house using FSL
and ANTs (see Supplementary Information). Standard preprocessing of the
mddrest data was done at each site using the Data Processing Assistant for
Resting-State fMRI (DPARSF), which is based on SPM [20,21] (see
Supplementary Information for preprocessing of Abide and UK Biobank).
Time courses of cortical and subcortical regions as dened by the Harvard-
Oxford atlas [22] were extracted for all datasets (112 regions in total, see
Supplementary Information for analyses on a functional atlas). Correlations
between all brain regions were estimated and the resulting correlation
matrices were used as features to predict class membership (Fig. 1). We
used medication status to dene more homogeneous groups, and
included sex to benchmark classication performance for a task that is
not dependent on the psychiatric diagnosis, resulting in the following
classication tasks:
I. MDD vs HC
II. Non-medicated MDD vs HC
III. Medicated patients MDD vs HC
IV. Medicated MDD vs non-medicated MDD
V. Male vs female
For each contrast separately, we subsampled the classes so that the
number of participants per class was equal. Supplementary Table S1
describes the sample compositions of the groups, for each contrast
separately.
Classier models
Three classes of models, linear SVM, non-linear rbf SVM, and GCN, were
used to evaluate prediction performance for all tests. To assess the
generalizability of our results to data that have not been used to train the
model, we used a 5-fold cross-validation (CV) scheme. Hyperparameter
search for each model was based on best practices from the literature, and
chosen empirically on the basis of a relative prediction accuracy on 20% of
the training set [10] (See Supplementary Information for details). After the
best hyperparameter combination had been determined, the actual
performance of the classiers was assessed on the test set [23]. The
overall performance was calculated by averaging balanced accuracy
performance in the ve rounds on the test splits. Other evaluation metrics,
namely F1-score, specicity and sensitivity, are reported in the Supple-
mentary Information. All performances were compared against chance
level using a random permutation test, then Bonferroni correction was
used to adjust for the number of comparisons (Supplementary Informa-
tion). Finally, for the contrast of MDD vs HC, to assess model generalization
between datasets, we trained a model on one dataset and evaluated the
model on the other dataset. For GCN we used a 5-fold cross-validation (CV)
scheme to perform model selection on 20% of the test set. The SVMs do
not need model selection and we applied a one-shotprocedure.
Linear and rbf SVM. We used linear and rbf SVM, popular classiers that
nd respectively linear and non-linear combinations of features that best
separate classes among the observations [24,25].The upper triangular
portion of the FC matrix was used as input for both linear and rbf SVM
(Fig. 2).
GCN. Spatial GCN, referred to simply as GCN here, is a particular class of
GCN. The rst step consisted in transforming the FC matrices in graph
representations. A graph representation is composed of nodes, nodal
features or embeddings, and edges connecting the nodes. In our case,
each node represents a region of interest (ROI). To construct the edges of
the graphs, we thresholded the FC and binarized it so that the top 50% of
connections in terms of connectivity strength were transformed into ones
and the rest into zeros, regulating the sparsity of the graph. This threshold
was derived from previous studies, including a systematic search for
optimal graph sparsity by our own group. The nodal features were dened
by the connectivity prole of that region to other regions, meaning the
corresponding row in the FC matrix before thresholding (Fig. 1). This
S. Gallo et al.
3014
Molecular Psychiatry (2023) 28:3013 3022
Content courtesy of Springer Nature, terms of use apply. Rights reserved
allowed the model to abstract information from each group of regions that
have high temporal correlation. The GCN architecture used for this work
was optimized for each contrast and dataset. See Supplementary Fig. S2
for a visual representation of the model and related concepts and
Supplementary Information for details about the architecture. We used a
binary cross entropy loss function and optimized the weights using Adams
optimizer. The model is trained for 100 epochs with an initial learning rate
of 0.001 decaying by a factor of 10 every 30 epochs.
GCN-Explainer and ablation study
Two complementary experiments were carried out on the main MDD vs HC
contrast. To assess the consistency of the results (e.g., replication), we
performed the experiment on the psymri and the mddrest datasets
independently. Regions highlighted by both datasets are reported in the
results section. We focused on visualization of the GCN results because the
methods allowed us to use complementary visualization techniques and
strategies to enhance reliability of the results.
GCN-Explainer shows the manner in which the GCN classier made
the predictions. These explanations are in the form of a subgraph of the
entire graph the GCN was trained on, so that the subgraph maximizes
the mutual information with GCN prediction. This is achieved by
formulating a mean eld variational approximation and learning a real-
valued graph mask which selects the important subgraph of the GCNs
computation graph.
We additionally performed an ablation study to identify the regions that
inuenced the performance of the GCN model in separating HC and MDD
patients. This was done by masking the connectivity prole of each region,
i.e., deleting the corresponding row from the connectivity matrix of the test
set. The resultant drop in accuracy from the performance of the model
trained on the full connectivity matrix is attributed to the region. We
repeated the train-test process masking each region 10 times and calculated
the mean drop in accuracy. The repetition leverages the stochastic nature of
the GCN classier to enhance the replicability of the results.
Univariate group analyses
Univariate independent sample t-tests were performed on FC for the
psymri and mddrest datasets separately. Sex, age, recording site and
movement during scanning (average framewise displacement according to
Jenkinson) were regressed out before testing. For each contrast and
datasets, results were FDR corrected for multiple comparisons (p< 0.05).
RESULTS
Classication performance
Classications for the main comparison between MDD vs HC were
signicantly better than chance level after correction for multiple
comparisons (with the exception of linear SVM classication of
ROIs
ROIs
FC FC - thresholded
ROI
ROI
node features
edges
GCN classifier
Flatten to 1D
SVM classifier
FC lower triangle
time
fMRI preprocessing FC
Brain parcellation in ROIs
time
BOLD signal
ROIs
Fig. 1 Pipeline from 4D rs-fMRI data to input for the classication task. Visual representation of our pipeline. For the psymri dataset,
preprocessing of the raw 4D rs-fMRI and parcellation of the brain in regions of interest (ROIs) according to the Harvard-Oxford atlas was
performed in house, while the mddrest consortium provided us directly with the time course of the same ROIs. The functional connectivity
(FC) matrix was calculated using Pearson correlation between ROIs. Each entry in the FC represented the strength of functional connectivity
between two ROIs, each row represented the correlation prole between one ROI and other ROIs. Since the FC is symmetrical, only one of the
triangles was used as input for the SVM classiers. From the FC we constructed the graph, which was used as GCN input. The ROIs were used
as the nodes of the graphs. To construct the edges between nodes, i.e., the FC between ROIs, we rst binarized the FC matrix so that only the
50th highest absolute values of the correlations of the matrix were transformed into ones, while the rest were transformed into zeros. We then
drew an edge between ROIs whose correlation survived the binarization process. A feature was assigned to each node. The features were the
original (i.e., before binarization) correlation prole of the node itself with the rest of the ROIs in the brain, therefore an entire row of the FC.
SVM support vector machine, GCN graph convolutional network.
S. Gallo et al.
3015
Molecular Psychiatry (2023) 28:3013 3022
Content courtesy of Springer Nature, terms of use apply. Rights reserved
psymri), though balanced accuracies averaged across folds were
low with a mean of 61% across datasets and classication models
(range 5763%; see Fig. 2and Supplementary Table S2). Average
balanced accuracies for the comparisons between medicated
MDD vs HC, non-medicated MDD vs HC, and medicated MDD vs
non-medicated MDD were comparable with a mean of 62% (range
5467%). At least one classication model was signicantly better
than chance for each of these three comparisons for mddrest and
the combination of mddrest +psymri, while none of the classica-
tions for psymri were signicant. The Supplementary Information
provides additional evaluation metrics showing that sensitivity
and specicity were balanced (Supplementary Tables S3S5), and
that site harmonization using Combat had little inuence on the
results (Supplementary Table S15). Comparable classication
results were obtained when using a fully connected deep learning
model or when using a functional instead of structural parcellation
atlas (see Supplementary Information).
The cross-dataset training procedure for the contrast MDD vs
HC resulted in lower performances. A GCN trained on psymri and
tested on mddrest performed with a mean accuracy of 54.16
(sd =0.66), while trained on the mddrest and tested on the psymri
performed with mean accuracy of 56.38 (sd =0.84), a SVM-linear
performed with accuracy of 55.7 and 54.8 respectively on the
same contrasts, and a SVM-rbf performed with accuracy of 53.1,
and accuracy of 56.1.
To investigate the inuence of subject and research site
characteristics on classication performance, we assessed the
accuracy for the different sexes, diagnostic statuses, scanner
manufacturers and recording sites for the SVM-rbf that performed
best. Particularly the variability in accuracy across sites was
appreciable (range 4887%), but was not signicantly associated
with sample size (r
s
=0.25, p=0.25). Additional univariate t-testing
revealed no signicant FC difference between correctly and
incorrectly classied participants (see Supplementary Information).
Symptom severity. To evaluate whether symptom severity could
be predicted from the FC matrices, we used GCN and support
vector regression (SVR) with the rbf kernel to predict Hamilton
depression scores (HAM-D) for 1113 patients in mddrest and 333
patients in psymri. SVR could only explain 3.5-7% of the variance
and GCN only predicted the training mean, indicating that
symptom severity could not be predicted reliably.
GCN-Explainer and ablation study results
To gain insight into the most important connections for the
classication of MDD vs HC, we performed two complementary
**** *
***
****
***
**
**
**** *
*
*
*
*
Fig. 2 Performance of each classier for each comparison, expressed as average balanced accuracy across ve folds. Error bars indicate
standard deviation across folds, * indicates classication results better than chance level after permutation testing. Signicance level was
corrected for the number of experiments performed, using the Bonferroni procedure.
S. Gallo et al.
3016
Molecular Psychiatry (2023) 28:3013 3022
Content courtesy of Springer Nature, terms of use apply. Rights reserved
visualization experiments on the mddrest and psymri datasets
separately to assess whether results would be consistent. Visualiza-
tion of the GCN using GCN-Explainer identied the connections
between 1) the left and right thalamus, 2) the right lingual gyrus and
right supracalcarine cortex, 3) the left and right anterior divisions of
the supramarginal gyrus, and 4) left and right medial frontal cortex.
These connections were amongst the ten most inuential connec-
tions that were present in both datasets (Fig. 3A). An ablation study
showed the highest drops in balanced accuracy that were present
in both datasets for the thalamus (mean (sd) over 10 repetitions per
fold; 6.27(2.17)% for psymri:and4.62(1.08)% for mddrest:) and
Heschls gyrus (5.99(3.88)% for psymri,4.12(1.47)% for mddrest).
The results for psymri and mddrest are presented in Fig. 3Band
Supplementary Table S6.
Univariate group analyses
28% of the connections showed signicant differences between
MDD and HC in the mddrest dataset, with predominantly
reduced FC in MDD. For example, the amygdala showed reduced
connectivity with 154 other regions, the insula showed reduced
connectivity with 126 other regions, and the anterior cingulate
cortex showed reduced connectivity with 100 other regions, but
increased connectivity with the right precentral and postcentral
gyri. In contrast, the thalamus showed increased connectivity with
199 other brain regions, primarily with frontal and insular regions,
but decreased connectivity between interhemispheric homologs
(Fig. 4). Effect sizes were low [26], with an average Cohens-dof
0.14 (range 0.34, 0.08) across signicantly reduced connec-
tions, and an average Cohens-dof 0.12 (range 0.08, 0.18) for
signicantly increased thalamus connections.
In the psymri dataset, only the decreased connectivity between
the left and right supracalcarine cortices survived correction for
multiple comparisons for the comparison between MDD and HC.
Inspection of the uncorrected results showed a comparable
pattern of results as for the mddrest dataset (Fig. 4). Full results for
each contrast and dataset (FDR corrected) are presented in
Supplementary Tables S8S13.
Only two comparisons, MDD vs HC and MDD-med vs HC
showed replicable differences in the two datasets (Table 1). In the
rst contrast, connectivity between the left and right
psymri
Connection between:
Thalamus L Thalamus R
Lingual Gyrus R Supracalcarine Cortex R
Supramarginal Gyrus, ant, L Supramarginal Gyrus, ant, R
Frontal Medial Cortex L Frontal Medial Cortex R
Paracingulate Gyrus L Paracingulate Gyrus R
Inf Frontal Gyrus, R Inferior Frontal Gyrus, R
Paracingulate Gyrus R Cingulate Gyrus, post, R
Cingulate Gyrus, ant, L Cingulate Gyrus, R
Putamen L Putamen R
Intracalcarine Cortex L Intracalcarine Cortex R
Connection between:
Thalamus L Thalamus R
Lingual Gyrus R Supracalcarine Cortex R
Supramarginal Gyrus,ant, L Supramarginal Gyrus, ant, R
Frontal Medial Cortex L Frontal Medial Cortex R
Lingual Gyrus L Lingual Gyrus R
Planum Temporale L Planum Temporale R
Occipital Pole L Occipital Pole R
Insular Cortex L Insular Cortex R
Inf Frontal Gyrus, triangularis L Inf Frontal Gyrus, opercularis L
Brain-Stem L Brain-Stem R
mddrest
-10% -8% -6% -4% -2% 0%
Precentral Gyrus R
Thalamus L
Sup. Parietal Lobule R
Heschl's Gyrus L
Occipital Pole R
Inf. Frontal Gyrus R
Temporal Pole R
Lingual Gyrus R
Pallidum R
Middle Temporal Gyrus, ant. L
Ablation study results
GCN-Explainer results
A
B
mean acc drop mean acc drop
-10% -8% -6% -4% -2% 0%
Thalamus R
Middle Frontal Gyrus L
Frontal Medial Cortex R
Postcentral Gyrus R
Heschl's Gyrus L
Insular Cortex L
Frontal Orbital Cortex L
Cuneal Cortex L
Caudate R
Frontal Pole R
Fig. 3 GCN explainer and ablation results for the classication of MDD and HC. A Results of the GCN explainer experiment obtained using
the psymri dataset (left panel) and on the mddrest dataset (right panel): on top is the graphic representation of the functional connections
between areas identied as necessary to discriminate MDD from HC, which are listed. The results on the left panel were obtained from the
experiment on the psymri dataset, while those on the right are from the mddrest dataset. Connections identied by experiments in both
datasets are shaded in gray. BResults of the ablation experiment obtained using the psymri dataset (left panel) and the mddrest dataset (right
panel). Regions identied by experiments in both datasets are shaded in gray. L left, R right, ant anterior, inf. inferior, post posterior, acc
balanced accuracy.
S. Gallo et al.
3017
Molecular Psychiatry (2023) 28:3013 3022
Content courtesy of Springer Nature, terms of use apply. Rights reserved
supracalcarine cortex showed a reduction in patients (psymri t:
4.73 p-corr < 0.05, mddrest t: 3.69, p-corr < 0.0005), out of the
3536 signicant connectivity in the mddrest dataset. For the
second comparison, connectivity between the left thalamus and
left prefrontal gyrus showed increased connectivity in medicated
patients (psymri t: 3.56 p-corr < 0.05, mddrest t: 3.03, p-corr < 0.05),
while another nine FCs showed decreased FC in medicated
patients, out of the 3527 signicant connectivity in the psymri
dataset and 596 in the mddrest.
Sex classication
Classication of sex was beyond chance level for all the classiers
and datasets, with a mean across datasets and models of 68%
(range 6571%). To assess whether sex classication accuracy is
comparable in other datasets, we performed similar analyses with
comparable sample sizes (N=2000) in the Abide and UK Biobank
datasets. Sex classication accuracy was comparable in the
retrospective Abide cohort (73%) and higher in the prospective
harmonized UK Biobank cohort (81%).
DISCUSSION
The results showed that ML and DL classiers were able to
distinguish patients from controls beyond chance level, but that
classication performance was low. Classication accuracies for
(non-)medicated patients separately were comparable, suggesting
that medication use had little inuence on the results. Visualiza-
tion of the functional connections that were most inuential
revealed hyperconnectivity of the thalamus. This was
corroborated by two distinct visualization techniques and
replicated in two datasets, suggesting that thalamic hypercon-
nectivity may be the most prominent neurophysiological char-
acteristic of MDD. Interestingly, thalamic hyperconnectivity was
rather specic, as MDD was mainly associated with widespread
hypoconnectivity.
The 61% accuracy in these two datasets is considerably lower
than the average 84% accuracy across small-scale studies in a
recent meta-analysis [4]. Our results corroborate those from a
recent Japanese multicenter study that reported a balanced
accuracy of 6769% [27]. The lower accuracy with larger sample
sizes is paradoxical as ML and DL models only become better
when trained on larger samples [28]. However, neuroimaging
research has actually shown that prediction accuracy tends to
decline with increasing sample size [29,30]. This is presumably
due to the increase in clinical heterogeneity when recruiting larger
samples, as sample heterogeneity reduces model performance
[5,27]. We used data from two consortia that both consist of small
samples obtained at many different research centers, and
performances across sites ranged considerably (see Supplemen-
tary Information). Accordingly, the large total sample size came
together with large heterogeneity, which is probably responsible
for the poor accuracy of our model. Strategies to mitigate sites
effect were not successful (see Supplementary Information).
Heterogeneity is maximal when training and testing is performed
on two different datasets, and indeed the lowest results obtained
in the cross-datasets experiments conrmed the role of hetero-
geneity in compromising the nal performance, which may be
related to the distinct ancestry of the Chinese and European
Univariate t-test
ROIs
ROIs
ROIs
psymri
(uncorrected)
mddrest
(FDR corrected)
4
0
-2
-4
2
t-value
Frontal LimbicOccipital Parietal
Sub-cortical Temporal
Cohen’s d
psymri
mddrest
ROIs
ROIs
ROIs
cohen’s d
0
-.2
-.1
.3
.2
.1
-.3
Fig. 4 Univariate t-test results and Cohensd.Left: Results of the univariate t-test for the classication task MDD vs HC for the mddrest
dataset (top) and for the psymri (bottom). The mddrest results are corrected for multiple comparison and thresholded using FDR < 0.05. The
red lines correspond to the left and the right thalami. For the psymri dataset, t-tests did not survive correction for multiple comparisons, and
the results are thresholded at p-uncorr < 0.05 to illustrate the comparable pattern as for mddrest. The clustering for lobes is done merely for
illustration purposes. Right: Cohensdfor the classication task MDD vs HC for the mddrest dataset (top) and for the psymri (bottom), calculated
for each voxel group comparison.
S. Gallo et al.
3018
Molecular Psychiatry (2023) 28:3013 3022
Content courtesy of Springer Nature, terms of use apply. Rights reserved
cohorts. Nevertheless, combination of both datasets led to
comparable performance, indicating that it is possible to construct
an MDD model that can generalize across cohorts. Though where
sample homogeneity can lead to optimal model performance,
sample heterogeneity ensures optimal generalization of the
model to new data [5], suggesting that our results could form a
lower bound on classication accuracy.
One way to reduce sample heterogeneity is to take clinical
variability into account. We therefore split the sample into
medicated and non-medicated patients. Although antidepressants
are known to affect resting-state connectivity [31], this did not
increase classication performance, suggesting that medication use
had little inuence on the classication results. Attempts to classify
patients based on symptom severity or demographics were
unsuccessful (see Supplementary Fig. S3). The diagnosis of MDD
depends on the subjective evaluation of nine different symptoms
and as little as one symptom may overlap between two patients
[32], comorbidity is common, and symptoms may overlap with
other disorders [33], leading to low interrater reliability of the
diagnosis [34]. Such uncertainty associated with the diagnosis can
obscure the relationship between a patients data and the category
it belongs to [3538], and thereby decrease accuracy [39]. Data-
driven denition of the disorder and the use of biotypes could help
arrive at more homogeneous psychiatric groups. The search for
MDD-biotypes triggered a urry of publications [4045]and
discussions in the last few years, but no consensus has emerged yet.
Another way to reduce heterogeneity of the dataset is to
statistically harmonize data across centers. We performed data
harmonization using Combat, which had little inuence on the
results. While this procedure can increase the power to detect group
differences, it also had little inuence on the classication of
Alzheimers disease in the ADNI dataset [46]. While site harmoniza-
tion will have a large effect on the ability to distinguish centers from
the data, we expect that it will not have a large inuence on the
classication of interest when the dataset is balanced and site
information is independent of group membership.
Despite our hypothesis that exploiting the graph-like structure
of FC would be benecial for the classication tasks, we found no
clear advantage in using GCN over SVM, nor by another tested DL
model (Supplementary Information). DL methods like GCNs
perform especially well when applied to very large samples such
as the 14 million images in ImageNet [47]. However, in
applications like ours, the numerosity of the dataset is limited.
Even this largest sample of MDD fMRI data might simply not be
enough to exploit the potential of GCNs [48].
Although the results did not meet the accuracy criteria for a
clinical diagnostic tool, the insights they provide go beyond
mereprediction, and can help connect neurobiological pro-
cesses to their psychiatric consequences (but see ref. [49]). In the
classication between MDD and HC, the thalamus stood out as the
only region whose importance is supported both by two different
visualization techniques and the replication across two datasets.
The GCN-Explainer identied the inter-hemispheric connections
between left and right thalamus as most important to the
classication. Of note, the prominent inter-hemispheric connec-
tions in the results (Fig. 3A) may reect the nature of the way we
constructed the GCN-layers (Supplementary Information). We
therefore assume that the entire FC prole of the thalamus is
driving the result, rather than only the interhemispheric con-
nectivity. This interpretation is supported by the ablation study
performed for the GCN for which we removed each thalamus and
its FC with the rest of the brain and witnessed a ~5% drop in
accuracy, conrming its role in discriminating between MDD and
HC. Importantly, the relatively low accuracy of the classiers
inuences the reliability of visualization techniques. For the GCN
model, we were able to take advantage of the stochastic nature of
the algorithm to increase the replicability of the results by
repeating the training-test procedure 10 times. SVM algorithms
are deterministic and this augmentation is not possible, limiting
reliability of the results even further (reported in the Supplemen-
tary Information)
Additional univariate t-testing showed thalamic hyperconnec-
tivity in mddrest, while most other brain regions showed
hypoconnectivity. This pattern was also observed in psymri, but
this did not withstand correction for multiple comparisons.
Univariate effect sizes in the two datasets were comparable,
suggesting that the psymri results could have been penalized by
the smaller sample size. In general, the univariate effect sizes were
negligible to small [26]. This highlights the usefulness of multi-
variate analysis, as the obtained ~60% accuracy translates into a
medium effect size [50].
Initial studies [51,52] as well as recent meta-analyses [5357]
have already pointed to thalamic hyperactivity, during rest as well
as during cognitive and emotion processing. Other studies have
Table 1. Replicated univariate t-test results in the psymri and mddrest datasets.
Connection between psymri mddrest
tp-value tp-value
MDD vs HC
R Supracalcarine Cortex L Supracalcarine Cortex 4.73 0.017 #### ######
MDD med vs HC
L Thalamus L Precentral Gyrus 3.56 0.048 3.03 0.034
L Temporal Occipital Fusiform Cortex L Lateral Occipital Cortex, infer. 4.52 0.005 #### 0.037
R Supracalcarine Cortex L Intracalcarine Cortex 4.29 0.009 #### 0.021
R Superior Temporal Gyrus, pos. R Superior Temporal Gyrus, ant. 4.06 0.018 #### 0.020
R Planum Temporale R Precentral Gyrus 3.93 0.021 #### 0.015
L Frontal Orbital Cortex R Frontal Medial Cortex 3.75 0.032 #### 0.049
L Heschls Gyrus (includes H1 and H2) R Postcentral Gyrus 3.74 0.032 #### 0.044
R Supracalcarine Cortex R Intracalcarine Cortex 3.73 0.032 #### 0.016
R Putamen R Middle Frontal Gyrus 3.57 0.048 #### 0.046
L Central Opercular Cortex R Precentral Gyrus 3.56 0.048 #### 0.012
Functional connectivity results showing differences between the MDD and HC group and the medicated MDD and the HC groups that are replicated in the
two datasets. Tvalues and p-values, FDR corrected, are reported separately for the psymri and mddrest dataset.
RRight, LLeft.
S. Gallo et al.
3019
Molecular Psychiatry (2023) 28:3013 3022
Content courtesy of Springer Nature, terms of use apply. Rights reserved
suggested metabolic abnormalities in the thalamus of patients
with depression [5860], and specically the mediodorsal
thalamus was implicated in onset of depression [61,62]. This
nuclei is responsible for integrating sensory, motor, visceral and
olfactory information and subsequently relating it to the
individuals emotional state [63] and its connectivity prole is
congruent with the increased connectivity pattern we nd in our
study. This suggests that our results may be driven by
hyperconnectivity of the mediodorsal thalamus. Hypervigilant
brain states in MDD have been observed with electroencephalo-
graphy [64,65] and inversely, thalamic deactivation precedes
sleep onset. EEG-fMRI-studies in healthy subjects have reported a
thalamic BOLD signal decrease in lower vigilance states [64] and
directly relevant to our result thalamocortical uncoupling as a
general hallmark of (light) sleep [6668]. Thalamic hyperconnec-
tivity during MRI scanning (that represents a mild stress
experiment) may well hint towards a general MDD-related
dysfunction within the larger brain network, as recent views on
the thalamus hold that it is not a passive relay station but that it
has a central role in ongoing cortical functioning [69,70]. Overall,
this suggests a hypothesis that corticothalamic hyperconnectivity
may hijackthe corticocortical connectivity that was reduced
throughout the brain in our study.
This study has to be considered in light of its strengths and
limitations. Its main strength is the use of two of the largest
resting-state fMRI consortia with clinically conrmed MDD that
show converging evidence for poor discrimination of MDD and
the importance of thalamic hyperconnectivity. At the same time,
these large datasets come with the limitation of large clinical (e.g.,
differences in severity and chronicity) and technological (e.g.,
differences in scanners and MRI acquisitions) heterogeneity,
presumably reducing classication accuracy. To evaluate whether
technolgical heterogeneity could have inuenced the results, we
compared sex classication in our MDD cohorts with sex
classication in a comparable cohort for ASD (ABIDE) and a high
quality dataset with prospective data acquisition harmonization
(UK Biobank). The results show that sex classication accuracy can
increase from (7173%) in the MDD and ASD datasets to 81% in
the UK Biobank. This suggests that our MDD classication result is
as good as can be obtained from heterogenous retrospective
multicenter cohorts. Though the higher sex classication accuracy
in the UK Biobank suggests that the accuracy for MDD may
improve when all data could be collected on the same scanner. A
further limitation is that we analyzed FC as a stationary feature
even though it consists of dynamic changes in neural activity over
time, which may be important for the classication of MDD [71].
And nally, given the domain heterogeneity of psychiatric
disorders, classifying a disease based on one brain imaging
modality only is reductive. An integrative modeling of multimodal
data, such as molecular, genomic, clinical, medical imaging,
physiological signal and behavioral means comprehensively
considering different aspects of the disease, thus likely enhancing
the classication performance [13,72]
In conclusion, our study provides a realistic and possibly lower
bound estimate of the classication performance that can be
obtained with the application of FC on a large, ecologically valid,
multi-site sample of MDD patients. Our ndings show that FC can
distinguish between MDD patients and HCs, but that it is not
sufciently accurate for clinical use. Despite the low accuracy,
visualization of the DL classier enabled important insights into
the neural basis of MDD, and revealed consistent and reproducible
thalamic hyperconnectivity as the most prominent neurophysio-
logical characteristic of MDD.
DATA AVAILABILITY
Deidentied and anonymized data were contributed from studies approved by local
Institutional Review Boards. All study participants provided written informed consent
at their local institution. Data of the PsyMRI project are available at http://psymri.org/.
Data of the REST-meta-MDD project are available at: http://rfmri.org/REST-meta-MDD.
Data that were generated and the graph convolutional models used for this study are
available on request to the corresponding author.
REFERENCES
1. GBD 2017 Disease and Injury Incidence and Prevalence Collaborators. Global,
regional, and national incidence, prevalence, and years lived with disability for
354 diseases and injuries for 195 countries and territories, 1990-2017: a sys-
tematic analysis for the Global Burden of Disease Study 2017. Lancet.
2018;392:1789858.
2. Coleman JRI, Gaspar HA, Bryois J, Bipolar Disorder Working Group of the Psy-
chiatric Genomics Consortium, Major Depressive Disorder Working Group of the
Psychiatric Genomics Consortium, Breen G. The genetics of the mood disorder
spectrum: genome-wide association analyses of more than 185,000 cases and
439,000 controls. Biol Psychiatry. 2020;88:16984.
3. Topol EJ. High-performance medicine: the convergence of human and articial
intelligence. Nat Med. 2019;25:4456.
4. Kambeitz J, Cabral C, Sacchet MD, Gotlib IH, Zahn R, Serpa MH, et al. Detecting
neuroimaging biomarkers for depression: a meta-analysis of multivariate pattern
recognition studies. Biol Psychiatry. 2017;82:3308.
5. Schnack HG, Kahn RS. Detecting neuroimaging biomarkers for psychiatric dis-
orders: sample size matters. Front Psychiatry. 2016;7:50.
6. Yan C-G, Chen X, Li L, Castellanos FX, Bai T-J, Bo Q-J, et al. Reduced default mode
network functional connectivity in patients with recurrent major depressive
disorder. Proc Natl Acad Sci USA. 2019;116:907883.
7. Friston KJ, Frith CD, Liddle PF, Frackowiak RS. Functional connectivity: the
principal-component analysis of large (PET) data sets. J Cereb Blood Flow Metab.
1993;13:514.
8. Murrough JW, Abdallah CG, Anticevic A, Collins KA, Geha P, Averill LA, et al.
Reduced global functional connectivity of the medial prefrontal cortex in major
depressive disorder. Hum Brain Mapp. 2016;37:321423.
9. Hamilton JP, Etkin A, Furman DJ, Lemus MG, Johnson RF, Gotlib IH. Functional
neuroimaging of major depressive disorder: a meta-analysis and new integration
of base line activation and neural response data. Am J Psychiatry.
2012;169:693703.
10. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:43644.
11. Zeng L-L, Shen H, Liu L, Wang L, Li B, Fang P, et al. Identifying major depression
using whole-brain functional connectivity: a multivariate pattern analysis. Brain.
2012;135:1498507. Pt 5
12. Wang X, Ren Y, Zhang W. Depression disorder classication of fMRI data using
sparse low-rank functional brain network and graph-based features. Comput
Math Methods Med. 2017;2017:3609821.
13. Durstewitz D, Koppe G, Meyer-Lindenberg A. Deep neural networks in psychiatry.
Mol Psychiatry. 2019;24:158398.
14. Quaak M, van de Mortel L, Thomas RM, van Wingen G. Deep learning applications
for the classication of psychiatric disorders using neuroimaging data: Systematic
review and meta-analysis. NeuroImage Clin. 2021;30:102584.
15. Thomas NK, Welling M. Semi-supervised classication with graph con- volutional
networks. arXiv. 2016. https://arxiv.org/abs/1609.02907.
16. Castelvecchi D. Can we open the black box of AI? Nature. 2016;538:203. https://
doi.org/10.1038/538020a.
17. Ying R, Bourgeois D, You J, Zitnik M, Leskovec J. GNNExplainer: generating expla-
nations for graph neural networks. Adv Neural Inf Process Syst. 2019;32:924051.
18. Di Martino A, Yan CG, Li Q, Denio E, Castellanos FX, Alaerts K, et al. The autism
brain imaging data exchange: towards a large-scale evaluation of the intrinsic
brain architecture in autism. Mol Psychiatry. 2014;19:65967.
19. Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an
open access resource for identifying the causes of a wide range of complex
diseases of middle and old age. PLoS Med. 2015;12:e1001779.
20. Chao-Gan Y, Yu-Feng Z. DPARSF: a MATLAB toolbox for pipelinedata analysis of
resting-state fMRI. Front Syst Neurosci. 2010;4:13.
21. Penny WD, Friston KJ, Ashburner JT, Kiebel SJ, Nichols TE, editors. Statistical
parametric mapping: the analysis of functional brain images. Elsevier; 2011.
22. Makris N, Goldstein JM, Kennedy D, Hodge SM, Caviness VS, Faraone SV, et al.
Decreased volume of left and total anterior insular lobule in schizophrenia.
Schizophr Res. 2006;83:15571.
23. Hastie T, Tibshirani R, Friedman JH, Friedman JH. The elements of statistical
learning: data mining, inference, and prediction. Vol. 2. New York: Springer; 2009.
pp. 1758.
24. Fisher RA. The use of multiple measurements in taxonomic problems. Ann Eugen.
1936;7:17988. https://doi.org/10.1111/j.1469-1809.1936.tb02137.x.
25. McLachlan GJ. Discriminant analysis and statistical pattern recognition. In: Wiley
Series in Probability and Statistics. 1992. https://doi.org/10.1002/0471725293.
S. Gallo et al.
3020
Molecular Psychiatry (2023) 28:3013 3022
Content courtesy of Springer Nature, terms of use apply. Rights reserved
26. Cohen J. Statistical power analysis. Curr Dir Psychol Sci. 1992;1:98101.
27. Yamashita A, Sakai Y, Yamada T, Yahata N, Kunimatsu A, Okada N, et al. Gen-
eralizable brain network markers of major depressive disorder across multiple
imaging sites. PLoS Biol. 2020;18:e3000966.
28. Schnack HG, Nieuwenhuis M, van Haren NEM, Abramovic L, Scheewe TW,
Brouwer RM, et al. Can structural MRI aid in clinical classication? A machine
learning study in two independent samples of patients with schizophrenia,
bipolar disorder and healthy subjects. Neuroimage. 2014;84:299306.
29. Wolfers T, Buitelaar JK, Beckmann CF, Franke B, Marquand AF. From estimating
activation locality to predicting disorder: a review of pattern recognition for
neuroimaging-based psychiatric diagnostics. Neurosci Biobehav Rev. 2015;57:32849.
30. Bruin WB, Taylor L, Thomas RM, Shock JP, Zhutovsky P, Abe Y, et al. Structural
neuroimaging biomarkers for obsessive-compulsive disorder in the ENIGMA-OCD
consortium: medication matters. Transl Psychiatry. 2020;10:342.
31. van Wingen GA, Tendolkar I, Urner M, van Marle HJ, Denys D, Verkes R-J, et al.
Short-term antidepressant administration reduces default mode and task-
positive network connectivity in healthy individuals during rest. Neuroimage.
2014;88:4753.
32. American Psychiatric Association. Diagnostic and statistical manual of mental
disorders (DSM-5®). Washington, DC: American Psychiatric Pub; 2013. pp. 991.
33. Insel T, Cuthbert B, Garvey M, Heinssen R, Pine DS, Quinn K, et al. Research
Domain Criteria (RDoC): toward a new classication framework for research on
mental disorders. Am J Psychiatry. 2010;167:74851. https://doi.org/10.1176/
appi.ajp.2010.09091379.
34. Regier DA, Narrow WE, Clarke DE, Kraemer HC, Kuramoto SJ, Kuhl EA, et al. DSM-5
eld trials in the United States and Canada, Part II: test-retest reliability of
selected categorical diagnoses. Am J Psychiatry. 2013;170:5970.
35. Hickey RJ. Noise modelling and evaluating learning from examples. Artif Intell.
1996;82:15779. https://doi.org/10.1016/0004-3702(94)00094-8.
36. Nigam N, Dutta T, Gupta HP. Impact of noisy labels in learning techniques: a survey.
Adv Data Inf Sci. 2020;40311. https://doi.org/10.1007/978-981-15-0694-9_38.
37. AbuDahab K, Xu D-L, Keane J. Induction of belief decision trees from data. In: AIP
Conference Proceedings. 2012. https://doi.org/10.1063/1.4756644.
38. Lim C, Han S, Lee J. Analyzing deep neural networks with noisy labels. In: 2020
IEEE International Conference on Big Data and Smart Computing (BigComp).
2020. https://doi.org/10.1109/bigcomp48618.2020.00012.
39. Frénay B, Verleysen M. Classication in the presence of label noise: a survey. IEEE
Trans Neural Netw Learn Syst. 2014;25:84569.
40. Drysdale AT, Grosenick L, Downar J, Dunlop K, Mansouri F, Meng Y, et al. Resting-
state connectivity biomarkers dene neurophysiological subtypes of depression.
Nat Med. 2017;23:2838.
41. Dinga R, Schmaal L, Penninx BWJH, van Tol MJ, Veltman DJ, van Velzen L, et al.
Evaluating the evidence for biotypes of depression: methodological replication
and extension of. Neuroimage Clin. 2019;22:101796.
42. Clementz BA, Sweeney JA, Hamm JP, Ivleva EI, Ethridge LE, Pearlson GD, et al.
Identication of distinct psychosis biotypes using brain-based biomarkers. Focus.
2018;16:22536.
43. Grosenick L, Shi TC, Gunning FM, Dubin MJ, Downar J, Liston C. Functional and
optogenetic approaches to discovering stable subtype-specic circuit mechan-
isms in depression. Biol Psychiatry Cogn Neurosci Neuroimaging. 2019;4:55466.
44. Mihalik A, Ferreira FS, Moutoussis M, Ziegler G, Adams RA, Rosa MJ, et al. Multiple
holdouts with stability: improving the generalizability of machine learning ana-
lyses of brainbehavior relationships. Biol Psychiatry. 2020;87:36876. https://
doi.org/10.1016/j.biopsych.2019.12.001.
45. Ing A, Sämann PG, Chu C, Tay N, Biondo F, Robert G, et al. Identication of
neurobehavioural symptom groups based on shared brain mechanisms. Nat Hum
Behav. 2019;3:130618.
46. Chen AA, Beer JC, Tustison NJ, Cook PA, Shinohara RT, Shou H, Alzheimers
Disease Neuroimaging Initiative. Mitigating site effects in covariance for machine
learning in neuroimaging data. Hum Brain Mapp. 2022;43:117995.
47. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, et al. ImageNet large
scale visual recognition challenge. Int J Comp Vis. 2015;115:21152. https://
doi.org/10.1007/s11263-015-0816-y.
48. He T, Kong R, Holmes AJ, Sabuncu MR, Eickhoff SB, Bzdok D, et al. Is deep learning
better than kernel regression for functional connectivity prediction of uid
intelligence? In: 2018 International Workshop on Pattern Recognition in Neuroi-
maging (PRNI). 2018. https://doi.org/10.1109/prni.2018.8423958.
49. Rudin C. Stop explaining black box machine learning models for high stakes
decisions and use interpretable models instead. Nat Mach Intell. 2019;1:20615.
https://doi.org/10.1038/s42256-019-0048-x.
50. Chinn S. A simple method for converting an odds ratio to effect size for use in
meta-analysis. Stat Med. 2000;19:312731. https://doi.org/10.1002/1097-
0258(20001130)19:22<3127::aid-sim784>3.0.co;2-m.
51. Drevets WC, Videen TO, Price JL, Preskorn SH, Carmichael ST, Raichle ME. A
functional anatomical study of unipolar depression. J Neurosci. 1992;12:362841.
52. Greicius MD, Flores BH, Menon V, Glover GH, Solvason HB, Kenna H, et al. Resting-
state functional connectivity in major depression: abnormally increased con-
tributions from subgenual cingulate cortex and thalamus. Biol Psychiatry.
2007;62:42937.
53. Hamilton JP, Farmer M, Fogelman P, Gotlib IH. Depressive rumination, the
default-mode network, and the dark matter of clinical neuroscience. Biol Psy-
chiatry. 2015;78:22430.
54. Palmer SM, Crewther SG, Carey LM, START Project Team. A meta-analysis of
changes in brain activity in clinical depression. Front Hum Neurosci. 2014;8:1045.
55. Müller VI, Cieslik EC, Serbanescu I, Laird AR, Fox PT, Eickhoff SB. Altered brain
activity in unipolar depression revisited: meta-analyses of neuroimaging studies.
JAMA Psychiatry. 2017;74:4755.
56. Mayberg HS. Limbic-cortical dysregulation: a proposed model of depression. J
Neuropsychiatry Clin Neurosci. 1997;9:47181.
57. Phillips ML, Drevets WC, Rauch SL, Lane R. Neurobiology of emotion perception II:
Implications for major psychiatric disorders. Biol Psychiatry. 2003;54:51528.
58. Holthoff VA, Beuthien-Baumann B, Zündorf G, Triemer A, Lüdecke S, Winiecki P,
et al. Changes in brain metabolism associated with remission in unipolar major
depression. Acta Psychiatr Scand. 2004;110:18494.
59. Dougherty DD, Weiss AP, Cosgrove GR, Alpert NM, Cassem EH, Nierenberg AA,
et al. Cerebral metabolic correlates as potential predictors of response to anterior
cingulotomy for treatment of major depression. J Neurosurg. 2003;99:10107.
60. Neumeister A, Nugent AC, Waldeck T, Geraci M, Schwarz M, Bonne O, et al. Neural
and behavioral responses to tryptophan depletion in unmedicatedpatients with
remitted major depressive disorder and controls. Arch Gen Psychiatry.
2004;61:76573.
61. Li W, Liu J, Skidmore F, Liu Y, Tian J, Li K. White matter microstructure changes in
the thalamus in Parkinson disease with depression: a diffusion tensor MR imaging
study. AJNR Am J Neuroradiol. 2010;31:18616.
62. Young KA, Holcomb LA, Yazdani U, Hicks PB, German DC. Elevated neuron
number in the limbic thalamus in major depression. Am J Psychiatry.
2004;161:12707.
63. Price JL, Drevets WC. Neurocircuitry of mood disorders. Neuropsychopharma-
cology. 2009;35:192216.
64. Olbrich S, Mulert C, Karch S, Trenner M, Leicht G, Pogarell O, et al. EEG-vigilance
and BOLD effect during simultaneous EEG/fMRI measurement. Neuroimage.
2009;45:31932.
65. Hegerl U, Wilk K, Olbrich S, Schoenknecht P, Sander C. Hyperstable regulation of
vigilance in patients with major depressive disorder. World J Biol Psychiatry.
2012;13:43646.
66. Spoormaker VI, Sturm A, Andrade K, Schroeter M, Goya-Maldonado R, Holsboer F,
et al. The neural correlates and temporal sequence of the relationship between
shock exposure, disturbed sleep and impaired consolidation of fear extinction. J
Psychiatr Res. 2010;44:11218. https://doi.org/10.1016/j.jpsychires.2010.04.017
67. Sämann PG, Wehrle R, Hoehn D, Spoormaker VI, Peters H, Tully C, et al. Devel-
opment of the brains default mode network from wakefulness to slow wave
sleep. Cereb Cortex. 2011;21:208293.
68. Tagliazucchi E, Laufs H. Decoding wakefulness levels from typical fMRI resting-
state data reveals reliable drifts between wakefulness and sleep. Neuron.
2014;82:695708.
69. Sherman SM. Thalamus plays a central role in ongoing cortical functioning. Nat
Neurosci. 2016;19:53341.
70. Weis S, Patil KR, Hoffstaedter F, Nostro A, Yeo BTT, Eickhoff SB. Sex classication
by resting state brain connectivity. Cereb Cortex. 2020;30:82435.
71. Yao D, Sui J, Yang E, Yap P-T, Shen D, Liu M. Temporal-adaptive graph con-
volutional network for automated identication of major depressive disorder
using resting-state fMRI. Mach Learn Med Imaging. 2020. 110. https://doi.org/
10.1007/978-3-030-59861-7_1.
72. Yang J, Yin Y, Zhang Z, Long J, Dong J, Zhang Y, et al. Predictive brain networks for
major depression in a semi-multimodal fusion hierarchical feature reduction frame-
work. Neurosci Lett. 2018;665:1639. https://doi.org/10.1016/j.neulet.2017.12.009.
ACKNOWLEDGEMENTS
This work was supported by the Netherlands Organization for Scientic Research
(NWO; 628.011.023); Philips Research; ZonMW (Vidi; 016.156.318). The access of the
UKbioBank data was granted under the application number 30091. Data collection
was supported by Swedish Research Council; ALF grant from Region Östergötland;
the Phyllis and Jerome Lyle Rappaport Foundation, Ad Astra Chandaria Foundation,
BIAL Foundation, Brain and Behavior Research Foundation, Anonymous donors, and
the Center for Depression, Anxiety, and Stress Research at McLean Hospital; The
German Research Foundation (DFG, grant FOR2107 DA1151/5-1 and DA1151/5-2 to
UD; SFB-TRR58, Projects C09 and Z02 to UD) and the Interdisciplinary Center for
Clinical Research (IZKF) of the medical faculty of Münster (grant Dan3/012/17 to UD);
European Commission (grant number H2020-634541); German Research Foundation
S. Gallo et al.
3021
Molecular Psychiatry (2023) 28:3013 3022
Content courtesy of Springer Nature, terms of use apply. Rights reserved
(GR 4510/2-1); Australian National Health and Medical Research Council of Australia
(NHMRC) Project Grants 1064643 (principal investigator, BJH) and 1024570 (principal
investigator, CGD); Austrian Science Fund (FWF, grant nr. KLI 597-827, KLI-148-B00,
F3514-B1); Science Foundation Ireland (SFI); The German Research Foundation (DFG
WA1539/4-1). This work also acknowledges the DIRECT consortium for providing the
Rest-Meta-MDD dataset.
AUTHOR CONTRIBUTIONS
SG: Conceptualization, Software, Methodology, Writing (original draft). A-EG:
Conceptualization, Software, Methodology, Writing (original draft). PZ: Data curation,
Writing (review & editing). RMT: Methodology, Supervision, Writing (review & editing).
NJ, ML, LB, DB, UD, CD, TF, IG, SGrimm, DG, TH, PJH, BJH, AJ, TK, BM, IN, SO, EP, LP,
MDS, PS, GW, HW, MW, The DIRECT Consortium: Data collection, Writing (review).
GvW: Conceptualization, Writing (review & editing), Supervision.
COMPETING INTERESTS
GvW received research funding from Philips. The other authors declare no competing
interests.
ADDITIONAL INFORMATION
Supplementary information The online version contains supplementary material
available at https://doi.org/10.1038/s41380-023-01977-5.
Correspondence and requests for materials should be addressed to Ahmed
El-Gazzar.
Reprints and permission information is available at http://www.nature.com/
reprints
Publishers note Springer Nature remains neutral with regard to jurisdictional claims
in published maps and institutional afliations.
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative
Commons license, and indicate if changes were made. The images or other third party
material in this article are included in the articles Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the
articles Creative Commons license and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly
from the copyright holder. To view a copy of this license, visit http://
creativecommons.org/licenses/by/4.0/.
© The Author(s) 2023
PSYMRI
Nooshin Javaheripour3, Meng Li3, Lucie Bartova 4, Udo Dannlowski 6, Christopher Davey 7, Thomas Frodl 8,9, Ian Gotlib 10,
Simone Grimm11, Dominik Grotegerd6, Tim Hahn 6, Paul J. Hamilton 12, Ben J. Harrison7, Andreas Jansen13, Tilo Kircher13,
Bernhard Meyer4, Igor Nenadić13, Sebastian Olbrich14, Elisabeth Paul 12, Lukas Pezawas 4, Matthew D. Sacchet15,
Philipp Sämann 16, Gerd Wagner 3, Henrik Walter 17, Martin Walter 8,9 and Guido van Wingen1,2
S. Gallo et al.
3022
Molecular Psychiatry (2023) 28:3013 3022
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
... The functional brain network matrix is constructed by calculating the FC between all pairs of nodes. After FC construction, we applied a regression-out approach to remove any potential factors that might affect the statistical and classification results, such as sites, equipment, acquisition parameters, and the age and gender of participants (Gallo et al., 2023). Regression-out is a commonly used harmonization technique to address site effects and control for covariates across studies (Wang et al., 2023;Dansereau et al., 2017). ...
... Compared to other deep learning models, GNN is preferred for its capability to utilize the inherent graph structure of FC and incorporate graph theory metrics as node features. This combination facilitates the visualization of key features (Gallo et al., 2023). Specifically, the process began with training the GNN to perform binary classification on graph data, capturing high-order nonlinear connectivity relationships between brain regions (Yang et al., 2023). ...
... This performs consistent with previous research. Gallo et al. (2023) employed GNN-Explainer to identify biomarkers and found intrinsic FCs within regions such as the thalamus and supramarginal gyrus in their study on depression. Similarly, Lei et al. (2022) used GNN to identify significant brain regions, which were predominantly located in subcortical and frontal regions, in patients with schizophrenia. ...
... Firstly, previous studies only conducted statistical difference analysis on quantitative indicators of functional network topology structure between MDD and healthy control groups without quantifying the discriminative ability of these attributes or identifying which indicators show greater differentiation between groups. Secondly, current research primarily focuses on characterizing graph topology structures encoding for MDD using advanced machine learning methods to enhance understanding of the pathophysiological mechanisms underlying MDD diagnosis [8][9][10][11][12] , they have framed diagnosing MDD as a graph classification problem and attempted to use graph convolutional neural networks with reasoning learning capabilities for graph topology structures represented by node inputs through graph convolution operators and pooling methods mapping into low-dimensional graph information encoding 13 . However, most studies input either functional connections or BOLD signals from brain regions as node information representations [8][9][10][11][12] , where a study evaluated five popular GNN architectures' performance in an MDD identification task, showing that one-dimensional convolutional neural networks performed best on flattened functional connection data possibly due to redundant information leading to reduced recognition efficiency within GNN architectures 10 . ...
... Secondly, current research primarily focuses on characterizing graph topology structures encoding for MDD using advanced machine learning methods to enhance understanding of the pathophysiological mechanisms underlying MDD diagnosis [8][9][10][11][12] , they have framed diagnosing MDD as a graph classification problem and attempted to use graph convolutional neural networks with reasoning learning capabilities for graph topology structures represented by node inputs through graph convolution operators and pooling methods mapping into low-dimensional graph information encoding 13 . However, most studies input either functional connections or BOLD signals from brain regions as node information representations [8][9][10][11][12] , where a study evaluated five popular GNN architectures' performance in an MDD identification task, showing that one-dimensional convolutional neural networks performed best on flattened functional connection data possibly due to redundant information leading to reduced recognition efficiency within GNN architectures 10 . Furthermore, a study utilized functional connectivity, BOLD signals, and functional metrics (including amplitude of low-frequency fluctuations 14 、fractional ALFF (fALFF) 15 and regional homogeneity (ReHo) 16 ) as input nodes for model learning, The findings indicated that, for the assessment of the efficiency of MDD, network functional metric information was found to be optimal, while functional connectivity and blood-oxygen-level-dependent (BOLD) signal information exhibited lower efficiency 9 . ...
... A comparison was made between the direct impact of node information on model learning efficiency and performance in representing graph structure, specifically between node functional metrics and node functional connectivity. The results show that node topology attributes are more effective in identifying MDD applications than functional connectivity, BOLD signal, and functional metrics as input features [8][9][10][11][12] of machine learning models in previous studies, providing empirical evidence for future deep learning of graph network structure to explore brain network topology in MDD patients. ...
Article
Full-text available
Major Depressive Disorder (MDD) is a common mental disorder characterized by cognitive impairment, and its pathophysiology remains to be explored. In this study, we aimed to explore the efficacy of brain network topological properties (TPs) in identifying MDD patients, revealing variational brain regions with efficient TPs. Functional connectivity (FC) networks were constructed from resting-state functional magnetic resonance imaging (rs-fMRI). Small-worldness did not exhibit significant variations in MDD patients. Subsequently, two-sample t-tests were employed to screen FC and reconstruct the network. The discriminative ability of TPs between MDD patients and healthy controls was analyzed using receiver operating characteristic (ROC), ROC analysis showed the small-worldness of binary reconstructed FC network (p < 0.05) was reduced in MDD patients, with area under the curve (AUC) of local efficiency (Le) and clustering coefficient (Cp) as sample features having AUC of 0.6351 and 0.6347 respectively being optimal. The AUC of Le and Cp for retained brain regions by T-test (p < 0.05) were 0.6795 and 0.6956 respectively. Further, support vector machine (SVM) model assessed the effectiveness of TPs in identifying MDD patients, and it identified the Le and Cp in brain regions selected by the least absolute shrinkage and selection operator (LASSO), with average accuracy from leave-one-site-out cross-validation being 62.03% and 61.44%. Additionally, shapley additive explanations (SHAP) was employed to elucidate variations in TPs across brain regions, revealing that predominant variations among MDD patients occurred within the default mode network. These results reveal efficient TPs that can provide empirical evidence for utilizing nodal TPs as effective inputs for deep learning on graph structures, contributing to understanding the pathological mechanisms of MDD.
... Compromised blood-brain barrier integrity, along with dysregulated cytokine and neurotrophic factor signaling, predisposes the central nervous system to an inflammatory state. Inflammation activates microglia and astrocytes, causes neuronal damage and apoptosis, impairs functional connectivity and significantly alters the structure and function of critical brain regions (including the hippocampus, frontal lobe, amygdala and striatum) [138,197,198]. ...
Article
Full-text available
Major depressive disorder is a prevalent mental disorder, yet its pathogenesis remains poorly understood. Accumulating evidence implicates dysregulated immune mechanisms as key contributors to depressive disorders. This review elucidates the complex interplay between peripheral and central immune components underlying depressive disorder pathology. Peripherally, systemic inflammation, gut immune dysregulation, and immune dysfunction in organs including gut, liver, spleen and adipose tissue influence brain function through neural and molecular pathways. Within the central nervous system, aberrant microglial and astrocytes activation, cytokine imbalances, and compromised blood-brain barrier integrity propagate neuroinflammation, disrupting neurotransmission, impairing neuroplasticity, and promoting neuronal injury. The crosstalk between peripheral and central immunity creates a vicious cycle exacerbating depressive neuropathology. Unraveling these multifaceted immune-mediated mechanisms provides insights into major depressive disorder’s pathogenic basis and potential biomarkers and targets. Modulating both peripheral and central immune responses represent a promising multidimensional therapeutic strategy.
... We utilized the 100-ROI and 400-ROI parcellation templates from Schaefer2018 parcellation [21] at different scales to parcel the cortex, and for further estimation of brain network systems. We employed a regressionout approach (one of the most used harmonization techniques [63,64]) for preprocessed data to eliminate any potential factors that could influence the results, including sites, equipments, acquisition parameters, and the age and gender of individuals. ...
Preprint
Full-text available
Network control theory (NCT) has recently been utilized in neuroscience to facilitate our understanding of brain stimulation effects. A particularly useful branch of NCT is optimal control, which focuses on applying theoretical and computational principles of control theory to design optimal strategies to achieve specific goals in neural processes. However, most existing research focuses on optimally controlling brain network dynamics from the original state to a target state at a specific time point. In this paper, we present the first investigation of introducing optimal stochastic tracking control strategy to synchronize the dynamics of the brain network to a target dynamics rather than to a target state at a specific time point. We utilized fMRI data from healthy groups, and cases of stroke and post-stroke aphasia. For all participants, we utilized a gradient descent optimization method to estimate the parameters for the brain network dynamic system. We then utilized optimal stochastic tracking control techniques to drive original unhealthy dynamics by controlling a certain number of nodes to synchronize with target healthy dynamics. Results show that the energy associated with optimal stochastic tracking control is negatively correlated with the intrinsic average controllability of the brain network system, while the energy of the optimal state approaching control is significantly related to the target state value. For a 100-dimensional brain network system, controlling the five nodes with the lowest tracking energy can achieve relatively acceptable dynamics control effects. Our results suggest that stochastic tracking control is more aligned with the objective of brain stimulation interventions, and is closely related to the intrinsic characteristics of the brain network system, potentially representing a new direction for future brain network optimal control research.
... The main rationale for implementing automated learning algorithms is in their aptitude for handling intricate and extensive datasets, discerning patterns, and predicting outcomes. The prompt identification of mental diseases relies heavily on these essential qualities [10][11][12][13][14][15]. ...
Article
Full-text available
The combination of extreme gradient boosting (XGBoost) and hippopotamus optimization algorithm (HOA) (XGBoost-HOA) was proposed for the multiclass classification of mental health disorders. The dataset considered for experimentation comprises 5019 records. This dataset contains 2487 records of females and 2532 records of males. It contains the records related to depression, anxiety, and stress disorders. The class imbalance problem was addressed by synthetic minority over-sampling technique (SMOTE). The XGBoost-HOA algorithm was applied to choose the most suitable hyperparameters and enhance class sensitivity while mitigating overfitting. The performance measures, considered for experimentation are accuracy, precision, recall, and F1-score. These are computed for the binary, three-class, four-class, and five-class classifications of depression, anxiety, and stress. The results demonstrate that the XGBoost-HOA algorithm possesses strong classification abilities, namely in the identification of depression and anxiety. When considering multifactor analysis, it attains accuracy up to 81%, precision up to 100%, recall up to 100%, and F1-scores up to 91%. The classification results in case of stress are lower in comparison to depression and anxiety. For depression, the receiver operating characteristic (ROC) curves indicate high area under the ROC curve (AUC) values, particularly for class 1 (AUC = 1.00) and class 5 (AUC = 0.95). Regarding anxiety, the ROC curves exhibit good performance, as indicated by high AUC values of 1.00 for class 1 and 0.93 for class 2. The ROC curves indicate that the performance for stress is moderate, with lower AUCs for certain classes, such as class 4 (AUC = 0.73). Overall, the XGBoost-HOA shows good results, except for occasional misclassification in some classes.
Article
Magnetic resonance imaging (MRI) offers non-invasive assessments of brain structure and function for analyzing brain disorders. With the increasing accumulation of multimodal MRI data in recent years, integrating information from various modalities has become an effective strategy for improving the detection of brain disorders. This study focuses on identifying major depressive disorder (MDD) by using arterial spin labeling (ASL) perfusion MRI in conjunction with structural MRI data. We collected ASL and structural MRI data from 260 participants, including 169 MDD patients and 91 healthy controls. We developed an explainable fusion method to identify MDD, utilizing cerebral blood flow (CBF) data from ASL perfusion MRI and brain tissue volumes from structural MRI. The fusion model, which integrates multimodal data, demonstrated superior predictive performance for MDD. By combining MRI regional volumes with CBF data, we achieved more effective results than using each modality independently. Additionally, we analyzed feature importance and interactions to explain the fusion model. We identified fourteen important features, comprising eight regional volumes and six regional CBF measures, that played a crucial role in the identification of MDD. Furthermore, we found three feature interactions among the important features and seven interactions between structural and functional features, which were particularly prominent in the model. The results of this study suggest that the fusion learning approach, which integrates ASL and structural MRI data, is effective in detecting MDD. Moreover, the study demonstrates that the model explanation method can reveal key features that influence the decisions of models, as well as potential interactions among these key features or between functional and structural features in identifying MDD. Keywords: Major depressive disorder; multimodal learning; structural MRI; Arterial spin labeling perfusion MRI
Chapter
Full-text available
Depression is inseparable from the distortive tendencies of late modernity. Anders Petersen wrote extensively on these tendencies, highlighting the general links between the structural paradoxes of late modernity and specific patterns of social suffering, which often manifest as depression. These phenomena however are always embedded in local contexts as well. This chapter aims at differentiating between the various modalities of depression emerging in two societies characterised by divergent paths of modernization. While the Netherlands was traditionally at the centre of the consecutive waves of modernization – including the birth and expansion of capitalism, industrialization, globalization and political liberalization – Hungary is characterised by a more reluctant relation to these processes. Given these divergent modernization paths the question dealt with in this chapter is in what way the framing of the ‘depression epidemic’ (according to the WHO depression is one of the leading causes of disability worldwide) also differs between the two countries. Focus is on discourses in popular media, internet platforms and online groups. What are the main differences between the Dutch and Hungarian popular discourses on depression as they unfold today, and how may this relate to the socio-cultural contexts at issue?
Article
Full-text available
Depression is vastly heterogeneous in its symptoms, neuroimaging data, and treatment responses. As such, describing how it develops at the network level has been notoriously difficult. In an attempt to overcome this issue, a theoretical “negative prediction mechanism” is proposed. Here, eight key brain regions are connected in a transient, state-dependent, core network of pathological communication that could facilitate the development of depressive cognition. In the context of predictive processing, it is suggested that this mechanism is activated as a response to negative/adverse stimuli in the external and/or internal environment that exceed a vulnerable individual’s capacity for cognitive appraisal. Specifically, repeated activation across this network is proposed to update an individual’s brain so that it increasingly predicts and reinforces negative experiences over time—pushing an individual at-risk for or suffering from depression deeper into mental illness. Within this, the negative prediction mechanism is poised to explain various aspects of prognostic outcome, describing how depression might ebb and flow over multiple timescales in a dynamically changing, complex environment.
Article
Full-text available
To acquire larger samples for answering complex questions in neuroscience, researchers have increasingly turned to multi-site neuroimaging studies. However, these studies are hindered by differences in images acquired across multiple sites. These effects have been shown to bias comparison between sites, mask biologically meaningful associations, and even introduce spurious associations. To address this, the field has focused on harmonizing data by removing site-related effects in the mean and variance of measurements. Contemporaneously with the increase in popularity of multi-center imaging, the use of machine learning (ML) in neuroimaging has also become commonplace. These approaches have been shown to provide improved sensitivity, specificity, and power due to their modeling the joint relationship across measurements in the brain. In this work, we demonstrate that methods for removing site effects in mean and variance may not be sufficient for ML. This stems from the fact that such methods fail to address how correlations between measurements can vary across sites. Data from the Alzheimer's Disease Neuroimaging Initiative is used to show that considerable differences in covariance exist across sites and that popular harmonization techniques do not address this issue. We then propose a novel harmonization method called Correcting Covariance Batch Effects (CovBat) that removes site effects in mean, variance, and covariance. We apply CovBat and show that within-site correlation matrices are successfully harmonized. Furthermore, we find that ML methods are unable to distinguish scanner manufacturer after our proposed harmonization is applied, and that the CovBat-harmonized data retain accurate prediction of disease group.
Article
Full-text available
Deep learning (DL) methods have been increasingly applied to neuroimaging data to identify patients with psychiatric and neurological disorders. This review provides an overview of the different DL applications within psychiatry and compares DL model accuracy to standard machine learning (SML). Fifty-three articles were included for qualitative analysis, primarily investigating autism spectrum disorder (ASD; n=22), schizophrenia (SZ; n=22) and attention-deficit/hyperactivity disorder (ADHD; n=9). Thirty-two of the thirty-five studies that directly compared DL to SML reported a higher accuracy for DL. Only sixteen studies could be included in a meta-regression to quantitatively compare DL and SML performance. This showed a higher odds ratio for DL models, though the comparison attained significance only for ASD. Our results suggest that deep learning of neuroimaging data is a promising tool for the classification of individual psychiatric patients. However, it is not yet used to its full potential: most studies use pre-engineered features, whereas one of the main advantages of DL is its ability to learn representations of minimally processed data. Our current evaluation is limited by minimal reporting of performance measures to enable quantitative comparisons, and the restriction to ADHD, SZ and ASD as current research focusses on large publicly available datasets. To truly uncover the added value of DL, we need carefully designed comparisons of SML and DL models which are yet rarely performed.
Article
Full-text available
Many studies have highlighted the difficulty inherent to the clinical application of fundamental neuroscience knowledge based on machine learning techniques. It is difficult to generalize machine learning brain markers to the data acquired from independent imaging sites, mainly due to large site differences in functional magnetic resonance imaging. We address the difficulty of finding a generalizable marker of major depressive disorder (MDD) that would distinguish patients from healthy controls based on resting-state functional connectivity patterns. For the discovery dataset with 713 participants from 4 imaging sites, we removed site differences using our recently developed harmonization method and developed a machine learning MDD classifier. The classifier achieved an approximately 70% generalization accuracy for an independent validation dataset with 521 participants from 5 different imaging sites. The successful generalization to a perfectly independent dataset acquired from multiple imaging sites is novel and ensures scientific reproducibility and clinical applicability.
Article
Full-text available
No diagnostic biomarkers are available for obsessive-compulsive disorder (OCD). Here, we aimed to identify magnetic resonance imaging (MRI) biomarkers for OCD, using 46 data sets with 2304 OCD patients and 2068 healthy controls from the ENIGMA consortium. We performed machine learning analysis of regional measures of cortical thickness, surface area and subcortical volume and tested classification performance using cross-validation. Classification performance for OCD vs. controls using the complete sample with different classifiers and cross-validation strategies was poor. When models were validated on data from other sites, model performance did not exceed chance-level. In contrast, fair classification performance was achieved when patients were grouped according to their medication status. These results indicate that medication use is associated with substantial differences in brain anatomy that are widely distributed, and indicate that clinical heterogeneity contributes to the poor performance of structural MRI as a disease marker.
Article
Full-text available
No diagnostic biomarkers are available for obsessive-compulsive disorder (OCD). Here, we aimed to identify magnetic resonance imaging (MRI) biomarkers for OCD, using 46 data sets with 2304 OCD patients and 2068 healthy controls from the ENIGMA consortium. We performed machine learning analysis of regional measures of cortical thickness, surface area and subcortical volume and tested classification performance using cross-validation. Classification performance for OCD vs. controls using the complete sample with different classifiers and cross-validation strategies was poor. When models were validated on data from other sites, model performance did not exceed chance-level. In contrast, fair classification performance was achieved when patients were grouped according to their medication status. These results indicate that medication use is associated with substantial differences in brain anatomy that are widely distributed, and indicate that clinical heterogeneity contributes to the poor performance of structural MRI as a disease marker.
Chapter
Extensive studies focus on analyzing human brain functional connectivity from a network perspective, in which each network contains complex graph structures. Based on resting-state functional MRI (rs-fMRI) data, graph convolutional networks (GCNs) enable comprehensive mapping of brain functional connectivity (FC) patterns to depict brain activities. However, existing studies usually characterize static properties of the FC patterns, ignoring the time-varying dynamic information. In addition, previous GCN methods generally use fixed group-level (e.g., patients or controls) representation of FC networks, and thus, cannot capture subject-level FC specificity. To this end, we propose a Temporal-Adaptive GCN (TAGCN) framework that can not only take advantage of both spatial and temporal information using resting-state FC patterns and time-series but also explicitly characterize subject-level specificity of FC patterns. Specifically, we first segment each ROI-based time-series into multiple overlapping windows, then employ an adaptive GCN to mine topological information. We further model the temporal patterns for each ROI along time to learn the periodic brain status changes. Experimental results on 533 major depressive disorder (MDD) and health control (HC) subjects demonstrate that the proposed TAGCN outperforms several state-of-the-art methods in MDD vs. HC classification, and also can be used to capture dynamic FC alterations and learn valid graph representations.
Article
Graph Neural Networks (GNNs) are a powerful tool for machine learning on graphs. GNNs combine node feature information with the graph structure by recursively passing neural messages along edges of the input graph. However, incorporating both graph structure and feature information leads to complex models and explaining predictions made by GNNs remains unsolved. Here we propose GnnExplainer, the first general, model-agnostic approach for providing interpretable explanations for predictions of any GNN-based model on any graph-based machine learning task. Given an instance, GnnExplainer identifies a compact subgraph structure and a small subset of node features that have a crucial role in GNN's prediction. Further, GnnExplainer can generate consistent and concise explanations for an entire class of instances. We formulate GnnExplainer as an optimization task that maximizes the mutual information between a GNN's prediction and distribution of possible subgraph structures. Experiments on synthetic and real-world graphs show that our approach can identify important graph structures as well as node features, and outperforms alternative baseline approaches by up to 43.0% in explanation accuracy. GnnExplainer provides a variety of benefits, from the ability to visualize semantically relevant structures to interpretability, to giving insights into errors of faulty GNNs.