Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Treatments for psychiatric disorders are only as effective as the precision with which we administer them. We have treatments that work; we just cannot always accurately predict who they are going to work for and why. In this article, we discuss how big data can help identify robust, reproducible and generalizable predictors of treatment response in psychiatry. Specifically, we focus on how machine-learning approaches can facilitate a move beyond discovery studies and toward model validation. We will highlight some recent exemplary studies in this area, describe how one can assess the merits of studies reporting treatment biomarkers, and discuss what we consider to be best practice for prediction research in psychiatry.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... High comorbidity and unclear diagnostic boundaries impact decisions about classification and treatment at all ages, particularly during childhood [8,9]. This is because children and adolescents with an anxiety disorder often develop other anxiety and/or mood disorders as they mature [10,11]. Hence, the clinical presentation in childhood can represent an early-stage manifestation of persistent mental health difficulties that will unfold across development [12]. ...
... While effective treatments exist for pediatric anxiety disorders, most patients do not achieve lasting remission [10], where longitudinal studies find high relapse rates. In perhaps the most influential pediatric anxiety clinical trial, the recurrence rate was 16.2% at 20 years follow-up, with the rate increasing over time [26]. ...
... Of note, these analyses only utilized clinical variables. Future studies adopting integrative approaches might use pathophysiology data [10]. Fig. 1 A Illustrates the relation between the required sample size to achieve satisfactory statistical power sensitivity across a range of sample sizes at a medium effect size. ...
Article
Full-text available
Purpose of the Review This review describes approaches to research on anxiety that attempt to link neural correlates to treatment response and novel therapies. The review emphasizes pediatric anxiety disorders since most anxiety disorders begin before adulthood. Recent Findings Recent literature illustrates how current treatments for anxiety manifest diverse relations with a range of neural markers. While some studies demonstrate post-treatment normalization of markers in anxious individuals, others find persistence of group differences. For other markers, which show no pretreatment association with anxiety, the markers nevertheless distinguish treatment-responders from non-responders. Heightened error related negativity represents the risk marker discussed in the most depth; however, limitations in measures related to error responding necessitate multimodal and big-data approaches. Summary Single risk markers show limits as correlates of treatment response. Large-scale, multimodal data analyzed with predictive models may illuminate additional risk markers related to anxiety disorder treatment outcomes. Such work may identify novel targets and eventually guide improvements in treatment response/outcomes.
... It is possible that an individual may be diagnosed with more than one disorder (or 2 individuals showing completely different symptoms may be labeled with the same disorder), according to the current Diagnostic and Statistical Manual of Mental Disorders, fourth edition (DSM-IV) manual. Unlike many other medical specializations, psychiatry does not use objective physiological tests in its diagnostic process [4]. Many clinicians, health care providers, and researchers are aware that this diagnostic process needs improvement. ...
... However, there are several challenges, both methodological and statistical, to the development of a model to predict a specific clinical outcome for previously unseen individuals. A group of authors elucidated some of the risks, pitfalls, and recommended techniques to improve model reliability and validity in future research [4,7,[55][56][57][58]. The authors declared that neuroimaging researchers who begin to develop such predictive models are typically unaware of some of the required considerations to accurately assess model performance and avoid inflated predictions (so-called unwarranted optimism) [4,55,56,59]. ...
... A group of authors elucidated some of the risks, pitfalls, and recommended techniques to improve model reliability and validity in future research [4,7,[55][56][57][58]. The authors declared that neuroimaging researchers who begin to develop such predictive models are typically unaware of some of the required considerations to accurately assess model performance and avoid inflated predictions (so-called unwarranted optimism) [4,55,56,59]. The common characteristics of this type of research are as follows: classification accuracy is typically 80% to 90% overall, the sample size tends to be small to modest, and samples are usually gathered from a single site. ...
Article
Background: Machine learning applications in healthcare have become numerous lately, and this work focuses on an important application in psychiatry related to the detection of depression. Since the advent of Computational Psychiatry, valuable research based on fMRI has had phenomenal results, but these tools tend to be simply too expensive for everyday clinical use. Objective: This article focuses on a much more affordable data-driven approach based on electroencephalographic (EEG) recordings. Further online applications via public or private cloud-based platforms would be a logical next step. We aim to compare several different approaches to the detection of depression from EEG recordings utilizing varying features and machine learning models. Methods: We have reviewed published studies based on resting state EEG with final machine learning, used to detect depression (detecting studies), while also presenting a group of Interventional studies utilizing some form of stimulation in their method, aimed to predict therapy outcomes. Results: We have reviewed 14 studies (classified as detection studies) and 12 interventional studies published between 2008 and 2019. Since direct comparison was not possible due to huge diversity of theoretical approaches and methods used, we compared them regarding the steps in analysis and accuracies yielded. We also compared possible drawbacks in terms of sample size but also in the process of feature extraction, feature selection, classification, internal and external validation as well as in possible unwarranted optimism and reproducibility regards. In addition we suggested desirable practices in avoiding misinterpretation of results and optimism. Conclusions: The work concludes with a discussion and review of guidelines to improve the reliability of developed models that may potentially improve diagnostics and offer more accurate treatment of depression in modern psychiatry.
... Patients are grouped in terms of their expected treatment response using diagnostic tests or techniques [2]. However, precision medicine remains a challenge in mental health care because treatments are effective on average, but it is difficult to predict exactly whom they will work for [3,4]. Stepped care principles provide a framework to allocate limited health care resources and have been proven to be cost-effective for depression and anxiety [5,6]. ...
... Techniques from the field of machine learning are aimed at making accurate predictions based on patterns in data. Machine learning can help to identify robust, reproducible, and generalizable predictors of treatment response [3,[9][10][11], and has already been used in health care research, for example, in predicting health care costs and outcomes [12][13][14][15]. By discovering associations and understanding patterns and trends within the data, machine learning has the potential to improve care. ...
Article
Full-text available
Background Predicting which treatment will work for which patient in mental health care remains a challenge. Objective The aim of this multisite study was 2-fold: (1) to predict patients’ response to treatment in Dutch basic mental health care using commonly available data from routine care and (2) to compare the performance of these machine learning models across three different mental health care organizations in the Netherlands by using clinically interpretable models. Methods Using anonymized data sets from three different mental health care organizations in the Netherlands (n=6452), we applied a least absolute shrinkage and selection operator regression 3 times to predict the treatment outcome. The algorithms were internally validated with cross-validation within each site and externally validated on the data from the other sites. ResultsThe performance of the algorithms, measured by the area under the curve of the internal validations as well as the corresponding external validations, ranged from 0.77 to 0.80. Conclusions Machine learning models provide a robust and generalizable approach in automated risk signaling technology to identify cases at risk of poor treatment outcomes. The results of this study hold substantial implications for clinical practice by demonstrating that the performance of a model derived from one site is similar when applied to another site (ie, good external validation).
... Our group focused mainly on data-driven computational psychiatry research (9)(10)(11)(12)(13)(14). We also became aware of so-called unwarranted optimism (15)(16)(17) and reported on it (10,12). The expression' unwarranted optimism' is coined in ML community to signify for unrealistically inflated high accuracies of models due to unresolved Dimensionality of problem, absent external validation, unproportional ratio between number of variables and number of subjects in highdimensional medical datasets, and existance of unattended blind spots. ...
... Whelan, Garavan, Gillan, and their colleagues, explained in their publications before 2017, why computational psychiatry projects, even when relying on neuroimaging data are flawed (16,17), arguing that some basic postulates from Information theory and Statistical learning theory are ignored, despite wide accessibility of many ML models. The consequence is overly optimistic (and misleading) results, that are not leading to clinically useful applications [see also (21)(22)(23)]. ...
Article
Full-text available
About the already developed innovations in psychiatry diagnosis and treatment follow-up
... Based on current practice, one can say that when it comes to diagnostics and treatment of depression, psychiatry does not have so much in common with modern medical scientific methods. The absence of objective (evidence-based) biomarkers and reliance solely on DSM/ICD as a tool for classification, and personal/biased impression/experience of the therapist, plus reliance on a self-report from the patient (that might be misleading or omitting important details) yield poor performance of overall treatment of mental disorders [4,5] . ...
... It seems that hippocampus has a critical role in affecting depresotypic neural responses [49] . Of course, we cannot claim that we detected anything below the level of cortex by EEG, but the connections from Cz-Fp1 [4,5,44,63] and generalization unreliable. ...
Preprint
Full-text available
Current diagnostic practice in psychiatry is not relying on objective biophysical evidence. Recent pandemic emphasized the need to address the rising number of mood disorders (in particular, depression) cases in a more efficient way. We are proposing several already developed practices that can help improve that diagnostic process: detection based on electrophysiological signals (both electroencephalogram and electrocardiogram based) that were shown to be accurate for clinical practice and several modalities of electromagnetic stimulation that were proven to ameliorate symptoms of depression. In this work, we are connecting the two with explanations coming from physiological complexity studies (and our own work) as well as advanced statistical methods like machine learning and the Bayesian inference approach. It is shown that fractal and nonlinear measures can adequately quantify previously undetected changes in intrinsic dynamics of physiological systems, providing the basis for early detection of depression. We are also advocating for early screening of cardiovascular risks in depression which is in connection to previously described decomplexification of the autonomous nervous system resulting in symptoms recognized clinically. All that said, additional information about the level of complexity can help clinicians make a better decisions in the therapeutic process, increase the overall effectiveness of the treatment, and finally increase the quality of life of the patient.
... Many different variable selection approaches have been suggested for treatment selection, all of which attempt to identify which variables, among the putative predictors, contribute significantly to the prediction outcome. Gillan and Whelan [79] presented an outstanding discussion of data-driven versus theory-driven approaches to model specifications. Typical approaches depend on parametric regression models [80] that select only variables with statistically significant contributions to the outcome. ...
... Progress in statistical modeling has led to feature selection methods, which are largely based on machine learning algorithms that can compliantly model and identify predictors, even with higher-order interactions [84]. Gillan and Whelan [79] provided an in-depth discussion of the merits of machine learning in the field of psychiatric disorders. ...
Article
Full-text available
The current polythetic and operational criteria for major depression inevitably contribute to the heterogeneity of depressive syndromes. The heterogeneity of depressive syndrome has been criticized using the concept of language game in Wittgensteinian philosophy. Moreover, “a symptom- or endophenotype-based approach, rather than a diagnosis-based approach, has been proposed” as the “next-generation treatment for mental disorders” by Thomas Insel. Understanding the heterogeneity renders promise for personalized medicine to treat cases of depressive syndrome, in terms of both defining symptom clusters and selecting antidepressants. Machine learning algorithms have emerged as a tool for personalized medicine by handling clinical big data that can be used as predictors for subtype classification and treatment outcome prediction. The large clinical cohort data from the Sequenced Treatment Alternatives to Relieve Depression (STAR*D), Combining Medications to Enhance Depression Outcome (CO-MED), and the German Research Network on Depression (GRND) have recently began to be acknowledged as useful sources for machine learning-based depression research with regard to cost effectiveness and generalizability. In addition, noninvasive biological tools such as functional and resting state magnetic resonance imaging techniques are widely combined with machine learning methods to detect intrinsic endophenotypes of depression. This review highlights recent studies that have used clinical cohort or brain imaging data and have addressed machine learning-based approaches to defining symptom clusters and selecting antidepressants. Potentially applicable suggestions to realize machine learning-based personalized medicine for depressive syndrome are also provided herein.
... It has become increasingly clear that treatment response in psychiatry is highly variable across individuals (Rush et al. 2006). Decades of research investigating potential singlevariable markers of treatment response have come up empty handed, and there is growing consensus that success will likely require complex, multivariate modelling approaches such as machine learning (Gillan & Whelan 2017. Machine learning approaches that rely on self-report data exclusively have shown potential for predicting response to antidepressants in a reanalysis of clinical trial data from over 4,000 patients (Chekroud et al. 2016). ...
... Coupling these sorts of data with rich cognitive and clinical assessment would help further elucidate which elements of treatment (e.g., self-reflection, supporter interaction, behavioral homework, and psychoeducation) and what treatment durations work best for which individual. In sum, there is significant potential for smartphone-based methodologies to assist in a push toward treatment-focused research that translates complex data sets into individualized predictions of real clinical value (Gillan & Whelan 2017). ...
Article
Improvements in understanding the neurobiological basis of mental illness have unfortunately not translated into major advances in treatment. At this point, it is clear that psychiatric disorders are exceedingly complex and that, in order to account for and leverage this complexity, we need to collect longitudinal datasets from much larger and more diverse samples than is practical using traditional methods. We discuss how smartphone-based research methods have the potential to dramatically advance our understanding of the neuroscience of mental health. This, we expect, will take the form of complementing lab-based hard neuroscience research with dense sampling of cognitive tests, clinical questionnaires, passive data from smartphone sensors, and experience-sampling data as people go about their daily lives. Theory- and data-driven approaches can help make sense of these rich data sets, and the combination of computational tools and the big data that smartphones make possible has great potential value for researchers wishing to understand how aspects of brain function give rise to, or emerge from, states of mental health and illness. Expected final online publication date for the Annual Review of Neuroscience, Volume 44 is July 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
... To predict clinical outcomes or relapses (for example, after incomplete remission in recurrent depression) would be of great clinical significance especially in clinical psychiatry. A group of authors elucidated risks, pitfalls and recommend the techniques how to improve model reliability and validity in future research [44,45,46]. All authors described that neuroimaging researchers who start to develop such predictive models are typically unaware of some considerations inevitable to accurately assess model performance and avoid inflated predictions (called 'optimism') [44]. ...
... We must collect more data or establish collaborative project where data can be gathered in numbers which are not achievable for a single site; a certain standardization of a protocol and decision to share can improve the whole endeavor greatly. Some of great collaborative projects like RDoC, STAR*D, IMAGEN, etc. Also, co-recording with fMRI and MEG should be a solution [45]. Another line of research is developing wireless EEG caps (Epoch, ENOBIO Neuroelectrics, iMotions, just to mention some) which can be used for research in the environment without restraining the patient, and even for monitoring of recovering from severe episodes. ...
Chapter
In this work, we aimed at comparing our findings in depression detection task with similar methodologies applied in present literature. In our project we showed that when electrophysiological signal (in this case electroencephalogram, EEG) is characterized by nonlinear measures, any of seven most popular classifiers yields high accuracy on the task. Following every step done in this process we elaborated on other findings mainly from analysis of electrical signals or nonlinear analysis showing what would be optimal for further research. We focused on discussing various possible mistakes and differences that could potentially lead to unwarranted optimism and other misinterpretations of results. We also consider obstacles that this practice would be accepted for real-life application in psychiatry and some ideas how to overcome them. In Conclusion we summarize recommendation for future research in order to be easily applicable in clinical practice.
... The model also needs information on 24 predictors, which may limit its value for clinical practice. Future studies are needed to assess whether the model can be improved by using larger training samples, other types of statistical learning techniques (Chekroud et al., 2016), or other types of data such as neuro-imaging, biomarkers, and molecular genetic data (Gillan and Whelan, 2017). However, the fact that our model exclusively utilizes readily available clinical information also is a strength as this reduces its associated costs and burden to patients. ...
... For instance, low probabilities are generally overrated, whereas high probabilities are underrated (Kahneman and Tversky, 1979). Thus, it should be carefully studied whether the application of these probabilistic decision support tools indeed improves clinical decision making in randomized controlled trials (Gillan and Whelan, 2017). ...
Article
Full-text available
Background Course of illness in major depression (MD) is highly varied, which might lead to both under- and overtreatment if clinicians adhere to a 'one-size-fits-all' approach. Novel opportunities in data mining could lead to prediction models that can assist clinicians in treatment decisions tailored to the individual patient. This study assesses the performance of a previously developed data mining algorithm to predict future episodes of MD based on clinical information in new data. Methods We applied a prediction model utilizing baseline clinical characteristics in subjects who reported lifetime MD to two independent test samples (total n=4,226). We assessed the model's performance to predict future episodes of MD, anxiety disorders, and disability during follow-up (1-9 years after baseline). In addition, we compared its prediction performance with well-known risk factors for a severe course of illness. Results Our model consistently predicted future episodes of MD in both test samples (AUC 0.68-0.73, modest prediction). Equally accurately, it predicted episodes of generalized anxiety disorder, panic disorder and disability (AUC 0.65-0.78). Our model predicted these outcomes more accurately than risk factors for a severe course of illness such as family history of MD and lifetime traumas. Limitations Prediction accuracy might be different for specific subgroups, such as hospitalized patients or patients with a different cultural background. Conclusions Our prediction model consistently predicted a range of adverse outcomes in MD across two independent test samples derived from studies in different subpopulations, countries, using different measurement procedures. This replication study holds promise for application in clinical practice.
... Machine learning can be defined as "a computational strategy that automatically determines ("learns") methods and parameters to reach an optimal solution" (17). Crucially, machine learning techniques take a data-driven approach: algorithms learn from examples without being explicitly programmed, which contrasts with more theory-driven approaches (10). The lack of hypotheses and "preselection" of variables could potentially allow for novel predictive associations that would otherwise go unnoticed. ...
... According to American Psychiatric Association a biomarker needs to have 80% accuracy, before it has "clinical utility" (48). Gillan & Whelan argue, however, that this threshold eventually comes down to a cost/benefit trade-off: How much do we win and lose when we apply this model (10)? Most studies did not compare the performance of their model to the "performance" of clinicians. ...
Article
Full-text available
Major depressive disorder imposes a substantial disease burden worldwide, ranking as the third leading contributor to global disability. In spite of its ubiquity, classifying and treating depression has proven troublesome. One argument put forward to explain this predicament is the heterogeneity of patients diagnosed with the disorder. Recently, many areas of daily life have witnessed the surge of machine learning techniques, computational approaches to elucidate complex patterns in large datasets, which can be employed to make predictions and detect relevant clusters. Due to the multidimensionality at play in the pathogenesis of depression, it is suggested that machine learning could contribute to improving classification and treatment. In this paper, we investigated literature focusing on the use of machine learning models on datasets with clinical variables of patients diagnosed with depression to predict treatment outcomes or find more homogeneous subgroups. Identified studies based on best practices in the field are evaluated. We found 16 studies predicting outcomes (such as remission) and identifying clusters in patients with depression. The identified studies are mostly still in proof-of-concept phase, with small datasets, lack of external validation, and providing single performance metrics. Larger datasets, and models with similar variables present across these datasets, are needed to develop accurate and generalizable models. We hypothesize that harnessing natural language processing to obtain data ‘hidden' in clinical texts might prove useful in improving prediction models. Besides, researchers will need to focus on the conditions to feasibly implement these models to support psychiatrists and patients in their decision-making in practice. Only then we can enter the realm of precision psychiatry.
... Here, we can solely observe 'a syndromic constellation of symptoms that hang together empirically, often for unknown reasons'. 9 This has been demonstrated by Østergaard et al.'s 10 mathematical demonstration of 1497 combinations of depression symptoms. This is not mere theoretical assumption but supported by research in the variability of depression trajectory and treatment variability. ...
Article
Full-text available
Depression is one of the most common and debilitating health problems, however, its heterogeneity makes a diagnosis challenging. Thus far the restriction of depression variables explored within groups, the lack of comparability between groups, and the heterogeneity of depression as a concept limit a meaningful interpretation, especially in terms of predictability. Research established students in late adolescence to be particularly vulnerable, especially those with a natural science or musical study main subject. This study used a predictive design, observing the change in variables between groups as well as predicting which combinations of variables would likely determine depression prevalence. 102 under- and postgraduate students from various higher education institutions participated in an online survey. Students were allocated into three groups according to their main study subject and type of institution: natural science students, music college students and a mix of music and natural science students at university with comparable levels of musical training and professional musical identity. Natural science students showed significantly higher levels of anxiety prevalence and pain catastrophizing prevalence, while music college students showed significantly higher depression prevalence compared to the other groups. A hierarchical regression and a tree analysis found that depression for all groups was best predicted with a combination of variables: high anxiety prevalence and low burnout of students with academic staff. The use of a larger pool of depression variables and the comparison of at-risk groups provide insight into how these groups experience depression and thus allow initial steps towards personalized support structures.
... While historical approaches to psychiatry have often been stuck either focusing on a small subset of factors or falling short of scientific standards of evidence (Ghaemi, 2007), the data-driven approach leverages the availability of data and computation to integrate diverse sources of information in a rigorous way. Unlike traditional statistical methods that focus on detecting group differences, machine learning methods are designed to exploit large volumes of multivariate data to make individual predictions (Bennett et al., 2019;Gillan & Whelan, 2017). This means that clinical decisions, such as treatment recommendations, can be tailored to sophisticated statistical representations of individual patients. ...
Article
Full-text available
The data-driven approach to psychiatric science leverages large volumes of patient data to construct machine learning models with the goal of optimizing clinical decision making. Advocates claim that this methodology is well-placed to deliver transformative improvements to psychiatric science. I argue that talk of a data-driven revolution in psychiatry is premature. Transformative improvements, cashed out in terms of better patient outcomes, cannot be achieved without addressing patient understanding. That is, how patients understand their own mental illnesses. I conceptualize understanding as the possession of adaptive mental constructs through which experience is mediated. I suggest that this notion of understanding serves as a bottleneck which any prospective approach to psychiatry must address to be efficacious. Subsequently I argue that, though the data-driven approach is undoubtedly powerful, it does not have a straightforward means of unblocking the bottleneck of understanding. I suggest that the data-driven approach must be supplemented with significant theoretical progress if it is to transform psychiatry.
... Computational psychiatry uses formal models of brain function to characterize the mechanisms of different psychopathological manifestations by describing them in computational or mathematical terms (20). This facilitates the study and articulation of these data by incorporating knowledge from other sciences such as cognitive science, computational neurosciences, and "machine learning" (20)(21)(22)(23)(24), trying to translate knowledge between different levels of analysis. This review aims to give a comprehensive view of the foundations of computational psychiatry, highlighting its interactions with different approaches like biophysics and evolutionary psychiatry to arrive at precision psychiatry. ...
Article
Full-text available
Computational psychiatry recently established itself as a new tool in the study of mental disorders and problems. Integration of different levels of analysis is creating computational phenotypes with clinical and research values, and constructing a way to arrive at precision psychiatry are part of this new branch. It conceptualizes the brain as a computational organ that receives from the environment parameters to respond to challenges through calculations and algorithms in continuous feedback and feedforward loops with a permanent degree of uncertainty. Through this conception, one can seize an understanding of the cerebral and mental processes in the form of theories or hypotheses based on data. Using these approximations, a better understanding of the disorder and its different determinant factors facilitates the diagnostics and treatment by having an individual, ecologic, and holistic approach. It is a tool that can be used to homologate and integrate multiple sources of information given by several theoretical models. In conclusion, it helps psychiatry achieve precision and reproducibility, which can help the mental health field achieve significant advancement. This article is a narrative review of the basis of the functioning of computational psychiatry with a critical analysis of its concepts.
... Larger datasets and a steady increase in computational power have led to ML techniques being discussed as an option to improve prediction models in psychotherapy (Aafjes-van Doorn et al., 2020;Delgadillo & Lutz, 2020). ML refers to approaches combining statistics and computer science, and these algorithms can be grouped by the similarity of their function ( Figure S1; Brownlee, 2019;Gillan & Whelan, 2017;Iniesta et al., 2016). Different ML approaches have been used to predict treatment outcomes, such as elastic net, random forest, k nearest neighbours, ensembles, etc. (Dinga et al., 2018;Lutz et al., 2005;Pearson et al., 2019;Webb et al., 2020). ...
Article
Objective The occurrence of dropout from psychological interventions is associated with poor treatment outcome and high health, societal and economic costs. Recently, machine learning (ML) algorithms have been tested in psychotherapy outcome research. Dropout predictions are usually limited by imbalanced datasets and the size of the sample. This paper aims to improve dropout prediction by comparing ML algorithms, sample sizes and resampling methods. Method Twenty ML algorithms were examined in twelve subsamples (drawn from a sample of N = 49,602) using four resampling methods in comparison to the absence of resampling and to each other. Prediction accuracy was evaluated in an independent holdout dataset using the F1-Measure. Results Resampling methods improved the performance of ML algorithms and down-sampling can be recommended, as it was the fastest method and as accurate as the other methods. For the highest mean F1-Score of .51 a minimum sample size of N = 300 was necessary. No specific algorithm or algorithm group can be recommended. Conclusion Resampling methods could improve the accuracy of predicting dropout in psychological interventions. Down-sampling is recommended as it is the least computationally taxing method. The training sample should contain at least 300 cases.
... A solution to this problem may lie in the development of multivariable models that are informed by data from complementary domains, such as cognitive, (neuro) physiological and molecular data [11,12,13]. Machine learning is one such method that can iteratively and contemporaneously analyse multiple variables and their interaction, aggregating small individual effects into single predictive values [14]. ...
Article
Full-text available
Background Evidence-based treatments for depression exist but not all patients benefit from them. Efforts to develop predictive models that can assist clinicians in allocating treatments are ongoing, but there are major issues with acquiring the volume and breadth of data needed to train these models. We examined the feasibility, tolerability, patient characteristics, and data quality of a novel protocol for internet-based treatment research in psychiatry that may help advance this field. Methods A fully internet-based protocol was used to gather repeated observational data from patient cohorts receiving internet-based cognitive behavioural therapy (iCBT) ( N = 600) or antidepressant medication treatment ( N = 110). At baseline, participants provided > 600 data points of self-report data, spanning socio-demographics, lifestyle, physical health, clinical and other psychological variables and completed 4 cognitive tests. They were followed weekly and completed another detailed clinical and cognitive assessment at week 4. In this paper, we describe our study design, the demographic and clinical characteristics of participants, their treatment adherence, study retention and compliance, the quality of the data gathered, and qualitative feedback from patients on study design and implementation. Results Participant retention was 92% at week 3 and 84% for the final assessment. The relatively short study duration of 4 weeks was sufficient to reveal early treatment effects; there were significant reductions in 11 transdiagnostic psychiatric symptoms assessed, with the largest improvement seen for depression. Most participants (66%) reported being distracted at some point during the study, 11% failed 1 or more attention checks and 3% consumed an intoxicating substance. Data quality was nonetheless high, with near perfect 4-week test retest reliability for self-reported height (ICC = 0.97). Conclusions An internet-based methodology can be used efficiently to gather large amounts of detailed patient data during iCBT and antidepressant treatment. Recruitment was rapid, retention was relatively high and data quality was good. This paper provides a template methodology for future internet-based treatment studies, showing that such an approach facilitates data collection at a scale required for machine learning and other data-intensive methods that hope to deliver algorithmic tools that can aid clinical decision-making in psychiatry.
... A solution to this problem may lie in the development of multivariable models that are informed by data from complementary domains, such as cognitive, (neuro)physiological and molecular data (11)(12)(13). Machine learning is one such method that can iteratively and contemporaneously analyse multiple variables and their interaction, aggregating small individual effects into single predictive values (14). ...
Preprint
Full-text available
Background: Evidence-based treatments for depression exist but not all patients benefit from them. Efforts to develop predictive models that can assist clinicians in allocating treatments are ongoing, but there are major issues with acquiring the volume and breadth of data needed to train these models. We developed a protocol for internet-based data collection methods that may help advance this field.Methods: A fully internet-based protocol was used to gather repeated observational data from two transdiagnostic patient cohorts receiving internet-based cognitive behavioural therapy (iCBT) (N = 600) or antidepressant medication treatment (N = 110). At baseline, participants provided >600 data points of self-report data, spanning socio-demographics, lifestyle, physical health, clinical and other psychological variables and completed 4 cognitive tests. They were followed weekly and completed another detailed clinical and cognitive assessment at week 4. In this paper, we describe our study design in detail and evaluate its methodological feasibility. In evaluating, we report on the demographic and clinical characteristics of participants, treatment adherence, study retention and compliance, indicators of data quality, and qualitative feedback from patients on study design and implementation.Results: Participant retention was 92% at week 3 and 84% for the final assessment. The relatively short study duration of 4 weeks was sufficient to reveal early treatment effects; there were significant reductions in 11 transdiagnostic psychiatric symptoms assessed, with the largest improvement seen for depression. Most participants (66%) reported being distracted at some point during the study, 11% failed 1 or more attention checks and 3% consumed an intoxicating substance. Data quality was nonetheless high, with near perfect 4-week test retest reliability for self-reported height (ICC = 0.97).Conclusions: An internet-based methodology can be used efficiently to gather large amounts of detailed patient data during iCBT and antidepressant treatment. Recruitment was rapid and retention was relatively high. The detailed data we report will help guide the design of future large-scale internet-based studies for predicting treatment outcomes in depression. The adoption of internet-based study designs may improve statistical power in this field and accelerate the search for tools that can aid clinical decision-making.
... This is especially true for the generally under-researched field of forensic psychiatry, where the interplay of psychopathology, offending, and aggression has yet to be comprehensively understood. Consequently, to investigate such phenomena, modern statistical methods such as ML are necessary and already applied in psychiatric research areas regarding pharmaceuticals or neuroimaging [5][6][7][8][9]. The following analysis should serve as an example of the practical use of ML in the field of forensic psychiatry, specifically aggression and schizophrenia spectrum disorders (SSDs). ...
Article
Full-text available
Linear statistical methods may not be suited to the understanding of psychiatric phenomena such as aggression due to their complexity and multifactorial origins. Here, the application of machine learning (ML) algorithms offers the possibility of analyzing a large number of influencing factors and their interactions. This study aimed to explore inpatient aggression in offender patients with schizophrenia spectrum disorders (SSDs) using a suitable ML model on a dataset of 370 patients. With a balanced accuracy of 77.6% and an AUC of 0.87, support vector machines (SVM) outperformed all the other ML algorithms. Negative behavior toward other patients, the breaking of ward rules, the PANSS score at admission as well as poor impulse control and impulsivity emerged as the most predictive variables in distinguishing aggressive from non-aggressive patients. The present study serves as an example of the practical use of ML in forensic psychiatric research regarding the complex interplay between the factors contributing to aggressive behavior in SSD. Through its application, it could be shown that mental illness and the antisocial behavior associated with it outweighed other predictors. The fact that SSD is also highly associated with antisocial behavior emphasizes the importance of early detection and sufficient treatment.
... To the latter, there is increasing awareness that effect sizes in mental health science are generally small, regardless of whether variables are biological (67) or psychosocial (68). Thus, for personalization to occur, studies must move toward integrating multiple variables that have individually low predictive powersuch approaches require large samples for accurate model development (69). Absent large datasets, a transdiagnostic and dimensional approach (compared to a categorical one) may do something to resolve both issues; if we can more accurately, validly and reliably capture mental health phenomena and the underlying biosignatures, then the effect sizes we observe will increase (59). ...
Article
Full-text available
Accumulating clinical evidence shows that psychedelic therapy, by synergistically combining psychopharmacology and psychological support, offers a promising transdiagnostic treatment strategy for a range of disorders with restricted and/or maladaptive habitual patterns of emotion, cognition and behavior, notably, depression (MDD), treatment resistant depression (TRD) and addiction disorders, but perhaps also anxiety disorders, obsessive-compulsive disorder (OCD), Post-Traumatic Stress Disorder (PTSD) and eating disorders. Despite the emergent transdiagnostic evidence, the specific clinical dimensions that psychedelics are efficacious for, and associated underlying neurobiological pathways, remain to be well-characterized. To this end, this review focuses on pre-clinical and clinical evidence of the acute and sustained therapeutic potential of psychedelic therapy in the context of a transdiagnostic dimensional systems framework. Focusing on the Research Domain Criteria (RDoC) as a template, we will describe the multimodal mechanisms underlying the transdiagnostic therapeutic effects of psychedelic therapy, traversing molecular, cellular and network levels. These levels will be mapped to the RDoC constructs of negative and positive valence systems, arousal regulation, social processing, cognitive and sensorimotor systems. In summarizing this literature and framing it transdiagnostically, we hope we can assist the field in moving toward a mechanistic understanding of how psychedelics work for patients and eventually toward a precise-personalized psychedelic therapy paradigm.
... Again, nonlinear measures showed to be superior to conventional ones (see also Gottschalk et al., 1995), and the ratings extracted from signals recorded during the whole 4 nights were more accurate than the result of the standard diagnostic procedure performed before sleep (they used ML models to differentiate between BDD and healthy controls). Faurholt-Jepsen et al. (2014) showed that self-reported (labeled "subjective") assessment was more efficient in identifying BDD states, as in using mobile technology (smartphones) or online platforms, such as Mechanical Turk (Gillan and Whelan, 2017). Hence, some electrophysiological recording could significantly improve the chances of BDD mania prediction. ...
Article
Full-text available
Bipolar depression is treated wrongly as unipolar depression, on average, for eight years. It is shown that this mismedication affects the occurrence of a manic episode and aggravates the overall condition of bipolar depression patients. Significant effort was invested in the early detection of depression and forecasting of responses to certain therapeutic approaches using a combination of features extracted from standard and online testing, wearables monitoring, and machine learning. In the case of unipolar depression, this approach yielded evidence that this data-based computational psychiatry approach would be helpful in clinical practice. Following a similar pipeline, we examined the usefulness of this approach to foresee a manic episode in bipolar depression, so that clinicians and the patient’s family can help the patient navigate through the time of crisis. Our projects combined the results from self-reported daily questionnaires, the data obtained from smartwatches, and the data from regular reports from standard psychiatric interviews to feed various machine learning models to predict a crisis in bipolar depression. Contrary to satisfactory predictions in unipolar depression, we found that bipolar depression, having more complex dynamics, requires personalized approach. Older work on physiological complexity (complex variability) suggests that an inclusion of electrophysiological data, properly quantified, might lead to better solutions, as shown in other projects of our group concerning unipolar depression. Here we make a comparison of previously performed research in a methodological sense, revisiting and additionaly interpreting our own results showing that the methodological approach to mania forecasting may be modified to provide an accurate prediction in bipolar depression.
... Whereas conventional approaches typically characterise group-averaged effects, machine learning models can be trained to provide optimal classification accuracy for the individual patient: for example, determining the likelihood of the patient responding to particular treatments (as for a predictive biomarker) or of having a favourable prognosis irrespective of treatment (for a prognostic biomarker). For neuroimaging to demonstrate practical clinical utility, it is essential that the characteristics of an imaging biomarker are fully described, including details on its sensitivity, specificity, and predictive value (Gillan & Whelan, 2017). ...
Article
Full-text available
Background Suppression of the rostral anterior cingulate cortex (rACC) has shown promise as a prognostic biomarker for depression. We aimed to use machine learning to characterise its ability to predict depression remission. Methods Data were obtained from 81 15- to 25-year-olds with a major depressive disorder who had participated in the YoDA-C trial, in which they had been randomised to receive cognitive behavioural therapy plus either fluoxetine or placebo. Prior to commencing treatment patients performed a functional magnetic resonance imaging (fMRI) task to assess rACC suppression. Support vector machines were trained on the fMRI data using nested cross-validation, and were similarly trained on clinical data. We further tested our fMRI model on data from the YoDA-A trial, in which participants had completed the same fMRI paradigm. Results Thirty-six of 81 (44%) participants in the YoDA-C trial achieved remission. Our fMRI model was able to predict remission status (AUC = 0.777 [95% confidence interval (CI) 0.638–0.916], balanced accuracy = 67%, negative predictive value = 74%, p < 0.0001). Clinical models failed to predict remission status at better than chance levels. Testing the model on the alternative YoDA-A dataset confirmed its ability to predict remission (AUC = 0.776, balanced accuracy = 64%, negative predictive value = 70%, p < 0.0001). Conclusions We confirm that rACC activity acts as a prognostic biomarker for depression. The machine learning model can identify patients who are likely to have difficult-to-treat depression, which might direct the earlier provision of enhanced support and more intensive therapies.
... This enables patterns in data to be more readily and accurately identified and more accurate predictions to be made from data sources (eg, more accurate diagnosis and prognosis) [17]. ML has been used for prediction in psychiatry [18]. ML methods have been successfully used to predict major depressive disorder persistence, chronicity, severity [19], and treatment response [20]. ...
... Reasons for the lack of adoption may be that no single characteristic provides a prediction that is accurate enough to be clinically meaningful or differential prediction of outcomes with alternative treatments (Simon & Perlis, 2010). Since depression is a complex and heterogeneous disorder (Fried, 2017;Wray et al., 2018), multiple features will likely have to be considered in a multivariate model to allow accurate prediction of treatment outcomes (Gillan & Whelan, 2017;Kautzky et al., 2017;Kessler, 2018). ...
Article
Background Multiple treatments are effective for major depressive disorder (MDD), but the outcomes of each treatment vary broadly among individuals. Accurate prediction of outcomes is needed to help select a treatment that is likely to work for a given person. We aim to examine the performance of machine learning methods in delivering replicable predictions of treatment outcomes. Methods Of 7732 non-duplicate records identified through literature search, we retained 59 eligible reports and extracted data on sample, treatment, predictors, machine learning method, and treatment outcome prediction. A minimum sample size of 100 and an adequate validation method were used to identify adequate-quality studies. The effects of study features on prediction accuracy were tested with mixed-effects models. Fifty-four of the studies provided accuracy estimates or other estimates that allowed calculation of balanced accuracy of predicting outcomes of treatment. Results Eight adequate-quality studies reported a mean accuracy of 0.63 [95% confidence interval (CI) 0.56–0.71], which was significantly lower than a mean accuracy of 0.75 (95% CI 0.72–0.78) in the other 46 studies. Among the adequate-quality studies, accuracies were higher when predicting treatment resistance (0.69) and lower when predicting remission (0.60) or response (0.56). The choice of machine learning method, feature selection, and the ratio of features to individuals were not associated with reported accuracy. Conclusions The negative relationship between study quality and prediction accuracy, combined with a lack of independent replication, invites caution when evaluating the potential of machine learning applications for personalizing the treatment of depression.
... In order to design a predictive model for mental healthcare resources with a numeric outcome, a possible solution lies in the large amounts of data in electronic health records that are continuously generated and stored within mental healthcare organizations (Gillan & Whelan, 2017;Shatte et al., 2019). The emerging field of machine learning allows the exploitation of large data sets and the modeling of complex underlying non-linear relationships and therefore holds potential to deal with the skewed distribution of healthcare resources (Iniesta et al., 2016). ...
Article
Full-text available
A mental healthcare system in which the scarce resources are equitably and efficiently allocated, benefits from a predictive model about expected service use. The skewness in service use is a challenge for such models. In this study, we applied a machine learning approach to forecast expected service use, as a starting point for agreements between financiers and suppliers of mental healthcare. This study used administrative data from a large mental healthcare organization in the Netherlands. A training set was selected using records from 2017 (N = 10,911), and a test set was selected using records from 2018 (N = 10,201). A baseline model and three random forest models were created from different types of input data to predict (the remainder of) numeric individual treatment hours. A visual analysis was performed on the individual predictions. Patients consumed 62 h of mental healthcare on average in 2018. The model that best predicted service use had a mean error of 21 min at the insurance group level and an average absolute error of 28 h at the patient level. There was a systematic under prediction of service use for high service use patients. The application of machine learning techniques on mental healthcare data is useful for predicting expected service on group level. The results indicate that these models could support financiers and suppliers of healthcare in the planning and allocation of resources. Nevertheless, uncertainty in the prediction of high-cost patients remains a challenge.
... This enables patterns in data to be more readily and accurately identified and more accurate predictions to be made from data sources (eg, more accurate diagnosis and prognosis) [17]. ML has been used for prediction in psychiatry [18]. ML methods have been successfully used to predict major depressive disorder persistence, chronicity, severity [19], and treatment response [20]. ...
Article
Full-text available
Background Machine learning (ML) offers vigorous statistical and probabilistic techniques that can successfully predict certain clinical conditions using large volumes of data. A review of ML and big data research analytics in maternal depression is pertinent and timely, given the rapid technological developments in recent years. Objective This study aims to synthesize the literature on ML and big data analytics for maternal mental health, particularly the prediction of postpartum depression (PPD). Methods We used a scoping review methodology using the Arksey and O’Malley framework to rapidly map research activity in ML for predicting PPD. Two independent researchers searched PsycINFO, PubMed, IEEE Xplore, and the ACM Digital Library in September 2020 to identify relevant publications in the past 12 years. Data were extracted from the articles’ ML model, data type, and study results. Results A total of 14 studies were identified. All studies reported the use of supervised learning techniques to predict PPD. Support vector machine and random forest were the most commonly used algorithms in addition to Naive Bayes, regression, artificial neural network, decision trees, and XGBoost (Extreme Gradient Boosting). There was considerable heterogeneity in the best-performing ML algorithm across the selected studies. The area under the receiver operating characteristic curve values reported for different algorithms were support vector machine (range 0.78-0.86), random forest method (0.88), XGBoost (0.80), and logistic regression (0.93). Conclusions ML algorithms can analyze larger data sets and perform more advanced computations, which can significantly improve the detection of PPD at an early stage. Further clinical research collaborations are required to fine-tune ML algorithms for prediction and treatment. ML might become part of evidence-based practice in addition to clinical knowledge and existing research evidence.
... Because of the coordinated actions between key prefrontal and subcortical regions in mediating anxiety, effective treatments likely depend on the relative integrity of the anatomical and functional connections between nodes of the anxiety circuit. This is consistent with studies demonstrating that successful treatments are linked to functional connectivity patterns that are more similar to healthy controls [274][275][276]. From an anatomical perspective, the integrity of various white matter tracts, particularly the uncinate fasciculus, which contains the axons that convey signals between the PFC and the medial temporal lobe [277,278], has been linked to pathological anxiety [279,280]. ...
Article
Full-text available
Anxiety is experienced in response to threats that are distal or uncertain, involving changes in one’s subjective state, autonomic responses, and behavior. Defensive and physiologic responses to threats that involve the amygdala and brainstem are conserved across species. While anxiety responses typically serve an adaptive purpose, when excessive, unregulated, and generalized, they can become maladaptive, leading to distress and avoidance of potentially threatening situations. In primates, anxiety can be regulated by the prefrontal cortex (PFC), which has expanded in evolution. This prefrontal expansion is thought to underlie primates’ increased capacity to engage high-level regulatory strategies aimed at coping with and modifying the experience of anxiety. The specialized primate lateral, medial, and orbital PFC sectors are connected with association and limbic cortices, the latter of which are connected with the amygdala and brainstem autonomic structures that underlie emotional and physiological arousal. PFC pathways that interface with distinct inhibitory systems within the cortex, the amygdala, or the thalamus can regulate responses by modulating neuronal output. Within the PFC, pathways connecting cortical regions are poised to reduce noise and enhance signals for cognitive operations that regulate anxiety processing and autonomic drive. Specialized PFC pathways to the inhibitory thalamic reticular nucleus suggest a mechanism to allow passage of relevant signals from thalamus to cortex, and in the amygdala to modulate the output to autonomic structures. Disruption of specific nodes within the PFC that interface with inhibitory systems can affect the negative bias, failure to regulate autonomic arousal, and avoidance that characterize anxiety disorders.
... Keywords: EEG, EEG biomarker, cognition, gamification, mobile EEG INTRODUCTION Recent advances in digital technologies provide a wealth of opportunity in the management of health conditions. In neurological disease the heterogeneity and complexity of conditions, along with continuing reliance on traditional subjective measurement tools, have presented a challenge for the development of data-driven biomarkers for diagnosis, monitoring and prediction of therapeutic response (1)(2)(3)(4)(5)(6)(7). The suite of tools described in this paper was designed to enable longitudinal, in-home data collection of brain electrophysiology and cognitive performance. ...
Article
Full-text available
Access to affordable, objective and scalable biomarkers of brain function is needed to transform the healthcare burden of neuropsychiatric and neurodegenerative disease. Electroencephalography (EEG) recordings, both resting and in combination with targeted cognitive tasks, have demonstrated utility in tracking disease state and therapy response in a range of conditions from schizophrenia to Alzheimer's disease. But conventional methods of recording this data involve burdensome clinic visits, and behavioural tasks that are not effective in frequent repeated use. This paper aims to evaluate the technical and human-factors feasibility of gathering large-scale EEG using novel technology in the home environment with healthy adult users. In a large field study, 89 healthy adults aged 40–79 years volunteered to use the system at home for 12 weeks, 5 times/week, for 30 min/session. A 16-channel, dry-sensor, portable wireless headset recorded EEG while users played gamified cognitive and passive tasks through a tablet application, including tests of decision making, executive function and memory. Data was uploaded to cloud servers and remotely monitored via web-based dashboards. Seventy-eight participants completed the study, and high levels of adherence were maintained throughout across all age groups, with mean compliance over the 12-week period of 82% (4.1 sessions per week). Reported ease of use was also high with mean System Usability Scale scores of 78.7. Behavioural response measures (reaction time and accuracy) and EEG components elicited by gamified stimuli (P300, ERN, Pe and changes in power spectral density) were extracted from the data collected in home, across a wide range of ages, including older adult participants. Findings replicated well-known patterns of age-related change and demonstrated the feasibility of using low-burden, large-scale, longitudinal EEG measurement in community-based cohorts. This technology enables clinically relevant data to be recorded outside the lab/clinic, from which metrics underlying cognitive ageing could be extracted, opening the door to potential new ways of developing digital cognitive biomarkers for disorders affecting the brain.
... ML is also used "on its own" in CP and applied directly to measured data, e.g., for producing patient-specific predictions or discovering structure in heterogeneous populations(222)(223)(224)(226)(227)(228). In this paper, however, we focus on approaches where ML operates on estimates provided by generative models.Frontiers in Psychiatry | www.frontiersin.org ...
Article
Full-text available
Psychiatry faces fundamental challenges with regard to mechanistically guided differential diagnosis, as well as prediction of clinical trajectories and treatment response of individual patients. This has motivated the genesis of two closely intertwined fields: (i) Translational Neuromodeling (TN), which develops “computational assays” for inferring patient-specific disease processes from neuroimaging, electrophysiological, and behavioral data; and (ii) Computational Psychiatry (CP), with the goal of incorporating computational assays into clinical decision making in everyday practice. In order to serve as objective and reliable tools for clinical routine, computational assays require end-to-end pipelines from raw data (input) to clinically useful information (output). While these are yet to be established in clinical practice, individual components of this general end-to-end pipeline are being developed and made openly available for community use. In this paper, we present the Translational Algorithms for Psychiatry-Advancing Science (TAPAS) software package, an open-source collection of building blocks for computational assays in psychiatry. Collectively, the tools in TAPAS presently cover several important aspects of the desired end-to-end pipeline, including: (i) tailored experimental designs and optimization of measurement strategy prior to data acquisition, (ii) quality control during data acquisition, and (iii) artifact correction, statistical inference, and clinical application after data acquisition. Here, we review the different tools within TAPAS and illustrate how these may help provide a deeper understanding of neural and cognitive mechanisms of disease, with the ultimate goal of establishing automatized pipelines for predictions about individual patients. We hope that the openly available tools in TAPAS will contribute to the further development of TN/CP and facilitate the translation of advances in computational neuroscience into clinically relevant computational assays.
... In recent explications of the goals of clinical prediction, several authors have reasoned that a single variable will rarely capture enough of the relevant variance to support clinical decision-making in mental health contexts Gillan & Whelan, 2017;Kessler, 2018;Simon & Perlis, 2010). When multiple relatively independent patient characteristics each account for some of the variance in outcome, an integration of the relevant factors, rather than a focus on each of them one at a time, will yield the most powerful predictions. ...
Article
PTSD treatment guidelines recommend several treatments with extensive empirical support, including Prolonged Exposure (PE), a trauma-focused treatment and Present-Centered Therapy (PCT), a non-trauma-focused therapy. Research to inform treatment selection has yielded inconsistent findings with single prognostic variables that are difficult to integrate into clinical decision-making. We examined whether a combination of prognostic factors can predict different benefits in a trauma-focused vs. a non-trauma-focused psychotherapy. We applied a multi-method variable selection procedure and developed a prognostic index (PI) with a sample of 267 female veterans and active-duty service members (mean age 45; SD = 9.37; 53% White) with current PTSD who began treatment in a randomized clinical trial comparing PE and PCT. We conducted linear regressions predicting outcomes (Clinician-Administered PTSD Scale score) with treatment condition, the PI, and the interaction between the PI and treatment condition. The interaction between treatment type and PI moderated treatment response, moderated post-treatment symptom severity, b = 0.30, SEb = 0.15 [95% CI: 0.01, 0.60], p = .049. For the 64% of participants with the best prognoses, PE resulted in better post-treatment outcomes; for the remainder, there was no difference. Use of a PI may lead to optimized patient outcomes and greater confidence when selecting trauma-focused treatments.
... It is noteworthy that while several machine learning algorithms have been applied in psychotherapy research to classify participants into dichotomous outcome variables (e.g., responders versus none-responders; Gillan & Whelan, 2017) or identify the optimal treatment for individual patients (Cohen & DeRubeis, 2018) less efforts have been made to predict continuous outcome variables (e.g., predicting process-outcome associations; e.g., Rubel et al., 2020). The present study represents a step forward in that direction, showing the feasibility of developing algorithms for the prediction of process-outcome associations that might meaningfully inform clinical practice. ...
Article
Full-text available
Objective: We aimed to develop and test an algorithm for individual patient predictions of problem coping experiences (PCE) (i.e., patients' understanding and ability to deal with their problems) effects in cognitive-behavioral therapy. Method: In an outpatient sample with a variety of diagnoses (n=1010), we conducted Dynamic Structural Equation Modelling to estimate within-patient cross-lagged PCE effects on outcome during the first ten sessions. In a randomly selected training sample (2/3 of the cases), we tried different machine learning algorithms (i.e., ridge regression, LASSO, elastic net, and random forest) to predict PCE effects (i.e., the degree to which PCE was a time-lagged predictor of symptoms), using baseline demographic, diagnostic, and clinically-relevant patient features. Then, we validated the best algorithm on a test sample (1/3 of the cases). Results: The random forest algorithm performed best, explaining 14.7% of PCE effects variance in the training set. The results remained stable in the test set, explaining 15.4% of PCE effects variance. Conclusions: The results show the suitability to perform individual predictions of process effects, based on patients' initial information. If the results are replicated, the algorithm might have the potential to be implemented in clinical practice by integrating it into monitoring and therapist feedback systems.
... A solution to this problem may lie in the development of multivariable models that are informed by data from complementary domains, such as cognitive, (neuro)physiological and molecular data (11)(12)(13). Machine learning is one such method that can iteratively and contemporaneously analyse multiple variables and their interaction, aggregating small individual effects into single predictive values (14). ...
... In the subfield of forensic psychiatry, its application is still relatively rare. However, in this area, where basic knowledge is somewhat scarce but extensive datasets may already exist, ML represents a promising opportunity to gain new insights -for example regarding the characteristics leading to escape or absconding (for more information on ML see [47][48][49][50]). ...
Article
Full-text available
Background Escape and absconding, especially in forensic settings, can have serious consequences for patients, staff and institutions. Several characteristics of affected patients could be identified so far, albeit based on heterogeneous patient populations, a limited number of possible factors and basal statistical analyses. The aim of this study was to determine the most important characteristics among a large number of possible variables and to describe the best statistical model using machine learning in a homogeneous group of offender patients with schizophrenia spectrum disorder. Methods A database of 370 offender patients suffering from schizophrenia spectrum disorder and 507 possible predictor variables was explored by machine learning. To counteract overfitting, the database was divided into training and validation set and a nested validation procedure was used on the training set. The best model was tested on the validation set and the most important variables were extracted. Results The final model resulted in a balanced accuracy of 71.1% (95% CI = [58.5, 83.1]) and an AUC of 0.75 (95% CI = [0.63, 0.87]). The variables identified as relevant and related to absconding/ escape listed from most important to least important were: more frequent forbidden intake of drugs during current hospitalization, more index offences, higher neuroleptic medication, more frequent rule breaking behavior during current hospitalization, higher PANSS Score at discharge, lower age at admission, more frequent dissocial behavior during current hospitalization, shorter time spent in current hospitalization and higher PANSS Score at admission. Conclusions For the first time a detailed statistical model could be built for this topic. The results indicate the presence of a particularly problematic subgroup within the group of offenders with schizophrenic spectrum disorder who also tend to escape or abscond. Early identification and tailored treatment of these patients could be of clinical benefit.
... To address this gap, we conducted an experiment with 220 clinical-care providers to assess the impact of ML treatment recommendations on clinician treatment selection. The possibility of improving treatment outcomes in major depressive disorder (MDD) using ML has received increased attention in recent years [14][15][16][17] . Identifying optimal treatment in this context is particularly challenging because of heterogeneous symptoms, tolerability concerns, and the prevalence of treatment-resistant depression, which can result in clinicians and patients using trial and error to find an effective treatment 18,19 . ...
Article
Full-text available
Decision support systems embodying machine learning models offer the promise of an improved standard of care for major depressive disorder, but little is known about how clinicians’ treatment decisions will be influenced by machine learning recommendations and explanations. We used a within-subject factorial experiment to present 220 clinicians with patient vignettes, each with or without a machine-learning (ML) recommendation and one of the multiple forms of explanation. We found that interacting with ML recommendations did not significantly improve clinicians’ treatment selection accuracy, assessed as concordance with expert psychopharmacologist consensus, compared to baseline scenarios in which clinicians made treatment decisions independently. Interacting with incorrect recommendations paired with explanations that included limited but easily interpretable information did lead to a significant reduction in treatment selection accuracy compared to baseline questions. These results suggest that incorrect ML recommendations may adversely impact clinician treatment selections and that explanations are insufficient for addressing overreliance on imperfect ML algorithms. More generally, our findings challenge the common assumption that clinicians interacting with ML tools will perform better than either clinicians or ML algorithms individually.
... Nonetheless, the implications of this finding are broader than those of this specific study. At a time when extraordinary amounts of information through digital sources are available to clinicians, a major undetermined question is whether access to this information may actually improve care [13]. Our findings provide an early signal suggesting that using collateral data from more diverse sources may have a positive impact on clinical care. ...
Article
Full-text available
Background The review of collateral information is an essential component of patient care. Although this is standard practice, minimal research has been done to quantify collateral information collection and to understand how collateral information translates to clinical decision making. To address this, we developed and piloted a novel measure (the McLean Collateral Information and Clinical Actionability Scale [M-CICAS]) to evaluate the types and number of collateral sources viewed and the resulting actions made in a psychiatric setting. Objective This study aims to test the feasibility of the M-CICAS, validate this measure against clinician notes via medical records, and evaluate whether reviewing a higher volume of collateral sources is associated with more clinical actions taken. Methods For the M-CICAS, we developed a three-part instrument, focusing on measuring collateral sources reviewed, clinical actions taken, and shared decision making between the clinician and patient. To determine feasibility and preliminary validity, we piloted this measure among clinicians providing psychotherapy at McLean Hospital. These clinicians (n=7) completed the M-CICAS after individual clinical sessions with 89 distinct patient encounters. Scales were completed by clinicians only once for each patient during routine follow-up visits. After clinicians completed these scales, researchers conducted chart reviews by completing the M-CICAS using only the clinician’s corresponding note from that session. For the analyses, we generated summary scores for the number of collateral sources and clinical actions for each encounter. We examined Pearson correlation coefficients to assess interrater reliability between clinicians and chart reviewers, and simple univariate regression modeling followed by multilevel mixed effects regression modeling to test the relationship between collateral information accessed and clinical actions taken. ResultsThe study staff had high interrater reliability on the M-CICAS for the sources reviewed (r=0.98; P
... big data in psychiatry is another utility of AI-based services. [5][6][7][8][9][10] In psychiatry, the use of deep learning has also reduced the need for large volumes of data otherwise required for disease prediction. 11 Despite the potential opportunities in using AI-enabled telepsychiatry services, several challenges have been reported. ...
Article
Full-text available
Background Published literature shows the overall challenges associated with artificial intelligence (AI)-enabled medicine and telepsychiatry more from the western perspective, with no specific mention from the perspective of individual stakeholders or Indians. This study was conceptualized to understand the perceived challenges of building, deploying, and using AI-enabled telepsychiatry for clinical practice from the perspectives of psychiatrist, patients, and the technology experts (who build such services) in urban India. Methods Between February 2020 and April 2020, a semistructured topic guide was drafted for qualitative exploratory study among psychiatrists ( n = 14), their patients ( n = 14), technology experts ( n = 13), and Chief Executive Officers (CEOs) ( n = 5) of health technology incubation centers. Interviews were conducted over the phone, recorded, and analyzed using the grounded theory approach. Results Almost all respondents cited ethical, legal, accountability, and regulatory implications as challenges. The major issues stated by patients were privacy/confidentiality, ethical violations, security/ hacking, and data ownership. Psychiatrists cited lack of clinical validation, lack of established studies or trials, iatrogenic risk, and healthcare infrastructure issues as the main challenges. Technology experts stated data-related issues as the major challenge. The CEOs quoted the lack of interdisciplinary experts as one of the main challenges in building deployable AI-enabled telepsychiatry in India. Conclusions There are challenges to deploy an AI-enabled telepsychiatry platform in India. There is a need to constitute an interdisciplinary team to systematically address these challenges. Deployment of AI-enabled telepsychiatry is not possible without clinical validation and addressing current challenges.
... To assess overfitting, we included an independent dataset to test final models. Overfitting could be a main culprit in overlooking the possible presence of false positives in prior work (57). A third strength is our use of multiple psychiatric symptoms to attempt to identify e-nose metrics as novel biomarkers in assessing and predicting mental illness. ...
Article
Full-text available
Non-intrusive, easy-to-use and pragmatic collection of biological processes is warranted to evaluate potential biomarkers of psychiatric symptoms. Prior work with relatively modest sample sizes suggests that under highly-controlled sampling conditions, volatile organic compounds extracted from the human breath (exhalome), often measured by an electronic nose (“e-nose”), may be related to physical and mental health. The present study utilized a streamlined data collection approach and attempted to replicate and extend prior e-nose links to mental health in a standard research setting within large transdiagnostic community dataset (N = 1207; 746 females; 18–61 years) who completed a screening visit at the Laureate Institute for Brain Research between 07/2016 and 05/2018. Factor analysis was used to obtain latent exhalome variables, and machine learning approaches were employed using these latent variables to predict three types of symptoms independent of each other (depression, anxiety, and substance use disorder) within separate training and a test sets. After adjusting for age, gender, body mass index, and smoking status, the best fitting algorithm produced by the training set accounted for nearly 0% of the test set’s variance. In each case the standard error included the zero line, indicating that models were not predictive of clinical symptoms. Although some sample variance was predicted, findings did not generalize to out-of-sample data. Based on these findings, we conclude that the exhalome, as measured by the e-nose within a less-controlled environment than previously reported, is not able to provide clinically useful assessments of current depression, anxiety or substance use severity.
... However, to determine the generalizability of PAI models in actual clinical practice, the predictive accuracy of these models needs to be externally validated (Bleeker et al., 2003;Gillan & Whelan, 2017). External validation is considered a second phase in multivariable prognostic research, following model development and preceding impact studies (Moons et al., 2009). ...
Article
Full-text available
Objective: Optimizing treatment selection may improve treatment outcomes in depression. A promising approach is the Personalized Advantage Index (PAI), which predicts the optimal treatment for a given individual. To determine the generalizability of the PAI, models needs to be externally validated, which has rarely been done. Method: PAI models were developed within each of two independent trials, with substantial between-study differences, that both compared CBT and IPT for depression (STEPd: n = 151 and FreqMech: n = 200). Subsequently, both PAI models were tested in the other dataset. Results: In the STEPd study, post-treatment depression was significantly different between individuals assigned to their PAI-indicated treatment versus those assigned to their non-indicated treatment (d = .57). In the FreqMech study, post-treatment depression was not significantly different between patients receiving their indicated treatment versus those receiving their non-indicated treatment (d = .20). Cross-trial predictions indicated that post-treatment depression was not significantly different between those receiving their indicated treatment and those receiving their non-indicated treatment (d = .16 and d = .27). Sensitivity analyses indicated that cross-trial prediction based on only overlapping variables didn't improve the results. Conclusion: External validation of the PAI has modest results and emphasizes between-study differences and many other challenges. † These authors (SvB & SB) contributed equally to the work.
... Psychiatry. Machine learning is known to help prediction research in psychiatry, by identifying robust, reproducible and generalizable predictors of treatment response in psychiatry (Gillan & Whelan, 2017). In their scoping review of ML applications in mental health, Shatte et al. (2019) highlighted a range of benefits across the areas of diagnosis, treatment and support, research, and clinical administration. ...
Article
Machine learning (ML) offers robust statistical and probabilistic techniques that can help to make sense of large amounts of data. This scoping review paper aims to broadly explore the nature of research activity using ML in the context of psychological talk therapies, highlighting the scope of current methods and considerations for clinical practice and directions for future research. Using a systematic search methodology, fifty-one studies were identified. A narrative synthesis indicates two types of studies, those who developed and tested an ML model (k=44), and those who reported on the feasibility of a particular treatment tool that uses an ML algorithm (k=7). Most model development studies used supervised learning techniques to classify or predict labeled treatment process or outcome data, whereas others used unsupervised techniques to identify clusters in the unlabeled patient or treatment data. Overall, the current applications of ML in psychotherapy research demonstrated a range of possible benefits for indications of treatment process, adherence, therapist skills and treatment response prediction, as well as ways to accelerate research through automated behavioral or linguistic process coding. Given the novelty and potential of this research field, these proof-of-concept studies are encouraging, however, do not necessarily translate to improved clinical practice (yet).
... There are various techniques to avoid or minimize over-matching, such as data splitting, crossvalidation, regularization, or reduction of predictors. Nevertheless, the generalizability of ML results from a data set should be treated with caution and may need further confirmation by new data and perhaps more conservative statistical approaches [27][28][29] ...
Article
Purpose There is a lack of research on predictors of criminal recidivism of offender patients diagnosed with schizophrenia. Methods 653 potential predictor variables were anlyzed in a set of 344 offender patients with a diagnosis of schizophrenia (209 reconvicted) using machine learning algorithms. As a novel methodological approach, null hypothesis significance testing (NHST), backward selection, logistic regression, trees, support vector machines (SVM), and naive bayes were used for preselecting variables. Subsequently the variables identified as most influential were used for machine learning algorithm building and evaluation. Results The two final models (with/ without imputation) predicted criminal recidivism with an accuracy of 81.7% and 70.6% and a predictive power (area under the curve, AUC) of 0.89 and 0.76 based on the following predictors: prescription of amisulpride prior to reoffending, suspended sentencing to imprisonment, legal complaints filed by relatives/ therapists/ public authorities, recent legal issues, number of offences leading to forensic treatment, anxiety upon discharge, being single, violence toward care team and constant breaking of rules during treatment, illegal opioid use, middle east as place of birth, and time span since the last psychiatric inpatient treatment. Conclusion Results provide new insight on possible factors influencing persistent offending in a specific subgroup of patients with a schizophrenic spectrum disorder.
... To deal with a potential case in which classes are not separable, SVMs introduce a regularization constant C that penalizes samples that cannot be separated. Formally, learning SVM for a two-class problem can be represented as the following optimization problem [46]: ...
Preprint
After performing comparison of the performance of seven different machine learning models on detection depression tasks to show that the choice of features is essential, we compare our methods and results with the published work of other researchers. In the end we summarize optimal practices in order that this useful classification solution can be translated to clinical practice with high accuracy and better acceptance.
Article
Nonlinear EEG analysis offers the potential for both increased diagnostic accuracy and deeper mechanistic understanding of psychopathology. EEG complexity measures have previously been shown to positively correlate with clinical depression. In this study, resting state EEG recordings were taken across multiple sessions and days with both eyes open and eyes closed conditions from a total of 306 subjects, 62 of which were in a current depressive episode, and 81 of which had a history of diagnosed depression but were not currently depressed. Three different EEG montages (mastoids, average, and Laplacian) were also computed. Higuchi fractal dimension (HFD) and sample entropy (SampEn) were calculated for each unique condition. The complexity metrics showed high internal consistency within session and high stability across days. Higher complexity was found in open-eye recordings compared to closed eyes. The predicted correlation between complexity and depression was not found. However, an unexpected sex effect was observed, in which males and females exhibited different topographic patterns of complexity.
Article
Full-text available
As modern medicine becomes increasingly personalized, psychiatry lags behind, using poorly-understood drugs and therapies to treat mental disorders. With the advent of methods that capture large quantities of data, such as genome-wide analyses or fMRI, machine learning (ML) approaches have become prominent in neuroscience. This is promising for studying the brain's function, but perhaps more importantly, these techniques can potentially predict the onset of disorder and treatment response. Experimental approaches that use naive machine learning algorithms have dominated research in computational psychiatry over the past decade. In a critical review and analysis, I argue that biologically realistic approaches will be more effective in clinical practice, and research trends should reflect this. Hybrid models are considered, and a brief case study on major depressive disorder is presented. Finally, I propose a novel four-step approach for the future implementation of computational methods in psychiatric clinics.
Article
There is strong interest in developing a more efficient mental health care system. Digital interventions and predictive models of treatment prognosis will likely play an important role in this endeavor. This article reviews the application of popular machine learning models to the prediction of treatment prognosis, with a particular focus on digital interventions. Assuming that the prediction of treatment prognosis will involve modeling a complex combination of interacting features with measurement error in both the predictors and outcomes, our simulations suggest that to optimize complex prediction models, sample sizes in the thousands will be required. Machine learning methods capable of discovering complex interactions and nonlinear effects (e.g., decision tree ensembles such as gradient boosted machines) perform particularly well in large samples when the predictors and outcomes have virtually no measurement error. However, in the presence of moderate measurement error, these methods provide little or no benefit over regularized linear regression, even with very large sample sizes (N = 100,000) and a non-linear ground truth. Given these sample size requirements, we argue that the scalability of digital interventions, especially when used in combination with optimal measurement practices, provides one of the most effective ways to study treatment prediction models. We conclude with suggestions about how to implement these algorithms into clinical practice.
Article
Full-text available
We discuss here the already demonstrated innovations that can help improve everyday clinical practice in psychiatry in particular in depression detection and remote patient monitoring.
Article
Full-text available
Major depressive disorder is a heterogeneous diagnostic category with multiple available treatments. With the goal of optimizing treatment selection, researchers are developing computational models which attempt to predict treatment response based on various pre-treatment measures. In this paper, we review studies which use brain activity data to predict treatment response. Our aim is to highlight and clarify important methodological differences between various studies that relate to the incorporation of domain-knowledge, specifically within two approaches delineated as data-driven and theory-driven. We argue that theory-driven generative modelling, which explicitly models information processing in the brain and thus can capture disease mechanisms, is a promising emerging approach that is only beginning to be utilized in treatment response prediction. The predictors extracted via such models could improve interpretability, which is critical for clinical decision-making. We also identify several methodological limitations across the reviewed studies and provide suggestions for addressing them. Namely, we consider problems with dichotomizing treatment outcomes, the importance of investigating more than one treatment in a given study for differential treatment response predictions, the need for a patient-centered approach for defining treatment outcomes, and finally, the use of internal and external validation methods for improving model generalizability.
Article
Depression is one of the significant contributors to the global burden disease, affecting nearly 264 million people worldwide along with the increasing rate of suicidal deaths. Electroencephalogram (EEG), a non-invasive functional neuroimaging tool has been widely used to study the significant biomarkers for the diagnosis of the disorder. Computational Psychiatry is a novel avenue of research that has shown a tremendous success in the automated diagnosis of depression. The present comprehensive review concentrate on two approaches widely adopted for an EEG based automated diagnosis of depression: Deep Learning (DL) approach and the traditional approach based upon Machine Learning (ML). In this review, we focus on performing the comparative analysis of a variety of signal processing and classification methods adopted in the existing literature for these approaches. We have discussed a variety of EEG based objective biomarkers and the data acquisition systems adopted for the diagnosis of depression. Few EEG studies focusing on multi-modal fusion of data have also been explained. Additionally, the research based upon the analysis and prediction of treatment outcome response for depression using EEG signals and machine learning techniques has been briefly discussed to aware the researchers about this emerging field. Finally, the future opportunities and a valuable discussion on major issues related to this field have been summarized that will help the researchers in developing more reliable and computationally intelligent systems in the field of psychiatry.
Preprint
Full-text available
Psychiatry faces fundamental challenges with regard to mechanistically guided differential diagnosis, as well as prediction of clinical trajectories and treatment response of individual patients. This has motivated the genesis of two closely intertwined fields: (i) Translational Neuromodeling (TN), which develops "computational assays" for inferring patient-specific disease processes from neuroimaging, electrophysiological, and behavioral data; and (ii) Computational Psychiatry (CP), with the goal of incorporating computational assays into clinical decision making in everyday practice. In order to serve as objective and reliable tools for clinical routine, computational assays require end-to-end pipelines from raw data (input) to clinically useful information (output). While these are yet to be established in clinical practice, individual components of this general end-to-end pipeline are being developed and made openly available for community use. In this paper, we present the Translational Algorithms for Psychiatry-Advancing Science (TAPAS) software package, an open-source collection of building blocks for computational assays in psychiatry. Collectively, the tools in TAPAS presently cover several important aspects of the desired end-to-end pipeline, including: (i) tailored experimental designs and optimization of measurement strategy prior to data acquisition, (ii) quality control during data acquisition, and (iii) artifact correction, statistical inference, and clinical application after data acquisition. Here, we review the different tools within TAPAS and illustrate how these may help provide a deeper understanding of neural and cognitive mechanisms of disease, with the ultimate goal of establishing automatized pipelines for predictions about individual patients. We hope that the openly available tools in TAPAS will contribute to the further development of TN/CP and facilitate the translation of advances in computational neuroscience into clinically relevant computational assays.
Preprint
Full-text available
We applied transfer entropy analysis on samples of electroencephalogram recorded from patients diagnosed with major depressive disorder and matched healthy controls. This is the first graphical representation of aberrated dynamics in terms of connectivity and the direction of information between standard centers in MDD.
Preprint
Depression is a serious world health issue and many avenues of research are aiming at elucidating the mechanisms behind it. Recent findings confirm the importance of a disrupted functional connectivity within the fronto-limbic system and other candidate areas important for depression. The question behind our work is whether areas with confirmed aberrated functioning in Major Depressive Disorder (MDD) are actually involved in the network which has different dynamics from a healthy one. On a sample of 21 depressed patients (11 women and 9 men) and 20 age-matched healthy controls (10 women and 10 men), we applied Transfer Entropy (TE) to quantify the directed dynamical interactions in the resting-state electroencephalographic (EEG) data recorded in our previous research in which we compared physiological complexity features of recurrently depressed patients and healthy controls. The dynamics of healthy resting-state EEG is substantially different from the dynamics of MDD brain: the interactions (information transfers) in healthy controls are numerous during resting state, contrary to MDD brains which are repeatedly showing the "isolated" activity in frontal, parietal and temporal areas. To the best of our knowledge, this is the first time that a graphical representation of information transfer and its directions is presented showing the differences between MDD and healthy controls. The BINNUE approach provided us with both influence and directions of influence between compared time series (epoch extracted from recorded EEG)
Article
Background There is a lack of neuroscience-based biomarkers for the diagnosis, treatment and monitoring of individuals with substance use disorders (SUD). The resource allocation index (RAI), a measure of the interrelationship between salience, executive control and default-mode brain networks (SN, ECN, and DMN), has been proposed as one such biomarker. However, the RAI has yet to be extensively tested in SUD samples. Methods The present analysis compared RAI scores between individuals with stimulant and/or opioid use disorders (SUD; n = 139, abstinent 4-365 days) and healthy controls (HC; n = 56) who had completed resting-state functional magnetic resonance imaging (fMRI) scans within the context of the Tulsa 1000 cohort. First, we used independent component analysis (ICA) to identify the SN, ECN, and DMN and extract their time series data. Second, we used multiple permutations of automatically identified networks to compute RAI as reported in the fMRI literature. Results First, the RAI as a metric depended substantially on the approach that was used to define the network components. Second, regardless of the selection of networks, after controlling for multiple testing there was no difference in RAI scores between SUD and HC. Third, the RAI was not associated with any substance use-related self-report measures. Conclusion Taken together, these findings do not provide evidence that RAI can be used as an fMRI-derived biomarker for the severity or diagnosis of individuals with SUD.
Article
Full-text available
Prominent theories suggest that compulsive behaviors, characteristic of obsessive-compulsive disorder and addiction, are driven by shared deficits in goal-directed control, which confers vulnerability for developing rigid habits. However, recent studies have shown that deficient goal-directed control accompanies several disorders, including those without an obvious compulsive element. Reasoning that this lack of clinical specificity might reflect broader issues with psychiatric diagnostic categories, we investigated whether a dimensional approach would better delineate the clinical manifestations of goal-directed deficits. Using large-scale online assessment of psychiatric symptoms and neurocognitive performance in two independent general-population samples, we found that deficits in goal-directed control were most strongly associated with a symptom dimension comprising compulsive behavior and intrusive thought. This association was highly specific when compared to other non-compulsive aspects of psychopathology. These data showcase a powerful new methodology and highlight the potential of a dimensional, biologically-grounded approach to psychiatry research.
Article
Full-text available
Objective To explore the value of machine learning methods for predicting multiple sclerosis disease course. Methods 1693 CLIMB study patients were classified as increased EDSS≥1.5 (worsening) or not (non-worsening) at up to five years after baseline visit. Support vector machines (SVM) were used to build the classifier, and compared to logistic regression (LR) using demographic, clinical and MRI data obtained at years one and two to predict EDSS at five years follow-up. Results Baseline data alone provided little predictive value. Clinical observation for one year improved overall SVM sensitivity to 62% and specificity to 65% in predicting worsening cases. The addition of one year MRI data improved sensitivity to 71% and specificity to 68%. Use of non-uniform misclassification costs in the SVM model, weighting towards increased sensitivity, improved predictions (up to 86%). Sensitivity, specificity, and overall accuracy improved minimally with additional follow-up data. Predictions improved within specific groups defined by baseline EDSS. LR performed more poorly than SVM in most cases. Race, family history of MS, and brain parenchymal fraction, ranked highly as predictors of the non-worsening group. Brain T2 lesion volume ranked highly as predictive of the worsening group. Interpretation SVM incorporating short-term clinical and brain MRI data, class imbalance corrective measures, and classification costs may be a promising means to predict MS disease course, and for selection of patients suitable for more aggressive treatment regimens.
Article
Full-text available
Importance: Depressive severity is typically measured according to total scores on questionnaires that include a diverse range of symptoms despite convincing evidence that depression is not a unitary construct. When evaluated according to aggregate measurements, treatment efficacy is generally modest and differences in efficacy between antidepressant therapies are small. Objectives: To determine the efficacy of antidepressant treatments on empirically defined groups of symptoms and examine the replicability of these groups. Design, setting, and participants: Patient-reported data on patients with depression from the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) trial (n = 4039) were used to identify clusters of symptoms in a depressive symptom checklist. The findings were then replicated using the Combining Medications to Enhance Depression Outcomes (CO-MED) trial (n = 640). Mixed-effects regression analysis was then performed to determine whether observed symptom clusters have differential response trajectories using intent-to-treat data from both trials (n = 4706) along with 7 additional placebo and active-comparator phase 3 trials of duloxetine (n = 2515). Finally, outcomes for each cluster were estimated separately using machine-learning approaches. The study was conducted from October 28, 2014, to May 19, 2016. Main outcomes and measures: Twelve items from the self-reported Quick Inventory of Depressive Symptomatology (QIDS-SR) scale and 14 items from the clinician-rated Hamilton Depression (HAM-D) rating scale. Higher scores on the measures indicate greater severity of the symptoms. Results: Of the 4706 patients included in the first analysis, 1722 (36.6%) were male; mean (SD) age was 41.2 (13.3) years. Of the 2515 patients included in the second analysis, 855 (34.0%) were male; mean age was 42.65 (12.17) years. Three symptom clusters in the QIDS-SR scale were identified at baseline in STAR*D. This 3-cluster solution was replicated in CO-MED and was similar for the HAM-D scale. Antidepressants in general (8 of 9 treatments) were more effective for core emotional symptoms than for sleep or atypical symptoms. Differences in efficacy between drugs were often greater than the difference in efficacy between treatments and placebo. For example, high-dose duloxetine outperformed escitalopram in treating core emotional symptoms (effect size, 2.3 HAM-D points during 8 weeks, 95% CI, 1.6 to 3.1; P < .001), but escitalopram was not significantly different from placebo (effect size, 0.03 HAM-D points; 95% CI, -0.7 to 0.8; P = .94). Conclusions and relevance: Two common checklists used to measure depressive severity can produce statistically reliable clusters of symptoms. These clusters differ in their responsiveness to treatment both within and across different antidepressant medications. Selecting the best drug for a given cluster may have a bigger benefit than that gained by use of an active compound vs a placebo.
Article
Full-text available
Despite its great promise, neuroimaging has yet to substantially impact clinical practice and public health. However, a developing synergy between emerging analysis techniques and data-sharing initiatives has the potential to transform the role of neuroimaging in clinical applications. We review the state of translational neuroimaging and outline an approach to developing brain signatures that can be shared, tested in multiple contexts and applied in clinical settings. The approach rests on three pillars: (i) the use of multivariate pattern-recognition techniques to develop brain signatures for clinical outcomes and relevant mental processes; (ii) assessment and optimization of their diagnostic value; and (iii) a program of broad exploration followed by increasingly rigorous assessment of generalizability across samples, research contexts and populations. Increasingly sophisticated models based on these principles will help to overcome some of the obstacles on the road from basic neuroscience to better health and will ultimately serve both basic and applied goals.
Article
Full-text available
Neuroimaging increasingly exploits machine learning techniques in an attempt to achieve clinically relevant single-subject predictions. An alternative to machine learning, which tries to establish predictive links between features of the observed data and clinical variables, is the deployment of computational models for inferring on the (patho)physiological and cognitive mechanisms that generate behavioural and neuroimaging responses. This paper discusses the rationale behind a computational approach to neuroimaging-based single-subject inference; focusing on its potential for characterising disease mechanisms in individual subjects and mapping these characterisations to clinical predictions. Following an overview of two main approaches - Bayesian model selection and generative embedding - which can link computational models to individual predictions, we review how these methods accommodate heterogeneity in psychiatric and neurological spectrum disorders, help avoid erroneous interpretations of neuroimaging data, and establish a link between a mechanistic, model-based approach and the statistical perspectives afforded by machine learning.
Article
Full-text available
Objective measures of psychiatric health would be of great benefit in clinical practice. Despite considerable research in the area of psychiatric neuroimaging outcome prediction, translating putative neuroimaging markers (neuromarkers) of a disorder into clinical practice has proven to be challenging. Here, we reviewed studies that used neuroimaging measures to predict treatment response and disease outcomes in major depressive disorder, substance use, autism spectrum disorder, psychosis, and dementia. The majority of studies sought to predict psychiatric outcomes rather than develop a specific biological index of future disease trajectory. Studies varied widely with respect to sample size and quantification of out-of-sample prediction model performance. Many studies were able to predict psychiatric outcomes with moderate accuracy, with neuroimaging data often augmenting the prediction compared to clinical or psychometric data alone. We make recommendations for future research with respect to methods that can increase the generalizability and reproducibility of predictions. Large sample sizes in conjunction with machine learning methods such as feature selection, cross-validation, and random label permutation provide significant improvement to, and quantification of, generalizability. Further refinement of neuroimaging protocols and analysis methods will likely facilitate the clinical applicability of predictive imaging markers in psychiatry. Such clinically relevant neuromarkers need not necessarily be grounded in the pathophysiology of the disease, but identifying these neuromarkers may suggest targets for future research into disease mechanisms. The ability of imaging prediction models to augment clinical judgements will ultimately depend on the personal and economic costs and benefits to the patient.
Article
Full-text available
Introduction: Although psychotherapies for depression produce equivalent outcomes, individual patients respond differently to different therapies. Predictors of outcome have been identified in the context of randomized trials, but this information has not been used to predict which treatment works best for the depressed individual. In this paper, we aim to replicate a recently developed treatment selection method, using data from an RCT comparing the effects of cognitive therapy (CT) and interpersonal psychotherapy (IPT). Methods: 134 depressed patients completed the pre- and post-treatment BDI-II assessment. First, we identified baseline predictors and moderators. Second, individual treatment recommendations were generated by combining the identified predictors and moderators in an algorithm that produces the Personalized Advantage Index (PAI), a measure of the predicted advantage in one therapy compared to the other, using standard regression analyses and the leave-one-out cross-validation approach. Results: We found five predictors (gender, employment status, anxiety, personality disorder and quality of life) and six moderators (somatic complaints, cognitive problems, paranoid symptoms, interpersonal self-sacrificing, attributional style and number of life events) of treatment outcome. The mean average PAI value was 8.9 BDI points, and 63% of the sample was predicted to have a clinically meaningful advantage in one of the therapies. Those who were randomized to their predicted optimal treatment (either CT or IPT) had an observed mean end-BDI of 11.8, while those who received their predicted non-optimal treatment had an end-BDI of 17.8 (effect size for the difference = 0.51). Discussion: Depressed patients who were randomized to their predicted optimal treatment fared much better than those randomized to their predicted non-optimal treatment. The PAI provides a great opportunity for formal decision-making to improve individual patient outcomes in depression. Although the utility of the PAI approach will need to be evaluated in prospective research, this study promotes the development of a treatment selection approach that can be used in regular mental health care, advancing the goals of personalized medicine.
Article
Full-text available
Empirically analyzing empirical evidence One of the central goals in any scientific endeavor is to understand causality. Experiments that seek to demonstrate a cause/effect relation most often manipulate the postulated causal factor. Aarts et al. describe the replication of 100 experiments reported in papers published in 2008 in three high-ranking psychology journals. Assessing whether the replication and the original experiment yielded the same result according to several criteria, they find that about one-third to one-half of the original findings were also observed in the replication study. Science , this issue 10.1126/science.aac4716
Article
Full-text available
We asked whether brain connectomics can predict response to treatment for a neuropsychiatric disorder better than conventional clinical measures. Pre-treatment resting-state brain functional connectivity and diffusion-weighted structural connectivity were measured in 38 patients with social anxiety disorder (SAD) to predict subsequent treatment response to cognitive behavioral therapy (CBT). We used a priori bilateral anatomical amygdala seed-driven resting connectivity and probabilistic tractography of the right inferior longitudinal fasciculus together with a data-driven multivoxel pattern analysis of whole-brain resting-state connectivity before treatment to predict improvement in social anxiety after CBT. Each connectomic measure improved the prediction of individuals' treatment outcomes significantly better than a clinical measure of initial severity, and combining the multimodal connectomics yielded a fivefold improvement in predicting treatment response. Generalization of the findings was supported by leave-one-out cross-validation. After dividing patients into better or worse responders, logistic regression of connectomic predictors and initial severity combined with leave-one-out cross-validation yielded a categorical prediction of clinical improvement with 81% accuracy, 84% sensitivity and 78% specificity. Connectomics of the human brain, measured by widely available imaging methods, may provide brain-based biomarkers (neuromarkers) supporting precision medicine that better guide patients with neuropsychiatric diseases to optimal available treatments, and thus translate basic neuroimaging into medical practice.Molecular Psychiatry advance online publication, 11 August 2015; doi:10.1038/mp.2015.109.
Article
Full-text available
A comprehensive account of the causes of alcohol misuse must accommodate individual differences in biology, psychology and environment, and must disentangle cause and effect. Animal models(1) can demonstrate the effects of neurotoxic substances; however, they provide limited insight into the psycho-social and higher cognitive factors involved in the initiation of substance use and progression to misuse. One can search for pre-existing risk factors by testing for endophenotypic biomarkers(2) in non-using relatives; however, these relatives may have personality or neural resilience factors that protect them from developing dependence(3). A longitudinal study has potential to identify predictors of adolescent substance misuse, particularly if it can incorporate a wide range of potential causal factors, both proximal and distal, and their influence on numerous social, psychological and biological mechanisms(4). Here we apply machine learning to a wide range of data from a large sample of adolescents (n = 692) to generate models of current and future adolescent alcohol misuse that incorporate brain structure and function, individual personality and cognitive differences, environmental factors (including gestational cigarette and alcohol exposure), life experiences, and candidate genes. These models were accurate and generalized to novel data, and point to life experiences, neurobiological differences and personality as important antecedents of binge drinking. By identifying the vulnerability factors underlying individual differences in alcohol misuse, these models shed light on the aetiology of alcohol misuse and suggest targets for prevention.
Article
Full-text available
Less than 50% of patients with Major Depressive Disorder (MDD) reach symptomatic remission with their initial antidepressant medication (ADM). There are currently no objective measures with which to reliably predict which individuals will achieve remission to ADMs. 157 participants with MDD from the International Study to Predict Optimized Treatment in Depression (iSPOT-D) underwent baseline MRIs and completed eight weeks of treatment with escitalopram, sertraline or venlafaxine-ER. A score at week 8 of 7 or less on the 17 item Hamilton Rating Scale for Depression defined remission. Receiver Operator Characteristics (ROC) analysis using the first 50% participants was performed to define decision trees of baseline MRI volumetric and connectivity (fractional anisotropy) measures that differentiated non-remitters from remitters with maximal sensitivity and specificity. These decision trees were tested for replication in the remaining participants. Overall, 35% of all participants achieved remission. ROC analyses identified two decision trees that predicted a high probability of non-remission and that were replicated: 1. Left middle frontal volume < 14 · 8 mL & right angular gyrus volume > 6 · 3 mL identified 55% of non-remitters with 85% accuracy; and 2. Fractional anisotropy values in the left cingulum bundle < 0 · 63, right superior fronto-occipital fasciculus < 0 · 54 and right superior longitudinal fasciculus < 0 · 50 identified 15% of the non-remitters with 84% accuracy. All participants who met criteria for both decision trees were correctly identified as non-remitters. Pretreatment MRI measures seem to reliably identify a subset of patients who do not remit with a first step medication that includes one of these commonly used medications. Findings are consistent with a neuroanatomical basis for non-remission in depressed patients. Brain Resource Ltd is the sponsor for the iSPOT-D study (NCT00693849).
Article
Full-text available
Electroconvulsive therapy (ECT) is effective even in treatment-resistant patients with major depression. Currently, there are no markers available that can assist in identifying those patients most likely to benefit from ECT. In the present study, we investigated whether resting-state network connectivity can predict treatment outcome for individual patients. We included forty-five patients with severe and treatment-resistant unipolar depression and collected functional magnetic resonance imaging scans before the course of ECT. We extracted resting-state networks and used multivariate pattern analysis to discover networks that predicted recovery from depression. Cross-validation revealed two resting-state networks with significant classification accuracy after correction for multiple comparisons. A network centered in the dorsomedial prefrontal cortex (including the dorsolateral prefrontal cortex, orbitofrontal cortex and posterior cingulate cortex) showed a sensitivity of 84% and specificity of 85%. Another network centered in the anterior cingulate cortex (including the dorsolateral prefrontal cortex, sensorimotor cortex, parahippocampal gyrus and midbrain) showed a sensitivity of 80% and a specificity of 75%. These preliminary results demonstrate that resting-state networks may predict treatment outcome for individual patients and suggest that resting-state networks have the potential to serve as prognostic neuroimaging biomarkers to guide personalized treatment decisions.
Article
Full-text available
Background: Although variation in the long-term course of major depressive disorder (MDD) is not strongly predicted by existing symptom subtype distinctions, recent research suggests that prediction can be improved by using machine learning methods. However, it is not known whether these distinctions can be refined by added information about co-morbid conditions. The current report presents results on this question. Method: Data came from 8261 respondents with lifetime DSM-IV MDD in the World Health Organization (WHO) World Mental Health (WMH) Surveys. Outcomes included four retrospectively reported measures of persistence/severity of course (years in episode; years in chronic episodes; hospitalization for MDD; disability due to MDD). Machine learning methods (regression tree analysis; lasso, ridge and elastic net penalized regression) followed by k-means cluster analysis were used to augment previously detected subtypes with information about prior co-morbidity to predict these outcomes. Results: Predicted values were strongly correlated across outcomes. Cluster analysis of predicted values found three clusters with consistently high, intermediate or low values. The high-risk cluster (32.4% of cases) accounted for 56.6-72.9% of high persistence, high chronicity, hospitalization and disability. This high-risk cluster had both higher sensitivity and likelihood ratio positive (LR+; relative proportions of cases in the high-risk cluster versus other clusters having the adverse outcomes) than in a parallel analysis that excluded measures of co-morbidity as predictors. Conclusions: Although the results using the retrospective data reported here suggest that useful MDD subtyping distinctions can be made with machine learning and clustering across multiple indicators of illness persistence/severity, replication with prospective data is needed to confirm this preliminary conclusion.
Article
Full-text available
This proof-of-concept study examines the feasibility of defining subgroups in psychiatric spectrum disorders by generative embedding, using dynamical system models which infer neuronal circuit mechanisms from neuroimaging data. To this end, we re-analysed an fMRI dataset of 41 patients diagnosed with schizophrenia and 42 healthy controls performing a numerical n-back working-memory task. In our generative-embedding approach, we used parameter estimates from a dynamic causal model (DCM) of a visual–parietal–prefrontal network to define a model-based feature space for the subsequent application of supervised and unsupervised learning techniques. First, using a linear support vector machine for classification, we were able to predict individual diagnostic labels significantly more accurately (78%) from DCM-based effective connectivity estimates than from functional connectivity between (62%) or local activity within the same regions (55%). Second, an unsupervised approach based on variational Bayesian Gaussian mixture modelling provided evidence for two clusters which mapped onto patients and controls with nearly the same accuracy (71%) as the supervised approach. Finally, when restricting the analysis only to the patients, Gaussian mixture modelling suggested the existence of three patient subgroups, each of which was characterised by a different architecture of the visual–parietal–prefrontal working-memory network. Critically, even though this analysis did not have access to information about the patients' clinical symptoms, the three neurophysiologically defined subgroups mapped onto three clinically distinct subgroups, distinguished by significant differences in negative symptom severity, as assessed on the Positive and Negative Syndrome Scale (PANSS). In summary, this study provides a concrete example of how psychiatric spectrum diseases may be split into subgroups that are defined in terms of neurophysiological mechanisms specified by a generative model of network dynamics such as DCM. The results corroborate our previous findings in stroke patients that generative embedding, compared to analyses of more conventional measures such as functional connectivity or regional activity, can significantly enhance both the interpretability and performance of computational approaches to clinical classification.
Article
Full-text available
Objective: Results from structural neuroimaging studies of obsessive-compulsive disorder (OCD) have been only partially consistent. The authors sought to assess regional gray and white matter volume differences between large samples of OCD patients and healthy comparison subjects and their relation with demographic and clinical variables. Method: A multicenter voxel-based morphometry mega-analysis was performed on 1.5-T structural T1-weighted MRI scans derived from the International OCD Brain Imaging Consortium. Regional gray and white matter brain volumes were compared between 412 adult OCD patients and 368 healthy subjects. Results: Relative to healthy comparison subjects, OCD patients had significantly smaller volumes of frontal gray and white matter bilaterally, including the dorsomedial prefrontal cortex, the anterior cingulate cortex, and the inferior frontal gyrus extending to the anterior insula. Patients also showed greater cerebellar gray matter volume bilaterally compared with healthy subjects. Group differences in frontal gray and white matter volume were significant after correction for multiple comparisons. Additionally, group-by-age interactions were observed in the putamen, insula, and orbitofrontal cortex (indicating relative preservation of volume in patients compared with healthy subjects with increasing age) and in the temporal cortex bilaterally (indicating a relative loss of volume in patients compared with healthy subjects with increasing age). Conclusions: These findings partially support the prevailing fronto-striatal models of OCD and offer additional insights into the neuroanatomy of the disorder that were not apparent from previous smaller studies. The group-by-age interaction effects in orbitofrontal-striatal and (para)limbic brain regions may be the result of altered neuroplasticity associated with chronic compulsive behaviors, anxiety, or compensatory processes related to cognitive dysfunction.
Article
Full-text available
Research and clinical investigations in psychiatry largely rely on the de facto assumption that the diagnostic categories identified in the Diagnostic and Statistical Manual (DSM) represent homogeneous syndromes. However, the mechanistic heterogeneity that potentially underlies the existing classification scheme might limit discovery of etiology for most developmental psychiatric disorders. Another, perhaps less palpable, reality may also be interfering with progress-heterogeneity in typically developing populations. In this report we attempt to clarify neuropsychological heterogeneity in a large dataset of typically developing youth and youth with attention deficit/hyperactivity disorder (ADHD), using graph theory and community detection. We sought to determine whether data-driven neuropsychological subtypes could be discerned in children with and without the disorder. Because individual classification is the sine qua non for eventual clinical translation, we also apply support vector machine-based multivariate pattern analysis to identify how well ADHD status in individual children can be identified as defined by the community detection delineated subtypes. The analysis yielded several unique, but similar subtypes across both populations. Just as importantly, comparing typically developing children with ADHD children within each of these distinct subgroups increased diagnostic accuracy. Two important principles were identified that have the potential to advance our understanding of typical development and developmental neuropsychiatric disorders. The first tenet suggests that typically developing children can be classified into distinct neuropsychological subgroups with high precision. The second tenet proposes that some of the heterogeneity in individuals with ADHD might be "nested" in this normal variation.
Article
Full-text available
A fundamental function of the brain is to evaluate the emotional and motivational significance of stimuli and to adapt behaviour accordingly. The IMAGEN study is the first multicentre genetic-neuroimaging study aimed at identifying the genetic and neurobiological basis of individual variability in impulsivity, reinforcer sensitivity and emotional reactivity, and determining their predictive value for the development of frequent psychiatric disorders. Comprehensive behavioural and neuropsychological characterization, functional and structural neuroimaging and genome-wide association analyses of 2000 14-year-old adolescents are combined with functional genetics in animal and human models. Results will be validated in 1000 adolescents from the Canadian Saguenay Youth Study. The sample will be followed up longitudinally at the age of 16 years to investigate the predictive value of genetics and intermediate phenotypes for the development of frequent psychiatric disorders. This review describes the strategies the IMAGEN consortium used to meet the challenges posed by large-scale multicentre imaging-genomics investigations. We provide detailed methods and Standard Operating Procedures that we hope will be helpful for the design of future studies. These include standardization of the clinical, psychometric and neuroimaging-acquisition protocols, development of a central database for efficient analyses of large multimodal data sets and new analytic approaches to large-scale genetic neuroimaging analyses.
Article
Full-text available
Comparative studies have implicated the nucleus accumbens (NAcc) in the anticipation of incentives, but the relative responsiveness of this neural substrate during anticipation of rewards versus punishments remains unclear. Using event-related functional magnetic resonance imaging, we investigated whether the anticipation of increasing monetary rewards and punishments would increase NAcc blood oxygen level-dependent contrast (hereafter, "activation") in eight healthy volunteers. Whereas anticipation of increasing rewards elicited both increasing self-reported happiness and NAcc activation, anticipation of increasing punishment elicited neither. However, anticipation of both rewards and punishments activated a different striatal region (the medial caudate). At the highest reward level ($5.00), NAcc activation was correlated with individual differences in self-reported happiness elicited by the reward cues. These findings suggest that whereas other striatal areas may code for expected incentive magnitude, a region in the NAcc codes for expected positive incentive value.
Article
Psychology has historically been concerned, first and foremost, with explaining the causal mechanisms that give rise to behavior. Randomized, tightly controlled experiments are enshrined as the gold standard of psychological research, and there are endless investigations of the various mediating and moderating variables that govern various behaviors. We argue that psychology’s near-total focus on explaining the causes of behavior has led much of the field to be populated by research programs that provide intricate theories of psychological mechanism but that have little (or unknown) ability to predict future behaviors with any appreciable accuracy. We propose that principles and techniques from the field of machine learning can help psychology become a more predictive science. We review some of the fundamental concepts and tools of machine learning and point out examples where these concepts have been used to conduct interesting and important psychological research that focuses on predictive research questions. We suggest that an increased focus on prediction, rather than explanation, can ultimately lead us to greater understanding of behavior.
Article
Background: Major depressive disorder (MDD) has a high personal and socio-economic burden and >60% of patients fail to achieve remission with the first antidepressant. The biological mechanisms behind antidepressant response are only partially known but genetic factors play a relevant role. A combined predictor across genetic variants may be useful to investigate this complex trait. Methods: Polygenic risk scores (PRS) were used to estimate multi-allelic contribution to: 1) antidepressant efficacy; 2) its overlap with MDD and schizophrenia. We constructed PRS and tested whether these predicted symptom improvement or remission from the GENDEP study (n=736) to the STAR*D study (n=1409) and vice-versa, including the whole sample or only patients treated with escitalopram or citalopram. Using summary statistics from Psychiatric Genomics Consortium for MDD and schizophrenia, we tested whether PRS from these disorders predicted symptom improvement in GENDEP, STAR*D, and five further studies (n=3756). Results: No significant prediction of antidepressant efficacy was obtained from PRS in GENDEP/STAR*D but this analysis might have been underpowered. There was no evidence of overlap in the genetics of antidepressant response with either MDD or schizophrenia, either in individual studies or a meta-analysis. Stratifying by antidepressant did not alter the results. Discussion: We identified no significant predictive effect using PRS between pharmacogenetic studies. The genetic liability to MDD or schizophrenia did not predict response to antidepressants, suggesting differences between the genetic component of depression and treatment response. Larger or more homogeneous studies will be necessary to obtain a polygenic predictor of antidepressant response.
Article
Biomarkers have transformed modern medicine but remain largely elusive in psychiatry, partly because there is a weak correspondence between diagnostic labels and their neurobiological substrates. Like to other neuropsychiatric disorders, depression is not a unitary disease, but rather a heterogeneous syndrome that encompasses varied, co-occurring symptoms and divergent responses to treatment. By using functional magnetic resonance imaging (fMRI) in a large multisite sample (n = 1,188), we show here that patients with depression can be subdivided into four neurophysiological subtypes ('biotypes') defined by distinct patterns of dysfunctional connectivity in limbic and frontostriatal networks. Clustering patients on this basis enabled the development of diagnostic classifiers (biomarkers) with high (82–93%) sensitivity and specificity for depression subtypes in multisite validation (n = 711) and out-of-sample replication (n = 477) data sets. These biotypes cannot be differentiated solely on the basis of clinical features, but they are associated with differing clinical-symptom profiles. They also predict responsiveness to transcranial-magnetic-stimulation therapy (n = 154). Our results define novel subtypes of depression that transcend current diagnostic boundaries and may be useful for identifying the individuals who are most likely to benefit from targeted neurostimulation therapies.
Article
Background: At present, no tools exist to estimate objectively the risk of poor treatment outcomes in patients with first-episode psychosis. Such tools could improve treatment by informing clinical decision-making before the commencement of treatment. We tested whether such a tool could be successfully built and validated using routinely available, patient-reportable information. Methods: By applying machine learning to data from 334 patients in the European First Episode Schizophrenia Trial (EUFEST; International Clinical Trials Registry Platform number, ISRCTN68736636), we developed a tool to predict poor versus good treatment outcome (Global Assessment of Functioning [GAF] score ≥65 vs GAF <65, respectively) after 4 weeks and 52 weeks of treatment. To enable the unbiased estimation of the predictive system's generalisability to new patients, we used repeated nested cross-validation to prevent information leaking between patients used for training and validating the models. In pursuit of everyday clinical applicability, we retrained the 4-week outcome predictor with only the top ten predictors of the pooled prediction system and then tested this tool in 108 independent patients with 4-week outcome labels. Discontinuation and readmission to hospital events in patients with predicted poor versus good outcomes were assessed with Kaplan-Meier log-rank analyses, whereas generalised linear mixed-effects models were used to investigate the GAF-based predictions against several clinically meaningful outcome indicators, including treatment adherence, symptom remission, and quality of life. Findings: The generalisability of our outcome predictions were estimated with cross-validation (test-fold balanced accuracy [BAC] of 75·0% for 4-week outcomes and 73·8% for and 52-week outcomes), and leave-site-out validation across 44 European sites (BAC of 72·1% for 4-week outcomes and 71·1% for 52-week outcomes). We identified a smaller group of ten predictors still providing a BAC of 71·7% in 108 patients never used for model discovery. Unemployment, poor education, functional deficits, and unmet psychosocial needs predicted both endpoints, whereas previous depressive episodes, male sex, and suicidality additionally predicted poor 1-year outcomes. 52-week predictions identified patients at risk for symptom persistence, non-adherence to treatment, readmission to hospital and poor quality of life. Specifically among these patients, amisulpride and olanzapine showed superior efficacy versus haloperidol, quetiapine, and ziprasidone. Interpretation: Our results suggest that prognostic models operating on brief, patient-reportable pre-treatment data might provide useful insight into individualised outcome trajectories, optimising treatment selection, and targeted clinical trial designs. To embed these tools into real-world care, replication is needed in external first-episode samples with overlapping variables, which are not available in the field at present. Funding: The European Group for Research in Schizophrenia.
Article
Psychiatry is in need of a major overhaul. In order to improve the precision with which we can treat, classify, and research mental health problems, we need bigger datasets than ever before. Web-based data collection provides a novel solution.