Article
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... While surveys often allow for polling representative samples, they provide a single time point of assessment and are resource intensive, both in terms of cost and time [75,81]. First-hand accounts on social media provide an ecological description of the lived experience of individuals and prior studies have shown that language on social media can be used to reliably characterize and study the mental health of individuals [18,36]. ...
... Social media language data has been widely used to understand and predict the well-being and mental health of individuals (see [12,36] for reviews on this topic). Prior studies illustrate the predictive utility of language in identifying mental health conditions such as depression [18,24], anxiety [42], ADHD [35], PTSD [14], and review the application of social data in understanding the manifestation and sequelae of mental health conditions [34,52]. ...
... Research Gap. Prior works studying mental health using social media data have found that language contained in posts contain insightful markers associated with individuals mental health and their sequelae which are often not feasible or too costly to obtain using traditional methods [19,24,36]. Further, Reddit has been shown to be a particularly promising platform considering the platform affordances which encourage self-disclosure and anonymity in the context of mental health discussions [17,28,72]. ...
Conference Paper
Full-text available
The experience of immigrating to a foreign land is associated with exposure to new cultures, changes in social networks, and challenges to prevalent systems of meaning. A body of literature has shown that the immigration experience, while pursued with the hope of a better quality of life, is associated with adverse effects on immigrants' mental health and well-being due to sociopolitical and economic factors. In this paper, we study first-hand accounts of struggles with mental health by individuals who participate in immigration communities (aka subreddits) on Reddit, a popular social media platform. First, we compare and contrast the sentiment and content of posts made by individuals in mental health subreddits who also post in immigration subreddits with those of matched group that does not. Second, we adopt the case-crossover approach to evaluate the changes in their mental health language before and after the first post in immigration subreddits. We find that mental health concerns among the individuals posting in immigration subreddits are about race, politics, violence, employment, and affordability whereas, among the matched group, mental health posts are about anger and self-harm, family and relationships, swearing, and introspection. We also find that the language of mental health before and after the first post in immigration subreddits evolves from seeking support and therapy to a more concrete and specific discussion around mental health and a positive outlook towards future goals.
... At least 5% of adults worldwide suffer from depression at some point in their lives [45]. Natural language processing (NLP) of social media data has been utilized to characterize and predict depression [13,19,30]. However, prior research in this area has predominantly treated depression as a single syndrome, for instance, represented by a sum-score [23]. ...
... Among all mental health conditions, depression is most commonly researched using social media [10]. Past studies have successfully used social media language to predict depression (see reviews in [26,30,34,55]). For example, one of the most cited papers extracted linguistic features from Twitter to analyze depression [13] and found signals for characterizing the onset of depression (e.g., increased negative emotions). The most commonly used linguistic features or variables in predictive analyses include the distribution of words and phrases, the syntactic composition of posts (e.g., length of posts), psycholinguistic categories from the Linguistic Inquiry and Word Count (LIWC) dictionaries [48], Latent Dirichlet Allocation (LDA) topics [7], and domain-specific lexicons (see reviews in [10,53]). ...
... The most commonly used linguistic features or variables in predictive analyses include the distribution of words and phrases, the syntactic composition of posts (e.g., length of posts), psycholinguistic categories from the Linguistic Inquiry and Word Count (LIWC) dictionaries [48], Latent Dirichlet Allocation (LDA) topics [7], and domain-specific lexicons (see reviews in [10,53]). These features are typically used to correlate and classify depression status obtained through self-assessments, self-disclosure, or forum membership [30]. Although some studies have evaluated the predictive ability of language posted on Reddit about multiple mental health conditions [12,36], specific symptoms of depression have not been examined. ...
Conference Paper
Full-text available
Depression is known to have heterogeneous symptom manifestations. Investigating various symptoms of depression is essential to understanding underlying mechanisms and personalizing treatments. Reddit, an online peer-to-peer social media platform, contains varied communities (subreddits) where individuals discuss their detailed mental health experiences and seek support. The current paper has two aims. The first is to identify psycho-linguistic and open-vocabulary language markers associated with different symptoms using 1,318,749 posts from 43 subreddit communities (e.g., r/bingeeating) clustered into 13 expert-validated depression symptoms (e.g., disordered eating). The second aim is to develop prediction models based on the above linguistic features and RoBERTa embeddings to detect specific symptom discourse in contrast to control subreddit posts contributed by the same Reddit users. These predictive models are then validated on a second sample of individuals (N = 2,986) who shared their Facebook posts and completed self-report depression (PHQ-9), anxiety (GAD-7), and loneliness (UCLA-3) surveys. Based on the differential linguistic patterns that emerged across the various symptoms in our data, we identified three potential clusters , which could also be mapped to the Research Domain Criteria (RDoC) framework. RoBERTa embeddings demonstrated the highest accuracy at predicting most symptoms and were particularly robust at predicting the severity of suicidal thoughts and attempts, self-loathing, loneliness, and disordered eating. Our study demonstrates the potential of using large, pseudonymous online forums to train language-based symptom-estimation machine-learning models that can be applied to other text sources. Such technologies could be helpful in clinical psychology, population health, and other areas where early mental health monitoring could improve diagnosis, risk reduction, and treatment.
... Research suggests that health digital traces on social media provide a great means for capturing an individual's current states of mind, feelings, behaviors, and activities that often characterize depression (Choudhury et al., 2013;Nadeem, 2016). Existing research shows that social media-based depression screening has the potential to achieve detection results that are comparable to unaided clinician assessment and screening surveys (Guntuku et al., 2017). Meanwhile, unlike traditional survey-or interview-based depression 2 When employees are present for work but less productive due to their illness. ...
... Most of these methods use feature engineering for depression detection. However, the adopted features, including LIWC, n-grams, and sentiment analysis (Chau et al., 2020;Guntuku et al., 2017), do not specify clinical depression assessment, therefore deteriorating depression detection performance. With the development of deep learning, an increasing number of researchers turn to end-to-end sequence models for depression detection using raw social media posts (Figuerêdo et al., 2022;Khan et al., 2021;Kim et al., 2020;Lin et al., 2020;Liu et al., 2022;Malviya et al., 2021). ...
... Especially, depressed patients are motivated to share their symptoms, major life events, and treatments for offering or seeking support and fighting the stigma of mental illness , examples of which are reported in Table 1. Research on social media for depression analyses mainly focuses on two categories: (1) the correlations between the use of social media sites and mental illnesses (Aalbers et al., 2019;Keles et al., 2020), and (2) using social media data for mental disorder detection (Guntuku et al., 2017). Our work focuses on the latter. ...
Preprint
Full-text available
Depression is a common disease worldwide. It is difficult to diagnose and continues to be underdiagnosed. Because depressed patients constantly share their symptoms, major life events, and treatments on social media, researchers are turning to user-generated digital traces on social media for depression detection. Such methods have distinct advantages in combating depression because they can facilitate innovative approaches to fight depression and alleviate its social and economic burden. However, most existing studies lack effective means to incorporate established medical domain knowledge in depression detection or suffer from feature extraction difficulties that impede greater performance. Following the design science research paradigm, we propose a Deep Knowledge-aware Depression Detection (DKDD) framework to accurately detect social media users at risk of depression and explain the critical factors that contribute to such detection. Extensive empirical studies with real-world data demonstrate that, by incorporating domain knowledge, our method outperforms existing state-of-the-art methods. Our work has significant implications for IS research in knowledge-aware machine learning, digital traces utilization, and NLP research in IS. Practically, by providing early detection and explaining the critical factors, DKDD can supplement clinical depression screening and enable large-scale evaluations of a population's mental health status.
... In the past decade, there has been a plethora of research in natural language processing (NLP) and computational social science on detecting mental health issues via online text data such as social media (e.g., [18,24,25,30,36]). However, most of these studies have focused on building domain-specific machine learning (ML) models (i.e., one model for one particular task, such as stress detection [35, 58], depression prediction [30, 79, 90], or suicide risk assessment [20,27]). ...
... Online platforms, especially social media platforms, have been acknowledged as a promising lens that is capable of revealing insights into the psychological states, health, and well-being of both individuals and populations [14,22,25,36,63]. In the past decade, there has been extensive research about leveraging content analysis and social interaction patterns to identify and predict risks associated with mental health issues, such as anxiety [6,72,77], major depressive disorder [24,26,61,83], suicide ideation [13,21,27,71,80], and others [17,19,53]. ...
Preprint
Full-text available
The recent technology boost of large language models (LLMs) has empowered a variety of applications. However, there is very little research on understanding and improving LLMs' capability for the mental health domain. In this work, we present the first comprehensive evaluation of multiple LLMs, including Alpaca, Alpaca-LoRA, and GPT-3.5, on various mental health prediction tasks via online text data. We conduct a wide range of experiments, covering zero-shot prompting, few-shot prompting, and instruction finetuning. The results indicate the promising yet limited performance of LLMs with zero-shot and few-shot prompt designs for mental health tasks. More importantly, our experiments show that instruction finetuning can significantly boost the performance of LLMs for all tasks simultaneously. Our best-finetuned model, Mental-Alpaca, outperforms GPT-3.5 (25 times bigger) by 16.7\% on balanced accuracy and performs on par with the state-of-the-art task-specific model. We summarize our findings into a set of action guidelines for future researchers, engineers, and practitioners on how to empower LLMs with better mental health domain knowledge and become an expert in mental health prediction tasks.
... The same study has also shown that this FOMO (Fear of missing out) is directly related to the sensitivity of the users to social media addictions. This further increases their need to consume more content from social media, searching for inspiration or anything to boost their self-confidence [11]. Instead, they lose self-esteem, leading to more degradation of mental health. ...
... The metrics used to measure the performance of our models are Precision, Recall, and F-scores [11] and are represented in equations (1-3). The results of these metrics are represented in Table 1. ...
Article
Full-text available
People’s mental conditions are often reflected in their social media activity due to the internet's anonymity. Psychiatric issues are often detected through such activities and can be addressed in their early stages, potentially preventing the consequences of unattended mental disorders like depression and anxiety. In this paper, the authors have implemented machine learning models and used various embedding techniques to classify posts from the famous social media blog site Reddit as stressful and non-stressful. The dataset used contains user posts that can be analyzed to detect patterns in the social media activity of those diagnosed with mental disorders. This paper uses different NLP (Natural Language Processing) tools such as ELMo (Embeddings from Language Models) word embeddings, BERT (Bidirectional Encoder Representations from Transformers) tokenizers, and BoW (Bag of Words) approach to create word/sentence data that can be fed to machine learning models. The results of each method have been discussed. The results achieved a top F1 score of 0.76, a Precision score of 0.71, and a Recall of 0.74 using only the preprocessed texts and machine learning algorithms to classify the posts. The results achieved by this paper are significant and have the potential to be applied in real-world scenarios to analyze mental stress among social media users. Although this paper focuses on data from Reddit, the techniques used can be transferred to similar social media platforms and could help solve the growing mental health crisis.
... A lot of work within the eld has been focused on developing computational models that can predict mental health outcomes from social media data. For example, it has been shown that it is possible to identify individuals who are at risk for depression or anxiety by analyzing their social media posts (Eichstaedt et al., 2018;Guntuku et al., 2017;Seabrook et al., 2016). Although social media studies have shown great promise, assessment of the present emotions requires current and relevant social media data, which is not always accessible. ...
Preprint
Full-text available
Psychological constructs are commonly quantified with closed-ended rating scales, however, recent advances in natural language processing (NLP) allow for quantification of open-ended language responses with unprecedented accuracy. We demonstrate that specific open-ended question analyzed by natural language processing (NLP) shows higher accuracy in categorizing emotional states compared to traditional rating scales. One group of participants ( N = 297) was asked to generate narratives related to four emotions; depression, anxiety, satisfaction, or harmony. The second group of participants ( N = 434), read the narratives produced by the first group. Both groups summarized the narratives in five descriptive words and rated the narratives on four rating scales commonly used to measure these emotions. The descriptive words were quantified by NLP methods and machine learning was used to categorize the responses into the corresponding emotional categories. The results showed a substantially higher number of accurate categorizations of the narratives based on descriptive words (64%) than on rating scales (44%), indicating that semantic measures have significantly higher predictive accuracy than the corresponding four rating scales. These findings are important, as it contradicts the commonly held view that rating scales have higher accuracy in quantifying mental states than language-based measures.
... According to the experts, sadness and other mental diseases can be detected in various online contexts. Technological advancements in natural language manage and machine learning approaches are beneficial [27] . In this study, the author uses four cutting-edge machine learning classifiers-Naive Bayes, J48, BF Tree, and One R-to optimize sentiment analysis. ...
... This implies that diversity of opinions and disagreement is "by design" less likely to be found in the same social media endeavour, creating platform islands, group think and isolation effects. This combination of homophily and lack of content variety has proved to degrade the quality, balance and safety (Golbeck et al., 2017;Guntuku et al., 2017) of online discourse, up to undermining social tolerance (Mutz, 2002). ...
... Such personalized interventions can increase the effectiveness of mental health treatments by tailoring them to each individual's unique needs and circumstances. In the broader context of mental well-being, AI applications extend into three main categories [3]: (1) personal sensing or digital phenotyping, which involves the use of mobile devices and wearables to collect and analyze data on individuals' behavior, physiological responses, and environmental factors, allowing for continuous monitoring and assessment of mental health conditions [4]; (2) natural language processing (NLP) of clinical texts and social media content [5]; (3) chatbots and conversational agents that can engage in interactive conversations with users, providing emotional support, psychoeducation, symptom tracking, and even therapeutic interventions [6]. ...
Article
Full-text available
Bullying and cyberbullying are harmful social phenomena that involve the intentional, repeated use of power to intimidate or harm others. The ramifications of these actions are felt not just at the individual level but also pervasively throughout society, necessitating immediate attention and practical solutions. The BullyBuster project pioneers a multi-disciplinary approach, integrating artificial intelligence (AI) techniques with psychological models to comprehensively understand and combat these issues. In particular, employing AI in the project allows the automatic identification of potentially harmful content by analyzing linguistic patterns and behaviors in various data sources, including photos and videos. This timely detection enables alerts to relevant authorities or moderators, allowing for rapid interventions and potential harm mitigation. This paper, a culmination of previous research and advancements, details the potential for significantly enhancing cyberbullying detection and prevention by focusing on the system’s design and the novel application of AI classifiers within an integrated framework. Our primary aim is to evaluate the feasibility and applicability of such a framework in a real-world application context. The proposed approach is shown to tackle the pervasive issue of cyberbullying effectively.
... On the other hand, a layered and hierarchical model for the transformation of raw sensor data into indicators of behaviors and states related to mental health is provided by Falciani et al. in their assessment of sensing research on mental health [29]. Reference [30] provides a review of the study that uses screening questionnaires, public sharing on Twitter, and participation in an online forum to predict mental illness. They conclude that passive activity on social media can be monitored to identify sad or at-risk persons using automated detection approaches. ...
Article
Full-text available
Modern technology frequently uses wearable sensors to monitor many aspects of human behavior. Since continuous records of heart rate and activity levels are typically gathered, the data generated by these devices have a lot of promise beyond counting the number of daily steps or calories expended. Due to the patient’s inability to obtain the necessary information to understand their conditions and detect illness, such as depression, objectively, methods for evaluating various mental disorders, such as the Montgomery–Asberg depression rating scale (MADRS) and observations, currently require a significant amount of effort on the part of specialists. In this study, a novel dataset was provided, comprising sensor data gathered from depressed patients. The dataset included 32 healthy controls and 23 unipolar and bipolar depressive patients with motor activity recordings. Along with the sensor data collected over several days of continuous measurement for each patient, some demographic information was also offered. The result of the experiment showed that less than 70 of the 100 epochs of the model’s training were completed. The Cohen Kappa score did not even pass 0.1 in the validation set, due to an imbalance in the class distribution, whereas in the second experiment, the majority of scores peaked in about 20 epochs, but because training continued during each epoch, it took much longer for the loss to decline before it fell below 0.1. In the second experiment, the model soon reached an accuracy of 0.991, which is as expected given the outcome of the UMAP dimensionality reduction. In the last experiment, UMAP and neural networks worked together to produce the best outcomes. They used a variety of machine learning classification algorithms, including the nearest neighbors, linear kernel SVM, Gaussian process, and random forest. This paper used the UMAP unsupervised machine learning dimensionality reduction without the neural network and showed a slightly lower score (QDA). By considering the ratings of the patient’s depressive symptoms that were completed by medical specialists, it is possible to better understand the relationship between depression and motor activity.
... They used linguistics of the Twitter users and tried to detect depression, bi-polar disorder, post-traumatic disorder, and seasonal affective disorders. Guntuku et al. [44] reviewed the works that used social media content for identifying mental illness. These studies predicted depression among social media users and determined the risk of other users falling into depression. ...
Article
Full-text available
The novel coronavirus disease (COVID-19) pandemic is provoking a prevalent consequence on mental health because of less interaction among people, economic collapse, negativity, fear of losing jobs, and death of the near and dear ones. To express their mental state, people often are using social media as one of the preferred means. Due to reduced outdoor activities, people are spending more time on social media than usual and expressing their emotion of anxiety, fear, and depression. On a daily basis, about 2.5 quintillion bytes of data are generated on social media. Analyzing this big data can become an excellent means to evaluate the effect of COVID-19 on mental health. In this work, we have analyzed data from Twitter microblog (tweets) to find out the effect of COVID-19 on people’s mental health with a special focus on depression. We propose a novel pipeline, based on recurrent neural network (in the form of long short-term memory or LSTM) and convolutional neural network, capable of identifying depressive tweets with an accuracy of 99.42%. Preprocessed using various natural language processing techniques, the aim was to find out depressive emotion from these tweets. Analyzing over 571 thousand tweets posted between October 2019 and May 2020 by 482 users, a significant rise in depressing tweets was observed between February and May of 2020, which indicates as an impact of the long ongoing COVID-19 pandemic situation.
... This reminds us of the high demand for an early identification depression detection tool. Since humans are social creatures, cases of human depression can be identified from recent behavioral changes which is more effective than traditional clinical depression detection [37]. In this study, we suggest a novel multi-class classification approach to linguistic text-based depression severity detection, which can be used as an early depression detection screening tool. ...
Conference Paper
Depression-driven suicide is a serious social problem. Early identification of depression is a vital necessity for the well-being of society. Clinical diagnosis of depression takes significant amount of time and requires highly skilled medical staff, which greatly limit its accessibility. Social media analysis for depression detection is a rapidly growing research area. However, most of the available methods can only detect the presence or absence of depression, not the severity of depression. A few recently developed models for depression severity detection are not validated on large datasets. In this study we proposed a novel method based on confidence vector for detecting the severity of depression. We evaluated our method using a large dataset consisting of more than 40,000 annotated statements extracted from multiple social network services. Preliminary results showed that our models outperformed the existing state-of-the-art models with a micro-averaged F1 score of 66% (an improvement of 5%) in human depression detection.
... Given the popularity of social media and the opportunities for expression for users, some researchers have begun developing methods of detecting mental health issues such as depression using DSM criteria and machine learning (Burdisso et al., 2019;Guntuku et al., 2017;Shen et al., 2017). For instance, Shen et al. (2017) developed a learning model for Twitter and found that users who were depressed expressed more negative emotions and posted more tweets during the late-night hours compared to non-depressed users. ...
Article
Full-text available
Depression is one of the most common mental health concerns in the USA. Critical to the treatment of depression is the identification of depressive symptoms by individuals and the professionals from whom individuals seek treatment. Symptom identification is made challenging by the diversity of depression symptoms experienced by those who struggle with the disease. The purpose of this study was to examine manifestations of depression as presented in a naturalistic textual forum, Twitter. By using the hashtag “#depressionsucks,” the authors examined the things that posters tweeted about that were relevant to experiences of depression in a sample of 169 unique tweets collected over a 4-week period using nCapture. The results of this study demonstrate the nuanced lived experience of Twitter users who experience depression and their public discussion of their depressive symptoms and experiences. The symptoms ranged from acute depression to a mindset of wanting to get better and to support others. These results show the wide range of manifestations of depression and give further insight into how social media can be used to understand lived experiences of those struggling with mental health.
... Studies have explored the use of NLP techniques to model basic ToM skills. For example, in detecting mental states and emotions (Tausczik and Pennebaker, 2010;Guntuku et al., 2017;Gordon and Hobbs, 2017;Rashkin et al., 2018a,b;Shapira et al., 2021) or by generating a humorous response when the interlocutor is in a playful mood (Shani et al., 2022;Shapira et al., 2023a). Recent work is focused around creating datasets testing whether and to what extent models have ToM (see §3). ...
Preprint
The escalating debate on AI's capabilities warrants developing reliable metrics to assess machine "intelligence". Recently, many anecdotal examples were used to suggest that newer large language models (LLMs) like ChatGPT and GPT-4 exhibit Neural Theory-of-Mind (N-ToM); however, prior work reached conflicting conclusions regarding those abilities. We investigate the extent of LLMs' N-ToM through an extensive evaluation on 6 tasks and find that while LLMs exhibit certain N-ToM abilities, this behavior is far from being robust. We further examine the factors impacting performance on N-ToM tasks and discover that LLMs struggle with adversarial examples, indicating reliance on shallow heuristics rather than robust ToM abilities. We caution against drawing conclusions from anecdotal examples, limited benchmark testing, and using human-designed psychological tests to evaluate models.
... According to the experts, sadness and other mental diseases can be detected in various online contexts. Technological advancements in natural language manage and machine learning approaches are beneficial [27] . In this study, the author uses four cutting-edge machine learning classifiers-Naive Bayes, J48, BF Tree, and One R-to optimize sentiment analysis. ...
Article
Full-text available
Covid-19 has impacted negatively on people all over the world. Some of the ways that it has affected people include such as Health, Employment, Mental Health, Education, Social isolation, Economic Inequality and Access to healthcare and essential services. Apart from physical symptoms, it has caused considerable damage to mental health of individuals. Among all, depression is identified as one of the common illnesses which leads to early death. People suffering from depression are at a higher risk of developing other health conditions, such as heart disease and stroke, and are also at a higher risk of suicide. The importance of early detection and intervention of depression cannot be overstated. Identifying and treating depression early can prevent the illness from becoming more severe and can also prevent the development of other health conditions. Early detection can also prevent suicide, which is a leading cause of death among people with depression. Millions of people have affected from this disease. To proceed with the study of depression detection among individuals we have conducted a survey with 21 questions based on Hamilton tool and advise of psychiatrist. With the use of Python’s scientific programming principles and machine learning methods like Decision Tree, KNN, and Naive Bayes, survey results were analysed. Further a comparison of these techniques is done. Study concludes that KNN has given better results than other techniques based on the accuracy and decision tree has given better results in the terms of latency to detect the depression of a person. At the conclusion, a machine learning-based model is suggested to replace the conventional method of detecting sadness by asking people encouraging questions and getting regular feedback from them.
... A higher topic usage score indicates that the given person has a higher probability of using the words included in a particular topic compared with individuals with a lower score. Topic usage scores can be thought of as analogous to survey scores and have been utilised previously for the automatic assessment of a wide range of traits based on the language expressed on social media such as personality (Park et al., 2015), depression and mental illness (Eichstaedt et al., 2018;Guntuku et al., 2017). In LDA, different but related words are compiled into topics, whereas in the survey measures, different items are used to assess an overarching construct (e.g. ...
Article
Full-text available
Wellbeing is predominantly measured through surveys but is increasingly measured by analysing individuals' language on social media platforms using social media text mining (SMTM). To investigate whether the structure of wellbeing is similar across both data collection methods, we compared networks derived from survey items and social media language features collected from the same participants. The dataset was split into an independent exploration (n = 1169) and a final subset (n = 1000). After estimating exploration networks, redundant survey items and language topics were eliminated. Final networks were then estimated using exploratory graph analysis (EGA). The networks of survey items and those from language topics were similar, both consisting of five wellbeing dimensions. The dimensions in the survey- and SMTM-based assessment of wellbeing showed convergent structures congruent with theories of wellbeing. Specific dimensions found in each network reflected the unique aspects of each type of data (survey and social media language). Networks derived from both language features and survey items show similar structures. Survey and SMTM methods may provide complementary methods to understand differences in human wellbeing.
... Efforts to detect mental illness and more specifically depression have increased gradually with the increase in social media usage [9], [10]. Guntuku et al. [11] indicated that tweets containing negative emotional sentiments are posted by depressed Twitter users more than by healthy users Various studies have used different classifiers to detect depression and other mental illnesses. For clinical outcome prediction using gene expression data, Kong and Yu [8] presented a new classifier, where RF is integrated with deep neural network, and demonstrated that the accuracy is higher compared to those of the other classification models using simulation experiments. ...
Article
Full-text available
Background: The use of social media data to predict mental health outcomes has the potential to allow for the continuous monitoring of mental health and well-being and provide timely information that can supplement traditional clinical assessments. However, it is crucial that the methodologies used to create models for this purpose are of high quality from both a mental health and machine learning perspective. Twitter has been a popular choice of social media because of the accessibility of its data, but access to big data sets is not a guarantee of robust results. Objective: This study aims to review the current methodologies used in the literature for predicting mental health outcomes from Twitter data, with a focus on the quality of the underlying mental health data and the machine learning methods used. Methods: A systematic search was performed across 6 databases, using keywords related to mental health disorders, algorithms, and social media. In total, 2759 records were screened, of which 164 (5.94%) papers were analyzed. Information about methodologies for data acquisition, preprocessing, model creation, and validation was collected, as well as information about replicability and ethical considerations. Results: The 164 studies reviewed used 119 primary data sets. There were an additional 8 data sets identified that were not described in enough detail to include, and 6.1% (10/164) of the papers did not describe their data sets at all. Of these 119 data sets, only 16 (13.4%) had access to ground truth data (ie, known characteristics) about the mental health disorders of social media users. The other 86.6% (103/119) of data sets collected data by searching keywords or phrases, which may not be representative of patterns of Twitter use for those with mental health disorders. The annotation of mental health disorders for classification labels was variable, and 57.1% (68/119) of the data sets had no ground truth or clinical input on this annotation. Despite being a common mental health disorder, anxiety received little attention. Conclusions: The sharing of high-quality ground truth data sets is crucial for the development of trustworthy algorithms that have clinical and research utility. Further collaboration across disciplines and contexts is encouraged to better understand what types of predictions will be useful in supporting the management and identification of mental health disorders. A series of recommendations for researchers in this field and for the wider research community are made, with the aim of enhancing the quality and utility of future outputs.
... However, as depressed people may stop posting on social media, testing for continuing monitoring apps should also be carried out using other uninterrupted data sources, such as smartphone and sensor data. It is also necessary to conduct studies that combine social media data with clinical interviews and other screening techniques in ecologically valid samples to evaluate the incremental value of social media-based screening and differentiating between mental health conditions [61]. ...
Article
Full-text available
Mental illness has recently become a global health issue, causing significant suffering in people’s lives and having a negative impact on productivity. In this study, we analyzed the generalization capacity of machine learning to classify various mental illnesses across multiple social media platforms (Twitter and Reddit). Language samples were gathered from Reddit and Twitter postings in discussion forums devoted to various forms of mental illness (anxiety, autism, schizophrenia, depression, bipolar disorder, and BPD). Following this process, information from 606,208 posts (Reddit) created by a total of 248,537 people and from 23,102,773 tweets was used for the analysis. We initially trained and tested machine learning models (CNN and Word2vec) using labeled Twitter datasets, and then we utilized the dataset from Reddit to assess the effectiveness of our trained models and vice versa. According to the experimental findings, the suggested method successfully classified mental illness in social media texts even when training datasets did not include keywords or when unrelated datasets were utilized for testing.
... This phenomenon provides an opportunity for psychologists to obtain additional data through social media Twitter [2]. Social media automated analysis has the potential to provide a method for the early detection [8]. ...
Article
Full-text available
Mental illness, including depression, is not a mild condition that only some mentally weak people experience. Technology is developing so rapidly, especially communication technology through social media. Twitter is a very popular social media today. Users can easily quickly and simply communicate all the feelings they are experiencing through tweets, which allows us to find information about emotional feelings to the level of user depression. Auto-mated analysis of social media has the potential to provide a method for early detection. This study aims to predict early signs of depression using data from social media Twitter. The method used in this research is classification by analyzing social media sentiment using the Hierarchical Attention Network. Classification using the Hierarchical Attention Network method was chosen because the method showed outstanding results for classifying texts in previous studies. The classification model in this study that represents the best accuracy, 74%, was performed by applying the Hierarchical Attention Network.
... In the field of mental health, ML techniques are now being employed to anticipate the probabilities of mental diseases and, as a result, to execute prospective therapy results. The literature that analyzes the characteristics of depression is increasing [28], [29], [30], [31]. To identify depression, trained machines utilize voice recordings, video interviews, text replies, and so on, in the modern tendency [32]. ...
Article
Full-text available
Users can interact with one another through social networks (SNs) by exchanging information, delivering comments, finding new information, and engaging in discussions that result in the production of vast volumes of data daily. These data, available in various forms, such as images, text, and videos, may be interpreted to reflect the user’s activities, including their mental state regarding depression. For example, depression is a chronic disease from which the vast majority of users suffer, and it has emerged as a significant issue relating to mental health on a global scale. However, because these data are scant, unfinished, and sometimes given inaccurately, it is challenging to make an accurate automated diagnosis from them. Even though several procedures have been utilized over the past few decades to diagnose depression, machine learning (ML) and deep learning (DL) techniques supply superior insights. Thus, in this study, we review several state-of-the-art ML and DL techniques in terms of the systematic literature review (SLR) approach for depression detection. We also highlight some critical challenges from the existing literature that may help to explore for future study. Finally, we believe this survey will help readers and researchers in ML and DL to understand critical solutions in diagnosing depression.
... Ethics and privacy are ongoing concerns and might arise with analyzing social media data, particularly when the data is considered sensitive. Very few users recognize that their mental health information could be extracted from their online activities [11]. It is crucial to ensure that participants clearly understand the nature of their participation and the types of social media data that would be gathered [77]. ...
Preprint
Full-text available
Objective: Social media has become a safe space for discussing sensitive topics such as mental disorders. Depression dominates mental disorders globally, and accordingly, depression detection on social media has witnessed significant research advances. This study aims to review the current state-of-the-art research methods and propose a multidimensional framework to describe the current body of literature relating to detecting depression on social media. Method: A study methodology involved selecting papers published between 2011 and 2022 that focused on detecting depression on social media. Three digital libraries were used to find relevant papers: Google Scholar, ACM digital library, and ResearchGate. In selecting literature, two fundamental elements were considered: identifying papers focusing on depression detection and including papers involving social media use. Results: In total, 46 papers were reviewed. Multiple dimensions were analyzed, including input features, social media platforms, disorder and symptomatology, ground truth, and machine learning. Various types of input features were employed for depression detection, including textual, visual, behavioral, temporal, demographic, and spatial features. Among them, visual and spatial features have not been systematically reviewed to support mental health researchers in depression detection. Despite depression's fine-grained disorders, most studies focus on general depression. Conclusion: Recent studies have shown that social media data can be leveraged to identify depressive symptoms. Nevertheless, further research is needed to address issues like depression validation, generalizability, causes identification, and privacy and ethical considerations. An interdisciplinary collaboration between mental health professionals and computer scientists may help detect depression on social media more effectively.
... Linguistic contents that users post on social media have been proved to be the basis for evaluating a person's mental state [2,3]. However, the majority of research targets are civilians. ...
Conference Paper
Compatriots revere soldiers for their willingness to put their life on the line for serving the nation. They deserve to be admired, given their power to endure the hardships of battlegrounds. Experts say that war veterans often show signs of traumatic disorders. It can be seen in the messages expressed on social media by these individuals. In this research, we leveraged tweets to gain insights into the sentiments of war veterans. We examined the elements of the tweets to study the lexical and non-lexical features. We counted adjectives with top 100 frequencies in soldiers and civilians’ corpora and used EmoLex to perform emotion analysis on 10 emotions. Our objective was to compare tweets of war survivors with the civilians. We performed this comparison to find cues in the tweets of war veterans that indicate psychological distress.
... A large number of studies in recent years have shown that social media data can be used to better understand, identify, and describe mental disorders [23,24] (eg, data from Facebook, Twitter, Instagram, Sina Weibo platforms). Individuals with mental disorders show changes in language and behavior, such as greater negative emotions and heightened self-attentive focus [25][26][27][28]. There is a high degree of similarity between patients with different forms of mental distress. ...
Article
Full-text available
Background Anxiety disorder has become a major clinical and public health problem, causing a significant economic burden worldwide. Public attitudes toward anxiety can impact the psychological state, help-seeking behavior, and social activities of people with anxiety disorder. Objective The purpose of this study was to explore public attitudes toward anxiety disorders and the changing trends of these attitudes by analyzing the posts related to anxiety disorders on Sina Weibo, a Chinese social media platform that has about 582 million users, as well as the psycholinguistic and topical features in the text content of the posts. Methods From April 2018 to March 2022, 325,807 Sina Weibo posts with the keyword “anxiety disorder” were collected and analyzed. First, we analyzed the changing trends in the number and total length of posts every month. Second, a Chinese Linguistic Psychological Text Analysis System (TextMind) was used to analyze the changing trends in the language features of the posts, in which 20 linguistic features were selected and presented. Third, a topic model (biterm topic model) was used for semantic content analysis to identify specific themes in Weibo users’ attitudes toward anxiety. ResultsThe changing trends in the number and the total length of posts indicated that anxiety-related posts significantly increased from April 2018 to March 2022 (R2=0.6512; P
... They are also likely to advance the field of psychiatric therapeutics by supporting modifications to clinical guidelines or the design of randomized controlled trials [22]. A larger body of evidence on this matter could also help identify patients to be targeted for more thorough mental health assessments and provided with further resources, support, and treatment [23]. ...
Article
Full-text available
Background: Major depressive disorder is a common mental disorder affecting 5% of adults worldwide. Early contact with health care services is critical for achieving accurate diagnosis and improving patient outcomes. Key symptoms of major depressive disorder (depression hereafter) such as cognitive distortions are observed in verbal communication, which can also manifest in the structure of written language. Thus, the automatic analysis of text outputs may provide opportunities for early intervention in settings where written communication is rich and regular, such as social media and web-based forums. Objective: The objective of this study was 2-fold. We sought to gauge the effectiveness of different machine learning approaches to identify users of the mass web-based forum Reddit, who eventually disclose a diagnosis of depression. We then aimed to determine whether the time between a forum post and a depression diagnosis date was a relevant factor in performing this detection. Methods: A total of 2 Reddit data sets containing posts belonging to users with and without a history of depression diagnosis were obtained. The intersection of these data sets provided users with an estimated date of depression diagnosis. This derived data set was used as an input for several machine learning classifiers, including transformer-based language models (LMs). Results: Bidirectional Encoder Representations from Transformers (BERT) and MentalBERT transformer-based LMs proved the most effective in distinguishing forum users with a known depression diagnosis from those without. They each obtained a mean F1-score of 0.64 across the experimental setups used for binary classification. The results also suggested that the final 12 to 16 weeks (about 3-4 months) of posts before a depressed user's estimated diagnosis date are the most indicative of their illness, with data before that period not helping the models detect more accurately. Furthermore, in the 4- to 8-week period before the user's estimated diagnosis date, their posts exhibited more negative sentiment than any other 4-week period in their post history. Conclusions: Transformer-based LMs may be used on data from web-based social media forums to identify users at risk for psychiatric conditions such as depression. Language features picked up by these classifiers might predate depression onset by weeks to months, enabling proactive mental health care interventions to support those at risk for this condition.
... This has favoured early explorations of Twitter language in relation to a broad range of psychosocial traits, and one study found it was possible to predict Twitter users' Big Five personality traits from their Twitter account (Golbeck et al., 2011). Other researchers have examined Twitter use and how it relates to depression and mental health detection (Guntuku et al., 2017), post-traumatic stress disorder (Coppersmith et al., 2014), schizophrenia (Mitchell et al., 2015), and the impact of strict COVID-19 lockdowns in Wuhan and Lombardy (Su et al., 2020). In one study from the USA, the language used on Twitter with certain psychological characteristics (e.g., anger, negative emotion language, disengagement) was associated with heart disease mortality risk, and their counterparts (engagement, positive emotion language) were protective at a county-wide level (Eichstaedt et al., 2015). ...
Article
Full-text available
Objectives This study aimed to investigate the linguistic markers of an interest in mindfulness. Specifically, it examined whether individuals who follow mindfulness experts on Twitter use different language in their tweets compared to a random sample of Twitter users. This is a first step which may complement commonly used self-report measures of mindfulness with quantifiable behavioural metrics. Method A linguistic analysis examined the association between an interest in mindfulness and linguistic markers in 1.87 million Twitter entries across 19,732 users from two groups, (1) a mindfulness interest group (n = 10,347) comprising followers of five mindfulness experts and (2) a control group (n = 9385) of a random selection of Twitter users. Text analysis software (Linguistic Inquiry and Word Count) was used to analyse linguistic markers associated with the categories and subcategories of mindfulness, affective processes, social orientation, and “being” mode of mind. Results Analyses revealed an association between an interest in mindfulness and lexical choice. Specifically, tweets from the mindfulness interest group contained a significantly higher frequency of markers associated with mindfulness, positive emotion, happiness, and social orientation, and a significantly lower frequency of markers associated with negative emotion, past focus, present focus, future focus, family orientation, and friend orientation. Conclusions Results from this study suggest that an interest in mindfulness is associated with more frequent use of certain language markers on Twitter. The analysis opens possible pathways towards developing more naturalistic methods of understanding and assessing mindfulness which may complement self-reporting methods.
Chapter
Depression-driven suicide is a serious social problem. Early identification of depression is vital for the well-being of society. Clinical diagnosis of depression takes a significant amount of time and requires highly skilled medical staff, which greatly limits its accessibility. Social media analysis for depression detection is therefore a rapidly growing research area. However, most of the available methods can only detect the presence or absence of depression, not the severity of depression. On the other hand, a few recently developed models for depression severity detection have not been validated on large datasets due to fundamental issues such as data sparsity. In this study, we proposed a novel method based on confidence vectors for detecting the severity of depression. We evaluated our method using a large dataset consisting of more than 40,000 annotated statements extracted from multiple social network services. To our knowledge, this is the largest and most well-balanced dataset for depression severity classification to date. Preliminary results showed that our models outperformed the existing state-of-the-art models by 5%, achieving a micro-averaged F1 score of 66% for human depression severity detection.KeywordsDepression Severity DetectionMulti-class ClassificationNatural Language ProcessingAffective Computing
Article
Full-text available
We live in an age where the use of smart devices and Internet are redefining our community standards. Additionally, the pandemic Covid-19 enforced the community to use applications on smart devices for various activities. Currently, many organizations are developing their applications that are accessible through various platforms, including Windows Phone Store, Apple App Store, and Google Play. To facilitate the customer the banking sector is also providing their mobile applications for various online services. Mobile banking applications (mbanking apps) have considerably upgraded the efficiency of the banks and living standards of the people. The people can easily download applications from app stores and are permitted to leave reviews or comments on the mobile application. The sentiment analysis is an area that allows us to examine the user opinion to improve the online services. Therefore, for any organization it is of prime importance to explore and evaluate the weaknesses affecting the delivery of their online services. In this work, sentiment analysis is performed to evaluate ten (10) mbanking apps of Pakistan using valence aware dictionary for sentiment reasoning and machine learning (ML) based approaches. Performance of three classifiers through supervised ML techniques multinomial Naïve Bayes, logistic regression, support vector machine, and ensemble model is compared and employed. Moreover, the thematic analysis of reviews is also performed to discover various factors as themes that affect the effectiveness of the mbanking apps by using Top2Vec Model. The results indicate that the ensemble model is best performing model with f1-score of 90%. The thematical analysis uncovers 346 positive themes like ease of use, helpful, reliable, user friendly, good aesthetics, convenience, secured and many more, whereas 441 negative themes comprise performance issue, poor updates/new version in apps, account registration issue, app crash problem, etc.
Article
Background: Researchers are increasingly interested in better methods for assessing the pace of aging in older adults, including vocal analysis. The present study sought to determine whether paralinguistic vocal attributes improve estimates of the age and risk of mortality in older adults. Methods: To measure vocal age, we curated interviews provided by male US World War II Veterans in the Library of Congress collection. We used diarization to identify speakers and measure vocal features and matched recording data to mortality information. Veterans (N=2,447) were randomly split into testing (n=1,467) and validation (n=980) subsets to generate estimations of vocal age and years of life remaining. Results were replicated to examine out-of-sample utility using Korean War Veterans (N=352). Results: WWII Veterans' average age was 86.08 at the time of recording and 91.28 at the time of death. Overall, 7.4% were prisoners of war, 43.3% were Army Veterans, and 29.3% were drafted. Vocal-age estimates (mean absolute error=3.255) were within five years of chronological age, 78.5% of the time. With chronological age held constant, older vocal-age estimation was correlated with shorter life expectancy (aHR = 1.10, 95% C.I.=[1.06-1.15], P<0.001), even when adjusting for age at vocal assessment. Conclusions: Computational analyses reduced estimation error by 71.94% (approximately eight years) and produced vocal age estimates that were correlated with both age and predicted time until death when age was held constant. Paralinguistic analyses augment other assessments for individuals when oral patient histories are recorded.
Article
Full-text available
Introduction The purpose of this study was to use text-based social media content analysis from cancer-specific subreddits to evaluate depression and anxiety-loaded content. Natural language processing, automatic, and lexicon-based methods were employed to perform sentiment analysis and identify depression and anxiety-loaded content. Methods Data was collected from 187 Reddit users who had received a cancer diagnosis, were currently undergoing treatment, or had completed treatment. Participants were split according to survivorship status into short-term, transition, and long-term cancer survivors. A total of 72524 posts were analyzed across the three cancer survivor groups. Results The results showed that short-term cancer survivors had significantly more depression-loaded posts and more anxiety-loaded words than long-term survivors, with no significant differences relative to the transition period. The topic analysis showed that long-term survivors, more than other stages of survivorship, have resources to share their experiences with suicidal ideation and mental health issues while providing support to their survivor community. Discussion The results indicate that Reddit texts seem to be an indicator of when the stressor is active and mental health issues are triggered. This sets the stage for Reddit to become a platform for screening and first-hand intervention delivery. Special attention should be dedicated to short-term survivors.
Chapter
Our lifestyle is inclined towards what we do, our daily habits, how we chose to do mundane activities and mostly how we participate in living a life. As youth’s life revolves social media, either for education, career or recreational purpose, it does have a significant impact on the youth’s lifestyle. The research indicates that the social media either helps to inculcate healthier habits or hampers the already existing healthy habits. Like over use of social media can hamper sleep routine, healthy habits, or routine set for studies. The research mentioned below holds the insight on subtle impact of social media on youth’s lifestyle.
Article
Full-text available
The reliance on Online Social Networks (OSN) for both formal and informal social interactions has dramatically changed the way people communicate. In this paper, a novel Social Behavioral Biometric (SBB), human micro-expression, is introduced for person identification. An emotion detection model is developed to extract emotion probability scores from person’s writing samples posted on Twitter. The corresponding emotion-progression features are extracted using an original technique that turns users’ microblogs into emotion-progression signals. Finally, a novel social behavioral biometric system that leverages rank-level weighted majority voting to achieve an accurate person identification is implemented. The proposed system is validated on a proprietary benchmark dataset consisting of 250 Twitter users. The experimental results convincingly demonstrate that the proposed social behavioral biometric, human micro-expression, possesses a strong distinguishable ability and can be used for person identification. The study further reveals that the proposed social behavioral biometric outperforms all the original SBB traits.
Article
Purpose This study aims to investigate how the open discussion of infertility-related topics on public social media platforms contributes to the well-being of individuals affected by infertility. Design/methodology/approach For this study, the authors used a netnographic approach to analyze 69 YouTube videos (>21 h of raw data) produced by infertility vloggers and more than 40,000 user comments. Findings The authors identify two ways in which infertility patients benefit from public discussions of the topic on social media: through watching videos and engaging in discussions, patients satisfy their infertility-related needs (i.e. the need for information, emotional support and experience sharing); and through reaching people who are not affected by infertility, vloggers help to de-taboo the issue as well as sensitize and educate society. Practical implications To providers of tabooed services, this study’s findings emphasize the potential of incorporating social media in the consumer support strategy. Social implications This research highlights the value of the public discussion of infertility-related topics on social media platforms for consumers affected by the issue. Originality/value In this study, the public discussion of infertility-related topics through video blogs is presented as a valuable tool to enhance the well-being of individuals confronted with infertility as these vlogs satisfy related needs of the consumers and contribute to de-tabooing.
Chapter
Depression is a genuine medical condition characterized by lethargy, suicidal thoughts, trouble concentrating, and a general state of disarray. It is a “biological brain disorder” and a psychological state of mind. The World Health Organization (WHO) estimates that over 280 million people worldwide suffer from depression, regardless of their culture, caste, religion, or whereabouts. Depression affects how a person thinks, speaks, or communicates with the outside world. The key objective of this study was to try to identify and use those differences in linguistics in Reddit posts to determine if a person may suffer from depressive disorders. This paper proposes novel Natural Language Processing (NLP) techniques, and Machine Learning approaches to train and evaluate the models. The proposed textual context-aware depression detection methodology consists of a hybrid transformer network consisting of Bidirectional Encoder Representations from Transformers (BERT) and Bidirectional Long Short-Term Memory (Bi-LSTM) with a Multi Layered Perceptron (MLP) attached in the end to classify depression indicative texts that can achieve incredible results in terms of accuracy–0.9548, precision–0.9706, recall–0.9745 and F1 score–0.9725.KeywordsBERTBi-LSTMDepressionMLPTransformers
Article
Background: Relatively little is known about how communication changes as a function of depression severity and interpersonal closeness. We examined the linguistic features of outgoing text messages among individuals with depression and their close- and non-close contacts. Methods: 419 participants were included in this 16-week-long observational study. Participants regularly completed the PHQ-8 and rated subjective closeness to their contacts. Text messages were processed to count frequencies of word usage in the LIWC 2015 libraries. A linear mixed modeling approach was used to estimate linguistic feature scores of outgoing text messages. Results: Regardless of closeness, people with higher PHQ-8 scores tended to use more differentiation words. When texting with close contacts, individuals with higher PHQ-8 scores used more first-person singular, filler, sexual, anger, and negative emotion words. When texting with non-close contacts these participants used more conjunctions, tentative, and sadness-related words and fewer first-person plural words. Conclusion: Word classes used in text messages, when combined with symptom severity and subjective social closeness data, may be indicative of underlying interpersonal processes. These data may hold promise as potential treatment targets to address interpersonal drivers of depression.
Chapter
Various social media platforms like Facebook, Twitter, etc., are the powerful tools to express sentiments and emotions across the globe. Sentiment analysis and its evaluation are used to reveal the positive or negative opinions associated with an individual. In this paper, we have tried to study the sentiment analysis based on their count, year wise, country wise, university wise, and keyword wise progression to understand the depth of the study in the field of sentiment analysis. Results show that sentiment analysis is not a new field, and authors are contributing to this field since 2008. Collaboration among different countries and universities has also seen during our study. In 2019, maximum contributions are received. Further, this study shows that 200 keywords with 149 unique and 18 as repeated are used by different author. Authors from 65 universities with 40 universities listed in Times Higher Education ranking 2020 are observed with highest number of authors from India.KeywordsSentiment analysisDepressionNaïve BayesCOVID-19
Article
Purpose The primary objective of this study was to identify patterns in users’ naturalistic expressions on student loans on two social media platforms. The secondary objective was to examine how these patterns, sentiments, and emotions associated with student loans differ in user posts indicating mental illness. Material and Method Data for this study were collected from Reddit and Twitter (2009–2020, n = 85,664) using certain key terms of student loans along with first-person pronouns as a triangulating measure of posts by individuals. Unsupervised and supervised machine learning models were used to analyze the text data. Results Results suggested 50 topics in reddit finance and 40 each in reddit mental health communities and Twitter. Statistically significant associations were found between mental illness statuses and sentiments and emotions. Posts expressing mental illness showed more negative sentiments and were more likely to express sadness and fear. Discussion and Conclusion Patterns in social media discussions indicate both academic and non-academic consequences of having student debt, including users’ desire to know more about their debts. Interventions should address the skill and information gaps between what is desired by the borrowers and what is offered to them in understanding and managing their debts. Cognitive burden created by student debts manifest itself on social media and can be used as an important marker to develop a nuanced understanding of people’s expressions on a variety of socioeconomic issues. Higher volumes of negative sentiments and emotions of sadness, fear, and anger warrant immediate attention of policymakers and practitioners to reduce the cognitive burden of student debts.
Article
Full-text available
Mental health problems are one of the various ills that afflict the world’s population. Early diagnosis and medical care are public health problems addressed from various perspectives. Among the mental illnesses that most afflict the population is depression; its early diagnosis is vitally important, as it can trigger more severe illnesses, such as suicidal ideation. Due to the lack of homogeneity in current diagnostic tools, the community has focused on using AI tools for opportune diagnosis. Unfortunately, there is a lack of data that allows the use of IA tools for the Spanish language. Our work has a cross-lingual scheme to address this issue, allowing us to identify Spanish and English texts. The experiments demonstrated the methodology’s effectiveness with an F1-score of 0.95. With this methodology, we propose a method to solve a classification problem for depression tweets (or short texts) by reusing English language databases with insufficient data to generate a classification model, such as in the Spanish language. We also validated the information obtained with public data to analyze the behavior of depression in Mexico during the COVID-19 pandemic. Our results show that the use of these methodologies can serve as support, not only in the diagnosis of depression, but also in the construction of different language databases that allow the creation of more efficient diagnostic tools.
Article
Perseverative thinking (PT), such as rumination or worry, is a transdiagnostic process implicated in the onset and maintenance of emotional disorders. Existing measures of PT are limited by demand and expectancy effects, cognitive biases, and reflexivity, leading to calls for unobtrusive, behavioral measures. In response, we developed a behavioral measure of PT based on language. A mixed sample of 188 participants with major depressive disorder, generalized anxiety disorder, or no psychopathology completed self-report PT measures. Participants were also interviewed, providing a natural language sample. We examined language features associated with PT, then built a language-based PT model and examined its predictive power. PT was associated with multiple language features, most notably I-usage (e.g., "I", "me"; β = 0.25) and negative emotion language (e.g., "anxiety", "difficult"; β = 0.19). In machine learning analyses, language features accounted for 14% of the variance in self-reported PT. Language-based PT predicted the presence and severity of depression and anxiety, psychiatric comorbidity, and treatment seeking, with effects in the r = 0.15-0.41 range. PT has face-valid linguistic correlates and our language-based measure holds promise for assessing PT unobtrusively. With further development, this measure could be used to passively detect PT for deployment of "just-in-time" interventions.
Article
Anxiety and depression negatively impact many. Studies suggest depression is associated with future time horizons, or how “far” into the future people tend to think, and anxiety is associated with temporal discounting, or how much people devalue future rewards. Separate studies from linguistics and economics have shown that how people refer to future time predicts temporal discounting. Yet no one—that we know of—has investigated whether future time reference habits are a marker of anxiety and/or depression. We introduce the FTR classifier, a novel classification system researchers can use to analyse linguistic temporal reference. In Study 1, we used the FTR classifier to analyse data from the social-media website Reddit. Users who had previously posted popular contributions to forums about anxiety and depression referenced the future and past more often than controls, had more proximal future and past time horizons, and significantly differed in their linguistic future time reference patterns: They used fewer future tense constructions (e.g. will), fewer high-certainty constructions (certainly), more low-certainty constructions (could), more bouletic modal constructions (hope), and more deontic modal constructions (must). This motivated Study 2, a survey-based mediation analysis. Self-reported anxious participants represented future events as more temporally distal and therefore temporally discounted to a greater degree. The same was not true of depression. We conclude that methods which combine big-data with experimental paradigms can help identify novel markers of mental illness, which can aid in the development of new therapies and diagnostic criteria.
Article
Machine learning based approaches for automatic disease prediction is a novel research area in healthcare informatics. Electronic Health Records in medical settings improves early-stage illness diagnosis. However, when standard rule-based approaches, like doctor's prescription or laboratory test reports are employed for disease diagnosis, the advantages of EHRs are not accomplished adequately. As a result, there is a requirement of technology based solution which helps in prediction of psychological diseases in a more efficient way. The proposed research work offers a hybrid Hopfield recurrent neural network (H2RN2) approach to predict psychological diseases by using amorphous clinical EHRs taken from Kaggle database. The proposed model automatically learns inherent semantic characteristics from available clinical data items. It uses fivefold cross validation technique within a recurrent neural network which detracts over fitting of the model. In addition to effective learning during training of the model, the hybrid approach also helps in accurate prediction of the disease with improved accuracy. The proposed model is assessed using three measuring parameters, accuracy, recall and F1-score and yields an accuracy of 97.53% in experimental evaluation, which is superior to several existing approaches for psychological disease prediction. The results demonstrate that the proposed model outperforms several other techniques in predicting the risk of psychiatric disorders. In future, the similar approach may be employed to predict gender-based psychological diseases or to anticipate the risk of various physiological diseases.
Article
Full-text available
Background and aims: Many people around the world, especially at the time of the Covid-19 outbreak, are concerned about their e-health data. The aim of this study was to investigate the attitudes of patients with Covid-19 toward sharing their health data for research and their concerns about security and privacy. Methods: This survey is a cross-sectional study conducted through an electronic researcher-made questionnaire from February to May 2021. Convenience sampling was applied to select the participants and all 475 patients were referred to two to Afzalipour and Shahid Bahonar hospitals were invited to the study. According to the inclusion and exclusion criteria, 204 patients were included in the study and completed the questionnaire. Descriptive statistics (frequency, mean, and standard deviation) were used to analyze the questionnaire data. SPSS 23.0 was used for data analysis. Results: Participants tended to share information about "comments provided by individuals on websites" (68.6%), "fitness tracker data" (64.19%), and "online shopping history" (63.21%) before death. Participants also tended to share information about "electronic medical records data" (36.75%), "genetic data" (24.99%), and "Instagram data" (24.99%) after death. "Fraud or misuse of personal information" (4.48 [±1.27]) was the most common concern of participants regarding the virtual world. "Unauthorized access to the account" (4.38 [±0.73]), "violation of the privacy of personal information" (4.26 [±0.85]), and "violation of the patient privacy and personal information confidentially" (4.26 [±0.85]) were the most of the unauthorized security incidents that occurred online for participants. Conclusion: Patients with Covid-19 were concerned about releasing information they shared on websites and social networks. Therefore, people should be made aware of the reliability of websites and social media so that their security and privacy are not affected.
Article
Full-text available
Motivation: Social media represent an unrivalled opportunity for epidemiological cohorts to collect large amounts of high-resolution time course data on mental health. Equally, the high-quality data held by epidemiological cohorts could greatly benefit social media research as a source of ground truth for validating digital phenotyping algorithms. However, there is currently a lack of software for doing this in a secure and acceptable manner. We worked with cohort leaders and participants to co-design an open-source, robust and expandable software framework for gathering social media data in epidemiological cohorts. Implementation: Epicosm is implemented as a Python framework that is straightforward to deploy and run inside a cohort's data safe haven. General features: The software regularly gathers Tweets from a list of accounts and stores them in a database for linking to existing cohort data. Availability: This open-source software is freely available at [https://dynamicgenetics.github.io/Epicosm/].
Conference Paper
Full-text available
Psychological distress in the form of depression, anxiety and other mental health challenges among college students is a growing health concern. Dearth of accurate, continuous, and multi-campus data on mental well-being presents significant challenges to intervention and mitigation efforts in college campuses. We examine the potential of social media as a new "barometer" for quantifying the mental well-being of college populations. Utilizing student-contributed data in Reddit communities of over 100 universities, we first build and evaluate a transfer learning based classification approach that can detect mental health expressions with 97% accuracy. Thereafter, we propose a robust campus-specific Mental Well-being Index: MWI. We find that MWI is able to reveal meaningful temporal patterns of mental well-being in campuses, and to assess how their expressions relate to university attributes like size, academic prestige, and student demographics. We discuss the implications of our work for improving counselor efforts, and in the design of tools that can enable better assessment of the mental health climate of college campuses.
Article
Full-text available
Sensors in everyday devices, such as our phones, wearables, and computers, leave a stream of digital traces. Personal sensing refers to collecting and analyzing data from sensors embedded in the context of daily life with the aim of identifying human behaviors, thoughts, feelings, and traits. This article provides a critical review of personal sensing research related to mental health, focused principally on smartphones, but also including studies of wearables, social media, and computers. We provide a layered, hierarchical model for translating raw sensor data into markers of behaviors and states related to mental health. Also discussed are research methods as well as challenges, including privacy and problems of dimensionality. Although personal sensing is still in its infancy, it holds great promise as a method for conducting mental health research and as a clinical tool for monitoring at-risk populations and providing the foundation for the next generation of mobile health (or mHealth) interventions. Expected final online publication date for the Annual Review of Clinical Psychology Volume 13 is May 7, 2017. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Article
Full-text available
Background Social networking sites (SNSs) have become a pervasive part of modern culture, which may also affect mental health. Objective The aim of this systematic review was to identify and summarize research examining depression and anxiety in the context of SNSs. It also aimed to identify studies that complement the assessment of mental illness with measures of well-being and examine moderators and mediators that add to the complexity of this environment. Methods A multidatabase search was performed. Papers published between January 2005 and June 2016 relevant to mental illness (depression and anxiety only) were extracted and reviewed. Results Positive interactions, social support, and social connectedness on SNSs were consistently related to lower levels of depression and anxiety, whereas negative interaction and social comparisons on SNSs were related to higher levels of depression and anxiety. SNS use related to less loneliness and greater self-esteem and life satisfaction. Findings were mixed for frequency of SNS use and number of SNS friends. Different patterns in the way individuals with depression and individuals with social anxiety engage with SNSs are beginning to emerge. Conclusions The systematic review revealed many mixed findings between depression, anxiety, and SNS use. Methodology has predominantly focused on self-report cross-sectional approaches; future research will benefit from leveraging real-time SNS data over time. The evidence suggests that SNS use correlates with mental illness and well-being; however, whether this effect is beneficial or detrimental depends at least partly on the quality of social factors in the SNS environment. Understanding these relationships will lead to better utilization of SNSs in their potential to positively influence mental health.
Article
Full-text available
Language data available through social media provide opportunities to study people at an unprecedented scale. However, little guidance is available to psychologists who want to enter this area of research. Drawing on tools and techniques developed in natural language processing, we first introduce psychologists to social media language research, identifying descriptive and predictive analyses that language data allow. Second, we describe how raw language data can be accessed and quantified for inclusion in subsequent analyses, exploring personality as expressed on Facebook to illustrate. Third, we highlight challenges and issues to be considered, including accessing and processing the data, interpreting effects, and ethical issues. Social media has become a valuable part of social life, and there is much we can learn by bringing together the tools of computer science with the theories and insights of psychology. (PsycINFO Database Record
Article
Full-text available
Social media has recently emerged as a premier method to disseminate information online. Through these online networks, tens of millions of individuals communicate their thoughts, personal experiences, and social ideals. We therefore explore the potential of social media to predict, even prior to onset, Major Depressive Disorder (MDD) in online personas. We employ a crowdsourced method to compile a list of Twitter users who profess to being diagnosed with depression. Using up to a year of prior social media postings, we utilize a Bag of Words approach to quantify each tweet. Lastly, we leverage several statistical classifiers to provide estimates to the risk of depression. Our work posits a new methodology for constructing our classifier by treating social as a text-classification problem, rather than a behavioral one on social media platforms. By using a corpus of 2.5M tweets, we achieved an 81% accuracy rate in classification, with a precision score of .86. We believe that this method may be helpful in developing tools that estimate the risk of an individual being depressed, can be employed by physicians, concerned individuals, and healthcare agencies to aid in diagnosis, even possibly enabling those suffering from depression to be more proactive about recovering from their mental health.
Article
Full-text available
Social networking sites are a part of everyday life for over a billion people worldwide. They show no sign of declining popularity, with social media use increasing at 3 times the rate of other Internet use. Despite this proliferation, mental healthcare has yet to embrace this unprecedented resource. We argue that social networking site data should become a high priority for psychiatry research and mental healthcare delivery. We illustrate our views using the world’s largest social networking site, Facebook, which currently has over 1 billion daily users (1 in 7 people worldwide). Facebook users can create personal profiles, socialize, express feelings, and share content, which Facebook stores as time-stamped digital records dating back to when the user first joined. Evidence suggests that 92% of adolescents go online daily and disclose considerably more about themselves online than offline. Thus, working with Facebook data could further our understanding of the onset and early years of mental illness, a crucial period of interpersonal development. Furthermore, a diminishing ‘digital divide’ has allowed for a broader sociodemographic to access Facebook, including homeless youth, young veterans, immigrants, patients with mental health problems, and seniors, enabling greater contact with traditionally harder-to-reach populations.
Conference Paper
Full-text available
Online social media, such as Reddit, has become an important resource to share personal experiences and communicate with others. Among other personal information, some social media users communicate about mental health problems they are experiencing, with the intention of getting advice, support or empathy from other users. Here, we investigate the language of Reddit posts specific to mental health, to define linguistic characteristics that could be helpful for further applications. The latter include attempting to identify posts that need urgent attention due to their nature, e.g. when someone announces their intentions of ending their life by suicide or harming others. Our results show that there are a variety of linguistic features that are discriminative across mental health user communities and that can be further exploited in subsequent classification tasks. Furthermore , while negative sentiment is almost uniformly expressed across the entire data set, we demonstrate that there are also condition-specific vocabularies used in social media to communicate about particular disorders. Source code and related materials are available from: https: //github.com/gkotsis/ reddit-mental-health.
Conference Paper
Full-text available
In this paper, we extensively evaluate the effectiveness of using a user's social media activities for estimating degree of depression. As ground truth data, we use the results of a web-based questionnaire for measuring degree of depression of Twitter users. We extract several features from the activity histories of Twitter users. By leveraging these features, we construct models for estimating the presence of active depression. Through experiments, we show that (1) features obtained from user activities can be used to predict depression of users with an accuracy of 69%, (2) topics of tweets estimated with a topic model are useful features, (3) approximately two months of observation data are necessary for recognizing depression, and longer observation periods do not contribute to improving the accuracy of estimation for current depression; sometimes, longer periods worsen the accuracy.
Article
Full-text available
Amazon.com's Mechanical Turk (MTurk) website provides a data collection platform with quick and inexpensive access to diverse samples. Numerous reports have lauded MTurk as capturing high-quality data with an epidemiological sample that is more representative of the U.S. population than traditional in-person convenience samples (e.g., undergraduate subject pools). This benefit, in combination with the ease and low-cost of data collection, has led to a remarkable increase in studies using MTurk to investigate phenomena across a wide range of psychological disciplines. Multiple reports have now examined the demographic characteristics of MTurk samples. One key gap remains, however, in that relatively little is known about individual differences in clinical symptoms among MTurk participants. This paper discusses the importance of assessing clinical phenomena in MTurk samples and supports its assertions through an empirical investigation of a large sample (N = 1,098) of MTurk participants. Results revealed that MTurk participants endorse clinical symptoms to a substantially greater degree than traditional nonclinical samples. This distinction was most striking for depression and social anxiety symptoms, which were endorsed at levels comparable with individuals with clinically diagnosed mood and anxiety symptoms. Participants' symptoms of physiological anxiety, hoarding, and eating pathology fell within the subclinical range. Overall, the number of individuals exceeding validated clinical cutoffs was between 3 and 19 times the estimated 12-month prevalence rates. Based on the current findings, it is argued that MTurk participants differ from the general population in meaningful ways, and researchers should consider this when referring to this sample as truly representative. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
Article
Full-text available
We conduct a detailed investigation of correlations between real-time expressions of individuals made across the United States and a wide range of emotional, geographic, demographic, and health characteristics. We do so by combining (1) a massive, geo-tagged data set comprising over 80 million words generated in 2011 on the social network service Twitter and (2) annually-surveyed characteristics of all 50 states and close to 400 urban populations. Among many results, we generate taxonomies of states and cities based on their similarities in word use; estimate the happiness levels of states and cities; correlate highly-resolved demographic characteristics with happiness levels; and connect word choice and message length with urban characteristics such as education levels and obesity rates. Our results show how social media may potentially be used to estimate real-time levels and changes in population-scale measures such as obesity rates.
Article
Full-text available
There are international differences in the epidemiology of depression and the performance of primary care physicians but the factors underlying these national differences are uncertain. To examine the international variability in diagnostic performance of primary care physicians when diagnosing depression in primary care. A meta-analysis of unassisted clinical diagnoses against semi-structured interviews. A systematic literature search, critical appraisal, and pooled analysis were conducted and 25 international studies were identified involving 8917 individuals. A minimum of three independent studies per country were required to aid extrapolation. Clinicians in the Netherlands performed best at case finding (the ability to rule in cases of depression with minimal false positives) (AUC+ 0.735) and this was statistically significantly better than the ability of clinicians in Australia (AUC+ 0.622) and the US (AUC+ 0.653), who were the worst performers. Clinicians from Italy had intermediate case-finding abilities. Regarding screening (the ability to rule out cases of no depression with minimal false negatives) there were no strong differences. Looking at overall accuracy, primary care physicians in Italy and the Netherlands were most successful in their diagnoses and physicians from the US and Australia least successful (83.5%, 81.9%, 74.3%, and 67.0%, respectively). GPs in the UK appeared to have the lowest ability to detect depression, as a proportion of all cases of depression (45.6%; 95% CI = 27.7% to 64.2%). Several factors influenced detection accuracy including: collecting data on clinical outcomes; routinely comparing the clinical performance of staff; working in small practices; and having long waits to see a specialist. Assuming these differences are representative, there appear to be international variations in the ability of primary care physicians to diagnose depression, but little differences in screening success. These might be explained by organisational factors.
Article
Full-text available
A 2-stage strategy, combining an assessment of severity with depression criteria, can help a physician focus on the most severe cases without missing less severe ones that still need treatment. Because of its brevity, relatively high positive predictive value, and ability to inform the clinician on both depression severity and diagnostic criteria, the PRIME-MD Patient Health Questionnaire (PHQ-9) is the best available depression screening tool for primary care. One-time screening is cost-effective; physicians may elect to screen more often based on risk factors.
Article
Full-text available
Dramatic changes have occurred in mental health treatments during the past decade. Data on recent treatment patterns are needed to estimate the unmet need for services. To provide data on patterns and predictors of 12-month mental health treatment in the United States from the recently completed National Comorbidity Survey Replication. Nationally representative face-to-face household survey using a fully structured diagnostic interview, the World Health Organization's World Mental Health Survey Initiative version of the Composite International Diagnostic Interview, carried out between February 5, 2001, and April 7, 2003. A total of 9282 English-speaking respondents 18 years and older. Proportions of respondents with 12-month DSM-IV anxiety, mood, impulse control, and substance disorders who received treatment in the 12 months before the interview in any of 4 service sectors (specialty mental health, general medical, human services, and complementary and alternative medicine). Number of visits and proportion of patients who received minimally adequate treatment were also assessed. Of 12-month cases, 41.1% received some treatment in the past 12 months, including 12.3% treated by a psychiatrist, 16.0% treated by a non-psychiatrist mental health specialist, 22.8% treated by a general medical provider, 8.1% treated by a human services provider, and 6.8% treated by a complementary and alternative medical provider (treatment could be received by >1 source). Overall, cases treated in the mental health specialty sector received more visits (median, 7.4) than those treated in the general medical sector (median, 1.7). More patients in specialty than general medical treatment also received treatment that exceeded a minimal threshold of adequacy (48.3% vs 12.7%). Unmet need for treatment is greatest in traditionally underserved groups, including elderly persons, racial-ethnic minorities, those with low incomes, those without insurance, and residents of rural areas. Most people with mental disorders in the United States remain either untreated or poorly treated. Interventions are needed to enhance treatment initiation and quality.
Article
Full-text available
Depression, with up to 11.9% prevalence in the general population, is a common disorder strongly associated with increased morbidity. The accuracy of non-psychiatric physicians in recognizing depression may influence the outcome of the illness, as unrecognized patients are not offered treatment for depression. To describe and quantitatively summarize the existing data on recognition of depression by non-psychiatric physicians. We searched the following databases: MEDLINE (1966-2005), Psych INFO (1967-2005) and CINAHL (1982-2005). To summarize data presented in the papers reviewed, we calculated the Summary receiver operating characteristic (ROC) and the summary sensitivity, specificity and odds ratios (ORs) of recognition, and their 95% confidence intervals using the random effects model. The summary sensitivity, specificity, and OR of recognition using the random effects model were: 36.4% (95% CI: 27.9-44.8), 83.7% (95% CI: 77.5-90.0), and 4.0 (95% CI: 3.2-4.9), respectively. We also calculated the Summary ROC. We performed a metaregression analysis, which showed that the method of documentation of recognition, the age of the sample, and the date of study publication have significant effect on the summary sensitivity and the odds of recognition, in the univariate model. Only the method of documentation had a significant effect on summary sensitivity, when the age of the sample and the date of publication were added to the model. The accuracy of depression recognition by non-psychiatrist physicians is low. Further research should focus on developing standardized methods of documenting non-psychiatric physicians' recognition of depression.
Article
The utility of Twitter data as a medium to support population-level mental health monitoring is not well understood. In an effort to better understand the predictive power of supervised machine learning classifiers and the influence of feature sets for efficiently classifying depression-related tweets on a large-scale, we conducted two feature study experiments. In the first experiment, we assessed the contribution of feature groups such as lexical information (e.g., unigrams) and emotions (e.g., strongly negative) using a feature ablation study. In the second experiment, we determined the percentile of top ranked features that produced the optimal classification performance by applying a three-step feature elimination approach. In the first experiment, we observed that lexical features are critical for identifying depressive symptoms, specifically for depressed mood (-35 points) and for disturbed sleep (-43 points). In the second experiment, we observed that the optimal F1-score performance of top ranked features in percentiles variably ranged across classes e.g., fatigue or loss of energy (5th percentile, 288 features) to depressed mood (55th percentile, 3,168 features) suggesting there is no consistent count of features for predicting depressive-related tweets. We conclude that simple lexical features and reduced feature sets can produce comparable results to larger feature sets.
Article
Thesupport-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data.High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Conference Paper
History of mental illness is a major factor behind suicide risk and ideation. However research efforts toward characterizing and forecasting this risk is limited due to the paucity of information regarding suicide ideation, exacerbated by the stigma of mental illness. This paper fills gaps in the literature by developing a statistical methodology to infer which individuals could undergo transitions from mental health discourse to suicidal ideation. We utilize semi-anonymous support communities on Reddit as unobtrusive data sources to infer the likelihood of these shifts. We develop language and interactional measures for this purpose, as well as a propensity score matching based statistical approach. Our approach allows us to derive distinct markers of shifts to suicidal ideation. These markers can be modeled in a prediction framework to identify individuals likely to engage in suicidal ideation in the future. We discuss societal and ethical implications of this research.
Article
Traditional mental health studies rely on information primarily collected through personal contact with a health care professional. Recent work has shown the utility of social media data for studying depression, but there have been limited evaluations of other mental health conditions. We consider post traumatic stress disorder (PTSD), a serious condition that affects millions worldwide, with especially high rates in military veterans. We also present a novel method to obtain a PTSD classifier for social media using simple searches of available Twitter data, a significant reduction in training data cost compared to previous work. We demonstrate its utility by examining differences in language use between PTSD and random individuals, building classifiers to separate these two groups and by detecting elevated rates of PTSD at and around U.S. military bases using our classifiers.. Copyright © 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Article
This study examines depression-related chatter on Twitter to glean insight into social networking about mental health. We assessed themes of a random sample (n=2,000) of depression-related tweets (sent 4-11 to 5-4-14). Tweets were coded for expression of DSM-5 symptoms for Major Depressive Disorder (MDD). Supportive or helpful tweets about depression was the most common theme (n=787, 40%), closely followed by disclosing feelings of depression (n=625; 32%). Two-thirds of tweets revealed one or more symptoms for the diagnosis of MDD and/or communicated thoughts or ideas that were consistent with struggles with depression after accounting for tweets that mentioned depression trivially. Health professionals can use our findings to tailor and target prevention and awareness messages to those Twitter users in need.
Conference Paper
The birth of a child is a major milestone in the life of parents. We leverage Facebook data shared voluntarily by 165 new mothers as streams of evidence for characterizing their postnatal experiences. We consider multiple measures including activity, social capital, emotion, and linguistic style in participants' Facebook data in pre- and postnatal periods. Our study includes detecting and predicting onset of post-partum depression (PPD). The work complements recent work on detecting and predicting significant postpartum changes in behavior, language, and affect from Twitter data. In contrast to prior studies, we gain access to ground truth on postpartum experiences via self-reports and a common psychometric instrument used to evaluate PPD. We develop a series of statistical models to predict, from data available before childbirth, a mother's likelihood of PPD. We corroborate our quantitative findings through interviews with mothers experiencing PPD. We find that increased social isolation and lowered availability of social capital on Facebook, are the best predictors of PPD in mothers.
Article
The CES-D scale is a short self-report scale designed to measure depressive symptomatology in the general population. The items of the scale are symptoms associated with depression which have been used in previously validated longer scales. The new scale was tested in household interview surveys and in psychiatric settings. It was found to have very high internal consistency and adequate test- retest repeatability. Validity was established by pat terns of correlations with other self-report measures, by correlations with clinical ratings of depression, and by relationships with other variables which support its construct validity. Reliability, validity, and factor structure were similar across a wide variety of demographic characteristics in the general population samples tested. The scale should be a useful tool for epidemiologic studies of de pression.
Article
Thesupport-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Article
A self-assessment scale has been developed and found to be a reliable instrument for detecting states of depression and anxiety in the setting of an hospital medical outpatient clinic. The anxiety and depressive subscales are also valid measures of severity of the emotional disorder. It is suggested that the introduction of the scales into general hospital practice would facilitate the large task of detection and management of emotional disorder in patients under investigation and treatment in medical and surgical departments.
Article
The aim of this study was to compare the validity of the Hospital Anxiety and Depression Scale (HADS), the WHO (five) Well Being Index (WBI-5), the Patient Health Questionnaire (PHQ), and physicians' recognition of depressive disorders, and to recommend specific cut-off points for clinical decision making. A total of 501 outpatients completed each of the three depression screening questionnaires and received the Structured Clinical Interview for DSM-IV (SCID) as the criterion standard. In addition, treating physicians were asked to give their psychiatric diagnoses. Criterion validity and Receiver Operating Characteristics (ROC) were determined. Areas under the curves (AUCs) were compared statistically. All depression scales showed excellent internal consistencies (Cronbach's alpha: 0.85-0.90). For 'major depressive disorder', the operating characteristics of the PHQ were significantly superior to both the HADS and the WBI-5. For 'any depressive disorder', the PHQ showed again the best operating characteristics but the overall difference did not reach statistical significance at the 5% level. Cut-off points that can be recommended for the screening of 'major depressive disorder' had sensitivities of 98% (PHQ), 94% (WBI-5), and 85% (HADS). Corresponding specificities were 80% (PHQ), 78% (WBI-5), and 76% (HADS). In contrast, physicians' recognition of 'major depressive disorder' was poor (sensitivity, 40%; specificity, 87%). Our sample may not be representative of medical outpatients, but sensitivity and specificity are independent of disorder prevalence. All three questionnaires performed well in depression screening, but significant differences in criterion validity existed. These results may be helpful in the selection of questionnaires and cut-off points.
Article
The nine-item depression module from the Patient Health Questionnaire (PHQ-9) is well validated and widely used as a brief diagnostic and severity measure, but its validity as an outcome measure for depression has not yet been established. Therefore, we investigated the sensitivity to change of the PHQ-9 in three groups of patients whose depression status either improved, remained unchanged, or deteriorated over time. From three cohorts of medical outpatients, with an equal distribution of major depressive disorder, other depressive disorders, or no depressive disorder, 167 patients (82.7%) were followed up after a mean of 12.3 +/- 3.0 months. The PHQ-9 and the Structured Clinical Interview for DSM-IV (SCID) were completed at both baseline and follow-up. Depression diagnoses from the SCID were used as the criterion standard to divide patients into subgroups with (a) improved depression status, (b) unchanged depression status, and (c) deteriorated depression status. Effect sizes (ES) of PHQ-9 change scores were ES = -1.33 for the improved depression status subgroup (n = 52), ES = -0.21 for the unchanged status subgroup (n = 91), and ES = 0.47 for the deteriorated status subgroup (n = 24). PHQ-9 change scores differed significantly between the three depression outcome groups. Limitations: The PHQ-9 and the SCID were completed in person at baseline, whereas they were completed in a telephone interview at follow-up. This study demonstrates the ability of the PHQ-9 to detect depression outcome and changes over time. Data from treatment trials will help further establish the sensitivity to change of the PHQ-9 in comparison to other depression severity measures.
Forecasting the Onset and Course of Mental Illness with Twitter Data
  • A G Reece
  • A J Reagan
  • Klm Lix
  • P S Dodds
  • C M Danforth
  • E J Langer
Reece AG, Reagan AJ, Lix KLM, Dodds PS, Danforth CM, Langer EJ: Forecasting the Onset and Course of Mental Illness with Twitter Data. 2016 arXiv:1608.07740.
Multi-task learning for mental health using social media text
  • A Benton
  • M Mitchell
  • D Hovy
Benton A, Mitchell M, Hovy D: Multi-task learning for mental health using social media text. In Proceedings of European Chapter of the Association for Computational Linguistics. 2017.
Linguistic Inquiry and Word Count: LIWC [Computer Software]. Austin, TX: liwc. net
  • Jw Pennebaker
  • Rj Booth
  • Me Francis
Pennebaker JW, Booth RJ, Francis ME: Linguistic Inquiry and Word Count: LIWC [Computer Software]. Austin, TX: liwc. net; 2007.
Comparative validity of three screening questionnaires for DSM-IV depressive disorders and physicians' diagnoses
  • B Lö We
  • R L Spitzer
  • K Grä Fe
  • K Kroenke
  • A Quenter
  • S Zipfel
  • C Buchholz
  • S Witte
  • W Herzog
Lö we B, Spitzer RL, Grä fe K, Kroenke K, Quenter A, Zipfel S, Buchholz C, Witte S, Herzog W: Comparative validity of three screening questionnaires for DSM-IV depressive disorders and physicians' diagnoses. J Affect Diso