To read the full-text of this research, you can request a copy directly from the authors.
... The current literature leverages the language semantic network analysis using the ML model to capture and detect the depression level however, this analysis does not capture the underlying rhetorical relationship in the language data of the object [13]- [27]. The lack of identification of RRship related to depression in limited contextual understanding and loss of structural insights which are important considering points for depression prediction. ...
... Figure 3 indicates that majority of the datasets used are from Twitter and Reddit. Social media platforms such as Twitter [15], [16], [18], [19], [24], [25], Reddit [13], [15], [17] [21], [22], [27], Facebook [15], eRisk, [16], [20], or electronic diary [15]. This section overviews the existing work in the domain of RSTFusionX so it is divided into two components; ML models and RST models. ...
... It uses accuracy, precision, recall, and F1-Score as a parameter for evaluation. Lastly, Liang et al. [27] uses emotion-cause analysis, works on semantic analysis, and uses the N2Ncause technique to perform prediction. It uses F1-Score and accuracy for evaluation All of these works on semantic analysis during the detection of depression are based on different datasets. ...
Depression is a common and severe mental disorder that frequently goes undiagnosed and untreated, particularly during its initial phase. However, with the increasing number of people sharing their thoughts and feelings online, social media has become a valuable resource to identify symptoms of mental health problems such as depression. As a result, research on social media-based depression diagnosis has received significant interest but it mainly focuses on the semantics of posts. It cannot detect the implicit ambiguities that lies in the language of the user in an online posting. To address this limitation, a syntactical analysis of each user post is required. The goal of this research is to enhance the identification of individuals at risk for depression by analyzing their language and rhetorical relations using rhetorical structure theory. The rhetorical scores are incorporated into an ensemble machine learning model to classify social media posts as depressive or non-depressive. The model employs a combining Multilayer Perceptron, Extreme Gradient Boosting, and Support Vector Machine algorithm. Therefore we proposed a model named RSTFusionX by linking depression to a social media dataset and using a combination of the rhetorical structure theory framework along with an ensemble machine learning algorithm. To attain more precise prediction models, the research aims to better understand the association between social media language trends and depression. The results of this study are significant, as RSTFusionX achieves 97.1% accuracy, 97.4% precision, 96.7% recall, and 96.9% F1-score, exhibiting better results as compared to baseline algorithms. As a result of RSTFusionX, true positive and true negative values are increased while false positive and false negative predictions are reduced. The proposed research findings suggest that the integration of characteristics from the depressive vocabulary improves classification accuracy. The article contributes to the resources for depression detection and provides a technique for future research to further improve depression detection.
... The frequent occurrence of mental diseases, such as depression, makes mental health gradually receive extensive attention from society [72][73][74]. Psychological counseling aims at reducing people's emotional distress and helping them understand and work through the challenges that they face. Relieving the psychological pressure of the persuaded through conversation holds profound significance for the persuasive dialogue system. ...
Persuasion, as one of the crucial abilities in human communication, has garnered extensive attention from researchers within the field of intelligent dialogue systems. Developing dialogue agents that can persuade others to accept certain standpoints is essential to achieving truly intelligent and anthropomorphic dialogue systems. Benefiting from the substantial progress of Large Language Models (LLMs), dialogue agents have acquired an exceptional capability in context understanding and response generation. However, as a typical and complicated cognitive psychological system, persuasive dialogue agents also require knowledge from the domain of cognitive psychology to attain a level of human-like persuasion. Consequently, the cognitive strategy-enhanced persuasive dialogue agent (defined as CogAgent ), which incorporates cognitive strategies to achieve persuasive targets through conversation, has become a predominant research paradigm. To depict the research trends of CogAgent, in this paper, we first present several fundamental cognitive psychology theories and give the formalized definition of three typical cognitive strategies, including the persuasion strategy, the topic path planning strategy, and the argument structure prediction strategy. Then we propose a new system architecture by incorporating the formalized definition to lay the foundation of CogAgent. Representative works are detailed and investigated according to the combined cognitive strategy, followed by the summary of authoritative benchmarks and evaluation metrics. Finally, we summarize our insights on open issues and future directions of CogAgent for upcoming researchers.
... The literature evidences the symptoms of depression as unhappiness, loneliness, forgetfulness, low self-esteem, thoughts of self-harm, disrupted sleep, loss of energy, changes in appetite, anxiety, reduced concentration, indecision, feelings of worthlessness, lack of interest, guilt, or hopelessness [2,6]. Previous studies showed that the primary causes of depression are financial issues, workplace issues, family issues, and academic performance [7]. There are traditional approaches to depression detection that require psychometric questions. ...
The World Health Organisation (WHO) revealed approximately 280 million people in the world suffer from depression. Yet, existing studies on early-stage depression detection using machine learning (ML) techniques are limited. Prior studies have applied a single stand-alone algorithm, which is unable to deal with data complexities, prone to overfitting, and limited in generalization. To this end, our paper examined the performance of several ML algorithms for early-stage depression detection using two benchmark social media datasets (D1 and D2). More specifically, we incorporated sentiment indicators to improve our model performance. Our experimental results showed that sentence bidirectional encoder representations from transformers (SBERT) numerical vectors fitted into the stacking ensemble model achieved comparable F1 scores of 69% in the dataset (D1) and 76% in the dataset (D2). Our findings suggest that utilizing sentiment indicators as an additional feature for depression detection yields an improved model performance, and thus, we recommend the development of a depressive term corpus for future work.
... Information and communication technologies (ICTs) have proven to be empowering and beneficial to users in a variety of ways and thus may also help improve mental health (Dasuki et al., 2014;Liang et al., 2023) media, for example, has played a critical role in activism, enabling the disadvantaged to express a wide range of ideas and voices and to organize unequally distributed resources (Liu, 2016). Within health communication, social media has shown the potential to facilitate the process of health empowerment by enabling seeking out as well as sharing health-related information (Zamora, 2022). ...
This study proposed and tested a novel theoretical framework of media empowerment regarding the relationship between digital media skills and mental health as well as the complex mechanism linking the two. This study utilized an online survey of a representative sample of Shanghai residents (N = 916) to examine the interconnections among digital media skills, (mis)information sharing, and mental health. The findings revealed that the empowerment mechanisms of digital media skills on depression were contradictory at the individual and community levels. For the two dimensions of digital media skills, information skills directly reduced levels of depression but indirectly aggravated depression by promoting misinformation sharing; in contrast, social skills alleviated depression by mitigating misinformation sharing. Furthermore, risk perception positively moderated the relationship between misinformation sharing and depression. This study contributes to the media empowerment literature by empirically demonstrating a linkage between developed digital media skills and media empowerment in the aspect of mental health in the digital age. This study also innovatively highlights specific psychosocial elements of the empowerment processes from a communication perspective.
... Literature evidences the symptoms of depression as unhappiness, loneliness, forgetfulness, low self-esteem, thoughts of self-harm, disrupted sleep, loss of energy, changes in appetite, anxiety, reduced concentration, indecision, feelings of worthlessness, lack of interest, guilt, or hopelessness [2,6]. Previous studies showed that primary causes of depression are financial issues, workplace issues, family issues, and academic performance [7]. There are traditional approaches to depression detection which requires psychometric questions. ...
The world health organisation (WHO) revealed approximately 280 million people in the world suffer from depression. Yet, existing studies on early-stage depression detection using machine learning (ML) techniques are limited. Prior studies have applied a single stand-alone algorithm which are unable to deal with data complexities, prone to overfitting and limited in generalisation. To this end, our paper examined the performance of several ML algorithms for early-stage depression detection using two benchmark social media datasets (D1 and D2). More specifically, we incorporated sentiment indicator to improve our model performance. Our experimental results showed that sentence bidirectional encoder representations from transformers (SBERT) numerical vectors fitted into stacking ensemble model achieved comparable F1 scores of 69% in dataset (D1) and 76% in dataset (D2). Our findings suggest that utilising sentiment indicators as additional feature for depression detection yields an improved model performance and thus, we recommend the development of depressive term corpus for future work.
The aim of this study is to propose an automated system for the early identification of non-clinical depression-related behaviors among social media users. We constructed a novel dataset comprising 22,525 posts labeled by expert annotators. These posts are divided into 10,573 depressive and 11,952 non-depressive entries. Within the depressive category, we delineated two levels: moderate and major depression.
Subsequently, to identify users experiencing depression based on their posted content, we integrated various feature extraction techniques with machine learning and deep learning algorithms, including Support Vector Machine, K-Nearest Neighbors, Decision Tree, Recurrent Neural Network, and Long Short-Term Memory. After detecting depressed users, our system offers tailored recommendations for activities and coping strategies based on factors such as the severity of the case, gender, and age. These personalized recommendations can significantly contribute to the users’ healing journey, overall well-being, and especially in fostering a sense of hope and support. Experimental results indicate that our proposed system achieved an 84% accuracy rate in depression detection.
Learning causality from large-scale text corpora is an important task with numerous applications, for example, in finance, biology, medicine, and scientific discovery. Prior studies have focused mainly on simple causality, which only includes one cause-effect pair. However, causality is notoriously difficult to understand and analyze because of multiple cause spans and their entangled interactions. To detect complex causality, we propose a self-paced contrastive learning model, namely N2NCause, to learn entangled interactions between multiple spans. Specifically, N2NCause introduces data enhancement operations to convert implicit expressions into explicit expressions with the most rational causal connectives for the synthesis of positive samples and to invert the directed connection between a cause-effect pair for the synthesis of negative samples. To learn the semantic dependency and causal direction of positive and negative samples, self-paced contrastive learning is proposed to learn the entangled interactions among spans, including the interaction direction and interaction field. We evaluated the performance of N2NCause in three cause-effect detection tasks. The experimental results show that, with the least data annotation efforts, N2NCause demonstrates competitive performance in detecting simple cause-effect relations, and it is superior to existing solutions for the detection of complex causality.
Stress and depression detection on social media aim at the analysis of stress and identification of depression tendency from social media posts, which provide assistance for the early detection of mental health conditions. Existing methods mainly model the mental states of the post speaker implicitly. They also lack the ability to mentalise for complex mental state reasoning. Besides, they are not designed to explicitly capture class-specific features. To resolve the above issues, we propose a mental state Knowledge–aware and Contrastive Network (KC-Net). In detail, we first extract mental state knowledge from a commonsense knowledge base COMET, and infuse the knowledge using Gated Recurrent Units (GRUs) to explicitly model the mental states of the speaker. Then we propose a knowledge–aware mentalisation module based on dot-product attention to accordingly attend to the most relevant knowledge aspects. A supervised contrastive learning module is also utilised to fully leverage label information for capturing class-specific features. We test the proposed methods on a depression detection dataset Depression_Mixed with 3165 Reddit and blog posts, a stress detection dataset Dreaddit with 3553 Reddit posts, and a stress factors recognition dataset SAD with 6850 SMS-like messages. The experimental results show that our method achieves new state-of-the-art results on all datasets: 95.4% of F1 scores on Depression_Mixed, 83.5% on Dreaddit and 77.8% on SAD, with 2.07% average improvement. Factor-specific analysis and ablation study prove the effectiveness of all proposed modules, while UMAP analysis and case study visualise their mechanisms. We believe our work facilitates detection and analysis of depression and stress on social media data, and shows potential for applications on other mental health conditions.
Background
Health researchers are increasingly using natural language processing (NLP) to study various mental health conditions using both social media and electronic health records (EHRs). There is currently no published synthesis that relates specifically to the use of NLP methods for bipolar disorder, and this scoping review was conducted to synthesize valuable insights that have been presented in the literature.
Objective
This scoping review explored how NLP methods have been used in research to better understand bipolar disorder and identify opportunities for further use of these methods.
Methods
A systematic, computerized search of index and free-text terms related to bipolar disorder and NLP was conducted using 5 databases and 1 anthology: MEDLINE, PsycINFO, Academic Search Ultimate, Scopus, Web of Science Core Collection, and the ACL Anthology.
Results
Of 507 identified studies, a total of 35 (6.9%) studies met the inclusion criteria. A narrative synthesis was used to describe the data, and the studies were grouped into four objectives: prediction and classification (n=25), characterization of the language of bipolar disorder (n=13), use of EHRs to measure health outcomes (n=3), and use of EHRs for phenotyping (n=2). Ethical considerations were reported in 60% (21/35) of the studies.
Conclusions
The current literature demonstrates how language analysis can be used to assist in and improve the provision of care for people living with bipolar disorder. Individuals with bipolar disorder and the medical community could benefit from research that uses NLP to investigate risk-taking, web-based services, social and occupational functioning, and the representation of gender in bipolar disorder populations on the web. Future research that implements NLP methods to study bipolar disorder should be governed by ethical principles, and any decisions regarding the collection and sharing of data sets should ultimately be made on a case-by-case basis, considering the risk to the data participants and whether their privacy can be ensured.
Due to the worldwide accessibility to the Internet along with the continuous advances in mobile technologies, physical and digital worlds have become completely blended, and the proliferation of social media platforms has taken a leading role over this evolution. In this paper, we undertake a thorough analysis towards better visualising and understanding the factors that characterise and differentiate social media users affected by mental disorders. We perform different experiments studying multiple dimensions of language, including vocabulary uniqueness, word usage, linguistic style, psychometric attributes, emotions’ co-occurrence patterns, and online behavioural traits, including social engagement and posting trends.
Our findings reveal significant differences on the use of function words, such as adverbs and verb tense, and topic-specific vocabulary, such as biological processes. As for emotional expression, we observe that affected users tend to share emotions more regularly than control individuals on average. Overall, the monthly posting variance of the affected groups is higher than the control groups. Moreover, we found evidence suggesting that language use on micro-blogging platforms is less distinguishable for users who have a mental disorder than other less restrictive platforms. In particular, we observe on Twitter less quantifiable differences between affected and control groups compared to Reddit.
Importance:
Ensuring the well-being of the 73 million children in the United States is critical for improving the nation's health and influencing children's long-term outcomes as they grow into adults.
Objective:
To examine recent trends in children's health-related measures, including significant changes between 2019 and 2020 that might be attributed to the COVID-19 pandemic.
Design, setting, and participants:
Annual data were examined from the National Survey of Children's Health (2016-2020), a population-based, nationally representative survey of randomly selected children. Participants were children from birth to age 17 years living in noninstitution settings in all 50 states and the District of Columbia whose parent or caregiver responded to an address-based survey by mail or web. Weighted prevalence estimates account for probability of selection and nonresponse. Adjusted logistic regression models tested for significant trends over time.
Main outcomes and measures:
Diverse measures pertaining to children's current health conditions, positive health behaviors, health care access and utilization, and family well-being and stressors.
Results:
A total of 174 551 children were included (annual range = 21 599 to 50 212). Between 2016 and 2020, there were increases in anxiety (7.1% [95% CI, 6.6-7.6] to 9.2% [95% CI, 8.6-9.8]; +29%; trend P < .001) and depression (3.1% [95% CI, 2.9-3.5] to 4.0% [95% CI, 3.6-4.5]; +27%; trend P < .001). There were also decreases in daily physical activity (24.2% [95% CI, 23.1-25.3] to 19.8% [95% CI, 18.9-20.8]; -18%; trend P < .001), parent or caregiver mental health (69.8% [95% CI, 68.9-70.8] to 66.3% [95% CI, 65.3-67.3]; -5%; trend P < .001), and coping with parenting demands (67.2% [95% CI, 66.3-68.1] to 59.9% [95% CI, 58.8-60.9]; -11%; trend P < .001). In addition, from 2019 to 2020, there were increases in behavior or conduct problems (6.7% [95% CI, 6.1-7.4] to 8.1% [95% CI, 7.5-8.8]; +21%; P = .001) and child care disruptions affecting parental employment (9.4% [95% CI, 8.0-10.9] to 12.6% [95% CI, 11.2-14.1]; +34%; trend P = .001) as well as decreases in preventive medical visits (81.0% [95% CI, 79.7-82.3] to 74.1% [95% CI, 72.9-75.3]; -9%; trend P < .001).
Conclusions and relevance:
Recent trends point to several areas of concern that can inform future research, clinical care, policy decision making, and programmatic investments to improve the health and well-being of children and their families. More analyses are needed to elucidate varying patterns within subpopulations of interest.
As an essential component of human cognition, cause–effect relations appear frequently in text, and curating cause–effect relations from text helps in building causal networks for predictive tasks. Existing causality extraction techniques include knowledge-based, statistical machine learning (ML)-based, and deep learning-based approaches. Each method has its advantages and weaknesses. For example, knowledge-based methods are understandable but require extensive manual domain knowledge and have poor cross-domain applicability. Statistical machine learning methods are more automated because of natural language processing (NLP) toolkits. However, feature engineering is labor-intensive, and toolkits may lead to error propagation. In the past few years, deep learning techniques attract substantial attention from NLP researchers because of its powerful representation learning ability and the rapid increase in computational resources. Their limitations include high computational costs and a lack of adequate annotated training data. In this paper, we conduct a comprehensive survey of causality extraction. We initially introduce primary forms existing in the causality extraction: explicit intra-sentential causality, implicit causality, and inter-sentential causality. Next, we list benchmark datasets and modeling assessment methods for causal relation extraction. Then, we present a structured overview of the three techniques with their representative systems. Lastly, we highlight existing open challenges with their potential directions.
Background
Current research has found dramatic changes in the lives of those with eating disorders (EDs) during the COVID-19 pandemic. We build on existing research to investigate the long-term effects and adaptations that people with EDs have faced due to COVID-19 related changes.
Method
We collected 234 posts from three separate time periods from the subreddit r/EatingDisorders and analyzed them using thematic analysis. The posts were examined for initial patterns, and then those concepts were grouped into themes to reveal the authentic experiences of people living with EDs during the COVID-19 pandemic.
Results
Initially, we found “lack of control” and “familial influences (loved ones seeking support)” emerge as themes within our broader data set throughout all three timeframes. There were additional themes that were present in only one or two of the collection periods. These themes consisted of “symptom stress,” “technical stresses and concerns,” and “silver linings.”
Conclusions
Our analysis shows that people with EDs have fought significantly during the pandemic. Initially, the (lack of) control and routine in their lives has caused symptoms to become more challenging, while being forced to move back home also caused significant stress. However, concerns transformed as the pandemic progressed, resulting in new pressures causing people to exhibit novel ED symptoms or relapse altogether. Also notable is the relatively few COVID-specific posts as the pandemic progressed, suggesting that people have accepted COVID as their “new normal” and begun to build resilience to the challenges associated. These are vital factors for clinicians to consider as they begin taking existing and new patients, particularly as face-to-face treatment options become a possibility again.
Plain English Summary
Existing research shows that the COVID-19 pandemic has transformed the lives of people who live with eating disorders in various ways. First, the pandemic has placed barriers on the path to recovery by limiting coping mechanism (and sometimes removing them altogether) and changing their relationships with food and the people in their lives. Second, the pandemic has forced treatment options to change since ED patients can no longer seek treatment face-to-face. Finally, there have been unexpected benefits to the pandemic, such as allowing individuals time to slow down and focus on their mental health. Previous studies examined individuals in clinical contexts rather than in their natural environments. We explored an online forum for people with eating disorders for the various themes that were discussed at three points over the period of March 2020-December 2020 and found that many people with EDs report worsening symptoms or relapse. However, we also noted that, compared to the beginning of the pandemic, people seemed to be less frequently asking for support during the third data collection period, implying an adaptation to the “new normal” of life in a pandemic. We conclude with a discussion of the findings.
Background:
COVID-19 vaccines are considered one of the most effective preventive strategies for containing the pandemic. Having a better understanding of the public's conceptions of COVID-19 vaccines may aid in the effort to promptly and thoroughly vaccinate the community. However, no known empirical research has yet fully explored the public's vaccine awareness through a sentiment-based topic modeling approach. Therefore, little is known about the evolution of public attitude since the rollout of COVID-19 vaccines.
Objective:
In this study, we specifically focus on tweets regarding COVID-19 vaccines after they have become publicly available. We aim to explore the overall sentiments and topics about COVID-19 vaccines, as well as how such sentiments and main concerns evolve.
Methods:
We collected tweets related to COVID-19 vaccines from December 14th, 2020, to April 30th, 2021, using Twitter Application Programming Interface (API), resulting in over 857,000 tweets after data cleaning. We then applied sentiment-based topic modeling to this dataset. For topic modeling, we used the coherence score to determine the optimal topic number and calculated the topic distribution to illustrate the topic evolution.
Results:
Overall, 46.51% positive, 23.81% negative, 28.70% neutral, 0.80% highly positive, and 0.18% highly negative sentiments were found among these tweets. Our results also showed that the main topics of positive tweets were (a1) recognition of the importance to get vaccinated (59.47%), (a1) being thankful with the expectation of receiving the vaccine (9.92%), and (a3) feeling positive about the vaccine administration (7.1%). On the other hand, the main concerns underlying negative tweets were found to include (b1) the extreme side effects of the vaccines (54.92%), (b2) the government's power abuse (10.12%), and (b3) concerns about some vulnerable groups with an underlying health condition (8.12%). Overall, the results showed that the negative sentiments were stable while the positive sentiments can be easily influenced and are more likely to be shifted to neutral, but not negative, sentiments, presumably because of the promising effects of the COVID-19 vaccines. We also demonstrated how the main concerns changed (via topic heatmap visualization) during the current widespread vaccination campaign.
Conclusions:
To the best of our knowledge, this is the first study evaluating the public's COVID-19 vaccine awareness on social media through a sentiment-based topic modeling approach since the rollout of the vaccines. This study builds upon a text-mining framework combining sentiment analysis and topic modeling, which automatically captures and analyzes COVID-19 vaccine-related discussions on social media, allowing real-time assessments of the public's vaccine awareness. Our results can help policymakers and research communities track public attitudes toward COVID-19 vaccines and help them make decisions to promote the vaccination campaign.
Clinicaltrial:
Background
The study of depression and anxiety using publicly available social media data is a research activity that has grown considerably over the past decade. The discussion platform Reddit has become a popular social media data source in this nascent area of study, in part because of the unique ways in which the platform is facilitative of research. To date, no work has been done to synthesize existing studies on depression and anxiety using Reddit.
Objective
The objective of this review is to understand the scope and nature of research using Reddit as a primary data source for studying depression and anxiety.
Methods
A scoping review was conducted using the Arksey and O’Malley framework. MEDLINE, Embase, CINAHL, PsycINFO, PsycARTICLES, Scopus, ScienceDirect, IEEE Xplore, and ACM academic databases were searched. Inclusion criteria were developed using the participants, concept, and context framework outlined by the Joanna Briggs Institute Scoping Review Methodology Group. Eligible studies featured an analytic focus on depression or anxiety and used naturalistic written expressions from Reddit users as a primary data source.
Results
A total of 54 studies were included in the review. Tables and corresponding analyses delineate the key methodological features, including a comparatively larger focus on depression versus anxiety, an even split of original and premade data sets, a widespread analytic focus on classifying the mental health states of Reddit users, and practical implications that often recommend new methods of professionally delivered monitoring and outreach for Reddit users.
Conclusions
Studies of depression and anxiety using Reddit data are currently driven by a prevailing methodology that favors a technical, solution-based orientation. Researchers interested in advancing this research area will benefit from further consideration of conceptual issues surrounding the interpretation of Reddit data with the medical model of mental health. Further efforts are also needed to locate accountability and autonomy within practice implications, suggesting new forms of engagement with Reddit users.
Background
As a common mental disease, depression seriously affects people’s physical and mental health. According to the statistics of the World Health Organization, depression is one of the main reasons for suicide and self-harm events in the world. Therefore, strengthening depression detection can effectively reduce the occurrence of suicide or self-harm events so as to save more people and families. With the development of computer technology, some researchers are trying to apply natural language processing techniques to detect people who are depressed automatically. Many existing feature engineering methods for depression detection are based on emotional characteristics, but these methods do not consider high-level emotional semantic information. The current deep learning methods for depression detection cannot accurately extract effective emotional semantic information.
Objective
In this paper, we propose an emotion-based attention network, including a semantic understanding network and an emotion understanding network, which can capture the high-level emotional semantic information effectively to improve the depression detection task.
Methods
The semantic understanding network module is used to capture the contextual semantic information. The emotion understanding network module is used to capture the emotional semantic information. There are two units in the emotion understanding network module, including a positive emotion understanding unit and a negative emotion understanding unit, which are used to capture the positive emotional information and the negative emotional information, respectively. We further proposed a dynamic fusion strategy in the emotion understanding network module to fuse the positive emotional information and the negative emotional information.
Results
We evaluated our method on the Reddit data set. The experimental results showed that the proposed emotion-based attention network model achieved an accuracy, precision, recall, and F-measure of 91.30%, 91.91%, 96.15%, and 93.98%, respectively, which are comparable results compared with state-of-the-art methods.
Conclusions
The experimental results showed that our model is competitive with the state-of-the-art models. The semantic understanding network module, the emotion understanding network module, and the dynamic fusion strategy are effective modules for depression detection. In addition, the experimental results verified that the emotional semantic information was effective in depression detection.
Social media platforms continue to evolve as archival platforms, where important milestones in an individual's life are socially disclosed for support, solidarity, maintaining and gaining social capital, or to meet therapeutic needs. However, a limited understanding of how and what life events are disclosed (or not) prevents designing platforms to be sensitive to life events. We ask what life events individuals disclose on a 256 participants' year-long Facebook dataset of 14K posts against their self-reported life events. We contribute a codebook to identify life event disclosures and build regression models on factors explaining life events' disclosures. Positive and anticipated events are more likely, whereas significant, recent, and intimate events are less likely to be disclosed on social media. While all life events may not be disclosed, online disclosures can reflect complementary information to self-reports. Our work bears practical and platform design implications in providing support and sensitivity to life events.
Emotion-cause pair extraction (ECPE), which aims at simultaneously extracting emotion-cause pairs that express emotions and their corresponding causes in a document, plays a vital role in understanding natural languages. Considering that a cause usually appears around its corresponding emotion, we construct a pair graph and a Pair Graph Convolutional Network (PairGCN) to model dependency relations among local neighborhood candidate pairs. Moreover, in our proposed graph, there are three types of dependency relations and each type of dependency relations has its own way to propagate contextual information. Experiments on a benchmark Chinese emotion-cause pair extraction corpus demonstrate the effectiveness of the proposed model.
Background The social distancing during COVID-19 is likely to cause a feeling of alienation, which may pose a threat to the public's mental health. Our research aims to examine the relationship between negative emotions and Post-Traumatic Stress Disorder (PTSD), considering the mediation effect of alienation and how it is moderated by anxiety and depression.
Methods For this, the current study conducted a cross-sectional survey on 7145 participants during the outbreak of COVID-19, via online questionnaires comprised of a self-designed Negative emotions questionnaire, Symptom Check List 90 (SCL-90), PTSD Checklist-civilian version (PCL-C), and Adolescent Students Alienation Scale (ASAS).
Results A total of 6666 pieces of data from the general population were included in the statistical analysis. The descriptive statistics showed a relatively mild level of mental disorders. Besides, results of Conditional Process Model analysis supported our hypotheses that negative emotions and alienation were both predictors for PTSD symptoms, and their direct and indirect effects were all moderated by the level of anxiety.
Limitations This study was limited by the generality and causality of the conclusion. The moderating effect of depression was left for further study due to the collinearity problem of variables.
Conclusions Social distancing may have an impact on individuals’ mental health by the feeling of alienation, which was moderated by affective disorders. Clinical psychologists should identify individuals’ particular cognition and mental disorders to provide a more accurate and adequate intervention for them.
With the exponential growth of user-generated content, policies and guidelines are not always enforced in social media, resulting in the prevalence of deviant content violating policies and guidelines. The adverse effects of deviant content are devastating and far-reaching. However, the detection of deviant content from sparse and imbalanced textual data is challenging, as a large number of stakeholders are involved with different stands and the subtle linguistic cues are highly dependent on complex context. To address this problem, we propose a multi-view attention-based deep learning system, which combines random subspace and binary particle swarm optimization (RS-BPSO) to distill content of interest (candidates) from imbalanced data, and applies the context and view attention mechanisms in convolutional neural network (dubbed as SSCNN) for the extraction of structural and semantic features. We evaluate the proposed approach on a large-scale dataset collected from Facebook, and find that RS-BPSO is able to detect whether the content is associated with marijuana with an accuracy of 87.55%, and SSCNN outperforms baselines with an accuracy of 94.50%.
Background:
The COVID-19 pandemic is exerting a devastating impact on mental health, but it is not clear how people with different types of mental health problems were differentially impacted as the initial wave of cases hit.
Objective:
We leverage natural language processing (NLP) with the goal of characterizing changes in fifteen of the world's largest mental health support groups (e.g., r/schizophrenia, r/SuicideWatch, r/Depression) found on the website Reddit, along with eleven non-mental health groups (e.g., r/PersonalFinance, r/conspiracy) during the initial stage of the pandemic.
Methods:
We create and release the Reddit Mental Health Dataset including posts from 826,961 unique users from 2018 to 2020. Using regression, we analyze trends from 90 text-derived features such as sentiment analysis, personal pronouns, and a "guns" semantic category. Using supervised machine learning, we classify posts into their respective support group and interpret important features to understand how different problems manifest in language. We apply unsupervised methods such as topic modeling and unsupervised clustering to uncover concerns throughout Reddit before and during the pandemic.
Results:
We find that the r/HealthAnxiety forum showed spikes in posts about COVID-19 early on in January, approximately two months before other support groups started posting about the pandemic. There were many features that significantly increased during COVID-19 for specific groups including the categories "economic stress", "isolation", and "home" while others such as "motion" significantly decreased. We find that support groups related to attention deficit hyperactivity disorder (ADHD), eating disorders (ED), and anxiety showed the most negative semantic change during the pandemic out of all mental health groups. Health anxiety emerged as a general theme across Reddit through independent supervised and unsupervised machine learning analyses. For instance, we provide evidence that the concerns of a diverse set of individuals are converging in this unique moment of history; we discover that the more users posted about COVID-19, the more linguistically similar (less distant) the mental health support groups became to r/HealthAnxiety (ρ = -0.96, P<.001). Using unsupervised clustering, we find the Suicidality and Loneliness clusters more than doubled in amount of posts during the pandemic. Specifically, the support groups for borderline personality disorder and post-traumatic stress disorder became significantly associated with the Suicidality cluster. Furthermore, clusters surrounding Self-Harm and Entertainment emerged.
Conclusions:
By using a broad set of NLP techniques and analyzing a baseline of pre-pandemic posts, we uncover patterns of how specific mental health problems manifest in language, identify at-risk users, and reveal the distribution of concerns across Reddit which could help provide better resources to its millions of users. We then demonstrate that textual analysis is sensitive to uncover mental health complaints as they arise in real time, identifying vulnerable groups and alarming themes during COVID-19, and thus may have utility during the ongoing pandemic and other world-changing events such as elections and protests from the present or the past.
Clinicaltrial:
Users of social media often share their feelings or emotional states through their posts. In this study, we developed a deep learning model to identify a user’s mental state based on his/her posting information. To this end, we collected posts from mental health communities in Reddit. By analyzing and learning posting information written by users, our proposed model could accurately identify whether a user’s post belongs to a specific mental disorder, including depression, anxiety, bipolar, borderline personality disorder, schizophrenia, and autism. We believe our model can help identify potential sufferers with mental illness based on their posts. This study further discusses the implication of our proposed model, which can serve as a supplementary tool for monitoring mental health states of individuals who frequently use social media.
Joint extraction of entities and their relations benefits from the close interaction between named entities and their relation information. Therefore, how to effectively model such cross-modal interactions is critical for the final performance. Previous works have used simple methods such as label-feature concatenation to perform coarse-grained semantic fusion among cross-modal instances, but fail to capture fine-grained correlations over token and label spaces, resulting in insufficient interactions. In this paper, we propose a deep Cross-Modal Attention Network (CMAN) for joint entity and relation extraction. The network is carefully constructed by stacking multiple attention units in depth to fully model dense interactions over token-label spaces, in which two basic attention units are proposed to explicitly capture fine-grained correlations across different modalities (e.g., token-to-token and labelto-token). Experiment results on CoNLL04 dataset show that our model obtains state-of-the-art results by achieving 90.62% F1 on entity recognition and 72.97% F1 on relation classification. In ADE dataset, our model surpasses existing approaches by more than 1.9% F1 on relation classification. Extensive analyses further confirm the effectiveness of our approach.
The emotion cause extraction (ECE) task aims at discovering the potential causes behind a certain emotion expression in a document. Techniques including rule-based methods, traditional machine learning methods and deep neural networks have been proposed to solve this task. However, most of the previous work considered ECE as a set of independent clause classification problems and ignored the relations between multiple clauses in a document. In this work, we propose a joint emotion cause extraction framework, named RNN-Transformer Hierarchical Network (RTHN), to encode and classify multiple clauses synchronously. RTHN is composed of a lower word-level encoder based on RNNs to encode multiple words in each clause, and an upper clause-level encoder based on Transformer to learn the correlation between multiple clauses in a document. We furthermore propose ways to encode the relative position and global predication information into Transformer that can capture the causality between clauses and make RTHN more efficient. We finally achieve the best performance among 12 compared systems and improve the F1 score of the state-of-the-art from 72.69% to 76.77%.
Causality extraction from natural language texts is a challenging open problem in artificial intelligence. Existing methods utilize patterns, constraints, and machine learning techniques to extract causality, heavily depending on domain knowledge and requiring considerable human effort and time for feature engineering. In this paper, we formulate causality extraction as a sequence labeling problem based on a novel causality tagging scheme. On this basis, we propose a neural causality extractor with the BiLSTM-CRF model as the backbone, named SCITE (Self-attentive BiLSTM-CRF wIth Transferred Embeddings), which can directly extract cause and effect without extracting candidate causal pairs and identifying their relations separately. To address the problem of data insufficiency, we transfer contextual string embeddings, also known as Flair embeddings, which are trained on a large corpus in our task. In addition, to improve the performance of causality extraction, we introduce a multihead self-attention mechanism into SCITE to learn the dependencies between causal words. We evaluate our method on a public dataset, and experimental results demonstrate that our method achieves significant and consistent improvement compared to baselines.
In this article, we review recent developments in the study of emotional expression within a basic emotion framework. Dozens of new studies find that upwards of 20 emotions are signaled in multimodal and dynamic patterns of expressive behavior. Moving beyond word to stimulus matching paradigms, new studies are detailing the more nuanced and complex processes involved in emotion recognition and the structure of how people perceive emotional expression. Finally, we consider new studies documenting contextual influences upon emotion recognition. We conclude by extending these recent findings to questions about emotion-related physiology and the mammalian precursors of human emotion.
Purpose of Review
The emotional memory and learning model of PTSD posits maladaptive fear conditioning, extinction learning, extinction recall, and safety learning as central mechanisms to PTSD. There is increasingly convincing support that sleep disturbance plays a mechanistic role in these processes. The current review consolidates the evidence on the relationships between emotional memory and learning, disturbed sleep, and PTSD acquisition, maintenance, and treatment.
Recent Findings
While disrupted sleep prior to trauma predicts PTSD onset, maladaptive fear acquisition does not seem to be the mechanism through which PTSD is acquired. Rather, poor extinction learning/recall and safety learning seem to better account for who maintains acute stress responses from trauma versus who naturally recovers; there is convincing evidence that this process is, at least in part, mediated by REM fragmentation. Individuals with PTSD had higher “fear load” during extinction, worse extinction learning, poorer extinction recall, and worse safety learning. Evidence suggests that these processes are also mediated by fragmented REM. Finally, PTSD treatments that require extinction and safety learning may also be affected by REM fragmentation.
Summary
Addressing fragmented sleep or sleep architecture could be used to increase emotional memory and learning processes and thus ameliorate responses to trauma exposure, reduce PTSD severity, and improve treatment. Future studies should examine relationships between emotional memory and learning and disturbed sleep in clinical PTSD patients.
This paper explores extending shallow semantic parsing beyond lexical-unit triggers, using causal relations as a test case. Semantic parsing becomes difficult in the face of the wide variety of linguistic realizations that causation can take on. We therefore base our approach on the concept of constructions from the linguistic paradigm known as Construction Grammar (CxG). In CxG, a construction is a form/function pairing that can rely on arbitrary linguistic and semantic features. Rather than codifying all aspects of each construction’s form, as some attempts to employ CxG in NLP have done, we propose methods that offload that problem to machine learning. We describe two supervised approaches for tagging causal constructions and their arguments. Both approaches combine automatically induced pattern-matching rules with statistical classifiers that learn the subtler parameters of the constructions. Our results show that these approaches are promising: they significantly outperform naïve baselines for both construction recognition and cause and effect head matches.
Background
More than 3.5 million Americans live with autism spectrum disorder (ASD). Major challenges persist in diagnosing ASD as no medical test exists to diagnose this disorder. Digital phenotyping holds promise to guide in the clinical diagnoses and screening of ASD.
Objective
This study aims to explore the feasibility of using the Web-based social media platform Twitter to detect psychological and behavioral characteristics of self-identified persons with ASD.
Methods
Data from Twitter were retrieved from 152 self-identified users with ASD and 182 randomly selected control users from March 22, 2012 to July 20, 2017. We conducted a between-group comparative textual analysis of tweets about repetitive and obsessive-compulsive behavioral characteristics typically associated with ASD. In addition, common emotional characteristics of persons with ASD, such as fear, paranoia, and anxiety, were examined between groups through textual analysis. Furthermore, we compared the timing of tweets between users with ASD and control users to identify patterns in communication.
ResultsUsers with ASD posted a significantly higher frequency of tweets related to the specific repetitive behavior of counting compared with control users (P
Detection at an early stage is vital for the diagnosis of the majority of critical illnesses and is the same for identifying people suffering from depression. Nowadays, a number of researches have been done successfully to identify depressed persons based on their social media postings. However, an unexpected bias has been observed in these studies, which can be due to various factors like unequal data distribution. In this paper, the imbalance found in terms of participation in the various age groups and demographics is normalized using the one-shot decision approach. Further, we present an ensemble model combining SVM and KNN with the intrinsic explainability in conjunction with noisy label correction approaches, offering an innovative solution to the problem of distinguishing between depression symptoms and suicidal ideas. We achieved a final classification accuracy of 98.05%, with the proposed ensemble model ensuring that the data classification is not biased in any manner.
Nowadays, stress has become a growing problem for society due to its high impact on individuals but also on health care systems and companies. In order to overcome this problem, early detection of stress is a key factor. Previous studies have shown the effectiveness of text analysis in the detection of sentiment, emotion, and mental illness. However, existing solutions for stress detection from text are focused on a specific corpus. There is still a lack of well-validated methods that provide good results in different datasets. We aim to advance state of the art by proposing a method to detect stress in textual data and evaluating it using multiple public English datasets. The proposed approach combines lexicon-based features with distributional representations to enhance classification performance. To help organize features for stress detection in text, we propose a lexicon-based feature framework that exploits affective, syntactic, social, and topic-related features. Also, three different word embedding techniques are studied for exploiting distributional representation. Our approach has been implemented with three machine learning models that have been evaluated in terms of performance through several experiments. This evaluation has been conducted using three public English datasets and provides a baseline for other researchers. The obtained results identify the combination of FastText embeddings with a selection of lexicon-based features as the best-performing model, achieving F-scores above 80%.
Text classification is a foundational task in many NLP applications. Traditional text classifiers often rely on many human-designed features, such as dictionaries, knowledge bases and special tree kernels. In contrast to traditional methods, we introduce a recurrent convolutional neural network for text classification without human-designed features. In our model, we apply a recurrent structure to capture contextual information as far as possible when learning word representations, which may introduce considerably less noise compared to traditional window-based neural networks. We also employ a max-pooling layer that automatically judges which words play key roles in text classification to capture the key components in texts. We conduct experiments on four commonly used datasets. The experimental results show that the proposed method outperforms the state-of-the-art methods on several datasets, particularly on document-level datasets.
Social media like Weibo has become an important platform for people to ask for help during COVID-19 pandemic. Using a complete dataset of help-seeking posts on Weibo during the COVID-19 outbreak in China (N=3,705,188), this study mapped their characteristics and analyzed their relationship with the epidemic development at the aggregate level, and examined the influential factors to determine whether and the extent the help-seeking crying could be heard at the individual level using computational methods for the first time. It finds that the number of help-seeking posts on Weibo has a Granger causality relationship with the number of confirmed COVID-19 cases with a time lag of eight days. This study then proposes a 3C framework to examine the direct influence of content, context, and connection on the responses (measured by retweets and comments) and assistance that help-seekers might receive as well as their indirect effects on assistance through the mediation of both retweets and comments. The differential influences of content (theme and negative sentiment), context (Super topic community, spatial location of posting, and the period of sending time), and connection (the number of followers, whether mentioning others, and verified status of authors and sharers) have been reported and discussed.
Depression is the most incapacitating disease worldwide, and it has an alarming comorbidity rate with anxiety. The use of social networks to expose personal difficulties has enabled works on the automatic identification of specific mental conditions, particularly depression. In spite of many solutions proposed for the automatic recognition of depression, fewer exist for anxiety and its comorbidity with depression. In this paper, we propose DAC Stacking, a solution that leverages stacking ensembles and deep learning (DL) to automatically identify depression, anxiety, and their comorbidity, using data extracted from Reddit. The stacking is composed of single-label binary classifiers, that either distinguish between specific disorders and control users (experts), or between pairs of target conditions (differentiating). A meta-learner explores these base classifiers as a context for reaching a multi-label decision. We assessed alternative ensemble topologies, exploring roles for base models, DL architectures, and word embeddings. All base classifiers and ensembles outperformed the baselines for depression and anxiety (f-measures near 0.79). The ensemble topology with the best performance (Hamming Loss of 0.29 and Exact Match Ratio of 0.46) combines base classifiers of three DL architectures, and includes expert and differentiating base models. The analysis of the influential classification features according to SHAP revealed the strengths of our solution and provided insights on the challenges for the automatic classification of the addressed mental conditions.
Emotion-cause pair extraction is a recently proposed task that aims at extracting all potential clause-level pairs of emotion and cause in text. To solve this task, researchers first proposed a two-step pipeline method. This method extracts the emotions and causes individually in the first step, then pairs the extracted emotions and causes and filters the invalid emotion-cause pairs in the second step. Due to that the two-step method has the error accumulation problem and is hard to be optimized jointly, several one-step end-to-end models have been proposed. These models share a similar underlying idea, that is, reframing the emotion-cause pair extraction task as a classification problem of candidate clause pairs. Unlike these models, in this paper, we reframe the emotion-cause pair extraction task as a unified sequence labeling problem, which allows to extract emotion-cause pairs through one pass of sequence labeling. This is realized by designing a special set of unified labels. In the unified label, we design a content part for emotion/cause identification and a pairing part for clause pairing. Then the emotion-cause pairs can be implicitly derived from the unified labels. To address this unified sequence labeling problem, we propose a unified target-oriented sequence-to-sequence model, which comprehensively utilizes the information of target clause, global context, and former decoded label, to perform end-to-end unified sequence labeling. The experimental results demonstrate the effectiveness of both our proposed unified sequence labeling scheme and unified target-oriented sequence-to-sequence model. All the code and data of this work can be obtained at
https://github.com/zifengcheng/UTOS
.
Depression is a widespread and intractable problem in modern society, which may lead to suicide ideation and behavior. Analyzing depression or suicide based on the posts of social media such as Twitter or Reddit has achieved great progress in recent years. However, most work focuses on English social media and depression prediction is typically formalized as being present or absent. In this paper, we construct a human-annotated dataset for depression analysis via Chinese microblog reviews which includes 6,100 manually-annotated posts. Our dataset includes two fine-grained tasks, namely depression degree prediction and depression cause prediction. The object of the former task is to classify a Microblog post into one of 5 categories based on the depression degree, while the object of the latter one is selecting one or multiple reasons that cause the depression from 7 predefined categories. To set up a benchmark, we design a neural model for joint depression degree and cause prediction, and compare it with several widely-used neural models such as TextCNN, BiLSTM and BERT. Our model outperforms the baselines and achieves at most 65+% F1 for depression degree prediction, 70+% F1 and 90+% AUC for depression cause prediction, which shows that neural models achieve promising results, but there is still room for improvement. Our work can extend the area of social-media-based depression analyses, and our annotated data and code can also facilitate related research.
Data on social media contain a wealth of user information. Big data research of social media data may also support standard surveillance approaches and provide decision-makers with usable information. These data can be analyzed using Natural Language Processing (NLP) and Machine Learning (ML) techniques to detect signs of mental disorders that need attention, such as depression and suicide ideation. This article presents the recent trends and tools that are used in this field, the different means for data collection, and the current applications of ML and NLP in the surveillance of public mental health. We highlight the best practices and the challenges. Furthermore, we discuss the current gaps that need to be addressed and resolved.
The number of mass shootings has increased in recent years. Understanding the factors that affect the emotional reactions of the population is very important to help them to cope with the constant sense of threat and fear. Related work has explored sentiment analysis on social media data to derive insights. However, existing solutions are limited to the negative polarity of the sentiment, disregard the intrinsic properties of the users and the events, and typically draw conclusions using a single mass violent event using an ad hoc method. In this article, we describe an analysis framework to investigate the emotional reaction on Twitter to mass traumatic events and use it to derive conclusions about eight mass shooting events. The framework encompasses the crawling of pre/post-events tweets to compare the emotional reactions, the classification of the sentiment in terms of Ekman’s basic emotions, and the use of data extracted from Twitter users’ profiles to understand these reactions in the light of users’ demographics (age and gender), proximity to the event, and the number of victims. To classify the emotion, we developed experiments with three deep learning strategies: CNN, biLSTM and BERT, where the former yielded the best result. The assessment of the emotion classifier revealed an encouraging performance (76%) and the ability to predict according to meaningful features. Our analyses revealed that anger, fear, and sadness are the most expressed emotions and that gender, age, and proximity to the event are influential factors. This analysis framework could be used to derive conclusions on the emotional reactions to all sorts of mass traumatic events.
The piecewise convolutional neural network (PCNN) is an important method for distant supervision relation extraction. However, the existing methods based on the PCNN still have the following shortcomings: these methods lack the consideration of the impacts of entity pairs and the sentence context on word encoding and do not distinguish the different contributions of the three segments in PCNN to relation classification. To solve these problems, we propose a novel gated piecewise CNN with entity-aware enhancement for distantly supervised relation extraction. First, we use a multi-head self-attention mechanism to combine the word embedding with the head/tail entity embedding and relative position embedding to generate an entity-aware enhanced word representation, which is capable of capturing the semantic dependency between each word and entity pair. Then we introduce a global gate to combine each entity-aware enhanced word representation with their average in the input sentence to form the final word representation of the PCNN input. Moreover, to determine the key segments where the most important information for relation classification appears, we design another gate mechanism to assign a different weight to each sentence segment to highlight the effects of key segments on the PCNN. Experiments on New York Times dataset demonstrate that our model significantly outperforms most of the state-of-the-art models.
Causal graphs play an essential role in the determination of causalities and have been applied in many domains including biology and medicine. Traditional causal graph construction methods are usually data-driven and may not deliver the desired accuracy of a graph. Considering the vast number of publications with causality knowledge, extracting causal relations from the literature to help to establish causal graphs becomes possible. Current supervised-learning-based causality extraction methods requires sufficient labeled data to train a model, and rule-based causality extraction methods are limited by the predefined patterns. This paper proposes a causality extraction framework by integrating rule-based methods and unsupervised learning models to overcome these limitations. The proposed method consists of three modules, including data preprocessing, syntactic pattern matching, and causality determination. In data preprocessing, abstracts are crawled based on attribute names before sentences are extracted and simplified. In syntactic pattern matching, these simplified sentences are parsed to obtain the part-of-speech tags, and triples are achieved based on these tags by matching the two designed syntactic patterns. In causality determination, four verb seed sets are initialized, and word vectors are constructed for the verbs in both the seed sets and the triples by applying an unsupervised machine learning model. Causal relations are identified by comparing the similarity between the verbs in each triple and that in each seed set to overcome the limitation of the seed sets. Causality extraction results on the attributes from the risk factors for Alzheimer's disease show that our method outperforms Bui's method and Alashri's method in terms of precision, recall, specificity, accuracy and F-score, with increases in the F-score of 8.29% and 5.37%, respectively.
The landscape of mental health has undergone tremendous changes within the last two decades, but the research on mental health is still at the initial stage with substantial knowledge gaps and the lack of precise diagnosis. Nowadays, big data and artificial intelligence offer new opportunities for the screening and prediction of mental problems. In this review paper, we outline the vision of digital phenotyping of mental health (DPMH) by fusing the enriched data from ubiquitous sensors, social media and healthcare systems, and present a broad overview of DPMH from sensing and computing perspectives. We first conduct a systematical literature review and propose the research framework, which highlights the key aspects related with mental health, and discuss the challenges elicited by the enriched data for digital phenotyping. Next, five key research strands including affect recognition, cognitive analytics, behavioral anomaly detection, social analytics, and biomarker analytics are unfolded in the psychiatric context. Finally, we discuss various open issues and the corresponding solutions to underpin the digital phenotyping of mental health.