Figure - available from: Scientific Reports
This content is subject to copyright. Terms and conditions apply.
Temporal evolution of the normalized prevalence of social media expressions indicative of mental health symptomatic outcomes on the college subreddit.

Temporal evolution of the normalized prevalence of social media expressions indicative of mental health symptomatic outcomes on the college subreddit.

Source publication
Article
Full-text available
The mental health of college students is a growing concern, and gauging the mental health needs of college students is difficult to assess in real-time and in scale. To address this gap, researchers and practitioners have encouraged the use of passive technologies. Social media is one such "passive sensor" that has shown potential as a viable "pass...

Citations

... In fact, the use of LIWC to extract language features is a longstanding practice among scientists. Scientists have used LIWC or SC-LIWC to construct computational prediction models of psychological traits, including personalities (12, 32), mental health status (33,34), and subjective wellbeing (35, 36). ...
... First, our results showed that SC-LIWC features made the highest overall contribution to predicting personality in this study, achieving a total importance of 70%. This finding is consistent with the results of numerous previous studies that used LIWC as a closed-vocabulary method of extracting social media features (33,61,62). The finding also provides evidence to support the validity of SC-LIWC (31). ...
Article
Full-text available
Background: Personality psychology studies personality and its variation among individuals and is an essential branch of psychology. In recent years, machine learning research related to personality assessment has started to focus on the online environment and showed outstanding performance in personality assessment. However, the aspects of the personality of these prediction models measure remain unclear because few studies focus on the interpretability of personality prediction models. The objective of this study is to develop and validate a machine learning model with domain knowledge introduced to enhance accuracy and improve interpretability. Methods: Study participants were recruited via an online experiment platform. After excluding unqualified participants and downloading the Weibo posts of eligible participants, we used six psycholinguistic and mental health-related lexicons to extract textual features. Then the predictive personality model was developed using the multi-objective extra trees method based on 3,411 pairs of social media expression and personality trait scores. Subsequently, the prediction model's validity and reliability were evaluated, and each lexicon's feature importance was calculated. Finally, the interpretability of the machine learning model was discussed. Results: The features from Culture Value Dictionary were found to be the most important predictors. The fivefold cross-validation results regarding the prediction model for personality traits ranged between 0.44 and 0.48 (p < 0.001). The correlation coefficients of five personality traits between the two "split-half" datasets data ranged from 0.84 to 0.88 (p < 0.001). Moreover, the model performed well in terms of contractual validity. Conclusion: By introducing domain knowledge to the development of a machine learning model, this study not only ensures the reliability and validity of the prediction model but also improves the interpretability of the machine learning method. The study helps explain aspects of personality measured by such prediction models and finds a link between personality and mental health. Our research also has positive implications regarding the combination of machine learning approaches and domain knowledge in the field of psychiatry and its applications to mental health.
... Social media platforms provide a non-intrusive means to collect people's naturalistic data at scale. Research has leveraged social media data from different social media platforms to explore psychological and health issues using data from various domains such as drug misuse [18], minority stress [59], and mental health [9,48]. Social media platforms provide timely and relevant information on examining risk attributes longitudinally. ...
Preprint
Full-text available
The Papageno effect concerns how media can play a positive role in preventing and mitigating suicidal ideation and behaviors. With the increasing ubiquity and widespread use of social media, individuals often express and share lived experiences and struggles with mental health. However, there is a gap in our understanding about the existence and effectiveness of the Papageno effect in social media, which we study in this paper. In particular, we adopt a causal-inference framework to examine the impact of exposure to mental health coping stories on individuals on Twitter. We obtain a Twitter dataset with ∼2M posts by ∼10K individuals. We consider engaging with coping stories as the Treatment intervention, and adopt a stratified propensity score approach to find matched cohorts of Treatment and Control individuals. We measure the psy-chosocial shifts in affective, behavioral, and cognitive outcomes in longitudinal Twitter data before and after engaging with the coping stories. Our findings reveal that, engaging with coping stories leads to decreased stress and depression, and improved expressive writing , diversity, and interactivity. Our work discusses the practical and platform design implications in supporting mental wellbeing.
... Fisher and Appelbaum [54] provided case reports where clinicians had begun to incorporate patients' electronic communications, such as social media, as a new form of collateral information. Prior work in social computing and HCI has shown that, from such data, AI and machine learning can help to infer whether an individual is vulnerable to mental health conditions [28,67,130,131], such as major depression [35] and postpartum depression [36]. An emergent body of work in digital mental health has also argued how the computational power harnessed by AI systems could be leveraged to reveal the complex psychopathology of psychiatric disorders and thus better inform therapeutic applications and collaboration across different clinicians [85,165]. ...
Preprint
Full-text available
Explainable AI (XAI) systems are sociotechnical in nature; thus, they are subject to the sociotechnical gap--divide between the technical affordances and the social needs. However, charting this gap is challenging. In the context of XAI, we argue that charting the gap improves our problem understanding, which can reflexively provide actionable insights to improve explainability. Utilizing two case studies in distinct domains, we empirically derive a framework that facilitates systematic charting of the sociotechnical gap by connecting AI guidelines in the context of XAI and elucidating how to use them to address the gap. We apply the framework to a third case in a new domain, showcasing its affordances. Finally, we discuss conceptual implications of the framework, share practical considerations in its operationalization, and offer guidance on transferring it to new contexts. By making conceptual and practical contributions to understanding the sociotechnical gap in XAI, the framework expands the XAI design space.
... Given the prevalence of social media, users generate a tremendous volume of content every day; social media can serve as a fountainhead of nonintrusive, real-time, and massive data that can be used to infer well-being [14]. The personal and social discourses that users follow on a daily basis make up the expression on social media, which can effectively reflect users' health status and PWB in various contexts [15][16][17]. Thus, researchers are able to study a variety of mental illnesses, including depression, anxiety, stress, and loneliness [13,15,16,[18][19][20], based on linguistic cues on social media platforms. ...
... The personal and social discourses that users follow on a daily basis make up the expression on social media, which can effectively reflect users' health status and PWB in various contexts [15][16][17]. Thus, researchers are able to study a variety of mental illnesses, including depression, anxiety, stress, and loneliness [13,15,16,[18][19][20], based on linguistic cues on social media platforms. ...
... Although the capability of social media data has proved enormous, its predictive power corresponding to ground truth well-being data has yet to be endorsed [15]. By verifying whether social media expressions can reflect an individual's PWB, this study aimed to investigate this previously unconfirmed topic. ...
Article
Full-text available
Background: Positive mental health is arguably increasingly important and can be revealed, to some extent, in terms of psychological well-being (PWB). However, PWB is difficult to assess in real time on a large scale. The popularity and proliferation of social media make it possible to sense and monitor online users' PWB in a nonintrusive way, and the objective of this study is to test the effectiveness of using social media language expression as a predictor of PWB. Objective: This study aims to investigate the predictive power of social media corresponding to ground truth well-being data in a psychological way. Methods: We recruited 1427 participants. Their well-being was evaluated using 6 dimensions of PWB. Their posts on social media were collected, and 6 psychological lexicons were used to extract linguistic features. A multiobjective prediction model was then built with the extracted linguistic features as input and PWB as the output. Further, the validity of the prediction model was confirmed by evaluating the model's discriminant validity, convergent validity, and criterion validity. The reliability of the model was also confirmed by evaluating the split-half reliability. Results: The correlation coefficients between the predicted PWB scores of social media users and the actual scores obtained using the linguistic prediction model of this study were between 0.49 and 0.54 (P<.001), which means that the model had good criterion validity. In terms of the model's structural validity, it exhibited excellent convergent validity but less than satisfactory discriminant validity. The results also suggested that our model had good split-half reliability levels for every dimension (ranging from 0.65 to 0.85; P<.001). Conclusions: By confirming the availability and stability of the linguistic prediction model, this study verified the predictability of social media corresponding to ground truth well-being data from the perspective of PWB. Our study has positive implications for the use of social media to predict mental health in nonprofessional settings such as self-testing or a large-scale user study.
... Lastly, large-scale passively sensed signals have been harnessed in university campus environments to measure determinants of well-being [383] and performance [384], outside of nutrition [26,227,260,324]. Recent studies point towards the feasibility and the potential of leveraging behavioral traces for campus-centric applications [84,253,312,350]. ...
... These lists were made with the help of emotion assessment methods and glossaries, and a judging panel validated them 88 . Various studies leveraged LIWC as a reliable tool for emotion and linguistic analysis 42,68,69,88,179 . ...
Preprint
Full-text available
Mental health disorders may cause severe consequences on all the countries' economies and health. For example, the impacts of the COVID-19 pandemic, such as isolation and travel ban, can make us feel depressed. Identifying early signs of mental health disorders is vital. For example, depression may increase an individual's risk of suicide. The state-of-the-art research in identifying mental disorder patterns from textual data, uses hand-labelled training sets, especially when a domain expert's knowledge is required to analyse various symptoms. This task could be time-consuming and expensive. To address this challenge, in this paper, we study and analyse the various clinical and non-clinical approaches to identifying mental health disorders. We leverage the domain knowledge and expertise in cognitive science to build a domain-specific Knowledge Base (KB) for the mental health disorder concepts and patterns. We present a weaker form of supervision by facilitating the generating of training data from a domain-specific Knowledge Base (KB). We adopt a typical scenario for analysing social media to identify major depressive disorder symptoms from the textual content generated by social users. We use this scenario to evaluate how our knowledge-based approach significantly improves the quality of results.
... physical activity, obesity, mental wellbeing, in the context of a crisis. Previous research findings have demonstrated the potential for social media to identify health service needs (Saha et al. 2022) and perceptions around preventive healthcare seeking behaviours (Ryu and Pratt 2022), as well as the value of social media in facilitating the spread of information and mobilisation efforts during crises (Osoro 2017). We considered the Helsinki ethical principles in the conduct of our study (World Medical Association 2013). ...
Article
Full-text available
Given the complexity of global health crises such as the COVID-19 pandemic, it is typical for crisis-focused interventions to have a multiplicity of impacts. Some of these impacts may yield positive or negative externalities for health priorities that do not have the same perceived urgency. The interplay between COVID-19 prevention (a high priority, high perceived urgency issue) and non-communicable disease (NCD) prevention (a high priority, low perceived urgency issue) provides a good case in point. By analysing tweets during Nigeria’s COVID-19 lockdowns, we identified avenues for social media to help adapt crisis responses to a wider range of wellbeing concerns.
... As an open community, r/Veterans is not limited to specific purposes or topics. Our work also draws motivation from a body of prior work that showed how Reddit data effectively reveals community dynamics and support among college students [10,81,86], LGBTQ+ individuals [82], sexual abuse sufferers [7], and individuals undergoing life transitions [39]. ...
Article
Full-text available
Veterans are a unique marginalized group facing multiple vulnerabilities. Current assessments of veteran needs and support largely come from first-person accounts guided by researchers' prompts. Social media platforms not only enable veterans to connect with each other, but also to self-disclose experiences and seek support. This paper addresses the gap in our understanding of veteran needs and their own support dynamics by examining self-initiated and ecologically-valid self-expressions. In particular, we adopt the Veteran Critical Theory (VCT) to conduct a computational study on the Reddit community of veterans. Using topic modeling, we find veteran-friendly gestures with good intentions might not be appreciated in the subreddit. By employing transfer learning methodologies, we find this community has more informational and emotional support behaviors than general online communities and a higher prevalence of informational support than emotional support. Lastly, an examination of support dynamics reveals some contrasts to previous scholarship in military culture and social media. We discover that positive language and author platform tenure have negative relations with posts receiving replies and replies getting votes, and that replies reflecting personal disclosures tend to get more votes. Through the lens of VCT, we discuss how online communities can help uncover veterans' needs and provide more effective social support.
... When analyzing research, including publications in the CLPsych workshop series, we could see that in comparison to traditional machine learning methods (Cohan et al., 2016;Coppersmith et al., 2015a;Jamil et al., 2017;Schwartz et al., 2014), recent research has focused more on using deep learning architectures (Husseini Orabi et al., 2018;Kshirsagar et al., 2017;Mohammadi et al., 2019) that considerably reduce the time and effort required for feature engineering. However, researchers have continued using traditional machine learning methods to predict individuals susceptible to mental disorders and suicide ideation, which could be due to the lack of large sets of annotated data (e.g., Hauser et al. (2019) or to the requirement of explainability (e.g., Saha et al. (2022)). ...
... Fear from the virus and social distancing rules affected people's mental health, which negatively impacted their feelings, mood, daily habits, and social relationships, which are essential elements of human mental well-being. Specifically, restrictions due to social distancing and quarantines increased feelings of loneliness and social anxiety [113]. Many works have leveraged textual features for loneliness detect from social media content. ...
Preprint
Full-text available
Online social media provides a channel for monitoring people's social behaviors and their mental distress. Due to the restrictions imposed by COVID-19 people are increasingly using online social networks to express their feelings. Consequently, there is a significant amount of diverse user-generated social media content. However, COVID-19 pandemic has changed the way we live, study, socialize and recreate and this has affected our well-being and mental health problems. There are growing researches that leverage online social media analysis to detect and assess user's mental status. In this paper, we survey the literature of social media analysis for mental disorders detection, with a special focus on the studies conducted in the context of COVID-19 during 2020-2021. Firstly, we classify the surveyed studies in terms of feature extraction types, varying from language usage patterns to aesthetic preferences and online behaviors. Secondly, we explore detection methods used for mental disorders detection including machine learning and deep learning detection methods. Finally, we discuss the challenges of mental disorder detection using social media data, including the privacy and ethical concerns, as well as the technical challenges of scaling and deploying such systems at large scales, and discuss the learnt lessons over the last few years.