Conference Paper

A Large-Scale Dataset for Motivational Dialogue System: An Application of Natural Language Generation to Mental Health

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... I hope it turned out to be a blessing. Lokala et al., 2022), counseling (Althoff et al., 2016;Shen et al., 2020Shen et al., , 2022 or motivational interviewing (Pérez-Rosas et al., 2016;Saha et al., 2021Saha et al., , 2022. Generally, the ESC system aims at reducing the user's emotional distress as well as assisting the user to identify and overcome the problem via conversations . ...
... I hope it turned out to be a blessing. Lokala et al., 2022), counseling (Althoff et al., 2016;Shen et al., 2020Shen et al., , 2022 or motivational interviewing (Pérez-Rosas et al., 2016;Saha et al., 2021Saha et al., , 2022. Generally, the ESC system aims at reducing the user's emotional distress as well as assisting the user to identify and overcome the problem via conversations . ...
Preprint
Full-text available
Unlike empathetic dialogues, the system in emotional support conversations (ESC) is expected to not only convey empathy for comforting the help-seeker, but also proactively assist in exploring and addressing their problems during the conversation. In this work, we study the problem of mixed-initiative ESC where the user and system can both take the initiative in leading the conversation. Specifically, we conduct a novel analysis on mixed-initiative ESC systems with a tailor-designed schema that divides utterances into different types with speaker roles and initiative types. Four emotional support metrics are proposed to evaluate the mixed-initiative interactions. The analysis reveals the necessity and challenges of building mixed-initiative ESC systems. In the light of this, we propose a knowledge-enhanced mixed-initiative framework (KEMI) for ESC, which retrieves actual case knowledge from a large-scale mental health knowledge graph for generating mixed-initiative responses. Experimental results on two ESC datasets show the superiority of KEMI in both content-preserving evaluation and mixed initiative related analyses.
... Motivation As NLG is increasingly integrated into online systems for tasks like mental health support [56] and behavioral interventions [33], ensuring individuals can disclose their gender in a safe environment is critical to their efficacy and the reduction of existing TGNB stigma. Therefore, another dimension in assessing gender non-affirmation in LLMs is evaluating how models respond to gender identity disclosure [47]. ...
Preprint
Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life. Given the recent popularity and adoption of language generation technologies, the potential to further marginalize this population only grows. Although a multitude of NLP fairness literature focuses on illuminating and addressing gender biases, assessing gender harms for TGNB identities requires understanding how such identities uniquely interact with societal gender norms and how they differ from gender binary-centric perspectives. Such measurement frameworks inherently require centering TGNB voices to help guide the alignment between gender-inclusive NLP and whom they are intended to serve. Towards this goal, we ground our work in the TGNB community and existing interdisciplinary literature to assess how the social reality surrounding experienced marginalization by TGNB persons contributes to and persists within Open Language Generation (OLG). By first understanding their marginalization stressors, we evaluate (1) misgendering and (2) harmful responses to gender disclosure. To do this, we introduce the TANGO dataset, comprising of template-based text curated from real-world text within a TGNB-oriented community. We discover a dominance of binary gender norms within the models; LLMs least misgendered subjects in generated text when triggered by prompts whose subjects used binary pronouns. Meanwhile, misgendering was most prevalent when triggering generation with singular they and neopronouns. When prompted with gender disclosures, LLM text contained stigmatizing language and scored most toxic when triggered by TGNB gender disclosure. Our findings warrant further research on how TGNB harms manifest in LLMs and serve as a broader case study toward concretely grounding the design of gender-inclusive AI in community voices and interdisciplinary literature.
... This often causes a lack of robustness which can hardly be avoided, and that is the reason why health-related data-driven DMs are extremely scarce in the literature. Among such systems, we can find [20,21], but these have not been tested with potential end-users. On the other hand, even though there are some other works describing DMs for tasks comparable to ours, the vast majority of them employ agenda, plan or rule-based designs (see Section 2.2). ...
Article
Full-text available
Designing human–machine interactive systems requires cooperation between different disciplines is required. In this work, we present a Dialogue Manager and a Language Generator that are the core modules of a Voice-based Spoken Dialogue System (SDS) capable of carrying out challenging, long and complex coaching conversations. We also develop an efficient integration procedure of the whole system that will act as an intelligent and robust Virtual Coach. The coaching task significantly differs from the classical applications of SDSs, resulting in a much higher degree of complexity and difficulty. The Virtual Coach has been successfully tested and validated in a user study with independent elderly, in three different countries with three different languages and cultures: Spain, France and Norway.
... Psychological Counseling Datasets Some dialogue studies on mental health address the emotions in the dialogue process and endeavour to motivate users suffering mood disorder. For example, Saha et al. (2021) presents the dialogue dataset MotiVAte of imparting optimism, hope, and motivation for distressed people. Recently, works like ESConv switch their attention to Emotional Support Dialog Systems. ...
Preprint
In a depression-diagnosis-directed clinical session, doctors initiate a conversation with ample emotional support that guides the patients to expose their symptoms based on clinical diagnosis criteria. Such a dialog is a combination of task-oriented and chitchat, different from traditional single-purpose human-machine dialog systems. However, due to the social stigma associated with mental illness, the dialogue data related to depression consultation and diagnosis are rarely disclosed. Though automatic dialogue-based diagnosis foresees great application potential, data sparsity has become one of the major bottlenecks restricting research on such task-oriented chat dialogues. Based on clinical depression diagnostic criteria ICD-11 and DSM-5, we construct the D$^4$: a Chinese Dialogue Dataset for Depression-Diagnosis-Oriented Chat which simulates the dialogue between doctors and patients during the diagnosis of depression, including diagnosis results and symptom summary given by professional psychiatrists for each dialogue.Finally, we finetune on state-of-the-art pre-training models and respectively present our dataset baselines on four tasks including response generation, topic prediction, dialog summary, and severity classification of depressive episode and suicide risk. Multi-scale evaluation results demonstrate that a more empathy-driven and diagnostic-accurate consultation dialogue system trained on our dataset can be achieved compared to rule-based bots.
... Recently, natural language generation approaches have become popular for health-related chatbot applications and for other domains. Saha et al. (2021) trained a variety of classifiers, based on a Seq2Seq architecture and reinforcement learning applications, on a manually annotated dialogue corpus for their mental health chatbot application. ...
Conference Paper
Full-text available
Open-domain large language models have progressed to generating natural-sounding and coherent text. Even though the generated texts appear human-like, the main stumbling block is that their output is never fully predictable, which runs the risk of resulting in harmful content such as false statements or inflammatory language. This makes it difficult to apply these models in highly sensitive domains, including personal health counselling. Hence, most of the chatbots for highly sensitive domains are developed using pre-scripted approaches. Although pre-scripted approaches are highly controlled, they suffer from repetitiveness and scalability issues. In this paper, we explore the possibility of combining the best of both worlds. We propose and describe in detail a new, flexible expert-driven hybrid architecture for harnessing the benefits of large language models in a controlled manner for highly sensitive domains and discuss the expectations and challenges.
Article
Full-text available
Depression is a major public health concern in the U.S. and globally. While successful early identification and treatment can lead to many positive health and behavioral outcomes, depression, remains undiagnosed, untreated or undertreated due to several reasons, including denial of the illness as well as cultural and social stigma. With the ubiquity of social media platforms, millions of people are now sharing their online persona by expressing their thoughts, moods, emotions, and even their daily struggles with mental health on social media. Unlike traditional observational cohort studies conducted through questionnaires and self-reported surveys, we explore the reliable detection of depressive symptoms from tweets obtained, unobtrusively. Particularly, we examine and exploit multimodal big (social) data to discern depressive behaviors using a wide variety of features including individual-level demographics. By developing a multimodal framework and employing statistical techniques to fuse heterogeneous sets of features obtained through the processing of visual, textual, and user interaction data, we significantly enhance the current state-of-the-art approaches for identifying depressed individuals on Twitter (improving the average F1-Score by 5 percent) as well as facilitate demographic inferences from social media. Besides providing insights into the relationship between demographics and mental health, our research assists in the design of a new breed of demographic-aware health interventions.
Conference Paper
Full-text available
Social media platforms are increasingly being used to share and seek advice on mental health issues. In particular, Reddit users freely discuss such issues on various subreddits, whose structure and content can be leveraged to formally interpret and relate subreddits and their posts in terms of mental health diagnostic categories. There is prior research on the extraction of mental health-related information, including symptoms, diagnosis, and treatments from social media; however, our approach can additionally provide actionable information to clinicians about the mental health of a patient in diagnostic terms for web-based intervention. Specifically, we provide a detailed analysis of the nature of subreddit content from domain expert's perspective and introduce a novel approach to map each subreddit to the best matching DSM-5 (Diagnostic and Statistical Manual of Mental Disorders - 5th Edition) category using multi-class classifier. Our classification algorithm analyzes all the posts of a subreddit by adapting topic modeling and word-embedding techniques, and utilizing curated medical knowledge bases to quantify relationship to DSM-5 categories. Our semantic encoding-decoding optimization approach reduces the false-alarm-rate from 30% to 2.5% over a comparable heuristic baseline, and our mapping results have been verified by domain experts achieving a kappa score of 0.84.
Article
Full-text available
Background: A World Health Organization 2017 report stated that major depression affects almost 5% of the human population. Major depression is associated with impaired psychosocial functioning and reduced quality of life. Challenges such as shortage of mental health personnel, long waiting times, perceived stigma, and lower government spends pose barriers to the alleviation of mental health problems. Face-to-face psychotherapy alone provides only point-in-time support and cannot scale quickly enough to address this growing global public health challenge. Artificial intelligence (AI)-enabled, empathetic, and evidence-driven conversational mobile app technologies could play an active role in filling this gap by increasing adoption and enabling reach. Although such a technology can help manage these barriers, they should never replace time with a health care professional for more severe mental health problems. However, app technologies could act as a supplementary or intermediate support system. Mobile mental well-being apps need to uphold privacy and foster both short- and long-term positive outcomes. Objective: This study aimed to present a preliminary real-world data evaluation of the effectiveness and engagement levels of an AI-enabled, empathetic, text-based conversational mobile mental well-being app, Wysa, on users with self-reported symptoms of depression. Methods: In the study, a group of anonymous global users were observed who voluntarily installed the Wysa app, engaged in text-based messaging, and self-reported symptoms of depression using the Patient Health Questionnaire-9. On the basis of the extent of app usage on and between 2 consecutive screening time points, 2 distinct groups of users (high users and low users) emerged. The study used mixed-methods approach to evaluate the impact and engagement levels among these users. The quantitative analysis measured the app impact by comparing the average improvement in symptoms of depression between high and low users. The qualitative analysis measured the app engagement and experience by analyzing in-app user feedback and evaluated the performance of a machine learning classifier to detect user objections during conversations. Results: The average mood improvement (ie, difference in pre- and post-self-reported depression scores) between the groups (ie, high vs low users; n=108 and n=21, respectively) revealed that the high users group had significantly higher average improvement (mean 5.84 [SD 6.66]) compared with the low users group (mean 3.52 [SD 6.15]); Mann-Whitney P=.03 and with a moderate effect size of 0.63. Moreover, 67.7% of user-provided feedback responses found the app experience helpful and encouraging. Conclusions: The real-world data evaluation findings on the effectiveness and engagement levels of Wysa app on users with self-reported symptoms of depression show promise. However, further work is required to validate these initial findings in much larger samples and across longer periods.
Article
Full-text available
Distributed word representations, or word vectors, have recently been applied to many tasks in natural language processing, leading to state-of-the-art performance. A key ingredient to the successful application of these representations is to train them on very large corpora, and use these pre-trained models in downstream tasks. In this paper, we describe how we trained such high quality word representations for 157 languages. We used two sources of data to train these models: the free online encyclopedia Wikipedia and data from the common crawl project. We also introduce three new word analogy datasets to evaluate these word vectors, for French, Hindi and Polish. Finally, we evaluate our pre-trained word vectors on 10 languages for which evaluation datasets exists, showing very strong performance compared to previous models.
Conference Paper
Full-text available
With the rise of social media, millions of people are routinely expressing their moods, feelings, and daily struggles with mental health issues on social media platforms like Twitter. Unlike traditional observational cohort studies conducted through questionnaires and self-reported surveys, we explore the reliable detection of clinical depression from tweets obtained unobtrusively. Based on the analysis of tweets crawled from users with self-reported depressive symptoms in their Twitter profiles, we demonstrate the potential for detecting clinical depression symptoms which emulate the PHQ-9 questionnaire clinicians use today. Our study uses a semi-supervised statistical model to evaluate how the duration of these symptoms and their expression on Twitter (in terms of word usage patterns and topical preferences) align with the medical findings reported via the PHQ-9. Our proactive and automatic screening tool is able to identify clinical depressive symptoms with an accuracy of 68% and precision of 72%.
Article
Full-text available
Background Synchronous written conversations (or “chats”) are becoming increasingly popular as Web-based mental health interventions. Therefore, it is of utmost importance to evaluate and summarize the quality of these interventions. Objective The aim of this study was to review the current evidence for the feasibility and effectiveness of online one-on-one mental health interventions that use text-based synchronous chat. Methods A systematic search was conducted of the databases relevant to this area of research (Medical Literature Analysis and Retrieval System Online [MEDLINE], PsycINFO, Central, Scopus, EMBASE, Web of Science, IEEE, and ACM). There were no specific selection criteria relating to the participant group. Studies were included if they reported interventions with individual text-based synchronous conversations (ie, chat or text messaging) and a psychological outcome measure. Results A total of 24 articles were included in this review. Interventions included a wide range of mental health targets (eg, anxiety, distress, depression, eating disorders, and addiction) and intervention design. Overall, compared with the waitlist (WL) condition, studies showed significant and sustained improvements in mental health outcomes following synchronous text-based intervention, and post treatment improvement equivalent but not superior to treatment as usual (TAU) (eg, face-to-face and telephone counseling). Conclusions Feasibility studies indicate substantial innovation in this area of mental health intervention with studies utilizing trained volunteers and chatbot technologies to deliver interventions. While studies of efficacy show positive post-intervention gains, further research is needed to determine whether time requirements for this mode of intervention are feasible in clinical practice.
Article
Full-text available
Background Web-based cognitive-behavioral therapeutic (CBT) apps have demonstrated efficacy but are characterized by poor adherence. Conversational agents may offer a convenient, engaging way of getting support at any time. Objective The objective of the study was to determine the feasibility, acceptability, and preliminary efficacy of a fully automated conversational agent to deliver a self-help program for college students who self-identify as having symptoms of anxiety and depression. Methods In an unblinded trial, 70 individuals age 18-28 years were recruited online from a university community social media site and were randomized to receive either 2 weeks (up to 20 sessions) of self-help content derived from CBT principles in a conversational format with a text-based conversational agent (Woebot) (n=34) or were directed to the National Institute of Mental Health ebook, “Depression in College Students,” as an information-only control group (n=36). All participants completed Web-based versions of the 9-item Patient Health Questionnaire (PHQ-9), the 7-item Generalized Anxiety Disorder scale (GAD-7), and the Positive and Negative Affect Scale at baseline and 2-3 weeks later (T2). Results Participants were on average 22.2 years old (SD 2.33), 67% female (47/70), mostly non-Hispanic (93%, 54/58), and Caucasian (79%, 46/58). Participants in the Woebot group engaged with the conversational agent an average of 12.14 (SD 2.23) times over the study period. No significant differences existed between the groups at baseline, and 83% (58/70) of participants provided data at T2 (17% attrition). Intent-to-treat univariate analysis of covariance revealed a significant group difference on depression such that those in the Woebot group significantly reduced their symptoms of depression over the study period as measured by the PHQ-9 (F=6.47; P=.01) while those in the information control group did not. In an analysis of completers, participants in both groups significantly reduced anxiety as measured by the GAD-7 (F1,54= 9.24; P=.004). Participants’ comments suggest that process factors were more influential on their acceptability of the program than content factors mirroring traditional therapy. Conclusions Conversational agents appear to be a feasible, engaging, and effective way to deliver CBT.
Article
Full-text available
In the past few years, the field of computer vision has gone through a revolution fueled mainly by the advent of large datasets and the adoption of deep convolutional neural networks for end-to-end learning. The person re-identification subfield is no exception to this, thanks to the notable publication of the Market-1501 and MARS datasets and several strong deep learning approaches. Unfortunately, a prevailing belief in the community seems to be that the triplet loss is inferior to using surrogate losses (classification, verification) followed by a separate metric learning step. We show that, for models trained from scratch as well as pretrained ones, using a variant of the triplet loss to perform end-to-end deep metric learning outperforms any other published method by a large margin.
Conference Paper
Full-text available
Cultural and gender norms shape how mental illness and therapy are perceived. However, there is a paucity of adequate empirical evidence around gender and cultural dimensions of mental illness. In this paper we situate social media as a "lens" to examine these dimensions. We focus on a large dataset of individuals who self-disclose to have an underlying mental health concern on Twitter. Having identified genuine disclosures in this data via semi-supervised learning, we examine differences in their posts, as measured via linguistic attributes and topic models. Our findings reveal significant differences between the content shared by female and male users, and by users from two western and two majority world countries. Males express higher negativity and lower desire for social support, whereas majority world users demonstrate more inhibition in their expression. We discuss the implications of our work in providing insights into the relationship of gender and culture with mental health, and in the design of gender and culture-aware health interventions.
Article
Full-text available
Sequential data often possesses a hierarchical structure with complex dependencies between subsequences, such as found between the utterances in a dialogue. In an effort to model this kind of generative process, we propose a neural network-based generative architecture, with latent stochastic variables that span a variable number of time steps. We apply the proposed model to the task of dialogue response generation and compare it with recent neural network architectures. We evaluate the model performance through automatic evaluation metrics and by carrying out a human evaluation. The experiments demonstrate that our model improves upon recently proposed models and that the latent variables facilitate the generation of long outputs and maintain the context.
Article
Full-text available
Mental illness is one of the most pressing public health issues of our time. While counseling and psychotherapy can be effective treatments, our knowledge about how to conduct successful counseling conversations has been limited due to lack of large-scale data with labeled outcomes of the conversations. In this paper, we present a large-scale, quantitative study on the discourse of text-message-based counseling conversations. We develop a set of novel computational discourse analysis methods to measure how various linguistic aspects of conversations are correlated with conversation outcomes. Applying techniques such as sequence-based conversation models, language model comparisons, message clustering, and psycholinguistics-inspired word frequency analyses, we discover actionable conversation strategies that are associated with better conversation outcomes.
Article
Full-text available
Objective: The aim of the current study was to explore the progress and depth of counselling processes used during online chat sessions, and their relationships to the number of sessions attended and client treatment outcomes. Method: Transcripts from 49 online clients were analysed using the Counselling Progress and Depth Instrument. Psychological distress, life satisfaction, and hope measures were collected prior to the participant’s fi rst session and again 6 weeks later providing treatment outcomes. Results: Overall, progress and depth scores were higher for clients who attended multiple sessions and associated with greater alleviation in clients’ psychological distress. Problem clarifi cation and action planning processes were both correlated with reductions in psychological distress. Conclusions: Findings imply that advancing through more of the stages of counselling in greater depth may help improve client outcomes from online counselling.
Conference Paper
Full-text available
Mental illnesses such as depression and anxiety are highly prevalent, and therapy is increasingly being offered online. This new setting is a departure from face-toface therapy, and offers both a challenge and an opportunity ‐ it is not yet known what features or approaches are likely to lead to successful outcomes in such a different medium, but online text-based therapy provides large amounts of data for linguistic analysis. We present an initial investigation into the application of computational linguistic techniques, such as topic and sentiment modelling, to online therapy for depression and anxiety. We find that important measures such as symptom severity can be predicted with comparable accuracy to face-to-face data, using general features such as discussion topic and sentiment; however, measures of patient progress are captured only by finergrained lexical features, suggesting that aspects of style or dialogue structure may also be important.
Article
Full-text available
In this paper we propose a method for predicting the user mental state for the development of more efficient and usable spoken dialogue systems. This prediction, carried out for each user turn in the dialogue, makes it possible to adapt the system dynamically to the user needs. The mental state is built on the basis of the emotional state of the user and their intention, and is recognized by means of a module conceived as an intermediate phase between natural language understanding and the dialogue management in the architecture of the systems. We have implemented the method in the UAH system, for which the evaluation results with both simulated and real users show that taking into account the user's mental state improves system performance as well as its perceived quality.
Article
Full-text available
This paper reports on a shared task involving the assignment of emotions to suicide notes. Two features distinguished this task from previous shared tasks in the biomedical domain. One is that it resulted in the corpus of fully anonymized clinical text and annotated suicide notes. This resource is permanently available and will (we hope) facilitate future research. The other key feature of the task is that it required categorization with respect to a large set of labels. The number of participants was larger than in any previous biomedical challenge task. We describe the data production process and the evaluation measures, and give a preliminary analysis of the results. Many systems performed at levels approaching the inter-coder agreement, suggesting that human-like performance on this task is within the reach of currently available technologies.
Article
Full-text available
Whereas before 2006 it appears that deep multilayer neural networks were not successfully trained, since then several algorithms have been shown to successfully train them, with experimental results showing the superiority of deeper vs less deep architectures. All these experimental results were obtained with new initialization or training mechanisms. Our objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future. We first observe the influence of the non-linear activations functions. We find that the logistic sigmoid activation is unsuited for deep networks with random initialization because of its mean value, which can drive especially the top hidden layer into saturation. Surprisingly, we find that saturated units can move out of saturation by themselves, albeit slowly, and explaining the plateaus sometimes seen when training neural networks. We find that a new non-linearity that saturates less can often be beneficial. Finally, we study how activations and gradients vary across layers and during training, with the idea that training may be more difficult when the singular values of the Jacobian associated with each layer are far from 1. Based on these considerations, we propose a new initialization scheme that brings substantially faster convergence.
Article
Full-text available
Human evaluations of machine translation are extensive but expensive. Human evaluations can take months to finish and involve human labor that can not be reused.
Article
We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models. Generative models produce system responses that are autonomously generated word-by-word, opening up the possibility for realistic, flexible interactions. In support of this goal, we extend the recently proposed hierarchical recurrent encoder-decoder neural network to the dialogue domain, and demonstrate that this model is competitive with state-of-the-art neural language models and back-off n-gram models. We investigate the limitations of this and similar approaches, and show how its performance can be improved by bootstrapping the learning from a larger question-answer pair corpus and from pretrained word embeddings.
Chapter
When people suffer from issues like workplace stress, financial stress, marital stress, family stress, bullying, domestic violence, peer pressure, anxiety, relationship issues, career issues, parenting, loneliness, depression and others, they try to avoid it, either by saying life is just like this or by comparing their problems with others. Distracting themselves from the problems is a temporary solution which will give rise to permanent problems. The problem we attempt to solve through our web app is to find whom to share our issues. We propose to develop a web app for online therapy, also known as Web Counseling. Online Therapy, a solution where one can get trained listeners and professional therapists, with whom one can share problems or thoughts, and might be able to shift one’s focus from problems to solutions.
Article
Social determinants of health (SDOH) are known to influence mental health outcomes, which are independent risk factors for poor health status and physical illness. Currently, however, existing SDOH data collection methods are ad hoc and inadequate, and SDOH data are not systematically included in clinical research or used to inform patient care. Social contextual data are rarely captured prospectively in a structured and comprehensive manner, leaving large knowledge gaps. Extraction methods are now being developed to facilitate the collection, standardization, and integration of SDOH data into electronic health records. If successful, these efforts may have implications for health equity, such as reducing disparities in access and outcomes. Broader use of surveys, natural language processing, and machine learning methods to harness SDOH may help researchers and clinical teams reduce barriers to mental health care.
Chapter
Over the past decade, scholars have been able to actively engage with patients, informal caregivers, and providers through social media sites and patient-centered groups in ways that are reshaping patient-centered research design and recruitment. As with the introduction of any new technology, there exists both potential for new modes of inquiry and unforeseen ethical quandaries. This chapter presents researchers with the types of questions, ongoing points of debate, and nascent solutions relevant to a research platform in which ethical considerations have yet to be well defined.
Article
In this article, we propose a novel multitask learning attention -based deep neural network model, which facilitates the fusion of various modalities. In particular, we use this network to both regress and classify the level of depression. Acoustic, textual, and visual modalities have been used to train our proposed network. Various experiments have been carried out on the benchmark dataset, namely, Distress Analysis Interview Corpus -a Wizard of Oz. From the results, we empirically justify that a) multitask learning networks cotrained over regression and classification have better performance compared to single -task networks, and b) the fusion of all the modalities helps in giving the most accurate estimation of depression with respect to regression.
Conference Paper
Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. Our main result is that on an English to French translation task from the WMT-14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.7 on the entire test set, where the LSTM's BLEU score was penalized on out-of-vocabulary words. Additionally, the LSTM did not have difficulty on long sentences. For comparison, a strong phrase-based SMT system achieves a BLEU score of 33.3 on the same dataset. When we used the LSTM to rerank the 1000 hypotheses produced by the aforementioned SMT system, its BLEU score increases to 36.5, which beats the previous state of the art. The LSTM also learned sensible phrase and sentence representations that are sensitive to word order and are relatively invariant to the active and the passive voice. Finally, we found that reversing the order of the words in all source sentences (but not target sentences) improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Article
This cross-sectional study explored the hope and expectations of young people accessing an online chat counselling service, as these common therapeutic factors have not yet been investigated in the online environment. Participants included 1033 young people aged 16–25 years, mostly young women, who completed an online questionnaire available through the online mental health service's homepage. Findings showed that online clients had low levels of hope, high treatment outcome expectations, high levels of psychological distress, and low levels of life satisfaction. Hope and expectations were barely associated and about two-thirds of respondents reported low hope but high expectations. Only hope, however, was found to be related to psychological distress and life satisfaction, with higher hope being protective. Expectations, discordance between hope and expectations, and amount of online services received were not associated with psychological distress or life satisfaction. The low levels of hope and high levels of psychological distress, but high expectations, of young people accessing online counselling reveal challenges for this approach
Article
The expressive power of a machine learning model is closely related to the number of sequential computational steps it can learn. For example, Deep Neural Networks have been more successful than shallow networks because they can perform a greater number of sequential computational steps (each highly parallel). The Neural Turing Machine (NTM) is a model that can compactly express an even greater number of sequential computational steps, so it is even more powerful than a DNN. Its memory addressing operations are designed to be differentiable; thus the NTM can be trained with backpropagation. While differentiable memory is relatively easy to implement and train, it necessitates accessing the entire memory content at each computational step. This makes it difficult to implement a fast NTM. In this work, we use the Reinforce algorithm to learn where to access the memory, while using backpropagation to learn what to write to the memory. We call this model the RL-NTM. Reinforce allows our model to access a constant number of memory cells at each computational step, so its implementation can be faster. The RL-NTM is the first model that can, in principle, learn programs of unbounded running time. We successfully trained the RL-NTM to solve a number of algorithmic tasks that are simpler than the ones solvable by the fully differentiable NTM. As the RL-NTM is a fairly intricate model, we needed a method for verifying the correctness of our implementation. To do so, we developed a simple technique for numerically checking arbitrary implementations of models that use Reinforce, which may be of independent interest.
Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. Our main result is that on an English to French translation task from the WMT-14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.7 on the entire test set, where the LSTM's BLEU score was penalized on out-of-vocabulary words. Additionally, the LSTM did not have difficulty on long sentences. For comparison, a strong phrase-based SMT system achieves a BLEU score of 33.3 on the same dataset. When we used the LSTM to rerank the 1000 hypotheses produced by the aforementioned SMT system, its BLEU score increases to 36.5, which beats the previous state of the art. The LSTM also learned sensible phrase and sentence representations that are sensitive to word order and are relatively invariant to the active and the passive voice. Finally, we found that reversing the order of the words in all source sentences (but not target sentences) improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Article
The nine-item Patient Health Questionnaire depression scale is a dual-purpose instrument that can establish provisional depressive disorder diagnoses as well as grade depression severity.
Article
The Hamilton Depression Rating Scale (HDRS) is the most widely used scale for patient selection and follow-up in research studies of treatments of depression. Despite extensive study of the reliability and validity of the total scale score, the psychometric characteristics of the individual items have not been well studied. In the only reliability study to report agreement on individual items using a test-retest interview method, most of the items had only fair or poor agreement. Because this is due in part to variability in the way the information is obtained to make the various rating distinctions, the Structured Interview Guide for the HDRS (SIGH-D) was developed to standardize the manner of administration of the scale. A test-retest reliability study conducted on a series of psychiatric inpatients demonstrated that the use of the SIGH-D results in a substantially improved level of agreement for most of the HDRS items.
The distress analysis interview corpus of human and computer interviews
  • J Gratch
  • R Artstein
  • G M Lucas
  • G Stratou
  • S Scherer
  • A Nazarian
  • R Wood
  • J Boberg
  • D Devault
  • S Marsella