Conference Paper

MindTime: Deep Learning Approach for Borderline Personality Disorder Detection

Authors:
  • Faculty of Management Technology and Information System Port Said University
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Borderline Personality Disorder (BPD): BPD is linked to a mental health condition that defines unstable relationship patterns, strong emotional reactions and low life satisfaction [37]. The text-based analysis of BPD symptoms is widely researched in the literature [38], [39]. ...
Article
Full-text available
Mental illness prediction through text involves employing natural language processing (NLP) techniques and deep learning algorithms to analyze textual data for the identification of mental disorders. Therefore, machine learning and deep learning algorithms have been utilized in the existing literature for the detection of mental illness. However, current systems exhibit suboptimal performance primarily due to their reliance on traditional embedding techniques and generic language models to generate text embeddings. To address this limitation, there is a requirement for domain-specific pretrained language models that comprehensively understand the context found in posts of mentally ill patients. Posts from individuals with mental illness often contain metaphorical expressions, posing a challenge for existing models in understanding such figurative language. In this study, we propose a hybrid transformer architecture, comprising MentalBERT and MelBERT pretrained language models, cascaded with CNN models to generate and concatenate deep features. MentalBERT is pretrained on an extensive corpus of text data specifically related to the mental health domain, while MelBERT is trained on a large corpus of metaphorical data for improved understanding of metaphorical expressions. The results reveal outstanding performance of the proposed architecture with an overall accuracy of 92% and an F1-score of 92%, surpassing state-of-the-art models in comparison. This study underscores the necessity for further research in this field and illustrates the potential of advanced technologies to address mental health issues in contemporary society.
... The problem of mental health is a serious issue in modern society. It usually refers to a person's mood, thoughts, and behavior [1]. One of the leading causes of suicide is poor mental health [2]. ...
Article
Full-text available
Narcissistic personality disorder (NPD) is a personality disorder that affects various aspects of life, including relationships, employment, school, and finances. Persons with NPD usually feel unhappy and disappointed when no one helps them and is not praised for their achievements. Diagnosing narcissism is generally done using a screening test that consumes time and costs a lot. This research aims to evaluate the performance of several feature selection (FS) approaches on machine learning (ML) techniques (support vector machine (SVM), random forest classifier (RFC), and Naive Bayes). Three scenarios of FS (all features, the information gain technique and the gain ratio (GR) feature technique) are used for each ML method. Several experiments using the benchmark narcissistic disorder dataset have been done. It adopts the k-fold cross-validation (10-fold cross-validation) strategy. We evaluate the method’s performance by measuring its accuracy, error rate, and processing time. It is shown that the RFC GR strategy gives the best performance with an accuracy of 100%.
... From 2015 onward there has been, according to Graham et al. (2019), a steep increase in the number of publications about AI for mental health. However, our database search (Scopus, Web of Science, Science Direct, PubMed, IEEE Xplore) with the terms "expert system, " "decision support system, " or "artificial intelligence" on the one hand, and "personality disorders" or any of the individual disorders on the other, only returned tangential research (Singh et al., 2020;Ellouze et al., 2021;Khazbak et al., 2021), proposals (Tuena et al., 2020;Sulistiani et al., 2021;Szalai, 2021), or proofs of concept (Nunes et al., 2009;Casado-Lumbreras et al., 2012;Randa and Permanasari, 2014;Laijawala et al., 2020). ...
Article
Full-text available
Personality disorders are psychological ailments with a major negative impact on patients, their families, and society in general, especially those of the dramatic and emotional type. Despite all the research, there is still no consensus on the best way to assess and treat them. Traditional assessment of personality disorders has focused on a limited number of psychological constructs or behaviors using structured interviews and questionnaires, without an integrated and holistic approach. We present a novel methodology for the study and assessment of personality disorders consisting in the development of a Bayesian network, whose parameters have been obtained by the Delphi method of consensus from a group of experts in the diagnosis and treatment of personality disorders. The result is a probabilistic graphical model that represents the psychological variables related to the personality disorders along with their relations and conditional probabilities, which allow identifying the symptoms with the highest diagnostic potential. This model can be used, among other applications, as a decision support system for the assessment and treatment of personality disorders of the dramatic or emotional cluster. In this paper, we discuss the need to validate this model in the clinical population along with its strengths and limitations.
Article
To diagnose patients with borderline personality disorder (BPD) based on Cyberball social exclusion task and resting-state functional magnetic resonance imaging (fMRI) using machine learning approach. In the current study, the researchers used fMRI images to examine social brain function and learning in BPD. Thirty-six participants completed the ‘Cyberball’ task. Data questionnaire and features extracted from fMRI data were used to diagnose BPD. In this study, three statistical models were used to diagnosing BPD, and the best model was introduced based on appropriate criteria. Also, important features are identified by the models. Totally, 20 people had BPD and 16 were healthy. 83.3% were women and 16.7% men. Logistic Lasso Regression (LLR) was the best model for the diagnosis of patients with BPD. Physical abuse, sexual abuse and the use of antidepressants and antipsychotic drugs were selected as important features by the models. Due to the structure of the machine learning models used in the study, there is no need to feature selection stage and important features can be identified in the models. Also, the diagnosis of BPD has been done with high accuracy, so that clinical physicians can diagnose BPD with all available information, including questionnaire information and fMRI data.
Article
Full-text available
Abstract This paper contains an analysis and comparison of different classifiers on different datasets of Psychiatric Disorders- Personality Disorder, Depression, Anxiety, Schizophrenia and Alzheimer's disease. Psychiatric disorders are also referred to as mental disorders, abnormalities of the mind that result in persistent behavior which can seriously cause day to day function and life. Stochastic in AI refers to if there is any uncertainty or randomness involved in results and are used during optimization; Using this process also helps to provide precise results. The study of stochastic process in AI uses mathematical knowledge and techniques from probability, set theory, calculus, linear algebra and mathematical analysis like Fourier analysis, real analysis, and functional analysis. this technique is used to construct neural network for making artificial intelligent mode for processing and minimizing human effort. This paper contains classifiers like SVM, MLP, LR, KNN, DT, and RF. Several types of attributes are used and have been trained by Weka tool, MATLAB, and Python. The results show that the SVM classifier showed the best performance for all the attributes and disorders researched in this paper.
Article
Full-text available
The recognition of personal emotional state or sentiment conveyed through text is the main task we address in our research. The communication of emotions through text messaging and posts of personal blogs poses the ‘informal style of writing’ challenge for researchers expecting grammatically correct input. Our Affect Analysis Model was designed to handle the informal messages written in an abbreviated or expressive manner. While constructing our rule-based approach to affect recognition from text, we followed the compositionality principle. Our method is capable of processing sentences of different complexity, including simple, compound, complex (with complement and relative clauses), and complex-compound sentences. The evaluation of the Affect Analysis Model algorithm showed promising results regarding its capability to accurately recognize affective information in text from an existing corpus of personal blog posts.
Article
Full-text available
Emotion detection (ED) is a branch of sentiment analysis that deals with the extraction and analysis of emotions. The evolution of Web 2.0 has put text mining and analysis at the frontiers of organizational success. It helps service providers provide tailor‐made services to their customers. Numerous studies are being carried out in the area of text mining and analysis due to the ease in sourcing for data and the vast benefits its deliverable offers. This article surveys the concept of ED from texts and highlights the main approaches adopted by researchers in the design of text‐based ED systems. The article further discusses some recent state‐of‐the‐art proposals in the field. The proposals are discussed in relation to their major contributions, approaches employed, datasets used, results obtained, strengths, and their weaknesses. Also, emotion‐labeled data sources are presented to provide neophytes with eligible text datasets for ED. Finally, the article presents some open issues and future research direction for text‐based ED.
Article
Full-text available
Borderline personality disorder (BPD) is a severe and heterogeneous mental disorder that is known to have the onset in young age, often in adolescence. For this reason, it is of fundamental importance to identify clinical conditions of childhood and adolescence that present a high risk to evolve in BPD. Investigations indicate that early borderline pathology (before 19 years) predict long-term deficits in functioning, and a higher percentage of these patients continue to present some BPD symptoms up to 20 years. There is a general accordance among investigators that good competence in both childhood and early adulthood is the main predictive factor of excellent recovery in BPD patients. Some authors suggest that specific childhood personality traits can to be considered precursors of adult BPD, as well as some clinical conditions: disruptive behaviours, disturbance in attention and emotional regulation, conduct disorders, substance use disorders, and attention-deficit-hyperactivity disorder. Unfortunately, diagnosis and treatment of BPD is usually delayed, also because some clinicians are reluctant to diagnose BPD in younger individuals. Instead, the early identification of BPD symptoms have important clinical implications in terms of precocious intervention programs, and guarantees that young people with personality disorders obtain appropriate treatments. This review is aimed to collect the current evidences on early risk and protective factors in young people that may predict BPD onset, course, and outcome.
Article
Full-text available
Emotional evaluation of video clips is the difficult task because it includes not only stationary objects as the background but also dynamic objects as the foreground. In addition, there are many video analysis problems to be solved beforehand to properly address the emotionrelated tasks. Recently, however, the convolutional neural network (CNN)-based deep learning approach, opens the possibility by solving the action recognition problem. Inspired by the CNN-based action recognition technology, this paper challenges to evaluate the emotion of video clips. In the paper, we propose a deep learning model to capture the video features and evaluate the emotion of a video clip on Thayer 2D emotion space. In the model, the pre-trained convolutional 3D neural network (C3D) generates short-term spatiotemporal features of the video, LSTM accumulates those consecutive time-varying features to characterize long-term dynamic behaviors, and multilayer perceptron (MLP) evaluates emotion of a video clip by regression on the emotion space. Due to the limited number of labeled data, the C3D is employed to extract diverse spatiotemporal from various layers by transfer learning technique. The pre-trained C3D on the Sports-1M dataset and long short term memory (LSTM) followed by the MLP for regression are trained in end-to-end manner to fine-tune the C3D, and to adjust weights of LSTM and the MLP-type emotion estimator. The proposed method achieves the concordance correlation coefficient values of 0.6024 for valence and 0.6460 for arousal, respectively. We believe this emotional evaluation of video could be easily associated with appropriate music recommendation, once the music is emotionally evaluated in the same high-level emotional space.
Article
Full-text available
Mental health is an indicator of emotional, psychological and social well-being of an individual. It determines how an individual thinks, feels and handle situations. Positive mental health helps one to work productively and realize their full potential. Mental health is important at every stage of life, from childhood and adolescence through adulthood. Many factors contribute to mental health problems which lead to mental illness like stress, social anxiety, depression, obsessive compulsive disorder, drug addiction, and personality disorders. It is becoming increasingly important to determine the onset of the mental illness to maintain proper life balance. The nature of machine learning algorithms and Artificial Intelligence (AI) can be fully harnessed for predicting the onset of mental illness. Such applications when implemented in real time will benefit the society by serving as a monitoring tool for individuals with deviant behavior. This research work proposes to apply various machine learning algorithms such as support vector machines, decision trees, naïve bayes classifier, K-nearest neighbor classifier and logistic regression to identify state of mental health in a target group. The responses obtained from the target group for the designed questionnaire were first subject to unsupervised learning techniques. The labels obtained as a result of clustering were validated by computing the Mean Opinion Score. These cluster labels were then used to build classifiers to predict the mental health of an individual. Population from various groups like high school students, college students and working professionals were considered as target groups. The research presents an analysis of applying the aforementioned machine learning algorithms on the target groups and also suggests directions for future work.
Article
Full-text available
Selection of text feature item is a basic and important matter for text mining and information retrieval. Traditional methods of feature extraction require handcrafted features. To hand-design, an effective feature is a lengthy process, but aiming at new applications, deep learning enables to acquire new effective feature representation from training data. As a new feature extraction method, deep learning has made achievements in text mining. The major difference between deep learning and conventional methods is that deep learning automatically learns features from big data, instead of adopting handcrafted features, which mainly depends on priori knowledge of designers and is highly impossible to take the advantage of big data. Deep learning can automatically learn feature representation from big data, including millions of parameters. This thesis outlines the common methods used in text feature extraction first, and then expands frequently used deep learning methods in text feature extraction and its applications, and forecasts the application of deep learning in feature extraction.
Article
Full-text available
Mobile technologies offer new opportunities for prospective, high resolution monitoring of long-term health conditions. The opportunities seem of particular promise in psychiatry where diagnoses often rely on retrospective and subjective recall of mood states. However, deriving clinically meaningful information from the complex time series data these technologies present is challenging, and the current implications for patient care are uncertain. In this study, 130 participants with bipolar disorder (n = 48) or borderline personality disorder (n = 31) and healthy volunteers (n = 51) completed daily mood ratings using a bespoke smartphone app for up to 1 year. A signature-based learning method was used to capture the evolving interrelationships between the different elements of mood and exploit this information to classify participants’ diagnosis and to predict subsequent mood. The three participant groups could be distinguished from one another on the basis of self-reported mood using the signature methodology. The methodology classified 75% of participants into the correct diagnostic group compared with 54% using standard approaches. Subsequent mood ratings were correctly predicted with >70% accuracy. Prediction of mood was most accurate in healthy volunteers (89–98%) compared to bipolar disorder (82–90%) and borderline personality disorder (70–78%). The signature method provided an effective approach to the analysis of mood data both in terms of diagnostic classification and prediction of future mood. It also highlighted the differing predictability and the overlap inherent within disorders. The three cohorts offered internally consistent but distinct patterns of mood interaction in their reporting which have the potential to enable more efficient and accurate diagnoses and thus earlier treatment.
Chapter
Full-text available
This chapter addresses the aspects of facial expression quantification to detect low, medium, and high levels of expressions. It develops an automatic emotion classification technique for recognizing six different facial emotions—anger, disgust, fear, happiness, sadness, and surprise. The authors evaluated two different facial features for this purpose: facial deformation features and marker-based features for extracting facial expression features. The results show that the sectored volumetric difference function (SVDF/VDF) shape transformation features allow better quantification of facial expressions as compared to marker-based features. The further plans for this research will be to find better methods to fuse audiovisual information that can model the dynamics of facial expressions and speech. Segmental level acoustic information can be used to trace the emotions at a frame level.
Article
Full-text available
Automatic emotion of detection in speech is a latest research area in the field of human machine interaction and speech processing. The aim of this paper is to enable a very natural interaction among human and machine. This dissertation proposes an approach to recognize the user's emotional state by analysing signal of human speech. To achieve the good extraction of the feature from the signal the propose technique uses the high pass filter before the feature extraction process. High pass filter uses to reduce the noise. High pass filter pass only high frequency and attenuates the lower frequency. This paper uses the Neural Network as a classifier to classify the different emotional states such as happy, sad, anger etc from emotional speech database. For the performance of classification use the speech feature such as Mel Frequency cepstrum coefficient (MFCC). The result shows that the Neural Network used as a classifier is a feasible technique for the emotional classification. By using the high pass filter performance should be increase.
Conference Paper
Full-text available
In this paper, we adopt a supervised machine learning approach to recognize six basic emotions (anger, disgust, fear, happiness, sadness and surprise) using a heterogeneous emotion-annotated dataset which combines news headlines, fairy tales and blogs. For this purpose, different features sets, such as bags of words, and N-grams, were used. The Support Vector Machines classifier (SVM) performed significantly better than other classifiers, and it generalized well on unseen examples.
Conference Paper
Full-text available
The recognition of personal emotional state or sentiment conveyed through text is the main task we address in our research. The communication of emotions through text messaging and posts of personal blogs poses the 'informal style of writing' challenge for researchers expecting grammatically correct input. Our Affect Analysis Model was designed to handle the informal messages written in an abbreviated or expressive manner. While constructing our rule-based approach to affect recognition from text, we followed the compositionality principle. Our method is capable of processing sentences of different complexity, including simple, compound, complex (with complement and relative clauses), and complex-compound sentences. The evaluation of the Affect Analysis Model algorithm showed promising results regarding its capability to accurately recognize affective information in text from an existing corpus of personal blog posts.
Article
Full-text available
Little is known about the cross-national population prevalence or correlates of personality disorders. To estimate prevalence and correlates of DSM-IV personality disorder clusters in the World Health Organization World Mental Health (WMH) Surveys. International Personality Disorder Examination (IPDE) screening questions in 13 countries (n = 21 162) were calibrated to masked IPDE clinical diagnoses. Prevalence and correlates were estimated using multiple imputation. Prevalence estimates are 6.1% (s.e. = 0.3) for any personality disorder and 3.6% (s.e. = 0.3), 1.5% (s.e. = 0.1) and 2.7% (s.e. = 0.2) for Clusters A, B and C respectively. Personality disorders are significantly elevated among males, the previously married (Cluster C), unemployed (Cluster C), the young (Clusters A and B) and the poorly educated. Personality disorders are highly comorbid with Axis I disorders. Impairments associated with personality disorders are only partially explained by comorbidity. Personality disorders are relatively common disorders that often co-occur with Axis I disorders and are associated with significant role impairments beyond those due to comorbidity.
Article
Full-text available
Epidemiological data on personality disorders, comorbidity and associated use of services are essential for health service policy. To measure the prevalence and correlates of personality disorder in a representative community sample. The Structured Clinical Interview for DSM-IVAxis II disorders was used to measure personality disorder in 626 persons aged 16-74 years in households in England, Scotland and Wales, in a two-phase survey. The weighted prevalence of personality disorder was 4.4% (95% CI 2.9-6.7). Rates were highest among men, separated and unemployed participants in urban locations. High use of healthcare services was confounded by comorbid mental disorder and substance misuse. Cluster B disorders were associated with early institutional care and criminality. Personality disorder is common in the community, especially in urban areas. Services are normally restricted to symptomatic, help-seeking individuals, but a vulnerable group with cluster B disorders can be identified early, are in care during childhood and enter the criminal justice system when young. This suggests the need for preventive interventions at the public mental health level.
Article
Background: Earlier research indicated that nearly 20% of patients diagnosed with either bipolar disorder (BD) or borderline personality disorder (BPD) also met criteria for the other diagnosis. Yet limited data are available concerning the potential impact of co-occurring BPD and/or BPD features on the course or outcome in patients with BD. Therefore, this study examined this comorbidity utilizing the standardized Borderline Personality Questionnaire (BPQ). Methods: This study involved 714 adult patients with a primary diagnosis of BD per DSM-IV criteria who were admitted to the psychiatric unit at an academic hospital in Houston, TX between July 2013 and July 2018. All patients completed the BPQ within 72 hours of admission. Statistical analysis was used to detect correlations between severity of BD, length of stay (LOS), and scores on the BPQ. A machine learning model was constructed to predict the parameters affecting patients' readmission rates within 30 days. Results: Analysis revealed that the severity of certain BPD traits at baseline was associated with mood state and outcome measured by LOS. Inpatients with BD who were admitted during acute depressive episodes had significantly higher mean scores on 7 of the 9 BPQ subscales (P<0.05) compared with those admitted during acute manic episodes. Inpatients with BD with greater BPQ scores on 4 of the 9 BPQ subscales had significantly shorter LOS than those with lower BPQ scores (P<0.05). The machine learning model identified 6 variables as predictors for likelihood of 30-day readmission with a high sensitivity (83%), specificity (77%), and area under the receiver operating characteristic curve of 86%. Conclusions: Although preliminary, these results suggest that inpatients with BD who have higher levels of BPD features were more likely to have depressive rather than manic symptoms, fewer psychotic symptoms, and a shorter LOS. Moreover, machine learning models may be particularly valuable in identifying patients with BD who are at the highest risk for adverse consequences including rapid readmission.
Article
Objective: Borderline personality disorder (BPD) occurs in 0.7-3.5% of the general population. Patients with BPD suffer from excessive comorbidity of psychiatric and somatic diseases and are known to be high utilizers of health care services. Because of a range of challenges related to adverse health behaviors and their interpersonal style, patients with borderline personality disorder are often regarded as "difficult" to interact with and treat optimally. Methods: This narrative review focuses on epidemiological studies on BPD and its comorbidity with a specific focus on somatic illness. Empirically-validated treatments are summarized and implementation of specific treatment models is discussed. Results: The prevalence of BPD among psychiatric inpatients (9-14%) and outpatients (12-18%) is high; medical service use is very frequent, annual societal costs vary between &OV0556; 11,000 and 28,000. BPD is associated with cardiovascular diseases and stroke, metabolic disease including diabetes and obesity, gastrointestinal disease, arthritis and chronic pain, veneral diseases and HIV infection as well as sleep disorders. Psychotherapy is the treatment of choice for BPD. Several manualized treatments for BPD have been empirically validated, including Dialectical Behavior Therapy (DBT), Transference-Focused Psychotherapy (TFP), Mentalization-based Therapy (MBT), and Schema-focused Therapy (SFT). Conclusions: Health care could be substantially improved if all medical specialties would be familiar with BPD, its pathology, medical and psychiatric co-morbidities, complications, and treatment. In mental health care, several empirically validated treatments are available that are applicable in a wide range of clinical settings.
Article
The multi-modal emotion recognition lacks the explicit mapping relation between emotion state and audio and image features, so extracting the effective emotion information from the audio/visual data is always a challenging issue. In addition, the modeling of noise and data redundancy is not solved well, so that the emotion recognition model is often confronted with the problem of low efficiency. The deep neural network (DNN) performs excellently in the aspects of feature extraction and highly non-linear feature fusion, and the cross-modal noise modeling has great potential in solving the data pollution and data redundancy. Inspired by these, our paper proposes a deep weighted fusion method for audio-visual emotion recognition. Firstly, we conduct the cross-modal noise modeling for the audio and video data, which eliminates most of the data pollution in the audio channel and the data redundancy in visual channel. The noise modeling is implemented by the voice activity detection(VAD), and the data redundancy in the visual data is solved through aligning the speech area both in audio and visual data. Then, we extract the audio emotion features and visual expression features via two feature extractors. The audio emotion feature extractor, audio-net, is a 2D CNN, which accepting the image-based Mel-spectrograms as input data. On the other hand, the facial expression feature extractor, visual-net, is a 3D CNN to which facial expression image sequence is feeded. To train the two convolutional neural networks on the small data set efficiently, we adopt the strategy of transfer learning. Next, we employ the deep belief network(DBN) for highly non-linear fusion of multi-modal emotion features. We train the feature extractors and the fusion network synchronously. And finally the emotion classification is obtained by the support vector machine using the output of the fusion network. With consideration of cross-modal feature fusion, denoising and redundancy removing, our fusion method show excellent performance on the selected data set.
Article
This paper proposes an audio-visual emotion recognition system that uses a mixture of rule-based and machine learning techniques to improve the recognition efficacy in the audio and video paths. The visual path is designed using the Bidirectional Principal Component Analysis (BDPCA) and Least-Square Linear Discriminant Analysis (LSLDA) for dimensionality reduction and class discrimination. The extracted visual features are passed into a newly designed Optimized Kernel-Laplacian Radial Basis Function (OKL-RBF) neural classifier. The audio path is designed using a combination of input prosodic features (pitch, log-energy, zero crossing rates and Teager energy operator) and spectral features (Mel-scale frequency cepstral coefficients). The extracted audio features are passed into an audio feature level fusion module that uses a set of rules to determine the most likely emotion contained in the audio signal. An audio visual fusion module fuses outputs from both paths. The performances of the proposed audio path, visual path, and the final system are evaluated on standard databases. Experiment results and comparisons reveal the good performance of the proposed system.
Article
The automatic analysis and classification of text using fine-grained attitude labels is the main task we address in our research. The developed @AM system relies on compositionality principle and a novel approach based on the rules elaborated for semantically distinct verb classes. The evaluation of our method on 1000 sentences, that describe personal experiences, showed promising results: average accuracy on fine-grained level was 62%, on middle level - 71%, and on top level - 88%.
Conference Paper
Understanding of emotion is greatly influenced by inputs like voice, facial expressions, body language. Yet few systems are exploring the broad field of the emotional human interface. No established analytical methods, neither in the field of speech analysis nor image processing, can reliably determine the intended or pure emotion. We have concentrated only on the voice-emotion analysis based on the idea that humans are capable of detecting other human emotional state through voice input without any semantic understanding. We developed a simplified human based emotional model and set of wavelet/cepstrum based software tools for emotion extraction from human voice. Our method applies calculations on short time energy portions that correspond to the words in a sentence. The power calculated in short time windows was included, because it especially emphasizes the difference between normal and angry speech. For the purpose of the general human emotional understanding pattern, 100 English and 50 Japanese sound samples were processed, trying to find the relation between semantic and non-semantic emotional understanding. We divide the voice samples into angry, happy, normal and “not defined” emotional state groups according to the general human understanding of the speech
Article
Machine recognition of human emotional state is an important component for efficient human-computer interaction. The majority of existing works address this problem by utilizing audio signals alone, or visual information only. In this paper, we explore a systematic approach for recognition of human emotional state from audiovisual signals. The audio characteristics of emotional speech are represented by the extracted prosodic, Mel-frequency Cepstral Coefficient (MFCC), and formant frequency features. A face detection scheme based on HSV color model is used to detect the face from the background. The visual information is represented by Gabor wavelet features. We perform feature selection by using a stepwise method based on Mahalanobis distance. The selected audiovisual features are used to classify the data into their corresponding emotions. Based on a comparative study of different classification algorithms and specific characteristics of individual emotion, a novel multiclassifier scheme is proposed to boost the recognition performance. The feasibility of the proposed system is tested over a database that incorporates human subjects from different languages and cultural backgrounds. Experimental results demonstrate the effectiveness of the proposed system. The multiclassifier scheme achieves the best overall recognition rate of 82.14%.
Article
Machine recognition of human emotional state is an important component for efficient human-computer interaction. The majority of existing works address this problem by utilizing audio signals alone, or visual information only. In this paper, we explore a systematic approach for recognition of human emotional state from audiovisual signals. The audio characteristics of emotional speech are represented by the extracted prosodic, Mel-frequency Cepstral Coefficient (MFCC), and formant frequency features. A face detection scheme based on HSV color model is used to detect the face from the background. The visual information is represented by Gabor wavelet features. We perform feature selection by using a stepwise method based on Mahalanobis distance. The selected audiovisual features are used to classify the data into their corresponding emotions. Based on a comparative study of different classification algorithms and specific characteristics of individual emotion, a novel multiclassifier scheme is proposed to boost the recognition performance. The feasibility of the proposed system is tested over a database that incorporates human subjects from different languages and cultural backgrounds. Experimental results demonstrate the effectiveness of the proposed system. The multiclassifier scheme achieves the best overall recognition rate of 82.14%.
Investigation of Multimodal Features, Classifiers and Fusion Methods for Emotion Recognition
  • Zheng Lian
Facial Emotion Recognition Using Machine Learning
  • raut
Machine Learning Facial Emotion Recognition in Psychotherapy Research. A useful approach?
  • Martin Steppan