Conference Paper

An Acoustic Framework for Detecting Fatigue in Speech Based Human-Computer-Interaction.

DOI: 10.1007/978-3-540-70540-6_7 Conference: Computers Helping People with Special Needs, 11th International Conference, ICCHP 2008, Linz, Austria, July 9-11, 2008. Proceedings
Source: DBLP

ABSTRACT This article describes a general framework for detecting accident-prone fatigue states based on prosody, articulation and
speech quality related speech characteristics. The advantages of this real-time measurement approach are that obtaining speech
data is non obtrusive, and free from sensor application and calibration efforts. The main part of the feature computation
is the combination of frame level based speech features and high level contour descriptors resulting in over 8,500 features
per speech sample. In general the measurement process follows the speech adapted steps of pattern recognition: (a) recording
speech, (b) preprocessing (segmenting speech units of interest), (c) feature computation (using perceptual and signal processing
related features, as e.g. fundamental frequency, intensity, pause patterns, formants, cepstral coefficients), (d) dimensionality
reduction (filter and wrapper based feature subset selection, (un-)supervised feature transformation), (e) classification
(e.g. SVM, K-NN classifier), and (f) evaluation (e.g. 10-fold cross validation). The validity of this approach is briefly
discussed by summarizing the empirical results of a sleep deprivation study.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: While significant work is being done in order to develop empathic agents which can identify emotions of the user through eye gaze and facial expressions, a neglected area, especially in the pedagogical context, is the use of voice for detection of alertness, fatigue and emotions. Some of the issues are lack of constant monitoring and visual feedback. However, such a system has advantages – the ability to work where visual monitoring is expensive, in darkness or where mobile devices cannot provide adequate visual feedback. We propose a model for a system capable of identifying emotions as well as alertness and fatigue based entirely on voice interaction, keyboard and mouse clicks; we also propose to develop an engine which can intelligently improve its prediction of emotions and cognitive states based on earlier interaction, and suggest appropriate measures to improve emotions, reduce distractions and mitigate fatigue. Keywords— Empathic agent; fatigue detection; intelligent e-learning system; speech emotion recognition. I. INTRODUCTION Today many e-learning systems are able to adapt to the needs, requirements and orientations of individual students. Such systems are considered intelligent or adaptive. But a more recent development is systems which are able to identify the emotions and affective states of individuals and react intelligently to them. In the pedagogical context, such developments are considered significant since they would allow computers to interact like human instructors and teachers through animated agents which appear and speak on the computer screen. Currently, many models [1], [2] and [3] are being developed which use face, eye tracking, voice, etc. in combination as inputs in order to identify emotions and to respond intelligently to them. Moreover, certain research focuses only on visual inputs, say, face and eyes, in order to predict emotions [4], [5]. However, there is hardly any research on developing robust technology to identify emotions, fatigue and alertness through voice input and to apply this in developing an intelligent and empathic feedback system. True, voice has already been used in various other contexts such as driver fatigue [6], [7], stress detection [8] but its independent use in a pedagogical context needs exploration. This is even more challenging since the user does not provide constant audio feedback. We propose a model for using the user's keyboard and mouse clicking behavior along with voice input to develop an intelligent student monitoring and interactive system which can be used along with any e-learning system.
    T4E, IIT Kharagpur; 12/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Fatigue is a natural phenomenon which is a kind of self-regulation and protection for human body. Detection fatigue states have positive significance for all occupations now. This paper presents a feature-based parameters and the probabilistic neural network (PNN) speech recognition model to detect fatigue. Through training at different times of voice samples as the voice sources and establishing a comprehensive identification system. Experimental results show that this way can reflect the degree of fatigue. MFCC parameters is superior to LPCC.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Abstract This paper deals with the potential and limitations of using voice and speech processing to detect Obstructive Sleep Apnea (OSA). An extensive body of voice features has been extracted from patients who present various degrees of OSA as well as healthy controls. We analyse the utility of a reduced set of features for detecting OSA. We apply various feature selection and reduction schemes (statistical ranking, Genetic Algorithms, PCA, LDA) and compare various classifiers (Bayesian Classifiers, kNN, Support Vector Machines, neural networks, Adaboost). S-fold crossvalidation performed on 248 subjects shows that in the extreme cases (that is, 127 controls and 121 patients with severe OSA) voice alone is able to discriminate quite well between the presence and absence of OSA. However, this is not the case with mild OSA and healthy snoring patients where voice seems to play a secondary role. We found that the best classification schemes are achieved using a Genetic Algorithm for feature selection/reduction.
    Applied Soft Computing 10/2014; 23:346 - 354. · 2.68 Impact Factor

Full-text (2 Sources)

Available from
Jun 2, 2014