Conference Paper

Recognition of stress in speech using wavelet analysis and Teager energy operator.

Conference: INTERSPEECH 2008, 9th Annual Conference of the International Speech Communication Association, Brisbane, Australia, September 22-26, 2008
Source: DBLP
30 Reads
  • [Show abstract] [Hide abstract]
    ABSTRACT: A full-wave and charge transport formulation is applied to the analysis of the dominant TM mode guided by a semiconductor substrate backed by a ground plane. Closed form expressions for the field components, charge density, and current density are obtained, along with characteristic equations for the propagation constant and transverse wave numbers. Numerical results on the wave parameters are obtained for different doping levels. The screening effect of the charge carriers on the transverse component of the electric field is observed to be negligible for an intrinsic semiconductor substrate, but gradually approaching that of a metallic conductor as the doping level reaches 10<sup>18</sup> cm<sup>-3</sup>. An equivalent circuit for the structure is constructed to facilitate the development of the dispersion equation while providing physical insight to the process of the wave-charge interaction on the semiconductor surface.
    No preview · Conference Paper · Nov 2003
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a new system for automatic stress detection in speech. In the process of feature extraction speech spectrograms were used as the primary features. The sigma-pi neuron cells were then employed to derive the secondary features. The analysis was performed at three alternative sets of analytical frequency bands: critical bands, Bark scale bands and equivalent rectangular bandwidth (ERB) scale bands. The presented algorithm was tested using actual stressful speech utterances from SUSAS (Speech Under Simulated and Actual Stress) database on the vowel-based level. The automatic stress-level classification was implemented using Gaussian mixture model (GMM) and k-nearest neighbor (KNN) classifiers. The strongest effect on the classification results was observed when selecting the type of frequency bands. The ERB scale provided the highest classification results ranging from 67.84% to 73.76%. The classification results did not differ between data sets containing specific types of vowels and data sets containing mixtures of vowels. This indicates that the proposed method can be applied to voiced speech in speech independent conditions.
    No preview · Conference Paper · Jan 2009
  • [Show abstract] [Hide abstract]
    ABSTRACT: This study presents automatic stress recognition methods based on acoustic speech analysis. Novel approaches to feature extraction based on the nonlinear Teager energy operator (TEO) calculated within critical bands, discrete wavelet transform bands, and wavelet packet bands are presented. The classification process was performed using two types of neural networks: the multilayer perceptron neural network (MLPNN) and the probabilistic neural network (PNN). The classification efficiency was tested using the actual stress dataset from the SUSAS database. The speech recordings were made by 15 speakers (8 females and 7 males) reading a list of 35 words under three actual conditions: high stress, low stress, and neutral. The best overall performance was observed for the features extracted using the TEO parameters calculated within perceptual wavelet packet bands(TEO-PWP). Depending on the type of mother wavelet, the correct classification scores for the PWP features ranged from 71.24% to 91.56% (using the MLPNN classifier), and from 86.63% to 93.67% (using the PNN). The PNN classifier outperformed the MLPNN classification method.
    No preview · Conference Paper · Jan 2009
Show more