[show abstract][hide abstract] ABSTRACT: There is growing interest in modeling nonlinear behavior in the speech signal, particularly for applications such as speech recognition. Conventional tools for analyzing speech data use information from the power spectral density of the time series, and hence are restricted to the first two moments of the data. These moments do not provide a sufficient representation of a signal with strong nonlinear properties. In this paper, we investigate the use of features, known as invariants, that measure the nonlinearity in a signal. We analyze three popular measures: Lyapunov exponents, Kolmogorov entropy and correlation dimension. These measures quantify the presence (and extent) of chaos in the underlying system that generated the observable. We show that these invariants can discriminate between broad phonetic classes on a simple database consisting of sustained vowels using the Kullback-Leibler divergence measure. These features show promise in improving the robustness of speech recognition systems in noisy environments.
INTERSPEECH 2006 - ICSLP, Ninth International Conference on Spoken Language Processing, Pittsburgh, PA, USA, September 17-21, 2006; 01/2006