[show abstract][hide abstract] ABSTRACT: This paper describes the development of the RWTH Mandarin LVCSR system. Different acoustic front-ends together with multiple system cross-adaptation are used in a two stage decoding framework. We describe the system in detail and present systematic recognition results. Especially, we compare a variety of approaches for cross-adapting to multiple systems. During the development we did a comparative study on different methods for integrating tone and phoneme posterior features. Furthermore, we apply lattice based consensus decoding and system combination methods. In these methods, the effect of minimizing character instead of word errors is compared. The final system obtains a character error rate of 17.7% on the GALE 2006 evaluation data.
[show abstract][hide abstract] ABSTRACT: In this paper we present an efficient and flexible approach to VTLN warping factor estimation. Due to the equivalence of frequency warping and linear transformation of cepstral coefficients, warping factors can be efficiently estimated by accumulating the sufficient statistics for linear transformation estimation, and searching the constrained space of transformations given by the explicit mapping between warping factors and linear transformation matrices. We show that the positive effect of using a properly normalized optimization criterion for warping factor estimation, which has been previously demonstrated for a signal analysis front-end without a filterbank, carries over to a MFCC front-end, resulting in a net improvement in word error rate
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on; 06/2006 · 4.63 Impact Factor