 Citations (9)
 Cited In (0)

Article: Hidden Markov models, maximum mutual information estimation, and the speech recognition problem
[Show abstract] [Hide abstract]
ABSTRACT: Hidden Markov Models (HMMs) are one of the most powerful speech recognition tools available today. Even so, the inadequacies of HMMs as a "correct" modeling framework for speech are well known. In that context, we argue that the maximum mutual information estimation (MMIE) formulation for training is more appropriate visavis maximum likelihood estimation (MLE) for reducing the error rate. We also show how MMIE paves the way for new training possibilities. We introduce Corrective MMIE training, a very efficient new training algorithm which uses a modified version of a discrete reestimation formula recently proposed by Gopalakrishnan et al. We propose reestimation formulas for the case of diagonal Gaussian densities, experimentally demonstrate their convergence properties, and integrate them into our training algorithm. In a connected digit recognition task, MMIE consistently improves the recognition performance of our recognizer.  [Show abstract] [Hide abstract]
ABSTRACT: The interaction of linear discriminant analysis (LDA) and a modeling approach using continuous Laplacian mixture density HMM is studied experimentally. The largest improvements in speech recognition could be obtained when the classes for the LDA transform were defined to be subphone units. On a 12000 word German recognition task with small overlap between training and test vocabulary a reduction in error rate by onefifth was achieved compared to the case without LDA. On the development set of the DARPA RM1 task the error rate was reduced by onethird. For the DARPA speakerdependent nogrammar case, the error rate averaged over 12 speakers was 9.9%. This was achieved with a recognizer using LDA and a set of only 47 Viterbitrained contextindependent phonemes.Proceedings  ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing 01/1992; 1:1316.  [Show abstract] [Hide abstract]
ABSTRACT: This paper describes a framework for optimising the parameters of a continuous density HMMbased large vocabulary recognition system using a maximum mutual information estimation (MMIE) criterion. To limit the computational complexity arising from the need to find confusable speech segments in the large search space of alternative utterance hypotheses, word lattices generated from the training data are used. Experiments are presented on the Wall Street journal database using up to 66 hours of training data. These show that lattices combined with an improved estimation algorithm makes MMIE training practicable even for very complex recognition systems and large training sets. Furthermore, experimental results show that MMIE training can yield useful increases in recognition accuracyAcoustics, Speech, and Signal Processing, 1996. ICASSP96. Conference Proceedings., 1996 IEEE International Conference on; 06/1996
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.