Neural networks (NNs) are discriminative classifiers which have been successfully integrated with hidden Markov models (HMMs), either in the hybrid NN/HMM or tandem connectionist systems. Typically, the NNs are trained with the frame-based cross-entropy criterion to classify phonemes or phoneme states. However, for word recognition, the word error rate is more closely related to the sequence
... [Show full abstract] classification criteria, such as maximum mutual information and minimum phone error. In this paper, the lattice-based sequence classification criteria are used to train the NNs in the hybrid NN/HMM system and the tandem system. A product-of-expert-based factorization and smoothing scheme is proposed for the hybrid system to scale the lattice-based NN training up to 6000 triphone states. Experimental results on the WSJCAM0 reveal that the NNs trained with the sequential classification criterion yield a 24.2% relative improvement compared to the cross-entropy trained NNs for the hybrid system.