Article

Isolated word recognition using modular recurrent neural networks

Department of Electronic Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong; Department of Computer Science, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
Pattern Recognition DOI:10.1016/S0031-3203(97)00106-4 pp.751-760

ABSTRACT This paper describes a novel method of using recurrent neural networks (RNN) for isolated word recognition. Each word in the target vocabulary is modeled by a fully connected recurrent network. To recognize an input utterance, the best matching word is determined based on its temporal output response. The system is trained in two stages. First, the RNN speech models (RSM) are trained independently to capture the essential static and temporal characteristics of individual words. This is performed by using an iterative re-segmentation training algorithm which gives the optimal phonetic segmentation automatically for each training utterance. The second-stage involves mutually discriminative training among the RSMs, aiming at minimizing the probability of misclassification. A series of simulation experiments have been performed to demonstrate the effectiveness of the proposed recognition method. For the recognition of (A) 20 English words, (B) 11 Cantonese digits and (C) 58 Cantonese CV syllables, the top-1 accuracy are 91.9, 93.6 and 87.1%, respectively.

0 0
 · 
0 Bookmarks
 · 
15 Views

Keywords

connected recurrent network
 
individual words
 
input utterance
 
iterative re-segmentation training algorithm
 
matching word
 
misclassification
 
mutually discriminative training
 
novel method
 
optimal phonetic segmentation
 
proposed recognition method
 
recurrent neural networks
 
RNN speech models
 
target vocabulary
 
temporal output response
 
top-1 accuracy
 
training utterance
 
word recognition
 

Tan Lee