An efficient beam pruning with a reward considering the potential to reach various words on a lexical tree
User Interface Lab., KDDI R& D Labs. Inc., Fujimino, JapanDOI: 10.1109/ICASSP.2010.5495098 Conference: Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Source: IEEE Xplore
This paper presents an efficient frame-synchronous beam pruning for automatic speech recognition. With conventional beam pruning, hypotheses that have a greater potential to reach various words on a lexical tree are likely to be pruned out, since this potential is not taken into account. To make the beam pruning less restrictive for hypotheses with a greater potential and vice versa, the proposed method adds a reward as a monotonically increasing function of the number of reachable words from the node where a hypothesis stays on a lexical tree, to the likelihood of the hypothesis. The reward is designed not to collapse the ASR probabilistic framework. The proposed method reduces the processing time from 30% to 70% for grammar-based tasks. For a language-model-based dictation task, it also causes an additional reduction from the processing time of the beam pruning with the language model look-ahead technique.
Full-text previewDOI: · Available from: mirlab.org
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.
- [Show abstract] [Hide abstract]
ABSTRACT: This paper presents an improved pruning method taking into account of the number of states possessed by hypotheses in some certain frames. With conventional pruning strategy, the hypotheses with a low score or a bad ranking will be discarded. However, it neglects a fact that the hypotheses several states ahead of or behind the right hypothesis in the prefix tree, which should be discarded, have similar scores and rankings with the right hypothesis. If a state is part of a partial path hypothesis, we say it is possessed by the hypothesis. So in a speech frame, we can deduce that the hypotheses which possess the most states and the hypotheses which possess the least states have little chance to be the right hypothesis. The proposed method analysis the range of the number of the states possessed by the hypotheses, and discards the hypotheses that possess too many or too few states. According to the experiments, This method could effectively improve the performance of the ASR.