[show abstract] [hide abstract]
ABSTRACT: “Tandem approach” is a method used in speech recognition to increase performance by using classifier posterior probabilities as observations in a hidden Markov model. In this work we study the effect of using multiple visual tandem features to improve audio-visual recognition accuracy. In addition, we investigate methods to combine outputs of several audio and visual tandem classifiers with a classifier fusion system to generate outputs using learned weights. Experiments show that both approaches help to improve audio-visual speech recognition with respect to regular audio-visual speech recognition especially in noisy environments.
Signal Processing and Communications Applications (SIU), 2011 IEEE 19th Conference on; 05/2011