Conference Paper

Exploiting complementary aspects of phonological features in automatic speech recognition

McGill Univ., Montreal
DOI: 10.1109/ASRU.2007.4430082 Conference: Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
Source: IEEE Xplore

ABSTRACT This paper presents techniques for exploiting complementary information contained in multiple definitions of phonological feature systems. Three different feature systems, differing in their structure and in the acoustic phonetic features they represent, are considered. A two stage process involving a mechanism for frame level phonological feature detection and a mechanism for decoding phoneme sequences from features is implemented for each phonological feature system. Two methods are investigated for integrating these features with MFCC based ASR systems. First, phonological feature and MFCC based systems are combined in a lattice re-scoring paradigm. Second, confusion network based system combination (CNC) is used to combine phone networks derived from phonological distinctive feature (PDF) and MFCC based systems. It is shown, using both methods, that phone error rates can be reduced by as much as 15% relative to the phone error rates obtained for any individual feature stream.

Download full-text

Full-text

Available from: Richard C. Rose, Jul 04, 2015
0 Followers
 · 
98 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we study the effect of using different phonological feature sets for detection-based automatic speech recognition in phone recognition tasks. Three phonological feature sets derived from different underlying phonological theories are investigated. Our experiments were conducted on the TIMIT database. By comparing the oracle phone recognition results achieved by assuming that all the phonological features are correctly detected based on each feature set, we show that selecting an appropriate phonological feature set is crucial to the performance of detection-based ASR. The highly accurate oracle phone recognition results show that the performance of the CRF-based backend, which is commonly used in detection-based ASR, is very satisfactory. Comparison of the oracle phone recognition results and the real phone recognition results indicates that investigation of high-accuracy front-end detectors is a key issue in improving the performance of detection-based ASR.
    Chinese Spoken Language Processing, 2008. ISCSLP '08. 6th International Symposium on; 01/2009
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: ABSTRACT In this paper, we study the effect of using different phonological feature sets for detection-based automatic speech recognition in phone recognition tasks. Three phonological feature sets derived from different underlying phonological theories are investigated. Our experiments ,were ,conducted ,on the ,TIMIT database. By comparing,the oracle phone ,recognition results achieved ,by assuming,that all the phonological features are correctly detected based on each feature set, we show that selecting an appropriate phonological,feature set is crucial ,to the ,performance ,of detection-based ASR. The ,highly ,accurate ,oracle ,phone recognition results show,that the performance,of the CRF-based backend, which is commonly used in detection-based ASR, is very satisfactory. Comparison ,of the ,oracle phone ,recognition results and ,the real phone ,recognition ,results indicates that investigation of high-accuracyfront-end detectors is a key issue in improving the performance,of detection-based ASR. Index Terms— Detection-based ASR, phonological feature system, result fusion, speech recognition 1. INTRODUCTION Currently, detection-based automatic speech recognition (ASR) isa,popular research topic in fields ,related to ASR. Because human,beings often understand speech ,by integrating ,multiple knowledge sources from the bottom up, detection-based ASR systems,attempt ,to reduce ,the gap ,between ,human ,speech
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we undertake the extraction of phonological fea-tures applied to Spanish language. Also propose a method to integrate these features into an HMM based speech recogni-tion system using an architecture that uses independent feature streams. In the experimental results we find that higher recog-nition accuracies and less computational cost can be obtained.