Conference Paper

Exploiting complementary aspects of phonological features in automatic speech recognition

McGill Univ., Montreal
DOI: 10.1109/ASRU.2007.4430082 Conference: Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
Source: IEEE Xplore

ABSTRACT This paper presents techniques for exploiting complementary information contained in multiple definitions of phonological feature systems. Three different feature systems, differing in their structure and in the acoustic phonetic features they represent, are considered. A two stage process involving a mechanism for frame level phonological feature detection and a mechanism for decoding phoneme sequences from features is implemented for each phonological feature system. Two methods are investigated for integrating these features with MFCC based ASR systems. First, phonological feature and MFCC based systems are combined in a lattice re-scoring paradigm. Second, confusion network based system combination (CNC) is used to combine phone networks derived from phonological distinctive feature (PDF) and MFCC based systems. It is shown, using both methods, that phone error rates can be reduced by as much as 15% relative to the phone error rates obtained for any individual feature stream.

Download full-text

Full-text

Available from: Richard C. Rose, Sep 01, 2015
0 Followers
 · 
108 Views
 · 
45 Downloads
  • Source
    • "Among the various speech knowledge sources, phonological features are used most frequently by detectionbased ASR research groups [2][3][4][5]. However, although there are several phonological feature sets based on different linguistic theories, it is not clear whether any of them could be considered as the best design for the detection-based ASR task [6]. Therefore, instead of building a detection-based ASR system by randomly selecting one of the phonological feature sets, it would be better to consider each of them before constructing a detection-based ASR system. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we study the effect of using different phonological feature sets for detection-based automatic speech recognition in phone recognition tasks. Three phonological feature sets derived from different underlying phonological theories are investigated. Our experiments were conducted on the TIMIT database. By comparing the oracle phone recognition results achieved by assuming that all the phonological features are correctly detected based on each feature set, we show that selecting an appropriate phonological feature set is crucial to the performance of detection-based ASR. The highly accurate oracle phone recognition results show that the performance of the CRF-based backend, which is commonly used in detection-based ASR, is very satisfactory. Comparison of the oracle phone recognition results and the real phone recognition results indicates that investigation of high-accuracy front-end detectors is a key issue in improving the performance of detection-based ASR.
    Chinese Spoken Language Processing, 2008. ISCSLP '08. 6th International Symposium on; 01/2009
  • Source
    • "Among the various speech knowledge sources, phonological features are used most frequently by detectionbased ASR research groups [2][3][4][5]. However, although there are several phonological feature sets based on different linguistic theories, it is not clear whether any of them could be considered as the best design for the detection-based ASR task [6]. Therefore, instead of building a detection-based ASR system by randomly selecting one of the phonological feature sets, it would be better to consider each of them before constructing a detection-based ASR system. "
    [Show abstract] [Hide abstract]
    ABSTRACT: ABSTRACT In this paper, we study the effect of using different phonological feature sets for detection-based automatic speech recognition in phone recognition tasks. Three phonological feature sets derived from different underlying phonological theories are investigated. Our experiments ,were ,conducted ,on the ,TIMIT database. By comparing,the oracle phone ,recognition results achieved ,by assuming,that all the phonological features are correctly detected based on each feature set, we show that selecting an appropriate phonological,feature set is crucial ,to the ,performance ,of detection-based ASR. The ,highly ,accurate ,oracle ,phone recognition results show,that the performance,of the CRF-based backend, which is commonly used in detection-based ASR, is very satisfactory. Comparison ,of the ,oracle phone ,recognition results and ,the real phone ,recognition ,results indicates that investigation of high-accuracyfront-end detectors is a key issue in improving the performance,of detection-based ASR. Index Terms— Detection-based ASR, phonological feature system, result fusion, speech recognition 1. INTRODUCTION Currently, detection-based automatic speech recognition (ASR) isa,popular research topic in fields ,related to ASR. Because human,beings often understand speech ,by integrating ,multiple knowledge sources from the bottom up, detection-based ASR systems,attempt ,to reduce ,the gap ,between ,human ,speech
  • Source
    • "This is mainly due to the fact that direct measurements require expensive and invasive devices, such as an electropalatograph. On the other hand, different methods are used to extract phonological information from the surface waveform, such as, the use of artificial neural networks [8] [10], dynamic Bayesian networks [4] [9] or Hidden Markov Models [14], among others. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we undertake the extraction of phonological fea-tures applied to Spanish language. Also propose a method to integrate these features into an HMM based speech recogni-tion system using an architecture that uses independent feature streams. In the experimental results we find that higher recog-nition accuracies and less computational cost can be obtained.
Show more