-
[show abstract]
[hide abstract]
ABSTRACT: We describe the application of decision trees to the automatic conversion of pronunciations between American, British and South African English accents. The resulting phoneme-to-phoneme (P2P) conversion technique derives the pronunciation of a word in a new target accent by taking advantage of its existing available pronunciation in a different source accent. We find that it is substantially more accurate to derive pronunciations in this way than directly from the orthography and available target accent pronunciations using more conventional grapheme-to-phoneme (G2P) conversion. Furthermore, by including both the graphemes and the phonemes of the source accent, grapheme-and-phoneme-to-phoneme (GP2P) conversion delivers additional increases in accuracy in relation to P2P. These findings are particularly important for less-resourced varieties of English, for which extensive manually-prepared pronunciation dictionaries are not available. By means of the P2P and GP2P approaches, the pronunciations of new words can be obtained with better accuracy than is possible using G2P methods.
Speech Communication. 01/2011;
-
INTERSPEECH 2009, 10th Annual Conference of the International Speech Communication Association, Brighton, United Kingdom, September 6-10, 2009; 01/2009
-
INTERSPEECH 2007, 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, August 27-31, 2007; 01/2007
-
[show abstract]
[hide abstract]
ABSTRACT: We present a corpus-based analysis of the Afrikaans, English, Xhosa and Zulu languages, comparing these in terms of phonetic content, diversity and mutual overlap. Our aim is to shed light on the fundamental phonetic interrelationships between these languages, with a view to furthering progress in multilingual automatic speech recognition in general, and in the South African region in particular.
Southern African Linguistics and Applied Language Studies 10/2005; 23(4):459-474. · 0.03 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: We investigate whether accent identification is more effective for English utterances embedded in a different language as part of a mixed code than for English utterances that are part of a monolingual dialogue. Our focus is on Xhosa and Zulu, two South African languages for which code-mixing with English is very common. In order to carry out our investigation, we extract English utterances from mixed-code Xhosa and Zulu speech corpora, as well as comparable utterances from an English-only corpus by Xhosa and Zulu mother-tongue speakers. Experiments using automatic accent identification systems show that identification is substantially more accurate for the utterances originating from the mixed-code speech. These findings are supported by a corresponding set of perceptual experiments in which human subjects were asked to identify the accents of recorded utterances. We conclude that accent identification is more successful for these utterances because accents are more pronounced for English embedded in mother-tongue speech than for English spoken as part of a monolingual dialogue by non-native speakers. Furthermore we find that this is true for human listeners as well as for automatic identification systems.
Computer Speech & Language.
-
[show abstract]
[hide abstract]
ABSTRACT: It has been shown that techniques known as grapheme-and-phoneme-to-phoneme (GP2P) conversion can be used to derive pronunciations in a poorly-resourced accent, such as South African English, using available pronunciations in better-resourced accents of the same language, such as British and American English. However if the pronunciation is not available in either accent, it must be obtained using grapheme-to-phoneme (G2P) conversion in either the source or the target accent. The question therefore arises whether it is better to apply G2P in the source accent and then GP2P to obtain the desired pronunciation in the target accent, or to apply G2P directly to the target accent. This study finds that if the source dictionary used has a high G2P accuracy (due to the dictionary's size, regularity, or both), it is advantageous to generate a pronunciation in the source accent first using G2P, and subsequently convert this pronunciation to the target accent.
-
[show abstract]
[hide abstract]
ABSTRACT: It is well established that accent can have a detrimental ef-fect on the performance of automatic speech recognition (ASR) systems. While accents are usually classified in terms of a speaker's mother tongue, it remains to be determined if and when this linguistic classification is appropriate for the develop-ment of ASR technology. This study focuses on South African English as produced by mother tongue speakers of Nguni and Sotho languages, which account for over 70% of the country's population. The aim of the investigation is to determine whether these two accent groups should be treated as a single variety, or whether it is better to treat them separately. We begin with a per-ceptual experiment in which human listeners classify different English accents. Subsequently, speech recognition experiments are conducted to determine whether the acoustic models bene-fit from the incorporation of Nguni/Sotho accent classifications. The results of the perceptual experiment indicate that most lis-teners cannot correctly identify a speaker's mother tongue based on their English accent. This finding is supported by the results of the recognition experiments.
-
[show abstract]
[hide abstract]
ABSTRACT: This paper presents a number of practical guidelines on the design, collection and annotation of South African speech databases. Most of the guidelines are based on personal ex-perience gained during previous data collection exercises. The issues that are addressed in the paper include: the aim of data collection, the design of prompting material, speaker recruit-ment, recording equipment, as well as the recording, editing and annotation of the speech data.
-
[show abstract]
[hide abstract]
ABSTRACT: In this study we explore the extension of a small Afrikaans pronunciation dictionary by applying phoneme-to-phoneme (P2P) and grapheme-and-phoneme-to-phoneme (GP2P) conversion to an existing and more extensive Dutch pronunciation dictionary. This is compared to the more common approach of extending the Afrikaans dictionary by means of grapheme-to-phoneme (G2P) conversion. The results indicate that the Afrikaans pronunciations obtained by P2P and GP2P from the Dutch dictionary are more accurate than the corresponding pronunciations obtained by the application of G2P. This result indicates that under-resourced languages can take advantage of existing and more extensive pronunciations available in a closely-related and better-resourced language in order to improve the extent and quality of a pronunciation dictionary.