Conference Proceeding

A nativeness classifier for TED Talks

INESC-ID Lisboa, Lisbon, Portugal
Acoustics, Speech, and Signal Processing, 1988. ICASSP-88., 1988 International Conference on (impact factor: 4.63). 06/2011; DOI:10.1109/ICASSP.2011.5947647 pp.5672 - 5675 In proceeding of: Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Source: IEEE Xplore

ABSTRACT This paper presents a nativeness classifier for English. The detector was developed and tested with TED Talks collected from the web, where the major non-native cues are in terms of segmental aspects and prosody. The first experiments were made using only acoustic features, with Gaussian supervectors for training a classifier based on support vector machines. These experiments resulted in an equal error rate of 13.11%. The following experiments based on prosodic features alone did not yield good results. However, a fused system, combining acoustic and prosodic cues, achieved an equal error rate of 10.58%. A small human benchmark was conducted, showing an inter-rater agreement of 0.88. This value is also very close to the agreement value between humans and the best fused system.

0 0
 · 
0 Bookmarks
 · 
43 Views

Full-text (2 Sources)

View
1 Download
Available from

Keywords

acoustic features
 
detector
 
equal error rate
 
first experiments
 
following experiments
 
fused system
 
Gaussian supervectors
 
inter-rater agreement
 
major non-native cues
 
paper presents
 
prosodic cues
 
prosodic features
 
prosody
 
support vector machines