Jaekoo KangCUNY Graduate Center | CUNY · Program in Speech–Language–Hearing Sciences
Jaekoo Kang
Doctor of Philosophy
Director of AI Research @ i-Scream arts (until 2024)
About
9
Publications
950
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
24
Citations
Introduction
variability in speech production; flexibility and synergy in articulatory movements; modeling of speech articulation and acoustics based on machine-learning approaches
Additional affiliations
August 2016 - November 2024
Education
August 2016 - April 2021
February 2008 - February 2014
Publications
Publications (9)
Contours traced by trained phoneticians have been considered to be the most accurate way to identify the midsagittal tongue surface from ultrasound video frames. In this study, inter-measurer reliability was evaluated using measures that quantified both how closely human-placed contours approximated each other as well as how consistent measurers we...
This study investigates how second language (L2) listeners’ perception is affected by two factors: the listeners’ experience with the target dialect – North American English (NAE) vs. Standard Southern British English (SSBE) – and talkers’ language background: native vs. non-native talkers; i.e. interlanguage speech intelligibility benefit (ISIB) t...
Background/aims:
We investigated the efficacy of ultrasound imaging of the tongue as a tool for familiarizing naïve learners with the production of a class of nonnative speech sounds: palatalized Russian consonants.
Methods:
Two learner groups were familiarized, one with ultrasound and one with audio only. Learners performed pre- and postfamilia...
No PDF available
ABSTRACT
Ultrasound imaging is a non-invasive technique for the measurement of the tongue in speech. Recent advancements in analytical edge detection algorithms and deep learning methods have improved tongue contour segmentation. However, most edge detection algorithms require user input as initialization “seeds” and accuracy can d...
Variability is widespread in speech, but it is unlikely that all of it is harmful; variability in other domains has been shown to allow flexibility, within limits. Using one technique for separating the two, we applied Uncontrolled Manifold Analysis (UCM) to vowels in running speech. This results in two multidimensional manifolds, one the controlle...
Korean learners of English must create four vowel categories for English (/i, ɪ/ and /ɛ, æ/) in relation to two similar native categories (/i/ and /ɛ/). It is hypothesized that new categories should be easier to learn than similar ones (Flege, 1994), but it is unclear whether the English L2 vowels are similar or new. The degree of similarity betwee...
Speech production is a highly skilled sensorimotor activity defined by articulatory or acoustic coordinates. To compare the variabilities of those two conceptualizations, issues of dimension reduction, normalization, incompleteness of information, etc., need to be taken into account. Uncontrolled manifold (UCM) method analyzes high-dimensional move...
The tongue surface is a good indicator of the main supralaryngeal articulation of speech, and quantifying it with more points to measure increases accuracy. However, unlike acoustic variables (e.g., formants), articulatory variables (e.g., flesh-point pellets or multiple measurement points on an ultrasoundimage) are highly correlated to one another...
Speech inversion (acoustic-to-articulatory mapping) is not a trivial problem, despite the importance, due to the highly non-linear and non-unique nature. This study aimed to investigate the performance of Deep Neural Network (DNN) compared to that of traditional Artificial Neural Network (ANN) to address the problem. The Wisconsin X-ray Microbeam D...