Tae-Jin Yoon

Tae-Jin Yoon
Verified
Tae-Jin verified their affiliation via an institutional email.
Verified
Tae-Jin verified their affiliation via an institutional email.
  • Doctor of Philosophy
  • Professor (Full) at Sungshin Women's University

About

70
Publications
9,448
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
474
Citations
Current institution
Sungshin Women's University
Current position
  • Professor (Full)

Publications

Publications (70)
Article
Full-text available
The modulation of vocal elements, such as pitch, loudness, and duration, plays a crucial role in conveying both linguistic information and the speaker’s emotional state. While acoustic features like fundamental frequency (F0) variability have been widely studied in emotional speech analysis, accurately classifying emotion remains challenging due to...
Preprint
Full-text available
The modulation of vocal elements such as pitch, loudness, and duration plays a crucial role in conveying both linguistic information and the speaker’s emotional state. While acoustic features like fundamental frequency (F0) variability have been widely studied in emotional speech analysis, challenges remain in accurately classifying emotions due to...
Article
Full-text available
Ideophones are believed to exhibit distinct phonotactic patterns compared to regular language, in their expressiveness. Vowel harmony can be observed in ideophones in Modern Korean. However, over time, Korean has gradually lost its regular vowel harmony process, due to the influx of foreign words, especially from Chinese, and historical sound chang...
Article
Full-text available
The study examined the link between Korean-speaking children’s vowel production and its perception by inexperienced adults and also observed whether ongoing vowel changes in mid-back vowels affect adults’ perceptions when the vowels are produced by children. This study analyzed vowels in monosyllabic words produced by 20 children, ranging from 2 to...
Article
This paper presents an apparent-time study of the vowel length contrast merger in Seoul Korean based on duration measurements of over 370,000 vowels in word-initial syllables in a read-speech corpus. The effects of word frequency on vowel duration and the lexical diffusion of long-vowel shortening are also examined. The findings confirm the observa...
Conference Paper
Full-text available
Seoul Korean is currently undergoing a tonogenetic sound change wherein the traditional consonantal VOT cue has been replaced by the previously intrinsic f0 of the following vowel. This study makes use of a recently available apparent-time corpus of speech to examine how this change has unfolded across the lexicon. In particular, we examine the eff...
Article
Full-text available
This paper investigates the acoustic characteristics of English fricatives in the TIMIT corpus, with a special focus on the role of gender in rendering fricatives in American English. The TIMIT database includes 630 talkers and 2342 different sentences, comprising over five hours of speech. Acoustic analyses are conducted in the domain of spectral...
Article
Full-text available
The paper describes methods of conducting vowel analysis from a large-scale corpus with the aids of forced alignment and optimal formant ceiling methods. `Read Style Corpus of Standard Korean` is used for building the forced alignment system and a subset of the corpus for the processing and extraction of features for vowel analysis based on optimal...
Article
This paper analyses the rate of inter-speaker consistency in the way multiple speakers render prosodic events when they read the same scripts. Prosodically labeled data of five speakers from the Boston Radio Speech Corpus (BURSC) are used to measure the degree of speaker variation in rendering prosodic boundaries. The results indicate that the aver...
Article
Full-text available
We made quantitative rhythmic and timing measurements on speech samples obtained a 61 year-old monolingual female En-glish speaker who is reported to have required a rare but possi-ble case of Foreign Accent Syndrome (FAS). The phonetic char-acteristics of speech produced by individuals with FAS affects both suprasegmental and segmental properties....
Article
Korean oral stops are unique in their three-way laryngeal contrast for oral stops, called plain, fortis, and aspirated. The primary acoustic correlates of the contrast have been shown to be voice onset time (VOT). Three native Korean-speaking mothers were recorded at their homes for one hour as they interacted with their 5- month-old infants. Mothe...
Article
Full-text available
A study showing a detailed phonetic analyses of a 61 year-old monolingual female English speaker is presented. There is considerable variability among reported cases of FAS in terms of phonetic characteristics and impairments. The speaker, LA, is a monolingual English-speaking Canadian Woman and she was 61-year old when the data were collected. One...
Article
Full-text available
An analysis is presented on the rate of inter-speaker consistency in the way multiple speakers realize prosodic events when they read the same scripts. The analysis is made on the Boston University Radio Speech Corpus (BURSC). The BURSC consists of data from five speakers (3 female and 2 male), each reading the same scripts that comprise more than...
Article
Full-text available
Statistical rhythmic metrics are applied on a Buckeye corpus [1] of spontaneous interview speech in order to investigate the extent of rhythm variability of between-speakers as well as the variability of within-speaker. The corpus consists of speech produced by speakers who share the same regional dialect in North America. The Buckeye corpus is uni...
Conference Paper
One of the crucial issues in Internet chat is how to manage the corresponding pairs of questions and answers in a sequence of conversations. This paper addresses the problems of ambiguous dialogue logs, lack of a social interaction network of chat agents, and the rupture of the turn sequence in the plain chat room. Therefore we can resolve the ambi...
Article
A study was conducted to investigate the use of acoustic cues by Chinese learners of English in stress perception. The study included the stress judgment of synthesized disyllabic nonsense tokens and an oddity test using real English words from carrier sentences. The study also manipulated F0 difference on disyllabic nonsense words in 5 level, whil...
Article
Full-text available
Non-modal phonation conveys both linguistic and paralinguistic information, and is distinguished by acoustic source and filter features. Detecting non-modal phonation in speech requires re- liable F0 analysis, a problem for telephone-band speech, where F0 analysis frequently fails. We demonstrate an approach to the detection of creaky phonation in...
Article
Full-text available
Article
Full-text available
Prosodic structure encodes the grouping of words into hierarchically layered prosodic constituents, in-cluding the prosodic word, intermediate phrase (ip) and intonational phrase (IP). This paper investigates the phonetic encoding of prosodic structure from a corpus of scripted broadcast news speech in Amer-ican English through analysis of the acou...
Article
Full-text available
Speech prosody is manifest in the acoustic signal through the modulation of pitch, loudness, duration, and source characteristics (voice quality), which combine to encode the prosodic structure of an utterance. Prosodic structure defines the location of prominent words and syllables, and the grouping of words into phonological phrases. Prosodic str...
Article
Full-text available
Voice quality conveys both linguistic and paralinguistic information, and can be distinguished by acoustic source characteristics. We label objective voice quality categories based on the spectral and temporal structure of speech sounds, specifically the harmonic structure (H1-H2) and the mean autocorrelation ratio of each phone. Results from a cla...
Article
Full-text available
The prosodic structure of speech is based on complex interac-tion within and between several different levels of linguistic, and paralinguistic organization, and is expressed in the modula-tion of F0, intensity, duration, and voice quality, as well as the occurrence of pauses. Even though leading theories of prosody maintain that prosody is shaped...
Article
Full-text available
Acoustic evidence for a distinction between low-toned interme- diate (ip) and intonational phrase (IP) boundaries is presented from two speech corpora representing spontaneous, conversa- tional speech and scripted broadcast speech. Robust effects of the two boundary levels are found in the phrase-final syllable rime in both corpora. Nucleus duratio...
Article
This paper describes automatic speech recognition systems that satisfy two techno- logical objectives. First, we seek to improve the automatic labeling of prosody, in order to aid future research in automatic speech understanding. Second, we seek to apply statistical speech recognition models of prosody for the purpose of reducing the word error ra...
Article
Full-text available
Non‐modal phonation conveys both linguistic and paralinguistic information, and is distinguished by acoustic source and filter features. Detecting non‐modal phonation in speech requires reliable F0 analysis, a problem for telephone‐band speech, where F0 analysis frequently fails. An approach is demonstrated to the detection of creaky phonation in t...
Article
Full-text available
The relationship between prosodic structure and syntactic structure is an unresolved area of inquiry, partly due to the shortage of prosodically transcribed speech corpora, and partly due to the complexity in- volved in the analysis of both syntax and prosody. Chomsky & Halle (1968, p. 372) state that "al- though there is a substantial literature o...
Article
Full-text available
Complex disfluencies that involve the repetition or correction of words are frequent in conversational speech, with repetition disfluencies alone accounting for over 20% of disfluencies. These disfluencies generally do not lead to comprehension errors for human listeners. We propose that the frequent occurrence of parallel prosodic features in the...
Conference Paper
Full-text available
Two transcribers have labeled prosodic events indepen- dently on a subset of Switchboard corpus using adapted ToBI (TOnes and Break Indices) system. Transcriptions of two types of pitch accents (H* and L*), phrasal accents (H- and L-) and boundary tones (H% and L%) encoded independently by two transcribers are compared for intertranscriber reliabil...
Article
Full-text available
 Since unaccented preboundary syllables were rare in the Radio News corpus and L* preboundary syllables were rare in both corpora, analyses of those items (unaccented in Radio News and L* in both) are not reported here. Creaky tokens were excluded from our analysis of pitch due to frequent pitch track failure.  Published reliability studies, incl...
Article
Full-text available
Prosodic phrase boundaries, regardless of level of disjuncture, can be signaled by variation in pitch, loudness, and final- syllable length. In an attempt to find acoustically distinctive characteristics correlated with ip (intermediate phrase) versus IP (intonation phrase) labels in a ToBI-labeled subset of the Switchboard corpus, we compared F0 d...
Article
Full-text available
Résumé: A travers une analyse acoustique et une expérience didactique sur machine, cette présentation met en évidence une distinction de catégorie en anglais américain entre les tons hauts accentués H* dont le registre est abaissé (downstep: !H*) et ceux dont il ne l'est pas. Cette étude offre une explication quant aux découvertes contradictoires d...
Article
Full-text available
Metathesis between a laryngeal feature and an adjacent stop is motivated by the perceptual optimization by means of which the acoustic/perceptual properties of the laryngeal feature in a less salient position is realized in a more salient position, or by means of which the acoustic/perceptual properties in a more salient position is resistant to be...
Article
Full-text available
Automatic speech recognition (ASR) is like solving a crossword puzzle. Context at every level is used to resolve ambiguity: the more context we can bring to bear, the higher will be the accuracy of the ASR. One of the ways in which ASR uses context is by defining context-dependent phonological units. This paper reviews and applies two types of phon...
Article
Full-text available
Prosodic structure encodes grouping of words into hierarchically layered prosodic constituents, including the prosodic word, intermediate phrase (ip) and intonational phrase (IP). This paper investigates the phonetic encoding of prosodic structure from a corpus of scripted broadcast news speech through analysis of the acoustic correlates of prosodi...
Article
Full-text available
The phonetic characteristics of declarative sentence-ending 'ta' was examined based on the speech of one male Korean speaker in his 20s drawn from a large-scale speech corpus. An analy-sis using a Unicode-based phone alignment system was com-pared to an analysis based on manually corrected alignment and the two methods produced largely comparable r...
Article
Full-text available
Acoustic phonetic studies on the speech of speech clinicians were conducted, when speech clinicians communicated with speech and/or language impaired children through regular, one-to-one training sessions. The participants in this study included three speech clinicians and twelve children with a speech and/or language disorder. Four children of dif...

Network

Cited By