ArticlePDF Available

Acoustic evidence for the emergence of tonal contrast in contemporary Korean


Abstract and Figures

Acoustic evidence suggests that contemporary Seoul Korean may be developing a tonal system, which is arising in the context of a nearly completed change in how speakers use voice onset time (VOT) to mark the language's distinction among tense, lax and aspirated stops. Data from 36 native speakers of varying ages indicate that while VOT for tense stops has not changed since the 1960s, VOT differences between lax and aspirated stops have decreased, in some cases to the point of complete overlap. Concurrently, the mean F0 for words beginning with lax stops is significantly lower than the mean F0 for comparable words beginning with tense or aspirated stops. Hence the underlying contrast between lax and aspirated stops is maintained by younger speakers, but is phonetically manifested in terms of differentiated tonal melodies: laryngeally unmarked (lax) stops trigger the introduction of a default L tone, while laryngeally marked stops (aspirated and tense) introduce H, triggered by a feature specification for [stiff].
Content may be subject to copyright.
Acoustic evidence for the
emergence of tonal contrast
in contemporary Korean*
David J. Silva
University of Texas at Arlington
Acoustic evidence suggests that contemporary Seoul Korean may be developing a
tonal system, which is arising in the context of a nearly completed change in how
speakers use voice onset time (VOT) to mark the language’s distinction among
tense, lax and aspirated stops. Data from 36 native speakers of varying ages indicate
that while VOT for tense stops has not changed since the 1960s, VOT differences
between lax and aspirated stops have decreased, in some cases to the point of
complete overlap. Concurrently, the mean F0 for words beginning with lax stops
is significantly lower than the mean F0 for comparable words beginning with
tense or aspirated stops. Hence the underlying contrast between lax and aspirated
stops is maintained by younger speakers, but is phonetically manifested in terms
of differentiated tonal melodies : laryngeally unmarked (lax) stops trigger the in-
troduction of a default L tone, while laryngeally marked stops (aspirated and
tense) introduce H, triggered by a feature specification for [stiff ].
1 Introduction
Despite long-standing descriptions of standard Korean as a non-tonal and
non-accentual language (Martin 1992: 60, Sohn 1999: 48, Lee & Ramsey
2000 : 315), the data analysed for this study indicate that some self-identified
speakers of standard (Seoul) Korean employ a speech-production strategy
that includes a tonal contrast. Moreover, this tonal system has arisen in the
context of a diachronic shift in Korean’s three-member contrast among
tense, lax and aspirated obstruents : as reported by several researchers (Jun
* I would like to thank Younjeoung Choi and Ji Eun Kim for their valuable assistance
in this project, not only as tireless fieldworkers, but as dedicated research assistants
in every respect. I further offer my thanks to the guest editors and to two anony-
mous reviewers, all of whom provided useful suggestions on the first draft of the
text. As noted in the text, preliminary descriptions of the VOT data were presented
at the 2003 Harvard International Symposium on Korean Linguistics (ISOKL) and
the 2004 Meeting of the International Circle of Korean Linguistics. A first pass at
the phonological analysis was offered at the 2005 Harvard ISOKL. The research for
this paper was conducted under the auspices of the University of Texas at
Arlington’s Institutional Research Board for the Protection of Human Subjects.
Phonology 23 (2006) 287–308. f2006 Cambridge University Press
doi:10.1017/S0952675706000911 Printed in the United Kingdom
1993, Kim 2000, Kim & Park 2001), many speakers of the standard
language – more specifically, younger speakers (Silva 2006) – no longer
mark the contrast between lax and aspirated stops in phrase-initial po-
sition with differences in voice onset time (VOT). Rather, some speakers
of Korean appear to be implementing surface-level phonation-to-tone
shifts akin to those reported for languages such as Tibetan (Duanmu 1992)
and Kammu (Svantesson & House 2006), languages in which historical/
underlying laryngeal distinctions are reinterpreted as systematic differ-
ences in fundamental frequency (F0).
As some younger speakers have
neutralised VOT differences between lax and aspirated stop phonemes
and mark this contrast tonally, formal accounts of the Korean obstruent
system need revision in the direction of feature representations that ad-
equately account for the phonetic behaviour of all speakers of the standard
variety, young and old alike. The modification proposed here involves
replacing underlying glottal aperture features (specifically [spread glottis]
and [constricted glottis]) with a more abstract laryngeal ‘ tensity feature
`la Kim 1965), [stiff]. This single feature has the advantage of
generalisability across the speaker pool : [stiff ] may be phonetically
implemented in either a more traditional way, i.e. maintaining VOT dis-
tinctions between lax and aspirated stops, or a more innovative manner,
i.e. backgrounding aspiration differences in favour of a tone-based strat-
egy for marking the underlying lax vs. aspiration distinction.
2 Background
Korean is unusual among the major languages of East Asia in at least two
ways. First, the consonant system of Korean includes a typologically rare
(if not unique) distinction among three types of voiceless obstruents : lax
(or plain), which Sohn (1999 : 154) describes as ‘ basically voiceless, with
only a minor degree of aspiration and no tenseness ; tense (or reinforced,
geminated), characterised by ‘ building up air pressure behind the closed
place of articulation’ and an unaspirated release (Sohn 1999: 154); and
aspirated, produced ‘with strong aspiration lasting about 100 ms’ (Lee &
Ramsey 2000: 62).
Second, most varieties of Korean do not employ F0
lexically. While both earlier stages of Korean and contemporary varieties of
the language have been shown to employ pitch-based accentual systems,
While it might be tempting to assume that the process described here is tonogenesis,
it is perhaps best to take a cue from Svantesson & House (2006), and hold off on
making such a claim. Whether standard Korean ultimately develops a purely tonal
means for distinguishing lexical items currently contrasted in terms of consonantal
phonation differences remains to be seen.
In a more recent re-evaluation of the Korean consonantal system, Kim (2000) and
Kim & Duanmu (2004) claim that the Korean system is, in fact, typologically un-
remarkable, arguing that the fundamental difference among the three stop types can
be analysed in terms of the common voiced vs. voiceless contrast. For additional
discussion on this point, see w5.
288 David J. Silva
the majority of Koreans speak a variety in which F0 plays no phonologi-
cally contrastive role at the word level (Sohn 1999 : 60–62, Lee & Ramsey
2000: 280, 315).
As documented in the current study, however, the phonetic im-
plementation of the phonemic contrast among lax, tense and aspirated
segments has been changing over the past two generations. According to
widely cited phonetic accounts of Korean written in the 1960s and 1970s
(Han & Weitzman 1965, 1967, Kim 1965, Hardcastle 1973, Kagaya 1974),
a key acoustic correlate associated with each stop type is voice onset time :
tense stops manifest short VOTs (in the range of 6 to 18 ms), aspirated
stops manifest long VOTs (~100–115 ms), and lax stops manifest inter-
mediate VOT values (~20–60 ms). Yet more recent acoustic studies of
the language (Silva 1992, Kim 1995, Cho 1996, Han 1996, Flege et al.
1999) have reported noticeably different VOT values for the lax and
aspirated stops; specifically, lax stops are more aspirated (~40–70 ms),
while aspirated stops are less so (~85–105 ms).
The first published mention of a possible diachronic change in VOT
values is a passing mention in Silva (1992: 157). Indirect evidence for
such a change for VOT was presented in Silva (2002), a meta-analysis of
relevant phonetic studies published between 1965 and 2000. The con-
clusions drawn from the meta-analysis led directly to the development of
the study being reported here. Preliminary accounts of the acoustic data
reported in this work first appeared in Silva et al. (2004), an initial de-
scription of observed age-related VOT changes for 14 female speakers in
the current subject pool, which verified the existence of a robust age-
related reduction of VOT values associated with the aspirated stops of the
language (VOT
). Silva (2006) provides an expanded phonetic account
of VOT behaviours for an additional 20 speakers, in which it is reported
that an observed apparent-time decrease in VOT
, coupled with a less
clear-cut rise in VOT
and a lack of any change for VOT
, points to a
pattern whereby the VOT difference between lax and aspirated stops
has decreased over time. Indeed, for many younger speakers, particularly
those born during the 1970s and 1980s, any VOT-based distinction be-
tween lax and aspirated stops has been completely neutralised. If VOT
no longer serves to distinguish lax from aspirated stops (at least for
some speakers), is there evidence of a phonemic merger ? As the data
analysed here suggest, the answer is ‘ no : rather, contrasts previously
marked primarily by differences in VOT have now been complicated
by the addition of a tonal dimension, which, for some speakers, has as-
sumed a primary role in distinguishing lax stops from their aspirated
A similar claim appears in Choi (2002), in which data from three speakers of the
Seoul dialect are compared with those of three speakers from the Chonnam dialect.
The participants in Choi’s study (which did not consider the issue of language
change) are all ‘in their late twenties’ (2002: 5), squarely placing them in the
younger cohort analysed in the current work.
Tonal contrast in contemporary Korean 289
3 Methodology
3.1 Materials
The materials developed for this study consist of experimentally con-
trolled frame sentences containing nine different target forms, each of
which is a three-syllable lexical item beginning with a ‘C-a’ sequence
(Table I). The frame took the shape i ken_-i-la-ko ha-cio This thing is
called (a) _, where the left edge of the target corresponds to the left edge
of a phonological phrase, as discussed in Silva (1992) and Yu-Cho (1990)
(and roughly consistent with an accentual phrase as characterised by Jun
1993). This frame is consistent with that used in previous research (e.g.
Silva 1992), particularly in that it locates the target forms in ‘initial po-
sition’ (a term used without further elaboration by the earliest researchers
on this topic, e.g. Han & Weitzman 1965, 1967). The use of a topic-marked
pronominal immediately to the left of the target form, i ken (Yi kes-eunYi
this’, kes thing’, -eun TOPIC), further ensures the presence of a phrasal
boundary immediately before the target. As has been amply demonstrated
in the literature on Korean phonetics and phonology, the prosodic po-
sition of a consonant segment plays a critical role in how that segment will
be realised on the surface. In phrase-initial position, for example, the
word-initial stops of the target items are expected to be realised in their
strongest (i.e. least lenited) variants : tense segments should manifest no-
ticeably long closures with very short VOTs, lax segments should mani-
fest moderate VOTs and aspirated segments should manifest very long
VOTs (Silva 1992).
Each sentence (e.g. i ken panulcililako hacio this is called needlework’)
was printed in Korean script on individual index cards. Subjects were
instructed to read the cards in random order, along with other cards
functioning as distractors (items to be used in research not related to the
‘sewing, needlework’
lax (plain)
place of articulation
‘multiple cropping’
‘grinding sound’
‘far away time/place’
‘scallion salad’
‘alien place’
labial alveolar velar
Table I
Target words employed in the study. All forms are given in Yale Romanisation.
Orthographic syllable boundaries are indicated by a period.
290 David J. Silva
current study), at a self-selected rate of speech, described as ‘normal,
comfortable reading speed’. After the subject had read each of the ran-
domly presented sentences, the cards were shuffled and the reading process
repeated. For this study, data from either three or four rounds have been
analysed, thereby yielding a total of 27 to 36 tokens for each subject.
3.2 Subjects
The data for this study were elicited from 36 adult native speakers of
Korean residing in the area of Dallas-Fort Worth, Texas, 21 females and
15 males, born between 1943 and 1982. In response to a demographic
questionnaire, all subjects reported that they were born, raised and edu-
cated in the capital region of Korea (i.e. Seoul city or Gyeonggi province),
and that they speak the standard variety. All subjects entered the United
States after the age of 18 and use Korean at home, speaking English as a
second language (primarily for business or educational purposes). No
subjects reported any difficulties in speech or hearing.
Subjects were recruited through a social networking approach and in-
vited to participate in a 20–30 minute recording session in a quiet location
of the subject’s choosing. Each session was recorded by one of two female
native speakers of Korean using a standard cassette audiotape and a lapel
microphone. The recordings were subsequently digitised (22,050 Hz,
16 bit) for acoustic analysis in Praat (version 4.2.14).
3.3 Data measurement
For each target word in the corpus, seven measurements were taken: VOT
of the word-initial stop, the F0 of the vowel in the first syllable at three
points (onset, midpoint and offset), and the averaged F0 of the vowels in
each of the target form’s three syllables. Averaged F0 values were obtained
by selecting the span of the vowel in acoustic display and allowing Praat to
automatically calculate the desired value. While employing this sort of
summary measure obscures F0 contours that might be associated with a
Each of these selection criteria was implemented to increase the likelihood that the
data elicited was maximally representative of standard Korean, despite the practical
limitation imposed by the fact that the subjects under investigation are currently
resident in the United States. Further assurances regarding the authenticity of
the data were obtained through the on-field judgments of the two female native
speakers who were contracted to recruit and record subjects for this study ; these
data collectors, both doctoral students in linguistics with specific training in pho-
netics and phonology, ascertained that each subject was a bona fide native speaker of
Korean with no discernable second-language phonetic traits. A subsequent assess-
ment of each subject’s authenticity was independently made in early 2006 by a third
native Korean speaker, who judged all 36 subjects to be legitimate native speakers of
the language. Post hoc efforts to ensure the validity of the speech data aside, of key
importance is the fact that each subject entered the United States on or after the age
of 18, beyond the point at which influence from American English would appreci-
ably influence the subjects’ production of the stops under investigation. See Flege
et al. (1995) and Flege et al. (2003) for further discussion of ‘ Age of Arrival ’ or ‘Age
of Learning ’ effects.
Tonal contrast in contemporary Korean 291
particular vowel, it successfully captures a more general sense of the vowel’s
pitch in relation to (a) the other two vowels in the same target word and (b)
corresponding syllables (i.e. first–second–third) in other target forms.
4 Results
4.1 VOT patterns
As predicted on the basis of previous research, while VOT values for the
tense stops have remained stable over time, there are clearly discernable
age-based differences in VOT values for lax and aspirated stops. In Fig. 1,
which (for the sake of clarity) aggregates subjects into a series of nine five-
year bands (following Labov 1994 : 60ff), we find that for younger speakers
the distances between mean VOT
and mean VOT
are smaller than
those observed for older speakers. This apparent change in behaviour is
most evident in the VOT data associated with the cohort of speakers born
in 1965 and after: for each of these bands, the difference between VOT
and VOT
diminishes considerably, with corresponding overlap in the
VOT data ranges (presented here as one standard deviation on either side
of the mean).
The changing role of VOT as a means of distinguishing lax and as-
pirated stops in Korean becomes all the more evident when one dis-
aggregates the data and calculates the mathematical difference between the
mean VOT values for lax and aspirated stops for each of the 36 subjects.
Since there is no natural pairing of individual tokens in the corpus, ex-
tracting VOT differences may only be done by determining the mean
and mean VOT
for each speaker and then calculating ‘delta-
Figure 1
Mean VOT values and range (1 standard deviation) over time. Each marker
represents the mean VOT value for speakers aggregated into five-year bands,
based on year of birth.
1950–54 1960–64 1970–74 1980–
1945–49 1955–59 1965–69 1975–79
year of birth aggregated in five-year bands
mean VOT +1 SD (ms)
292 David J. Silva
Amean VOT
. In calculating DVOT,
we can better assess the relative (and putatively phonemically driven)
differences in VOT employed by each speaker, while accounting for the
fact that there are individual speaker differences in overall degree of as-
piration (Silva et al. 2004). As illustrated in Fig. 2, older speakers con-
sistently produce clear differences between mean VOT
and mean
(DVOTZ0 ms), while younger speakers produce noticeably
smaller differences between mean VOT
and mean VOT
20 ms). In fitting a curve to the data, the best fit proved to be that for a
quadratic equation (R
=0.686), indicating that the relationship between
DVOT and year of birth is non-linear (cf. the R
of 0.656 for a linear
regression). Perhaps most surprising is the finding that for a handful of
speakers DVOT is negative: lax stops appear to manifest longer voice on-
set times than aspirated stops, a finding certainly at odds with widely ac-
cepted descriptions of the language.
A repeated-measures ANOVA performed on the mean VOT values
for the 36 subjects further revealed that both place of articulation
(F(2, 66)=5.2, p=0.008) and phonation type (F(2, 66)=57.4, pY0.001)
are significant within-speaker effects, with year of birth functioning as a
significant covariate (F(1, 33)=10.9, p=0.002) ; speaker sex was not sig-
nificant (F(1, 33)=0.0, p=0.992). Moreover, there was a single significant
interaction: phonationuyear of birth (F(2, 66)=20.9, pY0.001).
Figure 2
DVOT (VOTasp®VOTlax) as a function of subjects’ year of birth. The best-fit
curve is that representing a quadratic function (R2=0·686).
10 1940 19801950 1960 1970
subjects’ year of birth
DVOT (ms)
Initial assumptions regarding the data for VOT (as well as F0, discussed below)
suggested that both within-speaker and between-speaker effects might play
significant roles in the analysis. As such, the data were subjected to a series of
repeated-measures analyses of variance, each of which included three factors in-
itially thought to be relevant: phonation type, place of articulation and repetition
Tonal contrast in contemporary Korean 293
With respect to place of articulation effects, we find little new to report:
as the point of articulation moves toward the posterior region of the oral
cavity, VOT tends to increase (Laver 1994 : 352, Ladefoged 2003: 98).
Although robust for the tense and the lax stops, this pattern is confounded
for the aspirated stops : while the mean VOT for /kH/ is the longest
(80.4 ms), the mean VOT for /pH/ (79.5 ms) is greater than that for /tH/
(73.5 ms). Scheffe
´post hoc tests reveal two homogeneous subsets, the first
of which includes the alveolar and labial places of articulation and the
second of which includes labial and velar.
With regard to phonation effects, VOT values associated with the tense
stops (/pp tt kk/) were significantly (and unsurprisingly) lower than those
associated with lax /p t k/ or aspirated /pHtHkH/ for all speakers in the
corpus, a result is consistent with Silva’s (2002) meta-analysis of the VOT
literature. As concerns the mean VOT for lax and aspirated stops, how-
ever, the data revealed more complex relationships involving the speakers’
year of birth. A hierarchical cluster analysis on the VOT data revealed an
important divide among the subjects : speakers before 1965 belonged to
one cluster, while those born in 1965 and after belonged to another. One
subject, a male born in 1970, patterned with the pre-1965 group.
Among the older speakers, the ‘traditionalists’, the mean VOT associ-
ated with tense, lax and aspirated stops were all significantly different
(pY0.05), with mean VOT
ranging from 3 to 18 ms, mean VOT
ranging from 36 to 90 ms and mean VOT
ranging from 51 to 117 ms
(Fig. 3).
For younger speakers, the ‘ innovators’, the mean VOT associated
number. In addition, each speaker’s year of birth was taken as a covariate. During
this preliminary stage of the analysis, what was systematically varied from one
ANOVA to the next was a series of extralinguistic factors associated with potential
between-speaker effects, including speaker sex and an array of other factors corre-
sponding to the items included in the questionnaire, namely demographic data (self-
reported educational level, occupation, father’s dialect, mother’s dialect, etc.) and a
series of self-assessments (relative importance of clear speech, self-appraisal of
speaking skill, etc.). Ideally, a single analysis including all of these variables would
have provided a more statistically disciplined account of the data. Given the fact
that only 36 subjects are represented in the corpus, however, such an ANOVA
would have proven problematic, given the enormous ratio of variables to subjects. A
less statistically sophisticated (and admittedly weaker) piecemeal approach was
adopted, one yielding the aforementioned ‘ series’ of ANOVAs. Under this strategy,
the only between-speaker factors that played a significant role in any of the stat-
istical analyses (a=0.05) were year of birth and sex.
It is worth mentioning that the so-called traditionalists are not even as ‘ traditional
as the speakers documented in the first instrumental studies of Korean. Han &
Weitzman (1965: 163) report that ‘in the initial position, aspiration with /pH,tH,kH/
last, on the average, 14–15 centiseconds. In most cases, it lasts more than 10 centi-
seconds’. In the current corpus, however, VOT values for the aspirated stops are
not so long, with the upper end of the range reaching only 11.4 centiseconds.
Moreover, in research conducted by Lisker & Abramson (1964) and Kim (1965), we
find considerable overlap in the distribution of VOT values for tense and lax stops.
In the current corpus, no subject manifested such overlap, producing statistically
significantly distinct distributions for VOT
and VOT
. Hence, while the
general patterns of phonemic differentiation appear to be preserved by the ‘tra-
ditionalists’ of the current corpus, the absolute values for VOT seem to have
294 David J. Silva
with tense and lax stops is nearly equal to that for the traditionalists, but
the mean VOT for aspirated stops is markedly lower (69.7msvs.94
.0 ms).
Moreover, a repeated-measures ANOVA run only on the innovators
(using place of articulation and phonation type as the independent factors
and year of birth as a covariate) indicates that despite the relatively lower
mean VOT
, phonation type remains a significant factor (pY0.001) : as a
group, these younger speakers appear to use VOT differences to mark the
relevant phonemic distinction, but without the same degree of separation
in evidence for older speakers of the community.
In considering further the production of these younger speakers, how-
ever, one cannot help but be drawn back to the data in Fig. 2 and wonder if
the story to be told is, in fact, more complicated still. For the innovator
cohort alone, a correlation between year of birth and DVOT yields an R
0.173, indicating no meaningful age-related pattern. Moreover, there
are five speakers for whom DVOT is negative, suggesting that for these
subjects lax stops are produced with more aspiration than phonemic
aspirated’ stops, a departure from established norms. Can these speakers
be differentiated from the others on extralinguistic grounds other than age ?
Given the current corpus, with its limited degree of social stratification,
the answer at this point is a qualified ‘ no ’. With DVOT as the dependent
variable, a series of one-way ANOVAs for each of the demographic and
self-evaluation questions posed on a survey used in the fieldwork revealed
no differences in means that even approximated a=0.05 ; the smallest
p-value obtained was 0.303 (for the question ‘ How important is good
pronunciation?’). At this point in our understanding of VOT variation,
then, we are left to explore elsewhere other factors that might possibly
differentiate those subjects who present positive DVOTs from those who
do not.
Figure 3
Mean VOT values (ms) of tense, lax and aspirated stops for traditionalists vs.
innovators. Error bars: +1·00 SD.
VOT (ms)
generation group
Tonal contrast in contemporary Korean 295
The need for more carefully constructed and socially informed socio-
linguistic inquiry notwithstanding, a fundamental phonological question
remains to be addressed: if it is true that at least some of the younger
speakers in the community no longer use VOT to differentiate lax and
aspirated stops, how do they signal the underlying contrast? A potential
answer lies in an analysis of the corresponding F0 patterns.
4.2 Tonal melodies
Across the corpus, independent of a speaker’s age or sex, F0 patterns for
the target items are consistent: for words beginning with a lax stop, the F0
of the vowel in the first syllable is consistently lower than both (a) the
second and third syllables of the same target word and (b) the first syllable
of analogous words beginning with either a tense or aspirated stop (Fig. 4).
The robustness of these patterns is confirmed by a repeated-measures
ANOVA in which the dependent variable was the mean F0 for the vowel
in each syllable of the trisyllabic target forms. In this analysis, three in-
dependent variables were found to be statistically significant : phonation
type of the word-initial stop (F(2, 33)=121.5, pY0.001), location of the
vowel in the target word (first, second or third) (F(2, 33)=11.2, pY0.001)
and speaker’s sex (F(1, 34)=260.0, pY0.001). In addition, three interac-
tions proved significant (each at p50.001): PhonationuSex, Phonationu
Syllable and PhonationuSyllableuSex. Two factors not significant in the
analysis (pZ0.05) were the place of articulation of the word-initial stop
Figure 4
Comparison of averaged tonal melodies of male vs. female speakers. The phonation
types indicated refer to the first C in each target word. The bars represent 1 SD
above and below the mean value.
mean F0 (Hz)
296 David J. Silva
and the speaker’s year of birth. These findings suggesting that unlike the
VOT behaviours reported above, F0 behaviours are different in two
critical ways: place of articulation for the word-initial stop has no bearing
on the rate of vocal fold vibration of the targeted vowels, and there are no
clearly discernable diachronic effects on the same. This latter point is
particularly noteworthy, as it suggests that the tonal patterns presented in
Fig. 4 have been stable over time.
While one might expect phonation effects on the F0 values for the vowels
appearing in the target words’ initial syllables (Han & Weitzman 1965,
Silva 1998), differences among F0 values are also statistically significant
across syllables 2 and 3, but only with respect to a distinction between lax
and non-lax word-initial stops. In words beginning with aspirated and
tense stops, mean F0 values are not statistically significantly different in
either syllable 2 (p=0.318) or syllable 3 (p=0.059). Further probing by
means of Scheffe
´post hoc tests reveals that words beginning with a lax stop
exhibit a sequence of progressively higher-pitched syllables, words begin-
ning with an aspirated stops exhibit a sequence of progressively lower-
pitched syllables, and words beginning with a tense stop exhibit a pattern
whereby the F0 of syllable 2 is higher than those of syllables 1 and 3, but
the F0s of syllables 1 and 3 are not significantly different from each other.
These patterns are schematised using Chao numbers in (1):
5 Discussion
5.1 Variation and language change
The results of the acoustic study point to a change in how some Korean
speakers signal differences between lax and aspirated stops. The clearest
indicator of such a change is the presence of age-related shifts in VOT
values: as shown in Figs 1 and 2, older speakers are more likely to main-
tain clear VOT distinctions between lax and aspirated stops, while
younger speakers tend to minimise (or neutralise) such differences. The
non-linear pattern suggested by the data in Fig. 2 is arguably the first and
second components of a so-called s- (or z-) shaped curve, a pattern widely
documented in the sociolinguistic literature as representative of a language
change in apparent time (Labov 1972, 1994, Bailey et al. 1991, inter alia).
As Guy writes, ‘although such data actually constitute a synchronic
snapshot of a single point in time, the progress of the change is reflected in
the differential use by age’ (2003: 384–385). In the specific case of the
Tonal contrast in contemporary Korean 297
VOT data under analysis here, we can see the shift from a wide, statistically
significant differentiation between the VOTs of lax and aspirated stops
among older speakers, via a narrower (but still statistically significant)
differentiation of the same among middle-aged speakers, to the apparent
neutralisation of the VOT distinction for some of the youngest members
of the subject pool. As noted above, this younger cohort of speakers does
not appear to present a consistent set of behaviours, with some subjects
producing statistically significant differences in VOT for lax vs. aspirated
word-initial stops and other subjects appearing to neutralise VOT differ-
ences for these two phoneme types. How this situation plays out for a
larger pool of increasingly younger subjects remains to be documented.
The distribution of DVOT in Fig. 2, coupled with the fact that the best-fit
curve is a quadratic (and not a linear) relationship, provides support for
the claim that the sound change evidenced here has recently concluded. As
such, it is predicted that VOT data collected from speakers born after
1982 will yield an extension of points hugging the X-axis of the graph
in Fig. 2 (i.e. DVOT=0), thereby providing more a clearly discernible
(reversed) s-shaped curve.
Evidence further supporting the existence of a recent change lies in
the quasi-longitudinal data brought to bear in Silva (2002). Here we find
that speakers in their twenties during the 1960s and 1970s produced
traditional’ VOT patterns, i.e. clear differences in the mean VOTs of lax
vs. aspirated stops. Moreover, as Han & Weitzman report in their 1970
perception experiment, systematically manipulated differences in VOT
yielded responses that reflected a divide between the phonemic categories
of lax (shorter VOTs) and aspirated (longer VOTs). In contrast, speakers
in their twenties some 40 years later produce more ‘innovative’ VOT
patterns, i.e. DVOTY20 ms, often approximating 0. Were the data from
these two time periods representative of some other sociolinguistic
phenomenon (e.g. age-grading), there should have been more consistency
in the behaviour of subjects of similar ages. This is not the case.
5.2 Phonological implications
Despite this change in VOT patterns, the underlying contrast between lax
and aspirated stops in Korean is still preserved in the speech of younger
speakers, apparently manifested in terms of differentiated F0 patterns : lax
stops are associated with a melody that begins with a low pitch and then
rises through the target word, while aspirated stops are associated with a
melody that begins with a relatively higher pitch followed by a falling F0
Although phonation-associated F0 behaviours are not new to Korean
(Han & Weitzman 1965, Silva 1998), they have typically been relegated
to secondary status : while VOT values were taken as primary markers
of phonation type, F0 values were viewed as redundant.
Systematic consonant effects on the F0 trajectory of /a/ following word-initial target
stops could not be discerned in this corpus, either within or across subjects. This
298 David J. Silva
long-distance tonal associations correlated with lax vs. non-lax phonation
types have been documented in other varieties of Korean (e.g. Jun 1993) ;
but in these cases, F0 melodies have coexisted with robust VOT differ-
ences (Kim 2000). The current study, however, presents a new situation:
for younger speakers of the standard variety, VOT differences have been
neutralised and F0 appears to have assumed a primary role in marking the
lax vs. aspirated distinction.
These acoustic data corroborate a perception study by Kim et al. (2002),
who claim that their native Korean listeners made a critical distinction
in how they processed vocalic information edited from source stimuli
originally drawn from CV sequences containing lax, tense and aspirated
onsets. When presented with only the vowel portions of the source syl-
lables, subjects were consistently able to identify those source syllables
beginning with lax onsets: ‘for both vowel-only stimuli and cross-spliced
stimuli with conflicting consonant and vowel portions, listeners heard lax
initial stops if and only if the vowel had an L tone ’ (2002 : 97). Vowel
portions from syllables with tense and aspirated onsets, however, were not
so readily differentiated. Indeed, vowels drawn from syllables with both
tense and aspirated onsets were largely identified as coming from tense
stops; F0 cues were not sufficiently robust differentiators between the two
stops types. It was only when significant periods of aspiration noise were
included in the stimuli that subjects included aspirated onsets in their
reactions. Thus we find perceptual evidence in support of the acoustic
study reported here: that underlying phonation types for consonants are
correlated with the fundamental frequency characteristics of the immedi-
ately following vowel. The present study further generalises this claim by
arguing that these C-to-V correlations not only local, but may be broader
in scope. Given that the presence of a word-initial lax stop yields a tri-
syllabic tonal melody significantly lower than that associated with forms
beginning with either an aspirated or tense stop, we have reason to believe
that among these speakers of Korean, word-initial consonantal effects are
not restricted to the initial syllable, but extend into the higher prosodic
domain of the phonological word.
This shift in the relative weighting of phonetic events from the CV
transition (most critically, the post-release aspirated region) to the F0 of
the following vowel(s) provides evidence that standard Korean has un-
dergone a sound change analogous to that reported for various languages,
including Vietnamese (Haudricourt 1954, Thurgood 2002, among others),
Zaiwa (Wannemacher 1996) and Tibetan (Duanmu 1992). In Tibetan, for
example, historical differences between aspirated and unaspirated stops
finding is consistent with Han & Weitzman’s assessment of the issue: ‘there is
mutual overlapping between the onset values of fundamental frequency follow-
ing all three stops. This extreme overlapping suggests that the onset value of
fundamental frequency cannot be too significant a cue in the distinction of stop
consonants’ (1967 : 22). It is surprising, however, given the local phonation
effects reported in Silva (1998). This lack of local effects neither undermines nor
diminishes the more global melodic patterns reported here.
Tonal contrast in contemporary Korean 299
have been neutralised and replaced by differences in tone in the contem-
porary Lhasa variety.
khó (H level)
kh† (L rising)
Svantesson & House (2006) make a similar case in their comparison of
two varieties of Kammu (Laos). Where Eastern Kammu presents a voiced
vs. voiceless contrast (e.g. buuc winevs. puuc to undress’), Northern
Kammu manifests a corresponding L vs. H contrast, with no voicing
distinction (pu
`uc winevs. pu
´uc to undress’). Svantesson & House argue
that Northern Kammu is not truly tonal, despite the phonetic evidence.
Rather, they argue that the Eastern and Northern varieties share a com-
mon set of underlying representations, which are phonetically specified in
dialect-specific ways: speakers of Eastern Kammu realise the underlying
voicing contrast in initial consonants as such phonetically, while speakers
of Northern Kammu realise the same underlying contrast on the syllable
rhyme (i.e. higher vs. lower F0).
An analogous set of arguments has, in fact, been advanced for Korean
by Kim & Duanmu (2004). In their account, underlying lax stops are
argued to bear the features of a voiced obstruent ([+voiced, aspirated]),
while the tense series is treated simply as voiceless ([voiced, aspirated])
and the aspirated series is assigned the features [voiced, +aspirated].
Moreover, they write that ‘in this analysis, the main phonetic difference
in words pairs such as [t*al] ‘daughter’ and [tal] ‘moon’ [with tense and
lax initial stops, respectively] does not lie in the stops themselves but
in the tone of the vowel ([ta
´l] vs.[îl/tl])’ (2004: 96). In phrase-initial
position, lax stops are devoiced, thereby leaving only tonal information
to differentiate them from their tense counterparts. Under this analysis,
one further presumes that in word-initial position, the lax~aspirated
contrast is maintained by the opposition between [aspirated] and
Aside from the critical insight that phonation and tone should be phono-
logically integrated in Korean, Kim & Duanmu’s accounting evokes both
empirical and theoretical concerns. Empirically, their analysis provides no
clear accounting for the phonetic facts presented in w4 above : it is not clear
how they would capture the innovation manifested by younger speakers,
whereby lax and aspirated stops are no longer distinguished by differential
VOT values. Theoretically, their account raises two questions of con-
siderable import. First, is renders the tense series as the least marked in
the phonological system, as they are reanalysed as voiceless unaspirated
segments. Second, it suggests that in Korean, intervocalic position is the
primary locus for faithfulness (at least with respect to the feature
[+voiced]), an implication at odds with widely accepted accounts that give
primacy to domain-initial positions and prosodic heads in assessing fea-
tural faithfulness (Beckman 1998).
300 David J. Silva
5.3 Toward a revised model of Korean stop phonology
In this section, we propose a theoretical explanation that accounts for
the ‘traditional’ and ‘innovative’ patterns reported above without
the need to introduce radical changes to our existing understanding of the
Korean feature inventory. The analysis advanced draws substantially
from that proposed by Ahn & Iverson (2004), which was predicated on a
fully ‘traditional’ understanding of the phonetic facts.
We begin by returning to the Tibetan and Kammu situations, noting
that when they are compared to more traditional treatments of Korean, a
key difference emerges: in Korean, there is no underlying voicing contrast
in play. As would be predicted, both Tibetan and Kammu voiced stops are
associated with a low tone on the following vowel, a relatively common
depressor effect’ triggered by the voicing, high sonorance or nasality of
an immediately preceding syllable onset (Goldsmith 1990, Koehler 1995).
In Korean, however, standard accounts of the consonantal system deny a
contrastive role for the feature [voiced], under the general redundancy
statement that links sonorance with voicing : [\sonorant] £[\voiced] (see
Kim 1965, Kim-Renaud 1974, Lee & Ramsey 2000, Sohn 1999). This is
no problem, however, as support for an aspirated segment taking pre-
cedence over a corresponding unaspirated segment when it comes to tone
raising (the issue of voicing per se aside) can be found as far back as the
1970s, when researchers such as Hyman & Schuh (1974) advanced
the following hierarchy of consonant-induced tonal effects (reported in
Lee 1978: 218):
(3) implosive
voiceless aspirated
voiceless unaspirated
voiced obstruent
breathy voiced
tone raising
tone lowering
With the hierarchy in (3) as background, the absence of phonemic voicing
in Korean poses little theoretical problem for the analysis advanced here,
which rests primarily on the fact that in most languages (including
Korean) voiceless unaspirated segments are the least marked elements in
the consonant inventory. Interpreting markedness in a structural sense,
the current account treats the presence of a privative feature [spread
glottis] (or an analogous feature marking aspiration) as the driving force
behind the shift from aspiration to tone to mark the underlying lax vs.
aspirated distinction. More specifically, the ‘innovators’ in the subject
pool employ a redundancy rule whereby a laryngeal node dominating any
content is interpreted not on the segmental plane but tonally, formalised
in (4) (inspired by Yip 1995).
Tonal contrast in contemporary Korean 301
Glottal Aperture
(4) X
Under the analysis in (4), aspirated stops, by virtue of their underlying
marking for [spread glottis], acquire a tonal value of H; similarly, tense
stops, with their surface-level marking for [constricted glottis], likewise
acquire an H tone.
In contrast, lax stops do not receive any tonal marking,
leaving them to be realised by a default L tone. Appealing to a default L is
consistent with tonal analyses of other languages, e.g. Geman, Haya,
Kimatuumbi, Luganda and Somali (X-Tone 2005), as well as Lee’s (1987:
104) account of Gyeongsang Korean (but may be at odds with Kim’s 1997
account of the Northern Gyeongsang variety).
What, then, might motivate this association of a H tone with a non-null
Laryngeal node? To address this question, let us turn to the theory of
glottal features originally advocated in Halle & Stevens (1971). We begin
by assuming that underlying lax stops are represented by a single laryn-
geally unspecified C. Additionally, phonemic tense stops are represented
as geminated Cs (linked to a single syllable node), which are subsequently
modified by the introduction of a phonetic-level laryngeal specification
[constricted glottis] (Han 1996 : 191, Silva 1992 : 63–67). In the current
account, however, the element interpolated to the C-C structure is a
more abstract ‘tensity feature’, privative [stiff] (Halle & Stevens 1971, Bao
1990). Here we adopt Kim’s (1965) perspective on the notion that tense
and aspirated stops do, in fact, form a natural class of tense (or fortis)
segments, a view later adopted by Kim-Renaud in her underlying feature
specification of tense and aspirated phonemes as [+tense] (1974 : 5).
Finally, under this account, aspirated segments are laryngeally marked in
underlying representation, the relevant feature again being privative
[stiff]; they are the only segments that bear any laryngeal specification
in the lexicon.
As consonantal representations are interpreted by the phonetic com-
ponent of the grammar, C nodes marked by the tensity feature (i.e. tense
and aspirated) are realised with H tone the primary reflex of the stiff
vocal folds. At the same time, any non-sonorant singleton consonant
(i.e. non-geminate and aspirated) in phrase-initial position is realised with
aspiration, indicated by the insertion of the feature [spread]. Such an
account, with its explicit reference to prosodic constituency, is a reflection
This account operates independently of any assumptions regarding the underlying
status of the tense stops in Korean, be they phonemically geminates (which ulti-
mately receive a language-specific marking for [constricted glottis]) or singletons
(marked for [constricted glottis] from the outset).
302 David J. Silva
of what Kim-Renaud refers to as a ‘boundary-sensitive strengthening
phenomen[on]’ (1974: 3).
[+cons, son] [+cons, son]
[+cons, son] [+cons, son]
[sti‰] [sti‰]
[+cons, son] [+cons, son]
Xa. lax
X aspirated,
Xb. aspirated X
Xc. tense XXX
Under this analysis, the degree of aspiration associated with these phrase-
initial singleton Cs is assumed to be consistent, regardless of the con-
sonant’s marking for the tensity feature: [stiff] and [spread] is no more
aspirated than just [spread]. What is different about [stiff ] and [spread] is
the fact that the stiffness leads to a higher F0 on the following vowel than a
structure marked by [spread] alone. This transfer of laryngeal character-
istics from C to V is formally accomplished by following Ahn & Iverson
(2004), who invoke the principle of ‘ bipositionality:
H(6) [son]
Applying the principle in (6) to the structures in (5), we can begin to
account for the tonal patterns schematised in (1). What remains to be
accounted for is the apparent long-distance nature of the C-to-V spreading.
Tonal contrast in contemporary Korean 303
It is suggested here that no further phonological spreading is required to
account for the observed tonal melodies ; rather, phonetic interpolation
working across a higher prosodic domain, such as the minor phrase (Silva
1992) or the accentual phrase (Jun 1993), gives rise to the observed up-
ward trajectory for the lax/L items and the downward trajectory for the
non-lax/H items.
When compared to more traditional accounts of the lax~aspirated~
tense contrast in Korean, the analysis above differs in several ways. First,
it suggests that for Korean language innovators, the aspiration feature
(here [spread]) no longer functions phonemically ; rather, it is now a pro-
sodically conditioned redundant property. More specifically, the insertion
of [spread] occurs only in phrase-initial position, an example of a prosodi-
cally driven strengthening process. In word-internal intervocalic position,
by contrast, lax stops remain unmarked for laryngeal features, thereby
allowing for the phonetic interpolation of voicing (Silva 1992 : 142).
Aspirated stops in word-internal position also fail to acquire any new
features; their underlying marking for [stiff] prevents intervocalic voicing
(be the process a matter of phonological feature spreading or one of pho-
netic interpolation), leaving these segments voiceless in this position. The
extent to which the presence of a word-internal instance of [stiff] yields a
raised F0 on the following vowel is a matter left for future research.
Second, the current analysis assumes that the spreading effects attrib-
uted to bipositionality can be – if not must be – more than simply local. In
the spirit of Pierrehumbert & Beckman (1988), we argue that this phono-
logically spread instantiation of [stiff ] functions as an anchor for sub-
sequent F0 interpolation across tonally unspecified syllables across a
larger prosodic unit (most likely a phonological word, but perhaps even
a phrasal constituent), thereby yielding the F0 patterns displayed in Fig. 4.
Further consideration of [stiff ] as a more general manifestation of tonal
prominence’ will likely dictate whether the developing tonal contrast
might eventually be characterised as a pitch-accent system. Such an ac-
count would certainly place contemporary Seoul Korean in a familiar
context, given the existence of other pitch-accented dialects. All the same,
it would be foolhardy to assume that the phenomenon under discussion
here is unequivocally pitch-accentual, solely on the basis of what has been
claimed in other varieties of the language. In the absence of the necessary
corroborating data, no position is taken on the matter.
Finally, this account takes the explicit position that the relationship
between phonation type and the fundamental frequency of the following
vowel(s) is by-and-large ‘automatic’, as opposed to ‘controlled’. In con-
trast to the position put forth by Kingston & Diehl (1994: 423), the cur-
rent analysis claims that the phonological feature [stiff] actually serves to
predict phonetic behaviour, as opposed to simply limiting it. Moreover, it
is implied that this relationship predates the onset of the sound change
whereby VOT differences between lax and aspirated stops have become
neutralised, suggesting further that relevant relationship is not one between
glottal aperture and F0 (as argued by Kingston & Diehl, who focus their
304 David J. Silva
discussion on the feature [voice]), but rather one between glottal tension
and F0.
6 Conclusions
As the data from this study indicate, the obstruent system of Korean ap-
pears to have undergone a process of sound change : while the underlying
tripartite distinction among tense, lax and aspirated stops persists, the
basic phonetic manifestation of the latter two stop types has shifted from
one of a clear voice onset time distinction to one whereby fundamental
frequency plays the primary role. While other researchers have docu-
mented this sort of relationship between phonation type and F0 in speech
of contemporary Korean speakers (Jun 1993, Kim 2000, Kim & Park
2001), the current study makes clear the diachronic status of this situation,
namely, that the change appears to be in its final stages, if not actually com-
pleted. This type of shift is much like the voicing-to-tone changes attested
in other languages (such as Tibetan), thereby providing further support
for a phonological theory that allows for unified functionality of
the laryngeal mechanism, a single set of features that account for both
tonal and phonation events; the relevant feature here is [stiff], analogous
to Kim’s (1965) ‘ tensity’ feature. The acoustic results of the current
research further corroborate perceptual experiments conducted by Kim
et al. (2002), by supplementing their observations and analysis of related
data with a larger-scale, age-differentiated acoustic study. Indeed, the
Kim et al. methodology merits replication with a larger pool of listeners,
one that includes subjects older than the twelve 26–32 years old ‘ pho-
netically untrained native speakers of the Seoul dialect of Korean’ (2002:
84). If the acoustic data reported here have any bearing on the outcome of
such a perception experiment, we might expect that the responses of
older speakers (i.e. those born before 1965) would differ from those of their
younger counterparts, with older speaker perhaps less likely to consistently
differentiate among all three stop types solely on the basis of F0 infor-
mation. In addition, adding sonorant- and zero-initial three-syllable
words to the corpus (e.g. mapali packhorse’, nameci remainder’, ladio
radio’, apeji father’) will reveal the extent to which sonorance and/or
voicing influence the tonal melody of the entire word.
Finally, this study suggests that standard Korean is coming into align-
ment with other varieties of the language (most of which employ either
lexical or phrasal pitch accent), as well as with other East Asian languages,
which use F0 in phonemically relevant ways.
Ahn, Sang-Cheol & Gregory K. Iverson (2004). Dimensions in Korean laryngeal
phonology. Journal of East Asian Linguistics 13. 345–379.
Bailey, Guy, Tom Wikle, Jan Tillery & Lori Sand (1991). The apparent time con-
struct. Language Variation and Change 3. 241–264.
Tonal contrast in contemporary Korean 305
Bao, Zhiming (1990). On the nature of tone. PhD dissertation, MIT.
Beckman, Jill (1998). Positional faithfulness. PhD dissertation, University of
Massachusetts, Amherst.
Cho, Taehong (1996). Vowel correlates to consonant phonation : an acoustic-perceptual
study of Korean obstruents. MA thesis, University of Texas at Arlington.
Choi, Hansook (2002). Acoustic cues for the Korean stop contrast: dialectal variation.
ZAS Papers in Linguistics 28. 1–12.
Duanmu, San (1992). An autosegmental analysis of tone in four Tibetan languages.
Linguistics of the Tibeto-Burman Area 15. 65–91.
Flege, James E., Murray J. Munro & Ian R. A. MacKay (1995). Effects of age of
second-language learning on the production of English consonants. Speech
Communication 16. 1–26.
Flege, James E., Carlo Schirru & Ian R. A. MacKay (2003). Interaction between the
native and second language phonemic subsystems. Speech Communication 40.
Flege, James E., Grace H. Yeni-Komshian & Serena Liu (1999). Age constraints on
second-language acquisition. Journal of Memory and Language 41. 78–104.
Goldsmith, John A. (1990). Autosegmental and metrical phonology. Oxford &
Cambridge, Mass.: Blackwell.
Guy, Gregory R. (2003). Variationist approaches to phonological change. In Brian D.
Joseph & Richard D. Janda (eds.). The handbook of historical linguistics. Malden,
Mass. & Oxford: Blackwell. 369–400.
Halle, Morris & Kenneth N. Stevens (1971). A note on laryngeal features. MIT
Quarterly Progress Report 101. 198–212.
Han, Jeong-Im (1996). The phonetics and phonology of tense and plain consonants in
Korean. PhD dissertation, Cornell University. Distributed 1996, Ithaca: CLC
Han, Mieko S. & Raymond S. Weitzman (1965). Studies in the phonology of Asian
languages III: acoustic characteristics of Korean stop consonants. Los Angeles :
Acoustic Phonetics Research Laboratory, University of Southern California.
Han, Mieko S. & Raymond S. Weitzman (1967). Studies in the phonology of Asian
languages V: acoustic features in the manner-differentiation of Korean stop consonants.
Los Angeles: Acoustic Phonetics Research Laboratory, University of Southern
Han, Mieko S. & Raymond S. Weitzman (1970). Acoustic features of Korean /P, T, K/,
/p, t, k/ and /ph, th, kh/. Phonetica 22. 112–128.
Hardcastle, W. J. (1973). Some observations on the tense–lax distinction in initial stops
in Korean. JPh 1. 263–272.
Haudricourt, Andre
´-George (1954). De l’origine des tons en vietnamien. Journal
Asiatique 242. 69–82.
Hyman, Larry M. & Russell G. Schuh (1974). Universals of tone rules: evidence from
West Africa. LI 5. 81–115.
Jun, Sun-Ah (1993). The phonetics and phonology of Korean prosody. PhD dissertation,
Ohio State University.
Kagaya, Ryohei (1974). A fiberscopic and acoustic study of the Korean stops, affricates
and fricatives. JPh 2. 161–180.
Kim, Chin-W. (1965). On the autonomy of the tensity feature in stop classification
(with special reference to Korean stops). Word 21. 339–359.
Kim, Mi-Ryoung (2000). An alternative account of so-called lax consonants with
special reference to consonant-tone interaction. Studies in Phonetics, Phonology, and
Morphology 6. 333–352.
Kim, Mi-Ryoung, Patrice Speeter Beddor & Julie Horrocks (2002). The contribution
of consonant and vocalic information to the perception of Korean initial stops.
JPh 30. 77–100.
306 David J. Silva
Kim, Mi-Ryoung & San Duanmu (2004). ‘Tense’ and ‘ lax ’ stops in Korean. Journal
of East Asian Linguistics 13. 59–104.
Kim, Mi-Ryoung & Kyung-Ja Park (2001). A Korean consonant-tone transfer in L2
(English) acquisition. Journal of the Pan-Pacific Association of Applied Linguistics 5.
Kim, No-Ju (1997). The optimality theoretic account of tones, segments, and their inter-
action in North Kyungsang Korean. PhD dissertation, Ohio State University.
Kim, Youngho (1995). Acoustic characteristics of Korean coronal stops, affricates and
fricatives. MA thesis, University of Texas at Arlington.
Kim-Renaud, Young-Key (1974). Korean consonantal phonology. PhD dissertation,
University of Hawaii.
Kingston, John & Randy L. Diehl (1994). Phonetic knowledge. Lg 70. 419–454.
Koehler, Loren S. (1995). An underspecification approach to Budu vowel harmony.MA
thesis, University of Texas at Arlington.
Labov, William (1972). Sociolinguistic patterns, Philadelphia: University of
Pennsylvania Press.
Labov, William (1994). Principles of linguistic change : internal factors. Oxford &
Cambridge, Mass.: Blackwell.
Ladefoged, Peter (2003). Phonetic data analysis : an introduction to fieldwork and
instrumental techniques. Malden, Mass. & Oxford: Blackwell.
Laver, John (1994). Principles of phonetics. Cambridge : Cambridge University Press.
Lee, Iksop & S. Robert Ramsey (2000). The Korean language. Albany: State
University of New York Press.
Lee, Sang Do (1987). A study of tone in Korean dialects. PhD dissertation, Georgetown
Lee, Sang Oak (1978). Middle Korean tonology. PhD dissertation, University of
Illinois at Urbana-Champaign. Published 1979, Seoul: Hanshin.
Lisker, Leigh & Arthur S. Abramson (1964). A cross-language study of voicing in
initial stops: acoustical measurements. Word 20. 384–422.
Martin, Samuel E. (1992). A reference grammar of Korean. Rutland : Charles E. Tuttle
Publishing Co.
Pierrehumbert, Janet B. & Mary E. Beckman (1988). Japanese tone structure.
Cambridge, Mass.: MIT Press.
Silva, David J. (1989). A prosody-based investigation into the phonetics of Korean
stop voicing. Harvard Studies in Korean Linguistics 4. 181–195.
Silva, David J. (1992). The phonetics and phonology of stop lenition in Korean. PhD
dissertation, Cornell University. Distributed 1992, Ithaca : CLC Publications.
Silva, David J. (1998). The effects of prosodic structure and consonant phonation on
vowel F0in Korean: an examination of bilabial stops. In J. R. P. King & S. Robert
Ramsey (eds.) Progress in Korean linguistics. Ithaca : Cornell University Press. 1–23.
Silva, David J. (2002). Consonant aspiration in Korean: a retrospective. In Sang-Oak
Lee & Gregory K. Iverson (eds.) Pathways into Korean language and culture : essays
in honor of Young-Key Kim-Renaud. Revised edn. Seoul : Pagijong Press. 447–469.
Silva, David J. (2006). Variation in voice onset time for Korean stops : a case for recent
sound change. Korean Linguistics 13. 1–16.
Sliva, David J., Younjeoung Choi & Ji Eun Kim (2004). Diachronic shift in VOT
values for Korean stop consonants. Harvard Studies in Korean Linguistics 10.
Sohn, Ho-Min (1999). The Korean language. Cambridge : Cambridge University
Svantesson, Jan-Olof & David House (2006). Tone production, tone perception and
Kammu tonogenesis. Phonology 23. 309–333 (this issue).
Thurgood, Graham (2002). Vietnamese and tonogenesis : revising the model and the
analysis. Diachronica 19. 333–363.
Tonal contrast in contemporary Korean 307
Wannemacher, Mark W. (1996). Aspects of Zaiwa prosody : an autosegmental account.
MA thesis, University of Texas at Arlington.
X-Tone: Cross-linguistic tonal database (2005). Available (July 2006) at http://xtone.
Yip, Moira (1995). Tone in East Asian languages. In John A. Goldsmith (ed.)
The handbook of phonological theory. Cambridge, Mass. & Oxford : Blackwell.
Yu-Cho, Young Mee (1990). Syntax and phrasing in Korean. In Sharon Inkelas &
Draga Zec (eds.) The phonology–syntax connection. Chicago : University of Chicago
Press. 47–62.
308 David J. Silva
... The original VOT difference between these stops is currently being lost, leading to an ongoing quasi-tonogenetic sound change in which the vowel following lenis stops takes over a lower f0. As in Afrikaans, this f0 difference is not limited to vowel onset (Silva, 2006;Kang, 2014;Bang et al., 2018). ...
... Registrogenesis is a typical case of transphonologization, as secondary phonetic properties (f0, vowel quality, phonation type) take over the distinctive function of the primary cue (closure voicing) to maintain phonological contrast. It is more complex than the transphonologization of VOT/voicing into f0 contrasts attested in several genetically unrelated languages of the world such as Kammu (Svantesson & House, 2006), Malagasy (Howe, 2017), Afrikaans , and Korean (Silva, 2006;Kang & Han, 2013;Bang et al., 2018), because it involves more cues. Once formed, register is also subject to cue shifts as the weights of its various properties and how they are combined evolve and vary across 2 Austroasiatic used to be divided into two large groups, Mon-Khmer and Munda. ...
... Is production or perception normally leading the way? Some studies found that secondary production properties can be exaggerated and phonologized while the original primary cue is still present in production: this is the case with the phonologization of the secondary f0 associated to VOT contrasts in Seoul Korean (Silva, 2006;Kang & Han, 2013), Malagasy (Howe, 2017), and Afrikaans . Other studies found that perception leads change, meaning that listeners shift their attention to innovative cues before increasing their weight in production, as in southern British /u/-fronting (Harrington, Kleber, & Reubold, 2008;Harrington, 2012), Dutch obstruent devoicing (Pinget, 2015;Pinget, Kager, & Van de Velde, 2016, Southern Yi restructuring of register into vowel contrast (Kuang & Cui, 2018), and Korean tonogenesis in Hunchun and Dangdong dialects (Schertz, Kang, & Han, 2019). ...
Chrau, a South Bahnaric language of the Austroasiatic family spoken in South Vietnam, has been described as having a voicing contrast in onset stops. However, a production experiment reveals that rather than a voicing contrast, Chrau, like many other Austroasiatic languages, has register, a two-way contrast realized on syllables through a bundle of phonetic properties including phonation type, vowel quality, and pitch differences. Stop voicing is marginally present in some speakers but seems to be an optional property of register. The results of a perception study on register further suggest that speakers roughly employ the same phonetic properties in perception as in production. Individual variation is observed in both production and perception, but there is not a straightforward correlation between the two modes at the individual level—an indication that listeners’ perception is flexible enough to accommodate variation in production. Our results raise questions about the diachronic scenarios proposed to account for the transphonologization of onset voicing into register and tone in Mainland Southeast Asian languages.
... Prior studies have described Korean as having a short Voice Onset Time (VOT) and high F0 for fortis stops, an intermediate VOT and low F0 for lenis stops, and a long VOT and high F0 for aspirated stops in word-initial position (e.g., Lisker and Abramson, 1964;Cho, 1996). It has also been documented that, in Seoul Korean, the VOT of lenis and aspirated stops has gradually merged over time, with the contrast now depending on the F0 of the following vowel (e.g., Silva, 2006;Kang and Guion, 2008;Kang, 2014). The realization of stops in Seoul Korean is dependent on the prosodic position in which these stops occur, such as the Accentual Phrase (Silva, 2006). ...
... It has also been documented that, in Seoul Korean, the VOT of lenis and aspirated stops has gradually merged over time, with the contrast now depending on the F0 of the following vowel (e.g., Silva, 2006;Kang and Guion, 2008;Kang, 2014). The realization of stops in Seoul Korean is dependent on the prosodic position in which these stops occur, such as the Accentual Phrase (Silva, 2006). More specifically, in trisyllabic Korean words, a low F0 (L) and upward F0 trajectory are observed if the word-initial segment is a lenis stop, and a high F0 (H) and downward F0 trajectory is observed if the initial segment is a non-lenis stop (i.e., fortis and aspirated stops). ...
... More specifically, in trisyllabic Korean words, a low F0 (L) and upward F0 trajectory are observed if the word-initial segment is a lenis stop, and a high F0 (H) and downward F0 trajectory is observed if the initial segment is a non-lenis stop (i.e., fortis and aspirated stops). In other words, the consonant-induced F0 distinction in Korean extends far beyond the initial portion of the immediately following vowel (Jun, 1996;Silva, 2006). Korean listeners have also been found to use F0 cues in the perception of stop contrasts: Lee et al. (2013) and Schertz et al. (2015) demonstrated that Seoul Korean listeners used F0 as a primary cue and VOT as a secondary cue to perceive the lenis-aspirated stop contrast and both F0 and VOT as primary cues to perceive the fortis-lenis stop contrast. ...
Full-text available
This study examines whether second language (L2) learners' processing of an intonationally cued lexical contrast is facilitated when intonational cues signal a segmental contrast in the native language (L1). It does so by investigating Seoul Korean and French listeners' processing of intonationally cued lexical-stress contrasts in English. Neither Seoul Korean nor French has lexical stress; instead, the two languages have similar intonational systems where prominence is realized at the level of the Accentual Phrase. A critical difference between the two systems is that French has only one tonal pattern underlying the realization of the Accentual Phrase, whereas Korean has two underlying tonal patterns that depend on the laryngeal feature of the phrase-initial segment. The L and H tonal cues thus serve to distinguish segments at the lexical level in Korean but not in French; Seoul Korean listeners are thus hypothesized to outperform French listeners when processing English lexical stress realized only with (only) tonal cues (H * on the stressed syllable). Seoul Korean and French listeners completed a sequence-recall task with four-item sequences of English words that differed in intonationally cued lexical stress (experimental condition) or in word-initial segment (control condition). The results showed higher accuracy for Seoul Korean listeners than for French listeners only when processing English lexical stress, suggesting that the processing of an intonationally cued lexical contrast in the L2 is facilitated when intonational cues signal a segmental contrast in the L1. These results are interpreted within the scope of the cue-based transfer approach to L2 prosodic processing.
... The investigation of sound change is naturally focused on differences between speakers of different ages. For example, earlier studies that offered evidence of sound change in the Korean stop laryngeal contrast, such as Silva (2006) and Kang and Guion (2008), did so by showing that younger speakers were producing the contrast with phonetic cue weightings different from older speakers. But the spread and adoption of sound change is conditioned on a range of factors, one of which is the specific community that the speaker is a part of: who are the speaker's peers, and what social or institutional pressures might influence the speaker's adoption of an innovative variant? ...
... Korean has a three-way laryngeal contrast among voiceless stops, namely, fortis (/p', t', k'/), lenis (/p, t, k/), and aspirated stops (/p h , t h , k h /), with the phonetic and phonological properties of this contrast having been thoroughly documented over the past six decades (e.g., Kim, 1965;Han & Weitzman, 1970;Cho, Jun, & Ladefoged, 2002;Kong, Beckman, & Edwards, 2011;Lee, 2016;Lee, Holliday, & Kong, 2020). Research in the past 15 years, however, has focused on a sound change in the stop contrast of standard Seoul Korean (e.g., Kim & Duanmu, 2004;Silva, 2006;Kang & Guion, 2006Kang, 2014). Production data has shown that while the three stop types used to be differentiated by VOT alone, in the order fortis < lenis < aspirated, the VOT of lenis and aspirated stops has become overlapped in younger Seoul speakers born after roughly 1965. ...
Previous studies have shown that the experience of higher education can influence speakers’ use of local and supralocal variants, but there has been less work examining its effect on perception. In the current study, we investigated the effect of higher education on perceptual cue-weighting by comparing high school and university students speaking two different dialects of Korean: Standard Seoul Korean (SSK) and Kyungsang Korean (KK). SSK speakers are known to perceptually weigh f0 over VOT in the stop laryngeal contrast, whereas KK speakers weigh VOT over f0. 117 high school and university students completed a stop identification task by responding to auditory stimuli built from VOT and f0 continua. Results revealed that while dialect-specific cue-weighting patterns existed among both SSK and KK listeners, the cue-weighting of university students in both regions was less dialect-specific than their respective high school counterparts. Comparing these patterns with those of 47 elementary school students confirmed that the trend is not directly correlated with the listeners’ ages. These findings suggest that the sociolinguistic experience accompanying the transition into higher education motivates listeners to flexibly accommodate supralocal phonetic variation regardless of dialectal prestige.
... Moreover, Languages 2022, 7, 6 3 of 20 the word-initial lenis-aspirated stop contrast is believed to be undergoing tonogenesis-the emergence of a tonal distinction on the basis of existing consonantal laryngeal contrasts and the ultimate replacement of the laryngeal categories by tone Kang 2014. Specifically, an ongoing sound change in the Seoul-Gyeonggi dialect of Korean is merging the VOTs of lenis and aspirated stops, especially among younger speakers (Bang et al. 2018;Chang and Mandock 2019;Kang 2014;Kang and Guion 2008;Kong and Yoon 2013;Silva 2006), with onset F0 becoming the dominant cue to this distinction (Kang et al. 2010;Kim et al. 2002;Kong et al. 2011;Lee et al. 2013Lee et al. , 2020Lee and Jongman 2019). ...
Full-text available
The present study examines the extent of crosslinguistic influence from English as a dominant language in the perception of the Korean lenis–aspirated contrast among Korean heritage speakers in the United States (N = 20) and English-speaking learners of Korean as a second language (N = 20), as compared to native speakers of Korean immersed in the first language environment (N = 20), by using an AX discrimination task. In addition, we sought to determine whether significant dependencies could be observed between participants’ linguistic background and experiences and their perceptual accuracy in the discrimination task. Results of a mixed-effects logistic regression model demonstrated that heritage speakers outperformed second language learners with 85% vs. 63% accurate discrimination, while no significant difference was detected between heritage speakers and first language-immersed native speakers (85% vs. 88% correct). Furthermore, higher verbal fluency was significantly predictive of greater perceptual accuracy for the heritage speakers. The results are compatible with the interpretation that the influence of English on the discrimination of the Korean laryngeal contrast was stronger for second language learners of Korean than for heritage speakers, while heritage speakers were not apparently affected by dominance in English in their discrimination of Korean lenis and aspirated stops.
Effects of vocal accommodation have been reported in a wide range of contexts, but they have typically been small. The absence of effects in some cases has proven perplexing. In the present investigation I present innovative methods for the representation of phonetic distance between phonetic tokens and the analysis of phonetic accommodation. I take a broad crosslinguistic perspective and report effects of linguistic background (L1) on patterns of phonetic convergence toward typical monolingual English voiceless stop voice-onset-times (VOTs). I propose that patterns of accommodation in laryngeal-oral coordination, as instantiated by voiceless stop VOTs, will reflect general principles of motor coordination (preferences for stable/in-phase coordination, cf. Browman & Goldstein, 1988; Haken, Kelso, & Bunz, 1985). Thus, stable, near-zero VOTs (cf. Spanish) will be less likely to show convergence toward intermediate English VOT, whereas less stable, long VOTs (cf. Korean) will be more likely to converge. Monolingual English and bilingual (Korean-English, Spanish-English) participants completed word shadowing and reading tasks. Their vowel-normalized voiceless stop VOTs were submitted to two analyses which confirm the articulatory stability hypothesis and reveal group-specific changes in vocal accommodation over time. The first involves a general baseline-to-test comparison, while in the second, a trial-specific difference from baseline is used as a dependent measure. The results offer new insights into the effects of language background on vocal accommodation, and the analytical approach offers a means to more cleanly isolate subtle effects of accommodation in speech among a multitude of competing factors.
This paper is a socio-phonetic study into the ways that affective meanings, particularly those related to politeness, are perceived cross-culturally. Building on previous research [Brown et al., 2014], the present experiments investigated the perception of acoustic cues associated with politeness in Korean by two groups of Chinese listeners: naïve listeners and experienced learners. The aim of this study is two-fold: to assess whether second language (L2) learners can access acoustic information for a politeness-related social stance in the L2, and whether L2 experience can improve learners’ ability to attend to the relevant cues. In Experiment 1, randomly ordered isolated stimuli produced in either a deferential or an intimate context by native Korean speakers were judged by naïve Chinese listeners. Similar to English listeners in Brown et al. (2014), Chinese listeners’ overall accuracies were below chance level (52.8%). When the same stimuli were presented in a blocked by speaker design in Experiment 2, accuracies increased to 57.8% for naïve listeners and 62.5% for experienced listeners, indicating that language experience can facilitate the acquisition of the language-specific acoustic correlates of politeness. The implications of these findings are discussed in light of the effects of L2 experience in immersion settings on the implicit learning of sociolinguistic knowledge.
This work examines Seoul Korean listeners’ perception of the five Korean sibilants: affricates /c′, c, c h / and fricatives /s′, s/. Natural productions of the consonants were manipulated to vary orthogonally along several phonetic parameters relevant to the place/manner contrast ((denti)alveolar fricative vs. (palato)alveolar affricate) and the laryngeal contrast (fortis vs. lenis vs. aspirated). Of particular interest was listeners’ representation of /s/, whose laryngeal status is ambiguous. All manipulated parameters (baseline consonant and vowel affiliation, fundamental frequency at vowel onset, frication duration, and aspiration duration) influenced categorization, with consonant and vowel spectral information playing the primary role in distinguishing most sibilants. However, f 0 , a laryngeal cue, trumped place and manner cues in affricate vs. fricative classification, highlighting the increasing importance of f 0 in Korean segmental phonology.
The phonetics/phonology interface refers to the relationship between the physical dimensions of phonetics and the abstract arrangement of phonemes and their manifestations within the phonological systems of languages. This chapter provides an overview of a range of approaches to the investigation of the phonetics/phonology interface, with particular attention to the relationships between phonetic factors such as positional prominence, acoustic salience and articulatory gestures, and phonological phenomena such as segment features and inventories, assimilation, and tone. I survey several clusters of theoretical orientation, each with distinct theoretical underpinnings and claims about the extent to which phonological concepts encode, reflect or direct phonetic details. I conclude with a discussion synthesising these seemingly disparate approaches, unifying them around a theme of linking the continuous physical dimensions of phonetic science with the abstract cognitive categories and rules of combination that typify phonological models. I discuss pedagogical implications and new directions in which facets of the interface can be explored.
This study investigated the sound change of aspirated stops in Korea by comparing neural and behavioral responses between younger and older generations of Korean speakers. Neural sensitivities were examined using event-related potentials (ERPs) in four conditions: /t/ vs. /th/, /t/ vs. /t’/, /th/ vs. /t/, and /t’/ vs. /t/. In addition, accuracies and reaction times in an AX discrimination task were measured. A total of 40 Korean native speakers participated in the study: 20 in the younger generation group in their 20 s and 20 in the older generation group in their 50 s. The results were as follows: (i) in the behavioral task, both the younger and the older generations did well at distinguishing aspirated stops from lax stops with high accuracy rates over 90% and similar reaction times, (ii) while the ERP results showed generational differences; mismatch negativity (MMN) was elicited for the aspirated and lax stops distinction only for the older generation, but not for the younger generation. Such group differences were not found for the tense and lax stop distinctions. The findings of this study provide neurophysiological evidence for the ongoing sound change of aspirated stops in Korea.
Full-text available
Acoustic data elicited from 34 native speakers of Korean living in the United States pro-vide evidence for diachronic change in the voice onset time (VOT) of phrase-initial aspirated and lax stop phonemes. While older speakers produce aspirated and lax stops with clearly differentiated average VOT values, many younger speakers appear to have neutralized this difference, producing VOTs for aspirated stops that are substantially shorter than those of older speakers, and comparable to those for corresponding lax stops. The data further indicate that, within each age group, older speakers manifest sex-based differences in VOT while younger speakers do not. Despite this appar-ent shift in VOT values, the acoustic evidence suggests that all speakers in this study, regardless of age, continue to mark underlying differences between aspirated and lax stops in terms of stop closure and the fundamental frequency of the following vowel. It is concluded that the data point to a recent phonetic shift in the language, whereby VOT no longer serves as the primary cue to differentiate between lax and aspirated stops. There is not, however, evidence of any reorganization of the lan-guage as the phonemic level: the language's underlying lax ~ aspirated ~ tense contrasts endure.
Full-text available
Korean is thought to be unique in having three kinds of voiceless stops: aspirated /ph th kh/, tense /p* t* k*/, and lax /p t k/. The contrast between tense and lax stops raises two theoretical problems. First, to distinguish them either a new feature [tense] is needed, or the contrast in voicing (or aspiration) must be increased from two to three. Either way there is a large increase in the number of possible stops in the world's languages, but the expansion lacks support beyond Korean. Second, initial aspirated and tense consonants correlate with a high tone, and lax and voiced consonants correlate with a low tone. The correlation cannot be explained in the standard tonogenesis model (voiceless-high and voiced-low). We argue instead that (a) underlyingly "tense" stops are regular voiceless unaspirated stops, and "lax" stops are regular voiced stops, (b) there is no compelling evidence for a new distinctive feature, and (c) the consonant-tone correlation is another case of voiceless-high and voiced-low. We conclude that Korean does not have an unusual phonology, and there is no need to complicate feature theory.
Full-text available
There is an online scan available to borrow at: The description there is: Japanese Tone Structure provides a thorough, phonetically grounded description of accent and intonation in Tokyo Japanese and uses it to develop an explicit account of surface phonological representation. The unusual amount of quantitative phonetic data analyzed and its testing in a detailed model make this an important new study for theoretical phonologists, phoneticians, and specialists in Japanese. The authors' broader purpose, however, is to develop a general theory of surface representation that can capture salient facts about prosodic structure in all languages and provide a suitable input to phonetic rules. The theory integrates autosegmental principles into a metrical account of prosodic structures in an explicit formalism. The work establishes phonology and phonetics as a productive area in cognitive science.
This paper argues that the phonetic interpretation of phonological representations may be controlled as well as automatic, because contextual variation in the realization of distinctive feature values is a flexible and adaptive response to variation in the demands on the production or perception of these values between contexts. The principal evidence presented in support of this argument is that the variation in the phonetic realization of speech sounds between contexts or languages involves reorganization of articulations into distinct phonetic categories. Extensive evidence of such reorganization in the realization of the feature [voice] is presented.