Content uploaded by Sky Onosson
Author content
All content in this area was uploaded by Sky Onosson on Feb 15, 2022
Content may be subject to copyright.
Pre-print
1
A phonetic case study of Tŝilhqot’in /z/ and /zʕ/
Abstract: This paper provides an acoustic description of /z/ and /zʕ/ in Tŝilhqot’in (Northern
Dene). These sounds are noted by Cook (1993, 2013) to show lenition and some degree of
laterality in coda position. Based on recordings made in 2014 with a single, mother-tongue
speaker of Tŝilhqot’in, we describe their acoustic properties and examine their distribution as a
function of prosodic position and segmental environment. We find that they vary along three
dimensions: manner (fricative ~ approximant), degree of retraction (non-retracted ~ retracted),
and laterality (non-lateral ~ lateral). In addition, some tokens have a characteristic ‘buzziness’,
which has been associated with the Chinese front apical vowel (Shao & Ridouane 2018, 2019)
and the Swedish ‘viby-i’ (Westberger 2019). We argue that ‘lenition’ (Kirchner 2004, Ennever,
Meakins & Round 2017) can only account for some of the observed variation and suggest that
both /z/ and /zʕ/ are specified for two tongue articulations: tongue tip/blade and tongue body
(Laver 1994), encompassing laterality (and concomitant retraction) in addition to the primary
coronal gesture.
1 Introduction
Tŝilhqot’in
1
(ISO 639-3) is a Northern Dene language spoken in the central interior of British
Columbia, Canada (see Figure 1). According to the 2018 Report on the Status of B.C. First
Nations Languages, there are currently 765 fluent speakers, the largest number of speakers of
any First Nations language spoken strictly within the boundaries of B.C. (Dunlop et al. 2018).
1
Also written/referred to as Tsilhqut’in, Tsilhqút’ín, Chilcotin, Tzilkotin, Tŝinlhqot’in, and Nenqayni Ch’ih. The
spelling chosen here follows the spelling used by the Tŝilhqot’in National Government (http://www.tsilhqotin.ca/).
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
2
Figure 1 Map of Tŝilhqot’in and surrounding First Nations territories. Original source (following
Creative Commons license CC-BY-NC-SA 2.0): First Nations People of British Columbia,
Ministry of Education, British Columbia. http://www.bced.gov.bc.ca/abed/map.htm
The bulk of the existing linguistic research on Tŝilhqot’in has been done by Eung-Do
Cook, culminating in his (2013) A Tsilhqút’ín grammar. While Cook’s grammar includes
thorough descriptions of Tŝilhqot’in phonology, morphology, and syntax, it does not provide any
details on the phonetic structures of the language. Tŝilhqot’in has an extremely rich sound
inventory, including features typical of Dene languages (e.g., tone) as well as ones shared with
neighbouring Salish and Wakashan languages (e.g. contrastive pharyngealization). Table 1
provides the consonant inventory, including a three-way voicing contrast in obstruents, a velar ~
uvular contrast (including secondary labialization), and a plain ~ pharyngealized contrast in
coronal fricatives and affricates. Table 1 is organized phonetically, based on the International
Phonetic Association’s (2015) International Phonetic Alphabet chart. However, it is important to
note that Cook (2013) groups the sounds /l/, /z/, /zʕ/, /j/, /w/, /ʁ/, and /ʁw/ together as “voiced
continuants (spirants)” (p. 15), based on phonological evidence. In van Eijk’s (1997) work on
neighbouring St’át’imcets (Interior Salish), /z/, /zʕ/, /ʁ/, and /ʁw/ are grouped together with /l/,
/l’/, /lʕ/, /lʕ’/, /j/, /j’/, /w/, /w’/, /ɣ/, /ɣ’/, /ʕ/, /ʕ’/, /ʕw/, /ʕ’w/, /ʔ/ and /h/, in this case as (voiced)
“resonants” (p. 2).
2
Thus, both Cook and van Eijk recognize that voiced fricatives and (non-
nasal) resonants share certain phonological properties; we return to the relevance of their
classification system for the study of /z/ and /zʕ/ in Section 4.
2
Neither Cook nor van Eijk define the terms “continuants”, “spirants”, and “resonants”. It is clear that, in their
descriptions of the sounds, they both have in mind phonological classification (based on distributional properties)
rather than phonetic realization.
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
3
Table 1 Tŝilhqot’in consonant inventory (adapted from Cook 2013: 15)
Bilabial
Alveolar
Post-alveolar
Velar
Uvular
Glottal
Plosive
b p
d t t’
g k k’
gW kW kW’
G q q’
GW qW qW’
/
Affricate
dÉz tÉs tÉs’
dÉz! tÉs! tÉs!’
dÉZ tÉS tÉS’
Lateral
affricate
dÉl tɬ
tɬ’
Nasal
m
n
Fricative
s z
s! z!
S
xW
X XW
“ “W
Lateral
fricative
¬
Approximant
w
y
h
Lateral
approximant
l
Tŝilhqot’in also has a relatively complex vowel system: tense (long) vowels (/i a u/)
contrast with lax (short) ones (/ɨ e o/), and each of these six underlying vowels has (at least) two
realizations, one occurring in retracted environments, termed ‘flat’ by Cook (1993, 2013), and
the other occurring elsewhere, termed ‘sharp’ by Cook (1993, 2013). The presence of non-
retracted vs. retracted vowel allophones is a reliable perceptual cue to the quality — retracted or
not — of adjacent consonants (see section 3.1.1 and 3.1.2 below). In fact, a Tŝilhqot’in mother-
tongue speaker and language expert that co-author Bird has worked with has talked about
reforming the orthography, so that the diacritic used to mark pharyngealization on consonants
(< ̂ >) is instead written on the vowels. At least in some dialects (Stone in particular), there is
evidence for contrastive nasalization (Cook 2013: 21). Finally, Tŝilhqot’in is a tone language:
vowels can have high (marked) or low (unmarked) tone.
The study described here focuses on the voiced alveolar plain and pharyngealized
fricatives, /z/ and /zʕ/. Contrastive pharyngealization is a key feature of the Tŝilhqot’in sound
system, and is also found in adjacent Interior Salish languages (Bessell 1992, Namradan 2006,
Shahin 2002). Cook (1993) provides a phonological analysis of the local and non-local
assimilation (retraction) processes that are triggered by pharyngealized consonants in
Tŝilhqot’in, but no detailed phonetic work has yet been done on these sounds. In St’át’imcets
(Interior Salish), which is spoken to the south-east of Tŝilhqot’in (see Figure 1), pharyngealized
coronal consonants are articulated with significant tongue root retraction towards the lower
pharyngeal wall, pulling back both preceding and following vowels (Namdaran 2006). This
results in raised F1 and lowered F2 values associated with pharyngealized consonants compared
to their plain counterparts (Shahin 1997, 2002), a pattern reflective of pharyngealized sounds
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
4
cross-linguistically, including consonants (Al-Tamimi 2017, Shar & Ingram 2010) and vowels
(Chiu & Sun 2020).
Cook (1993) describes both /z/ and /zʕ/ as “spirants rather than fricatives, i.e., non-
strident, especially in syllable final position, so that they are sometimes perceived mistakenly as
dark l.” (p. 159). In his more recent work, Cook (2013) does not say anything particular about
the phonetic realization of /z/ and /zʕ/, although he does describe a more general weakening
(lenition) process of continuant consonants in coda position, i.e. a “stronger articulation
(fricative) in initial position and weaker articulation (spirant, glide) in final position of the
continuants” (p. 44). This lenition process is unusual in that, cross-linguistically, lenition occurs
most often in intervocalic position (Ennerver et al. 2017, Katz & Pitzanti 2019). We return to the
question of whether “lenition” is the best term to use to describe the observed variation in /z/ and
/zʕ/ in Section 4.2.
To our ears (as trained phoneticians), /z/ and /zʕ/ clearly have an elusive phonetic target,
exhibiting substantial variation beyond stronger vs. weaker manners of articulation. In previous
work on the language, co-author Bird has transcribed these sounds using a variety of symbols:
[z], [z
̞4], [zð], [zʁ], [ɫ], [ɫð], [ʁ], [ʁ4], [ɮ] (Bird, 2014). The purpose of this study is to characterize
the phonemes /z/ and /zʕ/ phonetically, to determine (a) what the possible phonetic variants of
these phonemes are, and (b) to what extent these variants are systematically distributed, based on
prosodic position and segmental environment. In terms of (b), our expectations are threefold. 1)
lenited realizations will be more common in coda position than elsewhere (Cook 2013: 44).
Furthermore, weakening may be affected by adjacent segments, although previous studies
disagree on precise effects (Ennerver et al. 2017, Kirchner 2001, 2004). 2) Retracted realizations
will also be more common in coda position than elsewhere, based on cross-linguistic findings
showing that, in articulatorily complex segments, tongue body articulations (as opposed to
tongue tip articulations) are more prominent in coda than in onset position (Gick et al. 2006;
Krakow 1993). Finally, 3) in terms of lateralization, following Cook’s (1993) observations, we
anticipate that lateralized realizations will be more common in coda position than elsewhere.
2 Methods
The study reported on below came out of an elicitation session in the fall of 2013 with a single
speaker who had worked closely with Eung-Do Cook in his linguistic fieldwork in the 1970s.
She grew up speaking Tŝilhqot’in in the home, where her mother was a monolingual speaker.
She is bilingual in English, and has continued to be involved in language work, as a linguist and
as a teacher. At the time of the elicitation session, co-author Sonya Bird was in Tŝilhqot’in
territory for other reasons and had the unique opportunity to work with her for an afternoon.
Because we only had a very limited time together, we made do with available word lists, based
primarily on Cook’s (1989, 1993, 2004) materials; this is reflected in the unevenness of the token
numbers across conditions, summarized in Table 2. Unfortunately, we were limited to making
audio recordings, and so were not able to capture articulation directly — this is clearly an area of
future exploration.
We recognize that working with a single speaker and using recording materials that were
not all specifically tailored for this study has implications in terms of the reliability of the
patterns described below. We note that the variable realizations of /z/ and /zʕ/ in the speaker’s
recordings are matched by several speakers in the FirstVoices Tŝilhqot’in (Xeni Gwet’in)
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
5
language portal (First Voices 2021)
3
, providing evidence that these are robust patterns that hold
across speakers of the language.
2.1 Stimuli and recording procedure
Stimuli consisted of words extracted from Cook’s (1989, 1993, 2004) materials
4
illustrating
Tŝilhqot’in sounds and complemented by materials available from a Field Methods course
offered in the spring 2006 at the University of Victoria by Dr. Leslie Saxon. Impressionistically,
/z/ and /zʕ/ share phonetic properties with /ʁ/ and /l/, especially in coda position. Therefore, the
set of words analyzed for this study included ones containing /z/ and /zʕ/ as well as representative
tokens of /ʁ/ and /l/ for comparison. The words themselves ranged in duration from one to six
syllables, the majority being disyllabic, with three- and four-syllable words also being common
(see Appendix 1 for the full word list).
The word list was recorded in a quiet room in the home of the speaker’s sister, using a
Zoom H4N portable recorder and a head-mounted microphone set at approximately 3cm from
the consultant’s mouth. The microphone was kept in a fixed position for the entire recording
session, ensuring that intensity could be reliably compared across tokens (Kingston 2008, 19).
For each word, the speaker was asked to check that she knew the word, and (if so) to repeat it
three times in a row. Because, in some cases, the pronunciation of the target sounds (/z/ and /zʕ/
in particular) varied across repetitions, all three repetitions of each word were included in the
analysis.
Table 2 summarizes the number of tokens per phoneme analyzed
5
, organized by position.
Note that the token counts are unevenly distributed and, in some cases, quite small. They also do
not include word-initial onsets. For /z/ and /zʕ/, the token counts reflect the distribution of the
sounds in available written materials. Cook (2013: 16) notes that his corpus does not include any
word-initial /z/ or /zʕ/ tokens. Based on the materials available to us, /z/ also seems relatively
infrequent in non-intervocalic, word-medial onset position; in short, /z/ occurs most often in
VCV and VC# positions. The distribution of /zʕ/ is somewhat broader, including (in this dataset)
a fair number of word-medial coda (VCCV) tokens as well. Note also that lexical tone was not
incorporated into our study, because (a) we had no indication it might affect the realization of the
target segments and (b) our dataset did now allow us to include it as a predictor variable, given
that only seven words had a lexical high tone.
6
/l/ and /ʁ/ did not vary much in their realization
and were included for comparison only, and as such made up relatively small sets.
3
FirstVoices (https://www.firstvoices.com/) is “an online space for Indigenous communities to share and promote
language, oral culture, and linguistic history”. It houses audio recordings, dictionaries, songs, and stories from many
Indigenous communities in British Columbia, Canada (as well as a few from elsewhere in Canada).
4
At the time of data collection, Cook’s (2013) grammar had not yet been published.
5
Note: CC sequences in our dataset are composed of a coda consonant + a (different) onset consonant; there are no
geminate consonants in Tŝilhqot’in.
6
Tŝilhqot’in tone is highly complex, including lexical tones as well as tones (high tone in particular) derived from
various phonological processes (see Cook, 2013, Section 1.5). The tonal system is not yet well understood, and is
not fully marked in Cook’s transcriptions, which formed the bases of our analysis.
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
6
Table 2. Tokens elicited, by syllabic position*
Consonant
Intervocalic
(VCV)
Medial onset
(VCCV)
Medial coda
(VCCV)
Final coda
(VC#)
Total
/z/
40
3
9
38
90
/z!/
41
3
45
55
144
/“/
15
6
3
12
36
/l/
6
3
11
18
38
Total elicited
102
15
68
123
308
Total analyzed
93
12
53
109
267
*V=vowel; C=consonant; target consonant is bolded and underlined.
In 41 of the 308 elicited word tokens the target phoneme was not present phonetically,
most often occurring in coda position (29/41). This turned out to be the case for all 15 tokens
with /ʁ/ in coda position (12 final and 3 medial) e.g., bilogh /biluʁ/ (‘knife’) was pronounced
[biloː] and for 12/45 words with /zʕ/ in medial coda position e.g., tiẑlin /tizʕlin/ (‘Chilco Lake’),
pronounced [tɛɫɛin]. In addition, /z/ was not realized in final coda position in 2/38 words. Note
that in such cases of deletion, underlying consonantal retraction was generally still evident in the
adjacent vowels, e.g. in /biluʁ/, /u/ is retracted to [o] (see Cook 2013: 24, example c.); in /tizʕlin/,
both /i/ vowels are retracted, as is /l/.
7
Deletion in coda position is mentioned in Cook (2013: 44)
as a process affecting all continuant consonants; it was therefore not surprising to observe in this
data set. In addition, target phonemes were not realized in 9/18 words with /ʁ/ in intervocalic
position and 3/37 words with /zʕ/ in intervocalic position. All medial onsets were realized
phonetically.
2.2 Data analysis
Each token was analyzed qualitatively (2.2.1) and quantitatively (2.2.2), with a primary focus on
the target consonant itself. As mentioned above, vowels provide robust information about the
quality — retracted vs. non-retracted — of adjacent consonants. As such, we did include vowels
adjacent to the target consonants in our analysis. However, small sample sizes by vowel meant
that we were not able to conduct reliable statistical analyses on them.
2.2.1 Qualitative analysis
Qualitatively, tokens were categorized in terms of (a) manner of articulation (2 levels: non-
lenited vs. lenited), based on the presence of visible frication and/or formant structure (Katz &
Pitzanti 2019; Lee-Kim 2014; Shao & Ridouane 2018), (b) retraction (2 levels: non-retracted vs.
7
See Cook (1993) for an analysis of Tŝilhqot’in retraction effects.
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
7
retracted), based on the quality of the target consonant and the adjacent vowels (observed
auditorily and visually), and (c) laterality (2 levels: non-lateral vs. lateral), based primarily on
auditory observation. Katz and Pitzanti (2019) have pointed out the limitations of using
subjective judgments to “force a binary classification onto continuous phonetic properties” (p.
11). Given the complexity of the observed variation in /z/ and /zʕ/, it seemed nonetheless likely
that these discrete categorizations would be useful in describing /z/ and /zʕ/ realizations and their
distribution. Note that we use the general term ‘retraction’ to describe perceived backing of the
tongue body (in both /z/ and /zʕ/), without specifying precisely where the articulatory target of
this backing is (raised vs. lowered).
Coding was done using Praat textgrids (Boersma & Weenink 2018); six tiers were used,
to mark segments (both target consonants and adjacent vowels) of interest, token number,
underlying consonant and position, phonetic realization of manner specifically, phonetic
realization more generally (via phonetic transcription) of consonant, adjacent vowel quality
(sharp vs. flat), and phonemic transcription of adjacent vowel. We recognize that there are
certain limitations to our auditory coding, and consequently to the acoustic analyses that are
based on this auditory coding (3.1.2). First, although we are trained phoneticians, we are not
speakers of Tŝilhqot’in, and the auditory cues we used to classify sounds may not correspond
exactly to those that fluent speakers would use; future work should include a perceptual study
with fluent speakers, especially to disentangle realizations coded as having ambiguous (to us)
place features. Second, many of the realizations were ambiguous, varying along dimensions (e.g.
of lenition) in continuous ways, making auditory judgments challenging.
8
To increase the
reliability of our analyses, we coded the data in a two-step process. Initial auditory coding and
transcription was carried out by co-author Bird. Subsequently, co-author Onosson independently
transcribed the entire dataset, using a Praat textgrid that did not specify the underlying phonemes
of the target sounds so as to minimize potential bias towards any particular phonetic realizations.
We initially agreed on the transcriptions of 223 out of 267 tokens (an inter-rater agreement rate
of 83.5%), both having noted a number of individual tokens of uncertain quality. A consensus of
opinion was reached on several of the latter, bringing the number of agreed-upon tokens to 230,
for an overall inter-rater agreement rate of 86.1%. For those tokens which were not fully agreed-
upon (37 of 267, or 13.9%), co-author Bird’s auditory coding was used in the quantitative
analysis, given her more extensive experience listening to these sounds and the language more
generally.
2.2.2 Quantitative analysis
Quantitative analysis focused on several measurements automatically extracted using a Praat
script. Within the target consonants themselves, we measured duration, mean intensity and band-
pass filtered zero crossing rate (bp-zcr) as correlates of lenition, as well as spectral moments 1-4
and mean F1 & F2 (Hz) as correlates of retraction and lateralization.
Mean intensity (dB) across the duration of the target consonants was extracted from the
Intensity object created from the Sound object using Praat’s To intensity… function. Bp-zcr has
been used as an alternative to harmonic to noise ratio (HNR) to quantify noisiness in a signal
8
We note that one reviewer disagreed with some of our transcriptions, even though we — co-authors — agreed on
them. We suspect this may have been partly to do with the specific play-back devices used to listen to the audio
files, as documented in Sanker et al. (2021).
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
8
without reference to periodicity (Gordeeva & Scobbie 2010; Westerberg 2018); the higher the
bp-zcr value, the more noisy (i.e., less lenited) the sound is. Bp-zcr was measured in Praat
following Westerberg (2018), separately within each 1/3 of target consonant duration, by
dividing the number of zero crossings (taken from a PointProcess “zeros” object, set by default
to include both “raisers” and “fallers”) by 1/3 of the token duration. A mean per-token bp-zcr
was then calculated in R by averaging across the three 1/3 rates. Spectral moments (centre of
gravity, standard deviation, skewness and kurtosis) were included in our analysis, as measures of
fricative place of articulation (Jongman, Wayland & Wong 2000). They were measured within a
30ms Hamming window from the centre of each token, band-pass filtered between 200-22,050
Hz
9
and extracted at a power setting of 2.0 (the default). In addition to spectral moments, we also
measured the mean first and second formants within the consonants themselves (Jassem 1965,
Soli 1981, Alwan 1986, Jongman et al. 2000) because, even for realizations coded as (voiced)
fricatives, formant structure was often visible within the consonant. F1 and F2 were calculated
using the burg formula and the following settings: 5 formants, ceiling of 5500 Hz, 25ms window
length, and 50 Hz pre-emphasis.
In addition to measuring various acoustic properties within the target consonants
themselves, we also measured F1 and F2 (using the same parameters as for the consonants)
within the vowels preceding and following /z/ and /zʕ/ since, as mentioned above, vowel quality
is a robust and reliable perceptual cue of retraction, or at least phonological pharyngealization, in
Tŝilhqot’in. We split the vowels into three equal thirds (beginning, middle, end), and measured
mean F1 and F2 in each third. We referred to formants in the first third when describing vowels
following target consonants, and to the last third when describing vowels preceding target
consonants.
2.2.3 Statistical analysis
All statistical analyses of acoustic features and distributional properties were conducted in R (R
Core Team 2020) running in RStudio (RStudio Team 2020) and used several sub-components of
the tidyerse R package library (Wickham et al., 2019) as well as the stats package from the base
R library for specific statistical functions. Statistical analyses were carried out using the
following formulas. Chi-square tests of distributions: chisq.test(table(variable1, variable2)).
ANOVAs: aov(dependent.variable ~ independent.variable (* ind.var2 * ind.var3) ). Post-hoc
testing for significant interactions within multivariate ANOVAs was carried out using Tukey’s
Honest Significant Difference (Tukey, 1953): TukeyHSD(anova.test.result). Statistical
significance was determined for all tests at p<.05; we omit specific p-values when reporting
results generally, except where discussing findings which fail to meet this threshold. Qualitative
9
We surveyed the literature on the use of spectral moments in the acoustic analysis of consonants and found
substantial variation in the choice of audio filtering. Such choices are not without consequence, however. Shadle and
Mair (1996) conducted a comparison of several filtering methods based on previous studies and determined that “the
values of the moments were strongly affected by the frequency range used” (p. 1522). Nonetheless, our study does
not focus on providing these measurements for comparison with other research, but rather to make internal cross-
comparisons. Our high-pass value of 200 Hz is meant to filter out voicing in both /z/ and /zʕ/. This matches the value
selected by Shadle and Mair as well as Sundara (2005) for the same purpose, although other studies have utilized
other cut-off values such as 500 Hz (Jannedy & Weirich 2017) or even as high as 1000 Hz (Nirgianaki 2014).
Because we were dealing with retracted realizations in many cases, we elected to use a lower threshold so as to
preserve much of the formant structure across a range of places of articulation.
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
9
or categorical variables which were tested include the following factors: phoneme (2 levels: /z/,
/zʕ/), preceding or following vowel (4 levels: /a/, /e/, /i/, /u/), preceding or following consonant (4
levels: /ʁ/, /l/, /z/, /zʕ/), manner (2 levels: lenited, non-lenited), retraction (2 levels: retracted,
non-retracted), laterality (2 levels: lateral, non-lateral). Quantitative or continuous variables
which were tested included the following factors: duration, intensity, band-passed zero-crossing
rate (bp-zcr), F1, F2, centre of gravity, standard deviation, skewness, and kurtosis.
3 Results
In general, /z/ and /zʕ/ vary substantially from token to token and even from repetition to
repetition within a given token, giving the impression of a somewhat underspecified articulatory
and/or acoustic target. Nonetheless, certain patterns do emerge, pointing to syllabic (onset vs.
coda) and segmental (adjacent segment) effects on phonetic realization. Results are presented in
two parts: first, we describe the different phonetic realizations of /z/ and /zʕ/ with accompanying
acoustic analysis (Section 3.1); second, we analyze the distribution of the different phonetic
realizations, according to syllabic position and segmental environment (3.2).
3.1 Phonetic realizations of /z/ and /zʕ/
In this section, we describe the acoustic and auditory variants of /z/ and /zʕ/. In 3.1.1, we
compare the two phonemes to each other, to get a general sense of how /z/ and /zʕ/ differ
phonetically. In 3.1.2, we explore surface realizations of both phonemes in more detail, in terms
of variation in lenition, retraction, and lateralization as well as ‘buzziness’ (likely corresponding
to dentalization — see below).
3.1.1 Acoustic correlates of /z/ vs. /zʕ/
The phonetic realizations of /z/ and /zʕ/ included relatively similar ranges of variation in manner,
retraction, and laterality. Before exploring this variation in more detail, it is useful to compare
the two phonemes phonetically in terms of their overall phonetic features.
As we shall see in Section 3.2, /z/ vs. /zʕ/ differ somewhat in the frequency of different
realizations (see Tables 3 and 4). Nonetheless, we hypothesized that, overall, /zʕ/ realizations
would exhibit acoustic measures reflective of pharyngealization — in particular, having raised
F1 and lowered F2 within both /zʕ/ and in adjacent vowels (Namdaran (2006) and Shahin (2002).
A set of one-way ANOVAs (phoneme) was conducted across our suite of acoustic parameters to
look for differences by phoneme (Appendix 2, Table 5). Acoustic correlates which distinguish
between /z/ vs. /zʕ/ vary depending on the realization, and often do not reach the level of
statistical significance, most likely due to low token counts in our data for certain realizations.
Nevertheless, the trends are consistent: realizations deriving from underlying /zʕ/ tend to have
longer duration, lower intensity, lower F2, and greater skewness, in comparison to realizations
deriving from underlying /z/. Figure 2 plots kernel density estimate distributions (Rosenblatt
1956, Parzen 1962) of the various acoustic measures by phoneme. Those correlates which meet
the level of statistical significance across all realizations are intensity (-1.6 dB for /zʕ/), F2 (-303
Hz for /zʕ/), cog (+228 for /z/), and standard deviation (+225 for /z/).
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
10
Figure 2. Distributions of acoustic parameters across tokens of /z/ and /z!/.
Figure 3 provides kernel density estimate plots of /z/ and /zʕ/ distributions by mean F1 (y-
axis) and F2 (x-axis), as measured within the consonants. Overall, Figure 3 shows that the two
consonants differ substantially along F2 but much less so along F1; this is also the case for the
vowels adjacent to /z/ and /zʕ/ (Figure 4), similar to Zawaydeh & de Jong’s (2011) findings in
Ammani-Jordanian Arabic. This indicates that, unlike in other languages (see Al Tamini (2017)
on Arabic and Shahin (1997) on St’át’imcets), what is termed “pharyngealization” in Tŝilhqot’in
is manifested primarily as backing, without substantial lowering.
Figure 3. Mean F1 and F2 of /z/ vs. /zʕ/.
0.000
0.002
0.004
100 200 300 400 500
Duration (ms)
0.00
0.05
0.10
60 65 70
Intensity (dB)
0
1
2
3
1000 3000 10000
Bp-zcr (log)
0.000
0.002
0.004
200 400 600 800
F1 (Hz)
0.0000
0.0005
0.0010
0.0015
0.0020
900 1200 1500 1800 2100
F2 (Hz)
0
2
4
300 1000 3000
Centre of gravity, log(Hz)
0
1
2
100 300 1000 3000
Standard deviation, log(Hz)
0.00
0.02
0.04
0.06
0 20 40 60 80
Skewness
0.0
0.3
0.6
0.9
1 10 100 1000 10000
Kurtosis (log)
Phoneme
z
z
200
300
400
500
900120015001800
F2 (Hz)
F1 (Hz)
Density
0.4
0.6
0.8
1.0
Phoneme
z
z
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
11
As predicted based on Cook (2013) and also described for Arabic (Embarki et al. 2011,
Laver 1994), adjacent vowels also reliably correlate with phonemic pharyngealization. For
example, the final /a/ of teẑilhchaz /tezʕiɬt
͡ʃaz/ ‘I started to fry it’ is realized as [æ] (with a
relatively high F2) whereas in telhant’aẑ /teɬant’azʕ/ ‘crowberry’ it is realized as [ɑ] (with a
relatively low F2).
10
Figure 4 provides the acoustic vowel spaces before /z/ vs. /zʕ/ (a) and after
/z/ vs. /zʕ/ (b). This plot is based on F1 and F2 measurements averaged over the third of the
vowel that is adjacent to the target consonant, i.e. the last third for vowels preceding /z/ and /zʕ/
and the first third for vowels occurring after /z/ and /zʕ/. Although there are many more vowel
tokens preceding than following /z/ and /zʕ/, the general pattern is the same: the vowel space is
substantially further back (lower F2) and slightly lower down (higher F1) adjacent to
pharyngealized /zʕ/ compared to its non-pharyngealized counterpart /z/. These results mirror
those represented for the consonants themselves, in Figure 3 above.
To determine statistical significance of formant differences, two-way ANOVAs
(phoneme, retraction) were conducted for each formant per vowel according to the preceding
(Appendix 2, Table 6) or following (Appendix 2, Table 7) phoneme (for /u/, its scarcity in our
dataset meant that we were only able to make comparisons when preceding but not following /z/
vs. /zʕ/). For vowels following the target consonant, the main differences occur in F2: /a/, /e/,
and /i/ all have higher F2 values following /z/ than /zʕ/. The only significant F1 difference is for
/e/, with F1 being lower following /z/ than /zʕ/. For vowels preceding the target consonant, only
/a/ has a significantly higher F2 before /z/ than /zʕ/; only /i/ shows a significant F1 difference, but
in the opposite direction than expected: F1 is higher before /z/ than /zʕ/. Although statistical
analysis is limited in reliability because of small sample sizes, the trends for F2 in particular are
consistent with the consonant-internal measurements (Figure 3). They also suggest that retraction
effects are stronger on vowels which follow consonants compared to vowels preceding
consonants, which is somewhat surprising given documented phonetic effects in other languages
(Nolan 2017) and also phonological effects of retraction described for Tŝilhqot’in and more
generally (Cook 1993; Zawaydeh & de Jong 2011).
10
Audio recordings of illustrative tokens are provided in the supplementary materials.
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
12
Figure 4. Vowel formants before /z/ vs. /zʕ/ (a) and after /z/ vs. /zʕ/ (b). Formants plotted are
mean values in adjacent third of the vowel. Ellipses show 95% confidence intervals (omitted
where low token count prohibits calculation).
As we shall see below, retraction is one component of the phonetic variation observed in
both /z/ and /zʕ/, in addition to being contrastive at the phonological level in the form of
pharyngealization. One of the most interesting things about the Tŝilhqot’in patterns described
here is this blurred role of retraction in the sound system.
3.1.2 Phonetic variation across /z/ and /zʕ/
In analyzing /z/ and /zʕ/ auditorily, we noted variation along three dimensions: degree of lenition,
degree of retraction, and lateralization. In this section, we explore the acoustic correlates of each
of these dimensions in turn. The results are complex, as a result of many interacting factors, and
not all tendencies reach statistical significance. In this section, we report on clear tendencies; full
statistical analyses are provided in Appendix 2 (Tables 8-12).
Before: z
Before: z
50010001500200025003000 50010001500200025003000
200
400
600
800
F2 (Hz)
F1 (Hz)
Vowel
a
e
i
u
a
After: z
After: z
8001200160020002400 8001200160020002400
200
300
400
500
600
700
F2 (Hz)
F1 (Hz)
Vowel
a
e
i
u
b
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
13
Considering lenition first, both /z/ and /zʕ/ vary in how lenited their realizations are, from
clear fricatives to barely present approximants. We hypothesized that lenited forms should differ
from non-lenited forms acoustically in terms of intensity and bp-zcr in particular (see Appendix
2, Tables 10–12). Figure 5 provides a density comparison of bp-zcr, which showed the larger
effect size, in /z/ and /zʕ/ tokens coded as lenited vs. non-lenited. The distribution of bp-zcr
values is much flatter for non-lenited (dashed lines) than for lenited tokens and, crucially, the
mean is much higher, as expected: 3787 for non-lenited vs. 1337 for lenited.
Figure 5. Bp-zcr in /z/ and /zʕ/ tokens coded as lenited vs. non-lenited.
Turning to retraction, we observed during auditory coding that it tended to coincide with
lenition. We tested the correlation between the two, finding that they are indeed highly correlated
(χ2=130.85, df=1), such that lenited realizations also (strongly) tend to be retracted (of the 167
tokens coded as retracted, 153 or 92% were also coded as lenited). We expected retraction to be
reflected in both consonantal and vocalic measures of place (spectral moments and formants). No
significant independent effect of retraction was found on the consonantal measures (see
Appendix 2, Table 10). Retraction did tend to have an effect on the quality of adjacent vowels,
although small token numbers per vowel made it difficult to reliably test this effect statistically
(see Appendix 2, Tables 6–7). Figure 6 shows vowel formants plotted according to the following
phoneme (/z/ or /zʕ/, the condition for which we have the most data) and for retraction (as coded
auditorally). In general, F2 values largely match our expectations, tending to be lower adjacent to
retracted versus non-retracted tokens (this effect was only significant in the case of /i/ occurring
after a retracted consonant). For F1, values are lower for /e/ and higher for /i/ following retracted
consonants, and lower for /u/ preceding retracted consonants. These preliminary F1 results
indicate that the retraction we hear on both /z/ and /zʕ/ is possibly more accurately described as
uvularization (see Zawaydeh & de Jong 2011 on Ammani-Jordanian Arabic), involving backing
and slight raising rather than lowering. This would explain both lowering of F1 in /e/ vs. raising
in /i/, as well as lowering of F1 in /u/, which is realized as [o] in retracted environments (Cook
2013).
z
z
1000 3000 10000 1000 3000 10000
0
1
2
3
4
Bp-zcr (log)
Manner
Lenited
Non-lenited
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
14
Figure 6. Vowel formants before /z/ vs. /zʕ/ coded as non-retracted (a) or retracted (b). Formants
plotted are mean values in adjacent third of the vowel. Ellipses show 95% confidence intervals
(omitted where low token count prohibits calculation).
The observed variation in retraction and lenition was such that, in some cases, both /z/
and /zʕ/ were realized as retracted and lenited [“4], which is also a common realization of
underlying /ʁ/. In Figure 7, we compare what was transcribed phonetically as [ʁ
̞] from
intervocalic /zʕ/ vs. underlying /ʁ/, in the word teẑighin /tezʕiʁin/ ‘I started to pack or haul it’.
The realization of both /zʕ/ and /ʁ/ is a short approximant [ʁ
̞] (/zʕ/: 55ms; /ʁ/: 48ms), although
underlying /zʕ/ is somewhat more lenited than underlying /ʁ/ in terms of intensity (69 dB vs. 65
dB) and formant structure (clearer in /zʕ/ than in /ʁ/). The vowel between /zʕ/ and /ʁ/ is
underlyingly /i/; its transitional nature both out of the preceding /zʕ/ and into the following /ʁ/
indicates retraction of both flanking consonants.
Before: Non-retracted z
Before: Non-retracted z
100020003000 100020003000
200
400
600
800
F2 (Hz)
F1 (Hz)
Vowel
a
e
i
u
a
Before: Retracted z
Before: Retracted z
10002000 10002000
200
400
600
F2 (Hz)
F1 (Hz)
Vowel
a
e
i
u
b
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
15
Figure 7. Intervocalic [“4] from /zʕ/ vs. intervocalic [“4] from /ʁ/ in
tez^ighin
/tezʕiʁin/ ‘I started to
pack or haul it’.
Given the apparent neutralization of the phonemic contrast between /zʕ/ and /ʁ/ (Figure
7), and also /z/, it is worth examining the phonetic realization of these three phonemes, to see
whether they are still distinguishable from one another acoustically. Figure 8 plots F2 (x-axis, as
the most reliable correlate of place) and bp-zcr (y-axis, as the most reliable correlate of manner)
by underlying phoneme for surface realizations of [ʁ
̞]. Phonemic /ʁ/ mainly varies in bp-zcr over
a limited range of F2 values, meaning it varies a fair amount in manner (degree of lenition), but
is relatively consistent in place. Conversely, [ʁ
̞] realizations of /z/ and /zʕ/ vary mainly in F2 over
a limited range of bp-zcr values, meaning they vary in place (degree of retraction), with /z/ being
somewhat more forward than /zʕ/ overall, but not much in manner (degree of lenition). We used
one-way ANOVAs (phoneme) to test for significant differences among [ʁ
̞] realizations
corresponding to different phonemes (see Appendix 2, Table 8). F2 was found to differ
significantly by phoneme, but only between /z/ and /ʁ/, with a mean difference of +209 Hz for
/z/. For bp-zcr, /ʁ/ differed significantly from both /z/ (+782) and /zʕ/ (+682), /ʁ/ exhibiting less
lenition than /z/ and /zʕ/; /z/ and /zʕ/ did not differ from each other in bp-zcr. Other significantly
different measures among [ʁ
̞] realizations included intensity (lower for /ʁ/, no significant
difference between /z/ vs. /zʕ/), centre of gravity (higher only for /ʁ/ vs. /z/), and standard
deviation (higher only for /ʁ/ vs. /z/).
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
16
Figure 8. [“4] F2 by bp-zcr according to phoneme: /z/, /zʕ/ and /ʁ/.
Laterality is the third dimension of variation observed in the dataset. Especially in coda
position (see Section 3.2), many tokens of both /z/ and /zʕ/ were coded as dark [ɫ]. Figure 9
provides a comparison of /l/ and /z/ in séla ninq’ez /séla ninq’ez/ ‘my hands are cold’, both of
which are realized as lateral approximants. Lateralized realizations of /z/ and /zʕ/ typically
sounded more retracted than underlying /l/. In Figure 9, this is reflected by the formant values of
/l/ vs. /z/, especially F2 (1296 Hz for /z/ vs. 1800 Hz for /l/). In addition, /z/ is preceded by the
retracted allophone of /e/: [ʌ]. Statistical analysis shows that tokens coded as lateral generally
have longer durations than ones coded as non-lateral, in addition to having lower intensity and
higher skewness values (Appendix 2, Table 11).
0
1000
2000
3000
4000
100015002000
F2 (Hz)
Bp-zcr
Phoneme
z
z
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
17
Figure 9. Dark [ɫ] in séla ninq’ez /séla ninq’ez/ ‘my hands are cold’.
Similar to the potential neutralization of /z/, /zʕ/ and /ʁ/ resulting from combined effects
of lenition and retraction (Figures 7 and 8), lateralization potentially leads to neutralization of /z/,
/zʕ/, and underlying /l/. Figure 10 plots F2 (x-axis; as the most reliable correlate of place) and bp-
zcr (y-axis; as the most reliable correlate of manner), by underlying phoneme. One-way
ANOVAs (phoneme) show that surface lateral realizations are distinguished by several acoustic
parameters (Appendix 2, Table 9), including F2 (higher in /z/ vs. /l/ and /zʕ/, no significant
difference between /l/ and /zʕ/) and bp-zcr (lower in /l/ vs. /z/ and /zʕ/, no significant difference
between /z/ and /zʕ/). The F2 results are surprising, since our perception was that /l/ is realized as
a lighter lateral than both /z/ and /zʕ/ (see Figure 9). Bp-zcr results reflect the fact that /l/ is a true
(and consistent) approximant, whereas /z/ and /zʕ/ are more variable in manner, even when coded
as [ɫ].
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
18
Figure 10. [l] F2 by bp-zcr according to phoneme: /z/, /zʕ/ and /l/.
In addition to the three main dimensions of lenition, retraction, and lateralization, we
observed what we called ‘buzziness’ in some of the /z/ and /zʕ/ tokens (39/88 and 43/129
respectively). Unfortunately, it was not possible to video record the elicitation session, but in
discussing the articulatory details of /z/ and /zʕ/ with the speaker, she confirmed that she
clenched her jaw during these sounds, as her mother had taught her. She also mentioned that,
when she taught these sounds to children, she showed them her teeth and told them to keep their
teeth closed. This articulatory tension sometimes lead secondary dentalization ([ð]) superimposed
on the primary articulation, leading to the auditory impression of buzziness (see also Zhou & Wu
1963 and others on the Chinese apical vowel). Acoustically, the buzzy nature of some [z] and
[zʕ] tokens is reflected in their spectral composition: it is not quite as ‘clean’ as that of canonical
/z/, with more noise throughout the frequency ranges, and especially below 4500 Hz.
In our data, buzziness was strongly correlated with manner, (χ2=62.809, df=1), with 61%
of buzzy tokens also coded as fricatives and 89% of non-buzzy tokens coded as approximants.
Given the speaker’s description of /z/ and /zʕ/, it is not surprising that buzziness occurred
primarily a secondary effect of jaw clenching specifically in the more closed (fricative)
realizations of the phonemes. Statistically, we did not find a reliable independent effect of
buzziness on bp-zcr, as expected based on previous literature (Gordeeva & Scobbie 2010;
Westerberg 2019). We did however find interactions between buzziness and other factors
(Appendix 2, Table 12). In terms of manner, buzzy tokens had lower cog values than non-buzzy
tokens within non-lenited tokens specifically.
We end this section by noting that we observed a number of tokens that were ambiguous
and/or transitional in their realization. With respect to manner, several word-final tokens
transitioned from a relatively open (approximant) to a relatively closed (fricative) sound e.g., the
final /z/ in jíz /d
͡ʒíz/ ‘inside’ was realized as [ʁ
̞͡ʁ]. Note that this pattern of realization is opposite
to what has been described for Chinese apical vowels (discussed in Section 4), which go from
1000
2000
3000
800120016002000
F2 (Hz)
Bp-zcr
Phoneme
l
z
z
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
19
more to less constricted (Shao & Ridouane 2018, 2019). With respect to place, realizations were
also especially variable in coda position, with token(s) coded as [ð], [z~ð], [l] (light), [“4~ɮ],
[ɫ~ɮ], and [“4~ɫ~ɮ], [ɮ~“4]. Such realizations point to the fact that /z/ and /zʕ/ varied not only
along three dimensions, but also along continua within these dimensions, and especially in coda
position.
Summarizing so far, /z/ and /zʕ/ differ from one another in intensity (likely because of
lenition patterns) as well as in acoustic features associated with place, with /zʕ/ showing lower
F2, cog, and standard deviation values in particular, indicating tongue body/root backing. For
both consonants, lenited tokens were associated with higher intensity and lower bp-zcr
(correlates of manner), as well as lower F2, centre of gravity & standard deviation, and higher
skewness & kurtosis (correlates of place, reflecting the strong correlation between retraction and
lenition). No significant effects of retraction were found within the consonants themselves, but
adjacent vowel formants (F2 in particular) showed that retraction was associated with tongue
body/root backing (raised F2). This preliminary finding supports the fluent speakers’ perceptions
(see Section 2) that Tŝilhqot’in retraction is carried on the vowels rather than the consonants.
Lateralization of /z/ and /zʕ/ was associated with longer duration, lower intensity, and higher
skewness. Finally, buzzy tokens were significantly correlated with non-lenited forms, and were
associated with lower cog values than non-buzzy tokens.
3.2 Distribution of /z/ and /zʕ/ variants
Now that the phonetic variants of /z/ and /zʕ/ have been described, we consider their distribution
across prosodic positions (3.2.1) and segmental environments (3.2.2).
11
The analysis is based on
our auditory classification of /z/ and /zʕ/ realizations (via transcription), as supported by the
acoustic measures summarized in Section 3.1.
3.2.1 Phonetic realizations across prosodic positions
Based on Cook’s (1993, 2013) descriptions of /z/ and /zʕ/ as well as on more general effects of
sonority (Clements 1990), we expect the clearest predictor of phonetic realization to be prosodic
position, with more lenited and lateral realizations in coda position than elsewhere. Tables 3 and
4 summarize the number of tokens of each phonetic realization by underlying consonant, in
intervocalic position (Table 3) and in final coda position (Table 4). We focus on these positions
because they are the most common ones in our dataset (see Table 2 above).
Intervocalically (Table 3), the most common realization of /z/ is a retracted, lenited
approximant (coded as [ʁ
̞]) and the most common realization of /zʕ/ is a non-retracted, non-
lenited fricative (coded as [z]). Crucially, unlike in coda position, intervocalic /z/ and /zʕ/ have
no lateral component, with the exception of a single token coded as ambiguous [ɮ~ ʁ
̞].
11
Because the dataset used for this study includes three (and in one case four) repetitions per word, one other
potential effect which we investigated was repetition number. We did not find any correlation between repetition
number and any of several dependent variables including phonetic realization, manner, laterality, and retraction, for
any syllabic position. We therefore excluded it from further analysis.
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
20
Table 3: Token numbers per phonetic realization in V_V position by underlying consonant
(token percentages refer to their respective columns, with bolding indicating the most common
realization(s) per column).
Manner
Phonetic realization
/z/
/zʕ/
Total
Fricative
[z] (Fig. 3)
14 (35%)
22 (58%)
36 (46%)
[ʁ] (Fig. 4)
0
4 (11%)
4 (5%)
Approximant
[ʁ
̞] (Figs. 4 & 5)
26 (65%)
10 (26%)
36 (46%)
[z
̞]
0
1 (3%)
1 (1%)
Hybrid
[ɮ~“4]
0
1 (3%)
1 (1%)
Total
40
38
78
In coda position (Table 4), the most common realization is [ɫ], especially for /zʕ/. Note
that in addition to the major realizations illustrated above, a few realizations were observed only
once or twice: [ð], [z~ð], [l] (light), and hybrid realizations [“4~ɮ], [ɫ~ɮ4], and [“4~ɫ~ɮ]. These
reflect the fact that /z/ and /zʕ/ are particularly variable and ambiguous in coda position, much
more so than in onset position.
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
21
Table 4. Token numbers per phonetic realization in V_# position, by underlying consonant
(token percentages refer to their respective columns, with bolding indicating the most common
realization per column).
Manner
Phonetic realization
/z/
/zʕ/
Total
Fricative
[z] (Figure 3)
8 (22%)
5 (9%)
13 (14%)
[ð]
1 (3%)
0
1 (1%)
[z~ð]
0
1 (2%)
1 (1%)
Approximant
[“4] (Figures 5, 6, 8)
3 (8%)
0
3 (3%)
[ɫ] (Figure 9)
14 (39%)
46 (84%)
60 (66%)
[l] (light)
2 (6%)
0
2 (2%)
vowel (Figure 10)
4 (11%)
3 (5%)
7 (8%)
Other (hybrids)
[“4~ɮ], [ɫ~ɮ4]; [“4~ɫ~ɮ] (all lenited)
4 (11%)
0
4 (4%)
Total
36
55
91
Considering prosodic position as a whole (including medial onsets and codas as well —
see Table 2), degree of lenition is significantly correlated with prosodic position for /zʕ/
(χ2=50.64, df=3); for /z/, the relationship is weaker, falling slightly above the level of
significance (χ2=6.79, df=3, p=.079). In coda position, both medial and final, there is an
overwhelming tendency (between 75%–100%) for both phonemes to be lenited. Conversely, in
medial onset position there is equally strong resistance to lenition (67%-100%). These findings
support cross-linguistic tendencies related to sonority and syllable structure, whereby preference
is for low-sonority onsets and high-sonority codas (Clements 1990). Intervocalically, the two
phonemes differ, with /z/ leniting as it does in coda position and /zʕ/ resisting lenition as it does
in onset position. However, for both phonemes, intervocalic position shows the weakest
tendency towards categorical behaviour, i.e., there is more variability in lenition intervocalically
than in any other prosodic position.
Based on cross-linguistic findings that, in articulatorily complex segments, tongue body
articulations are more dominant in coda than in onset position (Gick et al. 2006; Krakow 1993),
we predicted that coda position would also lead to higher numbers of tokens perceived as
retracted. Although this is generally the case, the effect of prosodic position on retraction differs
between /z/ and /zʕ/, the correlation being significant only for /zʕ/ (χ2=39.833, df=3). Both /z/ and
/zʕ/ are categorically non-retracted in medial onset position. In final coda position, both
phonemes tend to be produced with retracted realizations, although this tendency is relatively
weak for /z/. In intervocalic and medial coda positions, the two phonemes behave with opposite
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
22
tendencies: /z/ tends to retract intervocalically and shows a weak tendency to non-retraction in
medial codas; in other words, intervocalic and final coda positions behave similarly for /z/, and
contrast with medial onsets and codas (which pattern together). In contrast, /zʕ/ tends not to
retract intervocalically but is nearly categorically retracted in medial codas; in other words,
intervocalic and onset positions behave similarly for /zʕ/, and contrast with medial and final
codas (which pattern together). Note that the relatively weak effects of prosody on retraction
reflect Cook’s (1993, 2013) descriptions of /z/ and /zʕ/ variation, in which he refers to lenition
and lateralization, but not retraction.
In support of Cook’s (1993) observations, there is a significant correlation between
prosodic position and laterality with the effect being generally quite strong and consistent for
both /z/ and /zʕ/ (/z/: χ2=36.577, df=3; /zʕ/: χ2=89.101, df=3): non-lateral realizations occur
almost categorically in intervocalic and medial onset positions (with the tendency in the latter
position being slightly weaker for /z/). In medial and final coda position, lateral realizations are
nearly categorical, except for /z/ in final codas where there is only a weak tendency.
Finally, the relationship between buzziness and prosodic position is statistically
significant for both /z/ (χ2=9.306, df=3) and /zʕ/ (χ2=42.988, df=3), reflecting the correlation
between buzziness and manner (see 3.2.1). For /zʕ/, the relationship between manner and position
is very clear (see above) and therefore predictive of buzziness in a straight-forward way: non-
lenited intervocalic /zʕ/ realizations are by and large buzzy; lenited coda realizations are non-
buzzy. The only exception to this involves intervocalic lenited realizations, half of which are
buzzy. For /z/, the relationship between manner and position is not as clear, and therefore neither
is the relationship between buzziness and position: lenited intervocalic /z/ realizations are almost
categorically non-buzzy and non-lenited coda realizations are entirely buzzy, but other manner-
position combinations also occur and are more variable. Overall, the observed patterns of
buzziness support the idea that intervocalic /zʕ/ is syllabified as an onset, with lenition and hence
non-buzziness occurring only in coda position, whereas this is not so clearly the case for /z/.
The findings reported in 3.2.1 suggest that the prosodic affiliation of intervocalic
consonants is worth investigating further. Across dimensions of variation, /zʕ/ shows relatively
consistent syllabic effects, with intervocalic /zʕ/ behaving like onset /zʕ/, and in opposition to
medial and final coda /zʕ/. In contrast, /z/ does not show such consistent syllabic effects: with
respect to retraction, intervocalic /z/ patterns with medial and final coda /z/, and in opposition to
onset /z/; with respect to lenition, intervocalic /z/ patterns with final coda /z/, and in opposition to
medial coda or onset /z/; with respect to laterality, intervocalic /z/, patterns with onset /z/, in
opposition to coda /z/ (medial and final); with respect to buzziness, /z/ final codas stand apart
from other positions as the locus for buzzy realizations. This difference in syllabic affiliations
between intervocalic /z/ and /zʕ/ is interesting, and worth delving into in more detail, especially
given the complexity of syllabification of intervocalic consonants in other Dene languages (Bird
2002).
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
23
3.2.2 Phonetic realizations across segmental environments
In addition to prosodic position, a number of studies have shown that segmental environment
plays a role in lenition
12
patterns in particular (Kirchner 2001, 2004, Kingston 2008, Ennerver et
al. 2017). In the dataset we are working with, /z/ and /zʕ/ occur either adjacent to vowels or (in a
small number of case) to resonants. We focus here on the potential role of preceding vowels in
predicting the phonetic realization of /z/ and /zʕ/, since this is the condition for which we have
the most data.
There is ongoing debate about whether the quality of adjacent vowels affects degree of
consonant lenition, and existing findings are conflicting in terms of the direction of possible
effects (Ennerver et al. 2017).
13
In Tŝilhqot’in, /z/ and /zʕ/ show a strong tendency to lenite
following all oral
14
vowels (intervocalic and coda position), compared to following another
consonant (medial onset position). Variation between vowels (significant only for /z/: χ2=11.45,
df=3) suggests a prohibitive effect of vowel proximity on lenition. Ranking vowels according to
lenition of /z/ (least to most lenited), we get the following scale: i (61%) > e (88%) > a, u (100%
i.e., categorical lenition). For lenition of /zʕ/ the rankings are: e (62%) > i (75%) > u (80%) > a
(88%). In both cases, distal (articulatorily incompatible) [a] leads to the most cases of lenition,
and the more proximal (articulatorily compatible) [e] and [i] lead to fewer cases of lenition. This
supports Kirchner (2001, 2004), and is also compatible with Iskarous et al.’s (2013) model of
coarticulatory resistance, which predicts that high and front vowels will resist coarticulatory
effects more than low and back vowels(Recasens & Rodriguez 2016).
15
To the extent that [u]’s
behaviour is reliable (very few tokens exist, especially preceding /z/), it seems to reflect the
articulatory specification of /z/ vs. /zʕ/: [u] patterns with distal [a] before /z/, but closer to
proximal [e] and [i] before /zʕ/.
Recall from 3.1.2 that buzziness was strongly correlated with manner. The effect of
vowel quality on lenition is reflected in its effect on buzziness as well (significant only for /zʕ/:
χ2=15.785, df=5). For /z/, when ordered by buzziness (least to most buzzy), vowels are ranked as
followed: u (0%) > a (22%) > e (38%) > i (61%); only /i/ is likely to produce buzzy realizations,
which makes sense articulatorily if buzziness results from a high/fronted tongue body. For /zʕ/,
the ranking is: a (17%) > u (20%) > i (28%) > e (45%). The ranking of vowels preceding /zʕ/ is
exactly opposite for buzziness vs. lenition, reflecting the fact that less lenited realizations are
consistently more buzzy. The rankings for vowels preceding to /z/ by buzziness vs. lenition are
more complex, reflecting the fact that the relationship between buzziness and lenition is also
more complex, and partly dependent on prosodic position.
There is also a correlation between preceding vowel and retraction for both phonemes
(/z/: χ2=15.908, df=5; /zʕ/ χ2=16.925, df=5). If we rank vowels according to retraction (least to
12
We also tested for effects of vowel quality on laterality, retraction, and buzziness, but no clear patterns emerge for
any of these effects.
13
Existing studies are of plosives rather than fricatives (Cole, Hualde & Iskarous 1999, Simonet, Hualde & Nadeu
2012, Ortega-Llebaria 2004, Ennerver et al. 2017).
14
We focus on oral vowels in this section, since we have very few tokens of nasalized vowels.
15
Recasens & Rodriguez (2016) show that co-articulatory resistance decreases in Catalan VCV sequences in this
progression: [i, e] > [a] > [o] > [u]. We can say that our data is largely compatible with this ranking insofar as we
have the data to match it. However, our data is particularly lacking at the “most variable, least resistant to co-
articulatory effects” end of the scale, as we have relatively few tokens of /u/, and Tŝilhqot’in’s four-vowel system
lacks an equivalent to /o/ (also Tŝilhqot’in /e/ is ["] rather than [e]).
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
24
most) for /z/, we get the following ranking: a (44%) > i (61%) > e (76%) > u (100%). In
comparison to /z/, /zʕ/ shows a much greater incidence of retraction overall, reflecting its
inherently retracted nature. When ranked according to retraction for /zʕ/, the vowels are ordered
as follows: e (62%) > i (72%) > a (88%) > u (100%). For both phonemes, [u] categorically
favours retraction, which supports the idea that what we coded as phonetically retracted
corresponds to raised backing, or uvularization (compatible with [u]) rather than
pharyngealization (Saltzman and Munhall 1989). The fact that [a] does not favour retraction of
/z/ further supports this view, since one would expect [a] to favour pharyngealization, but not
uvularization.
Finally, in terms of laterality, the patterns differ somewhat by phoneme, and both are
statistically significant (/z/: χ2=21.342, df=5; /zʕ/ χ2=22.461, df=5). The order of vowels with
respect to laterality (least to most lateral) of /z/, is as follows: u (0% lateral) > e (29%) > i
(42%) > a (100%); only [a] is followed categorically by lateral realizations. For /zʕ/, the order is:
e (52%) > u (60%) > a (75%) > i (81%); all vowels favour lateralization of following /zʕ/, which
reflects the near categorical tendency for /zʕ/ to be lateralized in coda position.
Overall, the findings presented in Sections 3.1 and 3.2 paint a relatively consistent pattern
with respect to /z/ and /zʕ/ variation, even though not all results are statistically significant.
Tokens coded as [ʁ
̞], corresponding primarily to /z/ in intervocalic position (Table 3), are
acoustically lenited, retracted, and non-buzzy. Tokens coded as [z], often corresponding to /zʕ/ in
intervocalic position (Table 3), are acoustically non-lenited, non-retracted, and buzzy. Thus,
lenition, retraction, and buzziness pattern together in where they occur. Tokens coded as [ɫ] are
observed strictly in final coda position (Table 4) and this is reflected in the very clear results of
laterality by position, across both /z/ and /zʕ/. In terms of segmental environment, distal vowels
trigger lenition more so than proximal vowels; other effects vary by phoneme.
4 Discussion
Both the phonetic features of Tŝilhqot’in /z/ and /zʕ/ and their distributional properties across
syllabic and segmental environments are reminiscent of patterns observed in other languages,
locally and further afield. The discussion that follows considers how /z/ and /zʕ/ should be
characterized phonetically and phonologically (Section 4.1), and whether the observed patterns
can be described as lenition (4.2).
4.1 Phonetic and phonological features of Tŝilhqot’in /z/ and /zʕ/
Phonetically, the defining features of Tŝilhqot’in /z/ and /zʕ/ include a characteristic ‘buzziness’
(accompanying coronal articulations in particular) and acoustic features compatible with
engagement of both the tongue tip (TT; non-retracted/non-lateralized articulations) and the
tongue body (TB; retracted/lateralized articulations), or what Laver (1994) refers to as “double
articulations” (p. 314). Although phonologically, Tŝilhqot’in /z/ and /zʕ/ are clearly consonants,
their phonetic characteristics are reminiscent of two other sounds described in the literature: the
Chinese front apical vowel and the Swedish “Viby-i” vowel (see Laver 1994, Chapter 11,
Section 11.3).
The Chinese sound that has traditionally been called the ‘front apical vowel’ has been the
topic of much debate concerning its precise nature and whether it is best characterized
(phonetically) as a vowel, an approximant, or a fricative (Karlgren 1915, Ladefoged &
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
25
Maddieson 1996, Yu 1999, Duanmu 2007, Lee-Kim 2014). X-ray images in Zhou & Wu (1963)
and ultrasound images in Lee-Kim (2014) both show that the sound (transcribed here as [ɹ
̪]
following Lee-Kim) has a double articulation, with TT raising and TB raising/backing (see also
Laver (1994) on double-articulations). In fact, Lee-Kim (2014) notes that there seems to be an
inherent compatibility between dental TT articulations and TB retraction, citing Stevens, Keyser
& Kawasaki (1986), who “conjecture that the dental constriction, which requires a flat tongue
front, can be achieved more easily when the tongue back is retracted (p. 271). Lee-Kim notes
that [ɹ
̪] is “presumably unattested in any other language” (p. 279). While the phonological
distribution of [ɹ
̪] is certainly a signature feature of Chinese languages, Tŝilhqot’in /z/ and /zʕ/
show that its phonetic realization is perhaps not so unique.
If Tŝilhqot’in /z/ and /zʕ/ indeed shows the same kind of double articulation (TT and TB)
as Chinese [ɹ
̪], as our acoustic and auditory analysis suggests, this might explain why all of these
sounds can exhibit what is referred to here as ‘buzziness’. Shao & Ridouane (2018, 2019)
describe the front apical vowel in Jixi-Hui Chinese, which they transcribe /z/. According to their
(2019) articulatory investigation, Jixi Hui /z/ has a high TB and a raised TT. They hypothesize
that the high TT in particular might explain the presence of “abundant frication noise” during the
sound, which has also been described by Trubetzkoy (1969) as “frication-like noise resembling a
humming” (p. 171) and by Chao (1961) as “a buzzing quality” (p. 22).
16
Note that Jixi-Hui
Chinese /z/ transitions from a fricative to a vowel. In cases where Tŝilhqot’in /z/ and /zʕ/ are
transitional, they follow the opposite pattern, transitioning from a more lenited, sonorant sound
to a less lenited one. This is not surprising, given the differences in their phonological status,
Jixi-Hui Chinese /z/ acting as a syllable nucleus and Tŝilhqot’in /z/ and /zʕ/ acting as consonants.
Another language which exhibits a distinctly ‘buzzy’ sound is Swedish. Westberger
(2018, 2019) has conducted the most recent work on the vowel termed ‘Viby-i’, which she
describes as “an /i:/ variant with an unusual “thick”, “buzzing”, and “damped” quality
(Engstrand et al. 1998).” (Westerberg 2019: 3696). Westberger cites similar descriptions by
previous authors: Björsten & Engstrand (1999) “suggest that viby-i is a high central unrounded
[ɨ], which may be produced with a raised tongue tip to amplify its “damped” quality.”
(Westberger 2019: 3696–7). Frid et al. (2015) report that “Viby-i is produced with a lower and
backer tongue body, and different tongue tip behaviour, than [i:].” (Westberger 2019: 3697).
Based on acoustic evidence, Westberger hypothesizes that Viby-i is a centralized vowel (similar
to our auditory impressions of the most lenited versions of Tŝilhqot’in /z/ and /zʕ/) and links the
low F2 of Viby-i to a complex tongue shape. Of course, the articulatory properties of Tŝilhqot’in
/z/ and /zʕ/ can only be inferred from the acoustic signal here. Our hope is that, in the future, we
might also have the possibility of conducting an articulatory study of these sounds. Particularly
intriguing to us is the potential role of jaw clenching in producing /z/ and /zʕ/. Neither the
literature on the Chinese front apical vowel nor that on the Swedish Viby-i mentions the jaw as a
primary articulator, but according to the Tŝilhqot’in speaker we worked with, the jaw is tightly
shut for /z/ and /zʕ/. Her descriptions provide support for Esling’s (2005) laryngeal articulator
model of vowel production, which specifically includes the jaw as a primary articulator.
Although phonetically Tŝilhqot’in /z/ and /zʕ/ appear similar to the Chinese front apical
vowel and to the Swedish Viby-i, these sounds differ in their phonological status. Prosodically,
unlike the Chinese and Swedish sounds, Tŝilhqot’in /z/ and /zʕ/ are clearly consonants, acting as
16
Bowei Shao, personal communication (November 30, 2019).
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
26
syllable onsets and codas, but never as nuclei. Segmentally, their precise phonological
categorization is less clear. Although they are transcribed as fricatives by Cook (1993, 2013), /z/
and /zʕ/ show prosodically-based variation that is typical of sonorants cross-linguistically. Gick
et al. (2006) and Krakow (1993, 1999) discuss articulatory timing in liquids and nasals,
respectively. Liquids and nasals are articulatorily complex, consisting of (at least) two gestures:
TT + TB for liquids, and tongue (TT or TB) + velum for nasals. In such segments, both the
magnitude of the anterior and posterior gestures and their relative timing can vary as a function
of syllabic position (Gick et al. 2006): in onset position, if the posterior gesture occurs at all, it
occurs relatively synchronously with the anterior gesture; in coda position, the posterior gesture
precedes the anterior gesture and the anterior gesture is often lenited, sometimes to the point of
disappearing altogether. For liquids (/l/ in particular), asynchronous timing in coda position leads
to perceived dominance of the TB gesture, as in dark [ɫ]. For nasals, asynchronous timing in coda
position leads to perceived nasal co-articulation with the preceding vowel. In Tŝilhqot’in,
retracted (TB-dominant) variants of /z/ and /zʕ/ (especially lateral ones) tend to occur in coda
position, whereas non-retracted tokens are more common in onset position (TT-dominant),
especially for /zʕ/. This pattern matches that found among other sonorants across the world’s
languages.
In the literature on inter-gestural timing, the anterior and posterior gestures are sometimes
conceptualized instead as “consonantal” and “vocalic”, where more anterior gestures (TT) are
consonantal and more posterior gestures (TB) are vocalic. Inter-gestural timing is then described
in terms of peripherality within the syllable: consonantal (anterior) gestures occur more
peripherally, whereas vocalic (posterior) gestures occur more centrally (Sproat & Fujimura
1993). If we think of lenited variants in our data as more “vocalic” and non-lenited variants as
more “consonantal”, then the co-variation (χ2=73.263, df=1) observed between lenition and
retraction (see 3.1.2) makes good sense: as with sonorants in other languages, the more vocalic
variants of /z/ and /zʕ/ are also retracted (dominance of TB gesture; 91% of lenited forms),
whereas the more consonantal variants are also non-retracted (dominance of TT gesture; 90% of
non-lenited forms).
The idea of /z/ and /zʕ/ patterning with sonorants is familiar from Interior Salish
languages that neighbour Tŝilhqot’in. For example, van Eijk (1997: 4) notes that in St’át’imcets
(Lillooet Salish; see Figure 1), /z/ behaves as a resonant, phonologically. Phonetically, van Eijk
characterizes /z/ as a “lax” fricative, implying that it is on the sonorant end of the fricative-
sonorant continuum. Phonologically, it has the same distributional restrictions as other resonants
in the language and, like more typical resonants, it also has a glottalized counterpart.
Interestingly, van Ejik notes that, in the Mount Currie dialect, /z’/ allows free variation between
[z’] and [l’] in coda position, e.g. /χez’p/ ‘ember(s)’ can be realized as [χez’p] or [χel’p]. This
suggests that [z] ~ [l] allophonic variation is an areal phenomenon, and/or that [z] and [l] are
compatible in some way that makes them likely allophones, not just in Tŝilhqot’in, but more
generally.
Summarizing so far, we have seen that the acoustics of Tŝilhqot’in /z/ and /zʕ/ appear to
indicate both TT and TB gestures, which vary in their magnitude and specific realization, and
which sometimes lead to the perception of buzziness within the sound. Phonologically, /z/ and
/zʕ/ behave similarly to /z/ in Interior Salish languages, which also has (phonological) properties
of sonorants (rather than fricatives). We return to the phonological specification of /z/ and /zʕ/ in
Section 4.2, after a discussion of whether their variation falls under the umbrella of lenition.
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
27
4.2 Tŝilhqot’in /z/ and /
z!
/ variation as lenition
Given the variation in manner observed across tokens of Tŝilhqot’in /z/ and /zʕ/, it is worth
considering to what extent this variation can be described in terms of lenition (Kirchner 2004,
Kingston 2008, Warner & Tucker 2011, Ennerver et al. 2017, Katz & Pitzanti 2019, Broś et al.
2021,and others). Ennerver et al. (2017) summarize the factors that have been shown to play a
role in lenition, including duration, prosodic position, and segmental environment (as well as
consonantal place of articulation, which we will not consider here).
17
In terms of duration, cross-linguistic patterns show that lenited forms also tend to be
shorter in duration, not surprisingly. A common analysis of this correlation is ‘undershoot’,
where shorter durations (e.g. due to faster speech rates) lead to articulatory undershoot of targets.
As Ennerver et al. (2017) state, “the shorter the duration afforded to a constriction, the less likely
full constriction will be achieved.” (p. 4). Interestingly, the lenited and non-lenited variants of
Tŝilhqot’in /z/ and /zʕ/ are not significantly different in duration, and in fact the trend is in the
opposite direction than predicted: the average durations for lenited vs. non-lenited tokens are
182ms and 170m respectively. Warner and Tucker (2011) report that, in American English,
lenition occurs more often at faster speech rates and in less formal registers. Recall that the
Tŝilhqot’in data were elicited in a word list task, and hence presumably reflect a relatively slow
speech rate and formal register. Perhaps a correlation between duration and lenition would
emerge in faster, more spontaneous speech. Given the observed variation more broadly though,
across three dimensions (manner, retraction, laterality), it seems unlikely that shortened duration
is a determining factor in the observed variation in /z/ and /zʕ/ realization.
In terms of prosodic position, intervocalic position is “widely accepted as the segmental
environment most favourable for consonantal lenition.” (Ennerver et al. 2017: 5; see also
Kirchner 2004). Different scholars propose different motivations for lenition in this position.
According to Kirchner (2001, 2004), intervocalic consonant lenition occurs for articulatory
reasons, to reduce the effort required to move between vocalic (relatively open) and consonantal
(relatively closed) articulatory targets. Kingston (2008) argues that lenition occurs for perceptual
reasons, so as not to interrupt the speech stream word-internally; this helps to focus the listener’s
attention on morpheme and word boundaries (see also Katz & Pitzanti 2019). In Tŝilhqot’in, the
most favoured site for lenition, particularly so for /zʕ/, is coda position, a position that does not
require quick transitions between the target consonant and adjacent vowels (Kirchner 2001,
2004) and that would not otherwise interrupt the speech stream word-internally (Kingston 2008).
This particular pattern of /zʕ/ variation goes directly against Kingston (2008), who argues that
while lenition is common within prosodic constituents (including the word), it is often
disfavoured at their edges. The pattern for /z/ is less clear, and possibly more compatible with a
lenition analysis, in that its most common realization in intervocalic position is lenited [ʁ
̞]. One
pattern that clearly does match cross-linguistic tendencies is that lenition is disfavoured in medial
onset position, for both /z/ and /zʕ/ (Escure 1977). Recall that Tŝilhqot’in /z /and /zʕ/ do not occur
in word-initial onset position; it would be interesting to further investigate the potential
relationship between the distribution of these sounds and their lenition patterns.
17
Consonantal place of articulation is only relevant when lenition patterns hold of multiple places of articulation,
e.g., Spanish /b, d, g/. We are also unable to consider the potential effect of lexical frequency here (see Kingston
2008, Katz & Pitzanti 2019), since we do not have access to frequency data for Tŝilhqot’in.
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
28
Although we were unable to test the effect of adjacent consonant on lenition (Kingston
2008), we did test the effect of adjacent (preceding) vowel (Ennever et al. 2017) and found that,
for both /z /and /zʕ/, lenition was favoured adjacent to distal ([a]) vowels and disfavoured
adjacent to proximal ([e], [i]) vowels. To the extent that vowel environment is reliable, this
supports Kirchner’s (2001, 2004) effort-based model of lenition.
Summarizing, while lenition is an important aspect of Tŝilhqot’in /z/ and /zʕ/ variation,
the specifics of the observed patterns do not consistently reflect ones found cross-linguistically:
although vocalic environment does seem to play a role consistent with articulatory (effort-based)
motivations for lenition, there is no correlation between (shorter) duration and lenition, and the
syllabic position that favours lenition most consistently is (final) coda position rather than
intervocalic position, especially for /zʕ/. Given that lenition is only one dimension of /z/ and /zʕ/
variation, this result is perhaps not surprising. Indeed, even if articulatory considerations could
explain the lenition aspect of /z/ and /zʕ/ variation, they could not explain the full variation
observed, which also includes laterality and retraction, all of which entail increased (rather than
reduced) articulatory complexity.
That the variation observed in Tŝilhqot’in /z/ and /zʕ/ occurs across three dimensions —
manner, retraction, and laterality — begs the question: What is the phonological specification of
these sounds, and what IPA symbol most appropriately represents them? Without direct
articulatory data, it is difficult to answer these questions. We provide some preliminary thoughts
here, as a foundation for future work. Ennerver et al. (2017: 1–2) refer to Keating’s (1990)
Window Model to explain observed variation in manner: segments are specified for ‘windows’
or ‘ranges’ of targets rather than single points. This model potentially works well for explaining
variation along a single articulatory continuum, like manner (e.g., degree of lenition). It is not
clear how well it would work for variation in features like retraction and laterality.
18
To answer
this question, detailed articulatory work is needed to determine what precisely the gestures are
that create the auditory and acoustic characteristics of retraction and laterality in Tŝilhqot’in, and
whether these also vary continuously in strength. Whether target gestures are single points or
ranges, it seems clear that /z/ and /zʕ/ need to include retraction and laterality as part of their
phonetic and phonological specification, otherwise there would be no way of explaining why
these properties turn up in their production.
5 Conclusion
The Tŝilhqot’in consonant inventory is incredibly rich, and has yet to be documented
phonetically in a thorough way. This study takes a first step towards this endeavour, focusing on
the acoustic characteristics of the voiced plain and pharyngealized coronal fricatives /z/ and /zʕ/.
Findings show that these sounds vary along three main dimensions: manner (fricative ~
approximant), retraction (non-retracted ~ retracted), and laterality (non-lateral ~ lateral), in
addition to buzziness, which is highly correlated with manner. Variation is partly based in part
on syllabic position and segmental environment. A fuller description of these sounds requires
studying their articulatory properties directly. We hope to be able to continue documenting the
Tŝilhqot’in sound system, and that this work will advance our understanding of phonetic
18
Zawaydeh & de Jong (2011) consider Keating’s Window Model to explain gradient uvularization effects in
Ammni-Jordanian Arabic, but reject it because it is unable to capture the complexity of these effects.
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
29
typology, while also contributing to pedagogical resources for teaching and learning Tŝilhqot’in
pronunciation.
Acknowledgments
This research was conducted on the lands of the Tŝilhqot’in Nation (data collection), the
Esquimalt, Songhees, and W
̱SÁNEĆ Nations, and in Treaty One territory, original lands of the
Anishinaabeg, Cree, Oji-Cree, Dakota, and Dene peoples, and the homeland of the Métis Nation.
We are grateful to be able to live and work on these lands. We would like to thank the
Tŝilhqot’in speakers who welcomed us into their homes and shared their language with us. We
are also grateful to the audience members at The 2014 Phonetic Building Blocks of Speech
conference, and to the two anonymous reviewers who provided us with such thorough and
insightful feedback on our work. This work was funded by the Social Sciences and Research
Council of Canada, grant # 410-2011-224.
References
Al-Tamimi, Jalal-Eddin. 2017. Revisiting acoustic correlates of pharyngealization in Jordanian
and Moroccan Arabic: Implications for formal representations. Laboratory Phonology:
Journal of the Association for Laboratory Phonology, 8(1), 28. doi:10.5334/labphon.19
Alwan, Abeer A.-H. 1986. Acoustic and perceptual correlates of pharyngeal and uvular
consonants. M.Sc. thesis, Massachusetts Institute of Technology.
Bessell, Nicola J. 1992. Towards a phonetic and phonological typology of post-velar articulation.
Ph.D. dissertation, University of British Columbia.
Bird, Sonya. 2002. The phonetics and phonology of Lheidli intervocalic consonants.
Unpublished Ph.D. dissertation, University of Arizona.
Bird, Sonya. 2014. A phonetic investigation of Tsilhqut’in /z/. Paper presented at the Phonetic
Building Blocks of Speech conference in honour of Professor John Esling. Victoria,
Canada.
Boersma, Paul & David Weenink. 2018. Praat: doing phonetics by computer (version 6.0.43)
http://www.praat.org/ (accessed 8 September 2018).
Björsten, Sven & Olle Engstrand. 1999. Swedish “damped” /i/ and /y/: Experimental and
typological observations. Proceedings of the 14th International Congress of the Phonetic
Sciences. San Francisco.
Broś, Karolina, Marzena Żygis, Adam Sikorski & Jan Wołłejko. 2021. Phonological contrasts
and gradient effects in ongoing lenitionin the Spanish of Gran Canaria. Phonology 38(1),
1-40.
Chao, Yuen-Ren. 1961. Mandarin primer: An intensive course in spoken Chinese. Oxford:
Harvard University Press.
Chiu, Chenhao & Jackson T.-S. Sun. 2020. On pharyngealized vowels in Northern Horpa: An
acoustic and ultrasound study. The Journal of the Acoustical Society of America 147,
2928. doi:10.1121/10.0001005
Clements, Nick. 1990. The role of the sonority cycle in core syllabification. In J. Kingston & M.
Beckman (eds.), Papers in Laboratory Phonology I: Between the Grammar and Physics
of Speech, 283–333. Cambridge: Cambridge University Press.
Cole, Jennifer, José I. Hualde & Khalil Iskarous. 1999. Effects of Prosodic and Segmental
Context on /g/-Lenition in Spanish. In O. Fujimura, B. D. Joseph, and B. Palek (eds.),
Proceedings of the Fourth International Linguistics and Phonetics Conference, 575–589.
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
30
Cook, Eung-Do. 1989. Chilcotin tone and verb paradigms. In Eung-Do Cook, Keren Rice (eds),
Athapaskan Linguistics: Current perspectives on a language family (Trends in
Linguistics State-of-the-Art Reports 15), 145–198. Berlin: Mouton de Gruyter.
Cook, Eung-Do. 1993. Chilcotin flattening and autosegmental phonology. Lingua 91, 149-174.
Cook, Eung-Do. 2004. A Linguistic Introduction to Tsinlhqut’in (Chilcotin). Unpublished
manuscript.
Cook, Eung-Do. 2013. A Tsilhqút’ín grammar. Vancouver: UBC Press.
Duanmu, San. 2007. The Phonology of Standard Chinese. New York: Oxford University Press.
Dunlop, Britt, Suzanne Gessner, Tracey Herbert & Aliana Parker. 2018. Report on the Status of
B.C. First Nations Languages, Third Edition. Brentwood Bay, BC: First People’s
Cultural Council. http://www.fpcc.ca/files/PDF/FPCC-LanguageReport-180716-
WEB.pdf (accessed 23 February 2021).
Embarki, Mohamed, Slim Ouni, Mohamed Yeou, Christian Guilleminot & Sallal Al Maqtari.
2011. Acoustic and electromagnetic articulographic study of pharyngealisation.
Coarticulatory effects as an index of stylistic and regional variation in Arabic. In Zeki
Majeed Hassan & Barry Heselwood (eds.), Instrumental Studies in Arabic Phonetics, 93–
215. Amsterdam/Philadelphia: John Benjamins.
Engstrand, O., Björsten, S., Lindblom, B., Bruce, G., Eriksson, A. 1998. Hur udda är Viby-i?
Experimentella och typologiska observationer. Folkmålsstudier 39, 83–95.
Ennever, Thomas, Felicity Meakins & Erich R. Round. 2017. A replicable acoustic measure of
lenition and the nature of variability in Gurindji stops. Laboratory Phonology: Journal of
the Association for Laboratory Phonology 8(1), 1–32. doi:10.5334/labphon.18
Escure, Geneviève. 1977. Hierarchies and phonological weakening. Lingua 43, 55–64. doi:
10.1016/0024-3841(77)90048-1
Esling, John. 2005. There are no back vowels: The Laryngeal Articulator model. Canadian
Journal of Linguistics 50, 13–44. doi: muse.jhu.edu/article/208962/pdf.
FirstVoices. 2021. Tsilhqot’in (Xeni Gwet’in).
https://www.firstvoices.com/explore/FV/sections/Data/Athabascan/Tsilhqot'in%20(Xeni
%20Gwet'in)/Tsilhqot'in%20(Xeni%20Gwet'in) (accessed 23 February 2021).
Frid, Johan, Susanne Schötz, Lars Gustafsson & Anders Löfqvist. 2015. Tongue articulation of
front close vowels in Stockholm, Gothenburg and Malmöhus Swedish. ICPhS18.
Glasgow.
Gick, Bryan, Fiona Campbell, Sunyoung Oh & Linda Tamburri-Watt. 2006. Toward universals
in the gestural organization of syllables: a cross-linguistic study of liquids. Journal of
Phonetics 34(1), 49–72. doi:10.1016/j.wocn.2005.03.005
Gordeeva, Olga B. & James M. Scobbie. 2010. Preaspiration as a correlate of word-final voice in
Scottish English fricatives. In Susanne Fuchs, Martine Toda & Marzena Zygis (eds.),
Turbulent Sounds. An Interdisciplinary Guide, 167–208. Berlin, New York: De Gruyter
Mouton.
International Phonetic Association. 2015. International Phonetic Alphabet.
https://www.internationalphoneticassociation.org/sites/default/files/IPA_Kiel_2015.pdf
(accessed 23 February 2021).
Iskarous, K, Christine Mooshammer, Phil Hoole, Daniel Recasens, Christine H. Shadle, Elliot
Saltzman & D. H. Whalen. 2013. The coarticulation/invariance scale: Mutual information
as a measure of coarticulation resistance, motor synergy, and articulatory invariance. The
Journal of the Acoustical Society of America 134, 1271. doi:10.1121/1.4812855
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
31
Jannedy, Stefanie & Melanie Weirich. 2017. Spectral moments vs discrete cosine transformation
coefficients: Evaluation of acoustic measures distinguishing two merging German
fricatives. The Journal of the Acoustical Society of America 142(1), 395–405.
doi:10.1121/1.4991347
Jassem, Wiktor. 1965. Formants of fricative consonants. Language and Speech 8: 1-16.
doi:10.1177/002383096500800101
Jongman, Allard, Ratree Wayland & Serena Wong. 2000. Acoustic Characteristics of English
Fricatives. Journal of the Acoustical Society of America 108(3), 1252–1263.
doi:10.1121/1.1288413
Karlgren, Bernhard. 1915. Etudes sur la Phonology Chinoise. Uppsala, Sweden: KW Appelberg.
Katz, Jonah & Gianmarco Pitzanti. 2019. The phonetics and phonology of lenition: A
Campidanese Sardinian case study. Laboratory Phonology: Journal of the Association
for Laboratory Phonology 10(1), 16. doi:10.5334/labphon.184
Keating, Patricia A. 1990. The window model of coarticulation: articulatory evidence. Papers in
laboratory phonology I, 26, 451–470.
Kingston, John. 2008. Lenition. In Laura Colantoni & Jeffrey Steele (eds.), Selected proceedings
of the 3rd conference on laboratory approaches to Spanish phonology, 1–31. Somerville,
MA: Cascadilla Proceedings Project.
Kirchner, Robert. 2001. An effort approach to consonant lenition. New York: Routledge.
Kirchner, Robert. 2004. Consonant lenition. In Bruce Hayes, Robert Kirchner & Donca Steriade
(eds.), Phonetically based phonology, 313–345. Cambridge: Cambridge University Press.
Krakow, Rena A. 1993. Nonsegmental influences on velum movement patterns: Syllables,
sentences, stress, and speaking rate. In Marie A. Huffman & Rena A. Krakow (eds.),
Nasals, nasalization, and the velum (Phonetics and Phonology V), 87–116. New York:
Academic Press.
Krakow, Rena A. 1999. Physiological organization of syllables: A review. Journal of Phonetics
27, 23–54. doi:10.1006/jpho.1999.0089
Ladefoged, Peter & Ian Maddieson. 1996. The Sounds of the World’s Languages. Oxford UK &
Malden MA: Blackwell.
Laver, John. 1994. Principles of Phonetics. Cambridge: Cambridge University Press.
Lee-Kim, Sang-Im. 2014. Revising Mandarin ‘apical vowels’: An articulatory and acoustic
study. Journal of the International Phonetic Association 44(3), 261–282.
Namdaran, Nahal. 2006. Retraction in St’at’imcets: An ultrasonic investigation. Ph.D.
dissertation, University of British Columbia.
Nirgianaki, Elina. 2014. Acoustic characteristics of Greek fricatives. The Journal of the
Acoustical Society of America 135(5), 2964–2976. doi:10.1121/1.4870487
Nolan, Tess. 2017. A phonetic investigation of vowel variation in Lekwungen. M.A. thesis,
University of Victoria.
Ortega-Lleberia, Marta. 2004. Interplay between phonetic and inventory constraints in the degree
of spirantization of voiced stops: Comparing intervocalic /b/ and intervocalc /g/ in
Spanish and English. In Timothy L. Face (ed.), Laboratory approaches to Spanish
phonology, 237–253. Berlin: Mouton de Gruyter.
Parzen, Emanuel. 1962. On Estimation of a Probability Density Function and Mode. The Annals
of Mathematical Statistics 33(3), 1065-1076. doi:10.1214/aoms/1177704472
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
32
R Core Team. 2020. R: A language and environment for statistical computing. Vienna: R
Foundation for Statistical Computing. http://www.R-project.org/ (accessed 23 February
2021).
Recasens, Daniel & Clara Rodríguez. 2016. A study on coarticulatory resistance and
aggressiveness for front lingual consonants and vowels using ultrasound. Journal of
Phonetics 59, 58–75. doi:10.1016/j.wocn.2016.09.002
Rosenblatt, Murray. 1956. Remarks on Some Nonparametric Estimates of a Density Function.
The Annals of Mathematical Statistics 27(3), 832-837. doi:10.1214/aoms/1177728190
RStudio Team. 2020. RStudio: Integrated Development for R. https://www.rstudio.com/
(accessed 23 February 2021).
Saltzman, Elliot L. & Kevin G. Munhall. 1989. A dynamical approach to gestural patterning in
speech production. Ecological psychology 1(4), 333–382. doi:10.1207/
s15326969eco0104_2
Sanker, Chelsea, Sarah Babinski, Roslyn Burns, Marisha Evans, Juhyae Kim, Slater Smith,
Natalie Weber & Claire Bowern. 2021. (Don’t) try this at home! The effects of recording
devices and software on phonetic analysis. Language 97(4), e360–e382.
doi:10.1353/lan.2021.0075
Shadle, Christine H. & Sheila J. Mair. 1996. Quantifying spectral characteristics of fricatives.
International Conference on Spoken Language Processing 3, 1521–1524.
doi:10.1109/icslp.1996.607906
Shahin, Kimary. N. 1997. Postvelar harmony: An examination of its bases and crosslinguistic
variation. Ph.D. dissertation, University of British Columbia.
Shahin, Kimary N. 2002. Postvelar harmony. John Benjamins.
Shao, Bowei & Rachid Ridouane. 2018. La « voyelle apicale » en chinois de Jixi:
caractéristiques acoustiques et comportement phonologique. Actes des XXXIIe Journées
d’Études sur la parole, 685–693.
Shao, Bowei & Rachid Ridouane. 2019. Apical vowel in Jixi-Hui Chinese: An articulatory study.
Proceedings of ICPhS 2019, 2358-2362.
Shar, Saeed & John Ingram. 2010. Pharyngealization in Assiri Arabic: an acoustic analysis.
Proceedings of the 2010 meeting of the Australian Speech Science and Technology
Association (ASSTA), 5-8.
Simonet, Miquel, José I. Hualde & Marianna Nadeu. 2012. Lenition of /d/ in spontaneous
Spanish and Catalan. INTERSPEECH 2012, 1416–1419.
Soli, Sigfrid D. (1981). Second formants in fricatives: Acoustic consequences of fricative-vowel
coarticulation. Journal of the Acoustical Society of America 70(4), 976–984.
https://doi.org/10.1121/1.387032
Sproat, Richard & Osamu Fujimura. 1993. Allophonic variation in English /l/ and its
implications for phonetic implementation. Journal of Phonetics 21, 291–311.
Stevens, Kenneth N., Samuel Jay Keyser & Haruko Kawasaki. 1986. Toward a phonetic and
Phonological theory of redundant features. In Joseph S. Perkell & Dennis H. Klatt (eds.),
Invariance and variability in speech processes, 426–449. Hillsdale, NJ: Lawrence
Erlbaum.
Sundara, Megha. 2005. Acoustic-phonetics of coronal stops: A cross-language study of Canadian
English and Canadian French. The Journal of the Acoustical Society of America 118(2),
1026–1037. doi:10.1121/1.1953270
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.
Pre-print
33
Trubetzkoy, Nikolai S. 1969. Principles of phonology. Berkley and Los Angeles, CA: University
of California Press.
Tukey, John W. 1953. The Problem of Multiple Comparisons. Unpublished paper. Princeton, NJ:
Princeton University.
van Eijk, Jan. 1997. The Lillooet Language: Phonology, Morphology, and Syntax. Vancouver
BC: UBC Press.
Warner, Natasha & Benjamin V. Tucker. 2011. Phonetic variability of stops and flaps in
spontaneous and careful speech. The Journal of the Acoustical Society of America 130(3),
1606–1617.
Westerberg, Fabienne. 2018. Rogue vowel: Acoustic variation and dynamics of Swedish Viby-i.
Poster presented at the British Association of Academic Phoneticians Colloquium. Kent,
UK.
Westerberg, Fabienne. 2019. Swedish “Viby-i”: Acoustics, articulation, and variation.
Proceedings of ICPhS 2019, 3696–3700.
Wickham, Hadley, et al. 2019. Welcome to the tidyverse. Journal of Open Source Software
4(43), 1686. doi:10.21105/joss.01686
Yu, Alan. 1999. Aerodynamic constraints on sound change: The case of syllabic sibilants. The
Journal of the Acoustical Society of America 105(2), 1096-1097. doi:10.1121/1.425139
Zawaydeh, Bushra Adnan & Kenneth de Jong. 2011. The phonetics of localizing uvularisation in
Ammani-Jordanian Arabic. In Zeki Majeed Hassan & Barry Heselwood (eds.),
Instrumental Studies in Arabic Phonetics, 257–276. Amsterdam/Philadelphia: John
Benjamins.
Zhou, Dianfu & Jongji Wu. 1963. Putonghua fayin tupu [Articulatory diagrams of Standard
Chinese]. Beijing: Shangwu yinshuguan.
Authors: Sonya Bird (University of Victoria) and Sky Onosson (University of Manitoba).
To appear in the Journal of the International Phonetic Association.
This version may dier slightly from the published article. Please cite the published version.
Please do not distribute without permission of the authors.