ArticlePDF Available

A phonetic case study of Tŝilhqot’in /z/ and /z ʕ /

Authors:

Abstract and Figures

This paper provides an acoustic description of /z/ and /z ʕ / in Tŝilhqot’in (Northern Dene). These sounds are noted by Cook (1993, 2013) to show lenition and some degree of laterality in coda position. Based on recordings made in 2014 with a single, mother-tongue speaker of Tŝilhqot’in, we describe their acoustic properties and examine their distribution as a function of prosodic position and segmental environment. We find that they vary along three dimensions: manner (fricative–approximant), degree of retraction (non-retracted–retracted), and laterality (non-lateral–lateral). In addition, some tokens have a characteristic ‘buzziness’, which has been associated with the Chinese front apical vowel (Shao & Ridouane 2018, 2019) and the Swedish ‘Viby-i’ (Westberger 2019). We argue that ‘lenition’ (Kirchner 2004, Ennever, Meakins & Round 2017) can only account for some of the observed variation and suggest that both /z/ and /z ʕ / are specified for two tongue articulations: tongue tip/blade and tongue body (Laver 1994), encompassing laterality (and concomitant retraction) in addition to the primary coronal gesture.
Content may be subject to copyright.
A phonetic case study of Tˆ
silhqot’in /z/and/zD/
Sonya Bird
University of Victoria
sbird@uvic.ca
Sky Onosson
University of Manitoba
sky@onosson.com
This paper provides an acoustic description of /z/ and /zD/inT
ˆ
silhqot’in (Northern Dene).
These sounds are noted by Cook (1993,2013) to show lenition and some degree of lat-
erality in coda position. Based on recordings made in 2014 with a single, mother-tongue
speaker of Tˆ
silhqot’in, we describe their acoustic properties and examine their distribu-
tion as a function of prosodic position and segmental environment. We find that they
vary along three dimensions: manner (fricative–approximant), degree of retraction (non-
retracted–retracted), and laterality (non-lateral–lateral). In addition, some tokens have a
characteristic ‘buzziness’, which has been associated with the Chinese front apical vowel
(Shao & Ridouane 2018,2019) and the Swedish ‘Viby-i’ (Westberger 2019). We argue that
‘lenition’ (Kirchner 2004, Ennever, Meakins & Round 2017) can only account for some
of the observed variation and suggest that both /z/ and /zD/ are specified for two tongue
articulations: tongue tip/blade and tongue body (Laver 1994), encompassing laterality (and
concomitant retraction) in addition to the primary coronal gesture.
1Introduction
Tˆ
silhqot’in1(ISO 639-3) is a Northern Dene language spoken in the central interior of British
Columbia, Canada (see Figure 1). According to the 2018 Report on the Status of B.C. First
Nations Languages, there are currently 765 fluent speakers, the largest number of speakers
of any First Nations language spoken strictly within the boundaries of B.C. (Dunlop et al.
2018).
The bulk of the existing linguistic research on Tˆ
silhqot’in has been done by Eung-Do
Cook, culminating in his 2013 A Tsilhqút’ín Grammar. While Cook’s grammar includes
thorough descriptions of Tˆ
silhqot’in phonology, morphology, and syntax, it does not provide
any details on the phonetic structures of the language. Tˆ
silhqot’in has an extremely rich sound
inventory, including features typical of Dene languages (e.g. tone) as well as ones shared with
neighbouring Salish and Wakashan languages (e.g. contrastive pharyngealization). Table 1
provides the consonant inventory, including a three-way voicing contrast in obstruents, a
velar–uvular contrast (including secondary labialization), and a plain–pharyngealized con-
trast in coronal fricatives and affricates. Table 1is organized phonetically, based on the
International Phonetic Association’s (2015) International Phonetic Alphabet chart. However,
it is important to note that Cook (2013) groups the sounds /lzz
Djw““
w/togetheras
1Also written/referred to as Tsilhqut’in, Tsilhqút’ín, Chilcotin, Tzilkotin,Tˆ
sinlhqot’in, and Nenqayni
Ch’ih. The spelling chosen here follows the spelling used by the Tˆ
silhqot’in National Government
(http://www.tsilhqotin.ca/).
Journal of the International Phonetic Association, page 1 of 34 © The Author(s), 2022. Published by Cambridge University Press on
behalf of the International Phonetic Associationdoi:10.1017/S0025100322000093
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
2Sonya Bird & Sky Onosson
Figure 1 (Colour online) Map of Tˆ
silhqot’in and surrounding First Nations territories. Original source (following Creative
Commons license CC-BY-NC-SA 2.0): First Nations People of British Columbia, Ministry of Education, British Columbia,
http://www.bced.gov.bc.ca/abed/map.htm.
‘voiced continuants (spirants)’ (p. 15), based on phonological evidence. In van Eijk’s (1997)
work on neighbouring St’át’imcets (Interior Salish), /zz
D““
w/ are grouped together with /l
l'lDlD'jj
'ww
'ƒƒ
'ÃÃ
'ÃwÃ'w/h/, in this case as (voiced) ‘resonants’ (p. 2).2Thus, both
Cook and van Eijk recognize that voiced fricatives and (non-nasal) resonants share certain
phonological properties; we return to the relevance of their classification system for the study
of /z/and/zD/ in Section 4.
Tˆ
silhqot’in also has a relatively complex vowel system: tense (long) vowels (/iau/) con-
trast with lax (short) ones (/ˆeo/), and each of these six underlying vowels has (at least) two
realizations, one occurring in retracted environments, termed ‘flat’ by Cook (1993,2013), and
the other occurring elsewhere, termed ‘sharp’ by Cook (1993,2013). The presence of non-
retracted vs. retracted vowel allophones is a reliable perceptual cue to the quality retracted
or not of adjacent consonants (see Sections 3.1.1 and 3.1.2 below). In fact, a Tˆ
silhqot’in
2Neither Cook nor van Eijk define the terms ‘continuants’, ‘spirants’, and ‘resonants’. From their
descriptions of the sounds, it is clear that they both have in mind phonological classification (based
on distributional properties) rather than phonetic realization.
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
A phonetic case study of Tˆ
silhqot’in /z/and/zD/3
Table 1 Tˆ
silhqot’in consonant inventory (adapted from Cook 2013: 15).
Bilabial Alveolar Post-
alveolar Velar Uvular Glottal
Plosive b p d t’ g k k’
gwkw kw
G q q’
Gw qw qwʔ
Affricate dz ts ts’
tsˁ tsˁ’
dzˁ
Lateral
affricate tɬ’
Nasal m n
Fricative s z
sˁ zˁ
χχw
ʁʁw
Lateral
fricative
Approximant w
ɬ
y
xw
h
Lateral
approximant l
dl
dƷ
tɬ
t∫t∫’
t
mother-tongue speaker and language expert that co-author Bird has worked with has talked
about reforming the orthography, so that the diacritic used to mark pharyngealization on con-
sonants (< >) is instead written on the vowels. At least in some dialects (Stone in particular),
there is evidence for contrastive nasalization (Cook 2013:21).Finally,T
ˆ
silhqot’in is a tone
language: vowels can have high (marked) or low (unmarked) tone.
The study described here focuses on the voiced alveolar plain and pharyngealized frica-
tives, /z/and/zD/. Contrastive pharyngealization is a key feature of the Tˆ
silhqot’in sound
system (see Table 1), and is also found in adjacent Interior Salish languages (Bessell 1992,
Shahin 2002, Namradan 2006). Cook (1993) provides a phonological analysis of the local
and non-local assimilation (retraction) processes that are triggered by pharyngealized con-
sonants in Tˆ
silhqot’in, but no detailed phonetic work has yet been done on these sounds.
In St’át’imcets (Interior Salish), which is spoken to the south-east of Tˆ
silhqot’in (see
Figure 1), pharyngealized coronal consonants are articulated with significant tongue root
retraction towards the lower pharyngeal wall, pulling back both preceding and following
vowels (Namdaran 2006). This results in raised F1 and lowered F2 values associated with
pharyngealized consonants compared to their plain counterparts (Shahin 1997,2002), a pat-
tern reflective of pharyngealized sounds cross-linguistically, including consonants (Shar &
Ingram 2010, Al-Tamimi 2017) and vowels (Chiu & Sun 2020).
Cook (1993) describes both /z/and/zD/ as ‘spirants rather than fricatives, i.e. non-strident,
especially in syllable final position, so that they are sometimes perceived mistakenly as dark
l (p. 159). In his more recent work, Cook (2013) does not say anything particular about the
phonetic realization of /z/and/zD/, although he does describe a more general weakening (leni-
tion) process of continuant consonants in coda position, i.e. a ‘stronger articulation (fricative)
in initial position and weaker articulation (spirant, glide) in final position of the continuants’
(p. 44). This lenition process is unusual in that, cross-linguistically, lenition occurs most often
in intervocalic position (Ennerver et al. 2017, Katz & Pitzanti 2019). We return to the ques-
tion of whether ‘lenition’ is the best term to use to describe the observed variation in /z/and
/zD/ in Section 4.2.
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
4Sonya Bird & Sky Onosson
To our ears (as trained phoneticians), /z/and/zD/ clearly have an elusive phonetic target,
exhibiting substantial variation beyond stronger vs. weaker manners of articulation. In pre-
vious work on the language, co-author Bird has transcribed these sounds using a variety of
symbols: [zz4z
Dzëë
D““4L](Bird 2014). The purpose of this study is to characterize
the phonemes /z/and/zD/ phonetically, to determine (a) what the possible phonetic variants
of these phonemes are, and (b) to what extent these variants are systematically distributed,
based on prosodic position and segmental environment. In terms of (b), our expectations
are threefold: (i) Lenited realizations will be more common in coda position than elsewhere
(Cook 2013: 44). Furthermore, weakening may be affected by adjacent segments, although
previous studies disagree on precise effects (Kirchner 2001,2004; Ennerver et al. 2017). (ii)
Retracted realizations will also be more common in coda position than elsewhere, based on
cross-linguistic findings showing that, in articulatorily complex segments, tongue body artic-
ulations (as opposed to tongue tip articulations) are more prominent in coda than in onset
position (Krakow 1993, Gick et al. 2006). Finally, (iii) in terms of lateralization, following
Cook’s (1993) observations, we anticipate that lateralized realizations will be more common
in coda position than elsewhere.
2Method
The study reported on below came out of an elicitation session in the fall of 2013 with a
single speaker who had worked closely with Eung-Do Cook in his linguistic fieldwork in the
1970s. She grew up speaking Tˆ
silhqot’in in the home, where her mother was a monolingual
speaker. She is bilingual in English, and has continued to be involved in language work,
as a linguist and as a teacher. At the time of the elicitation session, co-author Sonya Bird
wasinT
ˆ
silhqot’in territory for other reasons and had the unique opportunity to work with
her for an afternoon. Because we only had a very limited time together, we made do with
available word lists, based primarily on Cook’s (1989,1993,2004) materials; this is reflected
in the unevenness of the token numbers across conditions, summarized in Table 2below.
Unfortunately, we were limited to making audio recordings, and so were not able to capture
articulation directly–this is clearly an area of future exploration.
We recognize that working with a single speaker and using recording materials that were
not all specifically tailored for this study has implications in terms of the reliability of the
patterns described below. We note that the variable realizations of /z/and/zD/ in the speaker’s
recordings are matched by several speakers in the FirstVoices Tˆ
silhqot’in (Xeni Gwet’in)
language portal (FirstVoices 2021),3providing evidence that these are robust patterns that
hold across speakers of the language.
2.1 Stimuli and recording procedure
Stimuli consisted of words extracted from Cook’s (1989,1993,2004) materials4illustrat-
ing Tˆ
silhqot’in sounds and complemented by materials available from a Field Methods
course offered in the spring 2006 at the University of Victoria by Dr. Leslie Saxon.
Impressionistically, /z/and/zD/ share phonetic properties with //and/l/, especially in coda
position. Therefore, the set of words analyzed for this study included ones containing /z/
and /zD/ as well as representative tokens of //and/l/ for comparison. The words themselves
3FirstVoices (https://www.firstvoices.com/) is ‘an online space for Indigenous communities to share and
promote language, oral culture, and linguistic history’. It houses audio recordings, dictionaries, songs,
and stories from many Indigenous communities in British Columbia, Canada (as well as a few from
elsewhere in Canada).
4At the time of data collection, Cook’s (2013) grammar had not yet been published.
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
A phonetic case study of Tˆ
silhqot’in /z/and/zD/5
Table 2 Tokens elicited, by syllabic position.
Intervocalic Medial onset Medial coda Final coda
Consonant (VCV) (VCCV) (VCCV) (VC#) Total
/z/40393890
/zD/ 41 3 45 55 144
//15631236
/l/63111838
Total elicited 102 15 68 123 308
Total analyzed 93 12 53 109 267
V=vowel; C =consonant; target consonant is bolded and underlined.
ranged in duration from one to six syllables, the majority being disyllabic, with three- and
four-syllable words also being common (see Appendix Afor the full word list).
The word list was recorded in a quiet room in the home of the speaker’s sister, using a
Zoom H4N portable recorder and a head-mounted microphone set at approximately 3cm from
the consultant’s mouth. The microphone was kept in a fixed position for the entire recording
session, ensuring that intensity could be reliably compared across tokens (Kingston 2008:
19). For each word, the speaker was asked to check that she knew the word, and (if so) to
repeat it three times in a row. Because, in some cases, the pronunciation of the target sounds
(/z/and/zD/ in particular) varied across repetitions, all three repetitions of each word were
included in the analysis.
Table 2summarizes the number of tokens per phoneme analyzed,5organized by position.
Note that the token counts are unevenly distributed and, in some cases, quite small. They also
do not include word-initial onsets. For /z/and/zD/, the token counts reflect the distribution
of the sounds in available written materials. Cook (2013: 16) notes that his corpus does
not include any word-initial /z/or/zD/ tokens. Based on the materials available to us, /z/
also seems relatively infrequent in non-intervocalic, word-medial onset position; in short, /z/
occurs most often in VCV and VC# positions. The distribution of /zD/ is somewhat broader,
including (in this dataset) a fair number of word-medial coda (VCCV) tokens as well. Note
also that lexical tone was not incorporated into our study, because (i) we had no indication
it might affect the realization of the target segments, and (ii) our dataset did now allow us to
include it as a predictor variable, given that only seven words had a lexical high tone.6/l/and
// did not vary much in their realization and were included for comparison only, and as such
made up relatively small sets.
In 41 of the 308 elicited word tokens, the target phoneme was not present phonetically,
most often in coda position (29/41). This turned out to be the case for all 15 tokens with //
in coda position (12 final and three medial) e.g. bilogh /bilu“/ ‘knife’ was pronounced [bilo˘]
and for 12/45 words with /zD/ in medial coda position e.g. tiˆ
zlin/tizDlin/ ‘Chilco Lake’, pro-
nounced [tEëEin]. In addition, /z/ was not realized in final coda position in 2/38 words. Note
that in such cases of deletion, underlying consonantal retraction was generally still evident
in the adjacent vowels e.g. in /bilu“/, /u/ is retracted to [o](see Cook 2013: 24 example c.);
in /tizDlin/, both /i/ vowels are retracted, as is /l/.7Deletion in coda position is mentioned in
5Note: CC sequences in our dataset are composed of a coda consonant +a (different) onset consonant;
there are no geminate consonants in Tˆ
silhqot’in.
6Tˆ
silhqot’in tone is highly complex, including lexical tones as well as tones (high tone in particular)
derived from various phonological processes (see Cook 2013: Section 1.5). The tonal system is not
yet well understood, and is not fully marked in Cook’s transcriptions, which formed the bases of our
analysis.
7See Cook (1993) for an analysis of Tˆ
silhqot’in retraction effects.
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
6Sonya Bird & Sky Onosson
Cook (2013: 44) as a process affecting all continuant consonants; it was therefore not surpris-
ing to observe in this data set. In addition, target phonemes were not realized in 9/18 words
with // in intervocalic position and 3/37 words with /zD/ in intervocalic position. All medial
onsets (VCCV) were realized phonetically.
2.2 Data analysis
Each token was analyzed qualitatively (Section 2.2.1) and quantitatively (Section 2.2.2), with
a primary focus on the target consonant itself. As mentioned above, vowels provide robust
information about the quality retracted vs. non-retracted of adjacent consonants. As such,
we did include vowels adjacent to the target consonants in our analysis. However, small sam-
ple sizes by vowel meant that we were not able to conduct reliable statistical analyses on
them.
2.2.1 Qualitative analysis
Qualitatively, tokens were categorized in terms of (i) manner of articulation (two levels:
non-lenited vs. lenited), based on the presence of visible frication and/or formant struc-
ture (Lee-Kim 2014, Shao & Ridouane 2018, Katz & Pitzanti 2019); (ii) retraction (two
levels: non-retracted vs. retracted), based on the quality of the target consonant and the
adjacent vowels (observed auditorily and visually); and (iii) laterality (two levels: non-
lateral vs. lateral), based primarily on auditory observation. Katz & Pitzanti (2019: 11)
have pointed out the limitations of using subjective judgments to ‘force a binary classifica-
tion onto continuous phonetic properties’. Given the complexity of the observed variation
in /z/and/zD/, it seemed nonetheless likely that these discrete categorizations would be
useful in describing /z/and/zD/ realizations and their distribution. Note that we use the
general term ‘retraction’ to describe perceived backing of the tongue body (in both /z/and
/zD/), without specifying precisely where the articulatory target of this backing is (raised vs.
lowered).
Coding was done using Praat textgrids (Boersma & Weenink 2018); six tiers were used, to
mark segments (both target consonants and adjacent vowels) of interest, token number, under-
lying consonant and position, phonetic realization of manner specifically, phonetic realization
more generally (via phonetic transcription) of consonant, adjacent vowel quality (sharp vs.
flat), and phonemic transcription of adjacent vowel. We recognize that there are certain lim-
itations to our auditory coding, and consequently to the acoustic analyses that are based on
this auditory coding (Section 3.1.2). First, although we are trained phoneticians, we are not
speakers of Tˆ
silhqot’in, and the auditory cues we used to classify sounds may not correspond
exactly to those that fluent speakers would use; future work should include a perceptual study
with fluent speakers, especially to disentangle realizations coded as having ambiguous (to us)
place features. Second, many of the realizations were ambiguous, varying along dimensions
(e.g. of lenition) in continuous ways, making auditory judgments challenging.8To increase
the reliability of our analyses, we coded the data in a two-step process. Initial auditory cod-
ing and transcription was carried out by co-author Bird. Subsequently, co-author Onosson
independently transcribed the entire dataset, using a Praat textgrid that did not specify the
underlying phonemes of the target sounds so as to minimize potential bias towards any par-
ticular phonetic realizations. We initially agreed on the transcriptions of 223 out of 267 tokens
(an inter-rater agreement rate of 83.5%), both having noted a number of individual tokens of
uncertain quality. A consensus of opinion was reached on several of these, bringing the num-
ber of agreed-upon tokens to 230, for an overall inter-rater agreement rate of 86.1%. For those
8We note that one reviewer disagreed with some of our transcriptions, even though we–co-authors–agreed
on them. We suspect this may have been partly to do with the specific play-back devices used to listen
to the audio files, as documented in Sanker et al. (2021).
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
A phonetic case study of Tˆ
silhqot’in /z/and/zD/7
tokens which were not fully agreed-upon (37 of 267, or 13.9%), co-author Bird’s auditory
coding was used in the quantitative analysis, given her more extensive experience listening
to these sounds and the language more generally.
2.2.2 Quantitative analysis
Quantitative analysis focused on several measurements automatically extracted using a Praat
script. Within the target consonants themselves, we measured duration, mean intensity and
band-pass filtered zero crossing rate (bp-zcr) as correlates of lenition, as well as spectral
moments 1–4 and mean F1 & F2 (Hz) as correlates of retraction and lateralization.
Mean intensity (dB) across the duration of the target consonants was extracted from the
Intensity object created from the Sound object using Praat’s To intensity... function. Bp-zcr
has been used as an alternative to harmonic to noise ratio (HNR) to quantify noisiness in
a signal without reference to periodicity (Gordeeva & Scobbie 2010, Westerberg 2018); the
higher the bp-zcr value, the more noisy (i.e. less lenited) the sound is. Bp-zcr was measured in
Praat following Westerberg (2018), separately within each one-third of target consonant dura-
tion, by dividing the number of zero crossings (taken from a PointProcess ‘zeros’ object, set
by default to include both ‘raisers’ and ‘fallers’) by a third of the token duration. A mean per-
token bp-zcr was then calculated in R by averaging across the three one-third rates. Spectral
moments (centre of gravity, standard deviation, skewness and kurtosis) were included in our
analysis, as measures of fricative place of articulation (Jongman, Wayland & Wong 2000).
They were measured within a 30 ms Hamming window from the centre of each token, band-
pass filtered between 200 Hz and 22,050 Hz9and extracted at a power setting of 2.0 (the
default). In addition to spectral moments, we also measured the mean first and second for-
mants within the consonants themselves (Jassem 1965,Soli1981,Alwan1986, Jongman
et al. 2000) because, even for realizations coded as (voiced) fricatives, formant structure was
often visible within the consonant. F1 and F2 were calculated using the burg formula and
the following settings: five formants, ceiling of 5500 Hz, 25 ms window length, and 50 Hz
pre-emphasis.
In addition to measuring various acoustic properties within the target consonants them-
selves, we also measured F1 and F2 (using the same parameters as for the consonants) within
the vowels preceding and following /z/and/zD/ since, as mentioned above, vowel quality is
a robust and reliable perceptual cue of retraction, or at least phonological pharyngealization,
in Tˆ
silhqot’in. We split the vowels into three equal thirds (beginning, middle, end), and mea-
sured mean F1 and F2 in each third. We referred to formants in the first third when describing
vowels following target consonants, and to the last third when describing vowels preceding
target consonants.
2.2.3 Statistical analysis
All statistical analyses of acoustic features and distributional properties were conducted
inR(RCoreTeam2020) running in RStudio (RStudio Team 2020) and used several
9We surveyed the literature on the use of spectral moments in the acoustic analysis of consonants and
found substantial variation in the choice of audio filtering. Such choices are not without consequence,
however. Shadle & Mair (1996) conducted a comparison of several filtering methods based on previous
studies and determined that ‘the values of the moments were strongly affected by the frequency range
used’ (p. 1522). Nonetheless, our study does not focus on providing these measurements for comparison
with other research, but rather to make internal cross-comparisons. Our high-pass value of 200 Hz is
meant to filter out voicing in both /z/and/zD/. This matches the value selected by Shadle & Mair as well
as Sundara (2005) for the same purpose, although other studies have utilized other cut-off values such
as 500 Hz (Jannedy & Weirich 2017) or even as high as 1000 Hz (Nirgianaki 2014). Because we were
dealing with retracted realizations in many cases, we elected to use a lower threshold so as to preserve
much of the formant structure across a range of places of articulation.
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
8Sonya Bird & Sky Onosson
sub-components of the tidyerse R package library (Wickham et al. 2019)aswellas
the stats package from the base R library for specific statistical functions. Statistical
analyses were carried out using the following formulas. Chi-square tests of distribu-
tions: chisq.test(table(variable1, variable2)). ANOVAs: aov(dependent.variable indepen-
dent.variable (ind.var2 ind.var3)). Post-hoc testing for significant interactions within
multivariate ANOVAs was carried out using Tukey’s Honest Significant Difference (Tukey
1953): TukeyHSD(anova.test.result). Statistical significance was determined for all tests at
p< .05; we omit specific p-values when reporting results generally, except where discussing
findings which fail to meet this threshold. Qualitative or categorical variables which were
tested include the following factors: phoneme (two levels: /zz
D/), preceding or following
vowel (four levels: /aeiu/), preceding or following consonant (four levels: /“lzz
D/), manner
(two levels: lenited, non-lenited), retraction (two levels: retracted, non-retracted), lateral-
ity (two levels: lateral, non-lateral). Quantitative or continuous variables which were tested
included the following factors: duration,intensity,band-passed zero-crossing rate (bp-zcr),
F1, F2, centre of gravity,standard deviation,skewness,andkurtosis.
3Results
In general, /z/and/zD/ vary substantially from token to token and even from repetition to
repetition within a given token, giving the impression of a somewhat underspecified artic-
ulatory and/or acoustic target. Nonetheless, certain patterns do emerge, pointing to syllabic
(onset vs. coda) and segmental (adjacent segment) effects on phonetic realization. Results
are presented in two parts: first, we describe the different phonetic realizations of /z/and/zD/
with accompanying acoustic analysis (Section 3.1); second, we analyze the distribution of
the different phonetic realizations, according to syllabic position and segmental environment
(Section 3.2).
3.1 Phonetic realizations of /z/and/zQ/
In this section, we describe the acoustic and auditory variants of /z/and/zD/. In Section 3.1.1,
we compare the two phonemes to each other, to get a general sense of how /z/and/zD/ differ
phonetically. In Section 3.1.2, we explore surface realizations of both phonemes in more
detail, in terms of variation in lenition, retraction, and lateralization as well as ‘buzziness’
(likely corresponding to dentalization–see below).
3.1.1 Acoustic correlates of /z/vs./zQ/
The phonetic realizations of /z/and/zD/ included relatively similar ranges of variation in
manner, retraction, and laterality. Before exploring this variation in more detail, it is useful
to compare the two phonemes phonetically in terms of their overall phonetic features.
As we shall see in Section 3.2,/z/ vs. /zD/ differ somewhat in the frequency of different
realizations (see Tables 3and 4). Nonetheless, we hypothesized that, overall, /zD/ realizations
would exhibit acoustic measures reflective of pharyngealization–in particular, having raised
F1 and lowered F2 within both /zD/ and in adjacent vowels (Shahin 2002, Namdaran 2006). A
set of one-way ANOVAs (phoneme) was conducted across our suite of acoustic parameters to
look for differences by phoneme (appendix Table B1). Acoustic correlates which distinguish
between /z/ vs. /zD/ vary depending on the realization, and often do not reach the level of
statistical significance, most likely due to low token counts in our data for certain realizations.
Nevertheless, the trends are consistent: realizations deriving from underlying /zD/tendtohave
longer duration, lower intensity, lower F2, and greater skewness, in comparison to realizations
deriving from underlying /z/. Figure 2plots kernel density estimate distributions (Rosenblatt
1956, Parzen 1962) of the various acoustic measures by phoneme. Those correlates which
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
A phonetic case study of Tˆ
silhqot’in /z/and/zD/9
0.000
0.002
0.004
100 200 300 400 500
Duration (ms)
0.00
0.05
0.10
60 65 70
Intensity (dB)
0
1
2
3
1000 3000 10000
Bp-zcr (log)
0.000
0.002
0.004
200 400 600 800
F1 (Hz)
0.0000
0.0005
0.0010
0.0015
0.0020
900 1200 1500 1800 2100
F2 (Hz)
0
2
4
300 1000 3000
Centre of gravity, log(Hz)
0
1
2
100 300 1000 3000
Standard deviation, log(Hz)
0.00
0.02
0.04
0.06
0 20406080
Skewness
0.0
0.3
0.6
0.9
1 10 100 1000 10000
Kurtosis (log)
Phoneme
z
z
Figure 2 (Colour online) Distributions of acoustic parameters across tokens of /z/and/zD/.
meet the level of statistical significance across all realizations are intensity (–1.6 dB for /zD/),
F2 (–303 Hz for /zD/), cog (+228 for /z/), and standard deviation (+225 for /z/).
Figure 3provides kernel density estimate plots of /z/and/zD/ distributions by mean
F1 (y-axis) and F2 (x-axis), as measured within the consonants. Overall, Figure 3shows
that the two consonants differ substantially along F2 but much less so along F1; this is
also the case for the vowels adjacent to /z/and/zD/ (Figure 4), similar to Zawaydeh & de
Jong’s (2011) findings in Ammani-Jordanian Arabic. This indicates that, unlike in other lan-
guages (see Al-Tamini 2017 on Arabic, and Shahin 1997on St’át’imcets), what is termed
‘pharyngealization’ in Tˆ
silhqot’ in is manifested primarily as backing, without substantial
lowering.
As predicted based on Cook (2013) and also described for Arabic (Laver 1994, Embarki
et al. 2011), adjacent vowels also reliably correlate with phonemic pharyngealization. For
example, the final /a/oftehilhchaz /tezDi¯tÉSaz/ ‘I started to fry it’ is realized as [Q](with a
relatively high F2) whereas in telhant’ah/te¯ant'azD/ ‘crowberry’ it is realized as [A](with a
relatively low F2).10 Figure 4provides the acoustic vowel spaces before /z/ vs. /zD/ (a) and
after /z/ vs. /zD/ (b). This plot is based on F1 and F2 measurements averaged over the third
of the vowel that is adjacent to the target consonant, i.e. the last third for vowels preceding
/z/and/zD/ and the first third for vowels occurring after /z/and/zD/. Although there are many
more vowel tokens preceding than following /z/and/zD/, the general pattern is the same:
the vowel space is substantially further back (lower F2) and slightly lower down (higher F1)
adjacent to pharyngealized /zD/ compared to its non-pharyngealized counterpart /z/. These
results mirror those represented for the consonants themselves, in Figure 3.
To determine statistical significance of formant differences, two-way ANOVAs
(phoneme, retraction) were conducted for each formant per vowel according to the preced-
ing (appendix Table B2) or following (appendix Table B3) phoneme (for /u/, its scarcity in
our dataset meant that we were only able to make comparisons when preceding but not fol-
lowing /z/ vs. /zD/). For vowels following the target consonant, the main differences occur in
F2: /a/, /e/, and /i/ all have higher F2 values following /z/than/zD/. The only significant F1
difference is for /e/, with F1 being lower following /z/than/zD/. For vowels preceding the
target consonant, only /a/ has a significantly higher F2 before /z/than/zD/; only /i/ shows a
10 Audio recordings of illustrative tokens are available upon request, by contacting co-author Bird.
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
10 Sonya Bird & Sky Onosson
200
300
400
500
900120015001800
F2 (Hz)
F1 (Hz)
Density
0.4
0.6
0.8
1.0
Phoneme
z
z
Figure 3 (Colour online) Mean F1 and F2 of /z/vs./zD/.
Before: z Before: z
00010050002005100030052 00010050002005100030052
200
400
600
800
F2 (Hz)
F1 (Hz)
Vowel
a
e
i
u
(a)
After: z After: z
0021008000200612400 0021008000200612400
200
300
400
500
600
700
F2 (Hz)
F1 (Hz)
Vowel
a
e
i
u
(b)
Figure 4 (Colour online) Vowel formants before /z/vs./zD/ (a) and after /z/vs./zD/ (b). Formants plotted are mean values in
adjacent third of the vowel. Ellipses show 95% confidence intervals (omitted where low token count prohibits calculation).
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
A phonetic case study of Tˆ
silhqot’in /z/and/zD/11
significant F1 difference, but in the opposite direction than expected: F1 is higher before /z/
than /zD/. Although statistical analysis is limited in reliability because of small sample sizes,
the trends for F2 are consistent with the consonant-internal measurements (Figure 3). They
also suggest that retraction effects are stronger on vowels which follow consonants compared
to vowels preceding consonants, which is somewhat surprising given documented phonetic
effects in other languages (Nolan 2017) and the phonological effects of retraction described
for Tˆ
silhqot’in and more generally (Cook 1993, Zawaydeh & de Jong 2011).
As we shall see below, retraction is one component of the phonetic variation observed
in both /z/and/zD/, in addition to being contrastive at the phonological level in the form of
pharyngealization. One of the most interesting things about the Tˆ
silhqot’in patterns described
here is this blurred role of retraction in the sound system.
3.1.2 Phonetic variation across /z/ and /zQ/
In analyzing /z/and/zD/ auditorily, we noted variation along three dimensions: degree of leni-
tion, degree of retraction, and lateralization. In this section, we explore the acoustic correlates
of each of these dimensions in turn. The results are complex as a result of many interacting
factors, and not all tendencies reach statistical significance. In this section, we report on clear
tendencies; full statistical analyses are provided in appendix Tables B4B8.
Considering lenition first, both /z/and/zD/ vary in how lenited their realizations are, from
clear fricatives to barely present approximants. We hypothesized that lenited forms should
differ from non-lenited forms acoustically in terms of intensity and bp-zcr in particular (see
Tables B6B8). Figure 5provides a density comparison of bp-zcr, which showed the larger
effect size, in /z/and/zD/ tokens coded as lenited vs. non-lenited. The distribution of bp-zcr
values is much flatter for non-lenited (dashed lines) than for lenited tokens and, crucially, the
mean is much higher, as expected: 3787 for non-lenited vs. 1337 for lenited.
z z
1000 3000 10000 1000 3000 10000
0
1
2
3
4
Bp-zcr (log)
Manner
Lenited
Non-lenited
Figure 5 (Colour online) Bp-zcr in /z/and/zD/ tokens coded as lenited vs. non-lenited.
Turning to retraction, we observed during auditory coding that it tended to coincide
with lenition. We tested the correlation between the two, finding that they are indeed highly
correlated (χ2=130.85, df =1), such that lenited realizations also (strongly) tend to be
retracted (of the 167 tokens coded as retracted, 153 or 92% were also coded as lenited).
We expected retraction to be reflected in both consonantal and vocalic measures of place
(spectral moments and formants). No significant independent effect of retraction was found
on the consonantal measures (see Table B6). Retraction did tend to have an effect on the
quality of adjacent vowels, although small token numbers per vowel made it difficult to reli-
ably test this effect statistically (see Tables B2,B3). Figure 6shows vowel formants plotted
according to the following phoneme (/z/or/zD/, the condition for which we have the most
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
12 Sonya Bird & Sky Onosson
Before: Non-retracted z Before: Non-retracted z
100020003000 100020003000
200
400
600
800
F2 (Hz)
F1 (Hz)
Vowel
a
e
i
u
(a)
Before: Retracted z Before: Retracted z
10002000 10002000
200
400
600
F2 (Hz)
F1 (Hz)
Vowel
a
e
i
u
(b)
Figure 6 (Colour online) Vowel formants before /z/vs./zD/ coded as non-retracted (a) or retracted (b). Formants plotted are
mean values in adjacent third of the vowel. Ellipses show 95% confidence intervals (omitted where low token count
prohibits calculation).
data) and for retraction (as coded auditorally). In general, F2 values largely match our expec-
tations, tending to be lower adjacent to retracted versus non-retracted tokens (this effect was
only significant in the case of /i/ occurring after a retracted consonant). For F1, values are
lower for /e/ and higher for /i/ following retracted consonants, and lower for /u/ preceding
retracted consonants. These preliminary F1 results indicate that the retraction we hear on
both /z/and/zD/ is possibly more accurately described as UVULARIZATION (see Zawaydeh &
de Jong 2011 on Ammani-Jordanian Arabic), involving backing and slight raising rather than
lowering. This would explain both lowering of F1 in /e/ vs. raising in /i/, as well as lowering
of F1 in /u/, which is realized as [o]in retracted environments (Cook 2013).
The observed variation in retraction and lenition was such that, in some cases, both /z/
and /zD/ were realized as retracted and lenited [“4], which is also a common realization of
underlying //. In Figure 7, we compare what was transcribed phonetically as [“4]from inter-
vocalic /zD/ vs. underlying //, in the word teˆ
zighin /tezDi“in/ ‘I started to pack or haul it’. The
realization of both /zD/and// is a short approximant [“4](/zD/: 55 ms; //: 48 ms), although
underlying /zD/ is somewhat more lenited than underlying // in terms of intensity (69 dB vs.
65 dB) and formant structure (clearer in /zD/thanin//). The vowel between /zD/and//is
underlyingly /i/; its transitional nature both out of the preceding /zD/ and into the following
// indicates retraction of both flanking consonants.
Given the apparent neutralization of the phonemic contrast between /zD/and//
(Figure 7), and also /z/, it is worth examining the phonetic realization of these three
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
A phonetic case study of Tˆ
silhqot’in /z/and/zD/13
Figure 7 Intervocalic [4]from /zD/ vs. intervocalic [4]from //inteˆ
zighin /tezDiin/‘Istartedtopackorhaulit.
0
1000
2000
3000
4000
100015002000
F2 (Hz)
Bp-zcr
Phoneme
z
z
Figure 8 (Colour online) [4]F2 by bp-zcr according to phoneme: /z/, /zD/and//.
phonemes, to see whether they are still distinguishable from one another acoustically.
Figure 8plots F2 (x-axis, as the most reliable correlate of place) and bp-zcr (y-axis, as the
most reliable correlate of manner) by underlying phoneme for surface realizations of [“4].
Phonemic // mainly varies in bp-zcr over a limited range of F2 values, meaning it varies a
fair amount in manner (degree of lenition), but is relatively consistent in place. Conversely,
[“4]realizations of /z/and/zD/ vary mainly in F2 over a limited range of bp-zcr values, mean-
ing they vary in place (degree of retraction), with /z/ being somewhat more forward than /zD/
overall, but not much in manner (degree of lenition). We used one-way ANOVAs (phoneme)
to test for significant differences among [“4]realizations corresponding to different phonemes
(see Table B4). F2 was found to differ significantly by phoneme, but only between /z/and//,
with a mean difference of +209 Hz for /z/. For bp-zcr, // differed significantly from both
/z/(+782) and /zD/(+682), // exhibiting less lenition than /z/and/zD/; /z/and/zD/ did not
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
14 Sonya Bird & Sky Onosson
Figure 9 Dark [ë]in séla ninq’ez /se@la ninq'ez/ ‘my hands are cold’.
differ from each other in bp-zcr. Other significantly different measures among [“4]realiza-
tions included intensity (lower for //, no significant difference between /z/ vs. /zD/), centre
of gravity (higher only for // vs. /z/), and standard deviation (higher only for // vs. /z/).
Laterality is the third dimension of variation observed in the dataset. Especially in coda
position (see Section 3.2), many tokens of both /z/and/zD/ were coded as dark [ë]. Figure 9
provides a comparison of /l/and/z/inséla ninq’ez /se@la ninq'ez/ ‘my hands are cold’, both
of which are realized as lateral approximants. Lateralized realizations of /z/and/zD/ typically
sounded more retracted than underlying /l/. In Figure 9, this is reflected by the formant values
of /l/ vs. /z/, especially F2 (1296 Hz for /z/ vs. 1800 Hz for /l/). In addition, /z/ is preceded
by the retracted allophone of /e/: []. Statistical analysis shows that tokens coded as lateral
generally have longer durations than ones coded as non-lateral, in addition to having lower
intensity and higher skewness values (Table B7).
Similar to the potential neutralization of /z/, /zD/and// resulting from combined effects
of lenition and retraction (Figures 7and 8), lateralization potentially leads to neutralization of
/z/, /zD/, and underlying /l/. Figure 10 plots F2 (x-axis; as the most reliable correlate of place)
and bp-zcr (y-axis; as the most reliable correlate of manner), by underlying phoneme. One-
way ANOVAs (phoneme) show that surface lateral realizations are distinguished by several
acoustic parameters (Table B5), including F2 (higher in /z/ vs. /l/and/zD/, no significant
difference between /l/and/zD/) and bp-zcr (lower in /l/ vs. /z/and/zD/, no significant difference
between /z/and/zD/). The F2 results are surprising, since our perception was that /l/ is realized
as a lighter lateral than both /z/and/zD/ (see Figure 9). Bp-zcr results reflect the fact that /l/
is a true (and consistent) approximant, whereas /z/and/zD/ are more variable in manner, even
when coded as [ë].
In addition to the three main dimensions of lenition, retraction, and lateralization, we
observed what we called ‘buzziness’ in some of the /z/and/zD/ tokens (39/88 and 43/129,
respectively). Unfortunately, it was not possible to video record the elicitation session, but
in discussing the articulatory details of /z/and/zD/ with the speaker, she confirmed that she
clenched her jaw during these sounds, as her mother had taught her. She also mentioned
that, when she taught these sounds to children, she showed them her teeth and told them to
keep their teeth closed. This articulatory tension sometimes leads secondary dentalization
superimposed on the primary articulation, resulting in the auditory impression of buzziness
(see also Zhou & Wu 1963 and others on the Chinese apical vowel). Acoustically, the buzzy
nature of some [z]and [zD]tokens is reflected in their spectral composition: it is not quite
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
A phonetic case study of Tˆ
silhqot’in /z/and/zD/15
1000
2000
3000
002100800020061
F2 (Hz)
Bp-zcr
Phoneme
l
z
z
Figure 10 (Colour online) [l]F2 by bp-zcr according to phoneme: /z/, /zD/and/l/.
as ‘clean’ as that of canonical /z/, with more noise throughout the frequency ranges, and
especially below 4500 Hz.
In our data, buzziness was strongly correlated with manner, (χ2=62.809, df =1), with
61% of buzzy tokens also coded as fricatives and 89% of non-buzzy tokens coded as approx-
imants. Given the speaker’s description of /z/and/zD/, it is not surprising that buzziness
occurred primarily as a secondary effect of jaw clenching specifically in the more closed
(fricative) realizations of the phonemes. Statistically, we did not find a reliable independent
effect of buzziness on bp-zcr, as expected based on previous literature (Gordeeva & Scobbie
2010, Westerberg 2019). We did however find interactions between buzziness and other fac-
tors (Table B8). In terms of manner, buzzy tokens had lower cog values than non-buzzy
tokens within non-lenited tokens specifically.
We end this section by noting that we observed a number of tokens that were ambiguous
and/or transitional in their realization. With respect to manner, several word-final tokens tran-
sitioned from a relatively open (approximant) to a relatively closed (fricative) sound e.g. the
final /z/injíz/dÉZçz/ ‘inside’ was realized as [É4“]. Note that this pattern of realization is oppo-
site to what has been described for Chinese apical vowels (discussed in Section 4), which go
from more to less constricted (Shao & Ridouane 2018,2019). With respect to place, real-
izations were also especially variable in coda position, with token(s) coded as [D],[zD],[l]
(light), [“4L],[¯L],and[“4¯L],[L“4]. Such realizations point to the fact that /z/and
/zD/ varied not only along three dimensions, but also along continua within these dimensions,
and especially in coda position.
Summarizing so far, /z/and/zD/ differ from one another in intensity (likely because of
lenition patterns) as well as in acoustic features associated with place, with /zD/ showing
lower F2, cog, and standard deviation values in particular, indicating tongue body/root back-
ing. For both consonants, lenited tokens were associated with higher intensity and lower
bp-zcr (correlates of manner), as well as lower F2, centre of gravity & standard deviation,
and higher skewness & kurtosis (correlates of place, reflecting the strong correlation between
retraction and lenition). No significant effects of retraction were found within the conso-
nants themselves, but adjacent vowel formants (F2 in particular) showed that retraction was
associated with tongue body/root backing (raised F2). This preliminary finding supports the
fluent speakers’ perceptions (see Section 2)thatT
ˆ
silhqot’in retraction is carried on the vowels
rather than the consonants. Lateralization of /z/and/zD/ was associated with longer duration,
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
16 Sonya Bird & Sky Onosson
lower intensity, and higher skewness. Finally, buzzy tokens were significantly correlated with
non-lenited forms, and were associated with lower cog values than non-buzzy tokens.
3.2 Distribution of /z/and/zQ/ variants
Now that the phonetic variants of /z/and/zD/ have been described, we consider their distribu-
tion across prosodic positions (Section 3.2.1) and segmental environments (Section 3.2.2).11
The analysis is based on our auditory classification of /z/and/zD/ realizations (via transcrip-
tion), as supported by the acoustic measures summarized in Section 3.1.
3.2.1 Phonetic realizations across prosodic positions
Based on Cook’s (1993,2013) descriptions of /z/and/zD/ as well as on more general effects
of sonority (Clements 1990), we expect the clearest predictor of phonetic realization to be
prosodic position, with more lenited and lateral realizations in coda position than elsewhere.
Tables 3and 4summarize the number of tokens of each phonetic realization by underlying
consonant, in intervocalic position (Table 3) and in final coda position (Table 4). We focus on
these positions because they are the most common ones in our dataset (see Table 2above).
Intervocalically (Table 3), the most common realization of /z/ is a retracted, lenited
approximant (coded as [“4]) and the most common realization of /zD/ is a non-retracted, non-
lenited fricative (coded as [z]). Crucially, unlike in coda position, intervocalic /z/and/zD/
have no lateral component, with the exception of a single token coded as ambiguous [L“4].
Table 3 Token numbers per phonetic realization in V_V position by underlying consonant (token percentages
refer to their respective columns, with bolding indicating the most common realization(s) per column).
Manner Phonetic realization /z//zD/ Total
Fricative [z](Figure 3) 14 (35%) 22 (58%) 36 (46%)
[](Figure 4) 0 4 (11%) 4 (5%)
Approximant [4](Figures 4&5) 26 (65%) 10 (26%) 36 (46%)
[z4]01(3%)1(1%)
Hybrid [L4]01(3%)1(1%)
Total 40 38 78
In coda position (Table 4), the most common realization is [ë], especially for /zD/.
Note that in addition to the major realizations illustrated above, a few realizations were
observed only once or twice: [D],[zD],[l](light), and hybrid realizations [“4L],[¯L],
and [“4¯L]. These reflect the fact that /z/and/zD/ are particularly variable and ambiguous
in coda position, much more so than in onset position.
Considering prosodic position as a whole (including medial onsets and codas as well–see
Table 2), degree of lenition is significantly correlated with prosodic position for /zD/
(χ2=50.64, df =3); for /z/, the relationship is weaker, falling slightly above the level
of significance (χ2=6.79, df =3, p=.079). In coda position, both medial and final,
there is an overwhelming tendency (between 75%–100%) for both phonemes to be lenited.
11 Because the dataset used for this study includes three (and in one case four) repetitions per word,
one other potential effect which we investigated was repetition number. We did not find any correla-
tion between repetition number and any of several dependent variables including phonetic realization,
manner, laterality, and retraction, for any syllabic position. We therefore excluded it from further
analysis.
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
A phonetic case study of Tˆ
silhqot’in /z/and/zD/17
Table 4 Token numbers per phonetic realization in V_# position, by underlying consonant (token percentages refer to their respective
columns, with bolding indicating the most common realization per column).
Manner Phonetic realization /z//zD/ Total
Fricative [z](Figure 3) 8 (22%) 5 (9%) 13 (14%)
[D]1(3%) 0 1(1%)
[zD]01(2%)1(1%)
Approximant [4](Figures 5,6,8) 3 (8%) 0 3 (3%)
[](Figure 9) 14 (39%) 46 (84%) 60 (66%)
[l](light) 2 (6%) 0 2 (2%)
vowel (Figure 10) 4 (11%) 3 (5%) 7 (8%)
Other (hybrids) [4L],[L];[4L](all lenited) 4 (11%) 0 4 (4%)
Total 36 55 91
Conversely, in medial onset position there is equally strong resistance to lenition (67–100%).
These findings support cross-linguistic tendencies related to sonority and syllable structure,
whereby preference is for low-sonority onsets and high-sonority codas (Clements 1990).
Intervocalically, the two phonemes differ, with /z/ leniting as it does in coda position and /zD/
resisting lenition as it does in onset position. However, for both phonemes, intervocalic posi-
tion shows the weakest tendency towards categorical behaviour, i.e. there is more variability
in lenition intervocalically than in any other prosodic position.
Based on cross-linguistic findings that, in articulatorily complex segments, tongue body
articulations are more dominant in coda than in onset position (Krakow 1993, Gick et al.
2006), we predicted that coda position would also lead to higher numbers of tokens perceived
as retracted. Although this is generally the case, the effect of prosodic position on retrac-
tion differs between /z/and/zD/, the correlation being significant only for /zD/(χ2=39.833,
df =3). Both /z/and/zD/ are categorically non-retracted in medial onset position. In final coda
position, both phonemes tend to be produced with retracted realizations, although this ten-
dency is relatively weak for /z/. In intervocalic and medial coda positions, the two phonemes
behave with opposite tendencies: /z/ tends to retract intervocalically and shows a weak ten-
dency to non-retraction in medial codas; in other words, intervocalic and final coda positions
behave similarly for /z/, and contrast with medial onsets and codas (which pattern together).
In contrast, /zD/ tends not to retract intervocalically but is nearly categorically retracted in
medial codas; in other words, intervocalic and onset positions behave similarly for /zD/,
and contrast with medial and final codas (which pattern together). Note that the relatively
weak effects of prosody on retraction reflect Cook’s (1993,2013) descriptions of /z/and/zD/
variation, in which he refers to lenition and lateralization, but not retraction.
In support of Cook’s (1993) observations, there is a significant correlation between
prosodic position and laterality with the effect being generally quite strong and consistent
for both /z/and/zD/(/z/: χ2=36.577, df =3; /zD/: χ2=89.101, df =3): non-lateral realiza-
tions occur almost categorically in intervocalic and medial onset positions (with the tendency
in the latter position being slightly weaker for /z/). In medial and final coda position, lateral
realizations are nearly categorical, except for /z/ in final codas where there is only a weak
tendency.
Finally, the relationship between buzziness and prosodic position is statistically signifi-
cant for both /z/(χ2=9.306, df =3) and /zD/(χ2=42.988, df =3), reflecting the correlation
between buzziness and manner (see 3.2.1). For /zD/, the relationship between manner and
position is very clear (see above) and therefore predictive of buzziness in a straight-forward
way: non-lenited intervocalic /zD/ realizations are by and large buzzy; lenited coda realiza-
tions are non-buzzy. The only exception to this involves intervocalic lenited realizations, half
of which are buzzy. For /z/, the relationship between manner and position is not as clear, and
therefore neither is the relationship between buzziness and position: lenited intervocalic /z/
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
18 Sonya Bird & Sky Onosson
realizations are almost categorically non-buzzy and non-lenited coda realizations are entirely
buzzy, but other manner–position combinations also occur and are more variable. Overall,
the observed patterns of buzziness support the idea that intervocalic /zD/ is syllabified as an
onset, with lenition and hence non-buzziness occurring only in coda position, whereas this is
not so clearly the case for /z/.
The findings reported in Section 3.2.1 suggest that the prosodic affiliation of intervo-
calic consonants is worth investigating further. Across dimensions of variation, /zD/ shows
relatively consistent syllabic effects, with intervocalic /zD/ behaving like onset /zD/, and in
opposition to medial and final coda /zD/. In contrast, /z/ does not show such consistent syl-
labic effects: with respect to retraction, intervocalic /z/ patterns with medial and final coda
/z/, and in opposition to onset /z/; with respect to lenition, intervocalic /z/ patterns with final
coda /z/, and in opposition to medial coda or onset /z/; with respect to laterality, intervocalic
/z/, patterns with onset /z/, in opposition to coda /z/ (medial and final); with respect to buzzi-
ness, /z/ final codas stand apart from other positions as the locus for buzzy realizations. This
difference in syllabic affiliations between intervocalic /z/and/zD/ is interesting, and worth
delving into in more detail, especially given the complexity of syllabification of intervocalic
consonants in other Dene languages (Bird 2002).
3.2.2 Phonetic realizations across segmental environments
In addition to prosodic position, a number of studies have shown that segmental environ-
ment plays a role in lenition12 patterns in particular (Kirchner 2001,2004; Kingston 2008;
Ennerver et al. 2017). In the dataset we are working with, /z/and/zD/ occur either adjacent
to vowels or (in a small number of case) to resonants. We focus here on the potential role
of preceding vowels in predicting the phonetic realization of /z/and/zD/, since this is the
condition for which we have the most data.
There is ongoing debate about whether the quality of adjacent vowels affects degree of
consonant lenition, and existing findings are conflicting in terms of the direction of possi-
ble effects (Ennerver et al. 2017).13 In Tˆ
silhqot’in, /z/and/zD/ show a strong tendency to
lenite following all oral14 vowels (intervocalic and coda position), compared to following
another consonant (medial onset position). Variation between vowels (significant only for
/z/: χ2=11.45, df =3) suggests a prohibitive effect of vowel proximity on lenition. Ranking
vowels according to lenition of /z/ (least to most lenited), we get the following scale: i(61%)
>e(88%) > a,u(100% i.e. categorical lenition). For lenition of /zD/ the rankings are: e(62%)
>i(75%) > u(80%) > a(88%). In both cases, distal (articulatorily incompatible) [a]leads
to the most cases of lenition, and the more proximal (articulatorily compatible) [e]and [i]
lead to fewer cases of lenition. This supports Kirchner (2001,2004), and is also compati-
ble with Iskarous et al.’s (2013) model of coarticulatory resistance, which predicts that high
and front vowels will resist coarticulatory effects more than low and back vowels (Recasens
& Rodriguez 2016).15 To the extent that [u]’s behaviour is reliable (very few tokens exist,
especially preceding /z/), it seems to reflect the articulatory specification of /z/ vs. /zD/: [u]
patterns with distal [a]before /z/, but closer to proximal [e]and [i]before /zD/.
12 We also tested for effects of vowel quality on laterality, retraction, and buzziness, but no clear patterns
emerge for any of these effects.
13 Existing studies are of plosives rather than fricatives (Cole, Hualde & Iskarous 1999, Ortega-Llebaria
2004, Simonet, Hualde & Nadeu 2012, Ennerver et al. 2017).
14 We focus on oral vowels in this section, since we have very few tokens of nasalized vowels.
15 Recasens & Rodriguez (2016) show that co-articulatory resistance decreases in Catalan VCV sequences
in this progression: [i, e]>[a]>[o]>[u]. We can say that our data is largely compatible with this ranking
insofar as we have the data to match it. However, our data is particularly lacking at the ‘most variable,
least resistant to co-articulatory effects’ end of the scale, as we have relatively few tokens of /u/, and
Tˆ
silhqot’in’s four-vowel system lacks an equivalent to /o/(alsoT
ˆ
silhqot’in /e/is[]rather than [e]).
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
A phonetic case study of Tˆ
silhqot’in /z/and/zD/19
Recall from Section 3.1.2 that buzziness was strongly correlated with manner. The effect
of vowel quality on lenition is reflected in its effect on buzziness as well (significant only for
/zD/: χ2=15.785, df =5). For /z/, when ordered by buzziness (least to most buzzy), vowels
are ranked as followed: u (0%) > a(22%) > e(38%) > i(61%); only /i/ is likely to produce
buzzy realizations, which makes sense articulatorily if buzziness results from a high/fronted
tongue body. For /zD/, the ranking is: a(17%) > u(20%) > i(28%) > e(45%). The ranking
of vowels preceding /zD/ is exactly opposite for buzziness vs. lenition, reflecting the fact that
less lenited realizations are consistently more buzzy. The rankings for vowels preceding /z/
by buzziness vs. lenition are more complex, reflecting the fact that the relationship between
buzziness and lenition is also more complex, and partly dependent on prosodic position.
There is also a correlation between preceding vowel and retraction for both phonemes
(/z/: χ2=15.908, df =5; /zD/χ2=16.925, df =5). If we rank vowels according to retraction
(least to most) for /z/, we get the following ranking: a(44%) > i(61%) > e(76%) > u(100%).
In comparison to /z/, /zD/ shows a much greater incidence of retraction overall, reflecting
its inherently retracted nature. When ranked according to retraction for /zD/, the vowels are
ordered as follows: e(62%) > i(72%) > a(88%) > u(100%). For both phonemes, [u]categor-
ically favours retraction, which supports the idea that what we coded as phonetically retracted
corresponds to raised backing, or UVULARIZATION (compatible with [u]) rather than pha-
ryngealization (Saltzman & Munhall 1989). The fact that [a]does not favour retraction of /z/
further supports this view, since one would expect [a]to favour pharyngealization, but not
uvularization.
Finally, in terms of laterality, the patterns differ somewhat by phoneme, and both are sta-
tistically significant (/z/: χ2=21.342, df =5; /zD/χ2=22.461, df =5). The order of vowels
with respect to laterality (least to most lateral) of /z/, is as follows: u(0% lateral) > e(29%)
>i(42%) > a(100%); only [a]is followed categorically by lateral realizations. For /zD/, the
order is: e(52%) > u(60%) > a(75%) > i(81%); all vowels favour lateralization of following
/zD/, which reflects the near categorical tendency for /zD/ to be lateralized in coda position.
Overall, the findings presented in Sections 3.1 and 3.2 paint a relatively consistent pattern
with respect to /z/and/zD/ variation, even though not all results are statistically significant.
Tokens coded as [“4], corresponding primarily to /z/ in intervocalic position (Table 3), are
acoustically lenited, retracted, and non-buzzy. Tokens coded as [z], often corresponding to
/zD/ in intervocalic position (Table 3), are acoustically non-lenited, non-retracted, and buzzy.
Thus, lenition, retraction, and buzziness pattern together in where they occur. Tokens coded
as [ë]are observed strictly in final coda position (Table 4) and this is reflected in the very clear
results of laterality by position, across both /z/and/zD/. In terms of segmental environment,
distal vowels trigger lenition more so than proximal vowels; other effects vary by phoneme.
4Discussion
Both the phonetic features of Tˆ
silhqot’in /z/and/zD/ and their distributional properties across
syllabic and segmental environments are reminiscent of patterns observed in other languages,
locally and further afield. The discussion that follows considers how /z/and/zD/ should
be characterized phonetically and phonologically (Section 4.1), and whether the observed
patterns can be described as lenition (Section 4.2).
4.1 Phonetic and phonological features of Tˆ
silhqot’in /z/and/zQ/
Phonetically, the defining features of Tˆ
silhqot’in /z/and/zD/ include a characteristic ‘buzzi-
ness’ (accompanying coronal articulations in particular) and acoustic features compatible
with engagement of both the tongue tip (TT; non-retracted/non-lateralized articulations) and
the tongue body (TB; retracted/lateralized articulations), or what Laver (1994: 314) refers to
as ‘double articulations’. Although phonologically, Tˆ
silhqot’in /z/and/zD/ are clearly con-
sonants, their phonetic characteristics are reminiscent of two other sounds described in the
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
20 Sonya Bird & Sky Onosson
literature: the Chinese front apical vowel and the Swedish ‘Viby-i’ vowel (see Laver 1994:
Chapter 11/Section 11.3).
The Chinese sound that has traditionally been called the ‘front apical vowel’ has been the
topic of much debate concerning its precise nature and whether it is best characterized (pho-
netically) as a vowel, an approximant, or a fricative (Karlgren 1915, Ladefoged & Maddieson
1996,Yu1999, Duanmu 2007, Lee-Kim 2014). X-ray images in Zhou & Wu (1963)and
ultrasound images in Lee-Kim (2014) both show that the sound (transcribed here as [®5]fol-
lowing Lee-Kim) has a double articulation, with TT raising and TB raising/backing (see also
Laver (1994) on double-articulations). In fact, Lee-Kim (2014) notes that there seems to be
an inherent compatibility between dental TT articulations and TB retraction, citing Stevens,
Keyser & Kawasaki (1986), who ‘conjecture that the dental constriction, which requires a
flat tongue front, can be achieved more easily when the tongue back is retracted’ (p. 271).
Lee-Kim notes that [®5]is ‘presumably unattested in any other language’ (p. 279). While
the phonological distribution of [®5]is certainly a signature feature of Chinese languages,
Tˆ
silhqot’in /z/and/zD/ show that its phonetic realization is perhaps not so unique.
If Tˆ
silhqot’in /z/and/zD/ indeed shows the same kind of double articulation (TT and
TB) as Chinese [®5], as our acoustic and auditory analysis suggests, this might explain why
all of these sounds can exhibit what is referred to here as ‘buzziness’. Shao & Ridouane
(2018,2019) describe the front apical vowel in Jixi-Hui Chinese, which they transcribe /z/.
According to their 2019 articulatory investigation, Jixi-Hui /z/ has a high TB and a raised
TT. They hypothesize that the high TT in particular might explain the presence of ‘abundant
frication noise’ during the sound, which has also been described by Trubetzkoy (1969)as
‘frication-like noise resembling a humming’ (p. 171) and by Chao (1961) as ‘a buzzing qual-
ity’ (p. 22).16 Note that Jixi-Hui Chinese /z/ transitions from a fricative to a vowel. In cases
where Tˆ
silhqot’in /z/and/zD/ are transitional, they follow the opposite pattern, transitioning
from a more lenited, sonorant sound to a less lenited one. This is not surprising, given the
differences in their phonological status, Jixi-Hui Chinese /z/ acting as a syllable nucleus and
Tˆ
silhqot’in /z/and/zD/ acting as consonants.
Another language which exhibits a distinctly ‘buzzy’ sound is Swedish. Westberger
(2018,2019) has conducted the most recent work on the vowel termed ‘Viby-i’, which
she describes as ‘an /i:/ variant with an unusual “thick”, “buzzing”, and “damped” qual-
ity (Engstrand et al. 1998)’ (Westerberg 2019: 3696). Westberger cites similar descriptions
by previous authors: Björsten & Engstrand (1999) ‘suggest that viby-i is a high central
unrounded [ˆ], which may be produced with a raised tongue tip to amplify its “damped”
quality’ (Westberger 2019: 3696–7). Frid et al. (2015) report that ‘Viby-i is produced with
a lower and backer tongue body, and different tongue tip behaviour, than [i:] (Westberger
2019: 3697). Based on acoustic evidence, Westberger hypothesizes that Viby-i is a central-
ized vowel (similar to our auditory impressions of the most lenited versions of Tˆ
silhqot’in /z/
and /zD/) and links the low F2 of Viby-i to a complex tongue shape. Of course, the articulatory
properties of Tˆ
silhqot’in /z/and/zD/ can only be inferred from the acoustic signal here. Our
hope is that, in the future, we might also have the possibility of conducting an articulatory
study of these sounds. Particularly intriguing to us is the potential role of jaw clenching in
producing /z/and/zD/. Neither the literature on the Chinese front apical vowel nor that on the
Swedish Viby-i mentions the jaw as a primary articulator, but according to the Tˆ
silhqot’in
speaker we worked with, the jaw is tightly shut for /z/and/zD/. Her descriptions provide sup-
port for Esling’s (2005) laryngeal articulator model of vowel production, which specifically
includes the jaw as a primary articulator.
Although phonetically Tˆ
silhqot’in /z/and/zD/ appear similar to the Chinese front api-
cal vowel and to the Swedish Viby-i, these sounds differ in their phonological status.
Prosodically, unlike the Chinese and Swedish sounds, Tˆ
silhqot’in /z/and/zD/ are clearly
16 Bowei Shao, personal communication (30 November 2019).
https://doi.org/10.1017/S0025100322000093 Published online by Cambridge University Press
A phonetic case study of Tˆ
silhqot’in /z/and/zD/21
consonants, acting as syllable onsets and codas, but never as nuclei. Segmentally, their pre-
cise phonological categorization is less clear. Although they are transcribed as fricatives by