Article

A typological study of Voice Onset Time (VOT) in Indo-Iranian languages

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

The stop consonants of Indo-Iranian languages are categorized into two to maximum five laryngeal categories. The present study investigates whether Voice Onset Time (VOT) reliably differentiates the word-initial stop laryngeal categories and how it covaries with different places of articulation in ten languages (two Iranian: Pashto and Wakhi; seven Indo-Aryan: Dawoodi, Punjabi, Shina, Jangli, Urdu, Sindhi, and Siraiki; and one Isolate: Burushaski). The results indicated that there was a clear VOT distinction between the voiceless unaspirated and voiceless aspirated stops. The voiceless unaspirated stops showed shorter voicing lag VOTs than voiceless aspirated stops. Voiced unaspirated, voiced aspirated, and voiced implosive stops were characterized by voicing lead VOTs. In the voiceless unaspirated and aspirated categories, palatal affricates showed the longest voicing lag VOT due to the frication interval of this stop type. In contrast, voiceless unaspirated retroflex stops were characterized by the shortest voicing lag VOT. There were no clear place differences in the voiceless aspirated, voiced unaspirated, voiced aspirated, and voiced implosive categories. The findings of the current study suggest that VOT reliably differentiates the stop consonants of all the languages that contrast two (voiceless unaspirated vs. voiced unaspirated: Pashto and Wakhi) or three (voiceless unaspirated vs. voiceless aspirated vs. voiced unaspirated: Burushaski, Dawoodi, Punjabi, and Shina) laryngeal categories. However, VOT does not consistently distinguish the stop consonants of languages (Jangli, Urdu, Sindhi, and Siraiki) with contrastive voiced unaspirated, voiced aspirated, and voiced implosive categories.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Durational/temporal measures Temporal cues (e.g., closure and release durations) are one of the most important acoustic correlates of stops, including coronals (Anderson and Maddieson 1994). Voice Onset Time (VOT) has been widely used as a reliable descriptor of place and laryngeal contrasts of syllable-initial stops (Cho and Keating 2001;Cho and Ladefoged 1999;Hussain 2018;Lisker and Abramson 1964). A related measure is release duration, which is generally measured in word-medial and word-final consonants. ...
... In Marathi, voiceless unaspirated retroflex stop releases were shorter in duration than dental stops (Karjigi and Rao 2012). Hussain (2018) presented data from nine Indo-Iranian languages and noted that VOT of voiceless dental /t ⊓ / was consistently longer than that for the voiceless retroflex /ʈ /. Similar observations were made by Dart and Nihalani (1999) for Malayalam dental and retroflex stops, while a near-lack of release differences was observed for coronals in some Australian languages (e.g., Central Arrernte: Tabain 2012). ...
... Similar observations were made by Dart and Nihalani (1999) for Malayalam dental and retroflex stops, while a near-lack of release differences was observed for coronals in some Australian languages (e.g., Central Arrernte: Tabain 2012). Cross-linguistic studies have also confirmed that VOT and release duration are longer in voiceless aspirated stops and shorter in voiceless unaspirated stops (Cho et al. 2019;Hussain 2018;Lisker and Abramson 1964). Punjabi has fairly complex systems of both place and laryngeal contrasts across different word positions but laryngeal categories could be better distinguished by VOT/release than subtle dental vs. retroflex place contrast (Hussain 2021b). ...
Article
Punjabi is an Indo-Aryan language which contrasts a rich set of coronal stops at dental and retroflex places of articulation across three laryngeal configurations. Moreover, all these stops occur contrastively in various positions (word-initially, -medially, and -finally). The goal of this study is to investigate how various coronal place and laryngeal contrasts are distinguished acoustically both within and across word positions. A number of temporal and spectral correlates were examined in data from 13 speakers of Eastern Punjabi: Voice Onset Time, release and closure durations, fundamental frequency, F1-F3 formants, spectral center of gravity and standard deviation, H1*-H2*, and cepstral peak prominence. The findings indicated that higher formants and spectral measures were most important for the classification of place contrasts across word positions, whereas laryngeal contrasts were reliably distinguished by durational and voice quality measures. Word-medially and -finally, F2 and F3 of the preceding vowels played a key role in distinguishing the dental and retroflex stops, while spectral noise measures were more important word-initially. The findings of this study contribute to a better understanding of factors involved in the maintenance of typologically rare and phonetically complex sets of place and laryngeal contrasts in the coronal stops of Indo-Aryan languages.
... The four-way stop laryngeal contrast present in many Indo-Aryan languages (Hussain, 2018), including Hindi and Urdu, has been the subject of a large body of research, with the goal of determining which feature or set of features best defines the contrast. While the voiceless unaspirated, voiceless aspirated, and voiced unaspirated stops (henceforth P, PH, B) are generally unproblematic to define, the so-called "voiced aspirated" stops (henceforth BH) pose difficulties and therefore have caused the most controversy in terms of their proposed representation. ...
... Esposito & Khan, 2012 for a comparison of voiced aspirated consonants in Gujarati and the genetically unrelated White Hmong). Hussain (2018) compared VOT and its interaction with place of articulation across several Indo-Aryan languages including two (Jangli and Urdu) with the four-way contrast, and two (Sindhi and Siraiki) with a five-way contrast made up of the four-way contrast as well as an implosive series. Overall, effects were consistent across languages, although separate statistical models were run for each language, so the crosslanguage comparison was about the relative patterning of VOT across laryngeal categories and places of articulation within each language rather than a direct comparison of VOT values across languages. ...
... /B, BH/ vs. /P, PH/). Prevoicing duration was found to be longer in B than BH in Hindi (Davis, 1994;Dutta, 2007), Nepali (Schwarz et al., 2019), Marathi (Dmitrieva & Dutta, 2020), and Jangli (Hussain, 2018), though cf. Hussain, 2018 for Urdu, where no difference was found. ...
Article
This work examines cue weighting in production and perception of the four-way laryngeal contrast in Hindi and Urdu. Previous work has consistently identified several cues, including prevoicing (duration), aspiration (duration), voice quality, and f0, that are relevant to the contrast, although the phonetic specification of the contrast, and particularly the status of the so-called “voiced aspirates,” remains unclear. In this work, we confirm the importance of prevoicing and aspiration to the contrast overall, but argue that voice quality (murmur or breathy voice) best distinguishes the voiced aspirates in production. In perception, listeners make use of all cues, in line with production patterns. Tokens in which concurrent prevoicing and aspiration are categorically identified as voiced aspirates, indicating that the joint presence of these two cues is sufficient for voiced stop identification and demonstrating the primacy of these features over all of the others tested. At the same time, neither prevoicing nor aspiration is strictly necessary for voiced aspirate identification; a stop token be perceived as a voiced aspirate even when one of these is absent, as long as the breathy voice quality also characteristic of voiced aspirates is present. We attribute the disproportionately large perceptual category space for voiced aspirates to the variability of voiced aspirates in production.
... Other languages whose voicing contrast has not been fully understood despite the substantial number of their speakers include Brazilian Portuguese (Ahn, 2018a with 8 speakers); Thai (Kirby, 2018 with 12 speakers); Turkish (U Ünal-Logacev, Fuchs & Lancia, 2018 with 6 speakers); and Russian (Kharlamov, 2018 with 60 speakers). Languages that have received even less attention but are covered in this special collection include Lebanese Arabic (Al-Tamimi & Khattab, 2018 with 20 speakers), Vietnamese and Khmer (Kirby, 2018 with 14 speakers each); Yerevan (Eastern) Armenian (Seyfarth & Garellek, 2018 with 8 speakers), and 10 languages (two Iranian, seven Indo-Aryan languages and one isolated one) spoken in India with 48 speakers in total (Hussain, 2018). ...
... As shown in Figure 1b, mean VOTs of aspirated (denti) alveolar stops in these languages are distributed over a wide range from 57 ms to 97 ms. Among these languages, 8 languages were studied by Hussain (2018) with a similar method. These languages also show a similar variation in mean VOT from 57 ms to 91. ...
... There are, however, cases in which voicing contrast cannot be fully captured by VOT alone, or cases in which the phonetic nature of voicing contrast can be further illuminated along phonetic dimensions other than VOT such as voice quality and F0. Many of the Indo-Aryan languages (Hussain 2018) present such cases as they employ multiple distinctions made along the negative VOT dimension with voiced (unaspirated) stops in contrast with voiced aspirated stops and voiced implosives. Hussain (2018) suggests a number of possible phonetic correlates of the laryngeal distinction especially for the stops whose VOT values overlap substantially to the extent that no further distinction could be made along the negative VOT dimension. ...
Article
Full-text available
In this special collection entitled Marking 50 Years of Research on Voice Onset Time and the Voicing Contrast in the World's Languages, we have compiled eleven studies investigating the voicing contrast in 19 languages. The collection provides extensive data obtained from 270 speakers across those languages, examining VOT and other acoustic, aerodynamic and articulatory measures. The languages studied may be divided into four groups: 'aspirating' languages with a two-way contrast (English, three varieties of German); 'true voicing' languages with a two-way contrast (Russian, Turkish, Brazilian Portuguese, two Iranian languages Pashto and Wakhi); languages with a three-way contrast (Thai, Vietnamese, Khmer, Yerevan Armenia, three Indo-Aryan languages, Dawoodi, Punjabi and Shina, and Burushaki spoken in India); and Indo-Aryan languages with a more than three-way contrast (Jangli and Urdu with a four-way contrast, and Sindhi and Siraiki with a five-way contrast). We discuss the cross-linguistic data, focusing on how much VOT alone tell s us above the voicing contrast in these languages, and what other phonetic dimensions (such as consonant-induced F0 and voice quality) are needed for a complete understanding of laryngeal contrast in these languages. Implications for various issues emerge: universal phonetic feature systems, effects of language contact on linguistic levelling, and the relation between laryngeal contrast and supralaryngeal articulation. The cross-linguistic VOT data also lead us to discuss how the distribution of VOT as measured acoustically may allow us to infer the underlying articulation and how it might be approached in gestural phonologies. The discussion on these multiple issues sparks new questions to be resolved, and provide indications of where the field may be best directed in exploring laryngeal contrast in voicing in the world's languages.
... A recent volume on Hindko, Panjabi and Saraiki (Bashir et al. 2019) offers a description based on Shackle (1976). Hussain (2018) presents acoustic data on the Saraiki stops. The present Illustration may be compared to other studies of languages spoken in the same region, including Sindhi (Nihalani 1995) and Hindi (Ohala 1994). ...
... ). Iranian languages 'palatals are the only stop series that are produced as postalveolar affricates' and thus transcribes them as [ʧ ʧ h ʤ ʤ ɦ ]. In this article, we followHussain (2018). ...
Article
Full-text available
Saraiki (ISO 639-3:skr) is an Indo-Aryan language widely used in Pakistan and India (Bashir, Conners & Hefright 2019). The variety described here is Central Saraiki, spoken in the districts of Multan, Muzaffargarh, Bahawalpur and the northern parts of Dera Ghazi Khan in Pakistan, which form the largest of the Saraiki-speaking areas.1 Geographically, Pakistan is divided into four provinces, Punjab, Sindh, Khyber Pukhton Khaw (KPK) and Balochistan. Punjabi is spoken in Punjab, and Sindhi is the dominant language in Sindh. Most Pashto speakers live in KPK and Balochistan, while the inhabitants of Balochistan speak Balochi, Brahui and Saraiki. Other than Urdu, Saraiki is the only language which is spoken in all four provinces of Pakistan, with a majority of speakers in southern Punjab.
... From stops series, let's take voiceless unaspirated retroflex /ʈ/ and dental /ṱ/ and sonorant retroflex /ɽ/ and alveolar /ɼ/. In the literature, it is claimed that the formant trajectories of adjacent vowels of retroflex consonants might be lower than that of non-retroflex sounds (Hussain, 2018) and this is one of the phonetic reasons to assign [+back] feature to retroflex. In order to confirm this, we take into account the formant values of preceding and following vowels of Saraiki retroflex. ...
Article
Full-text available
The study presents that retroflex can be non-back and palatalized in Saraiki language. In this article it is claimed that phonologically to assign [+back] feature to retroflex is unsound. It is because retroflex consonants are coronal and only coronal features should be listed. Furthermore, phonetically low F3 is not a distinctive quality of retroflex rather it is context and language dependent. Therefore, instead of using [+back] as a distinctive feature, a feature [+retracted] is used to specify retroflex from other coronals as this feature is common to all retroflex sound.
... Shina is an endangered Indo-Aryan (Dardic) language spoken in Gilgit, Northern Pakistan. There is a three-way laryngeal contrast in Shina stops (e.g., voiceless unaspirated /p/, voiceless aspirated /p h /, and voiced unaspirated /b/) at five places of articulation (bilabial, dental, retroflex, palatal, and velar;Hussain 2018;Radloff 1999). From a typological perspective, voiceless aspirated stops have been described to have either a raising effect on fundamental frequency (F0) and spectral tilt or a lowering effect (see below). ...
Article
Full-text available
Shina is an endangered Indo-Aryan (Dardic) language spoken in Gilgit, Northern Pakistan. The present study investigates the acoustic correlates of Shina’s three-way stop laryngeal contrast across five places of articulation. A wide range of acoustic correlates were measured including fundamental frequency (F0), spectral tilt (H1*-H2*, H1*-A1*, H1*-A2*, and H1*-A3*), and cepstral peak prominence (CPP). Voiceless aspirated stops were characterized by higher fundamental frequency, spectral tilt, and cepstral peak prominence, compared to voiceless unaspirated and voiced unaspirated stops. These results suggest that Shina is among those languages which have a raising effect of aspiration on the pitch and spectral tilt onsets of the following vowels. Positive correlations among fundamental frequency, spectral tilt, and cepstral peak prominence were observed. The findings of this study will contribute to the phonetic documentation of endangered Dardic languages.
... Another important characteristic is the 4-way laryngeal contrast in plosives and (to a lesser extent) affricates, which includes plain voiceless, plain voiced, aspirated voiceless, and breathy voiced categories. A 4-way contrast of this kind is quite common in Indo-Aryan languages of the subcontinent (at least in plosives); however, most languages of the Hindu-Kush region tend to have a reduced, 3-way contrast (plain voiceless, plain voiced, and aspirated voiceless; Hussain 2018). Sample words with these and other consonants in word-initial position are provided below, both in phonemic IPA transcription and in the Kalasha orthography. ...
Article
Full-text available
Kalasha (ISO 639-3: kls), also known as Kalashamon, is a Northwestern Indo-Aryan language spoken in Chitral District of Khyber Pakhtunkwa Province in northern Pakistan, primarily in the valleys of Bumburet, Rumbur, Urtsun, and Birir, as shown in Figure 1. The number of speakers is estimated between 3000 and 5000. The Ethnologue classifies the language status as ‘vigorous’ (Eberhard, Simons & Fennig 2019) but some researchers consider it ‘threatened’ (Rahman 2006, Khan & Mela-Athanasopoulou 2011). Kalasha has been in close contact with Nuristani and other Northwestern Indo-Aryan languages. Among the latter, the influence of Khowar has been particularly strong because it functions as a lingua franca of Chitral District (Liljegren & Khan 2017). The Kalasha lexicon includes many loanwords from Khowar, as well as from Persian, Arabic, and Urdu (Trail & Cooper 1999). Early efforts to put the language in writing employed Arabic script but a Latin-based script was adopted in 2000 (Cooper 2005, Kalash & Heegård 2016).
... We use "Drenjongke" at the request of the informants; "Bhutia" is the official name recognized by the Indian government. 3) Indo-Iranian languages show up to a five-way laryngeal contrast (Hussain 2018). Dzongkha, the national language of Bhutan which is closely related to Drenjongke, exhibits a four-way laryngeal contrast as well. ...
Article
Drenjongke is a Tibeto-Burman language spoken in Sikkim, India, whose phonetic properties are under-studied. This language is reported to have a four-way laryngeal contrast: aspirated, voiceless, voiced, and "devoiced" (van Driem 2016). An acoustic analysis of twelve Drenjongke speakers shows that in addition to differences in VOT, there are systematic differences in F0 and F1 in the following vowel. Our analysis further suggests that high F1 after de-voiced consonants is controlled, rather than being an automatic consequence of long VOT. We conclude that Drenjongke speakers use at least three acoustic dimensions (VOT, F0 and F1) to distinguish the four-way laryngeal contrast.
Thesis
Full-text available
Цель работы — с использованием методов акустического анализа звучащей речи описать реализацию взрывных согласных шугнанского языка в начальной и конечной позициях.
Article
Full-text available
This paper presents a first detailed analysis of the Voice Onset Time (VOT) and Constriction Duration (CD) of stops /p t ʈ c k/ and flap /ɽ/ in the Indigenous Australian language Warlpiri as spoken in Lajamanu Community, in Australia’s Northern Territory. The results show that Warlpiri stops are realised as voiceless, long-lag stops word-initially, as well as word-medially, where /p t k/ are also characterised by CDs in excess of 100 ms. This is similar to what has been reported for Kriol, and for the emerging mixed language Light Warlpiri, also spoken in the community, and by some of the participants. The results indicate that Warlpiri does not obligatorily make a word-medial distinction between stops orthographically represented by ‘rt’ and ‘rd’, which have previously been argued to be realised as /ʈ/ and /ɽ/, respectively, at least in some varieties of Warlpiri. Finally, the results also suggest that the realisation of word-initial Warlpiri flap /ɽ/ is highly variable, potentially resulting in a near-merger with /ɻ/.
Preprint
Full-text available
Phonetic data on laryngeal contrasts in two-series systems are reconsidered, revealing incompatibility with both privative and binary approaches employing a feature [voice] in segment-oriented frameworks. An account of two-series laryngeal phonology within the Onset Prominence (OP) framework is presented. The OP model offers a conciliatory perspective on the privative-binary debate, making predictions that are compatible with available phonetic data on VOT, final laryngeal neutralization, and assimilation, and resolving a number of longstanding issues in laryngeal phonology.
Article
Punjabi (Western, ISO-639-3 pnb) is an Indo-Aryan language (Indo-European, Indo-Iranian) spoken in Pakistan and India, and in immigrant communities in the UK, Canada, USA, and elsewhere. In terms of number of native speakers, it is ranked 10th among the world’s languages, with more than 100 million speakers (Lewis, Simons & Fennig 2016). Aspects of the phonology of different varieties of Punjabi have been described in Jain (1934), Arun (1961), Gill & Gleason (1962), Singh (1971), Dulai & Koul (1980), Bhatia (1993), Malik (1995), Shackle (2003), and Dhillon (2010). Much of this literature is focused on Eastern varieties, and the phonology of Western Punjabi dialects has received relatively less attention (e.g. Bahri 1962, Baart 2003, 2014). https://sites.google.com/students.mq.edu.au/qandeelhussain/publications?authuser=0
Article
Full-text available
The multilingual and multicultural region of northern Pakistan, which has approximately 30 distinct languages, is described and evaluated from the perspective of language vitality, revealing the diverse and complex interplay of language policies, community attitudes and generational transmission. Based on the experience of conscious language maintenance efforts carried out in the area, some conclusions are offered concerning the particular effectiveness of regional networking and non-governmental institution support to promote local languages and sustain their vitality in times of great change. Web link: http://www.valentin.uu.se/research/in-house-publications/multiethnica/
Article
Full-text available
This study investigates the interaction between voice quality and pitch by revisiting the well-known case of Mandarin creaky voice. This study first provides several pieces of experimental data to assess whether the mechanism behind allophonic creaky voice in Mandarin is tied to tonal categories or is driven by phonetic pitch ranges. The results show that the presence of creak is not exclusively limited to tone 3, but can accompany any of the low pitch targets in the Mandarin tones; further, tone 3 is less creaky when the overall pitch range is raised, but more creaky when the overall pitch range is lowered. More importantly, tone 3 is not unique in this regard, and other tones such as tone 1 are also subject to similar variations. In sum, voice quality is quite systematically tied to F0 in Mandarin. Results from a pitch glide experiment further suggest that voice quality overall covaries with pitch height in a wedge-shaped function. Non-modal voice tends to occur when pitch production exceeds certain limits. Voice quality, thus, has the potential to enhance the perceptual distinctiveness of extreme pitch targets.
Article
Full-text available
The pronunciation of stop consonants varies markedly with age, gender, accent, etc. Yet by extracting appropriate cues common to these varying pronunciations, it is possible to correctly identify the spoken consonant. In this paper, the structure underlying Hindi stop consonants is presented. This understanding may potentially be used as a “recipe” for their artificial synthesis. Hindi alphabet stops were analyzed for this purpose. This alphabet has an organized and comprehensive inventory of stop consonants, and its consonants invariably terminate with the neutral vowel schwa. While the former consideration makes the findings potentially applicable to many languages including English, the latter rationale helped reduce the endeavor's analytical complexity. The alphabet has velar, palatal, retroflex, dental and bilabial stops in voiceless-unaspirated, voiceless-aspirated, voiced-unaspirated, voiced-aspirated, and nasal flavors. It is shown that additive combinations of relatively simple acoustic functions can be used to generate most of the 20 non-nasal stops. This work will potentially help speech therapists improve diagnosis and rectification of speech and hearing disabilities, speed up electronic communication of audio data, and improve voice recognition.
Article
Full-text available
One of the frequent questions by users of the mixed model function lmer of the lme4 package has been: How can I get p values for the F and t tests for objects returned by lmer? The lmerTest package extends the 'lmerMod' class of the lme4 package, by overloading the anova and summary functions by providing p values for tests for fixed effects. We have implemented the Satterthwaite's method for approximating degrees of freedom for the t and F tests. We have also implemented the construction of Type I - III ANOVA tables. Furthermore, one may also obtain the summary as well as the anova table using the Kenward-Roger approximation for denominator degrees of freedom (based on the KRmodcomp function from the pbkrtest package). Some other convenient mixed model analysis tools such as a step method, that performs backward elimination of nonsignificant effects - both random and fixed, calculation of population means and multiple comparison tests together with plot facilities are provided by the package as well.
Article
Full-text available
Substantial research has established that place of articulation of stop consonants (labial, alveolar, velar) are reliably differentiated using a number of acoustic measures such as closure duration, voice onset time (VOT), and spectral measures such as centre of gravity and the relative energy distribution in the mid-to-high spectral range of the burst. It is unclear, however, whether such measurable acoustic differences are present in multiple place of articulation contrasts among coronal stops. This article presents evidence from the highly endangered indigenous Australian language Wubuy, which maintains a 4-way coronal stop place contrast series in all word positions. The authors examine the temporal and burst characteristics of / t ̪ t ʈ/ in three prosodic positions (utterance-initial, word-initial but phrase medial, and word-medial). The results indicate that VOT, closure duration, and the spectral quality of the burst may indeed differentiate multiple coronal place contrasts, in most positions, although measures that distinguish the apical contrast in absolute initial position remain elusive. The authors also examine measures (spectrum kurtosis, spectral tilt) previously used in other studies of multiple coronals in Australian languages. These results suggest that the authors' measures perform at least as well as those previously applied to multiple coronals in other Australian languages.
Article
Full-text available
This study investigates consonant-related F0 perturbations (“CF0”) in French and Italian by comparing the effects of voiced and voiceless obstruents on F0 to those of voiced sonorants. The voiceless obstruents /p f/ in both languages are found to have F0-raising properties similar to American English voiceless obstruents, while F0 following the (pre)voiced obstruents /b v/ in French and Italian patterns together with /m/, again similar to English [Hanson (2009). J. Acoust. Soc. Am. 125(1), 425–441]. In both languages, F0 is significantly depressed, relative to sonorants, during the closure for voiced obstruents, but cannot be differentiated from sonorants following the release of oral constriction. These findings are taken as support for a model on which F0 perturbations are fundamentally the result of laryngeal maneuvers initiated to sustain or inhibit phonation, regardless of other language-particular aspects of phonetic realization.
Article
Full-text available
Least-squares means are predictions from a linear model, or averages thereof. They are useful in the analysis of experimental data for summarizing the effects of factors, and for testing linear contrasts among predictions. The lsmeans package (Lenth 2016) provides a simple way of obtaining least-squares means and contrasts thereof. It supports many models fitted by R (R Core Team 2015) core packages (as well as a few key contributed ones) that fit linear or mixed models, and provides a simple way of extending it to cover more model classes.
Conference Paper
Full-text available
Research suggests that nonsense and real words often exhibit differences in their acoustic properties. Despite this, the use of nonsense stimuli is prevalent in acoustic analyses of a range of phenomena and in experimental studies of segmental perception. The present study examined stop duration and preceding vowel formant transitions for two Bengali coronal stops produced in real and nonsense word stimuli. Firstly, significant differences were observed based on the stimulus type. Nonsense word production showed more distinct dental-retroflex differentiation. Secondly, the results revealed that F3 was a more reliable cue to place of articulation than closure duration and voice onset time.
Article
Full-text available
Hindko is an Indo-Aryan language that is mainly spoken in Khyber Pukhtoonkhaw province of Pakistan. This work aims to identify the oral stops of Hindko and determine the intrinsic acoustic cues for them. The phonemic analysis is done with the help of minimal pairs and phoneme distribution in contrastive environments which reveals that Hindko has twelve oral stops with three way series. The acoustic analysis of these segments shows that intrinsically voice onset time (VOT), closure duration and burst are reliable and distinguishing cues of stops in Hindko.
Article
Full-text available
This work reports cross-language differences in the voicing of initial voiced stops, and in the use of active maneuvers to achieve closure voicing, using multiparametric aerodynamic data. Oral pressure, oral and nasal flow, and acoustic data were obtained for utterance-initial /b d p t m/ for 10 speakers of Spanish, 6 speakers of French and 5 speakers of English. Voiced stops were classified as fully voiced or devoiced, and by shape of the oral pressure pulse (implosivized, other cavity enlarging maneuver) and/or occurrence of nasal flow (prenasalized) or oral flow (spirantized) during the stop phase in an attempt to relate aerodynamic data and actual glottal vibration to vocal-tract gestures and maneuvers to facilitate voicing. Maneuvers that favor the initiation (and sustaining) of voicing are then related to (i) language-specific differences in the use of glottal vibration during the constriction as a cue to voicing, (ii) place of articulation, and (iii) speaker dependent variation.
Article
Full-text available
Pakistan is a multilingual country with six major and over fifty-nine small languages. However, the languages of the domains of power—government, corporate sector, media, education, etc.— are English and Urdu. The state's policies have favored these two languages at the expense of others. This has resulted in the expression of ethnic identity through languages other than Urdu. It has also resulted in English having become a symbol of the upper class, sophistication and power. The less powerful indigenous languages of Pakistan are becoming markers of lower status and culture shame. Some small languages are also on the verge of extinction. It is only by promoting additive multilingualism that Pakistani languages will gain vitality and survive as cultural capital rather than cultural stigma.
Article
Full-text available
Fiberscopic films and audio recordings were made of two native speakers of Hindi, producing # Ci, iCi, iC # utterances where C was one of the four types of stops and affricates. The voiced unaspirated type showed voicing through the whole consonant and no ab-/adduction gesture. The other three types, in the intervocalic case, all showed an ab-/adduction gesture, but the timing and the amplitude of this gesture differed for the three types. For the voiceless unaspirated type, the gesture started at the beginning of the consonant and ended at the release, whereas for the voiced aspirated type, it started at the release and ended at the end of the consonant. For the voiceless aspirated type, it started and ended with the consonant, reaching a glottal width approximately double of the latter two types. Similar results were obtained in initial and final position. Voice onset time values and durations of oral closure are examined. Pros and cons of the terms 'voiced aspirated', 'murmured aspirated', and 'voiced phonoaspirated' are discussed. Finally a mechanism underlying stop production is suggested.
Thesis
Madurese, a Western Malayo-Polynesian language spoken on the Indonesian island of Madura, exhibits a three-way laryngeal contrast distinguishing between voiced, voiceless unaspirated and voiceless aspirated stops and an unusual consonant-vowel (CV) co-occurrence restriction. The CV co-occurrence restriction is of phonological interest given the patterning of voiceless aspirated stops with voiced stops rather than with voiceless unaspirated stops, raising the question of what phonological feature they may share. Two features have been linked with the CV co-occurrence restriction: Advanced Tongue Root [ATR] and Lowered Larynx [LL]. However, as no evidence of voicing during closure for aspirated stops is observed and no other acoustic measures except voice onset time (VOT), fundamental frequency (F0), frequencies of the first (F1) and the second (F2) formants and closure duration relating to the proposed features have been conducted, it remains an open question which acoustic properties are shared by voiced and aspirated stops. Three main questions are addressed in the thesis. The first question is what acoustic properties voiced and voiceless aspirated stops share to the exclusion of voiceless unaspirated stops. The second question is whether [ATR] or [LL] accounts for the patterning together of voiceless aspirated stops with voiced stops. The third question is what the implications of the results are for a transparent phonetics-phonology mapping that expects phonological features to have phonetic correlates associated with them. In order to answer the questions, we looked into VOT, closure duration, F0, F1, F2 and a number of spectral measures, i.e. H1*-A1*, H1*-A2*, H1*-A3*, H1*-H2*, H2*-H4* and CPP. We recorded fifteen speakers of Madurese (8 females, 7 males) reading 188 disyllabic Madurese words embedded in a sentence frame. The results show that the three-way voicing categories in Madurese have different VOT values. The difference in VOT is robust between voiced stops on the one hand and voiceless unaspirated and voiceless aspirated stops on the other. Albeit statistically significant, the difference in VOT values between voiceless unaspirated and voiceless aspirated stops is relatively small. With regard to closure duration, we found that there is a difference between voiced stops on the one hand and voiceless unaspirated and aspirated stops on the other. We also found that female speakers distinguish F0 for the three categories while male speakers distinguish between F0 for voiced stops on the one hand and voiceless unaspirated and voiceless aspirated stops on the other. The results for spectral measures show that there are no significant differences in H1*-A1*, H1*-A3*, H1*-H2*, H2*-H4* and CPP between vowels adjacent to voiced and voiceless aspirated stops. In contrast, there are significant differences in these measures between vowels adjacent to voiced and voiceless unaspirated stops and between vowels adjacent to voiceless aspirated and voiceless unaspirated stops. Regarding the question whether voiced and voiceless aspirated stops share certain acoustic properties, our findings show that they do. The acoustic properties they share are H1*-A1* for both genders, H1*-H2* for females, H1*-A3* and H2*-H4* for males, and CPP for females at vowel onset and for males at vowel midpoint. However, they do not share such acoustic properties as VOT, closure duration and F0. Voiceless unaspirated and voiceless aspirated stops can be distinguished by VOT, F0 and spectral measures, i.e. H1*-A1*, H1*-A3*, H1*-H2*, H2*-H4* and CPP. However, these two voiceless stop categories have similar closure durations. As regards the question if [+ATR] or [+LL] might be responsible for the patterning together of voiceless aspirated stops with voiced stops, our findings suggest that either feature appears to be plausible. Acoustic evidence that lends support to the feature [+ATR] includes lower F1 and greater spectral tilt measures, i.e. H1*-A1*, H1*-A3*, H1*-H2* and H2*-H4*, and lower CPP values. Acoustic evidence that supports the feature [+LL] includes lower F1 and greater spectral tilt measures, i.e. H1*-A1*, H1*-A3*, H1*-H2* and H2*-H4*, and lower CPP values. However, the fact that voiceless aspirated stops are voiceless during closure raises a problem for the feature [+ATR] and the fact that F0 for voiceless aspirated stops is higher than for voiced stops also presents a problem for the feature [+LL]. The fact that not all acoustic measures fit in well with either feature is problematic to the idea that the relationship between phonetics and phonology is transparent in the sense that phonological features can be directly transformed into their phonetic correlates. Following the view that not all phonological features may not be expected to be phonetically grounded, for example, when they are related to historical sound change, we hold the idea of a phonetics-phonology mapping which allows for other non-phonetic factors to account for a phonological phenomenon. We also provide historical and loanword evidence which could support that voiceless aspirated stops in Madurese may have derived from earlier voiced stops, which probably retain their historical laryngeal contrast through phonologisation.
Article
The phonological category “retroflex” is found in many Indo-Aryan languages; however, it has not been clearly established which acoustic characteristics reliably differentiate retroflexes from other coronals. This study investigates the acoustic phonetic properties of Punjabi retroflex /ʈ/ and dental /ʈ̪/ in word-medial and word-initial contexts across /i e a o u/, and in word-final context across /i a u/. Formant transitions, closure and release durations, and spectral moments of release bursts are compared in 2280 stop tokens produced by 30 speakers. Although burst spectral measures and formant transitions do not consistently differentiate retroflexes from dentals in some vowel contexts, stop release duration, and total stop duration reliably differentiate Punjabi retroflex and dental stops across all word contexts and vocalic environments. These results suggest that Punjabi coronal place contrasts are signaled by the complex interaction of temporal and spectral cues.
Article
Just over fifty years ago, Lisker and Abramson proposed a straightforward measure of acoustic differences among stop consonants of different voicing categories, Voice Onset Time (VOT). Since that time, hundreds of studies have used this method. Here, we review the original definition of VOT, propose some extensions to the definition, and discuss some problematic cases. We propose a set of terms for the most important aspects of VOT and a set of Praat labels that could provide some consistency for future cross-study analyses. Although additions of other aspects of realization of voicing distinctions (F0, amplitude, duration of voicelessness) could be considered, they are rejected as adding too much complexity for what has turned out to be one of the most frequently used metrics in phonetics and phonology.
Article
This paper presents the results of a preliminary investigation on the documentation of the language of Khetrans. The Khetrans, being one of the many Baloch tribes that speak a language apart from Balochi, primarily occupy the Barkhan district of Balochistan. Earlier observations describe their language as forming a part of Sindhi or being a type of Lahnda. Khetrani is undoubtably a northwestern Indo-Aryan language, and the evidence at the researcher's disposal shows that it does share features with both Sindhi and Siraiki. Historically Khetrani lay well on a dialect continuum that spanned both of these languages and has preserved features intermediate to each. Its adjectival morphology is nearly the same as Siraiki's but the pronominal one closer to Sindhi. Khetrani verb structure is largely similar to and at par with Siraiki for it lacks the “richness” of Sindhi, despite the forms of many cognate verbs being identical to Sindhi. The most salient features of the verbal morphology aligning Khetrani with Siraiki are a sigmatic future and the continuous aspect. The valence model, however, is similar to Sindhi and Khetrani has a Passive Participle peculiar to itself. These features distinguish Khetrani as an independent language.
Article
Article
This article demonstrates that implosives in Sindhi involve ingressive airflow, unlike the implosives in Hausa. The immediate consequence of this fact is that the proposal that there are no true implosives, i.e., those that involve suction, must be rejected. It also raises the question whether implosives should be characterized in phonological theory as sounds involving suction, or as sounds involving the lowering of the larynx. Comparison of the implosives in Sindhi with those of Hausa also demonstrates the need for including certain kinds of “phonetic implementational phenomena” in the domain of phonology.
Book
Kazakh: A Comprehensive Grammar is the first thorough analysis of Kazakh to be published in English. The volume is systematically organized to enable users to find information quickly and easily, and provides a thorough understanding of Kazakh grammar, with special emphasis given to syntax. Features of this book include: descriptions of phonology, morphology and syntax; examples from contemporary usage; tables summarizing discussions, for reference; a bibliography of works relating to Kazakh. Kazakh: A Comprehensive Grammar reflects the richness of the language, focusing on spoken and written varieties in post-Soviet Kazakhstan. It is an essential purchase for all linguists and scholars interested in Kazakh or in Turkic languages as well as advanced learners of Kazakh.
Book
This grammar provides a grammatical description of Palula, an Indo-Aryan language of the Shina group. The language is spoken by about 10,000 people in the Chitral district in Pakistan’s Khyber Pakhtunkhwa Province. This is the first extensive description of the formerly little-documented Palula language, and is one of only a few in-depth studies available for languages in the extremely multilingual Hindukush-Karakoram region. The grammar is based on original fieldwork data, collected over the course of about ten years, commencing in 1998. It is primarily in the form of recorded, mainly narrative, texts, but supplemented by targeted elicitation as well as notes of observed language use. All fieldwork was conducted in close collaboration with the Palula-speaking community, and a number of native speakers took active part in the process of data gathering, annotation and data management. The main areas covered are phonology, morphology and syntax, illustrated with a large number of example items and utterances, but also a few selected lexical topics of some prominence have received a more detailed treatment as part of the morphosyntactic structure. Suggestions for further research that should be undertaken are given throughout the grammar. The approach is theory-informed rather than theory-driven, but an underlying functional-typological framework is assumed. Diachronic development is taken into account, particularly in the area of morphology, and comparisons with other languages and references to areal phenomena are included insofar as they are motivated and available. The description also provides a brief introduction to the speaker community and their immediate environment. Online access link: http://langsci-press.org/catalog/book/82
Article
Measurements of formant frequencies and duration are reported for 8 Swedish vowels uttered by a male talker in three consonantal environments under varying timing conditions. An exponential function is used to describe the extent to which formant frequencies in the vowels reach their target values as a function of vowel-segment duration. A target is specified by the asymptotic values of the first two formant frequencies of the vowel and is independent of consonantal context and duration. It is thus an invariant attribute of the vowel. The results suggest an interpretation in terms of a simple dynamic model of vowel articulation.
Article
The computer-assisted learning of spoken language is closely tied to automatic speech recognition (ASR) technology which, as is well known, is challenging with non-native speech. By focusing on specific phonological differences between the target and source languages of non-native speakers, pronunciation assessment can be made more reliable. The four-way contrast of Hindi stops, where voicing and aspiration are phonemic for each of five distinct places-of-articulation, are typically challenging for a learner from a different native language group. The improper production of the aspiration contrast is thus often the salient cue to non-native accents of spoken Hindi. In this work, acoustic-phonetic features, motivated by an understanding of the production of the aspirated plosives, are evaluated for the classification of plosives along the aspiration dimension. Several new acoustic measures are proposed for the reliable detection of the aspiration contrast in unvoiced and voiced plosives. The acoustic-phonetic features are shown to perform well in the two-way classification task, and also appear robust to cross-language transfer where statistical models trained on Marathi speech were tested on native Hindi utterances. In experiments on native and non-native utterances of Hindi words by Tamil-L1 speakers, the acoustic-phonetic features clearly separate the non-native speakers from native on pronunciation quality of aspirated plosives. The acoustic-phonetic features also outperformed an ASR system based on more generic spectral features in terms of phone-level feedback that was consistent with human judgement.
Article
Recent articles by Voegelin and Voegelin (1965) and Kachru (1969) presented erroneous listings of the so-called "Dardic" languages. These listings were based on Grierson's now outdated classification, and they did not reflect the clear division between the Nūristānī (Kāfir) languages, which constitute a separate branch of Indo-Iranian, and the other Dardic languages, which are Indo-Aryan, as stated by Morgenstierne (1961). The present article points out the errors in the Voegelins' and Kachru's lists and updates Morgenstierne's scheme in the light of recent field research in the Hindu-Kush region of Afghanistan.
Article
Previous research has shown that in languages like English, the implementation of voicing in voiced obstruents is affected by linguistic factors such as utterance position, stress, and the adjacent sound. The goal of the current study is to extend previous findings in two ways: (1) investigate the production of voicing in connected read speech instead of in isolation/carrier sentences, and (2) understand the implementation of partial voicing by examining where in the constriction voicing appears or dies out. The current study examines the voicing of stops and fricatives in the connected read speech of 37 speakers. Results confirm that phrase position, word position, lexical stress, and the manner and voicing of the adjacent sound condition the prevalence of voicing, but they have different effects on stops and fricatives. The analysis of where voicing is realized in the constriction interval shows that bleed from a preceding sonorant is common, but voicing beginning partway through the constriction interval (i.e., negative voice onset time) is much rarer. The acoustic, articulatory, and aerodynamic sources of the patterns of phonation found in connected speech are discussed.
Article
The abstract for this document is available on CSA Illumina.To view the Abstract, click the Abstract button above the document title.
Article
Pitjantjatjara is an Australian language with five stop places of articulation /p t t c k/ in three vowel contexts /a i u/. We present word-medial stop burst data from nine speakers, examining duration, formant, spectral moment and spectral tilt measures. Our particular focus is on the apical contrast (alveolar /t/ vs. retroflex /t/) and on the alveo-palatal /c/ vs. velar /k/ contrast. We observe differences between the palatal and the velar depending on vowel context, and we discuss the possible aerodynamic and acoustic sources for these differences. By contrast, we find that differences between the alveolar and the retroflex are minimal in all three vowel contexts. Unexpectedly, in the context of /i/, various spectral measures suggest that the articulatory release for the retroflex /t/ is in fact more anterior than the release for the alveolar /t/ - we discuss this result in terms of possible articulatory overshoot of the target for /t/ before /i/, and suggest that this result provides additional explanation for the cross-linguistic rarity of retroflexes in an /i/ vowel context.
Conference Paper
There is evidence that coronal contrasts involving retroflexes are less clearly distinguished after a high front vowel /i/ [2, 9]. However, no detailed acoustic studies have been conducted to investigate whether following front vowels affect the contrastiveness of dentals, retroflexes and palatals. We examined the acoustic characteristics of three Punjabi coronal onsets /t̪ ʈ tʃ/ produced before five vowels /i e a u o/ by 12 Punjabi speakers. The results showed that only VOT and spectral variance of the release burst reliably distinguished Punjabi coronal stops in all vocalic contexts. Centre of gravity, skewness and kurtosis of release bursts did not differentiate the coronals in the /i/ context, but did distinguish them before /e a u o/. These findings shed more light on the phonetic basis of coronal contrasts in Indo-Aryan languages, and the ways that they interact with different following vowels.
Article
Previous research has shown that F0 is positively related to H1*‐H2* across male speakers of English [Iseli et al. (2006)] and to H1‐H2 (after inverse filtering) within individual male speakers of Dutch [Swerts and Veldhuis (2001)]. That is, males who have overall higher‐pitched voices generally have overall higher values of H1*‐H2* (cross‐speaker relation), and as an individual male’s F0 goes up, H1‐H2 generally also goes up (within‐speaker relation). The present study investigates both of these relations, cross‐speaker and within‐speaker, for male and female speakers of English and Mandarin, and extends them to a large set of voice quality measures. The speech samples consist of repeated rising and falling tone sweeps, in which speakers began at a self‐selected comfortable pitch, and then swept either up or down in pitch to their highest or lowest comfortable pitch. The beginnings of the sweeps are tested for cross‐speaker relations, while the entire sweeps are tested for within‐speaker relations. VOICESAUCE, a new program for voice analysis, is used to extract F0, energy, cepstral peak prominence, formants and bandwidths, and a variety of harmonic amplitude measures. Many measures are shown to be strongly related to F0. [Work supported by NSF.]
Article
This paper presents jaw movement data from two female speakers of Central Arrernte, focusing on the four coronal places of articulation (dental, alveolar, retroflex and alveo-palatal) across stop, nasal and lateral manners of articulation. It also presents spectral burst data for the stop consonants. Results suggest that when there is a clear spectral peak for the release burst, as is the case primarily for the alveo-palatal, but also for the alveolar and retroflex, the jaw remains high at stop release; but when the spectral burst has a relatively diffuse spectrum, as is the case for the dental, the jaw begins to lower before stop release in anticipation of the following vowel. In addition, there is evidence that the highest jaw target for the alveo-palatal is timed not for the release of the stop closure, but for the frication portion which follows. These results are interpreted as lending support for the view that the lower teeth play a role in stop burst release, similar to their role in sibilant fricative production. Finally, with regard to the retroflex consonants, there is evidence that the continuous upwards jaw movement during acoustic closure is associated with a low jaw position for the initial posterior placement of the tongue tip, followed by a final high jaw position at the anterior release of the tongue tip closure. These results support the view that both biomechanical and acoustic considerations play a role in speech planning and coarticulation.
Article
Speech samples (720 CVC words) from 10 adult male Nepali speakers are analyzed with the aid of a video spectrograph. The distributions of VOT based on group data for each of four phonemic stop categories show that only three of the categories can be differentiated by VOT alone: voice lead, short-lag and long-lag stops. The fourth category, voiced aspirate, contains VOT values from both pre- and postrelease areas of the VOT timeline. Analysis of individual data reveals marked intersubject variability in the VOT distribution of the voiced aspirate category supporting the necessity of multiple subject samples in acoustically based cross-linguistic studies.