[show abstract][hide abstract] ABSTRACT: In most collections of segmental speech errors, exchanges are less frequent than anticipations and perseverations. However, it has been suggested that in inner speech exchanges might be more frequent than either anticipations or perseverations, because many half-way repaired errors (Yew…uhh..New York) are classified as repaired anticipations, but may equally well be half-way repaired exchanges. In this paper it is demonstrated for experimentally elicited speech errors that indeed in inner speech exchanges are more frequent than anticipations and perseverations. The predominance of exchanges can be explained by assuming a mechanism of planning and serial ordering segments during the generation of speech that is qualitatively similar to the scan-copier model proposed by Shattuck-Hufnagel (Sublexical units and suprasegmental structure in speech production planning. In P.F. MacNeilage (Ed.), The production of speech (pp. 109–136). New York: Springer).
Journal of Memory and Language 01/2013; 68(1):26–38. · 2.80 Impact Factor
[show abstract][hide abstract] ABSTRACT: Motor resonance processes are involved both in language comprehension and in affect perception. Therefore we predict that listeners understand spoken affective words slower, if the phonetic form of a word is incongruent with its affective meaning. A language comprehension study involving an interference paradigm confirmed this prediction. This interference suggests that affective phonetic cues contribute to language comprehension. A perceived smile or frown affects the listener, and hearing an incongruent smile or frown impedes our comprehension of spoken words.
[show abstract][hide abstract] ABSTRACT: In native speech, durational patterns convey linguistically relevant phenomena such as phrase structure, lexical stress, rhythm, and word boundaries. The lower intelligibility of non-native speech may be partly due to its deviant durational patterns. The present study aims to quantify the relative contributions of non-native durational patterns and of non-native speech sounds to intelligibility. In a Speech Reception Threshold study, duration patterns were transplanted between native and non-native versions of Dutch sentences. Intelligibility thresholds (critical speech-to-noise ratios) differed by about 4 dB between the matching versions with unchanged durational patterns. Results for manipulated versions suggest that about 0.4–1.1 dB of this difference was due to the durational patterns, and that this contribution was larger if the native and non-native patterns were more deviant. The remainder of the difference must have been due to non-native speech sounds in these materials. This finding supports recommendations to attend to durational patterns as well as native-like speech sounds, when learning to speak a foreign language.
[show abstract][hide abstract] ABSTRACT: Speech impairment often occurs in patients after treatment for head and neck cancer. New treatment modalities such as surgical reconstruction or (chemo)radiation techniques aim at sparing anatomical structures that are correlated with speech and swallowing. In randomized trials investigating efficacy of various treatment modalities or speech rehabilitation, objective speech analysis techniques may add to improve speech outcome assessment. The goal of the present study is to investigate the role of objective acoustic-phonetic analyses in a multidimensional speech assessment protocol.
Speech recordings of 51 patients (6 months after reconstructive surgery and postoperative radiotherapy for oral or oropharyngeal cancer) and of 18 control speakers were subjectively evaluated regarding intelligibility, nasal resonance, articulation, and patient-reported speech outcome (speech subscale of the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire-Head and Neck 35 module). Acoustic-phonetic analyses were performed to calculate formant values of the vowels /a, i, u/, vowel space, air pressure release of /k/ and spectral slope of /x/.
Intelligibility, articulation, and nasal resonance were best predicted by vowel space and /k/. Within patients, /k/ and /x/ differentiated tumor site and stage. Various objective speech parameters were related to speech problems as reported by patients.
Objective acoustic-phonetic analysis of speech of patients is feasible and contributes to further development of a speech assessment protocol.
Folia Phoniatrica et Logopaedica 02/2009; 61(3):180-7. · 1.08 Impact Factor
[show abstract][hide abstract] ABSTRACT: This paper investigates phonological recursion by means of early accent placement (stress shift), which marks the initial boundary of a phonological phrase. The question is whether or not this early pitch accent placement can be applied recursively to phonological phrases that are embedded in larger phonological phrases. This was investigated in a map task experiment, with various Dutch phonological phrases as landmarks drawn on the map. The target phrases consisted of a noun modified by either one adjective, of the type aardrijkskùndig genóotschap‘geographical society’, or by two adjectives, of the type Amsterdàms aardrijkskùndig genóotschap, i.e. syntactically recursive noun phrases. An early pitch accent was realized on both the first and the second adjective in 30% of the spoken syntactically recursive phrases: e.g. Àmsterdams àardrijkskundig genóotschap. These prosodically recursive structures indicate that recursion may apply in phonology, as it does in syntax.
[show abstract][hide abstract] ABSTRACT: Alaryngeal speakers (speakers in whom the larynx has been removed) have inconsistent control over acoustic parameters such as F(0) and duration. This study investigated whether proficient tracheoesophageal and oesophageal speakers consistently convey phrase boundaries. It was further investigated if these alaryngeal speakers used the same hierarchy of acoustic boundary cues that is found in normal speakers. A perception experiment revealed that listeners identified prosodic boundaries less accurately in oesophageal speakers. Acoustic analyses showed that laryngeal speakers used pre-boundary lengthening and pitch movements at phrase boundaries, as expected. Tracheoesophageal speakers used pre-boundary-lengthening and pauses and oesophageal speakers used pauses to convey phrase boundaries. Two oesophageal speakers also paused inappropriately, within phrases. Although these two speakers differentiated between air-injection and prosodic pauses, listeners were unable to tell the two types of pauses apart. Alaryngeal speakers might benefit from therapy that specifically teaches them how to optimize their prosodic abilities.
[show abstract][hide abstract] ABSTRACT: Speech tempo (articulation rate) varies both between and within speakers. The present study investigates several factors affecting tempo in a corpus of spoken Dutch, consisting of interviews with 160 high-school teachers. Speech tempo was observed for each phrase separately, and analyzed by means of multilevel modeling of the speaker's sex, age, country, and dialect region (between speakers) and length, sequential position of phrase, and autocorrelated tempo (within speakers). Results show that speech tempo in this corpus depends mainly on phrase length, due to anticipatory shortening, and on the speaker's country, with different speaking styles in The Netherlands (faster, less varied) and in Flanders (slower, more varied). Additional analyses showed that phrase length itself is shorter in The Netherlands than in Flanders, and decreases with speaker's age. Older speakers tend to vary their phrase length more (within speakers), perhaps due to their accumulated verbal proficiency.
The Journal of the Acoustical Society of America 03/2008; 123(2):1104-13. · 1.65 Impact Factor
[show abstract][hide abstract] ABSTRACT: This paper reports two experiments designed to investigate whether lexical bias in phonological speech errors is caused by immediate feedback of activation, by self-monitoring of inner speech, or by both. The experiments test a number of predictions derived from a model of self-monitoring of inner speech. This model assumes that, after an error in inner speech, (1) an early interruption of speech may be made when speech was initiated too hastily, (2) the error may be covertly repaired, leading to the correct target, (3) the error may be covertly replaced by another speech error, or (4) an error may go undetected, leading to a completed spoonerism. This model of self-monitoring was supported by the speech errors observed in two SLIP experiments. The pattern of results supports the idea that lexical bias has two sources, immediate feedback of activation and self-monitoring of inner speech.
[show abstract][hide abstract] ABSTRACT: Psycholinguistic data are often analyzed with repeated-measures analyses of variance (ANOVA), but this paper argues that mixed-effects (multilevel) models provide a better alternative method. First, models are discussed in which the two random factors of participants and items are crossed, and not nested. Traditional ANOVAs are compared against these crossed mixed-effects models, for simulated and real data. Results indicate that the mixed-effects method has a lower risk of capitalization on chance (Type I error). Second, mixed-effects models of logistic regression (generalized linear mixed models, GLMM) are discussed and demonstrated with simulated binomial data. Mixed-effects models effectively solve the “language-as-fixed-effect-fallacy”, and have several other advantages. In conclusion, mixed-effects models provide a superior method for analyzing psycholinguistic data.
[show abstract][hide abstract] ABSTRACT: This study investigates how listeners cope with gradient forms of deletion of word-final /t/ when recognising words in a phonological context that makes /t/-deletion viable. A corpus study confirmed a high incidence of /t/-deletion in an /st#b/ context in Dutch. A discrimination study showed that differences between released /t/, unreleased /t/ and fully deleted /t/ in this specific /st#b/ context were salient. Two on-line experiments were carried out to investigate whether lexical activation might be affected by this form variation. Even though unreleased and released variants were processed equally fast by listeners, a detailed analysis of the unreleased condition provided evidence for gradient activation. Activating a target ending in /t/ is slowest for the most reduced variant because phonological context has to be taken into account. Importantly, activation for a target with /t/ in the absence of cues for /t/ is reduced if there is a surface-matching lexical competitor.
[show abstract][hide abstract] ABSTRACT: Certain types of speech, e.g. lists of words or numbers, are usually spoken with highly regular inter-stress timing. The main hypothesis of this study (derived from the Dynamic Attending Theory) is that listeners attend in particular to speech events at these regular time points. Better timing regularity should improve spoken-word perception. Previous studies have suggested only a weak effect of speech rhythm on spoken-word perception, but the timing of inter-stress intervals was not controlled in these studies. A phoneme monitoring experiment is reported, in which listeners heard lists of disyllabic words in which the timing of the stressed vowels was either regular (with equidistant inter-stress intervals) or irregular. In addition, metrical expectancy was controlled by varying the stress pattern of the target word, as either the same or the opposite of the stress pattern in its preceding words. Resulting reaction times show a main effect of timing regularity, but not of metrical expectancy. These results suggest that listeners employ attentional rhythms in spoken-word perception, and that regular speech timing improves speech communication.
[show abstract][hide abstract] ABSTRACT: Data from repeated measures experiments are usually analyzed with conventional ANOVA. Three well-known problems with ANOVA are the sphericity assumption, the design effect (sampling hierarchy), and the requirement for complete designs and data sets. This tutorial explains and demonstrates multi-level modeling (MLM) as an alternative analysis tool for repeated measures data. MLM allows us to estimate variance and covariance components explicitly. MLM does not require sphericity, it takes the sampling hierarchy into account, and it is capable of analyzing incomplete data. A fictitious data set is analyzed with MLM and ANOVA, and analysis results are compared. Moreover, existing data from a repeated measures design are re-analyzed with MLM, to demonstrate its advantages. Monte Carlo simulations suggest that MLM yields higher power than ANOVA, in particular under realistic circumstances. Although technically complex, MLM is recommended as a useful tool for analyzing repeated measures data from speech research.
[show abstract][hide abstract] ABSTRACT: Certain types of speech, e.g. lists of words or numbers, are usually spoken with a clear speech rhythm. Salient, stressed vowels are aligned to rhythmic points within the phrase period. The main hypothesis of this study (derived from the Dynamic Attending Theory; M.R. Jones (1976), Psych. Rev. 83, 323--355) is that listeners attend in particular to speech events at these rhythmic time points. Better rhythmic regularity should improve spoken-word perception. Previous studies that suggested only a weak e#ect of speech rhythm on spoken-word perception, su#ered from poor control over speech rhythm in their stimuli. A phoneme monitoring experiment is reported, in which listeners heard lists of disyllabic words in which the rhythm of the stressed vowels was either regular (with equidistant inter-stress intervals) or jittery. In addition, metrical expectancy was controlled by varying the stress pattern of the target word, as either the same or the opposite of the stress pattern in its preceding words. Resulting RTs show a main e#ect of rhythmic regularity, but not of metrical expectancy. Rhythmic regularity had its strongest e#ect in the perception of iambic words. These results suggest that listeners employ attentional rhythms in spoken-word perception, and that a clear rhythm improves speech communication. Keywords: rhythm; dynamic attending; stress; speech; spoken-word recognition Rhythmic regularity and metrical expectancy in spoken-word perception
[show abstract][hide abstract] ABSTRACT: Highly proficient alaryngeal speakers are known to convey prosody successfully. The present study investigated whether alaryngeal speakers not selected on grounds of proficiency were able to convey pitch accent (a pitch accent is realized on the word that is in focus, cf. Bolinger, 1958). The participating speakers (10 tracheoesophageal, 9 esophageal, and 10 laryngeal [control] speakers) produced sentences in which accent was cued by the preceding context. For each utterance, a group of listeners identified which word conveyed accent. All speakers were able to convey accent. Acoustic analyses showed that some alaryngeal speakers had little or no control over fundamental frequency. Contrary to expectation, these speakers did not compensate by using nonmelodic cues, whereas speakers using F0 did use nonmelodic cues. Thus, temporal and intensity cues are concomitant with the use of F0; if F0 is affected, these nonmelodic cues will be as well. A pitch perception experiment confirmed that alaryngeal speakers who had no control over F0 and who did not use nonmelodic cues were nevertheless able to produce pitch movements. Speakers with no control over F0 apparently relied on an alternative pitch system to convey accents and other pitch movements.
Journal of Speech Language and Hearing Research 01/2003; 45(6):1106-18. · 1.97 Impact Factor
[show abstract][hide abstract] ABSTRACT: In this study we investigate whether speakers, in line with the predictions of the Hyper- and Hypospeech theory, speed up most during the least informative parts and less during the more informative parts, when they are asked to speak faster. We expected listeners to benefit from these changes in timing, and our main goal was to find out whether making the temporal organisation of artificially time-compressed speech more like that of natural fast speech would improve intelligibility over linear time compression. Our production study showed that speakers reduce unstressed syllables more than stressed syllables, thereby making the prosodic pattern more pronounced. We extrapolated fast speech timing to even faster rates because we expected that the more salient prosodic pattern could be exploited in difficult listening situations. However, at very fast speech rates, applying fast speech timing worsens intelligibility. We argue that the non-uniform way of speeding up may not be due to an underlying communicative principle, but may result from speakers’ inability to speed up otherwise. As both prosodic and segmental information contribute to word recognition, we conclude that extrapolating fast speech timing to extremely fast rates distorts this balance between prosodic and segmental information.
[show abstract][hide abstract] ABSTRACT: In phrases like thirteen men, stress in thirteen is often shifted forward from its canonical final position. Presumably, the occurrence of this optional stress shift may be partly controlled by the rhythm of speech. Work on rhythmic speech production has demonstrated that given a repetition cycle, T, its harmonic fractions like T/2 attract stressed vowel onsets. Comparing phrases like ceMENT thirTEEN and GALaxy thirTEEN, differing in the number of weak syllables between strong ones, it was predicted that, during rhythmic production, the harmonic locations would attract shifted stress. Since shifting stress results in more even distribution of syllables through the cycle, we expected that faster repetition rates would also result in more stress shift. Dependent variables were the relative stress in the second word of each pair, and the location of onset of the nuclear vowel of the stressed syllable. Results confirmed the predictions, first, that with more intermediate unstressed syllables, stress was shifted forward more often (thereby locating the stressed vowel onset closer to T/2) and, second, that stress shifted forward more often at faster speaking rates. [Work supported by Fulbright Visiting Scholar program and by Utrecht University, The Netherlands.]
The Journal of the Acoustical Society of America 01/2002; · 1.65 Impact Factor