Article

Variable pronunciations reveal dynamic intra-speaker variation in speech planning

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In two speech production experiments, we investigated the link between phonetic variation and the scope of advance planning at the word form encoding stage. We examined cases where a word has, in addition to the pronunciation of the word in isolation, a context-specific pronunciation variant that appears only when the following word includes specific sounds. To the extent that the speaker uses the variant specific to the following context, we can infer that the phonological content of the upcoming word is included in the current planning scope. We hypothesize that the time alignment between selection of the phonetic variant in the currently-being-encoded word and retrieval of segmental details of the upcoming word is variable from moment to moment depending on current task demands and the dynamics of lexical access for each word involved. The results showed that the use of a context-sensitive phonetic variant of /t/ (“flapping”) by English speakers reliably increased under conditions which favor advance planning. Our hypothesis was supported by evidence compatible with its three key predictions: an increase in flapping in phrases with a higher frequency following word, more flapping in a procedure with a response delay relative to a speeded response, and an attenuation of the following word frequency effect with delayed responses. This reveals that within speakers, the degree of advance planning varies continuously from moment to moment, reflecting (in part) the accessibility of form properties of individual words in the utterance.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Predictability has been shown to be associated with many dimensions of variation in speech, including durational variation and variable omission of segments. However, the mechanism or mechanisms that underlie these effects are still unclear. This paper presents data on a new aspect of predictability in speech, namely how it affects allophonic variation. We examine two coronal stop allophones in English, flap and glottal stop, and find that their relationship with predictability is quite different from what is expected under current theories of probabilistic reduction in speech. Flapping is more likely when the word that follows is more predictable, but is not influenced by the frequency of the word itself, while glottal stops are more likely in words that are less predictable. We propose that the crucial distinction between these two allophones is how they are conditioned by phonological context. This, we argue, interacts with online speech planning processes and gives rise to variability for context-dependent allophones. This hypothesis offers a specific, testable mechanism for certain predictability effects, and has the potential to extend to other factors that contribute to variability in speech.
Article
Full-text available
Interactive models of language production predict that it should be possible to observe long-distance interactions; effects that arise at one level of processing influence multiple subsequent stages of representation and processing. We examine the hypothesis that disruptions arising in nonform-based levels of planning—specifically, lexical selection—should modulate articulatory processing. A novel automatic phonetic analysis method was used to examine productions in a paradigm yielding both general disruptions to formulation processes and, more specifically, overt errors during lexical selection. This analysis method allowed us to examine articulatory disruptions at multiple levels of analysis, from whole words to individual segments. Baseline performance by young adults was contrasted with young speakers’ performance under time pressure (which previous work has argued increases interaction between planning and articulation) and performance by older adults (who may have difficulties inhibiting nontarget representations, leading to heightened interactive effects). The results revealed the presence of interactive effects. Our new analysis techniques revealed these effects were strongest in initial portions of responses, suggesting that speech is initiated as soon as the first segment has been planned. Interactive effects did not increase under response pressure, suggesting interaction between planning and articulation is relatively fixed. Unexpectedly, lexical selection disruptions appeared to yield some degree of facilitation in articulatory processing (possibly reflecting semantic facilitation of target retrieval) and older adults showed weaker, not stronger interactive effects (possibly reflecting weakened connections between lexical and form-level representations).
Article
Full-text available
This study investigates the interaction of lexical access and articulation in spoken word production, examining two dimensions along which theories vary. First, does articulatory variation reflect a fixed plan, or do lexical access-articulatory interactions continue after response initiation? Second, to what extent are interactive mechanisms hard-wired properties of the production system, as opposed to flexible? In two picture naming experiments, we used semantic neighbour manipulations to induce lexical and conceptual co-activation. Our results provide evidence for multiple sources of interaction, both before and after response initiation. While interactive effects can vary across participants, we do not find strong evidence of variation of effects within individuals, suggesting that these interactions are relatively fixed features of each individual’s production system.
Article
Full-text available
Many phonological processes can be affected by segmental context spanning word boundaries, which often lead to variable outcomes. This paper tests the idea that some of this variability can be explained by reference to production planning. We examine coronal stop deletion (CSD), a variable process conditioned by preceding and upcoming phonological context, in a corpus of spontaneous British English speech, as a means of investigating a number of variables associated with planning: Prosodic boundary strength, word frequency, conditional probability of the following word, and speech rate. From the perspective of production planning, (1) prosodic boundaries should affect deletion rate independently of following context; (2) given the locality of production planning, the effect of the following context should decrease at stronger prosodic boundaries; and (3) other factors affecting planning scope should modulate the effect of upcoming phonological material above and beyond the modulating effect of prosodic boundaries. We build a statistical model of CSD realization, using pause length as a quantitative proxy for boundary strength, and find support for these predictions. These findings are compatible with the hypothesis that the locality of production planning constrains variability in speech production, and have practical implications for work on CSD and other variable processes.
Article
Full-text available
One of the frequent questions by users of the mixed model function lmer of the lme4 package has been: How can I get p values for the F and t tests for objects returned by lmer? The lmerTest package extends the 'lmerMod' class of the lme4 package, by overloading the anova and summary functions by providing p values for tests for fixed effects. We have implemented the Satterthwaite's method for approximating degrees of freedom for the t and F tests. We have also implemented the construction of Type I - III ANOVA tables. Furthermore, one may also obtain the summary as well as the anova table using the Kenward-Roger approximation for denominator degrees of freedom (based on the KRmodcomp function from the pbkrtest package). Some other convenient mixed model analysis tools such as a step method, that performs backward elimination of nonsignificant effects - both random and fixed, calculation of population means and multiple comparison tests together with plot facilities are provided by the package as well.
Conference Paper
Full-text available
The " fraction of locally unvoiced frames " measure in Praat's Voice Report (VR) is an automated method of obtaining the percentage of a segment which is voiced, but its accuracy has been called into question due to values that change based on scrolling and zooming in Praat's viewing window and don't always match manual voicing segmentation. This study offers statistical support for the accuracy of VR when certain guidelines are followed: (1) use the object window; (2) decrease the time step to increase temporal resolution; and (3) use gender-specific pitch ranges. The closure and frication portions of 277 affricates were analyzed using VR in this way and the results were compared to manual voicing segmentation using paired Wilcoxon tests. The results show that there is no significant difference between VR and manual segmentation, regardless of whether only the closure portion, only the frication portion, or the entire affricate is considered.
Article
Full-text available
This study investigated to what extent advance planning during sentence production is affected by a concurrent cognitive load. In two picture-word interference experiments in which participants produced subject-verb-object sentences while ignoring auditory distractor words, we assessed advance planning at a phonological (lexeme) and at an abstract-lexical (lemma) level under visuospatial or verbal working memory (WM) load. At the phonological level, subject and object nouns were found to be activated before speech onset with concurrent visuospatial WM load, but only subject nouns were found to be activated with concurrent verbal WM load, indicating a reduced planning scope as a function of type of WM load (Experiment 1). By contrast, at the abstract-lexical level, subject and object nouns were found to be activated regardless of type of concurrent load (Experiment 2). In both experiments, sentence planning had a more detrimental effect on concurrent verbal WM task performance than on concurrent visuospatial WM task performance. Overall, our results suggest that advance planning at the phonological level is more affected by a concurrently performed verbal WM task than advance planning at the abstract-lexical level. Also, they indicate an overlap of resources allocated to phonological planning in speech production and verbal WM. PQJE_1167926_supplemental_material.docx.
Article
Full-text available
The number of phonological neighbours to a word (PND) can affect its lexical planning and pronunciation. Similar parallel effects on planning and articulation have been observed for other lexical variables, such as a word's contextual predictability. Such parallelism is frequently taken to indicate that effects on articulation are mediated by effects on the time course of lexical planning. We test this mediation assumption for PND and find it unsupported. In a picture naming experiment, we measure speech onset latencies (planning), word durations, and vowel dispersion (articulation). We find that PND predicts both latencies and durations. Further, latencies predict durations. However, the effects of PND and latency on duration are independent: parallel effects do not imply mediation. We discuss the consequences for accounts of lexical planning, articulation, and the link between them. In particular, our results suggest that ease of planning does not explain effects of PND on articulation.
Article
Full-text available
Speech production and reading aloud studies have much in common, especially the last stages involved in producing a response. We focus on the minimal planning unit (MPU) in articulation. Although most researchers now assume that the MPU is the syllable, we argue that it is at least as small as the segment based on negative response latencies (i.e., response initiation before presentation of the complete target) and longer initial segment durations in a reading aloud task where the initial segment is primed. We also discuss why such evidence was not found in earlier studies. Next, we rebut arguments that the segment cannot be the MPU by appealing to flexible planning scope whereby planning units of different sizes can be used due to individual differences, as well as stimulus and experimental design differences. We also discuss why negative response latencies do not arise in some situations and why anticipatory coarticulation does not preclude the segment MPU. Finally, we argue that the segment MPU is also important because it provides an alternative explanation of results implicated in the serial vs. parallel processing debate.
Article
Full-text available
The analysis of experimental data with mixed-effects models requires decisions about the specification of the appropriate random-effects structure. Recently, Barr et al. (2013) recommended fitting 'maximal' models with all possible random effect components included. Estimation of maximal models, however, may not converge. We show that failure to converge typically is not due to a suboptimal estimation algorithm, but is a consequence of attempting to fit a model that is too complex to be properly supported by the data, irrespective of whether estimation is based on maximum likelihood or on Bayesian hierarchical modeling with uninformative or weakly informative priors. Importantly, even under convergence, overparameterization may lead to uninterpretable models. We provide diagnostic tools for detecting overparameterization and guiding model simplification. Finally, we clarify that the simulations on which Barr et al. base their recommendations are atypical for real data. A detailed example is provided of how subject-related attentional fluctuation across trials may further qualify statistical inferences about fixed effects, and of how such nonlinear effects can be accommodated within the mixed-effects modeling framework.
Article
Full-text available
The literature on advance phonological planning in adjective-noun phrases (NPs) presents diverging results: while many experimental studies suggest that the entire NP is encoded before articulation, other results favor a span of encoding limited to the first word. Although cross-linguistic differences in the structure of adjective-NPs may account for some of these contrasting results, divergences have been reported even among similar languages and syntactic structures. Here we examined whether inter-individual differences account for variability in the span of phonological planning in the production of French NPs, where previous results indicated encoding limited to the first word. The span of phonological encoding is tested with the picture-word interference (PWI) paradigm using phonological distractors related to the noun or to the adjective of the NPs. In Experiment 1, phonological priming effects were limited to the first word in adjective NPs whichever the position of the adjective (pre-nominal or post-nominal). Crucially, phonological priming effects on the second word interacted with speakers' production speed suggesting different encoding strategies for participants. In Experiment 2, we tested this hypothesis further with a larger group of participants. Results clearly showed that slow and fast initializing participants presented different phonological priming patterns on the last element of adjective-NPs: while the first word was primed by a distractor for all speakers, only the slow speaker group presented a priming effect on the second element of the NP. These results show that the span of phonological encoding is modulated by inter-individual strategies: in experimental paradigms some speakers plan word by word whereas others encode beyond the initial word. We suggest that the diverging results reported in the literature on advance phonological planning may partly be reconciled in light of the present results.
Article
Full-text available
We present word frequencies based on subtitles of British television programmes. We show that the SUBTLEX-UK word frequencies explain more of the variance in the lexical decision times of the British Lexicon Project than the word frequencies based on the British National Corpus and the SUBTLEX-US frequencies. In addition to the word form frequencies, we also present measures of contextual diversity part-of-speech specific word frequencies, word frequencies in children programmes, and word bigram frequencies, giving researchers of British English access to the full range of norms recently made available for other languages. Finally, we introduce a new measure of word frequency, the Zipf scale, which we hope will stop the current misunderstandings of the word frequency effect.
Article
Full-text available
This chapter discusses the temporal patterns of rapid movement sequences in speech and typewriting and what these patterns might mean in relation to the advance planning or motor programming of such sequences. The chapter discusses response factors that affect the time to initiate a prespecified rapid movement sequence after a signal when the goal is to complete the sequence as quickly as possible as well as how such factors affect the rate at which movements in the sequence are produced. The response factor of central interest is number of elements in the sequence. The effect of the length of a movement sequence on its latency is based partly on the possibility that it reflects a latency component used for advance planning of the entire sequence: The length effect would then measure the extra time required to prepare extra elements. The idea that changes in reaction time might reflect changes in sequence preparation in this way proposed that simple reaction time increased with the number of elements in a sequence of movements made with one arm. A part of the reaction time includes the time to gain access to stored information concerning the whole sequence: a process akin to loading a program into a motor buffer, with sequences containing more elements requiring larger programs, and larger programs requiring more loading time.
Article
Full-text available
Five experiments explored the influence of repeated phonemes on the production of short utterances. In Experiment 1 coloured object naming showed faster latencies when colour and object started with the same phoneme (‘green goat’) than when they did not; the opposite was found when colour and object were named on consecutive trials (‘green’ – ‘goat’). Experiments 2 and 3 focused on adjective-noun phrases and showed no effect of repeated phonemes on either acoustical duration of speeded responses, or latencies in a delayed variant of the task, suggesting a higher-level – rather than articulatory – locus of the effect. Experiments 4 and 5 demonstrated that the facilitation induced by repeated segments is not specific to word onset (‘green chain’) and is independent of whether or not the repeated phonemes occupy the same within-word position (‘green flag’). These results indicate that in the production of multiple words, word forms are concurrently activated and evoke phonological segments represented in a position-nonspecific manner.
Conference Paper
Full-text available
We present a preliminary analysis of transcriber consistency in labeling and segmentation of words and phones in the Buckeye corpus of spontaneous, informal speech. We find that pairwise inter-transcriber agreement on exact phone label match was 76%, and segmentation agreement within 20% of phone pair length was 75%, though longer phones are more consistently segmented than shorter phones. Patterns of consistency variation in labeling are observed as a function of phonetic categories that are similar to patterns reported for read speech. More agreement is seen on consonants than on vowels, and on fricatives and labials than on other consonant classes. In general, we find that shorter, more reduced words and phones result in more transcriber disagreement.
Article
Full-text available
In the present article, we introduce OpenSesame, a graphical experiment builder for the social sciences. OpenSesame is free, open-source, and cross-platform. It features a comprehensive and intuitive graphical user interface and supports Python scripting for complex tasks. Additional functionality, such as support for eyetrackers, input devices, and video playback, is available through plug-ins. OpenSesame can be used in combination with existing software for creating experiments.
Article
Full-text available
The current study addresses the extent of phonological planning during spontaneous sentence production. Previous work shows that at articulation, phonological encoding occurs for entire phrases, but encoding beyond the initial phrase may be due to the syntactic relevance of the verb in planning the utterance. I conducted three experiments to investigate whether phonological planning crosses multiple grammatical phrase boundaries (as defined by the number of lexical heads of phrase) within a single phonological phrase. Using the picture-word interference paradigm, I found in two separate experiments a significant phonological facilitation effect to both the verb and noun of sentences like "He opens the gate." I also altered the frequency of the direct object and found longer utterance initiation times for sentences ending with a low-frequency vs. high-frequency object offering further support that the direct object was phonologically encoded at the time of utterance initiation. That phonological information for post-verbal elements was activated suggests that the grammatical importance of the verb does not restrict the extent of phonological planning. These results suggest that the phonological phrase is unit of planning, where all elements within a phonological phrase are encoded before articulation. Thus, consistent with other action sequencing behavior, there is significant phonological planning ahead in sentence production.
Article
Full-text available
We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of ‘culturomics,’ focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. Culturomics extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.
Article
Full-text available
In Mandarin Chinese, speakers benefit from fore-knowledge of what the first syllable but not of what the first phonemic segment of a disyllabic word will be (Chen, Chen, & Dell, 2002), contrasting with findings in English, Dutch, and other Indo-European languages, and challenging the generality of current theories of word production. In this article, we extend the evidence for the language difference by showing that failure to prepare onsets in Mandarin (Experiment 1) applies even to simple monosyllables (Experiments 2-4), and confirm the contrast with English for comparable materials (Experiments 5 and 6). We also provide new evidence that Mandarin speakers do reliably prepare tonally unspecified phonological syllables (Experiment 7). To account for these patterns, we propose a language general proximate units principle whereby intentional preparation for speech as well as phonological-lexical coordination are grounded at the first phonological level below the word at which explicit unit selection occurs. The language difference arises because syllables are proximate units in Mandarin Chinese, whereas segments are proximate in English and other Indo-European languages. The proximate units perspective reconciles the aspiration toward a language general account of word production with the reality of substantial cross-linguistic differences.
Article
Full-text available
Word frequency is the most important variable in research on word processing and memory. Yet, the main criterion for selecting word frequency norms has been the availability of the measure, rather than its quality. As a result, much research is still based on the old Kucera and Francis frequency norms. By using the lexical decision times of recently published megastudies, we show how bad this measure is and what must be done to improve it. In particular, we investigated the size of the corpus, the language register on which the corpus is based, and the definition of the frequency measure. We observed that corpus size is of practical importance for small sizes (depending on the frequency of the word), but not for sizes above 16-30 million words. As for the language register, we found that frequencies based on television and film subtitles are better than frequencies based on written sources, certainly for the monosyllabic and bisyllabic words used in psycholinguistic research. Finally, we found that lemma frequencies are not superior to word form frequencies in English and that a measure of contextual diversity is better than a measure based on raw frequency of occurrence. Part of the superiority of the latter is due to the words that are frequently used as names. Assembling a new frequency norm on the basis of these considerations turned out to predict word processing times much better than did the existing norms (including Kucera & Francis and Celex). The new SUBTL frequency norms from the SUBTLEX(US) corpus are freely available for research purposes from http://brm.psychonomic-journals.org/content/supplemental, as well as from the University of Ghent and Lexique Web sites.
Article
Full-text available
Some theories of lexical access in production locate the effect of lexical frequency at the retrieval of a word's phonological characteristics, as opposed to the prior retrieval of a holistic representation of the word from its meaning. Yet there is evidence from both normal and aphasic individuals that frequency may influence both of these retrieval processes. This inconsistency is especially relevant in light of recent attempts to determine the representation of another lexical property--age of acquisition or AoA--whose effect is similar to that of frequency. To further explore the representations of these lexical variables in the word retrieval system, we performed hierarchical, multinomial logistic regression analyses of 50 aphasic patients' picture-naming responses. While both log frequency and AoA had a significant influence on patient accuracy and led to fewer phonologically related errors and omissions, only log frequency had an effect on semantically related errors. These results provide evidence for a lexical access process sensitive to frequency at all stages, but with AoA having a more limited effect.
Article
Full-text available
Picture-word interference experiments conducted with Italian speakers investigated how determiners are selected in noun phrase (NP) production. Determiner production involves the selection of a noun's syntactic features (mass or count, gender), which specify the type of determiner to be selected, and the subsequent selection of a particular phonological form (e.g., the/a in English). The research focused on the syntactic feature of gender. Results repeatedly failed to replicate the gender-congruity effect in NP production reported with Dutch speakers (longer latencies for target-distractor noun pairs with contrasting as opposed to the same gender). It is proposed that the discrepant results reflect processing differences in lexical access in Italian and Dutch: The selection of determiners in Italian, but not in Dutch, depends on phonological properties of the word that follows it in the NP. Evidence consistent with this explanation was obtained in an experiment in which determiner selection in NP production was hindered by conflicting phonological information in the NP.
Article
Four language production experiments examine how English speakers plan compound words during phonological encoding. The experiments tested production latencies in both delayed and online tasks for English noun-noun compounds (e.g., daytime), adjective-noun phrases (e.g., dark time), and monomorphemic words (e.g., denim). In delayed production, speech onset latencies reflect the total number of prosodic units in the target sentence. In online production, speech latencies reflect the size of the first prosodic unit. Compounds are metrically similar to adjective-noun phrases as they contain two lexical and two prosodic words. However, in Experiments 1 and 2, native English speakers treated the compounds as single prosodic units, indistinguishable from simple words, with RT data statistically different than that of the adjective-noun phrases. Experiments 3 and 4 demonstrate that compounds are also treated as single prosodic units in utterances containing clitics (e.g., dishcloths are clean) as they incorporate the verb into a single phonological word (i.e. dishcloths-are). Taken together, these results suggest that English compounds are planned as single recursive prosodic units. Our data require an adaptation of the classic model of phonological encoding to incorporate a distinction between lexical and postlexical prosodic processes, such that lexical boundaries have consequences for post-lexical phonological encoding.
Article
One of the most persistent arguments against the segment as the minimal planning unit is that the seemingly ubiquitous, thus, presumed obligatory, nature of anticipatory coarticulation (AC) effects favors the syllable or a larger unit. By contrast, we present the results of 3 experiments showing that AC is not ubiquitous, but graded and variable based on (a) phonological availability and (b) the specific criterion to initiate articulation adopted by a speaker. We further argue that phonological encoding is parallel. These results point to (a) the segment, and not the syllable, as the minimal planning unit and (b) a flexible planning scope. Implications with respect to the current formulation of AC regarding phonological availability and the minimal unit of speech articulation are discussed. (PsycINFO Database Record
Article
This paper examines the factors conditioning the production of linguistic variables in real time by individual speakers: what we term the dynamics of variation in individuals . We propose a framework that recognizes three types of factors conditioning variation: sociostylistic, internal linguistic, and psychophysiological. We develop two main points against this background. The first is that sequences of variants produced by individuals display systematic patterns that can be understood in terms of sociostylistic conditioning and psychophysiological conditioning. The second is that psychophysiological conditioning and internal linguistic conditioning are distinct in their mental implementations; this claim has implications for understanding the locality of the factors conditioning alternations, the universality and language-specificity of variation, and the general question of whether grammar and language use are distinct. Questions about the dynamics of variation in individuals are set against community-centered perspectives to argue that findings in the two domains, though differing in explanatory focus, can ultimately be mutually informative.
Article
This study examines the production of words the pronunciation of which depends on the phonological context. Participants produced adjective-noun phrases starting with the French determiner un. The pronunciation of this determiner requires a liaison consonant before vowels. Naming latencies and determiner acoustic durations were shorter when the adjective and the noun both started with vowels or both with consonants, than when they had different onsets. These results suggest that the liaison process is not governed by the application of a local contextual phonological rule; they rather favor the hypothesis that pronunciation variants with and without the liaison consonant are stored in memory.
Article
This study examined whether the brain operations involved during the processing of successive words in multi word noun phrase production take place sequentially or simultaneously. German speakers named pictures while ignoring a written distractor superimposed on the picture (picture-word interference paradigm) using the definite determiner and corresponding German noun. The gender congruency and the phonological congruency (i.e., overlap in first phonemes) between target and distractor were manipulated. Naming responses and EEG were recorded. The behavioural performance replicated both the phonology and the gender congruency effects (i.e., shorter naming latencies for gender congruent than incongruent and for phonologically congruent than incongruent trials). The phonological and gender manipulations also influenced the EEG data. Crucially, the two effects occurred in different time windows and over different sets of electrodes. The phonological effect was observed substantially earlier than the gender congruency effect. This finding suggests that the processing of determiners and nouns during determiner noun phrase production occurs at least partly sequentially.
Article
Speakers usually produce words in connected speech. In such contexts, the form in which many words are uttered is influenced by the phonological properties of neighboring words. The current article examines the representations and processes underlying the production of phonologically constrained word form variations. For this purpose, we consider determiners whose form is sensitive to phonological context (e.g., in English: a car vs. an animal; in French: le chien 'the dog' vs. l'âne 'the donkey'). Two hypotheses have been proposed regarding how these words are processed. Determiners either are thought to have different representations for each of their surface forms, or they are thought to have only 1 representation while other forms are generated online after selection through a rule-based process. We tested the predictions derived from these 2 views in 3 picture naming experiments. Participants named pictures using determiner-adjective-noun phrases (e.g., la nouvelle table 'the new table'). Phonologically consistent or inconsistent conditions were contrasted, based on the phonological onsets of the adjective and the noun. Results revealed shorter naming latencies for consistent than for inconsistent sequences (i.e., a phonological consistency effect) for all the determiner types tested. Our interpretation of these findings converges on the assumption that determiners with varying surface forms are represented in memory with multiple phonological-lexical representations. This conclusion is discussed in relation to models of determiner processing and models of lexical variability. (PsycINFO Database Record (c) 2014 APA, all rights reserved).
Article
Four experiments demonstrate effects of prosodic structure on speech production latencies. Experiments 1 to 3 exploit a modified version of the Sternberg et al. (1978, 1980) prepared speech production paradigm to look for evidence of the generation of prosodic structure during the final stages of sentence production. Experiment 1 provides evidence that prepared sentence production latency is a function of the number of phonological words that a sentence comprises when syntactic structure, number of lexical items, and number of syllables are held constant. Experiment 2 demonstrated that production latencies in Experiment 1 were indeed determined by prosodic structure rather than the number of content words that a sentence comprised. The phonological word effect was replicated in Experiment 3 using utterances with a different intonation pattern and phrasal structure. Finally, in Experiment 4, an on-line version of the sentence production task provides evidence for the phonological word as the preferred unit of articulation during the on-line production of continuous speech. Our findings are consistent with the hypothesis that the phonological word is a unit of processing during the phonological encoding of connected speech.
Article
It is quite normal for us to produce one or two million word tokens every year. Speaking is a dear occupation and producing words is at the core of it. Still, producing even a single word is a highly complex affair. Recently, Levelt, Roelofs, and Meyer (1999) reviewed their theory of lexical access in speech production, which dissects the word-producing mechanism as a staged application of various dedicated operations. The present paper begins by presenting a bird eye's view of this mechanism. We then square the complexity by asking how speakers control multiple access in generating simple utterances such as a table and a chair. In particular, we address two issues. The first one concerns dependency: Do temporally contiguous access procedures interact in any way, or do they run in modular fashion? The second issue concerns temporal alignment: How much temporal overlap of processing does the system tolerate in accessing multiple content words, such as table and chair ? Results from picture-word interference and eye tracking experiments provide evidence for restricted cases of dependency as well as for constraints on the temporal alignment of access procedures.
Article
The present paper examines the plausibility of two models of flapping in American English: (1) a traditional model of flapping as a categorical switch from stop to flap production in a specified linguistic environment, and (2) a model of flapping in which flapping arises as a by-product of articulatory changes associated with the general implementation of prosodic structure. These models are tested against a corpus of X-ray microbeam records of English speakers producing utterances with word-final coronal consonants in the appropriate segmental context for flapping, but in varied prosodic locations. Tokens were submitted to perceptual, acoustic, and articulatory analyses. Results show that listeners consistently transcribe the presence of flaps according to acoustic differences in the presence of voicing during closure and a release burst. Transcriptions and lingual measurements, however, suggest that the difference between flaps and [d] is associated with gradient differences in lingual positioning. Some articulatory correlates of perceived flapping correspond to predictions of a model of increased co-production of vowels and consonants yielding lenited stops heard as flaps, but others do not. Problems raised by these results for both traditional and prosodic by-product models are discussed.
Article
Four experiments investigated the span of advance planning for phrases and short sentences. Dutch undergraduates were presented with pairs of objects, which they named using noun-phrase conjunctions (e.g., the translation equivalent of "'the arrow and the bag") or sentences ("the arrow is next to the bag"). Each display was accompanied by an auditory distractor, which was related in form or meaning to the 1st or 2nd noun of the utterance or unrelated to both. For sentences and phrases, the mean speech onset time was longer when the distractor was semantically related to the 1st or 2nd noun and shorter when it was phonologically related to the 1st noun than when it was unrelated. No phonological facilitation was found for the 2nd noun. Findings suggest that before utterance onset both target lemmas and the 1st target form were selected. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Investigated the timing relations between phonological encoding—that is, the generation of an abstract phonological representation of a to-be-produced utterance—and the initiation of articulation. Specifically, the authors examined whether a speaker completes phonological encoding of a complete word before articulation is initiated. This question was investigated for the production of German no-determiner noun phrases (e.g., roter Tisch, "red table") in 4 experiments, each with 16 college students as participants. Results showed reliable facilitation effects for distractors that are identical to the first syllable of the first word of the noun phrase. For the second syllable of the first word, only weak facilitation effects were obtained. However, additional analyses showed that two groups of speakers can be distinguished, one showing only facilitation effects for the first syllable of the first word, and the other showing an additional facilitation effect for the second syllable of the first word. It is suggested that (1) the (phonological) word is not the lower limit of phonological encoding before articulation can be initiated, and that (2) speakers can adjust the size of the advance planning unit at the phonological level based on the specific speaking context. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Phonological evidence supports the frequency-based model proposed in the article by Nick Ellis. Phonological reduction occurs earlier and to a greater extent in high-frequency words and phrases than in low-frequency ones. A model that accounts for this effect needs both an exemplar representation to show phonetic variation and the ability to represent multiword combinations. The maintenance of alternations conditioned by word boundaries, such as French liaison, also provides evidence that multiword sequences are stored and can accrue representational strength. The reorganization of phonetic exemplars in favor of the more frequent types provides evidence for some abstraction in categories beyond the simple registration of tokens of experience.
Article
In 7 experiments the authors investigated the locus of word frequency effects in speech production. Exp 1 demonstrated a frequency effect in picture naming that was robust over repetitions. Exps 2, 3, and 7 excluded contributions from object identification and initiation of articulation. Exps 4 and 5 investigated whether the effect arises in accessing the syntactic word (lemma) by using a grammatical gender decision task. Although a frequency effect was found, it dissipated under repeated access to a word's gender. Exp 6 tested whether the robust frequency effect arises in accessing the phonological form (lexeme) by having Ss translate words that produced homophones. Low-frequent homophones behaved like high-frequent controls, inheriting the accessing speed of their high-frequent homophone twins. Because homophones share the lexeme, not the lemma, this suggests a lexeme-level origin of the robust effect. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Producing a word to express a meaning requires the processes of lexical selection and phonological encoding. We argue that lexical selection is influenced by contextual constraint and phonological encoding by word frequency, and we use these variables to assess the processing relations between selection and encoding. In two experiments we examined latencies to name pictures presented within sentences. The sentences varied in degree of constraint, whereas the target picture-names varied in frequency. In both experiments, targets that followed constraining sentences showed substantially reduced frequency effects. When the targets followed incongruent sentence frames, the frequency effect returned. The results offer new support for the predictions of cascade theories of word production.
Article
The incremental approach to language production assumes that the production system interleaves planning and articulation processes. Two experiments examined this assumption. In the first, participants stated the sums of two two-digit numbers in one of three different kinds of utterances, the sum by itself, the sum followed by the sequence “is the answer,” or the frame “The answer is” followed by the sum. Problem difficulty was manipulated as well, so that in some conditions, speakers could (in principle) state the tens component of the sum while planning the ones. Latencies to begin to speak were the same for all three utterance types and were affected by the difficulty of the problem as a whole. Utterance durations were unaffected by problem difficulty. In the second experiment, participants were induced to speak incrementally through the use of a deadline procedure. Both latencies and utterance durations were influenced by the difficulty of the problem. This latter finding supports a basic premise of the incremental approach: Speakers sometimes speak and plan simultaneously. Nevertheless, the language production system appears not to be architecturally incremental; instead, the extent to which people speak incrementally is under strategic control.
Article
Griffin [Griffin, Z. M. (2003). A reversed length effect in coordinating the preparation and articulation of words in speaking. Psychonomic Bulletin & Review, 10, 603–609.] found that speakers naming object pairs spent more time before utterance onset looking at the second object when the first object name was short than when it was long. She proposed that this reversed length effect arose because the speakers’ decision when to initiate an utterance was based, in part, on their estimate of the spoken duration of the first object name and the time available during its articulation to plan the second object name. In Experiment 1 of the present study, participants named object pairs. They spent more time looking at the first object when its name was monosyllabic than when it was trisyllabic, and, as in Griffin’s study, the average gaze-speech lag (the time between the end of the gaze to the first object and onset of its name, which corresponds closely to the pre-speech inspection time for the second object) showed a reversed length effect. Experiments 2 and 3 showed that this effect was not due to a trade-off between the time speakers spent looking at the first and second object before speech onset. Experiment 4 yielded a reversed length effect when the second object was replaced by a symbol (x or +), which the participants had to categorise. We propose a novel account of the reversed length effect, which links it to the incremental nature of phonological encoding and articulatory planning rather than the speaker’s estimate of the length of the first object name.
Article
Our study addresses the scope of phonological advance planning during sentence production using a novel experimental procedure. The production of German sentences in various syntactic formats (SVO, SOV, and VSO) was cued by presenting pictures of the agents of previously memorized agent–action–patient scenes. To tap the phonological activation of the agent (encoded as sentence subject) and the patient (encoded as sentence object), auditory distractor words, which were either phonologically related to the subject, or to the object, or were unrelated to both, were used. Compared with the unrelated control condition, distractors related to the subject in the utterance-initial phrase facilitated the response, while distractors related to the subject or to the object appearing in a non-initial phrase interfered with the response. Because some automatic phonological activation resulting from the perceptual processing of the visual stimulus cannot be responsible for the object-related interference effects, our results suggest that phonological advance planning exceeds a single syntactic phrase and may even span a whole simple sentence. The data are discussed in the context of current models of phonological encoding.
Article
Three picture-word interference experiments addressed the question of whether the scope of grammatical advance planning in sentence production corresponds to some fixed unit or rather is flexible. Subjects produced sentences of different formats under varying amounts of cognitive load. When speakers described 2-object displays with simple sentences of the form "the frog is next to the mug," the 2 nouns were found to be lexically-semantically activated to similar degrees at speech onset, as indexed by similarly sized interference effects from semantic distractors related to either the first or the second noun. When speakers used more complex sentences (including prenominal color adjectives; e.g., "the blue frog is next to the blue mug") much larger interference effects were observed for the first than the second noun, suggesting that the second noun was lexically-semantically activated before speech onset on only a subset of trials. With increased cognitive load, introduced by an additional conceptual decision task and variable utterance formats, the interference effect for the first noun was increased and the interference effect for second noun disappeared, suggesting that the scope of advance planning had been narrowed. By contrast, if cognitive load was induced by a secondary working memory task to be performed during speech planning, the interference effect for both nouns was increased, suggesting that the scope of advance planning had not been affected. In all, the data suggest that the scope of advance planning during grammatical encoding in sentence production is flexible, rather than structurally fixed.
Article
The form of a determiner is dependent on different contextual factors: in some languages grammatical number and grammatical gender determine the choice of a determiner variant. In other languages, the phonological onset of the element immediately following the determiner affects selection, too. Previous work has shown that the activation of opposing determiner forms by a noun's grammatical properties leads to slower naming latencies in a picture naming task, as does the activation of opposing forms by the interaction between a noun's gender and the phonological context. The present paper addresses the question of whether phonological context alone is sufficient to evoke competition between determiner forms. Participants produced English phrases in which a noun phrase's phonology required a determiner that was the same as or differed from the determiner required by the noun itself (e.g., apurple giraffe; an orange giraffe). Naming latencies were slower when the phrase-initial determiner differed from the determiner required by the noun in isolation than when the phrase-initial determiner matched the isolated-noun determiner. This was true both for definite and indefinite determiners. The data show that during the production of a determiner-noun phrase, nouns automatically activate the phonological forms of their determiners, which can compete with the phonological forms that are generated by an assimilation rule.
Article
Phrase-final words tend to be lengthened and followed by a pause. The dominant view of prosodic production is that word lengthening and pausing reflect the syntax of a sentence. The author demonstrates that, instead, lengthening and pausing reflect a distinctly prosodic representation, in which phonological constituents are arranged in a hierarchical, nonrecursive structure. Prosodic structure is created without knowledge of words' phonemic content. As a result, within a single sentential position, greater word lengthening necessitates shorter pauses, but across positions, word and pause durations show a positive correlation. The author presents a model of prosodic production that describes the process of prosodic encoding and provides a quantitative specification of the relation between word lengthening and pausing. This model has implications for studies of language production, comprehension, and development.
Article
In numerous languages determiner forms depend not only on semantic information but also on several other kinds of information, such as the grammatical gender of the controlling noun or the phonological properties of the context. In the present research we contrasted two possible accounts of determiner retrieval: one in which every type of required information is bundled into a unitized representation for determiner retrieval and one in which each type of information can individually activate determiner forms. These alternative hypotheses were investigated in three experiments in which native speakers of French named pictures with simple [determiner + noun] or complex [determiner + adjective + noun] noun phrases. In the experiments, the properties of the contextual cues that drive the retrieval of the determiner were manipulated - for example, we manipulated the number of determiner forms that are compatible with a given grammatical gender and the number of grammatical genders that a given determiner form can be used with. Neither hypothesis can fully account for the results of the three experiments. However, a hybrid hypothesis that combines the principal features of the two hypotheses provides a good account of the data.