Language and Speech

Published by SAGE Publications
Print ISSN: 0023-8309
The first part of this study examined (Parisian) French-learning 11-month-old infants' recognition of the six definite and indefinite French articles: le, la, les, un, une, and des. The six articles were compared with pseudoarticles in the context of disyllabic or monosyllabic nouns, using the Head-turn Preference Procedure. The pseudo articles were similar to real articles in terms of phonetic composition and phonotactic probability, and real and pseudo noun phrases were alike in terms of overall prosodic contour. In three experiments, 11-month-old infants showed preference for real over pseudo articles, suggesting they have the articles' word-forms stored in long-term memory. The second part of the study evaluates several hypotheses about the role of articles in 11-month-olds infants' word recognition. Evidence from three experiments supports the view that articles help infants to recognize the following words. We propose that 11-month-olds have the capacity to parse noun phrases into their constituents, which is consistent with the more general view that function words define a syntactic skeleton that serves as a basis for parsing spoken utterances. This proposition is compared to a competing account, which argues that 11-month-olds recognize noun-phrases as whole-words.
In this paper, sung speech is used as a methodological tool to explore temporal variability in the timing of word-internal consonants and vowels. It is hypothesized that temporal variability/stability becomes clearer under the varying rhythmical conditions induced by song.This is explored crosslinguistically in German - a language that exhibits a potential vocalic quantity distinction - and the non-quantity languages French and Russian. Songs by non-professional singers, i.e. parents that sang to their infants aged 2 to 13 months in a non-laboratory setting, were recorded and analyzed.Vowel and consonant durations at syllable contacts of trochaic word types with CVCV or CV:CV structure were measured under varying rhythmical conditions. Evidence is provided that in German non-professional singing, the two syllable structures can be differentiated by two distinct temporal variability patterns: vocalic variability (and consonantal stability) was found to be dominant in CV:CV structures whereas consonantal variability (and vocalic stability) was characteristic for CVCV structures. In French and Russian, however, only vocalic variability seemed to apply.Additionally, findings suggest that the different temporal patterns found in German were also supported by the stability pattern at the tonal level. These results point to subtle (supra) segmental timing mechanisms in sung speech that affect temporal targets according to the specific prosodic nature of the language in question.
Vowels used in the experiment 
Previous research has shown that English infants are sensitive to mispronunciations of vowels in familiar words by as early as 15-months of age. These results suggest that not only are infants sensitive to large mispronunciations of the vowels in words, but also sensitive to smaller mispronunciations, involving changes to only one dimension of the vowel. The current study broadens this research by comparing infants' sensitivity to the different types of changes involved in the mispronunciations. These included changes to the backness, height, and roundedness of the vowel. Our results confirm that 18-month-olds are sensitive to small changes to the vowels in familiar words. Our results also indicate a differential sensitivity of vocalic specification, with infants being more sensitive to changes in vowel height and vowel backness than vowel roundedness. Taken together, the results provide clear evidence for specificity of vowels and vocalic features such as vowel height and backness in infants' lexical representations.
Listeners hearing an ambiguous speech sound flexibly adjust their phonetic categories in accordance with lipread information telling what the phoneme should be (recalibration). Here, we tested the stability of lipread-induced recalibration over time. Listeners were exposed to an ambiguous sound halfway between /t/ and /p/ that was dubbed onto a face articulating either /t/ or /p/. When tested immediately, listeners exposed to lipread /t/ were more likely to categorize the ambiguous sound as /t/ than listeners exposed to /p/. This aftereffect dissipated quickly with prolonged testing and did not reappear after a 24-hour delay. Recalibration of phonetic categories is thus a fragile phenomenon.
A previous study (Brady, Shankweiler, and Mann, 1983) demonstrated inferior speech repetition abilities for poor readers with degraded stimuli. The present study, in contrast, used clear listening conditions. Third-grade average and below-average readers were tested on a word repetition task with monosyllabic, multisyllabic, and pseudoword stimuli. No group differences were obtained on speed of responding, and the lack of reaction time differences between reading groups was corroborated on a control task which measured verbal response time to nonspeech stimuli. However, below average readers were significantly less accurate at repeating the multisyllabic and pseudoword stimuli. This evidence is compatible with the hypothesis that encoding difficulties contribute to the memory deficits characteristic of poor readers.
It is possible to regard imitation within at least three theoretical frameworks. First, imitation may be regarded as a special case of another more general form of learning (Miller and Dollard, 1941; Gewirtz, 1971). Secondly, it is possible to consider imitation as a unique process which may be accounted for in its own right (Bandura, 1969). A third theory of imitation has its origin in the work of Piaget (1951) and suggests that imitation is only one aspect of the total functioning of an individual and lies between accommodation and assimilation.
The Delay of Principle B Effect (DPBE) has been discussed in various studies that show that children around age 5 seem to violate Principle B of Binding Theory (Chomsky, 1981, and related works), when the antecedent of the pronoun is a name, but not when the antecedent is a quantifier. The analysis we propose can explain the DPBE in languages of the Dutch-English type, and its exemption in languages with (dis)placed pronouns (clitics). In both types of languages, the phenomenon arises when children have to compare two alternative representations for equivalence. The principle that induces the comparison is different in both cases, however. The comparision of children speaking languages with pronouns occurring within the VP is induced by Grodzinsky and Reinhart's (1993) Rule I. However, the comparison of children in languages where the pronouns occur above the VP is induced by Scope Economy. In both cases the result is similar: the children take guesses in the process of interpreting the anaphoric dependency, thereby performing at chance level.
Twenty-eight subjects were presented with computer generated grammatically deviant strings and asked to carry out two tasks on each of two experimental days. Task 1 was a forced-choice experiment in which 50 pairs of strings were presented aurally to each subject, and he had to select that member of the pair which he felt was the best approximation to a good English sentence. In Task 2, subjects were required to read and rank each string on a scale running from 1 (completely unacceptable) to 5 (completely acceptable). A different order of stimulus presentation was employed on each experimental day; 14 subjects were assigned to one order on the first day and received the other order on the second day. Results show that subjects tend to prefer the same statement over orders and that rank and preference are highly correlated. There are considerable differences in preference among the 50 pairs of stimulus items. Analysis of the data suggests that this task yields information relevant to the linguistic and in particular the syntactic competence of subjects when applied to grammatically deviant strings. Subjects appear to be trying to cope with the statements by comparing them to acceptable syntactic and/or semantic patterns.
Two experiments are reported which were designed to increase understanding of the process of abstracting from complex verbal passages. The experiments test three hypotheses : that a primacy effect will be observed (H1), that this effect will be more pronounced in the case of subjects who have little interest in the passage to be summarized (H2), and that interest in the passage will be positively related to the amount of material abstracted (H3). The experiments are of the same design. This requires that two versions of a passage be used, and that the order in which the halves of the passage are presented be varied. Results of the two experiments are consistent with the validity of H1, but do not support H2 or H3. The need to compare abstraction behaviour with recall performance is emphasized.
Rebuttal analogies (e.g., "Politicians arguing over the renaming of an airport is like watering your petunias when your house is on fire") are commonly used as responses in verbal conflicts. The following study investigated the role that irony mapping, absurdity comparison, and argumentative convention play in interpreters' derivations of speaker's intentions in using rebuttal analogies. In general, these intentions are to demonstrate the unsoundness of the opposed proposition ("argument") and its advocate ("social attack"), (Whaley & Holloway, 1996). Three rebuttal analogy types were rated on argumentativeness and social attack in verbal conflict and nonverbal conflict scenarios. The results of six experiments on 120 participants found that analogies with ironic bases (e.g., "Doubling the defense budget in order to intimidate North Korea is like using a chainsaw to file your nails") were perceived as more of a social attack and as more argumentative than analogies with absurd (e.g., "... using ketchup to wax your car") and nonironic (e.g., "... using a nailfile to file your nails") bases. No difference was found between the two scenario types. Norming data confirmed equivalence of absurdity of ironic and absurd bases, and greater irony of ironic over absurd bases. The results thus implicate hearers' use of the ironic structure between bases and targets in the interpretation of rebuttal analogies rather than mere absurdity comparison or argumentative convention.
In this study, a sentence verification task was used to determine the effect of a foreign accent on sentence processing time. Twenty native English listeners heard a set of English true/false statements uttered by ten native speakers of English and ten native speakers of Mandarin. The listeners assessed the truth value of the statements, and assigned accent and comprehensibility ratings. Response latency data indicated that the Mandarin-accented utterances required more time to evaluate than the utterances of the native English speakers. Furthermore, utterances that were assigned low comprehensibility ratings tended to take longer to process than moderately or highly comprehensible utterances. However, there was no evidence that degree of accent was related to processing time. The results are discussed in terms of the "costs" of speaking with a foreign accent, and the relevance of such factors as accent and comprehensibility to second language teaching.
It is generally recognized that the most significant cue to accent location is fundamental frequency (F0) in both Japanese and English. Furthermore, it is widely believed that a syllable is perceived as accented if the syllable contains an F0 peak. However, Sugito (1972) found that, in Japanese, if an F0 peak is followed by a steep F0 fall, the syllable preceding the F0 peak may be perceived as accented. In this article we present two experiments which investigate the relationship between F0 peak and F0 fall rate in accent perception for Japanese and English. The first experiment confirms that, for Japanese, both F0 peak location and F0 fall rate affect listeners' judgements of accent location. Specifically, the later the F0 peak occurs in a given syllable, relative to the syllable boundary, the greater the F0 fall rate necessary for listeners to perceive the preceding syllable as accented. The second experiment shows that this phenomenon is not unique to Japanese: Perception of accent location in English is also influenced by both F0 peak location and post-peak F0 fall rate.
Mean prosodic appropriateness ratings on a scale of 1 
Mean RT in ms ("yes" responss only), %YES (standard 
Four experiments investigated the effect of syntactic argument structure on the evaluation and comprehension of utterances with different patterns of pitch accents. Linguistic analyses of the relation between focus and prosody note that it is possible for certain accented constituents within a broadly focused phrase to project focus to the entire phrase. We manipulated focus requirements and accent in recorded question-answer pairs and asked listeners to make linguistic judgments of prosodic appropriateness (Experiments 1 and 3) or to make judgments based on meaningful comprehension (Experiments 2 and 4). Naive judgments of prosodic appropriateness were generally consistent with the linguistic analyses, showing preferences for utterances in which contextually new noun phrases received accent and old noun phrases did not, but suggested that an accented new argument NP was not fully effective in projecting broad focus to the entire VP. However, the comprehension experiments did demonstrate that comprehension of a sentence with broad VP focus was as efficient when only a lexical argument NP received accent as when both NP and verb received accent. Such focus projection did not occur when the argument NP was an "independent quantifier" such as nobody or everything. The results extend existing demonstrations that the ease of understanding spoken discourse depends on appropriate intonational marking of focus to cases where certain structurally-defined words can project focus-marking to an entire phrase.
The Grammar of Dutch Intonation (GDI) provides a description of the possible intonation contours of Dutch. The GDI distinguishes accent-lending and nonaccent-lending pitch configurations, but refrains from further functional statements. This paper describes an experimental attempt to verify meaning hypotheses for four Dutch single-accent pitch patterns as postulated in the linguistic literature. The four pitch accent types were realized on proper names; the abstract meanings, in terms of the manipulation of an element of the background shared between speaker and listener, were incorporated in situational contexts, distinguishing between a "default" and a vocative use of the proper name ("orientation"). Listeners ranked the four melodic shapes from most to least appropriate in their specific context. After revision of part of the materials a second perception experiment was conducted, in which subjects had a rank four contexts from most to least appropriate for a specific pitch accent type. Results show a distinct effect of "orientation" on the appropriateness of two of the investigated pitch accent types in the various context types; the other two pitch accent types are associated with the predicted context types (and vice versa) well above chance, indicating the viability of at least two of the linguistic proposals.
Experiment 1. Mean FO difference (in Hz) between emphasis and non-emphasis conditions for four sentence locations with significance levels, F( 1,21), throughout. 
Experiment 1, Percentages of post-target pause occurrence by condition and speaker 
Experiment 1. Numbers and lengths (in ms) of modified and new pauses following the target for each speaker 
This research aims (1) to describe the acoustic manifestations of emphatic accent in French by examining similarities and differences between four speakers; and (2) to identify, amongst the acoustic measures, those which determine the perception of emphasis. In experiment 1, four speakers were asked to read twenty-four sentences aloud twice, first without any emphasis, and second with emphasis on a target word in the sentence. The acoustic modifications induced by emphasis production were analyzed on the target words and on their surrounding contexts, speaker by speaker. Acoustic measurements revealed that all speakers increased the contrast between the target and the contexts, by slowing down articulation on the targets and by increasing intensity and F0 on the targets relative to the adjacent syllables. F0 peak was found either on the first or last syllable of the target word, and F0 increase was shown to spread over the peak-bearing syllable to the whole word. Speakers' productions differed with respect to the production of pauses and the syllabic location of F0 peak in the target words. In Experiment 2, the four speakers' productions were presented to listeners, who had to decide whether an emphasis had been produced or not. A stepwise regression analysis was conducted, using the acoustic measurements as independent variables and the percentage of emphasis perception as the dependent variable. The results suggest a major role of F0 manifestations: Listeners were found to be sensitive to an F0 increase on the first syllable of the target, relative to its value in non-emphasis condition. Listeners would be sensitive to deviations from expected F0 patterns in French, and may interpret them as signaling emphatic accent.
This paper examines predictions made by two theories of the relationship between pitch accent and focus. The empirical evidence presented suggests that listeners are sensitive to a variety of factors that may affect the focus projection ability of pitch accents, that is the ability of a pitch accent on one word to mark focus on a larger constituent. The findings suggest that listeners' interpretation of focus structure is most sensitive to the presence or absence of a pitch accent on a focused constituent and the deaccenting of following unfocused material (pitch accent position). Preliminary evidence suggests that the status of a pitch accent as nuclear or prenuclear may also affect listeners' interpretations, though to a lesser extent than accent position. Finally, the results show that focus projection is affected only minimally, if at all, by the type of pitch accent (at least for the two accent types compared (H* vs. L + H*)).
In the first part of this study, we measured the alignment (relative to segmental landmarks) of the low F0 turning points between the accentual fall and the final boundary rise in short Dutch falling-rising questions of the form Do you live in [place name]? produced as read speech in a laboratory setting. We found that the alignment of these turning points is affected by the location of a postaccentual secondary stressed syllable if one is present. This is consistent with the findings and analyses of Grice, Ladd, & Arvaniti, 2000 (Phonology 17, 143-185), suggesting that the low turning points are the phonetic reflex of a "phrase accent." In the second part of this study, we measured the low turning points in falling-rising questions produced in a task-oriented dialog setting and found that their alignment is affected in the same way as in the read speech data. This suggests that read speech experiments are a valid means of investigating the phonetic details of intonation contours.
The article describes the contrastive possibilities of alignment of high accents in three Romance varieties, namely, Central Catalan, Neapolitan Italian, and Pisa Italian. The Romance languages analyzed in this article provide crucial evidence that small differences in alignment in rising accents should be encoded phonologically. To account for such facts within the AM model, the article develops the notion of "phonological anchoring" as an extension of the concept of secondary association originally proposed by Pierrehumbert and Beckman (1988), and later adopted by Grice (1995), Grice, Ladd, and Arvaniti (2000), and others to explain the behavior of edge tones. The Romance data represent evidence that not only peripheral edge tones seek secondary associations. We claim that the phonological representation of pitch accents should include two independent mechanisms to encode alignment properties with metrical structure: (1) encoding of the primary phonological association (or affiliation) between the tone and its tone-bearing unit; and (2), for some specific cases, encoding of the secondary phonological anchoring of tones to prosodic edges (moras, syllables, and prosodic words). The Romance data described in the article provide crucial evidence of mora-edge, syllable-edge, and word-edge H tonal associations.
Rate of Initial Accent on target words by adjective scope. In broad scope the constituent boundary precedes A. In narrow scope, the boundary precedes N2  
In addition to the phrase-final accent (FA), the French phonological system includes a phonetically distinct Initial Accent (IA). The present study tested two proposals: that IA marks the onset of phonological phrases, and that it has an independent rhythmic function. Eight adult native speakers of French were instructed to read syntactically ambiguous French sentences (e.g., Les gants et les bas lisses 'the smooth gloves and stockings') in a way that disambiguated the scope of the adjective. When the final adjective (lisses) applies to the conjoined NP, a prosodic boundary is warranted immediately before the adjective; when it applies to the second NP alone, a boundary before that NP is more appropriate. Length of the second noun and the adjective were varied from one to four syllables to investigate length-related tendencies toward phonological boundary marking and toward rhythmic placement of IA. For the materials from six speakers whose readings were correctly interpreted by native listeners, incidence of word-initial prosodic peaks was affected by both structure and length, with most reliable occurrence at onsets of Minor/Phonological Phrases. The only effect of rhythmicity independent of phrase structure was omission of FA in stress clash with IA.
This paper describes the development of the Wildcat Corpus of native- and foreign-accented English,a corpus containing scripted and spontaneous speech recordings from 24 native speakers of American English and 52 non-native speakers of English.The core element of this corpus is a set of spontaneous speech recordings, for which a new method of eliciting dialogue-based, laboratory-quality speech recordings was developed (the Diapix task). Dialogues between two native speakers of English, between two non-native speakers of English (with either shared or different LIs), and between one native and one non-native speaker of English are included and analyzed in terms of general measures of communicative efficiency.The overall finding was that pairs of native talkers were most efficient, followed by mixed native/non-native pairs and non-native pairs with shared LI. Non-native pairs with different LIs were least efficient.These results support the hypothesis that successful speech communication depends both on the alignment of talkers to the target language and on the alignment of talkers to one another in terms of native language background.
percent correct intelligibility for each speaker category and noise condition and averaged across noise conditions. Standard deviations are shown in parentheses 
This study compared the intelligibility of native and foreign-accented bilingualism English speech presented in quiet and mixed with three different levels of background noise. Two native American English speakers and four native Mandarin Chinese speakers for whom English is a second language each read a list of 50 phonetically balanced sentences (Egan, 1948). The authors speech intelligibility identified two of the Mandarin-accented English speakers as high-proficiency speakers and two as lower proficiency speakers, based on their speech intelligibility in quiet (about 95% and 80%, respectively). Original record-perception ings and noise-masked versions of 48 utterances were presented to monolingual American English speakers. Listeners were asked to write down the words they heard the speakers say, and intelligibility was measured as content words correctly identified. While there was a modest difference between native and high-proficiency speech in quiet (about 7%), it was found that adding noise to the signal reduced the intelligibility of high-proficiency accented speech significantly more than it reduced the intelligibility of native speech. Differences between the two groups in the three added noise conditions ranged from about 12% to 33%. This result suggests that even high-proficiency non-native speech is less robust than native speech when it is presented to listeners under suboptimal conditions.
Productions of ten English vowels in /bVt/ and /bVd/ contexts were elicited from a group of native American English speakers and a group of native Arabic speakers who had learned English in adulthood. When a variety of acoustic measurements, including vowel durations, F1 and F2 frequencies, and movement in F1 and F2 were examined, the two groups were found to differ on at least one of these parameters for nearly every vowel considered. A subset of the Arabic speakers' productions, and the productions of two native English speakers, were rated for accentedness by five native English judges. The rating data indicated that only a minority of the Arabic group's productions were regarded by the judges as "native-like". When the acoustic measurement data were regressed on the mean ratings, it was found that the accentedness scores were correlated primarily with F1 frequency and movement in F2, although the significant predictors varied from vowel to vowel.
Approximately 100 college students were asked to evaluate Spanish-English bilingual speakers on the basis of taped readings of an English text. The speakers were chosen to represent a wide range of accentedness. The relationship between the amount of accentedness heard and the attributed characteristics of the speaker was investigated. The results show that the students made rather fine discriminations among varying degrees of accentedness in rating a speaker's personal attributes and speech. Support was thus found for the proposition that Spanish-accented English is negatively stereotyped and that the more accented the speech, the stronger the stereotype. By employing a seven-point rating scale with large groups rather than more involved scaling techniques based on individual testing, this study attempted to generalize the results of recent research which indicated that linguistically naive persons can reliably rate varying degrees of accentedness. Indeed, since the more convenient group-administered rating scale procedure provided high correlations with the accentedness scores obtained via more complicated scaling techniques, research concerned with reactions to a range of accentedness can progress rapidly.
Two experiments were carried out to investigate how the correspondence between sentence accentuation and distribution of information is used in human word processing. A forced-choice task with target words embedded in sentences was employed for this purpose. Target words provided either 'given' or 'new' information, and were either accented or unaccented. The subjects had to choose between two words that differed in the last consonant by one phonetic feature (e.g., mouth/mouse). The first experiment involved both normal-hearing and hearing-impaired listeners. A fixed amount of noise was used to reduce the quality of speech for normal-hearing listeners, in order to enable a comparison between the two listener groups. The results of the first experiment showed different processing patterns for normal-hearing and hearing-impaired listeners. The hearing-impaired listeners were more accurate with words that were properly accented for their information value, whereas the normal-hearing subjects were more accurate with accented than unaccented words regardless of their information value. A new group of normal-hearing subjects was tested in a second experiment with speech of more severely reduced quality. The results indicated that, under these circumstances, the normal-hearing listeners changed their strategy and also showed an interaction between information value and accent. It seems that, as speech becomes less intelligible, listeners depend increasingly on linguistic expectations stemming from the correlation between information value and accentuation.
F0-track idealization of a narrow and broad focus intonation for a German sentence. Narrow focus intonation is shown in (a), broad focus intonation in (b)  
Example of visual display presented to participants  
Average fixation proportions over time for contrastive referents, noncontrastive referents, and averaged distractors in Experiment 2: (a) for contrRef / noncontrAccent trials, (b) for contrRef / contrAccent trials, (c) for noncontrRef / noncontrAccent trials, and (d) for noncontrRef / contrAccent trials
In two eye-tracking experiments the role of contrastive pitch accents during the on-line determination of referents was examined. In both experiments, German listeners looked earlier at the picture of a referent belonging to a contrast pair (red scissors, given purple scissors) when instructions to click on it carried a contrastive accent on the color adjective (L + H*) than when the adjective was not accented. In addition to this prosodic facilitation, a general preference to interpret adjectives contrastively was found in Experiment 1: Along with the contrast pair, a noncontrastive referent was displayed (red vase) and listeners looked more often at the contrastive referent than at the noncontrastive referent even when the adjective was not focused. Experiment 2 differed from Experiment 1 in that the first member of the contrast pair (purple scissors) was introduced with a contrastive accent, thereby strengthening the salience of the contrast. In Experiment 2, listeners no longer preferred a contrastive interpretation of adjectives when the accent in a subsequent instruction was not contrastive. In sum, the results support both an early role for prosody in reference determination and an interpretation of contrastive focus that is dependent on preceding prosodic context.
This paper reports the results of an experiment that elicits contextual effects on Rising and Falling accents in Standard Serbian, with the goal of determining their acoustic correlates and their phonological representation. Materials systematically vary the distance between pitch accents, inducing “tone crowding,” in order to identify the phonetic dimensions that consistently distinguish the two pitch accent types, to examine the association between accents and the segmental string, as well as the timing relationship between accent minima and maxima, and to investigate the interaction between lexical accents and boundary tones. On the basis of the phonetic findings, a unified analysis of the phonological distribution and phonetic realization of Falling and Rising accents in Standard Serbian is proposed. It is proposed that both Rising and Falling accents consist of a single lexical High (H). The restricted distribution of the two accents emerges from the interaction of stress and tone: Falling accents are monosyllabic, such that stress and pitch prominence coincide; Rising accents are bisyllabic, such that the stressed syllable precedes the pitch-accented syllable. The phonetic differences between the Falling and Rising accents follow from the place of lexically designated H, the location of stress, and the effects of boundary tones.
Past research has not considered the possibility of stimulus mildness-broadness variations affecting the evaluation of spoken accents and dialects. This study was designed to consider whether listeners can perceive vocal differences along this pronunciation dimension, and if they could, whether their evaluations of the aesthetic, status and communicative contents of a standard, neutral passage of prose were a function of broadness. The results support the notion of this dimension's saliency, with "experience" (defined in terms of age and length of regional membership) being considered an important evaluative determinant. The data are discussed within the context of related research.
Starting with the knowledge that large numbers of Dublin teachers are not from Dublin themselves and speak, therefore, with a regional accent, the study used the matched-guise technique to investigate the reactions of Dublin secondary school students to five such regional accents. The subjects (N = 178), from different social strata, consistently rated the Donegal guise most favourably on traits reflecting competence. The Dublin speaker was perceived least favourably on these traits, and the Cork, Cavan and Galway guises were in the middle ranks. Evaluations were more varied on other dimensions, although the Dublin speaker was, with the Galway guise, rated most favourably in terms of social attractiveness. The implications of these findings are discussed with regard to future investigations of regional stereotypes in general, and the study of teacher-pupil dynamics in particular.
This paper presents patterns of accentual alignment in two varieties of Spanish spoken in the Basque Country: Lekeitio Spanish (LS), with speakers whose other native language is Lekeitio Basque (LB); and Vitoria Spanish (VS), with monolingual speakers of Spanish from the city of Vitoria. These patterns are compared to those of Madrid Spanish (MS), compare Face (2002). In LS, accents are realized as pitch rises rather than falls, like in MS and unlike in LB, but peaks are aligned before the offset of the accented syllable, unlike in MS and like in LB. At the end of the subject phrase, peaks display later alignment, like in MS. Thus, LS displays mixed properties of Basque and Spanish intonation. In VS, stress is also realized as a tonal rise, with peaks aligned after the offset of the accented syllable, like in other varieties of Spanish and unlike LS. The low tone target is aligned before the onset of the stressed syllable, earlier than in LS and MS. The continuum of tonal target alignment is observed in LS, VS, and MS, and the difficulties in identifying a "starred" tone lead to a discussion of the suitability of the starred tone notation.
In Estonian, as in a number of other languages, the nuclear pitch accent is often low and level. This paper presents two studies of this phenomenon. The first, a phonetic analysis of carefully structured read sentences shows that low accentuation can also spread to the prenuclear accents in an intonational phrase. The resulting sentence contours are used as evidence to evaluate alternative phonological analyses of low accentuation, and H + L* is shown to account best for the data. The second study presents quantitative evidence from fundamental frequency values which supports this phonological analysis. Finally, the distribution of prenuclear pitch accents is discussed. High and low accents can co-occur in an intonational phrase, but only in patterns obeying a specific sequential constraint. A fragment of an intonational grammar for Estonian is presented capturing the observed distributional restrictions.
The paper reports on a perception experiment in German that investigated the neuro-cognitive processing of information structural concepts and their prosodic marking using event-related brain potentials (ERPs). Experimental conditions controlled the information status (given vs. new) of referring and non-referring target expressions (nouns vs. adjectives) and were elicited via context sentences, which did not - unlike most previous ERP studies in the field--trigger an explicit focus expectation. Target utterances displayed prosodic realizations of the critical words which differed in accent position and accent type. Electrophysiological results showed an effect of information status, maximally distributed over posterior sites, displaying a biphasic N400--Late Positivity pattern for new information. We claim that this pattern reflects increased processing demands associated with new information, with the N400 indicating enhanced costs from linking information with the previous discourse and the Late Positivity indicating the listener's effort to update his/her discourse model. The prosodic manipulation registered more pronounced effects over anterior regions and revealed an enhanced negativity followed by a Late Positivity for deaccentuation, probably also reflecting costs from discourse linking and updating respectively. The data further lend indirect support for the idea that givenness applies not only to referents but also to non-referential expressions ('lexical givenness').
An investigation of the effects of experiential and linguistic variables on the preference for within sentence connectives. 8 items were constructed, each consisting of 2 clauses describing sequential events which varied in the perceived frequency of relationship and perceived temporal order (determined by subject ratings). There were 4 conditions: the clause order was either congruent with the perceived temporal order (A-B order) or opposite to it (B-A order), and the clauses were either in the Past or the Present Tense. Subjects in each condition (n = 13) ranked connectives in order of preference for each item. Results showed that frequency of relationship between described events was most likely to be a determining factor in connective preference when the clauses were in the A-B order or they were in the Past Tense. These results were substantiated by an independent absolute judgment procedure in which each sentence was rated for acceptability. Different groups (n = 29) rated the sentences in the A-B order and the B-A order. Generally the results were in agreement with the preference task, although an adaption level effect was apparent in the judgments. These results are discussed in terms of the degrees of freedom of interpretation allowed by the various linguistic connectives.
From Chomsky's assertion that the deep and surface structures of very simple utterances are highly similar, it follows that judgments of the degree of acceptability of such utterances should approximate judgments of their grammaticality. To test Chomsky's assertion that all native speakers of English share the same deep structure, judgments of the acceptability of selected permutations of examples of Scott's subject-verb-object-qualifier (SVOQ) were obtained. The design of the experiment was 3 x 8 x 2 x 3 factorial, with three levels of education (Group 1 — university students, Group 2 — people with three or four years of high school, and Group 3 — people with one or two years of high school), eight of degree of disruption of SVOQ, two of familiarity (sentences consisted either of very low or very high frequency words), and three of qualifier (common adverbs, -ly adverbs, and prepositional phrases). The analysis of S judgments (cases where a permutation was said to be grammatical and the same in meaning as the SVOQ form) yielded a significant Groups x Permutations x Familiarity interaction because, in the low familiar sentences, Group 3 (and, to some extent, Group 2) showed less capacity for grammatical discrimination than Group 1. The analysis of D judgments (cases where a permutation was said to be grammatical but different in meaning from SVOQ) yielded a number of significant main effects and interactions, which were generally interpreted as showing that Group 1 showed more grammatical sophistication than the other groups. On the basis of the experimental results it was concluded that Chomsky's assertion regarding deep structure had been falsified.
Summary of hierarchical regression analysis for variables predicting lexical decision time
Until recently most models of word recognition have assumed that semantic auditory naming effects come into play only after the identification of the word in question. What little evidence exists for early semantic effects in word recognition lexical decision has relied primarily on priming manipulations using the lexical decision task, and has used visual stimulus presentation. The current study uses semantics auditory stimulus presentation and multiple experimental tasks, and does not use priming. Response latencies for 100 common nouns were found to speech perception depend on perceptual dimensions identified by Osgood (1969): Evaluation, Potency, and Activity. In addition, the two-way interactions between these word recognition dimensions were significant. All effects were above and beyond the effects of concreteness, word length, frequency, onset phoneme characteristics, stress, and neighborhood density. Results are discussed against evidence from several areas of research suggesting a role of behaviorally important information in perception.
This investigation studied the influence of lexical factors, known to impact lexical access in adults, on the word retrieval of children. Participants included 320 typical and atypical (word-finding difficulties) language-learning children, ranging in age from 7 to 12 years. Lexical factors examined included word frequency, age-of-acquisition, neighborhood density, neighborhood frequency, and stress pattern. Findings indicated that these factors did influence lexical access in children. Words which were high in frequency and neighborhood frequency, low in neighborhood density and age-of-acquisition, and which contained the typical stress pattern for the language were easier to name. Further, the number of neighbors that were more frequent than the target word also had an effect on the word's ease of retrieval. Significant interactions indicated that age-of-acquisition effects decreased with maturation for typically-learning children whereas these effects continued to impact the lexical access of children with word-finding difficulties across the ages studied, suggesting that these children's difficulties in accessing words may have prevented them from developing strong access paths to these words. These findings support a view of lexical access in which access paths to words become strengthened with successful use.
Four experiments examined Dutch listeners' use of suprasegmental information in spoken-word recognition. Isolated syllables excised from minimal stress pairs such as VOORnaam/voorNAAM could be reliably assigned to their source words. In lexical decision, no priming was observed from one member of minimal stress pairs to the other, suggesting that the pairs' segmental ambiguity was removed by suprasegmental information. Words embedded in nonsense strings were harder to detect if the nonsense string itself formed the beginning of a competing word, but a suprasegmental mismatch to the competing word significantly reduced this inhibition. The same nonsense strings facilitated recognition of the longer words of which they constituted the beginning, but again the facilitation was significantly reduced by suprasegmental mismatch. Together these results indicate that Dutch listeners effectively exploit suprasegmental cues in recognizing spoken words. Nonetheless, suprasegmental mismatch appears to be somewhat less effective in constraining activation than segmental mismatch.
Infants' access and use of fine phonetic detail in laboratory tasks: a summary of relevant studies 
Several recent studies from our laboratory have shown that 14-month-old infants have difficulty learning to associate two phonetically similar new words to two different objects when tested in the Switch task. Because the infants can discriminate the same phonetic detail that they fail to use in the associative word-learning situation, we have argued that this word-learning failure results from a processing overload. Here we explore how infants perform in the Switch task with already known minimally different words. The experiment involved the same phonetic difference as used in our earlier word-learning studies. Following habituation to two familiar minimal pair object-label combinations (ball and doll), infants of 14 months looked longer to a violation in the object-label pairing (e.g., label 'ball' paired with object doll) than to an appropriate pairing. These results using well known words are consistent with the pattern of data recently obtained by Swingley and Aslin (2002) in which it was found that infants of 14 months look longer to the correct object when the accompanying well known word is spoken correctly rather than mispronounced. We discuss how these results are compatible with the limited resource explanation originally offered by Stager and Werker (1997).
Four cross-modal priming experiments and two forced-choice identification experiments investigated the use of suprasegmental cues to stress in the recognition of spoken English words, by native (English-speaking) and non-native (Dutch) listeners. Previous results had indicated that suprasegmental information was exploited in lexical access by Dutch but not by English listeners For both listener groups, recognition of visually presented target words was faster, in comparison to a control condition, after stress-matching spoken primes, either monosyllabic (mus- from MUsic/muSEum) or bisyl-word recognition labic (admi-from ADmiral/admiRAtion). For native listeners, the effect of stress-mismatching bisyllabic primes was not different from that of control primes, but mismatching monosyllabic primes produced partial facilitation. For non-native listeners, both bisyllabic and monosyllabic stress-mismatching primes produced partial facilitation. Native English listeners thus can exploit suprasegmental information in spoken-word recognition, but information from two syllables is used more effectively than information from one syllable. Dutch listeners are less proficient at using suprasegmental information in English than in their native language, but, as in their native language, use mono- and bisyllabic information to an equal extent. In forced-choice identification. Dutch listeners outperformed native listeners at correctly assigning a monosyllabic fragment (e.g.. mus-) to one of two words differing in stress.
This study examines the relationship between individual reading subprocesses and general reading ability in college students. The reading measures included eye movements while reading a passage, lexical decision latencies, comprehension, and vocabulary size. The results indicate that a distinct relation exists between reading speed and fixation behaviour associated with regressions through a text. About half the variability in comprehension scores can be predicted by subjects' performance on nonword lexical decisions, gaze durations, and vocabulary scores. These findings are discussed with reference to past studies using similar reading measures.
A theory of interpersonal accommodation has proposed that if a member of one ethnolinguistic group adopts the language of the other group member, this will evoke positive attitudes in the other and also result in that ethnic member making an effort to accommodate back to the initiator. The present study was designed to demonstrate that using the other's language may not necessarily lead to positive evaluation and reciprocal accommodation. Guided by attribution theory, three conditions were created involving English Canadians who spoke French or English to French Canadian subjects: (1) Subjects were given no information about the language capacity of the English Canadian speaker, (2) the speaker was externally pressured to use French or English, and (3) Subjects were aware that the speaker was capable of speaking French. The results demonstrated that the accommodation model was not sufficient in its original form to account for language choice in all contexts and an elaboration was suggested in attribution terms.
Previous research has argued that fundamental frequency is a critical component of phonetic accommodation. We tested this hypothesis in an auditory naming task with two conditions. Participants in an unfiltered condition completed an auditory naming task with a single male model talker. A second group of participants was assigned to a filtered condition where the same stimuli had been high-pass filtered at 300 Hz, thereby eliminating the fundamental frequency. Acoustic analysis of f0 revealed that participants assigned to the unfiltered condition imitated the pitch of the model talker more than those assigned to the filtered condition. Although accommodation was statistically significant, the effect was small, so we followed with a perception study to examine listeners' abilities to detect differences in accommodation across conditions. Shadowed tokens from participants in the unfiltered condition were indeed judged by listeners to be more similar to the model talker's productions that those from participants in the filtered condition. However, acoustic measurements and listener judgments of accommodation were not significantly correlated, enforcing the intuitive concept that accommodation and listeners' judgments of similarity are holistic and do not hone in on singular features in the acoustic signal.
An experiment was conducted to study intonational characteristics that accompany the grammatical morphemes wa and ga in Japanese utterances. Twelve native speakers of Japanese read aloud twelve sentences that contained various sub-types of wa and ga. Acoustical analysis of fundamental frequency and pause durations revealed that wa was preceded by a larger fall in fundamental frequency than ga and that wa was frequently followed by a pause, unlike ga. Thematic and contrastive sub-types of wa were differentiated by pauses as well as fundamental frequency fall. Three sub-types of ga were not clearly differentiated by pauses, although differences were observed in the patterns of fundamental frequency.
An unavoidable problem in speech technology, particularly in the development of robust automatic speech recognition systems, is the extreme variability in the acoustic attributes of segments. Segments are highly sensitive to context and bear little resemblance to their intrinsic characteristics manifested when they are uttered in isolation. However, the problem can become tractable if we model the linguistic and physiological aspects of coarticulatory processes, the main source of systemic variability at the segmental level.
Some peculiar properties of children's passives have long been observed in various languages such as an asymmetry between actional passives and nonactional passives. These peculiarities have been accounted for under the hypothesis that children's early passives are adjectival, and as such exhibit properties of adjectival passives in adult grammar. Under this hypothesis, a new prediction follows, namely that children's comprehension of passive predicates will vary depending upon the event structures of predicates. If a predicate has a target/result state in its event structure, it makes a good adjectival passive, and children will comprehend the predicate more easily. By contrast, if a predicate lacks a target/result state, it does not make a good adjectival passive, and children will comprehend the predicate less easily. This paper tested and confirmed this prediction in Korean children's passives. In a picture-aided comprehension task with 67 Korean children ranging from 3;10-8;8, we have found a contrast due to the event structures of predicates. The result shows that children are sensitive to the event structures of passive predicates, and thus provides additional support for the adjectival passive hypothesis.
This investigation seeks to understand the factors causing vocalization and elision of dark/l/ in the Romance languages. Contrary to articulatory- and perceptual-based arguments in the literature it is claimed that preconsonantal vocalization conveys the phonemic categorization of the /w/-like formant transitions generated by the tongue dorsum retraction gesture (in a similar fashion to other processes such as /[symbol: see text] /Vjn/). The evolution /VwlC/ > /VwC/ may be explained using articulatory and perceptual arguments. A dissimilatory perceptual mechanism is required in order to account for a much higher frequency of vocalizations before dentals and alveolars than before labials and velars in the Romance languages. Through this process listeners assign the gravity property of dark /l/ to a following grave labial or velar consonant but not so to a following acute dental or alveolar consonant in spite of the alveolar lateral being equally dark (i.e., grave) in the three consonantal environments. Other articulatory facts appear to play a role in the vocalization of final /l/ (i.e., the occurrence of closure after voicing has ceased) and of geminate /ll/ (i.e., its being darker than non-geminate /l/). The elision of dark /l/ may occur preconsonantally and word finally either after vocalization has applied or not. This study illustrates the multiple causal factors and the articulatory-perceptual nature of sound change processes.
Characteristics of the two Japanese learners and one native speaker group 
Percentage correct scores of nativelike Japanese speakers on the R / L identification task
This study tested the issue of whether extended length of residence (LOR) in adulthood can provide sufficient input to overcome age effects. The study replicates Flege, Takagi, and Mann (1995), which found that 10 out of 12 Japanese learners of English with extensive residence (12 years or more) produced liquids as accurately as native speakers of English (NS). Further, for both accuracy and native-like accentedness, the Japanese with extensive residence performed statistically better as a group than inexperienced Japanese (less than 3 years of residence). Results with a new sample of Japanese learners in this study found no statistical difference between the Japanese groups with extended versus short LOR although both reported equal levels of daily input in English. Additionally, both groups received statistically lower scores than NS. Moreover, LOR affected the two groups differently: The accuracy and native-like accentedness of words and sentences by Japanese with extensive residence declined with LOR (and chronological age when age of arrival was partialled out), while for Japanese with short residence accent improved with increased LOR (but not age). This study is the first to document a decline in second language production ability with LOR and age in older second language learners. However, this finding deserves to be addressed with further research, as the study was not designed to investigate this question and thus not all relevant factors, such as motivation or attitude, were controlled for. The results from the short-residence learners indicate that the initial one to two years of immersion may be the most important for improving phonological ability.
As is well known, Japanese adults who have just begun to learn English often err in producing /r/ and /l/ because their native language does not possess such liquid consonants. The aim of this study was to determine if Japanese adults eventually learn to produce /r/ and /l/ accurately in words like read and lead. Liquids spoken by 12 native Japanese speakers who had lived in the United States for an average of two years were often misidentified by native English-speaking listeners. Their productions of /r/ and /l/ also received much lower (and thus foreign-accented) ratings than did the native English speakers' liquids. On the other hand, liquids produced by native Japanese speakers who had lived in the United States for an average 21 years were identified correctly in forced-choice tests. This held true for liquids in words that had been read from a list as well as for words that had been spoken spontaneously. The ratings of liquids produced by 10 of the 12 experienced Japanese speakers fell within the range of ratings obtained for the 12 native English speakers. These findings challenge the widely accepted view that segmental production errors in a second language arise from the inevitable loss of ability to learn phonetic segments not found in the native language.
The aim of the present paper is to describe, in acoustic and perceptual terms, the prosodic pattern distinguishing English compound and non-compound noun phrases, and to determine how information structure and position affect the production and perception of the two forms. The study is based on the performance of ten English-speaking subjects (five speakers and five listeners). The test utterances were three minimal-pair noun phrases of two constituents, excised from conversational readings. These were analyzed acoustically, and submitted to the listeners for semantic identification. The results indicate that the distinction, when effective, lies primarily in the different prominence pattern: a sequence of an accented constituent followed by an unaccented one in compounds, and of two accented constituents (the second heard as stronger than the first) in non-compounds. It is also based on a different degree of internal cohesion, stronger in compounds and weaker in non-compounds. F0, associated or trading with intensity, has proved to be the main cue to this distinction--more than duration, the major differentiating parameter in production. When an item is excised from the context, the perception of the intended category depends heavily on the communicative importance it had in the discourse. This means that information structure, through its effects on accentuation, becomes the determining factor in the perception of the distinction. The distinctive accentual pattern weakens or is completely neutralized when the test items convey old information. The degree of deaccentuation also seems to be affected by an immediately following focus, and, to a certain extent, by position. The data are viewed in the framework of speaker-listener interaction, and it is argued that deaccentuation, as well as accentuation, can have a communicative function.
Interarticulator timing is a mechanism for producing linguistic contrasts that is widely used in different languages. This paper explores acoustic and aerodynamic effects of variations in laryngeal-oral coordination in voiceless consonants. Measurements of voice onset time and interarticulator phasing for individual tokens of stop consonants show weak correlations, indicating that interarticulator timing is only one factor determining voice onset time. Other factors most likely involved are glottal opening, transglottal pressure and air flow, and vocal fold tension. Taken together, these observations suggest that speakers may only have limited control of voice onset time. This could explain why languages do not seem to make fine-grain use of VOT for linguistic contrasts. Measurements of peak and minimum air flow during individual source pulses, obtained by inverse-filtering oral flow, show a pattern of decrease and increase in vowels following voiceless consonants. Subtle differences in the time course of these patterns occur following different consonants, suggesting that interarticulator phasing may be partly responsible for them. Closer examination reveals consistent correlations with interarticulator phasing for one speaker but inconsistent results for another. The results are discussed in terms of speech motor control and controlled variables in speech.
Top-cited authors
Anne H. Anderson
  • University of Glasgow
Elizabeth Boyle
  • University of the West of Scotland
Ellen Gurman Bard
  • The University of Edinburgh
Delphine Dahan
  • University of Pennsylvania
Jan Mcallister
  • University of East Anglia