ArticlePDF Available

Abstract and Figures

A perception experiment with native German listeners provided evidence for the relevance of the tonal onglide in nuclear accents - the pitch movement leading towards the target on the accented syllable. Listeners were able to distinguish between two pragmatic meanings of a short phrase (given/non-contrastive and new/contrastive) using the tonal onglide as the sole acoustic cue. On the basis of these findings, we argue that the onglide merits a phonological status in an intonation model of German and should not be regarded as merely phonetic detail.
Content may be subject to copyright.
The Role of Tonal Onglides in German Nuclear Pitch Accents
Simon Ritter, Martine Grice
IfL Phonetik, University of Cologne
{simon.ritter; martine.grice}@uni-koeln.de
Corresponding author:
Martine Grice
Herbert-Lewin-Str. 6, 50931 Cologne, Germany
E-Mail: martine.grice@uni-koeln.de
Phone: 0049 221 470 56 10
Abstract
A perception experiment with native German listeners provided evidence for the
relevance of the tonal onglide in nuclear accents – the pitch movement leading towards
the target on the accented syllable. Listeners were able to distinguish between two
pragmatic meanings of a short phrase (given/non-contrastive and new/contrastive) using
the tonal onglide as the sole acoustic cue. On the basis of these findings, we argue that
the onglide merits a phonological status in an intonation model of German and should
not be regarded as merely phonetic detail.
1
Keywords: Intonation, pitch accent, on-ramp, off-ramp, perception, pragmatic meaning,
leading tone, trailing tone
1. Introduction
In Autosegmental-Metrical Phonology, intonation contours in West Germanic
languages, such as English, Dutch and German, are composed of edge tones and pitch
accents. The tonal composition of pitch accents, however, differs from one analysis to
another. The very nature of autosegmental phonology makes it possible for the domain
of tones associated with a particular syllable to extend beyond the borders of that
syllable: Phonological association does not necessarily imply phonetic alignment.
Furthermore, tonal targets are often obtrusions and may thus be perceived by virtue of
tonal movements to and from them. The question we are concerned with in this paper is
whether the movement towards the target is relevant in terms of pragmatic meaning, and
thus whether this movement should be represented in the phonology. An account that
takes this movement towards the target into account has been referred to as an on-ramp
analysis, and one that takes only the target and the movement from the target into
account an off-ramp analysis (Gussenhoven, 2004).
Models in the tradition of the British School favour an off-ramp approach. Here,
intonation contours are decomposed into smaller constituents, prehead, head, nucleus
2
and tail. The meaning of the intonation contour over the whole phrase is primarily
determined by the nuclear tone, consisting of the nucleus and the tail (Cruttenden,
1997). This tonal movement begins on the last accented syllable and continues until the
end of the phrase. Although the meaning can also be influenced by the part of the
contour that precedes the nuclear tone, this portion plays a rather subordinate role. A
jump up from a lower level or a fall down from a higher level to the nucleus, i.e. the
transition between the so called head and the nucleus, does not contribute to the
phonological category of the nuclear tone (cf. Couper-Kuhlen, 1986, 80). However,
Crystal (1969, 218) remarks that the placement of “prominence” within the nuclear
contours can vary. As a consequence, a part of the tonal movement may occur before the
beginning of the vowel in the accented syllable, e.g. the rising part of a rise-fall nuclear
tone may begin before the accented syllable. This is treated as phonetic variation, since
the categorisation of the nuclear tone is not affected by this movement. Crystal refers to
this transitional movement towards the target on the accented syllable as the onglide.
We adopt this terminology here, leaving open for the moment the question as to its
phonetic and phonological status.
The approaches of von Essen (1964) and Pheby (1975) in their accounts of German
intonation have many elements of the British school. In von Essen’s (1964) model, for
instance, the main intonational meaning is carried by the tone on the accented syllable.
3
In Pheby’s (1975, 51) model, it is the contour on the last accented syllable and what
follows. This analysis is very similar to the concept of the nuclear tone, a combination
of the British School nucleus and tail (Cruttenden, 1997).
The approach of the American Structuralists can also be considered to be off-ramp.
Here, intonation contours are analysed as consisting of levels, or static tones, that are
combined in sequence to represent movements. Thus, the levels rather than movements
are the primitives. As Pike (1945) notes, these levels are considered to be meaningless
by themselves only intonation contours (i.e. combinations of levels) bear meaning.
The utterance-length intonation contour is divided into two components: the precontour
and the primary contour. These are equivalent to the head and nucleus (and tail) in the
British School. The beginning of a primary contour is a “stressed” syllable, and every
“heavily stressed” syllable constitutes the beginning of a new primary contour (Pike,
1945, 27), which can consist of several syllables. Precontours also carry meaning, but
the primary contour is said to have a stronger meaning than the precontour. Thus, the
cut between the major part of an intonationally defined phrase and a previous minor part
is at the same point for proponents of the British School and the American
Structuralists. It is the beginning of the nuclear accented syllable and even more
importantly the beginning of the tonal movement starting on that syllable, even if the
movement does not start at its very beginning. A movement that spans these two
4
subdomains is considered to be a transition, and does not play a (major) role in
determining the classification of intonation contours.
The framework of Autosegmental-Metrical Phonology (AM) treats intonation contours
as sequences of levels, or targets (or in later analyses, structures made up of targets,
Pierrehumbert and Beckman, 1988; Grice, 1995; Ladd, 2008). In this approach, there is
a clear division of tones according to whether they have a prominence-lending or
delimitative function. While boundary tones (also edge tones) are associated with the
edges of phrases in the prosodic hierarchy, pitch accents are associated with metrically
strong syllables, and serve to lend prominence to words containing these syllables. The
association of a tone to a metrically strong syllable is expressed with a * symbol. If
there is more than one tone associated with a pitch accent, only one of the tones is
designated to be starred.
The issue that concerns us here is whether a model allows for a tone before the starred
tone, a leading tone (e.g. L+H*), as well as a tone after it, a trailing tone (e.g. L*+H).
An on-ramp analysis is characterised by the use of both leading tones and trailing tones,
while an off-ramp analysis is restricted to the use of trailing tones. Although there is a
broad consensus among different models within the AM framework on the analysis of
the starred tone itself, there are considerable differences when it comes to the choice of
5
an on-ramp or off-ramp analysis. One model that makes exclusive use of trailing tones
and thereby builds upon the British off-ramp tradition is ToDI (Gussenhoven, 2005).
Other models, such as ToBI, or, more specifically, MAE-ToBI (Beckman, Hirschberg, &
Shattuck-Hufnagel, 2005) have both leading and trailing tones in their inventory.
(Besides H* and L*, there are two pitch accents with leading tones, L+H* and H+!H*,
and one with a trailing tone, L*+H.)
Among the AM-models for German intonation, there are also differences in the bitonal
pitch accents: GToBI, in line with the original ToBI for American English, employs both
leading and trailing tones (Grice & Baumann, 2002; Grice et al., 2005). Other models
like Féry (1993), Mayer (1995), and Peters (2014) are in line with the British tradition,
and with the Dutch ToDI system, in that they have only trailing tones. Some of these
models (Féry, 1993; Mayer, 1995) have a leading tone in their inventory, but treat it as
an exception that leads to a tritonal pitch accent, e.g. HH*L. It is important to remember
that there is a certain asymmetry in the designation of on-ramp and off-ramp, whereas
an on-ramp analysis takes the tonal movement before and after the accented syllable to
be relevant, an off-ramp analysis only concentrates on what follows the accented
syllable.
6
Deciding which part of the pitch accent is meaningful plays an important role in
deciding between on-ramp or off-ramp analyses. The main difference between these two
analyses can be illustrated using an example from Gussenhoven (2004, 127), which is
depicted in Figure 1.
Figure 1: Transcriptions using ToDI (off-ramp) and ToBI (on-ramp); adapted from
Gussenhoven 2004, 127-8)
Gussenhoven analyses this contour with a H*L accent on “BOW” followed by a low
boundary tone at the end of the phrase. The beginning of the rise on “BOW” is not
attributed to a tonal target. ToBI on the other hand represents the beginning of the rise as
a L target which is the leading tone of a L+H* pitch accent. By employing an on-ramp
analysis, ToBI considers the rise on “BOW” to be a part of the pitch accent. ToDI on the
other hand, does not. In fact, Gussenhoven states explicitly that “leading tones might
exist in Dutch, though they must be rare if they do” (Gussenhoven 2005, 126). By
7
analogy to the notion of on- and off-ramp, we call the tonal movement towards the
starred tone target the onglide and the tonal movement away from it the offglide.
IViE (Grabe, 2001), a transcription system based on the AM framework for British
English intonation that favours an off-ramp analysis, nonetheless incorporates a
mechanism for dealing with the onglide phonetically. The pitch accent types that are
transcribed on the phonological tier of the system can have one or two trailing tones. In
addition to the phonological tier, there is a phonetic tier in which the transcriber can add
a target specification for the pre- and posttonic syllable. This level can be low (l), mid
(m), or high (h). Thus, one possible transcription of an H*L accent that has a rise up to
the accented syllable could be lH-l, where the first l indicates the level of the pretonic
syllable, the H indicates the level of the tonic syllable, and the second l the level of the
posttonic syllable (the ‘-‘ indicates interpolation). But this description is purely
phonetic; a difference in the onglide does not contribute in any way to the classification
of pitch accents. Hence, IViE’s treatment of the onglide resembles that of Crystal (1969)
discussed above.
As pointed out by Gussenhoven (2004, 128), the issue as to whether an on- or off-ramp
analysis is justified in the description of intonation systems has so far attracted little
empirical study. For Italian, also a language with stress accent (Beckman, 1986), Grice
8
and Savino (1995) have shown that the tonal movement up to a high target on the
accented syllable plays a major role in the categorisation of meaning. In a categorical
perception experiment involving an identification task with resynthesised stimuli,
subjects were asked to judge whether the stimulus they heard was an information-
seeking polar question (query) or a command. Grice and Savino found that listeners
were able to distinguish information-seeking questions from commands based solely on
the pitch movement towards a constant high peak on the accented syllable.
Consequently, they analysed the distinction as being reflected in the pitch accent types
L+H* for questions and H* for commands1.
The left panel of Figure 2 depicts the construction of the stimuli in Grice and Savino’s
experiment. By systematically varying the target before the accented syllable they
manipulated the magnitude of the rising onglide. The results are reproduced in the right
panel of Figure 2: If there was a low target before the peak, the stimulus tended to be
judged as a question; if the onglide was only slightly rising with no clear low target
before the peak, the utterance tended to be judged as be a command. The authors
1 Although the curve is not as S-shaped as typically expected in categorical perception
tasks, it has to be remembered that the experiment is about intonation which, according
to Ladd and Morton (1997), can be said to be categorically interpreted rather than
categorically perceived. This would explain the shape of the response curve.
9
interpreted this finding as evidence for a leading tone and thus for the plausibility of an
on-ramp analysis for Italian.
Figure 2: Stimuli and results from Grice and Savino (1995)
Left panel: Contours of the stimuli on “lo mandi a Massimiliano” (you send it to
Massimiliano) between a stylised command (dashed line) and a question (dotted line).
Right panel: Results of the experiment. The greater the onglide (= size of the dip in Hz),
the higher the percentage of question (query) responses.
Chen (2011) reports evidence from production in favour of an off-ramp analysis of
Dutch intonation. In her production study, she examines the realisation of peak accents
in topic and focus conditions, measuring the properties of the rise and the fall. She finds
two classes of peak accents that differ in their realisation of focus and topic conditions.
When the accented word is under focus, the rise up to the peak on the accented syllable
is steeper in one class, whereas the fall is steeper and the rise stays constant in the other
class. She concludes that the former class can be described as H*, whereas the latter
10
should be labelled H*L. Chen’s experimental approach builds on the reasoning that
pitch accents in one class will behave in a phonetically similar way, while pitch accents
from different class will behave differently. However, it is unclear what the difference in
meaning might be between the two classes. Furthermore, in her study, prenuclear and
nuclear accents are pooled, making the results difficult to interpret.
Another empirical approach to the question as to the appropriateness of on- and off-
ramp analyses is discussed in Gussenhoven (2008) also for Dutch intonation. To
assess which part of a high pitch accent is most important he compares the semantic
judgements of the three contours depicted in Figure 3. In contour (a) both the rise and
the fall are present. In the other two contours, either the rise (b) or the fall (c) is present.
The contours are resynthesized with sentences containing words that can be either
interpreted in a modal or lexical sense, e.g. alleen (modal: ‘just’ or lexical: ‘alone’; e.g.
Hij zit alLEEN (met die man) in het caFÉ, lexical: ‘He is alone (with that man) in the
pub’, modal: ‘The thing is, he’s in the pub (with that man)’). The resynthesised contours
contained a pitch accent on the ambiguous target word (e.g. alleen) and a boundary after
it. Subjects had to decide whether their interpretation is modal or lexical by judging on a
scale how well a written paraphrase matched to the sentenced they heard. The results
show that contours (a) and (c) pattern together whereas (b) does not. That is, (b) exhibits
more ratings for a modal interpretation relative to the other two contours. However, it is
11
also possible to interpret these findings in a different way: Contours (a) and (c) share
what in other autosegmental metrical models would be analysed as a low edge tone
(either a phrase accent or a boundary tone).
Figure 3: Resynthesized contours from Gussenhoven (2008)2
Thus, the question as to whether the prenuclear movement (the onglide) is meaningful is
possibly eclipsed by the contribution to the meaning of the boundary tones (the offglide)
in this case. In order to assess the contribution of the onglide, it is necessary to keep all
other parts of the intonation contour constant. This is what we aim to achieve in the
present study. Furthermore, despite the relatedness of Dutch and German, it is necessary
to investigate the onglide in German separately, given that there may be differences
across the two languages precisely in this respect.
Consider the two short German utterances given in Figure 4. Both utterances end in a
low boundary tone. In the case of (a) the pitch falls down to a target on the accented
syllable “ni”. In the case of (b) the pitch rises up to a target on the accented syllable.
2 This transcription in the reproduction is different from the original due to a
typographic error in the original. This is the correct version (Carlos Gussenhoven, p.c.).
12
There is thus a difference in pitch before the target of the starred tone is reached a
difference in the onglide. For the purposes of phonetic transparency this difference will
be referred to henceforth with a leading tone: (a) as H+!H* L-%3 and (b) as L+H* L-%.
(a) Falling (b) Rising
Figure 4: Two examples of the German utterance “Für Janina” produced with (a) a
falling and (b) a rising onglide. The accented syllable “ni” is highlighted in grey.
Experimental data from Ritter, Krüger, Mücke and Grice (2012) suggest that the onglide
plays an important role for the identity of pitch accents in German. In that study, the
onglide was measured as the difference in semitones between a point 30 milliseconds
3 The early peak accent has also been annotated as H+L*. However, Grice et al. (2009)
found that in early peak accents in similar contexts, the second tone was scaled more
like !H* than L*, leading to a preference for H+!H*. L-% is shorthand for L-L% in
GToBI.
13
before the start of the accented syllable and the point on the accented syllable where a
GToBI (Grice et al., 2005) label was placed providing a measure of direction of the
onglide and the magnitude of the change in F0. The results showed that contrastive
focus is preferentially marked by accents with a rising onglide, whereas broad focus is
preferentially marked by accents with a falling (or only very slightly rising) onglide,
although not all speakers make use of this distinction. A perception experiment carried
out with these data further confirmed the findings on the function of the onglide.
However, since the stimuli were unmodified productions from a reading task, it was
impossible to control for other acoustic parameters such as duration, intensity (sonority
expansion), and vowel quality (hyperarticulation). In addition, many productions had a
prenuclear pitch accent earlier in the phrase. This part of the signal could in principle
have also carried information about the information status of the target word.
The study presented in this paper isolates the tonal onglide and explores how far it has
an impact on the perception of German nuclear pitch accents. If the onglide plays a
significant role in the distinction of accent types, it should be possible to change the
pragmatic meaning of an utterance by solely manipulating the pitch contour before the
starred tone target. The background presented above leads us to the formulation of the
following hypothesis:
14
Native listeners of German are able to distinguish pragmatic meanings using the
tonal onglide as the only auditory cue.
To test this hypothesis a perception experiment was set up in which listeners were
presented with stimuli that differed only in their tonal onglides. All other cues were held
constant. The target phrases only contained the nuclear pitch accent under scrutiny and
no further prenuclear pitch accent.
2. Methods
2.1 Stimuli
We used stimuli with a resynthesised F0 contour in order to have maximal control over
the onglide. As base stimuli, three short utterances by one German native speaker were
used: Für Janina (‘For Janina’), Für Marlene (‘For Marlene’) and Für Ramona (‘For
Ramona’). In German, prenuclear accents are common if there is an accentable syllable
near the beginning of the phrase. Such accents are referred to as ornamental accents
(Büring, 2007). To ensure natural-sounding stimuli, we thus restricted the prenuclear
context to a function word followed by an unstressed syllable. All base stimuli were
produced in one phrase, the speaker intended to realise a monotone production of the
pitch contour with a prominence on the name (e.g. Janina).
15
The resynthesised contours were constructed with a falling onglide, a rising onglide, and
a level onglide. The tonal target of the accented syllable, as well as the pitch contour
after the accent was kept constant (see Figure 5). Four points in the contour were
manipulated. First, all contours began at a value of 150 Hz. Second, the target on the
accented syllable (e.g. “ni” in Janina) was located at the midpoint of the syllable’s
vowel. This produced a fairly neutral phonetic alignment. This point was also set to 150
Hz. Third, all contours ended in a low boundary tone, the target for which is reached at
the midpoint of the last syllable (e.g. “na” in Janina) and set to 95 Hz.
The crucial pivot differentiating the stimuli in the series was a point 30 milliseconds
before the onset of the accented syllable. In the falling onglide case, the F0 contour falls
from a high target (189 Hz) to a target four semitones lower on the accented syllable. In
the rising onglide case, the contour rises from a low target (119.1 Hz) to a target four
semitones higher. In the level onglide case, the contour is level. All manipulations were
carried out in Praat (Boersma, 2001). A window of 30 milliseconds was used following
the analysis of production data in Ritter et al. (2012).
16
Figure 5: Manipulation of the pitch contours
2.2 Subjects
20 monolingual German native listeners (12 female, 8 male) took part in the
experiment. Most of the participants were undergraduate students from a number of
different disciplines. None of the subjects had extensive training in prosody.
2.3 Task, procedure
Subjects were presented with the resynthesised intonation contours along with a pair of
small dialogues displayed on a computer screen. In Figure 6 a sample of a dialogue pair
is shown. In dialogue (i), the target sentence provides an affirmative answer and
contains given information. We refer to this as given/non-contrastive. In context (ii), the
target sentence negates the proposition of the preceding sentence. It contains new
17
information and has a corrective focus with an explicit contrast. We refer to this
meaning as new/contrastive. The listeners were asked to match the sentence they heard
to one of two small dialogues. Since the text of the sentence matches both dialogues,
listeners were explicitly asked to base their decision on “how the utterance is
pronounced”. To choose one of the two dialogues, subjects pressed either the key “a”
for the left context or the key “l” for the right context on a German computer keyboard.
Figure 6: Examples of mini-dialogues used in the experiment
The experiment started with a short training phase to familiarise the subjects with the
task. The training block contained two repetitions of the falling and rising stimuli of
each target word – a total of twelve trials. After that, they listened to the stimuli in four
blocks with a pause of 15 seconds between the blocks. Each block contained two
repetitions of the stimuli. All contours occurred in every block, stimuli were randomised
in each block. The whole experiment lasted about 15 minutes. The visual presentation of
the contexts was reversed for 50 % of the subjects (i.e. they saw new/contrastive on the
left side, given/non-contrastive on the right side). The experiment was run on a
18
notebook with PsychoPy (Peirce, 2007). Stimuli were presented through headphones.
All sessions were held in a quiet room at the University of Cologne or at a participant’s
home.
2.4 Data
Each subject listened to 8 repetitions of the sentences, so that there was a total of 72
items for each subject (3 names * 3 manipulations * 8 repetitions). Because 9 items
from one subject had to be excluded, due to technical reasons, the analysed dataset
contained 1431 items. Reaction times were recorded in addition to the responses.
3. Results
Figure 7 shows the proportion of ratings as new/contrastive as means for all subjects. A
rising onglide was judged most frequently as encoding new/contrastive information
(75%). On the contrary, a falling onglide was less often judged as encoding
new/contrastive information (27%). Results for the level onglide lie in between the two
extremes (55%).
We built a mixed effects logistic regression model to assess the effect of onglide on the
subjects’ judgements using R (R Core Team, 2012) with lme4 (Bates, Maechler, &
Bolker, 2012). The model we ran had response (binary choice: context i or ii) as
19
dependent variable. As fixed effects, we entered onglide, gender, presentation order of
the dialogues (left vs. right), base sentence, and number of repetition (mean centered).
As random effects, we had intercepts for subjects, as well as by-subject random slopes
for the effect of onglide. We then ran a likelihood ratio test of the full model against a
null model without the effect of onglide. The result shows that onglide has a significant
effect on the response 2(2) = 15.043), with rising onglide having the highest
probability of ratings as new/contrastive information (log odds: 3.04, SE = 0.74) and
falling onglide with the lowest probability (log odds: -1.85, SE = 0.43). Level onglide
lies in-between (log odds: 1.76, SE = 0.60, p < 0.01). Base had no significant effect on
the response 2(2) = 5.7957, p = 0.05514), neither did gender 2(1) = 1.622, p =
0.2028), presentation order of the dialogues (χ2(1) = 0.1768, p = 0.6741), and repetition
2(1) = 1.2421, p = 0.2651).
20
Figure 7: Mean responses as 'new/contrastive’ (all subjects pooled)
A closer look at the individual results shows that there are different patterns across
listeners. In Figure 8 the judgements are plotted as means for each subject. While there
are some listeners that show the pattern that matches the overall means (highest ratings
for rising, lowest for falling), there are also some subjects whose results deviate from
the main trend: Either the pattern seems to be reversed (two subjects: 8 and 15), or there
is hardly any difference between the onglide conditions (four subjects, e.g. 13). Among
the other 14 listeners, there is variation in ratings for the level onglide. While for some
listeners the level onglide behaves like the rising onglide (e.g. subject 2), for others it
21
patterns with the falling onglide (e.g. subject 3). In other listeners, the results are in
between those for falling and rising onglide.
We also calculated the reaction times of all subjects, calculated from the offset of the
stimulus (see Figure 9). They were somewhat higher for the level onglide (2.22
seconds) than for the other cases (falling: 1.9 seconds; rising: 1.98 seconds). With a
generalised linear mixed model log transformed reaction times were analysed with
onglide, gender, presentation order of the dialogues, base sentence, and number of
repetition were used as fixed effects. Intercepts for subjects and by-subject random
slopes for the effect of onglide were used as random effects. A likelihood ratio test of
the full model against a null model was carried out. The result points to a significant
effect of onglide on the reaction times (χ2(2) = 6.081, p < 0.05). Level onglide had the
longest reaction times (β = 0.98, SE = 0.03), the reaction times for rising (β = 0.90, SE
= 0.78) and falling (β = 0.88, SE = 0.18) were lower. In addition, base had a significant
effect on the reaction times 2(2) = 13.642, p < 0.01) with slowest reaction times for
“Janina” (descriptive mean: 3.24 seconds, β = 0.88, SE = 0.18), followed by “Ramona”
(descriptive mean: 3.15 seconds, β = 0.86, SE = 0.03), and “Marlene” (descriptive
mean: 2.90 seconds, β = 0.78, SE = 0.03). Also, an effect was found for repetition (χ2(1)
= 81.128, p < 0.001) – reaction times decreased during the experiment (β = 0.77, SE =
0.01). No effect was found for gender 2(1) = 0.4083, p = 0.5228) and presentation
22
order of the dialogues (χ2(1) = 0.9415, p = 0.3319). It has to be pointed out that reaction
times were relatively long. This is likely due to the fact that the task involved
interpretation and contextualisation, rather than the type of decision often required in
experiments involving identification of stimuli.
Figure 8: Individual responses from all subjects (means for each subject)
23
Figure 9: Mean reaction times (all subjects pooled)
4. Discussion
The results of the present study show that German native listeners are able to distinguish
between two pragmatically different readings of the phrases (“Für Janina”, “Für
Marlene”, “Für Ramona”) using the tonal onglide as the sole acoustic cue. In particular,
the experiment provided overall clear results for the falling and rising onglide, although
there was some variation in the ratings across subjects (see figure 7), mirroring the
variation observed in production of similar contrasts in Ritter et al. (2012), and given
the fact that intonation is particularly susceptible to individual variation (Niebuhr et al.,
24
2011). Moreover, individual differences in perception are to be expected, especially
when only one dimension is manipulated (Perkell, 2004).
It appears that listeners were not as certain in interpreting the level onglide as they were
in the other two conditions. This was reflected in the longer reaction times measured for
the responses to the stimuli in this condition. There are at least two possible reasons for
this outcome. Although the level onglide condition approximates a simple H* pitch
accent, it is possible that subjects perceived it as less natural than the other two
contours. Indeed for English it has been argued that a rising onglide is necessary for H*-
type accents (Ladd & Schepman, 2003). Alternatively, the level onglide might not be
appropriate in either of the two pragmatic contexts we offered to the participants.
Results from Röhr and Baumann (2010, 2011) and Ritter et al. (2012) suggest that a
context where the target word is in focus but contains non-contrastive, new information
could be appropriate for a level onglide (e.g. if the contextualising question had been
Für wen is das Paket? ‘Who is the parcel for?’).
Our main finding is that the pitch movement in the region before the H* tonal target
(including the movement on the previous syllable) makes a major contribution to the
interpretation of pragmatic meaning, warranting a phonological treatment in models of
intonation. The tonal movement the onglide is directly encoded in on-ramp
25
intonation models like ToBI for English (Beckman et al. 2005) and GToBI for German
(Grice & Baumann, 2002; Grice et al., 2005), where it is represented as a leading tone,
hence our choice for the phonetically transparent labels H+!H* L-% for the falling
onglide and L+H* L-% for the rising onglide.
ToDI (Gussenhoven 2005) and its German equivalent (Peters 2014), as off-ramp
approaches, do not make use of leading tones. However, these models have a way of
deriving a tonal cluster including a tone preceding the starred tone, but only if there is a
prenuclear accent. In this case, the trailing tone of a preceding bitonal pitch accent (e.g.
L of a prenuclear H*L) can be displaced to a position immediately before the accented
syllable (e.g. resulting in the cluster LH*L in nuclear position, Gussenhoven 2005).
Crucially, this approach cannot account for cases where there is no prenuclear accent, as
is the case with our data. Gussenhoven argues that a contour with a H peak before the
accented syllable in single-accent utterances is rare in Dutch, although he discusses a
hypothetical example.
If used on Dutch Met de TREIN (with the train ‘By train’), met would have low
pitch, de high pitch, while a fall from mid to low or low pitch, exactly as for
downstepped !H*, would occur on trein. The pitch accent would be transcribed as
H+!H*. (Gussenhoven 2005:126).
26
Based on equivalence of meaning between (i) H+!H* and (ii) !H* preceded by a high
pitch (manifested as a H tone in a preceding pitch accent, H* or L*+H, or a boundary
%H tone), Grice et al. (2009) argue that accents with leading tones are derived. In their
analysis, the previous H tone in (i) and (ii) is derived from a common source, a floating
H tone. That is, they argue that the H tone can surface as part of a prenuclear accent or
as an initial boundary tone, with only very subtle differences in meaning between the
various association patterns. What is important is whether the onglide is falling or not.
By extension, we might assume that the same floating tone analysis holds for L+H*,
although this was not explicitly discussed by Grice et al. Since we do not manipulate the
contour after the tonal target on the accented syllable, our data do not bear on the
analysis of the offglide (a trailing tone of the nuclear pitch accent or a final boundary
tone).
Chen (2011) points out that the motivation for an on- or off-ramp analysis in the
description of an intonation system, or the development of a transcription system, is not
always transparent. She argues that the decision between on- and off-ramp might not be
a theoretical matter but a typological difference between languages. In her view, the
models not only differ with respect to what they consider to be part of a pitch accent; it
is the intonation systems of the languages themselves that differ. As a consequence, for
27
example, the intonation system of German could be seen as on-ramp, whereas the
intonation system of Dutch could be seen as off-ramp.
However, in order to assign a language to a particular type, it is necessary to find out
whether the language is exclusively describable in terms of either analysis. In the
description of the intonation system of German, models have differed in their off-ramp
or on-ramp characteristics (where on-ramp analyses are indeed mixed, in that they have
leading and trailing tones in their systems). As discussed above, all approaches might be
able to account for the same contours, but by using different mechanisms and with
different degrees of phonetic transparency. In this light, it appears to be more a
theoretical debate than a discussion of typological differences across languages.
Nevertheless, one could argue that certain typological differences between languages
can facilitate one or the other approach. For example, a late alignment of peaks could
lead to a direct encoding of the onglide in the intonation model, i.e. the use of leading
tones. Although this does not account for the differences between ToDI for Dutch and
GToBI for German, since Dutch, like German, is said to have a late alignment
(Schepman, Lickley, & Ladd, 2006), it might account for the British School tendency to
focus on the accented syllable and what follows, since the peak in British English tends
to be fairly early in the syllable.
28
In this paper, we are calling into question whether the “off-ramp-on-ramp-divide” is the
right dimension along which models or even languages should be differentiated. The
results presented rather suggest that, whatever analysis one chooses, it is important to
look at what happens before the starred tone target. We have shown that in German this
region is a meaningful part of the intonation contour and argue that it should be
incorporated into a phonological account of the intonation system of German.
Thus, the results here provide evidence for extending the “window of analysis” to the
left, i.e. before the target for the starred tone. This does not call into question the
following tonal movement, which is of great importance for the meaning of an
utterance, as discussed in the introduction. In our view, a sound analysis of intonation
should take into account what is happening both before and after the tonal target on the
accented syllable, so as to be able to capture important generalisations.
5. Conclusion
We have shown experimentally that native listeners of German make use of the tonal
onglide – the tonal movement leading towards the target of the starred tone as a cue
for distinguishing between two distinct pragmatic meanings. This result strongly
suggests that the onglide plays an important role in the intonation system of German,
adding to the scarce evidence available on the issue of on- vs. off-ramp analyses of
29
intonation. We conclude that whatever the abstractness (floating tone, leading tone,
displaced tone), the window before the target on the accented syllable is important, and
thus needs to be taken into account in the analysis of the intonation of German.
30
References
Bates, D.M., Maechler, M., & Bolker, B. (2012). lme4: Linear mixed-effects models
using S4 classes. R package version 0.999999-0.
Beckman, M. E. (1986). Stress and non-stress accent. Netherlands Phonetics Archives,
vol. VII. Dordrecht: Foris Publications.
Beckman, M. E., Hirschberg, J., & Shattuck-Hufnagel, S. (2005). The original ToBI
system and the evolution of the ToBI framework. In Jun, S.-A. (Ed.), Prosodic
Typology: The Phonology of Intonation and Phrasing (pp. 9-54). Oxford: Oxford
University Press.
Boersma, P. (2001). Praat, a system for doing phonetics by computer. Glot
International, 5, 341-345.
Büring, D. (2007). Intonation, Semantics and Information Structure. In Ramchand, G.
Reiss, C. (Eds.), The Oxford Handbook of Linguistic Interfaces.
31
Chen, A. (2011). What’s in a rise: Evidence for an off-ramp analysis of Dutch
Intonation. Proceedings of the 17th International Congress of Phonetic Sciences,
Hong Kong, 448-451.
Couper-Kuhlen, E. (1986). An Introduction to English Prosody. London and Tuebingen:
Edward Arnold and Niemeyer.
Cruttenden, A. (1997). Intonation (2nd edition). Cambridge: Cambridge University
Press.
Crystal, D. (1969). Prosodic Systems and Intonation in English. London: Cambridge
University Press.
Essen, O. von (1964). Grundzüge der hochdeutschen Satzintonation (2nd edition).
Ratingen: Henn.
Féry, C. (1993). German Intonational Patterns. Tübingen: Niemeyer.
Grabe, E. (2001). The IViE labelling guide, Version 3. Retrieved from
www.phon.ox.ac.uk/files/apps/old_IViE/guide.html
32
Grice, M., & Baumann, S. (2002). Deutsche Intonation und GToBI. Linguistische
Berichte, 191, 267–298.
Grice, M., Baumann, S. & Benzmüller, R. (2005). German Intonation in Autosegmental-
Metrical Phonology. In Jun, Sun-Ah (Ed.), Prosodic Typology: The Phonology of
Intonation and Phrasing (pp 55-83). Oxford University Press.
Grice, M., Baumann, S. & Jagdfeld, N. (2009). Tonal association and derived nuclear
accents: the case of downstepping contours in German. Lingua, 119, 881-905.
Grice, M., & Savino, M. (1995). Low tone versus 'sag' in Bari Italian intonation; a
perceptual experiment. Proceedings of the XIII International Congress of Phonetic
Sciences, Stockholm, Sweden, 658-661.
Gussenhoven, C. (2004). The Phonology of Tone and Intonation. Cambridge:
Cambridge University Press.
33
Gussenhoven, C. (2005). Transcription of Dutch Intonation. In Jun, S.-A. (Ed.),
Prosodic Typology: The Phonology of Intonation and Phrasing (pp. 118-145).
Oxford: Oxford University Press.
Gussenhoven, C. (2008). Semantic judgements as evidence for the intonational structure
of Dutch. Proceedings of the 4th Conference on Speech Prosody, Campinas, 609-
612.
Ladd, D. R. (2008). Intonational Phonology. Cambridge: Cambridge University Press.
Ladd, D.R., & Rachel Morton (1997). The perception of intonational emphasis:
continuous or categorical? Journal of Phonetics, 25, 313-342.
Ladd, D. R., & Schepman, A. (2003). “Sagging transitions” between high pitch accents
in English: experimental evidence. Journal of Phonetics, 31, 81-112.
Mayer, J. (1995). Transcription of German intonation - the Stuttgart System. Retrieved
from http://www.ims.uni-
stuttgart.de/institut/arbeitsgruppen/phonetik/joerg/labman/STGTsystem.html
34
Niebuhr, O., D’Imperio, M., Gili Fivela, B., & Cangemi, F. (2011). Are there "shapers"
and "aligners"? Individual differences in signalling pitch accent category.
Proceedings of the 17th ICPhS, Hong Kong, China, 120-123
Peirce, J. W. (2007). PsychoPy - Psychophysics software in Python. Journal of
Neuroscience Methods, 162, 8-13.
Perkell, J.S., Guenther F.H., Lane, H., Matthies, M.L., Stockmann, E., & Tiede, M.,
Zandipour, M. (2004). The distinctness of speakers' productions of vowel contrasts is
related to their discrimination of the contrasts. J. Acoust. Soc Am., 116, 2338-44.
Peters, J. (2014). Intonation. Heidelberg: Winter.
Pheby, J. (1975). Intonation und Grammatik im Deutschen. Berlin: Akademie-Verlag.
Pierrehumbert, J., & Beckman, M. (1988). Japanese Tone Structure. Cambridge: MIT
Press.
Pike, K.L. (1945). The Intonation of American English. Ann Arbor: University of
Michigan Press.
35
R Core Team (2012). R: A Language and Environment for Statistical Computing.
Retrieved from http://www.R-project.org/.
Ritter, S., Krüger, M., Mücke, D., & Grice, M. (2012). Production and perception of
contrast: Tonal onglides and oral gestures. Poster presentation at the final symposium
of the DFG Priority programme 1234, July 26th.
Röhr, C., & Baumann, S. (2011). Decoding Information Status by Type and Position of
Accent in German. Proceedings of the 17th International Congress of Phonetic
Sciences, Hong Kong, 1706-1709.
Röhr, C., & S. Baumann (2010). Prosodic Marking of Information Status in German.
Proceedings of the Fifth International Conference on Speech Prosody, Chicago,
100019, 1-4.
Schepman, A., Lickley, R., & Ladd, D.R. (2006). Effects of vowel length and “right
context” on the alignment of Dutch nuclear accents. Journal of Phonetics, 34, 1-28.
Acknowledgments:
36
Many thanks to Bodo Winter and Timo Roettger for advice on statistics. This work was
supported by the German Research Foundation (Grant GR 1610/5 awarded to Martine
Grice).
37
... Studies that have addressed the functional distinction between intonational contours have employed semantic scales (e.g., Dombrowski, 2003;Kohler, 2005;Dombrowski and Niebuhr, 2010;Kügler and Gollrad, 2015;Wochner, 2021), free association tasks (Kohler, 1991b), acceptability judgment tasks (Baumann and Grice, 2006), or psycholinguistic methods such as eye-tracking (e.g., Braun and Biezma, 2019). Kügler and Gollrad (2015), for instance, showed that German listeners differentiated between a contrastive and a broad focus reading based on differences in the scaling of the H tone (the L tone did not affect perceptual ratings, but see Ritter and Grice, 2015). Based on a free association task, Kohler (1991b) reports that medial peaks were associated with information that was new to the discourse in declaratives and with an information-seeking notion in questions. ...
... Wieling, 2018;van Rij et al., 2019;Sóskuthy, 2021). Using GAMMs hence not only provides information about tonal alignment (Atterer and Ladd, 2004), but also about tonal onglides (Ritter and Grice, 2015;Roessig et al., 2019), f0 excursions and scaling, and the overall shape of the contour (Niebuhr, 2007b;Niebuhr et al., 2011;Barnes et al., 2012Barnes et al., , 2013Barnes et al., , 2021. GAMMs furthermore allow us to test for interactions between intonation condition and regional variety over time, hence informing us on whether regional variation affects the distinctions between contours differently. ...
Article
Full-text available
The intonational realization of utterances is generally characterized by regional as well as inter- and intra-speaker variability in f0. Category boundaries thus remain “fuzzy” and it is non-trivial how the (continuous) acoustic space maps onto (discrete) pitch accent categories. We focus on three types of rising-falling contours, which differ in the alignment of L(ow) and H(igh) tones with respect to the stressed syllable. Most of the intonational systems on German have described two rising accent categories, e.g., L+H * and L * +H in the German ToBI system. L+H * has a high-pitched stressed syllable and a low leading tone aligned in the pre-tonic syllable; L * +H a low-pitched stressed syllable and a high trailing tone in the post-tonic syllable. There are indications for the existence of a third category which lies between these two categories, with both L and H aligned within the stressed syllable, henceforth termed (LH) * . In the present paper, we empirically investigate the distinctiveness of three rising-falling contours [L+H * , (LH) * , and L * +H, all with a subsequent low boundary tone] in German wh -questions. We employ an approach that addresses both the form and the function of the contours, also taking regional variation into account. In Experiment 1 ( form ), we used a delayed imitation paradigm to test whether Northern and Southern German speakers can imitate the three rising-falling contours in wh -questions as distinct contours. In Experiment 2 ( function ), we used a free association task to investigate whether listeners interpret the pragmatic meaning of the three contours differently. Imitation results showed that German speakers—both from the North and the South—reproduced the three contours. There was a small but significant effect of regional variety such that contours produced by speakers from the North were slightly more distinct than those by speakers from the South. In the association task, listeners from both varieties attributed distinct meanings to the (LH) * accent as opposed to the two ToBI accents L+H * and L * +H. Combined evidence from form and function suggests that three distinct contours can be found in the acoustic and perceptual space of German rising-falling contours.
... As summarized recently by both Gussenhoven (2016) and Ritter and Grice (2015), there are two prevailing views in today's literature on the nature of the default or most common nuclear pitch accents in declarative sentences in West Germanic languages. Put briefly, one school of thought, perhaps dominant in the AM literature today, follows Pierrehumbert's original analysis of English pitch accents in considering these fundamentally a rising pitch movement, or LH in AM's levelbased, non-configurationist terms. ...
... 6 Alternately, one could argue from the point of view of speech perception, demonstrating either that a corner predicted to "matter" to listeners, as a cue to category membership does not in fact matter, or that a region of the signal predicted not to matter by one or the other approach nonetheless does. This is the tack taken by Ritter and Grice (2015), who argue based on the results of a perception experiment for the relevance of what they call the onglide portion of the German pitch accent analogous to the English L + H* discussed above. The logic is that, if a portion of the signal has been shown to matter to listeners for categorization purposes, then that part of the signal should not be exclusively the result of idle interpolation from some previous F0 event to the one in question. ...
Article
Full-text available
Two conflicting views have been advanced of what defines ‘default’ high pitch accents in various West Germanic languages, including English: One equates these accents fundamentally with a rise to a high turning point, while the other focuses on the fall from it. Both views arise from the assumption within Autosegmental-Metrical theory that the phonological representations of intonational categories can be discerned more-or-less directly from the string of intentional-seeming changes of direction in the F0 curve, identified as production ‘targets’. Two perceptual experiments reveal that, at least in American English, this view critically oversimplifies how pitch accents containing High tones are defined and distinguished: instead, both the shape of the rise and the shape of the fall are seen to contribute to the alignment of the overall bulk of the high region, defined by the rise-fall shape, with the segmental string, and thus to its categorization by listeners as an early, mid or late rise-fall (H + !H*, L + H*, or L* + H). These findings are consistent with the view that the Tonal Center of Gravity (TCoG) of the rise-fall shape as a whole, rather than an F0 turning point per se, is what speakers align with segmental content to distinguish different pitch accent categories. Questioning the primacy of the turning points as the phonetic targets for these pitch accents, in turn, seriously problematizes standard assumptions about the nature of phonological representations of intonation and their relation to the signal.
... Pitch accent categories are often determined on the basis of segmental anchoring of F0 turning points, which involves temporal alignment of F0 peaks and valleys to landmarks in the segmental string, such as the CV boundary (Arvaniti, Ladd, & Mennen, 1998). Other gradual dimensions typically used to describe pitch accent categories have been scaling, i.e. the absolute or relative height of F0 high peaks (often used to distinguish between H* and L + H* contours), and tonal onglide (Ritter & Grice, 2015), i.e. the F0 movement from the preceding syllable to the tonal target on the accented syllable (often used to distinguish between H + L*, H* and L + H* contours). ...
... Pitch accent categories are often determined on the basis of segmental anchoring of F0 turning points, which involves temporal alignment of F0 peaks and valleys to landmarks in the segmental string, such as the CV boundary (Arvaniti, Ladd & Mennen, 1998). Other gradual dimensions typically used to describe pitch accent categories have been scaling, i.e. the absolute or relative height of F0 high peaks (often used to distinguish between H* and L+H* contours), and tonal onglide (Ritter & Grice, 2015), i.e. the F0 movement from the preceding syllable to the tonal target on the accented syllable (often used to distinguish between H+L*, H* and L+H* contours). ...
Preprint
Full-text available
Download of published version at (valid until March, 19th): https://authors.elsevier.com/a/1gVCZLixzvfdD . Previous studies on the prosodic marking of information status argue that Italian tends to resist deaccentuation of given elements. In particular, Italian reportedly always accents post-focal given information within noun phrases (NPs), so that it is not possible to reliably reconstruct the information status of the items from the acoustic signal. However, descriptions have so far been concerned with categorical accent patterns, lacking crucial information about continuous phonetic parameters and their distribution in the utterance in ways that can contribute to prosodic marking. In this paper, we use a novel approach based on periodic-energy-related measures to explore how speakers of the Neapolitan variety of Italian modulate continuous prosodic parameters to differentiate information structure. We show that, contrary to previous findings, Italian speakers of the Neapolitan variety do mark information status prosodically within noun phrases. The discrepancy with previous work is explained by the fact that the prosodic marking of post-focal givenness is not achieved through the categorical presence or absence of a pitch accent on one specific syllable, but through the gradual modulation of phonetic parameters at various locations. Moreover, we find that these modulations occur early in the noun phrase. We also show that native speakers can make use of their knowledge of these modulations to reliably identify post-focal given elements in the absence of the pragmatic context, that is, directly from the acoustic signal.
... As discussed above, these focus meanings are often expressed via distinct intonational categories, which differ along multiple cue dimensions. For simplicity's sake, we zoom in on a dimension called onglide, which is considered critical in encoding the narrow versus the contrastive focus in German (Ritter & Grice, 2015). Onglide is a dynamic measure, capturing the pitch movement leading toward the target on the accented syllable ( Figure 4b). ...
Article
Full-text available
Speech prosody, the melodic and rhythmic properties of a language, plays a critical role in our everyday communication. Researchers have identified unique patterns of prosody that segment words and phrases, highlight focal elements in a sentence, and convey holistic meanings and speech acts that interact with the information shared in context. The mapping between the sound and meaning represented in prosody is suggested to be probabilistic-the same physical instance of sounds can support multiple meanings across talkers and contexts while the same meaning can be encoded in physically distinct sound patterns (e.g., pitch movements). The current overview presents an analysis framework for probing the nature of this probabilistic relationship. Illustrated by examples from the literature and a dataset of German focus marking, we discuss the production variability within and across talkers and consider challenges that this variability imposes on the comprehension system. A better understanding of these challenges, we argue, will illuminate how the human perceptual, cognitive, and computational mechanisms may navigate the variability to arrive at a coherent understanding of speech prosody. The current paper is intended to be an introduction for those who are interested in thinking probabilistically about the sound-meaning mapping in prosody. Open questions for future research are discussed with proposals for examining prosodic production and comprehension within a comprehensive, mathematically-motivated framework of probabilistic inference under uncertainty. This article is categorized under: Linguistics > Language in Mind and Brain Psychology > Language.
... As discussed above, these focus meanings are often expressed via distinct intonational categories, which differ along multiple cue dimensions. For simplicity's sake, we zoom in on a dimension called onglide, which is considered critical in encoding the narrow versus the contrastive focus in German (Ritter & Grice, 2015). Onglide is a dynamic measure, capturing the pitch movement leading toward the target on the accented syllable ( Figure 4b). ...
Preprint
Full-text available
Speech prosody, the melodic and rhythmic properties of a language, plays a critical role in our everyday communication. Researchers have identified unique patterns of prosody that segment words and phrases, highlight focal elements in a sentence, and convey holistic meanings and speech acts that interact with the information shared in context. The mapping between the sound and meaning represented in prosody is suggested to be probabilistic – the same physical instance of sounds can support multiple meanings across talkers and contexts while the same meaning can be encoded in physically distinct sound patterns (e.g., pitch movements). The current overview presents an analysis framework for probing the nature of this probabilistic relationship. Illustrated by examples from the literature and a dataset of German focus marking, we discuss the production variability within and across talkers and consider challenges that this variability imposes on the comprehension system. A better understanding of these challenges, we argue, will illuminate how the human perceptual, cognitive, and computational mechanisms may navigate the variability to arrive at coherent understanding of speech prosody. The current paper is intended to be an introduction for those who are interested in thinking probabilistically about the sound-meaning mapping in prosody. Open questions for future research are discussed with proposals for examining prosodic production and comprehension within a comprehensive, mathematically-motivated framework of probabilistic inference under uncertainty.
... (G)ToBI is an intonation model that is based on principles of autosegmental-metrical phonology in which intonation contours are analysed into sequences of (high (H) and low (L)) tones that are associated with heads and edges in the prosodic structure of utterances (Jun, 2005(Jun, , 2014Ladd, 2008;Pierrehumbert, 1980). The tonal inventory involves monotonal as well as bitonal accents which primarily account for the pitch movement leading towards an accented syllable (i.e. the tonal onglide; Ritter & Grice, 2015). The tonal onglide is the part of the f0 contour that is generally accounted for phonologically in terms of a leading tone in rightheaded accent types, such as the L in an L+H* accent, or the first H in an H+!H* or H+L* accent. ...
Article
Full-text available
This paper investigates neurophysiological correlates of prosodic prominence in German with two EEG experiments. Experiment 1 tested different degrees of prominence (three accent types: L+H*, H*, H+L* and deaccentuation) in the absence of context, making the acoustic signal the only source for attention orienting. Experiment 2 tested L+H* and H+L* accents in relation to contexts such as “Guess what happened today” triggering expectations as to how exciting the following utterance will be. Results reveal that prominence cues that attract attention, such as a signal-driven high level of prosodic prominence or a content-driven expression of excitement, engender positivities of varying latency. Furthermore, contextual expectations trigger prediction errors, e.g. deviations from an appropriate level of prosodic prominence result in a negative ERP deflection. Hence, the data suggest that the two core processes – attentional orientation and predictive processing – reflect discrete stages in the construction of a mental representation during real-time comprehension.
... For some stimuli, the f0 values of the accentual peaks or valleys were adjusted (with Praat: [17]) if they exceeded a designated range of 10-15 Hz around the mean value for a specific condition (number of resynthesized stimuli per condition: L+H* = 23, H* = 19, H+L* = 25 and deaccentuation = 19). Table 1 shows the means and standard deviations of the accentual peaks (for deaccentuation: the f0 value in the middle of the stressed vowel), the tonal onglide (i.e. the pitch excursion towards the starred tone of the accented syllable in semitones [18]; see also [7]) and the duration of the target words for each condition. The values show stepwise differences in the height of the accentual H tone with the highest value for L+H* accents, a mid value for H+L* accents and the lowest value for H* accents (which is similar to the f0 value of deaccented target syllables). ...
Article
Current empirical work on prosodic prominence is based on theoretical developments in the mid-twentieth century, in which a generalized notion of stress (in word pairs like English insight/incite and in sentence pairs like THEY left/they LEFT) was replaced by a distinction between an abstract notion of word stress and a concrete notion of phrasal accent or prominence that applies to specific words in an utterance. Much research since then has focused on phonetic and other cues that signal such prominence. Early findings emphasized the role of intonational pitch movements; more recent research demonstrates the importance of other phonetic cues, categorical differences between pitch movement types, and nonphonetic factors like word frequency. However, the definition of prominence itself remains informal and depends on intuitions that are well motivated primarily in European languages. Recent findings point to important differences between languages. These might be accommodated in a more comprehensive theory of word and sentence stress that treats both as manifestations of a hierarchical prosodic structure of the sort assumed in metrical phonology, while at the same time allowing for significant differences of prosodic typology. Expected final online publication date for the Annual Review of Linguistics, Volume 9 is January 2023. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Article
Contexts such as “Guess what happened yesterday” lead to expectations as to how unusual and exciting the content of a following utterance will be. This paper investigates how speakers encode this pragmatic meaning in their productions and further evaluates the findings from the listeners’ perspective. Contexts in which the speaker is required to make information exciting for the listener lead to target words being made more prominent, with more frequent use of rising accents and larger rising tonal onglides than information placed in a neutral or ordinary context. Conversely, ordinary information is made less prominent by means of fewer and smaller rising onglides as well as more and larger falling onglides. Individual speakers convey this information in different but systematically compatible ways, supporting a view of intonational phonology that integrates qualitative pitch accent categories and quantitative phonetic parameters. Listeners’ ratings of the contexts and the production stimuli confirm the interpretation of the intended meanings and the role of the tonal onglide: A large rising onglide (as in L+H*) on the target word clearly leads to the interpretation of unusual/exciting information, whereas a small rising onglide (as in H*) or falling onglides (as in H+!H* and H+L*) do not.
Article
Full-text available
In this paper we provide an overview of work carried out on the intonation of Standard Ger-man, both in auditory phonetic studies and in the instrumentally-based phonological accounts within the autosegmental-metrical framework. We examine how far the different accounts shed light on controversial issues such as leading tones, levels of phrasing, and phrase accents, and propose a surface-oriented annotation framework, GToBI, which aims to capture all empi-rically observed distinctive intonation patterns. For illustration purposes, the contours which are reported to occur most commonly are given in schematic form, along with their GToBI transcription and examples of their usage.
Conference Paper
Full-text available
Pitch accents are analysed differently in an on-ramp analysis (i.e. ToBI) and an off-ramp analysis (e.g. Transcription of Dutch Intonation -ToDI), two competing approaches in the Autosegmental Metrical tradition. A case in point is pre-final high rise. A pre-final rise is analysed as H* in ToBI but is phonologically ambiguous between H* or H*L (a (rise-)fall) in ToDI. This is because in ToDI, the L tone of a pre-final H*L can be realised in the following unaccented words and both H* and H*L can show up as a high rise in the accented word. To find out whether there is a two-way phonological contrast in pre-final high rises in Dutch, we examined the distribution of phonologically ambiguous high rises (H*(L)) and their phonetic realisation in different information structural conditions (topic vs. focus), compared to phonologically unambiguous H* and H*L. Results showed that there is indeed a H*L vs. H* contrast in prefinal high rises in Dutch and that H*L is realised as H*(L) when sonorant material is limited in the accented word. These findings provide new evidence for an off-ramp analysis of Dutch intonation and have far-reaching implications for analysis of intonation across languages.
Book
Full-text available
There is an online scan available to borrow at: https://archive.org/details/japanesetonestru00pier The description there is: Japanese Tone Structure provides a thorough, phonetically grounded description of accent and intonation in Tokyo Japanese and uses it to develop an explicit account of surface phonological representation. The unusual amount of quantitative phonetic data analyzed and its testing in a detailed model make this an important new study for theoretical phonologists, phoneticians, and specialists in Japanese. The authors' broader purpose, however, is to develop a general theory of surface representation that can capture salient facts about prosodic structure in all languages and provide a suitable input to phonetic rules. The theory integrates autosegmental principles into a metrical account of prosodic structures in an explicit formalism. The work establishes phonology and phonetics as a productive area in cognitive science.
Book
When originally published in 1986, this book was the first to survey intonation in all its aspects, both in English and universally. In this updated edition, while the basic descriptive facts of the form and use of intonation are presented in the British nuclear tone tradition, there is nevertheless extensive comparison with other theoretical frameworks, in particular with the ToBI framework, which has become widespread in the United States. In this new edition Alan Cruttenden has expanded the sections on historical background, different theoretical approaches and sociolinguistic variation. After introductory chapters on the physiology and acoustics of pitch, he describes in detail the forms and functions of intonation in English and discusses the sociolinguistic and dialectal variations in intonation. The concluding chapter provides an overview of the state of the art in intonational studies.
Book
This second edition presents a completely revised overview of research on intonational phonology since the 1970s, including new material on research developments since the mid 1990s. It contains a new section discussing the research on the alignment of pitch features that has developed since the first edition was published, a substantially rewritten section on ToBI transcription that takes account of the application of ToBI principles to other languages, and new sections on the phonetic research on accent and focus. The substantive chapters on the analysis and transcription of pitch contours, pitch range, sentence stress and prosodic structure have been reorganised and updated. In addition, there is an associated website with sound files of the example sentences discussed in the book. This well-known study will continue to appeal to researchers and graduate students who work on any aspect of intonation.
Article
The ToBI transcription system for English intonation draws a distinction between two kinds of pitch accents involving local F0 maxima, namely H* and L+H*. In the L+H*, the rise to the F0 maximum begins with an actual phonological target (L), but in the H* the beginning of the rise (here “F0 min”) supposedly has no phonological status and its phonetic properties are determined by various contextual factors. The three experiments reported here provide evidence against this latter claim. The experiments are based on the phonetic properties of the medial F0 min in H* H* sequences on English given name+surname phrases (e.g., Norman Elson). In Experiment 1, we show that the F0 min is reliably aligned with the beginning of the accented syllable of the surname, thus correlating with the word boundary distinction in minimal pairs like Norman Elson/Norma Nelson. In Experiment 2, we show that experimentally modifying the alignment of the F0 min in such segmentally ambiguous phrases affects listeners’ judgment of which name they are hearing. In Experiment 3, we show that the F0 level of the F0 min and of the second H* accent is affected by the number of syllables intervening between the two accented syllables, in a way that is not predicted by Pierrehumbert's “sagging transition” model, which is central to the distinction between H* and L+H*. We therefore argue that in both H* and L+H* there are distinct L and H targets, and that the two should be regarded as belonging to a single accent category. This analysis makes the description of English intonation more theoretically consistent with that of various other European languages. The analysis also helps explain ToBI transcribers’ demonstrated difficulty in making the distinction between H* and L+H* reliably.