This article may not exactly replicate the final published version. [©2004 American Psychological Association] 1
This article may not exactly replicate the final
version published in the APA journal. It is not
the copy of record.
Representing a Described Sequence of Events: A Dynamic View of Narrative Comprehension
Technical University of Berlin, Germany
University of Hamburg, Germany
Technical University of Berlin, Germany
Journal of Experimental Psychology: Learning, Memory, and Cognition, 2004,
Vol. 30, No. 2, 451–464.
©2004 American Psychological Association
journal home page: http://www.apa.org/journals/xlm.html
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 2
This study explored the representation that readers construct when advancing through the
description of an unfolding occurrence. In 3 experiments, participants read narratives describing
a sequence of events and at a certain moment were tested for the accessibility of an entity from a
past event. Entities were less accessible when the temporal distance between that past event and
the current now point in the described world was relatively long than when it was shorter. This
effect occurred when temporal distance was varied in terms of the duration of an intervening
event but not when it was varied in terms of a temporal shift. The results suggest that the
representation constructed for the description of an unfolding occurrence mimics its temporal
structure. This is consistent with a dynamic view of narrative comprehension.
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 3
It is widely agreed that narrative comprehension involves creating a mental
representation of the states of affairs described by the text. This nonlinguistic representation is
referred to as a mental model (e.g., Glenberg, Meyer, & Lindem, 1987; Johnson-Laird, 1983) or
a situation model (e.g., Graesser, Millis, & Zwaan, 1997; Morrow, Bower, & Greenspan, 1990;
van Dijk & Kintsch, 1983; Zwaan & Radvansky, 1998). Mental models are constructed on-line
during text processing. Each incoming sentence is processed in consideration of prior
information, and then in turn, the model is updated according to the new information. However,
at a given moment during text processing, not all previously described states of affairs are
equally accessible. It seems likely that when interpreting a new sentence, the most easily
accessible states of affairs per default serve as the context (e.g., for determining the referents of
discourse anaphors, for resolving lexical ambiguities, and for drawing inferences). We refer to
the representation of these states of affairs as the current model.
One might suppose that the current model always represents the states of affairs that
were described in the last sentence. However, empirical findings suggest that recency of
mention is not the decisive variable. Rather, accessibility is mainly determined by what the
comprehender conceives as the current situation of the protagonist. For example, the state of
affairs that was described in the last sentence may in fact be difficult to access immediately
afterward, when this state of affairs was described as belonging to the protagonist’s past, as in
Sometime in the past, she worked as an economist for an international company, compared
with when it was described as belonging to the protagonist’s current situation, as in Now she
works as an economist for an international company (Carreiras, Carriedo, Alonso, &
Fernández, 1997). Numerous other studies have also shown that information concerning the
protagonist’s current situation is more accessible than information concerning other places or
time periods of the described world, even when recency of mention is controlled for (e.g.,
Anderson, Garrod, & Sanford, 1983; Bower & Rinck, 2001; Glenberg et al., 1987; Haenggi,
Kintsch, & Gernsbacher, 1995; Levine & Klin, 2001; Magliano & Schleich, 2000; Morrow,
1994; Morrow, Bower, & Greenspan, 1989; Morrow et al., 1990; Radvansky & Copeland,
2001; Rinck & Bower, 1995, 2000; Rinck, Hähnel, Bower, & Glowalla, 1997; Rinck,
Williams, Bower, & Becker, 1996; Zwaan, 1996; Zwaan, Madden, & Whitten, 2000). Thus,
there is good evidence that at a given moment during narrative processing, the current model
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 4
represents the protagonist’s situation at the respective narrative now.
It can be expected that a new sentence gives rise to different updating processes
depending on whether this sentence elaborates the protagonist’s current situation or implies a
shift to another situation. If the new sentence elaborates the current situation, the new
information can simply be added to the given current model, as in Jane was lying in bed,
listening to the final chords of her favorite Bach sonata. She enjoyed the full sound of her
In contrast, if the new sentence implies a shift of the narrative now, it can be expected
that a separate model is constructed, which then becomes the new current model: Jane was
lying in bed, listening to the final chords of her favorite Bach sonata. An hour later, she
turned off the stereo and fell asleep.
Empirical studies confirm that reading times for sentences implying a temporal shift
are prolonged (Rinck & Weber, in press; see also Hyönä, 1995), and entities or states of
affairs from the pre-shift situation become less accessible unless these entities or states of
affairs are also part of the situation at the new narrative now (Bestgen & Vonk, 1995; Levine
& Klin, 2001; Scott Rich & Taylor, 2000; Zwaan et al., 2000).
Yet, matters turn out to be more complicated when considering a text such as the
following, in which the second sentence implies situational continuity: Jane was lying in bed,
listening to the final chords of her favorite Bach sonata. She turned off the stereo and fell
It is not immediately clear whether the second sentence should be considered as
referring to the same or a different situation than the one that is represented in the current
model. On the one hand, one could propose that the new sentence refers to the same situation,
as it describes how this situation developed. On the other hand, the new sentence implies a
forward movement of the narrative now. On the basis of the previously mentioned findings
concerning the content of current models, it therefore seems reasonable to assume that in this
case a new model is constructed when processing the second sentence. Indeed, there is
empirical evidence that situational continuity sentences lead to diminished accessibility of
information concerning the prior situation compared with elaboration sentences (Levine &
Klin, 2001, Exp. 3; Magliano & Schleich, 2000). Thus, it seems that situational-continuity
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 5
sentences do not differ fundamentally from temporal shift sentences with respect to the
updating processes they trigger (disregarding potential processing advantages due to a larger
content overlap, see Gernsbacher, 1990, 1997). In both cases, a separate model is created for
representing the situation described in the new sentence. However, empirical findings suggest
that there is a difference between the result of processing a situational continuity sentence and
the result of a temporal shift sentence over and above the effect of differences in content
overlap. Information concerning the situation that was represented in the prior current model
(e.g., final chords) is more accessible after reading a situational continuity sentence than after
reading a time-shift sentence, even when both of them described the same new event, for
example, turning off the stereo (Bestgen & Vonk, 1995; Zwaan, 1996). It seems that in the
case of a situational-continuity sentence the new current model becomes more tightly
connected to the previous model than in the case of a temporal shift.
One possibility to account for these findings is to assume that the updating processes
are indeed basically identical in the two cases but that whenever a given current model is
replaced by a new current model, the continuity or discontinuity of the two respective
situations is explicitly encoded. This idea comes close to what is assumed in the event-
indexing model (Magliano, Zwaan, & Graesser, 1999; Zwaan, 1999a; Zwaan, Magliano, &
Graesser, 1995; Zwaan & Radvansky, 1998). According to this model, events are the core
units of the mental representation constructed for a narrative. It is assumed that after the
model for a given sentence is constructed in short-term working memory, the information of
this model is transferred to long-term working memory (cf. Ericsson & Kintsch, 1995), where
the previously described events are stored. The information is still easily accessible through
retrieval cues maintained in short-term working memory, which enables the comprehender to
connect the next current model to the information about previously described events. More
specifically, it is assumed that during text processing each incoming story event is indexed on
five situational dimensions: time, space, causality, intentionality, and agents. The more
indices two events share, the stronger the connection is between the event representations.
With respect to the temporal dimension, in particular, the event-indexing model assumes that
two events share a temporal index not only when they occur simultaneously but also when
they occur temporally contiguously in the described world. Thus, when the comprehender
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 6
constructs a model for a situational continuity sentence, this model becomes strongly
connected to the representation of the preceding event, which is already stored in long-term
memory. In contrast, the model for a comparable temporal shift sentence becomes relatively
weakly connected to the memory representation of the event described before. Apart from the
differences in connection strength, however, the result of the updating process is identical.
Hence, the event-indexing model implies that an unfolding occurrence, when being described
in consecutive situational continuity sentences, is represented by a series of distinct, though
interrelated, event representations. Note, that Zwaan himself, in his more recent work (e.g.,
Zwaan, 1999b, in press; Zwaan & Yaxley, in press) has proposed an experiential view of
language comprehension that in contrast to the standard version of the event-indexing model
is based on the assumption that language comprehension involves perceptual and action
representations rather than amodal propositional representations.
In narrative comprehension research, the narrative now is usually conceived of as the
time interval in the described world that a given sentence refers to, similar to Reichenbach’s
(1947) reference time and Klein’s (1994) topic time. In other words, for each sentence there is
one particular constant narrative now. Accordingly, when a sequence of situational-continuity
sentences describes how an occurrence unfolds, the narrative now moves ahead in discrete
steps. This view of the narrative now invites the assumption that comprehenders represent the
occurrence by a sequence of static representations (one for each sentence). However, this
static view of event representation is not the only possible one.
An alternative view results when proceeding from the original notion of a mental
model, as articulated in Johnson-Laird’s (1983) theory. This theory implies that the mental
models constructed in language comprehension are of the same kind as the mental models
that are constructed in nonlinguistic cognition. Generally, mental models are internal models
of the real or a fictitious world that are grounded in perception (see also Craik, 1943).
Accordingly, it can be presumed that the mental models created in language comprehension
have important properties in common with the mental models of perceived states of affairs.
Numerous studies have shown that people represent perceived events and even static objects
or scenes dynamically. For example, people mentally simulate seeing a depicted object
rotating or a pulley operating (e.g., Cooper, 1976; Hegarty, 1992; Sims & Hegarty, 1997),
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 7
they mentally track the presumed trajectory of a depicted object (e.g., Freyd & Pantzer, 1995;
Hespos & Rochat, 1997), and they anticipate the upcoming position or orientation of an
object that seems to be in an unstable state (e.g., Bertamini, 1993; Freyd, 1987; Verfaillie &
d'Ydewalle, 1991; for a review see Thornton & Hubbard, 2002). Given the evidence for
dynamic representations in perception, it seems reasonable to entertain the view that the
mental models constructed in language comprehension are dynamic representations as well,
or in other words, that comprehenders mentally simulate the experience of the events
described in a text (cf. Johnson-Laird, 1983, pp. 10, 423; McGinn, 1989, chap. 3; Zwaan,
Yaxley, Madden, & Aveyard, 2003).
A dynamic representation is a representation that evolves in time. Its temporal
structure has a representational function. It intrinsically codes (Palmer, 1978) the temporal
structure of the respective perceived or conceived state of affairs (cf. Freyd, 1987, 1993). A
dynamic representation is most naturally conceived as a temporally continuous
representation. It can also be conceived as a series of discrete static representations of
successive states of the evolving occurrence. Note, however, that a series of static
representations qualifies as a dynamic representation only if the temporal structure of the
series reflects the temporal structure of the represented occurrence.
In principle, the notion of a dynamic mental representation does not imply any specific
assumptions as to the amount and kind of non-temporal information being represented.
However, in our particular theoretical framework, we assume that the dynamic
representations being constructed in narrative comprehension have an informational content
that is principally comparable to (although possibly more schematic than) the content of the
representations that would be constructed when actually experiencing the respective
Let us now consider in greater detail what the dynamic view implies with respect to
narrative processing. When starting to read a description of several events, comprehenders set up
a dynamic representation simulating the experience of the event that is described in the first
sentence. When the next sentence is a situational continuity sentence, they continue the hitherto
constructed dynamic representation. We refer to this type of updating, which consists of the
continuation of a given dynamic representation, as tracking. If a text describes an unfolding
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 8
occurrence by means of a series of consecutive situational continuity sentences and thus allows
for continuous tracking, then the occurrence becomes gradually coded in one dynamic
representation, without any clear cuts between the representations of successive subevents of the
occurrence. In other words, according to the dynamic view, people conceive of the narrative now
as continuously moving ahead, as long as each new sentence is a situational continuity sentence.
When a new sentence implies a temporal shift, the previously constructed dynamic
representation is discontinued, and a new dynamic representation is initiated. Consequently,
the time interval that was skipped in the description is not represented. The dynamic view
does not suggest anything specific as to where and how the information from the outdated
dynamic representation is stored when a fresh start is performed. For now, we assume that the
information of a terminated dynamic representation is stored in a static representation in long-
term working memory.
To summarize, the dynamic view implies that two types of updating must be
distinguished: tracking and a fresh start. Tracking occurs when the new sentence is a
situational continuity sentence. Tracking consists in continuing the given dynamic
representation. A fresh start is performed when the new sentence implies a temporal shift. A
fresh start involves the termination of the given dynamic representation and the initiation of a
new one. The dynamic view differs from the previously described static view mainly in
postulating an updating process like tracking. More specifically, immediately successive
events that are described in consecutive situational continuity sentences are assumed to be
encoded in one coherent dynamic representation. In contrast, the static view implies that the
comprehender sets up one static representation after the other, with the time between
constructing the individual representations having no representational function.
One possibility to distinguish empirically between the dynamic and static views is to
study the impact that the temporal remoteness has on the accessibility of events that all
occurred in the protagonist's past. Take, for example, the chronological description of an
occurrence consisting of three immediately successive events: E1, E2, and E3. Let us further
assume that the sentence describing E3 contains an anaphoric expression that refers to an
entity that was involved in E1. Thus, when processing the information about E3, the reader
has to access an element of E1. The dynamic view implies that when reading the text, a
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 9
coherent dynamic representation is set up from E1, via E2, to E3. As the temporal structure of
the occurrence is intrinsically coded in the representation, the temporal distance between E1
and E3 in the described world is reflected in the distance between the sections of the
representation that code the two events. Accordingly, accessing the representation of E1
should take longer when the temporal distance between E1 and E3 is greater. In the case at
hand, where E1, E2, and E3 are immediately successive events, the temporal distance
between E1 and E3 depends on the duration of E2 (ceteris paribus). Thus, the dynamic view
predicts that more time is needed to resolve the anaphor in the description of E3 when E2 is a
long-lasting event than when it is a short-lived event. In contrast, the static view does not
predict any effect of temporal distance or event duration. The static view implies that when
the reader creates the model for E3, the events E1 and E2 are already stored in distinct
memory representations. Thus, there is no reason to expect that the time needed to access an
element of E1 when processing the description of E3 depends on the duration of E2.
Along these lines of reasoning, we conducted three experiments. Participants read
narrative texts that chronologically described a coherent occurrence consisting of three
contiguous events. A sample text is given in Table 1. Within the context of the description of
the first event (e.g., decorating the Christmas tree) the target entity was mentioned. The
second event always started with a movement of the protagonist to another place, and it either
had a relatively short or a relatively long duration (e.g., putting cookies on plates vs. baking
cookies). In the course of the description of the third event, the accessibility of the target
entity was tested, either by means of an anaphoric expression referring to the target entity
(Experiment 1 and 3) or by means of a probe-recognition task (Experiment 2). The dynamic
view predicts that accessing the target entity will require more time when the duration of the
second event (in the described world) is long compared with when it is short. In contrast, no
duration effect is predicted by the static view.
Participants. Forty-four students at the Technical University of Berlin participated in the
experiment. They either received a monetary reimbursement or participated to fulfill
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 10
undergraduate requirements. All participants were native German speakers.
Materials. There were 20 experimental narratives, constructed according to the following
scheme (see Table 1). First, the setting was specified, and the protagonist was introduced by
means of a proper name. Then, a particular action of the protagonist was described (first event),
and in this context, the target entity was mentioned. The target entity was a distinct, short-lived
incident or component of the protagonist's action (e.g., throwing a tantrum). In the next sentence,
the protagonist was described as moving to another place where he or she then performed
another action (intermediate event). For each text, there were two versions of this sentence. The
short-duration version described an action that usually doesn’t take too long (e.g., putting
cookies on plates), whereas the long-duration version described an action that usually takes
relatively longer (e.g., baking cookies). The next sentence (filler sentence) described the
termination or result of the action and signaled the beginning of a new event (third event). This
new event was elaborated in the next sentence (anaphoric sentence), in which the target entity
was anaphorically referred to. The anaphoric expression either contained the same noun that was
previously used for denoting the incident or contained a nominalization of the verb that was
central in the first mention of the target incident. The anaphoric sentence was followed by one or
more sentences that completed the story.
There were 38 filler texts, which were similar to the experimental texts with respect to
topics, style, and length. Twenty-two of these filler texts served another purpose, not related to
the issue of temporal information. To encourage the participants to read for comprehension,
there was a declarative statement for each text that was to be verified by the participants
immediately after reading the text.
Design and procedure. The 20 experimental texts were randomly assigned to two sets A
and B, which consisted of 10 texts each. Half of the participants received the short-duration
version of Set A and the long-duration version of Set B. The other half received the long-
duration version of Set A and the short-duration version of Set B. Thus, the two duration
versions were assigned to participant groups and text sets according to a 2(group) x 2(set) x
2(duration) Latin-square design. Experimental texts and filler texts were presented in various
mixed random orders.
Texts were displayed on a computer screen in 14point Palatino font. The participants
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 11
initiated the presentation of each text by pressing the return key and advanced through the text,
sentence by sentence, by pressing the space bar. When they pressed the space bar after the final
sentence of a text, the declarative statement was presented. Participants indicated their positive
or negative response by pressing the l key or d key, respectively. Participants were tested
individually. They were instructed to read the texts carefully at their normal pace. They were
informed that they would be asked to verify a statement after each text. It was not mentioned that
reading times were being measured. The procedure was illustrated by means of two practice
trials. The experimental session lasted approximately 50 minutes.
Results and Discussion
Analyses were carried out on the reading times for the anaphoric sentences and to control
for potential spillover effects from the experimental manipulation, also on the reading times for
the immediately preceding filler sentences. Outliers were determined separately for the two
sentence types. In determining outliers, we took into account not only differences among the
participants into account, but also differences among the sentences. First, the 20 reading times of
each participant were converted to z scores. Then reading times with a z score that deviated more
than 2.5 SDs from the mean z score of the respective item in the respective condition were
discarded. This eliminated 2.7% of the reading times for the filler sentences and 1.8% of the
reading times for the anaphoric sentences. In this article, F
refers to tests against an error term
based on participant variability, and F
refers to tests against an error term based on item
variability. The analysis by participants was a 2(group) x 2(sentence type: filler vs. anaphoric) x
2(duration: short vs. long) mixed analysis of variance (ANOVA) with group as the only
between-subjects variable. The analysis by items was analogous, with set instead of group.
Group and set, being derived from the before-mentioned Latin square, were included to reduce
error variance (cf. Pollatsek, & Well, 1995). Because they lacked theoretical relevance, however,
their effects are not reported. Because of the group variable, the standard deviations presented in
the tables are the square root of the pooled variance of the participant groups in the respective
condition. An alpha level of .05 was used for all statistical tests. All effect sizes reported are
proportions of variance explained (PV), determined according to Murphy and Myors (1998).
The mean reading times for the two types of sentences in the two duration conditions are
displayed in Table 2. There was a main effect of sentence, F
(1,42) = 62.84, MSE = 35,631, p
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 12
(1,18) = 6.94, MSE = 149,564, p = .02, which however is of little interest, as the filler
and anaphoric sentences were not matched for length and complexity. The main effect of
duration was significant as well, F
(1,42) = 5.49, MSE = 28,541, p = .02; F
(1,18) = 5.54, MSE
= 15,239, p = .03. The Sentence x Duration interaction was significant in the analysis by
(1,42) = 4.73, MSE = 16,898, p = .04, and marginally significant in the analysis
by items, F
(1,18) = 3.44, MSE = 9,370, p = .08. Planned comparisons showed that the reading
times for the filler sentences did not differ significantly in the two duration conditions (Fs < 1),
whereas the reading times for the anaphoric sentences were reliably shorter in the short-duration
condition than they were in the long-duration condition, F
(1,42) = 12.18, MSE = 18,891, p <
.01, PV = .22; F
(1,18) = 11.45, MSE = 9,652, p < .01, PV = .39.
The finding that the intermediate event affected the reading times for the anaphoric
sentence but not the reading times for the filler sentence suggests that the nature of the
intermediate event was specifically relevant to the resolution of the anaphor. Furthermore, the
fact that the reading times for the anaphoric sentence were shorter in the short-duration condition
than they were in the long-duration condition is consistent with the dynamic view of narrative
comprehension. The texts allowed for tracking. Thus, the readers represented the described
sequence of events in one coherent dynamic representation. They thus needed more time to
access the target incident when the first event was temporally more remote from the current now
of the protagonist compared with when it was temporally closer to it.
Note, that the observed reading time difference cannot be interpreted in terms of an in-
out effect, which is observed when the accessibility of information concerning the protagonist’s
situation at the narrative now is compared with the accessibility of information about other time
periods (e.g., Anderson et al., 1983; Carreiras et al., 1997; Rinck & Bower, 2000; Zwaan, 1996;
Zwaan et al., 2000). In the present experiment, at the time of testing, the target incident belonged
to the protagonist's past in both conditions. The observed effect is therefore difficult to explain
on the basis of the event-indexing model or other theories that assume that comprehenders
construct a distinct static representation for each event described.
It could be argued that it was not the accessibility of the target incident that was affected
by the intermediate event but rather some other process involved in connecting the information
of the anaphoric sentence to the previous text information. For whatever reasons, this integration
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 13
process may have been easier with the short-duration text versions than with the long-duration
versions. Experiment 2 addressed this issue.
The aim of this experiment was to examine whether an effect of the intermediate event
also emerges when the accessibility of the target entity is tested without requiring any further
processing at the sentence or text level. For these purposes, in the present experiment
accessibility was tested by means of a probe-recognition task. In place of the anaphoric sentence,
a probe word was presented that denoted the target entity (see Table 3).
An additional aim of Experiment 2 was to explore the generalizability of the results with
regard to the type of target entities. The targets in Experiment 1 were short-lived incidents,
existing only in a limited time period of the protagonist's past. It is conceivable that this
particular type of entity gives special weight to the temporal dimension as a basis for access. To
deemphasize the temporal dimension, the target entities in Experiment 2 were concrete physical
objects that endured over time and kept their place in the described world (e.g., a bottle of wine,
a letter). However, in one respect the targets could still be assigned to a particular temporal
interval in the described occurrence: Only in the first event were they an element of the
protagonist's situation. For example, in the story given in Table 3, the target object (carp) is an
element of the protagonist's here and now during the first event only, even though it exists for
the whole time.
Participants. Forty-four students at the Hamburg University took part in the experiment.
They were either paid for their participation or took part in the experiment to fulfill
undergraduate requirements. All participants were native German speakers. The data from 4
participants were excluded from the analyses because the accuracy of their probe-recognition
performance on the experimental items was not significantly better than chance (binomial test,
six or more errors, p > .05, one-tailed).
Materials. Twenty new experimental texts were constructed. The construction scheme
was the same as in Experiment 1, with two exceptions. First, the target entity was always an
object (e.g., a bottle, a bag, a particular window), which was involved in the protagonist's
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 14
activity in the first event. Second, after the filler sentence, a probe word was presented instead of
an anaphoric sentence. The probe word was the name of the target object.
There were 60 filler texts. For 20 filler texts, the probe word was a word that had been
mentioned in the text. For the other 40 filler texts, the probe word had not been mentioned.
Thus, all in all, there were 50% positive probes and 50% negative probes in the experiment. As
in Experiment 1, there was a declarative statement for each text that was to be verified by the
Design and Procedure. The participants were assigned to two groups, and the
experimental texts were assigned to two sets. The two duration versions were assigned to the
groups and text sets according to a 2(group) x 2(set) x 2(duration) Latin-square design. The
procedure was the same as in Experiment 1 with one exception. When the participants
pressed the space bar after having read the filler sentence, a short auditory signal occurred and
then the probe word appeared in the middle of the computer screen. Participants indicated
their positive or negative response by pressing the l or d key, respectively. Afterward the
declarative statement was presented, and participants responded by again pressing the l key or
d key. In the instruction, participants were asked to respond to the probe words as quickly and
accurately as possible. The experimental session lasted approximately 50 min.
Results and Discussion
Due to an error in the preparation of the text files for the computer program, one of the
experimental texts was presented incorrectly. The data pertaining to this text were discarded.
The mean error rates for the probe-recognition task in the experimental trials were 13.2% in the
short-duration condition and 12.8% in the long-duration condition. This difference was
statistically negligible (Fs < 1).
The reading times for the filler sentences and the latencies of the correct probe responses
were analyzed separately. Outliers were determined in the same way as in Experiment 1 (filler
sentences: 2.4%, correct probe responses: 2.3%). For each of the two data sets, the analysis by
participants was a 2 (group) x 2 (duration: short vs. long) mixed ANOVA, with group as
between-subjects variable and duration as within-subject variable. The two corresponding
analyses by items were analogous with set instead of group. The participants' mean reading times
for the filler sentences and mean latencies of correct probe responses are displayed in Table 4.
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 15
The duration of the intermediate event did not affect the reading times for the filler sentences (Fs
< 1), but it did affect the probe-recognition latencies. Latencies were significantly shorter in the
short-duration condition than they were in the long-duration condition, F
(1,38) = 6.80, MSE =
33,829, p = .01, PV = .15; F
(1,17) = 5.22, MSE = 25,604, p = .04, PV = .23.
The results correspond to those of Experiment 1. Participants needed more time to
recognize the name of the target entity when the intermediate event had a long duration (in the
described world) than when it had a relatively short duration. This supports the claim that the
intermediate event affects the accessibility of information concerning a previous event rather
than some integration process. Moreover, the present experiment shows that this even holds
when the target entity is a concrete object that endures over time and stays stationary. This
suggests that the temporal dimension is used for access even when the target entity itself is not
bound to a particular period of time in the described world. This result may be interpreted in
context with findings that suggest that states of affairs being mentioned in a narrative are
generally mentally coded in terms of their significance to the protagonist (e.g., Sanford, Clegg, &
As was pointed out above, finding an effect of the duration of the intermediate event on
the accessibility of the target entity poses a problem for the static view. However, one could be
skeptical as to whether our experiments really provided evidence for a duration effect. After all,
the intermediate events mentioned in the two text versions differed not only with respect to their
typical duration but also in other respects. In particular, one may suspect that the intermediate
events mentioned in the long-duration versions were on average more complex than the
intermediate events mentioned in the short-duration versions. If this was indeed the case, then
the observed effect may be due to the fact that in the long-duration condition often a more
complex memory representation needed to be searched through to access the target entity than in
the short-duration condition. Thus, the alleged duration effect may in fact have been an effect of
the complexity of the intermediate event. Experiment 3 was designed to clarify this issue.
The primary purpose of Experiment 3 was to examine whether the duration of the
intermediate event affects the accessibility of a preceding event even when the complexity of
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 16
the event is controlled for. To this end, we used text versions that described the very same
type of intermediate event and differed only with respect to a durative adverbial that specified
the duration of this event (see Table 5). As in Experiment 2, the target entities were distinct
physical objects. In all other respects the methodology corresponded to that of Experiment 1.
In particular, the target's accessibility was assessed by measuring the reading times for an
anaphoric sentence that referred to the target entity.
Experiment 3 also addressed a second issue, concerning the distinction between
tracking and fresh starts. The dynamic view of mental models predicts a temporal distance
effect only when the entire occurrence is coded in one continuously evolving dynamic
representation. In other words, a temporal distance effect is contingent on continuous
tracking. A temporal distance effect is not expected when a fresh start is performed
somewhere between the processing of the descriptions of the first and third event. Take, for
example, a narrative text that at some point contains a sentence-initial temporal adverbial like
an hour later or six hours later, both of which indicate a time shift. When encountering one
of these expressions, the reader is likely to perform a fresh start. The present dynamic
representation is discontinued, and a new one is initiated. As a result, the elements that were
involved in the previously described occurrence but are not involved in the new event become
less accessible. However, there is no reason to expect that their accessibility depends on the
size of the time shift, as the skipped time interval is not coded in a dynamic representation.
Hence, in the case of a fresh start, the temporal distance between the target event and the
narrative now at the time of testing is predicted to have no impact on the accessibility of the
target entity. A finding by Zwaan (1996) corresponds to this prediction. Zwaan manipulated
the temporal distance between two events by means of sentence-initial temporal adverbials.
The second event was either described as occurring immediately after the first event or as
occurring one hour or one day later (e.g., A moment later/An hour later/A day later, he turned
very pale). Thus, applying our classification of updating to Zwaan's materials, the moment
condition allowed for tracking, whereas the hour and day conditions both probably triggered a
fresh start. Zwaan tested the accessibility of the first event by means of a probe-recognition
task immediately after the description of the second event. Information from the first event
was found to be less accessible in the two time-shift conditions (hour, day) than in the
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 17
moment condition. More important to our issue, no significant difference obtained between
the two shift conditions. This result is exactly what the dynamic view predicts.
To investigate the hypothesized difference between tracking and fresh starts, we
included two time-shift conditions in the experiment. In these conditions, the texts did not
specify the duration of the intermediate event, but instead implied a time shift to the moment
at which the third event started (see Table 5). The sizes of the time shifts matched the
Participants. Thirty-two students at the Technical University of Berlin took part in the
experiment. They either were paid for their participation or took part in the experiment to fulfill
undergraduate requirements. All participants were native German speakers.
Materials. Twenty-eight new experimental texts were constructed (see sample text in
Table 5). The target entity was a discrete physical object that the protagonist interacted with
during the first event. The target entity was anaphorically referred to in the description of the
third event. There were 2 x 2 versions of each text, realizing two ways of manipulating the
temporal distance between the first and third event (duration of intermediate event vs. time
shift) and two levels of temporal distance (short vs. long).
The two duration versions were of the same structure as the texts used in Experiments 1
and 2. After a few introductory sentences, the first event was described, and in this context, the
target entity was mentioned. The next sentence informed about the protagonist's movement to
another location and described his or her activity there (intermediate event). However, unlike the
texts used in Experiments 1 and 2, the two versions described the same activity and differed only
with regard to the stated duration of the activity. The durations were numerically specified by
durative adverbials (e.g., for an hour vs. for six hours). In German, durative adverbials can have
different forms, all corresponding to the English durative adverbial with for (e.g., for two hours).
In our experiment, we used a bare durative adverbial (e.g., zwei Stunden) in 7 texts, a durative
adverbial containing lang (e.g., zwei Stunden lang) in 15 texts, and a durative adverbial
containing für (e.g., für zwei Stunden) in 6 texts.
The activities constituting the intermediate events all had a relatively large range of
plausible durations (e.g., lying in a bathtub, cycling on one's exercise bike, dancing at a ball).
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 18
The specific numeric values for the durative adverbials were chosen on the basis of duration
estimations of a separate group of 20 participants. To test whether the selected short and long
durations were comparable with respect to plausibility we collected ratings from another group
of 20 participants. These participants were presented with a booklet containing the 28 texts
either in the short-duration version or in the long-duration version (14 texts each). Participants
rated the plausibility of the stated duration by using a scale that ranged from 1 (very plausible) to
5 (very implausible), which corresponds to the grading system in German schools. The
difference between the plausibility ratings for the short- and long-duration version of a text (Ms
= 2.5 and 2.7, respectively) ranged from −2.8 to +2.4 (plausibility difference between versions,
M = 0.2, SD = 1.2) and was not significant according to a t test for correlated samples, t(27) =
−1.14, p = .26.
The sentence concerning the intermediate event was followed by a sentence that
informed about the beginning of the third event. This sentence was identical for the two duration
versions. The next sentence elaborated the third event. It was followed by the critical sentence,
which contained an anaphoric expression referring to the target object.
The two time-shift versions differed from the duration versions in the wording of two
sentences. First, the sentence about the intermediate event simply described the activity of the
protagonist at the new location, without specifying its duration. Second, the subsequent
sentence, describing the beginning of the third event, contained a temporal adverbial announcing
a time shift. The form of the temporal adverbial was nach zwei Stunden [after two hours] in 22
texts, and zwei Stunden später [two hours later] in 6 texts. The numeric values for the small and
large time shifts corresponded to the values given in the respective duration versions of the text.
Thus, the time-shift and duration versions of a given text implied the same temporal distances
between the event involving the target object and the protagonist's now described in the
anaphoric sentence. There were 24 filler texts. They were similar to the experimental texts with
respect to topics, style, and length.
Design and procedure. Participants were randomly assigned to four groups, and the
experimental texts were randomly assigned to four sets. The four text versions were assigned to
the groups and sets according to a 4(group) x 4(set) x 4(version) Latin square. Experimental and
filler texts were presented to the participants in mixed random order. The procedure was
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 19
identical to that of Experiment 1, except that a different task was used to encourage careful
reading. In 30% of the trials, participants were asked to write a short summary of the text just
read. The experimental session lasted about 60 minutes.
Results and Discussion
As in Experiment 1, analyses were conducted on the reading times of the filler
sentences and anaphoric sentences. Outliers were determined in the same way as in the
previous experiments (filler sentences: 3.6%, anaphoric sentences: 3.0%). The overall
analysis by participants was a 4(group) x 2(sentence: filler vs. anaphoric) x 2(manner of
manipulation: duration vs. time shift) x 2(temporal distance: small vs. large) mixed ANOVA,
with group being the only between variable. The analysis by items was identical, except that
it involved set instead of group.
Mean reading times for the filler and anaphoric sentences in the various conditions are
displayed in Table 6. The overall analyses yielded a significant main effect of sentence, F
= 92.90, MSE = 104,921, p < .01; F
(1,24) = 8.30, MSE = 1,032,199, p = .01. The interaction of
sentence and temporal distance was significant by participants but not by items, F
(1,28) = 6.42,
MSE = 29,746, p = .02; F
(1,24) = 2.02, MSE = 56,866, p = .17. No other effect reached the 5%
level of significance. Nonetheless, in accordance with our theoretical considerations, separate
analyses were performed for the duration and time-shift conditions.
For the duration conditions, the main effect of sentence was significant, F
(1,28) = 67.15,
MSE = 73,567, p < .01; F
(1,24) = 8.57, MSE = 481,445, p = .01, whereas the main effect of
temporal distance was not, F
(1,28) = 1.97, MSE = 58,395, p = .17; F
(1,24) = 1.40, MSE =
98,690, p = .25. The interaction of sentence and temporal distance proved significant in the
analysis by participants, F
(1,28) = 6.04, MSE = 43,764, p = .02; F
(1,24) = 2.43, MSE = 66,393,
p = .13. As expected, breaking up the interaction showed that the temporal distance had a
significant effect on the reading times for the anaphoric sentence, F
(1,28) = 9.24, MSE =
39,450, p = .01, PV = .25; F
(1,24) = 5.12, MSE = 58,445, p = .03, PV = .18, but not on the
reading times of the filler sentence (Fs < 1).
For the time-shift conditions, the main effect of sentence was significant, F
62.30, MSE = 77,166, p < .01; F
(1,24) = 7.38, MSE = 602,955, p = .01, whereas the main effect
of the temporal distance and the interaction of the two variables were not (Fs < 1). A planned
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 20
comparison of the reading times for the anaphoric sentence showed that the difference between
the short and large time-shift conditions was statistically insignificant (Fs < 1). Note, that despite
the relatively small sample size, the power to detect a temporal-distance effect of comparable
size as the one observed in the duration conditions (PV = .25) was larger than .80 in the analysis
by participants, given an α level of .05. In the analysis by items, power was .57 for detecting an
effect of comparable size to the one observed in the duration conditions (PV = .18).
To test the difference between the duration and time-shift conditions more directly, the
reading times for the anaphoric sentence were submitted to a common analysis. The main effect
of the manner of manipulation (duration vs. time shift) was not significant (Fs < 1). The main
effect of temporal distance reached significance in the analysis by particpants, F
(1,28) = 4.26,
MSE = 37,565, p < .05, but not in the analysis by items, F
(1,24) = 1.87, MSE = 81,155, p = .18.
The interaction of the two variables was marginally significant, F
(1,28) = 3.36, MSE = 61,355,
p = .08; F
(1,24) = 2.78, MSE = 52,956, p = .11.
The results for the duration conditions correspond to the results of Experiments 1 and 2.
The readers spent more time processing the anaphoric sentence when the described intermediate
event was relatively long-lasting than when it was relatively short-lived. The specific
contribution of the present experiment was to show that this effect occurs even when the
intermediate event consists of the same activity, with the only difference between the versions
being the numerical specification of the duration of the activity. One could argue that there was
still a difference of complexity between the representations of the long-lasting activities and
their short-duration counterparts, as longer lasting activities may often be conceived as involving
a larger number of objects (e.g., shorn sheep). Although it is true that for some of our texts one
could consider the duration of the activity as being correlated with the number of involved
objects, this does not challenge the interpretation of our results in terms of dynamic
representations. There is empirical evidence that suggests that elaborating a scene in terms of the
number of objects present in the scene does not affect the accessibility of an entity from a prior
situation (Rinck & Bower, 2000; see also Radvansky, Zwaan, Federico, & Franklin, 1998).
Hence, if the number of involved objects played a role in our effect at all, it could only have
done so because these objects came into play consecutively. If so, the effect would still be time
based. Longer retrieval times would be due to the fact that more subintervals containing the
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 21
same type of subevent were represented in the long-duration condition compared with the short-
duration condition. This in line with the dynamic view.
For the time-shift conditions, we did not find any indication that the size of the time shift
had an impact on the accessibility of the target object. This corresponds to Zwaan (1996) who
also did not find a temporal distance effect with time shifts (an hour later vs. a day later). In the
present experiment, the negative result is especially remarkable because the informational
content of the texts was almost identical in the duration and time-shift conditions.
According to the dynamic view, the critical difference between the duration and time-
shift conditions was that the duration statements rendered it possible to mentally track the
described sequence of events, whereas the shift statements prompted fresh starts. Tracking had
the effect that the relative durations of the events became intrinsically coded in a coherent
dynamic representation. The first event was therefore represented in a more remote section of the
representation and thus less accessible in the long-duration condition than in the short-duration
Let us now address an issue that is not directly related to the main topic of our article but
relevant to the dynamic view namely the reading times for the duration and shift sentences
themselves. According to the dynamic view, comprehending a sentence means to mentally
simulate the experience of the described situation. Thus, processing a sentence that describes a
long-lasting event will take longer than processing a sentence that describes a relatively short-
lived event (provided that the same time scale is used). The analogous prediction does not hold
for sentences stating a time shift. The length of a time shift has no impact on the mental
simulation and therefore will not affect the reading times. Our experiments were not designed to
explore this aspect of the dynamic view. However, in the present experiment, the short- and
long-duration sentences (as well as the shift versions) were quite similar in wording, which
renders a preliminary test of the hypothesis possible. Accordingly, we compared the reading
times for the duration sentences in the short- and long-duration versions (e.g., Then he goes to
the pasture and shears sheep for an hour/for six hours), as well as the reading times for the shift
sentences in the small- and large-shift versions (After an hour/After six hours a young man
approaches him, and he stops.) To adjust for differences in sentence length, a linear regression
equation predicting sentence reading time from the number of syllables was computed for each
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 22
participant, by using the reading times for all 156 sentences of the filler texts (cf. Ferreira &
Clifton, 1986; Trueswell, Tanenhaus, & Garnsey, 1994). On the basis of these regression
equations, the reading time residuals for the duration sentences and the time-shift sentences were
determined. Residuals that deviated more than 2.5 SDs from the participant’s mean in the
respective sentence-type condition (duration vs. shift) were discarded (duration: 2.9%, shift:
3.6%). The remaining residuals were submitted to separate mixed ANOVAs for the two types of
sentences, with time span (small vs. large) as a within-subject variable, and group or set as a
between-subjects variable. As predicted, the residual reading times for the duration sentences
were considerably shorter when a relatively short duration was stated (M = -228, SD = 490)
compared with when a long duration was stated (M = -2, SD = 530), F
(1, 28) = 11.57, MSE =
70.839, p < .01, PV = .29; F
(1, 24) = 3.41, MSE = 192.887, p = .08, PV = .12. For the time-shift
sentences, the mean residual reading times were M = -314 (SD = 236) and M = -247 (SD = 377)
for the small and large time-shift sentences, respectively. This difference was not significant,
(1, 28) = 1.06, MSE = 66.741, p > .30; F
< 1 (see Appendix for the corresponding raw
reading times). Of course, these post hoc analyses cannot be considered a strict test of the
simulation hypothesis. However, the results are promising with regard to further exploring this
The aim of our study was to explore how readers represent a described sequence of
immediately successive events. Our results suggest that the events are represented in such a way
that the temporal structure of the entire occurrence is preserved. In all three experiments, it was
found that the readers needed more time for accessing an element of a previously described
event when this event was temporally more remote from the current narrative now than when it
was less remote. This effect was observed by using different methods of measuring the
accessibility, with different kinds of target entities, and with different ways of specifying the
relative duration of the intervening event. Thus, the effect of temporal distance on the
accessibility of an element of the protagonist's past seems empirically well established.
Our study also provided evidence that the temporal-distance effect is bound to a specific
condition. The effect occurred only when the sequence of events was described in consecutive
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 23
situational continuity sentences, so that the entire coherent occurrence was described and the
critical temporal distance was determined by the duration of the intervening events. No temporal
distance effect was observed when the text contained a temporal shift, even though the size of
the skipped interval was stated explicitly.
Let us now consider how these results can be explained. As noted in the introduction, in
research on narrative comprehension it is usually presupposed that described events are
represented through distinct static representations (one for each event). This is not only assumed
for the representations in long-term memory but also for the short-term memory representations
that are created directly from the linguistic input. This static view suggests that the difference
between situational continuity and discontinuity is captured in the strength of the connections
between the individual event representations. The event-indexing model (Magliano et al., 1999;
Zwaan et al., 1995; Zwaan & Radvansky, 1998) is one of the most precise and explicit
theoretical accounts of this kind. It assumes that the reader sets up a distinct model for each
sentence in short-term working memory and before processing the next sentence, stores the
corresponding information in long-term working memory. During narrative processing, the
reader monitors the time frame and other situational dimensions and indexes each described
event on these dimensions. The more indices two events share the stronger is the connection
between the respective event representations. The difference between continuity and
discontinuity is thus captured in the connections between the event representations.
The temporal distance effect that we observed for continuous descriptions is difficult to
account for on the basis of the event-indexing model. According to this model, the target’s
accessibility would have depended only on the number of indices that the first event shared with
the third event, which was the last event that was represented in working memory before the test.
However, in this respect, the two experimental conditions did not differ. In fact, according to the
event-indexing model in its present form, the representations of the first and the third event even
received the very same temporal index, as all three described events were temporally contiguous
in the narrated world. One could of course revise the event-indexing model in this respect,
assuming that the order of the successive events is coded in long-term memory and that the
representation of an earlier event can only be accessed by searching through the representations
of intermediate events. However, it would still be unclear why the accessibility of the first event
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 24
would differ depending on whether the representation of the intermediate event contained the
information that this event lasted, for example, one hour or six hours.
Another possibility is that the activation levels of the event representations in long-term
memory reflect the events’ different temporal distances from the respective current narrative
now. Whenever the narrative now moves forward, the activation levels of all event
representations are updated. Assuming that the accessibility of an event depends on its activation
level, the temporal distance effect observed in the duration conditions of our experiments could
be accounted for (for similar proposals with respect to spatial relations, see Bower & Rinck,
2001; Haenggi et al., 1995; Kintsch, 1998). However, the problem with this account is that it
fails to provide a convincing explanation for the differential results obtained with situational
continuity and shift sentences. Why would only the information about the duration of an event
but not the information about the size of a temporal shift, be used for updating the activation
All in all, it seems difficult to account for the results of our study in terms of the
organization of event representations in long-term memory and the retrieval processes that
operate on them. It may be reasonable to consider a revision of the event-indexing model with
regard to its assumptions as to the nature of the representations in short-term working
memory. More specifically, one may entertain the hypothesis that the representations that are
constructed online from incoming linguistic information are dynamic rather than static.
When assuming that comprehenders construct dynamic representations, the temporal-
distance effect observed in our experiments is readily explained. The texts that were
presented in the duration conditions allowed participants to mentally track the critical three
events. When processing the description of the first event, the readers set up a dynamic
representation, which they then continuously transformed in accordance with the incoming
information about the unfolding occurrence. Thus, the temporal structure of the entire
occurrence became encoded in the representation's own temporal structure. At the time of
testing, the first event of the occurrence was represented in a more remote section of the
dynamic representation than the intermediate event. Most important, the section representing
the first event was more remote when the intermediate event had a long duration than when it
had a short duration in the described world. Accessing an element of the first event of the
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 25
occurrence therefore took more time when the intermediate event had a long duration than
when it had a short duration.
The dynamic view also explains why no temporal distance effect occurred in the time-
shift conditions of our Experiment 3 and in Zwaan's (1996) time-shift conditions. When
encountering the temporal adverbial in the shift sentence describing the third event, the
readers terminated the current dynamic representation and initiated a new dynamic
representation. As a consequence, the critical three events were distributed over two
successive representations, and moreover, a certain period was not intrinsically encoded at all.
Thus, the conditions for a temporal distance effect were not met.
Taken together, according to the dynamic view, the differential results for the duration
and time-shift conditions are due to the different updating processes that were performed in
the duration and shift conditions. In the duration conditions, the readers mentally tracked the
described occurrence, to the effect that the three relevant events were encoded in one
continuous dynamic representation. In contrast, in the shift conditions the readers performed a
fresh start before encoding the third event.
Some readers may note that the narratives in our experiments were written in the
historical present and may wonder whether this tense favored the construction of dynamic
representations. Indeed, in linguistics and literature research, the historical present is often
considered to make a story more vivid and to encourage the comprehender to place him- or
herself into the position of an observer of the described events (e.g., Fleischman, 1990). It
would thus certainly be interesting to systematically investigate the impact of the tense on the
construction of dynamic representations. Note, however, that compared with English, German
is much less restrictive with respect to the use of the present tense for describing future or
past events (cf. Lohnstein, 1996), and the historical present is definitely not uncommon in
German (cf. ten Cate, 1988). Moreover, the results of studies that use narratives written in the
past tense are in line with the dynamic view of narrative comprehension, as is pointed out
below. We therefore doubt that the present tense is a necessary condition for the construction
of dynamic representations.
The dynamic view suggests a distinction between two different types of updating,
tracking and fresh starts, but the question as to which precise conditions give rise to one or the
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 26
other type of updating needs to be clarified empirically. Of course, tracking is impossible if the
new sentence is about a situation that is widely separated from the currently represented situation
with respect to time and space. A more interesting question is what happens if the new sentence
implies a situational shift, but the skipped occurrence could in principle be inferred on the basis
of world knowledge or context information. The shift sentences in our Experiment 3 were of this
type, both in the small and large temporal distance condition (e.g., Then he goes to the pasture
and shears sheep. After an hour/After six hours a young man approaches him, and he stops.)
The fact that no temporal distance effect occurred indicates that the readers performed a fresh
start. Obviously, the mere possibility of inferring the intervening occurrence is not a sufficient
condition for tracking.
In this context, a temporal effect reported by Rinck and Bower (2000, Experiment 1) is
particularly interesting. Entities from a past situation of the protagonist were found to be less
accessible after a sentence such as After two hours, Calvin was finally done cleaning up the
room than after a sentence such as After ten minutes, Calvin was finally done cleaning up the
room. The authors considered the two experimental conditions as comparable to the a-moment-
later condition and an-hour-later condition of Zwaan's (1996) study, respectively (Rinck &
Bower, 2000, p. 1319). Thus, in our words, the authors regarded the critical sentences as
situational continuity sentences and temporal shift sentences, respectively. However, an
alternative interpretation is also possible. One could consider the sentences in both conditions
temporal shift sentences, although as being of a type that differs in an important respect from the
type used by Zwaan (1996) and by us. In Zwaan's study and our Experiment 3, the shift sentence
described the event that started at the new narrative now. In contrast, the shift sentence in the
example given by Rinck and Bower (2000) informed the reader about the protagonist's activity
in the preceding time interval while referring to the present situation merely as the consequent
state. Thus, in a way, the content of the shift sentence called for representing the respective
intervening event. This event may have been inserted into the representation retroactively, to the
effect that the entire occurrence, up until the new narrative now, became tracked in both
conditions. As we were informed by M. Rinck (personal communication, November 26, 2002),
indeed, 22 of the 24 shift sentences used in the Rinck and Bower (2000) study described an
activity that took place in the preceding time interval. When interpreting the results of Rinck and
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 27
Bower in the way proposed above, they complement our results. They indicate that a time shift is
not a sufficient condition for a fresh start. Thus, taking the results of all three studies together, it
seems whether tracking or a fresh start is performed depends primarily on whether the text
explicitly describes the entire occurrence without gaps. Compared with that, the point of time in
the story world that is specified as the new narrative now only plays a minor role.
The content of the new sentence is probably not the only relevant variable. Several
studies have shown that there is a number of linguistic structures that function as segmentation
markers in discourse and are interpreted by the comprehender as a signal to initiate a new model
(e.g., Bestgen & Costermans, 1994; Bestgen & Vonk, 1995, 2000; Vonk, Hustinx, & Simons,
1992). Segmentation markers may trigger a fresh start even when tracking is perfectly possible
as far as the new sentence’s content is concerned (Bestgen & Vonk, 2000). Among the
expressions that are usually classified as segmentation markers are sentence-initial temporal
adverbials (e.g., around two o'clock, after some minutes, two hours later), that is, the very same
type of expression that we used in the shift sentences. However, it is unlikely that the
segmentation markers alone were sufficient to prompt the fresh starts, because the same type of
linguistic expression was used by Zwaan (1996) in the a-moment-later condition and in the
experiment by Rinck and Bower (2000), where according to our reinterpretation, the described
occurrence was tracked. Thus, it seems that both the informational content and certain linguistic
devices have an impact on whether the comprehender attempts to mentally track the occurrence
or performs a fresh start. It is a task for future research to pinpoint the effects of the two factors
and their interaction.
The present study has focused on the representation of a sequence of events that is
described in several consecutive sentences. However, it is obvious that the dynamic view of
comprehension also has intriguing implications regarding the representation of one event
described in a single sentence. One implication, which we already mentioned in the context of
Experiment 3, concerns the time that is needed to set up the representation of an event being
described within a given narrative. The dynamic view predicts that more time is needed when
the event’s duration is longer rather than shorter. The results of the analysis of the reading
times for the duration sentences in Experiment 3 provide preliminary support for this idea
(see also Rinck et al., 1997, Experiment 1; Wender & Weber, 1982). Note that this prediction
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 28
only applies when the rest of the text is kept constant. Considering that texts widely differ
with respect to the temporal granularity of the description (e.g., a traffic accident vs. the
evolution of men), it seems likely that comprehenders use different representational time
scales for different texts (cf. Nakhimovsky, 1988). Accordingly, it is impossible to make a
general statement about the time that is needed to construct the dynamic representation of a
particular event. The question of how comprehenders select a time scale that is appropriate
for the temporal granularity of a given text is a highly interesting topic for future research.
Another implication concerns the internal structure of the dynamic representations of
an event that is described in a single sentence. The dynamic representation of an event
evolves in time, simulating the observation of the event. Representing an event described in a
single sentence is thus principally comparable to representing an unfolding occurrence that is
described in successive situational continuity sentences. More specifically, when reading a
sentence such as Mary went from the kitchen into the bathroom, the reader mentally tracks the
event, beginning with Mary in the kitchen, continuing with her walk through the hall, and
ending with Mary in the bedroom. Thus, it can be expected that after reading the sentence, the
entities from the final situation are more accessible than entities from the path, which in turn
will be more accessible than entities from the initial situation. This was indeed observed in a
number of studies using the Morrow paradigm (e.g., Morrow, 1994; Rinck & Bower, 1995;
Rinck et al., 1996; see also Morrow, 1985, 1990). The results of these studies are usually
interpreted not in terms of a temporal distance effect (as the dynamic view suggests), but in
terms of a spatial distance effect. On the basis of the available data, a decision between the
two interpretations is not yet possible.
As outlined in the introduction, the starting point of our considerations was the original
notion of mental models, as articulated in Johnson-Laird’s (1983, 1989) theory. In this theory,
the term mental model is not specific to the area of language processing. It generally refers to
representations that have their roots in perception and capture what the human mind, not having
direct access to the world, construes as the world (cf. projected world in Jackendoff’s, 1983,
terminology). According to Johnson-Laird, these representations are used in virtually all
cognitive processes. Hence, this theory implies that language comprehension results in
representations similar to those involved in direct experience. “A major function of language is
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 29
thus to enable us to experience the world by proxy” (Johnson-Laird, 1983, p. 471). In recent
years, a growing number of authors have presented proposals that are similar in spirit, stressing
that meaning must be considered as being grounded in perceptual experience and action (e.g.,
Barsalou, 1999a, 1999b, in press; Glenberg & Robertson, 2000; Glenberg & Kaschak, in press;
MacWhinney, 1999; Zwaan, in press). There is already some neuroscientific and behavioral
evidence for this claim, indicating that linguistically conveyed information about situations is
represented in the same mental subsystems as directly experienced situations (e.g., Glenberg &
Kaschak, 2002; Pulvermüller, 2003; Pulvermüller, Härle, & Hummel, 2001; Richardson, Spivey,
Barsalou, & McRae, 2003; Simmons, Pecher, Hamann, Zeelenberg, & Barsalou, 2003; Zwaan &
Yaxley, in press; for a review see Zwaan, in press).
With this perspective on language, research on nonlinguistic cognition becomes directly
relevant to research on text comprehension. In our study, we drew on research on perception
when hypothesizing that comprehenders create a dynamic representation when reading about an
unfolding occurrence. The results provide evidence for this hypothesis.
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 30
Anderson, A., Garrod, S. C., & Sanford, A. J. (1983). The accessibility of pronominal
antecedents as a function of episode shifts in narrative text. Quarterly Journal of
Experimental Psychology, 35A, 427-440.
Barsalou, L. W. (1999a). Language comprehension: Archival memory or preparation for situated
action? Discourse Processes, 28, 61-80.
Barsalou, L. W. (1999b). Perceptual symbol systems. Behavioral and Brain Sciences, 22, 577-
Barsalou, L. W. (in press). Situated simulation in the human conceptual system. Language and
Bertamini, M. (1993). Memory for position and dynamic representations. Memory & Cognition,
Bestgen, Y., & Costermans, J. (1994). Time, space, and action: Exploring the narrative structure
and its linguistic marking. Discourse Processes, 17, 421-446.
Bestgen, Y., & Vonk, W. (1995). The role of temporal segmentation markers in discourse
processing. Discourse Processes, 19, 385-406.
Bestgen, Y., & Vonk, W. (2000). Temporal adverbials as segmentation markers in discourse
comprehension. Journal of Memory and Language, 42, 74-87.
Bower, G. H., & Rinck, M. (2001). Selecting one among many referents in spatial situation
models. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 81-
Carreiras, M., Carriedo, N., Alonso, M. A., & Fernández, A. (1997). The role of verb tense and
verb aspect in the foregrounding of information during reading. Memory & Cognition,
Cooper, L. A. (1976). Demonstration of a mental analog of an external rotation. Perception &
Psychophysics, 19, 296-302.
Craik, K. (1943). The nature of explanation. Cambridge, England: Cambridge University Press.
Ericsson, K. A., & Kintsch, W. (1995). Long-term working memory. Psychological Review, 102,
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 31
Ferreira, F., & Clifton, C. Jr. (1986). The independence of syntactic processing. Journal of
Memory and Language, 25, 348-368.
Fleischman, S. (1990). Tense and narrativity. Austin: University of Texas Press.
Freyd, J. J. (1987). Dynamic mental representations. Psychological Review, 94, 427-438.
Freyd, J. J. (1993). Five hunches about perceptual processes and dynamic representations. In D.
E. Meyer & S. Kornblum (Eds.), Attention and performance XIV. Synergies in
experimental psychology, artificial intelligence, and cognitive neuroscience (pp. 99-119).
Cambridge, MA: MIT Press.
Freyd, J. J., & Pantzer, T. M. (1995). Static patterns moving in the mind. In S. M. Smith, T. B.
Ward, & R. A. Finke (Eds.), The creative cognition approach (pp. 181-204). Cambridge,
MA: MIT Press.
Gernsbacher, M. A. (1990). Language comprehension as structure building. Hillsdale, NJ:
Gernsbacher, M. A. (1997). Two decades of structure building. Discourse Processes, 23, 265-
Glenberg, A. M., & Kaschak, M. P. (2002). Grounding language in action. Psychonomic Bulletin
& Review, 9, 558-565.
Glenberg, A. M., & Kaschak, M. P. (in press). The body’s contribution to language. In B. H.
Ross (Ed.), The Psychology of Learning and Motivation: Vol. 43. New York: Academic
Glenberg, A. M., Meyer, M., & Lindem, K. (1987). Mental models contribute to foregrounding
during text comprehension. Journal of Memory and Language, 26, 69-83.
Glenberg, A. M., & Robertson, D. A. (2000). Symbol grounding and meaning: A comparison of
high-dimensional and embodied theories of meaning. Journal of Memory and Language,
Graesser, A. C., Millis, K. K., & Zwaan, R. A. (1997). Discourse comprehension. Annual
Review of Psychology, 48, 163-189.
Haenggi, D., Kintsch, W., & Gernsbacher, M. A. (1995). Spatial situation models and text
comprehension. Discourse Processes, 19, 173-199.
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 32
Hegarty, M. (1992). Mental animation: Inferring motion from static displays of mechanical
systems. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18,
Hespos, S. J., & Rochat, P. (1997). Dynamic mental representation in infancy. Cognition, 64,
Hyönä, J. (1995). An eye movement analysis of topic-shift effect during repeated reading.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1365-1373.
Jackendoff, R. (1983). Semantics and cognition. Cambridge, MA: MIT Press.
Johnson-Laird, P. N. (1983). Mental models. Cambridge, England: Cambridge University Press.
Johnson-Laird, P. N. (1989). Mental models. In M. I. Posner (Ed.), Foundations of cognitive
science (pp. 469-499). Cambridge, MA: MIT Press.
Kintsch, W. (1998). Comprehension: A paradigm for cognition. Cambridge, England:
Cambridge University Press.
Klein, W. (1994). Time in language. London: Routledge.
Levine, W. H., & Klin, C. M. (2001). Tracking of spatial information in narratives. Memory &
Cognition, 29, 327-335.
Lohnstein, H. (1996). Formale Semantik und natürliche Sprache [Formal semantics and natural
language]. Opladen, Germany: Westdeutscher Verlag.
MacWhinney, B. (1999). The emergence of language from embodiment. In B. MacWhinney
(Ed.), The emergence of language (pp. 213-256). Mahwah, NJ: Erlbaum.
Magliano, J. P., & Schleich, M. C. (2000). Verb aspect and situation models. Discourse
Processes, 29, 83-112.
Magliano, J. P., Zwaan, R. A., & Graesser, A. (1999). The role of situational continuity in
narrative understanding. In H. van Oostendorp & S. R. Goldman (Eds.), The construction
of mental representations during reading (pp. 219-245). Mahwah, NJ: Erlbaum.
McGinn, C. (1989). Mental content. Oxford, England: Blackwell.
Morrow, D. G. (1985). Prepositions and verb aspect in narrative understanding. Journal of
Memory and Language, 24, 390-404.
Morrow, D. G. (1990). Spatial models, prepositions, and verb-aspect markers. Discourse
Processes, 13, 441-469.
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 33
Morrow, D. (1994). Spatial models created from text. In H. van Oostendorp & R. A. Zwaan
(Eds.), Naturalistic text comprehension (pp. 57-78). Norwood, NJ: Ablex.
Morrow, D. G., Bower, G. H., & Greenspan, S. L. (1989). Updating situation models during
narrative comprehension. Journal of Memory and Language, 28, 292-312.
Morrow, D. G., Bower, G. H., & Greenspan, S. L. (1990). Situation-based inferences during
narrative comprehension. In A. C. Graesser & G. H. Bower (Eds.), The psychology of
learning and motivation: Vol. 25. Inferences and text comprehension (pp. 123-135). San
Diego, CA: Academic Press.
Murphy, K. R., & Myors, B. (1998). Statistical power analysis. Mahwah, NJ: Erlbaum.
Nakhimovsky, A. (1988). Aspect, aspectual class, and the temporal structure of the narrative.
Computational Linguistics, 14, 29-43.
Palmer, S. E. (1978). Fundamental aspects of cognitive representation. In E. Rosch & B. B.
Lloyd (Eds.), Cognition and categorization (pp. 259-303). Hillsdale, NJ: Erlbaum.
Pollatsek, A., & Well, A. D. (1995). On the use of counterbalanced designs in cognitive
research: A suggestion for a better and more powerful analysis. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 21, 785-794.
Pulvermüller, F. (2003). The neuroscience of language. Cambridge, England: Cambridge
Pulvermüller, F., Härle, M., & Hummel, F. (2001). Walking or talking? Behavioral and
neurophysiological correlates of action verb processing. Brain and Language, 78, 143-
Radvansky, G. A., & Copeland, D. E. (2001). Working memory and situation model updating.
Memory & Cognition, 29, 1073-1080.
Radvansky, G. A., Zwaan, R. A., Federico, T., & Franklin, N. (1998). Retrieval from temporally
organized situation models. Journal of Experimental Psychology: Learning, Memory,
and Cognition, 24, 1224-1237.
Reichenbach, H. (1947). Elements of symbolic logic. London: Macmillan.
Richardson, D. C., Spivey, M. J., Barsalou, L. W., & McRae, K. (2003). Spatial representations
activated during real-time comprehension of verbs. Cognitive Science, 27, 767-780.
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 34
Rinck, M., & Bower, G. H. (1995). Anaphora resolution and the focus of attention in situation
models. Journal of Memory and Language, 34, 110-131.
Rinck, M., & Bower, G. H. (2000). Temporal and spatial distance in situation models. Memory
& Cognition, 28, 1310-1320.
Rinck, M., Hähnel, A., Bower, G. H., & Glowalla, U. (1997). The metrics of spatial situation
models. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23,
Rinck, M., & Weber, U. (in press). Who when where: An experimental test of the event-
indexing model. Memory & Cognition.
Rinck, M., Williams, P., Bower, G. H., & Becker, E. S. (1996). Spatial situation models and
narrative understanding: Some generalizations and extensions. Discourse Processes, 21,
Sanford, A. J., Clegg, M., & Majid, A. (1998). The influence of types of character on processing
background information in narrative discourse. Memory & Cognition, 26, 1323-1329.
Scott Rich, S., & Taylor, H. A. (2000). Not all narrative shifts function equally. Memory &
Cognition, 28, 1257-1266.
Simmons, W. K., Pecher, D., Hamann, S. B., Zeelenberg, R., & Barsalou, L. W. (2003, March).
fMRI evidence for modality-specific processing of conceptual knowledge on six
modalities. Paper presented at the meeting of the Society of Cognitive Neuroscience,
Sims, V. K., & Hegarty, M. (1997). Mental animation in the visuospatial sketchpad: Evidence
from dual-task studies. Memory & Cognition, 25, 321-332.
ten Cate, A. P.(1988). Zeitdeixis und das historische Präsens [Time deixis and the historical
present]. In H. Weber & R. Zuber (Eds.), Linguistik Parisette (pp. 15-27). Tübingen,
Thornton, I. M., & Hubbard, T. L. (Eds.). (2002). Representational momentum. New Findings,
new directions. London: Taylor & Francis.
Trueswell, J. C., Tanenhaus, M. K., & Garnsey, S. M. (1994). Semantic influences on parsing:
Use of thematic role information in syntactic ambiguity resolution. Journal of Memory
and Language, 33, 285-318.
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 35
van Dijk, T. A., & Kintsch, W. (1983). Strategies of discourse comprehension. New York:
Verfaillie, K., & d'Ydewalle, G. (1991). Representational momentum and event course
anticipation in the perception of implied periodical motions. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 17, 302-313.
Vonk, W., Hustinx, L. G. M. M., & Simons, W. H. G. (1992). The use of referential expressions
in structuring discourse. Language and Cognitive Processes, 7, 301-333.
Wender, K. F., & Weber, G. (1982). On the mental representation of motion verbs. In F. Klix, J.
Hoffmann, & E. van der Meer (Eds.), Cognitive research in psychology (pp. 108-113).
Zwaan, R. A. (1996). Processing narrative time shifts. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 22, 1196-1207.
Zwaan, R. A. (1999a). Five dimensions of narrative comprehension: The event-indexing model.
In S. R. Goldman, A. C. Graesser, & P. van den Broek (Eds.), Narrative comprehension,
causality, and coherence (pp. 93-110). Mahwah, NJ: Erlbaum.
Zwaan, R. A. (1999b). Situation models: The mental leap into imagined worlds. Current
Directions in Psychological Science, 8, 15-18.
Zwaan, R. A. (in press). The immersed experiencer: Toward an embodied theory of language
comprehension. In B. H. Ross (Ed.), The psychology of learning and motivation: Vol. 44.
New York: Academic Press.
Zwaan, R. A., Madden, C. J., & Whitten, S. N. (2000). The presence of an event in the narrated
situation affects its availability to the comprehender. Memory & Cognition, 28, 1022-
Zwaan, R. A., Magliano, J. P., & Graesser, A. C. (1995). Dimensions of situation model
construction in narrative comprehension. Journal of Experimental Psychology: Learning,
Memory, and Cognition, 21, 386-397.
Zwaan, R. A., & Radvansky, G. A. (1998). Situation models in language comprehension and
memory. Psychological Bulletin, 123, 162-185.
Zwaan, R. A., & Yaxley, R. H. (in press). Spatial iconicity affects semantic relatedness
judgments. Psychonomic Bulletin & Review.
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 36
Zwaan, R. A., Yaxley, R. H., Madden, C. J., & Aveyard, M. E. (2003). Moving words: dynamic
representations in language comprehension. Manuscript submitted for publication.
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 37
Mean Raw Reading Times and Mean Number of Syllables (with Standard Deviations)
for the Two Versions of the Duration Sentences and Time-Shift Sentences in Experiment 3
Duration sentence Time shift sentence
Measure Short Long Small Large
Reading times in ms 3097 (858) 3189 (900) 2257 (501) 2193 (501)
No. syllables 23.3 (9.0) 22.0 (9.1) 16.8 (4.7) 15.5 (4.9)
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 38
Stephanie Kelter, Institute of Psychology, Technical University of Berlin, Berlin,
Germany; Barbara Kaup, Department of Computer Science, University of Hamburg, Germany;
Berry Claus, Institute of Psychology, Technical University of Berlin, Berlin, Germany.
Barbara Kaup is now at the Institute of Psychology, Technical University of Berlin,
This research was supported by Grant Ha 1237/5-2 from the German Research
Foundation (DFG) to Christopher Habel and Stephanie Kelter. We thank Carol J. Madden, Elena
Moore, Mike Rinck, Rolf A. Zwaan, and an anonymous reviewer for helpful comments on
earlier versions of this paper. We also thank Guido Liebe, Alexandra Peters, and Alexander
Richter for their assistance in collecting the data.
Correspondence concerning this article should be addressed to Stephanie Kelter, TU
Berlin, Institut für Psychologie und Arbeitswissenschaft, Sekr. FS 1, Franklinstr. 5-7, D-10587
Berlin, Germany. E-mail: firstname.lastname@example.org.
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 39
Sample Text Used in Experiment 1, Translated from German
Preparing for Christmas Eve
Around Christmas time Mrs. Strube always devotes herself entirely to her family. The day
before this year's Christmas Eve, she and her husband are decorating the living room together.
While he is setting up the crib beneath the tree, she lovingly decorates the tree with ornaments
and tinsel. Mr. Strube, however, does not like the decorations and says this bluntly.
As a result, Mrs. Strube throws a tantrum and throws the
remaining tinsel at Mr. Strube's feet.
She goes into the kitchen and puts some cookies on the
She goes into the kitchen and bakes some cookies to put on the
After doing this, the kitchen is sweet with the smell of Christmas.
Now she regrets her tantrum.
Note. Each participant read either the short or the long duration version. Texts were presented
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 40
Mean Reading Times in Milliseconds (with Standard Deviations) as a Function of the
Duration of the Intermediate Event in Experiment 1
Sentence Short Long
Filler Sentence 1,725 (391) 1,742 (323)
Anaphoric Sentence 1,909 (412) 2,011 (431)
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 41
Sample Text Used in Experiment 2, Translated from German
Preparing for New Year's Eve
Mrs. Quasten is full heartedly concerned about her family's well-being. This year, as every
year, she takes special care in arranging for New Year's Eve.
After getting up, she gets the carp ready for cooking and
prepares the sauce.
She then goes to the hairdresser and buys hairspray.
She then goes to the hairdresser and gets a perm.
When leaving the hairdresser, she hails a cab.
Note. Each participant read either the short or the long duration version. Texts were presented
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 42
Mean Reading Times for the Filler Sentence and Mean Recognition Latencies for the Probe in
Milliseconds (with Standard Deviations) as a Function of the Duration of the Intermediate Event
in Experiment 2
Stimulus Short Long
Filler Sentence 2,211 (733) 2,222 (751)
Probe 1,335 (403) 1,442 (427)
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 43
Sample Text Used in Experiment 3, Translated from German
The old man and his flock of sheep
Mr. Satorius is a shepherd. He is saddened by the fact that his occupation is dying out. He
really enjoys his work and would like to pass his expertise on to the next generation.
However, young people prefer to work nowadays as bank clerks or for insurance
companies, and have almost completely lost contact with nature. Today Mr. Satorius has
learned that he will not have a successor.
First event Full of sorrow, he puts the letter containing this message into the
saddle bag of his motor bicycle, together with his rain cape.
He then goes to the pasture and shears sheep for an hour / six hours.
When a young man approaches him he stops.
He then goes to the pasture and shears sheep.
After an hour / six hours a young man approaches him, and he stops.
Mr. Satorius looks up in astonishment.
He must again think of the letter.
Note. Each participant read only one version of each text. Texts were presented without
This article may not exactly replicate the final published version. [©2004 American Psychological Association] 44
Mean Reading Times in Milliseconds (with Standard Deviations) as a Function of Manner of
Temporal Manipulation and Temporal Distance in Experiment 3
Sentence Small Large
Filler sentence 1,927 (479) 1,896 (449)
Anaphoric sentence 2,229 (514) 2,380 (588)
Filler sentence 1,925 (446) 1,879 (433)
Anaphoric sentence 2,294 (461) 2,284 (514)