Published in: Studies in Second Language Acquisition 27: 53-78.
Gaps in Second Language Sentence Processing
Theodore Marinis, Leah Roberts, Claudia Felser & Harald Clahsen
University of Essex
Running Head: Gaps in second language processing
© Cambridge University Press, 2005.
Department of Language and Linguistics
University of Essex
Colchester CO4 3SQ
Four groups of second language (L2) learners of English from different language
backgrounds (Chinese, Japanese, German & Greek) and a group of native speaker controls
participated in an on-line reading-time experiment with sentences involving long-distance
wh-dependencies. While the native speakers showed evidence of making use of intermediate
syntactic gaps during processing, the L2 learners appeared to associate the fronted wh-phrase
directly with its lexical subcategoriser, regardless of whether or not the subjacency constraint
was operative in their native language. This finding is argued to support the hypothesis that
L2 learners under-use syntactic information in L2 processing, which prevents them from
processing the L2 input in a native-like fashion.
The real-time processing of sentences involving displaced constituents, or 'filler-gap
dependencies', has been the focus of a considerable body of psycholinguistic research on
monolingual sentence comprehension. A syntactically dislocated constituent such as the
fronted wh-phrase which book in Which book did you read in only one hour? poses a
challenge for the human sentence processing mechanism insofar as it cannot be fully
integrated immediately into the emerging semantic or discourse representation but instead
must be retained in short-time memory until it can be linked to its subcategoriser, or thematic
role assigner. As the computational cost incurred by temporarily storing a filler in short-term
memory increases with the distance between the filler and its associated gap (see, among
others, Gibson 1998; King & Just, 1991; King & Kutas, 1995; Kluender & Kutas, 1993), the
human sentence processing mechanism will normally attempt to integrate a dislocated
element at the earliest grammatically possible point during parsing. This well-documented
preference for keeping filler-gap dependencies as short as possible is known as the Active
Filler Hypothesis (Clifton & Frazier, 1989).
Linguistic theories differ with respect to the way filler-gap dependencies are analysed.
Within the generative-transformational tradition, a displaced constituent is assumed to form
a syntactic dependency with an empty category at its base position, and is thus only
indirectly linked to its subcategoriser. According to the copy theory of movement (Chomsky
1995, and later), the empty category (= ei in example  below) involved in filler-gap
dependencies is a silent but otherwise identical copy of the displaced constituent itself.
Which booki did you read ei in only an hour?
Some lexically-based syntactic frameworks including variants of Head-Driven Phrase
Structure Grammar, on the other hand, assume that a dislocated element is linked directly to
its lexical subcategoriser (Pollard & Sag, 1994). This linguistic controversy has given rise to
different hypotheses as to how filler-gap dependencies are processed, the Trace Reactivation
Hypothesis (TRH), according to which the human parser postulates empty categories
('traces') during the on-line comprehension of sentences containing such dependencies
(Bever & McElree 1988; Love & Swinney, 1996; Nicol & Swinney, 1989; Swinney, Ford,
Frauenfelder & Bresnan, 1988, among others), and the Direct Association Hypothesis
(DAH), which maintains that establishing a filler-gap dependency is a lexically-driven
process triggered by the automatic mental reconstruction of the subcategoriser's argument
structure when this is encountered (Pickering & Barry, 1991; Sag & Fodor, 1994).
Results from a number of studies on monolingual sentence comprehension suggest that two
distinct mental processes may in fact be involved in the processing of filler-gap
dependencies: (i) a phrase structure-based mechanism that triggers a filler's retrieval from
short-term memory at a specific structural position (the processing equivalent of inserting a
copy of the filler into a particular syntactic slot, as predicted by the TRH); and (ii) a
lexically-driven process of semantically integrating a displaced constituent with its thematic
role assigner or other licenser, as predicted by the DAH. Whereas these two processes are
usually difficult to dissociate empirically in head-initial languages like English (but see
Nicol, 1993), evidence for the TRH can be gathered from studies on the processing of filler-
gap dependencies in verb-final languages such as Japanese (Nakano, Felser & Clahsen,
2002) or German (Clahsen & Featherston, 1999; Featherston 2001; Fiebach, Schlesewsky &
Friederici, 2002), which found filler-reactivation effects before the subcategorising verb had
Regardless of whether or not a filler is assumed to be linked to its lexical subcategoriser via
empty categories located within the subcategoriser's extended projection, though, most
contemporary syntactic theories agree that for dependencies spanning more than one clause,
some kind of intermediate linguistic structure is present at intervening clause boundaries
which mediates between the filler and its ultimate gap (or subcategoriser). An example of
what is commonly referred to as 'successive-cyclic wh-movement' is provided in (2) below.
Whoi do you think ei (that) John says ei (that) Mary likes ei ?
Traditional evidence for the successive-cyclic nature of wh-movement includes various types
of 'island' effect (Ross, 1967), wh-complementiser agreement in languages like Irish
(McCloskey, 2001), children's use of medial wh in questions such as Who do you think who's
in the box? (Thornton, 1990), and wh-copying found in a number of languages including
German, Frisian, Afrikaans, and Romani (see Felser, in press, and references cited there).
Psycholinguistic evidence for successive-cyclicity has been found, for example, in a study by
Kluender & Kutas (1993) using event-related brain potentials (ERPs), and in a reading-time
study by Gibson & Warren (1999). Kluender & Kutas observed that the processing difficulty
for sentences containing subjacency violations such as (3) below increased (relative to
sentences in which subjacency was respected) both at the intervening wh-pronoun and at the
filler's base position.1
(3) *Whoi couldn't you decide who should sing something for ei at the family
Gibson & Warren (1999) investigated native English speakers' processing of grammatical
sentences containing long-distance wh-dependencies like that in sentence (4) below.
(4) The manager whoi the consultant claimed ei that the new proposal had pleased ei
will hire five workers tomorrow.
Similar to Kluender & Kutas (1993), the authors found that the availability of an
intermediate 'landing site' facilitated a filler's integration with its subcategoriser, thus
providing indirect evidence for the psychological reality of intermediate gaps in L1 sentence
processing. Gibson & Warren's reading-time study provided the model for the present study,
and will be discussed in more detail in section 3.1 below.
While there is ample evidence that the mental representations constructed during L1
sentence processing are built up rapidly and in an incremental fashion, and also include
abstract linguistic structure such as empty categories, or syntactic gaps, surprisingly little is
known to date about the way second language learners process the L2 input in real time.
Instead, L2 research has traditionally focused on the acquisition of grammatical knowledge
using off-line methodologies such as grammaticality judgement, elicitation techniques, or
comprehension tasks. Previous studies of L1 sentence processing in a range of different
languages have shown, however, that between-language variation is not restricted to
differences in grammar, but that some processing strategies may also be subject to cross-
linguistic variation (Cuetos, Mitchell & Corley, 1996; Frazier & Rayner, 1988; Gibson,
Pearlmutter, Canseco-Gonzalez & Hickok, 1996; Mazuka & Lust, 1990, among others).
Hence, besides being faced with the task of acquiring the L2 grammar, L2 learners may also
need to acquire any language-specific processing strategies that are used in the target
language. The observation that sentence processing is not necessarily uniform across
languages also raises the possibility of L1 processing transfer in L2 acquisition, an issue that
has featured prominently in much research within the framework of the Competition Model
of language acquisition and processing (Harrington, 1987; MacWhinney, 1997, 2002). It is
conceivable, for example, that L2 learners from wh-in-situ backgrounds fail to process wh-
dependencies in L2 English in a native-like way, whereas L2 learners whose L1 also shows
overt wh-movement are indistinguishable from native speakers in this domain.
Our previous studies of L2 processing indicate that although L2 learners, like native
speakers, are guided by lexical information during parsing, they rely on phrase-structure
information to a lesser extent than native speakers do - irrespectively of their language
background (Felser, Roberts, Gross & Marinis, 2003; Papadopoulou & Clahsen, 2003;
Roberts, 2003). If this is correct, then we might expect that when processing wh-
dependencies, L2 learners perform in accordance with the DAH but do not postulate any
intermediate syntactic gaps.
2. Previous studies of L2 learners' processing of wh-dependencies
The vast majority of existing L2 studies on the acquisition of wh-movement and subjacency
have used off-line tasks such as grammaticality judgements, and their results are not fully
conclusive.2 Only a few published studies are available to date that have examined the real-
time processing of wh-movement by L2 learners using on-line tasks. A reading-time study
carried out by Juffs & Harrington (1995) addressed the issue of whether it is processing
difficulties or a competence deficit that causes problems with certain types of filler-gap
dependencies for learners of English whose native language does not show successive-cyclic
wh-movement and thus arguably lacks the subjacency constraint. Juffs & Harrington report
the results from two on-line grammaticality judgement experiments that measured Chinese-
speaking learners' accuracy and reading times for grammatical and ungrammatical sentences
involving either subject and object extractions. The results from the full-sentence
presentation version of the experiment showed that the Chinese-speaking learners' response
accuracy was comparable to the native speakers' for ungrammatical subject and object
extractions, indicating that they had acquired the subjacency constraint. They performed
significantly worse than the native speakers, however, on grammatical sentences involving
subject - but not object - extraction (compare also White & Juffs, 1998). The learners'
difficulties with subject extractions were also reflected in their on-line reading times. In the
experiment using word-by-word presentation, the two participant groups showed distinct
patterns of processing the grammatical sentences. Specifically, the learners were found to
slow down significantly more at the region following the matrix verb in subject extractions
from finite clauses such as (5a) below than in object extractions, as in (5b). No such
slowdown was attested in the group of native speaker controls.
(5) a. Whoi did Ann say ei likes her friend? (subject extraction)
b. Which mani did Jane say her friends like ei ? (object extraction)
The authors argue that the learners' relatively poorer performance on subject extractions
reflects processing rather than competence problems (cf. Juffs & Harrington, 1996). Observe
that in sentences like (5a) above, the gap following the verb say may initially be analysed as
an object gap, a decision that must be revised as soon as the embedded verb likes is
encountered. While this kind of reanalysis causes no or little processing difficulty for native
speakers, it does, according to Juffs & Harrington, pose a problem for L2 learners.
Note, however, that given the nature of Juffs & Harrington's materials, their results do not
provide any unequivocal evidence for the learners' use of empty categories during
processing. As the purported trace position is adjacent to the subcategorising verb, the
slowdown observed in the post-gap region may also be due to the learners' trying to link the
fronted wh-phrase directly to its subcategoriser, in accordance with the DAH. The possibility
that the learners may have a lexically or verb-driven processing strategy is strengthened by
the fact that the learners (but not the native speakers) also showed elevated reading times at
the matrix verb say, a region prior to the locus of reanalysis. Juffs & Harrington (1996,
p.300) speculate that the learners may be confused by the lack of semantic fit of the wh-
pronoun who as the object of say at this point.
Another reading-time study by Williams, Möbius & Kim (2001) investigated so-called
'filled-gap' effects in L2 processing, and the question of whether or not L2 learners are
sensitive to plausibility constraints during parsing. Their experimental sentences involved
adjunct extractions in two plausibility conditions, as shown in (6a) and (6b) below.
(6) a. Which friendi did the gangster hide the car for ei late last night?
b. Which cavei did the gangster hide the car in ei late last night?
In example (6a), the fronted wh-phrase is a plausible object of the verb hide, whereas in
example (6b) it is not. Previous studies have shown that native speakers of English initially
attempt to analyse the displaced wh-phrase as a direct object, a misanalysis that gives rise to
increased processing difficulty when the real object the car is encountered (compare e.g.
Stowe, 1986). In Williams et al.'s self-paced reading experiment, Chinese, Korean, and
German-speaking learners of English were asked to read sentences presented on a computer
screen in a word-by-word fashion, and to indicate the point at which they thought the
sentence had become implausible by pressing a 'stop' button. Assuming that on-line sentence
comprehension is incremental in nature, the authors predicted that if the learners adopt a
filler-driven or 'gap-as-first-resort' strategy, then the wh-phrase in both conditions would
initially be analysed as the object of the verb when this is encountered. A filled-gap effect
would then be observed on the post-verbal NP, reflected in longer reading times, due to the
need for reanalysis at this point. If, on the other hand, a gap is posited only as a last-resort
strategy (that is, to avoid ungrammaticality; compare Fodor, 1978), then no such slowdown
would be expected at the post-verbal NP.
In the ‘stop-making-sense’ task, the learners behaved similarly to the native speakers. All but
the Chinese-speaking participants made more ‘stop’ decisions at and immediately after the
verb in the 'Implausible-at-V' condition than in the corresponding plausible condition,
suggesting that both the learners and the native speakers were sensitive to plausibility
information. The analysis of the reading time data showed that for all participant groups, the
post-verbal noun in the 'Plausible-at-V' condition elicited longer reading times compared to
the post-verbal noun in the 'Implausible-at-V' condition. This indicates that both the native
speakers and the learners analysed the wh-filler as the direct object of the verb, and that the
plausibility of the wh-filler as a direct object affected the ease of reanalysis. The learners’ L1
background did not appear to have any effect on how they processed the experimental
sentences. Only the native speakers showed an effect of plausibility at the determiner
introducing the post-verbal NP, however. According to the authors, the earlier onset of the
filled-gap effect observed in the native group may indicate a greater sensitivity to the
syntactic cue provided by the determiner, which signalled an incoming NP.
Williams et al.'s on-line experiment was complemented by an off-line acceptability
judgement task to investigate the different learner groups’ ability to recover from
misanalysis. The results showed that the learners but not the native speakers judged the
'Plausible-at-V' sentences unacceptable significantly more often than the 'Implausible-at-V'
ones. Similarly to Juffs & Harrington (1995, 1996), the authors conclude that the learners
have more difficulty than native speakers recovering from an initial misanalysis, particularly
when this analysis is plausible, suggesting an over-commitment to a strongly plausible first
Summarising, Williams et al.'s results suggest that L2 learners, like native speakers, employ
a filler-driven parsing strategy when processing wh-dependencies, irrespective of their
language background. A potential problem with this study, however, is that there is no
evidence that the learners interpreted the experimental items correctly. Recall that in the off-
line task, the learners judged many of the experimental sentences as unacceptable even
though they were both grammatical and fully plausible by the end of the sentence. Observe
further that like the results from Juffs & Harrington's (1995) study, Williams et al.'s results
do not bear directly on the question of whether or not L2 learners postulate empty categories
during processing. It is possible that the participants associated the wh-filler with the verb
directly, a decision that they were forced to undo when the actual Theme or Patient argument
became available. As the authors point out themselves, the filled-gap effect observed on the
post-verbal noun in the non-native participants may reflect purely thematic, rather than
thematic and syntactic, reanalysis processes. The current study aims to dissociate potentially
verb-driven integration effects from syntactic gap-filling by examining L2 learners'
processing of successive-cyclic wh-movement structures.
3. The present study
Our study was modelled after Gibson & Warren's (1999) study on the processing of long wh-
dependencies by adult native speakers of English. Using a self-paced reading task, Gibson &
Warren (hereafter, G&W) investigated how native speakers process sentences such as (7a)
and (7b) below.
(7) a. The manager whoi the consultant claimed e'i that the new proposal
had pleased ei will hire five workers tomorrow.
b. The manager whoi the consultant's claim about the new proposal
had pleased ei will hire five workers tomorrow.
The sentences in (7) above differ in that (7a) but not (7b) provides an intermediate landing
site for the fronted wh-pronoun. This is because in (7a), wh-movement has crossed a clause
boundary that signals the beginning of a new cyclic domain, whereas (7b) involves
extraction across a noun phrase. Crucially, the linear distance between the filler and its
ultimate gap (as measured in terms of the number of intervening words) was kept the same in
both experimental conditions. In order to control for a possible confounding effect of
subject-verb distance, G&W's materials also included sentences of the following types,
which did not involve any wh-movement but which differed in the relative distance between
the verb pleased and the head of its subject (viz. proposal in [8a], and claim in [8b]).
(8) a. The consultant claimed that the new proposal had pleased the manager who will
hire five workers tomorrow.
b. The consultant's claim about the new proposal had pleased the manager who will
hire five workers tomorrow.
The authors found an interaction between extraction and intervening phrase type at the
region containing the wh-filler's subcategoriser pleased. Reading times were shorter for
sentences such as (8a) that provided an intermediate landing site than for sentences such as
(8b), an effect that was not present in the non-extraction conditions and thus cannot be
attributed to any differences in subject-verb distance between the VP and NP conditions.
Furthermore, the reading times elicited by the complementiser that in (7a) were found to be
longer than in the corresponding non-extraction condition (8a), although the interaction
between extraction and intervening phrase type did not reach significance here.
The 'intermediate gap' effect observed by G&W supports a strong version of the Active
Filler Hypothesis according to which a filler is reactivated cyclically so as to break up long
dependencies into a series of shorter ones (compare Crocker, 1996; Frazier & Clifton, 1989).
Note, however, that there was an asymmetry in G&W's experimental materials between the
extraction and non-extraction conditions in that the extraction conditions contained more
words than the non-extraction conditions, and one additional level of embedding before the
critical segments. This asymmetry may have introduced a confound such that lower reading
times in the non-extraction conditions might have been partly due to the differences in length
and/or structural complexity between the extraction and non-extraction conditions.
The present study has two major aims: (i) to replicate G&W's finding with native speakers of
English using improved materials, and (ii) to investigate whether L2 learners of English from
different language backgrounds process long wh-dependencies in the same way, or
differently from, native speakers. To test whether the learners' L1 background has an effect
on their processing of long wh-dependencies in L2 English, we examined learners from both
wh-movement (Greek, German) and wh-in-situ backgrounds (Chinese, Japanese).
Four groups of learners of L2 English participated in the current study: 34 Chinese-speaking
learners (mean age = 25, range = 17-33), 26 Japanese-speaking learners (mean age = 27,
range = 20-40), 24 German-speaking learners (mean age = 24, range = 19-46), and 30
Greek-speaking learners (mean age = 25, range = 20-37), as well as a group of 24 native
English-speaking controls (mean age = 24, range = 19-34). The participants were recruited
from among the undergraduate and postgraduate student communities at the University of
XXXX and were paid a small fee for their participation. All participants had normal or
corrected-to-normal vision, and were naïve with respect to the purpose of the experiment.
The Chinese-speaking learners were all native speakers of Mandarin Chinese. All learners
had first been exposed to English around the age of 11 in a classroom setting, and none of
them considered themselves bilingual. Table 1 provides an overview of the learners' age at
the time of testing, their age of first exposure to English, and the time the participants had
spent in the UK at the time of testing.
Insert Table 1 about here
To determine the learners' general proficiency in English at the time of testing, all of them
underwent a standardised proficiency test, the Oxford Placement Test (OPT; Allen, 1992).
As our experimental materials involved structurally complex sentences, only learners at or
above the 'upper intermediate' level (i.e., learners scoring 145/200 points or above) were
included in our study. In addition to the OPT, the learners also completed an off-line
questionnaire, the purpose of which was to ensure that they were able to comprehend
complex sentences of the kind that were later used in the on-line task. The questionnaire
consisted of 20 sentences that were similar but not identical to the sentences used in the self-
paced reading task. There were five sentences corresponding to each of the four
experimental conditions in the on-line experiment, as described below in the Materials
section. Each sentence was followed by a comprehension question and three choices, as
illustrated by (9) below (for the full set of questionnaire materials, see Appendix A).
(9) The captain who the officer decided that the young soldier had displeased will
write a formal report next week.
Who made a decision?
the captain the officer the soldier
The participants were instructed to read the sentences and indicate which of the three
answers they considered the most appropriate. Table 2 presents a summary of the
participant's scores in the OPT and in the off-line questionnaire.
Insert Table 2 about here