Conference PaperPDF Available

Chunking ability shapes sentence processing at multiple levels of abstraction

Authors:

Abstract and Figures

Several recent empirical findings have reinforced the notion that a basic learning and memory skill—chunking—plays a fundamental role in language processing. Here, we provide evidence that chunking shapes sentence processing at multiple levels of linguistic abstraction, consistent with a recent theoretical proposal by Christiansen and Chater (2016). Individual differences in chunking ability at two different levels is shown to predict on-line sentence processing in separate ways: i) phonological chunking ability, as assessed by a variation on the non-word repetition task, predicts processing of complex sentences featuring phonological overlap; ii) multiword chunking ability, as assessed by a variation on the serial recall task, is shown to predict reading times for sentences featuring long-distance number agreement with locally distracting number-marked nouns. Together, our findings suggest that individual differences in chunking ability shape language processing at multiple levels of abstraction, consistent with the notion of language acquisition as learning to process.
Correlation between Phonological Chunk Sensitivity (derived from repetition scores in Part 2) and the difference in main verb RTs for OR sentences with and without phonological overlap between words. term did not reach significance. The model for the significant main effect had an R value of 0.36. A scatterplot showing the correlation between Phonological Chunk Sensitivity and the RT difference is shown in Figure 2: better chunking ability resulted in less phonological interference. Thus, consistent with the predictions of the Chunk-andPass framework, we find evidence for the notion that chunking ability shapes sentence processing differently at two separate levels of abstraction: participants who were more sensitive to word chunk information better processed long-distance dependencies in the face of conflicting local information, while those with higher phonological chunk sensitivity better processed complex sentences with phonological overlap between words. That the two chunk sensitivity measures did not correlate with one another further underscores the notion of chunking taking place at multiple levels of abstraction. While we failed to find the same effect of phonological overlap on processing as did Acheson and MacDonald (2011), it is likely that our subjects (Cornell undergraduates) had more reading experience than subjects at UW-Madison, and experienced less interference overall. Nonetheless, our measure of phonological chunk sensitivity was sensitive enough to pick up individual differences that predicted sentence processing in the face of phonological interference. Intriguingly, participants with very high Phonological Chunk Sensitivity appeared to experience an advantage for OR sentences featuring phonological overlap. This raises the possibility that such subjects benefitted from phonologically-based priming of subsequent rhyme words in sentences such as (3). Further work will be necessary to evaluate this possibility.
… 
Content may be subject to copyright.
Chunking Ability Shapes Sentence Processing at Multiple Levels of Abstraction
Stewart M. McCauley (Stewart.McCauley@liverpool.ac.uk)
Department of Psychological Sciences, University of Liverpool
Erin S. Isbilen (esi6@cornell.edu)
Morten H. Christiansen (christiansen@cornell.edu)
Department of Psychology, Cornell University
Abstract
Several recent empirical findings have reinforced the notion
that a basic learning and memory skillchunkingplays a
fundamental role in language processing. Here, we provide
evidence that chunking shapes sentence processing at multiple
levels of linguistic abstraction, consistent with a recent
theoretical proposal by Christiansen and Chater (2016).
Individual differences in chunking ability at two different
levels is shown to predict on-line sentence processing in
separate ways: i) phonological chunking ability, as assessed
by a variation on the non-word repetition task, predicts
processing of complex sentences featuring phonological
overlap; ii) multiword chunking ability, as assessed by a
variation on the serial recall task, is shown to predict reading
times for sentences featuring long-distance number agreement
with locally distracting number-marked nouns. Together, our
findings suggest that individual differences in chunking
ability shape language processing at multiple levels of
abstraction, consistent with the notion of language acquisition
as learning to process.
Keywords: sentence processing; chunking; learning;
memory; usage-based approach; language
Introduction
Language takes place in real time; a fairly uncontroversial
observation, yet one with far-reaching consequences that are
rarely considered. For instance, a typical English speaker
produces between 10 and 15 phonemes per second
(Studdert-Kennedy, 1986), yet the ability of the auditory
system to process discrete sounds is limited to around 10 per
second, beyond which the signal is perceived as a single
buzz (Miller & Taylor, 1948). Moreover, the auditory trace
is limited to about 100ms (Remez et al., 2010).
Compounding matters even further, human memory for
sequences is limited to between 4 and 7 items (e.g., Cowan,
2001; Miller, 1956). Simply put, the sensory signal is so
incredibly short-lived, and our memory for it so very
limited, that language would seem to stretch the human
capacity for information processing beyond its breaking
point. We refer to this as the Now-or-Never bottleneck
(Christiansen & Chater, 2016).
How is language learning and processing possible in the
face of this real-time constraint? A key piece of the puzzle,
we suggest, lies in chunking: through experience with
language, we learn to rapidly recode incoming information
into chunks which can then be passed to higher levels of
representation.
As an intuitive demonstration of the necessity of
chunking, imagine being tasked with recalling a string of
letters, presented auditorily: u o p f m r e e p o a e c s g n p l
i r. After a single presentation of the string, very few
listeners would be able to recall a sequence consisting of
even half of the letters (cf. Cowan, 2001). However, if
exposed to the exact same set of letters but re-ordered
slightly, virtually any listener would able to recall the entire
sequence with ease: f r o g m o u s e p a p e r p e n c i l.
Clearly, such a feat is possible by virtue of the ability to
rapidly chunk the sequence into familiar sub-sequences
(frog, mouse, paper, pencil).
According to the proposal of Christiansen and Chater
(2016), the Now-or-Never Bottleneck requires language
users to perform similar chunking operations on speech and
text in order to process and learn from the input. This is
necessary both due to the fleeting nature of sensory memory
and the speed at which information is encountered during
processing. Specifically, language users must perform
Chunk-and-Pass processing, whereby input is chunked as
rapidly as possible and passed to a higher, more abstract
level of representation. Information at higher levels must
also be chunked before being passed to still higher,
increasingly abstract levels of representation.
Thus, in order to communicate in real-time, language
users must chunk at multiple levels of abstraction, ranging
from the level of the acoustic signal to the level of
phonemes or syllables, to words, to multiword units, and
beyond. Indeed, mounting empirical evidence supports the
notion of chunking at levels higher than that of the
individual word: children and adults appear to store and
utilize chunks consisting of multiple words in
comprehension and production (e.g., Arnon & Snider, 2010;
Bannard & Matthews, 2008). Moreover, usage-based (e.g.,
Tomasello, 2003) and generative (e.g., Culicover &
Jackendoff, 2005) theoretical approaches have highlighted
the importance of such units in grammatical development
and sentence processing alike.
Chunking has been considered a key learning and
memory mechanism in mainstream psychology for over half
a century (e.g., Miller, 1956), and has been used to
understand specific aspects of language acquisition (e.g.,
Jones, 2012; Jones, Gobet, Freudenthal, & Pine, 2014).
Nevertheless, few have sought to understand how it may
shape more complex linguistic skills, such as sentence
processing. McCauley and Christiansen (2015) took an
initial step in this direction, showing that individual
differences in low-level chunking abilities were predictive
of reading times for sentences involving relative clauses,
demonstrating the far-reaching impact of basic chunking
skills in shaping complex linguistic behaviors.
The present study seeks to evaluate the predictions of the
Chunk-and-Pass framework more closely, by examining
individual variation in chunking at two different levels of
abstraction. Specifically, whereas chunking has previously
been treated as a uniform memory ability, we test the novel
theoretical prediction that chunking abilities may be
relatively independent at different levels of linguistic
abstraction. Participants were first asked to take part in a
multiword-based serial recall task (Part 1) designed to yield
a measure of chunking at the word level. This was followed
by a variation on the non-word repetition task (Part 2),
designed to yield a measure of phonological chunking
ability. Importantly, due to the memory limitations
discussed above, participants must utilize chunking in order
to recall more than a few discrete words or phonemes in
these tasks (e.g., Cowan, 2001; Miller, 1956). Finally,
participants took part in an online self-paced reading task
(Part 3). The results show that chunking ability at each level
predicts different aspects of sentence processing ability:
chunking at the phonological level predicts the extent to
which low-level phonological information interferes with or
facilitates complex sentence processing, while chunking at
the multiword level predicts the role of local information in
processing sentences with long-distance dependencies.
Part 1: Measuring Individual Differences in
Word Chunking Ability
The first task sought to gain a measure of individual
participants’ ability to chunk words into multiword units. To
this end, we specifically isolate chunking as a mechanism
by employing a classic psychological paradigm: the serial
recall task. Serial recall has a long history of use in studies
of chunking, dating back to some of the earliest relevant
work (e.g., Miller, 1956), as well being used to extensively
study individuals’ chunking abilities (e.g., Ericsson, Chase,
& Faloon, 1980).
Participants were tasked with recalling strings of 12
individual words, with each string consisting of 4 separate
word trigrams extracted from a large corpus of English.
Importantly, in order to recall more than a few discrete
items (as few as 4 in some accounts; e.g., Cowan, 2001),
listeners must chunk the words of the input sequence into
larger, multiword units. In this case, we expect them to draw
upon linguistic experience with the trigrams in the
experimental items.
In addition, we included a baseline performance measure:
matched control strings, which featured identical functors to
the experimental sequences, along with frequency-matched
content words (to avoid semantic overlap effects on recall),
presented in random order. Thus, comparing recall for
experimental and control trials provides a measure of word
chunking ability that reflects language experience while
controlling for such factors as attention, motivation, andto
the extent that it is separableworking memory.
Method
Participants 42 native English speakers from the Cornell
undergraduate population (17 females; age: M=19.8,
SD=1.2) participated for course credit. Of the original 45
subjects, one was excluded due to audio recording errors,
while two subjects failed to complete all three tasks.
Materials Experimental stimuli consisted of word trigrams
spanning a range of frequencies, extracted from the
American National Corpus (Reppen, Ide & Suderman,
2005) and the Fisher corpus (Cieri, Graff, Kimball, Miller &
Walker, 2004). The combined corpus contained a total of 39
million words of American English. Each item was
compositional (non-idiomatic). Item frequencies, per million
words, ranged from 40 to .08, averaging at .73.
Each word was synthesized independently using the
Festival speech synthesizer (Black, Clark, Richmond, King
& Zen, 2004) and concatenated into larger strings consisting
of 12 words (4 trigrams). Each trigram was matched as
closely as possible for frequency with the others occurring
in a sequence.
To provide a non-chunk-based control condition, each
item was matched to a sequence of words which contained
identical functors but random frequency-matched content
words (in order to avoid semantic overlap effects on recall,
content words were not re-used). The ordering of the words
was then randomized. An example of a matched set of
sequences is shown below:
1) have to eat good to know don’t like them is really nice
2) years got don’t to game have she mean to them far is
The final item set consisted of 20 sequences (10
experimental, 10 control).
Procedure Each trial featured a 12-word sequence
presented auditorily. Each word was followed by a 250ms
pause. Immediately upon completion of the string, the
participant was prompted to verbally recall as much of the
sequence as possible. Responses were recorded digitally and
later transcribed by a researcher blind to the conditions as
well as the purpose of the study.
The presentation order of the sequences was fully
randomized. The entire task took approximately 15 minutes.
Results and Discussion
Participants recalled significantly more words from
experimental strings than the frequency-matched control
sequences. The overall recall rate for words occurring in
experimental items was 74.0% (SE=2.3%), while the recall
rate for control sequences was just 39.2% (SE=1.1%). The
difference between conditions was significant (t(41)=18.8,
p<0.0001).
As the purpose of Part 1 was to gain an overall measure of
chunk sensitivity, we calculated the difference between
conditions individually for each subject (M=34.8%,
SE=1.8%), which afforded a measure of word-chunking
ability that reflects language experience while controlling
for factors such as working memory, attention, and
motivation. We refer to this difference measure as the Word
Chunk Sensitivity score, and it is used as a predictor of
sentence processing ability in Part 3.
In addition to bolstering previous empirical support for
compositional (non-idiomatic) multiword sequences as
linguistic units in their own right (e.g., Bannard &
Matthews, 2008), Part 1 revealed considerable individual
differences across participants in word chunking ability.
Recall rates for experimental items ranged from as high as
93.3% to just 30.4%, with difference scores across the
conditions ranging from 50.8% as low as 3.0%.
Part 2: Measuring Individual Differences in
Phonological Chunking Ability
While the first task sought to gain a measure of individual
participants’ chunking abilities at the level of words, Part 2
sought to gain a measure of chunking ability at the
phonological level. To this end, we re-purposed the standard
non-word repetition (NWR) task as a chunking task. NWR
has been used extensively to study various aspects of
language development. Recent studies, however, have
suggested that chunking may better account for NWR
performance than more nebulous psychological constructs,
such as working-memory (e.g., Jones, 2012; Jones et al.,
2014). In one sense, the NWR task can be re-conceptualized
as a serial recall task, as in Part 1. Following such work, and
in keeping with the Now-or-Never perspective outlined
above, we propose that individual differences in chunking
ability underlie differences in NWR performance. In turn,
NWRwith appropriately constructed stimulican serve
as an additional dimension along which to measure
chunking ability at the level of phonological processing.
Participants engaged in a standard NWR task, with each
non-word consisting of 4, 5, or 6 syllables. However, the
stimuli were designed such that the same set of syllables
occurred in two different non-words, but in different
orderings: one ordering yielded an item with high
“chunkability,” according to corpus statistics, while the
other was estimated to be less “chunkable.” The two items
were then counterbalanced across halves of the task.
Method
Participants The same 42 subjects from Part 1 participated
directly afterwards in this task.
Materials Non-words were generated using an algorithm
which took a large list
1
of English syllables and randomly
generated syllable combinations that were evaluated
according to distributional statistics at the phoneme level.
For the purpose of supplying statistics, the combined corpus
used in Part 1 was automatically re-transcribed phonetically
using the Festival speech synthesizer (Black et al., 2004).
1
http://semarch.linguistics.fas.nyu.edu/barker/Syllables/
For each of three different syllable lengths (4-, 5-, and 6-
syllables), the algorithm extracted item pairs that differed
maximally in sequence likelihood (based on phoneme
trigram statistics) across two different sequential orderings
of the same set of syllables. In other words, pairs were
selected in which one ordering of syllables was highly
“chunk-like,” while the other ordering of the same syllables
was less “chunk-like,” according to the phoneme statistics
of the corpus. Four sets of non-words (the four in which the
pair differed most greatly in terms of sequence likelihood)
were selected for each syllable length. An example of a
highly “chunk-like” 4-syllable item is krew-ih-tie-zuh,
which was matched to the less chunk-like tie-zuh-ih-krew.
Thus, the final set of items included 24 non-words, eight
in each of three syllable-length conditions (4-, 5-, and 6-
syllable), with four being highly “chunk-like” and the other
four consisting of alternate orderings of the same syllables
which were statistically less “chunk-like.”
Procedure The task was split into two blocks, with all
NWR item pairs counterbalanced between them. The
auditory presentation of each non-word was followed by a
1500ms pause, after which the participant was prompted to
recall the item verbally. As with Part 1, responses were
recorded digitally and scored offline. The task took
approximately 4 minutes to complete.
Correct responses received a score of 1. Responses
involving alteration to a single phoneme (usually a vowel
substitution, which could easily stem from differences in
regional dialect) received a score of 0.5. All other responses
received scores of 0.
Results and Discussion
Participants achieved a mean NWR accuracy rate of 54.1%
(SE=2.3%). While the overall differences between the high
chunk-like (M=55.2%, SE=2.5%) and low chunk-like
(M=53.1%, SE=2.5%) conditions were in the expected
direction, they were subtle, with a mean difference of 2.1%
(non-significant: t(41)=1.12, p>0.1). However, there was
considerable individual variation in the size of this
difference across participants (SE=1.9%), ranging from
29.2% to less than 0%, at -16.6%. Therefore, in Part 3, we
assess both the overall NWR performance score as well as
the difference between the conditions (which we refer to as
the Phonological Chunk Sensitivity score) as predictors of
sentence processing.
Importantly, neither the overall raw task performance
(β=-0.03, p=0.9) nor the Chunk Sensitivity scores (β=-0.19,
p=0.22) from Parts 1 and 2 correlated with one another,
consistent with the notion that chunking at each level may
have different consequences for sentence processing.
Part 3: Measuring Individual Differences in
Sentence Processing and Chunking
In Part 1, we sought to gain a measure of individual
participants’ ability to chunk words together, while Part 2
aimed to provide a measure of phonological chunking
ability. In Part 3, the same subjects from the first two parts
participated in a self-paced reading task designed to: i)
assess on-line sentence processing across two different
sentence types which were hypothesized to involve
chunking at the word and phonological levels, but to
different extents; ii) determine the extent to which chunking
ability, as assessed in the first two tasks, predicted
processing difficulties for each sentence type.
The first sentence type featured long distance subject-verb
number agreement with locally distracting number-marked
nouns, exemplified by (1):
1. The key to the cabinets was rusty from many years of
disuse.
Previous work (Pearlmutter, Garnsey, & Bock, 1999) has
shown that readers are slower to process the verb when the
number of the local noun (cabinets) does not match that of
the head noun (key), resulting in the sequence (cabinets
was). Reading times are compared to sentences in which the
number marking matches, as exemplified by (2):
2. The key to the cabinet was rusty from many years of
disuse.
In other words, reading times are higher at the verb when
the local information is distracting. Following the finding
that text-chunking ability predicts decreased difficulty with
complex sentences involving long-distance dependencies
(McCauley & Christiansen, 2015), we hypothesized that
participants with higher Word Chunk Sensitivity scores
(Part 1) would be less susceptible to interference from local
information in sentences such as (1). Subjects that are better
able to rapidly chunk words together and pass them to
higher levels of representation should not only experience
decreased computational burden from long-distance
dependencies, but should be less affected by locally
distracting information.
The second sentence type featured object-relative (OR)
clauses, which have been shown to be processed with
greater ease by good text chunkers (McCauley &
Christiansen, 2015). However, in the present study we
added an element of phonological interference: two pairs of
words in each sentence exhibited phonological overlap.
Previous work has shown that low-level phonological
overlap can interfere with the processing of sentences
featuring relative clauses (Acheson & MacDonald, 2011).
An experimental item and its matched control are shown in
(3) and (4):
3. The cook that the crook consoles controls the politician.
4. The prince that the crook comforts controls the politician.
In line with the Chunk-and-Pass framework, we predicted
that better phonological chunkers, as assessed in Part 2,
would be less susceptible to phonological interference, by
virtue of their ability to more rapidly chunk and pass
phonological information to a higher level of representation.
Thus, participants’ resilience to phonological interference
was hypothesized to be better predicted by Phonological
Chunk Sensitivity (Part 2), while participants’ susceptibility
to local number mismatch was expected to be better
predicted by Word Chunk Sensitivity (Part 1).
Method
Participants The same 42 subjects from Parts 1 and 2
participated in Part 3 immediately afterwards.
Materials There were two sentence listscounterbalanced
across subjectseach consisting of 9 practice items, 20
experimental items, 20 matched control items, and 68 filler
items. There were two experimental conditions, each with
20 items; the first consisted of the OR sentences featuring
phonological overlap (the first 20 items from Acheson &
MacDonald, 2011). The second experimental condition
consisted of grammatical sentences featuring long-distance
number agreement with locally distracting number-marked
nouns (the 16 items from Pearlmutter et al., 1999, plus four
additional sentences with the same properties).
Each list included, for each condition, 10 of the items in
their experimental form and 10 of the items in their control
form (without rhymes in the case of the OR sentences;
without locally distracting nouns in the case of the number
agreement sentences). The lists were counterbalanced such
half of the subjects saw the experimental versions of
sentences the other half saw in their control form.
Procedure Materials were presented in random order using
a self-paced, word-by-word moving window display (Just,
Carpenter, & Woolley, 1982). At the beginning of each trial,
a series of dashes appeared (one corresponding to each
nonspace character in the sentence). The first press of a
marked button caused the first word to appear, while
subsequent button presses caused each following word to
appear. The previous word would return once more to
dashes. Reaction times were recorded for each button press.
Following each sentence, subjects answered a yes/no
comprehension question using buttons marked “Y” and “N.”
The task took approximately 10 minutes.
Results and Discussion
Only trials with correct answers to comprehension questions
were analyzed. Accuracy for the number agreement
condition was 88.3%; for the object-relatives it was 80.0%.
Following Acheson & MacDonald (2011), raw reaction
times over 3000ms were excluded. Prior to analysis, raw
reaction times (RTs) were log-transformed.
Mean RTs for the main verb in the number agreement and
phonological overlap sentences were comparable to those in
the corresponding original studies (respectively: Pearlmutter
et al., 1999; Acheson & MacDonald, 2011), as was the size
of the mean difference between conditions. In the number
agreement condition, the verb in experimental items
(M=361.1, SE=19.9) was processed more slowly than in
controls (M=316.7, SE=13.9), a mean difference of 44ms
(F1[1,41]=12.7, p<0.001; F2[1,18]=10.2, p<0.01). There
was a fair amount of individual variation in the difference
Fig. 1: Correlation between Word Chunk Sensitivity (derived from
recall scores in Part 1) and the difference in main verb RTs
between sentences with locally distracting number information vs.
control sentences.
between conditions (SD=79.4).
The critical main verb in OR sentences featuring
phonological overlap was processed more slowly (M=605.1,
SE=70.6) than in matched controls (M=546.3, SE=42.2), a
mean difference of 58.8 which was non-significant
(F1[1,41]=1.21, p=0.28; F2[1,18]=0.04, p=0.8; see
discussion). There was, however, considerable individual
variation in the difference between conditions (SD=343.7),
especially relative to the size of group mean difference.
We were primarily interested in the extent to which
differences in RTs between experimental and control
sentences could be predicted by the Chunk Sensitivity
measures collected in Parts 1 and 2. Below, we analyze
these relationships using multiple linear regression, with
Word Chunk Sensitivity and Phonological Chunk
Sensitivity scores as predictors of RT differences between
conditions (recall that the two metrics were not correlated).
2
For the difference between sentences featuring locally
distracting number information and their control
counterparts, we found that Word Chunk Sensitivity was a
significant predictor of RT difference at the verb (β=-0.79,
t=-3.19, p<0.01), while Phonological Chunk Sensitivity and
the interaction term did not reach significance. The model
for the significant main effect had an R value of 0.42. The
correlation between Word Chunk Sensitivity and the RT
difference is depicted in Figure 1. As can be seen, subjects
with higher Word Chunk Sensitivity scores appear less
susceptible to interference from the locally distracting
number information, as reflected by lower differences
between verb RTs for experimental vs. control sentences.
With regard to the difference between OR sentences with
and without phonological overlap, we found that
Phonological Chunk Sensitivity was a significant predictor
of RT differences at the main verb (β=-3.49, t=-2.43,
p<0.05), while Word Chunk Sensitivity and the interaction
2
We found that raw NWR performance scores resulted in
weaker linear models and did not reach significance as a predictor.
Therefore, we focus on the Phonological Chunk Sensitivity metric
in the analyses (see Part 2).
Fig. 2: Correlation between Phonological Chunk Sensitivity
(derived from repetition scores in Part 2) and the difference in
main verb RTs for OR sentences with and without phonological
overlap between words.
term did not reach significance. The model for the
significant main effect had an R value of 0.36. A scatterplot
showing the correlation between Phonological Chunk
Sensitivity and the RT difference is shown in Figure 2:
better chunking ability resulted in less phonological
interference.
Thus, consistent with the predictions of the Chunk-and-
Pass framework, we find evidence for the notion that
chunking ability shapes sentence processing differently at
two separate levels of abstraction: participants who were
more sensitive to word chunk information better processed
long-distance dependencies in the face of conflicting local
information, while those with higher phonological chunk
sensitivity better processed complex sentences with
phonological overlap between words. That the two chunk
sensitivity measures did not correlate with one another
further underscores the notion of chunking taking place at
multiple levels of abstraction.
While we failed to find the same effect of phonological
overlap on processing as did Acheson and MacDonald
(2011), it is likely that our subjects (Cornell undergraduates)
had more reading experience than subjects at UW-Madison,
and experienced less interference overall. Nonetheless, our
measure of phonological chunk sensitivity was sensitive
enough to pick up individual differences that predicted
sentence processing in the face of phonological interference.
Intriguingly, participants with very high Phonological
Chunk Sensitivity appeared to experience an advantage for
OR sentences featuring phonological overlap. This raises
the possibility that such subjects benefitted from
phonologically-based priming of subsequent rhyme words
in sentences such as (3). Further work will be necessary to
evaluate this possibility.
General Discussion
In the present study, we show that individual differences in
chunking ability predict on-line sentence processing at
multiple levels of abstraction: chunking at the phonological
level is shown to predict the way phonological information
is used during complex sentence processing, while chunking
at the multiword level is shown to predict the ease with
which long-distance dependencies are processed in the face
of conflicting local syntactic information. In Part 1, we
adapted the serial recall taska paradigm used for over half
a century to study memory, including chunking
phenomenain order to gain a measure of individual
variation in subjects’ ability to chunk word sequences into
multiword units. In Part 2, subjects participated in a NWR
task with non-words designed to vary according to the ease
with which their phonemes could be chunked. The
difference in correct repetition rates between highly chunk-
able and less chunk-able items provided a measure of
individual variation in chunking ability at the phonological
level. Finally, in Part 3 we showed that chunking at the
multiword level was predictive of processing for sentences
with long-distance dependencies and distracting local
information, while chunking at the phonological level was
predictive of complex sentence processing in the presence
of phonological overlap between words.
Expanding on the findings of a previous study that
showed low-level chunking of sub-lexical letter sequences
to predict sentence processing abilities (McCauley &
Christiansen, 2015), the present study supports the notion
that chunking not only takes place at multiple levels of
abstraction, but that individuals’ processing abilities may be
differently shaped by chunking at each level. Moreover,
chunking at lower levels (e.g., the phonological level) may
have serious consequences for processing at higher levels
(e.g., sentence processing).
This work is highly relevant to the study of language
acquisition. The Now-or-Never bottleneck imposes
incremental, on-line processing constraints on language
learning, suggesting a key role for chunking. Indeed, a
number of recent computational modeling studies have
demonstrated that chunking can account for key empirical
findings on children’s phonological development and word
learning abilities (Jones, 2012; Jones et al., 2014), while
other work has captured a role for chunking in learning to
comprehend and produce sentences (McCauley &
Christiansen, 2011, 2014). There exists a clear need for
further developmental behavioral studiesincluding
longitudinal studiesexamining individual differences in
chunking as they pertain to specific stages of language
development as well as more general language learning
outcomes.
Acknowledgments
Thanks to Nick Chater and Gary Jones for helpful
discussion, as well as S. Reig, K. Diamond, J. Kolenda, J.
Powell, S. Goldberg, and D. Dahabreh for assistance with
participant running and recruitment.
References
Acheson, D.J. & MacDonald, M.C. (2011). The rhymes that the
reader perused confused the meaning: Phonological effects
during on-line sentence comprehension. Journal of Memory and
Language, 65, 193-207.
Arnon, I. & Snider, N. (2010). More than words: Frequency effects
for multi-word phrases. Journal of Memory and Language, 62,
67-82.
Bannard, C. & Matthews, D. (2008). Stored word sequences in
language learning. Psychological Science, 19, 241-248.
Christiansen, M.H. & Chater, N. (2016). The Now-or-Never
bottleneck: A fundamental constraint on language. Behavioral &
Brain Sciences, 39, e62.
Cowan, N. (2001). The magical number 4 in short-term memory:
A reconsideration of mental storage capacity. Behavioral and
Brain Sciences, 24, 87-114.
Culicover, P.W. & Jackendoff, R. (2005). Simpler syntax. New
York: Oxford University Press.
Ericsson, K.A., Chase, W.G., & Faloon, S. (1980). Acquisition of a
memory skill. Science, 208, 1181-1182.
Jones, G. (2012). Why chunking should be considered as an
explanation for developmental change before short-term
memory capacity and processing speed. Frontiers in
Psychology, 3 :167. DOI: 10.3389/fpsyg.2012.00167.
Jones, G., Gobet, F., Freudenthal, D., Watson, S.E. & Pine, J.M.
(2014). Why computational models are better than verbal
theories: The case of nonword repetition. Developmental
Science, 17, 298-310.
Just, M. A., Carpenter, P. A., & Woolley, J. D. (1982). Paradigms
and processes in reading comprehension. Journal of
Experimental Psychology: General, 111, 228-238.
McCauley, S.M. & Christiansen, M.H. (2011). Learning simple
statistics for language comprehension and production: The
CAPPUCCINO model. In L. Carlson, C. Hölscher, & T. Shipley
(Eds.), Proceedings of the 33rd Annual Conference of the
Cognitive Science Society (pp. 1619-1624). Austin, TX:
Cognitive Science Society.
McCauley, S.M. & Christiansen, M.H. (2014). Acquiring
formulaic language: A computational model. Mental Lexicon, 9,
419-436.
McCauley, S.M. & Christiansen, M.H. (2015). Individual
differences in chunking ability predict on-line sentence
processing. In D.C. Noelle & R. Dale (Eds.), Proceedings of the
37th Annual Conference of the Cognitive Science Society.
Austin, TX: Cognitive Science Society.
Miller, G.A. (1956). The magical number seven, plus or minus
two: Some limits on our capacity for processing information.
Psychological Review, 63, 81-97.
Miller, G.A. & Taylor, W.G. (1948). The perception of repeated
bursts of noise. Journal of the Acoustic Society of America, 20,
171-182.
Pearlmutter, N.J., Garnsey, S.M. & Bock, K. (1999). Agreement
processes in sentence comprehension. Journal of Memory and
Language, 41, 427-456.
Remez, R.E., Ferro, D.F., Dubowski, K.R., Meer, J., Broder, R.S.
& Davids, M.L. (2010). Is desynchrony tolerance adaptable in
the perceptual organization of speech? Attention, Perception, &
Psychophysics, 72, 2054-2058.
Studdert-Kennedy, M. (1986). Some developments in research on
language behavior. In N.J. Smelser & D.R. Gerstein (Eds.),
Behavioral and social science: Fifty years of discovery (pp. 208-
248). Washington, DC: National Academy Press.
Tomasello, M. (2003). Constructing a language: A usage-based
theory of language acquisition. Cambridge, MA: Harvard
University Press.
... To address this hypothesis, the present research examines whether individual differences in linguistic and cognitive abilities that support online processing may account for the variability in reading speed among L2 readers. To do so, it considers the role of L2 lexical knowledge in conjunction with chunk sensitivity, i.e., a recently developed cognitive measure found to be a significant predictor of processing efficiency, and a modulator of online reading (McCauley and Christiansen, 2015;McCauley et al., 2017;López-Beltrán et al., 2020). While a number of wellknown measures of cognitive skill have been investigated, their predictive power in what concerns online processing appears to be limited. ...
... Recent evidence has suggested that better chunking ability is associated with more efficient online processing in native (McCauley and Christiansen, 2015;McCauley et al., 2017) and non-native speakers (López-Beltrán et al., 2020). Briefly, previous work has proposed that in order to deal with the immediacy of language, speakers must be sensitive to the structural probabilities in the input, if they are to successfully process the linguistic signal in real time ("the now-or-never bottleneck"; Christiansen and Chater, 2016). ...
... Participants were native speakers of English who were enrolled in third and fourth semester university Spanish courses (roughly equivalent to level B1 of the Common European Framework of Reference for Languages, Council of Europe, 2011). This sample size is comparable to that of previous studies that have consistently detected cross-language effects during processing of L2 multiword units (Yamashita and Jiang, 2010;Gyllstad, 2011, 2013), and is in line with recent studies that examined individual-based differences in chunking ability, both in the L1 (McCauley et al., 2017) and the L2 (López-Beltrán et al., 2020) 2 . ...
Article
Full-text available
Behavioral studies on language processing rely on the eye-mind assumption, which states that the time spent looking at text is an index of the time spent processing it. In most cases, relatively shorter reading times are interpreted as evidence of greater processing efficiency. However, previous evidence from L2 research indicates that non-native participants who present fast reading times are not always more efficient readers, but rather shallow parsers. Because earlier studies did not identify a reliable predictor of variability in L2 processing, such uncertainty around the interpretation of reading times introduces a potential confound that undermines the credibility and the conclusions of online measures of processing. The present study proposes that a recently developed modulator of online processing efficiency, namely, chunking ability, may account for the observed variability in L2 online reading performance. L1 English – L2 Spanish learners’ eye movements were analyzed during natural reading. Chunking ability was predictive of overall reading speed. Target relative clauses contained L2 Verb-Noun multiword units, which were manipulated with regards to their L1-L2 congruency. The results indicated that processing of the L1-L2 incongruent units was modulated by an interaction of L2 chunking ability and level of knowledge of multiword units. Critically, the data revealed an inverse U-shaped pattern, with faster reading times in both learners with the highest and the lowest chunking ability scores, suggesting fast integration in the former, and lack of integration in the latter. Additionally, the presence of significant differences between conditions was correlated with individual chunking ability. The findings point at chunking ability as a significant modulator of general L2 processing efficiency, and of cross-language differences in particular, and add clarity to the interpretation of variability in the online reading performance of non-native speakers.
... Chunking is the process of grouping information into units and giving a label to them so that sets of information can be efficiently represented and used as integrated units (Huang and Awh, 2018;Miller, 1956). The prominent role of chunking in memory and other cognitive functions, such as language, visual perception, and motor skills, has been widely discussed by numerous studies (e.g., Chekaf et al., 2016;Fonollosa et al., 2015;Huntley et al., 2011;Jones, 2012;McCauley and Christiansen, 2015;McCauley et al., 2017;Solopchuk et al., 2016). In addition, chunking has been suggested to have a key role in the problem-solving process (Leighton and Sternberg, 2003), though chunking itself can be seen as an incipient form of problem solving. ...
... In this regard, Conway and Christiansen (2001) suggested that limitations in complex hierarchical chunking may help explain why non-human primates lack human-like language. This can be further supported by the evidence indicating the fundamental role of chunking in language processing (McCauley and Christiansen, 2015;McCauley et al., 2017). It is also tempting to postulate that more restricted slave components of the WM system in non-humans may not be able to efficiently support complex hierarchical chunking (for supportive neural evidence, see Aboitiz et al., 2010). ...
Article
Full-text available
Working memory and its components are among the most determinant factors in human cognition. However, in spite of their critical importance, many aspects of their evolution remain underinvestigated. The present study is devoted to reviewing the literature of memory studies from an evolutionary, comparative perspective, focusing particularly on short term memory capacity. The findings suggest the limited capacity to be the common attribute of different species of birds and mammals. Moreover, the results imply an increasing trend of capacity from our non-human ancestors to modern humans. The present evidence shows that non-human mammals and birds, regardless of their limitations, are capable of performing memory strategies, although there seem to be some differences between their ability and that of humans in terms of flexibility and efficiency. These findings have several implications relevant to the psychology of memory and cognition, and are likely to explain differences between higher cognitive abilities of humans and non-humans. The adaptive benefits of the limited capacity and the reasons for the growing trend found in the present study are broadly discussed.
... In contrast, phrase tracking (reflected in delta power) was modulated by both noise strength and language proficiency, such that with increasing proficiency L2 speakers were better able to track phrases under more severe noise conditions. These results suggest that processing an L2 might differ from processing one's native language, starting already at lower levels of parsing out the incoming speech stream into its building blocks (McCauley, Isbilen & Christiansen, 2017). Segmentation difficulties at this level could then have upstream effects for higher-level semantic and syntactic processing, explaining differences observed among speakers with different levels of proficiency. ...
Article
Full-text available
The study of the brains’ oscillatory activity has been a standard technique to gain insights into human neurocognition for a relatively long time. However, as a complementary analysis to ERPs, only very recently has it been utilized to study bilingualism and its neural underpinnings. Here, we provide a theoretical and methodological starter for scientists in the (psycho)linguistics and neurocognition of bilingualism field(s) to understand the bases and applications of this analytical tool. Towards this goal, we provide a description of the characteristics of the human neural (and its oscillatory) signal, followed by an in-depth description of various types of EEG oscillatory analyses, supplemented by figures and relevant examples. We then utilize the scant, yet emergent, literature on neural oscillations and bilingualism to highlight the potential of how analyzing neural oscillations can advance our understanding of the (psycho)linguistic and neurocognitive understanding of bilingualism.
Article
Morphological structures interact dynamically with lexical processing and storage, with the parameters of morphological typology being partly dependent on cognitive pathways for processing, storage and generalization of word structure, and vice versa. Bringing together a team of well-known scholars, this book examines the relationship between linguistic cognition and the morphological diversity found in the world's languages. It includes research from across linguistic and cognitive science sub-disciplines that looks at the nature of typological diversity and its relationship to cognition, touching on concepts such as complexity, interconnectedness within systems, and emergent organization. Chapters employ experimental, computational, corpus-based and theoretical methods to examine specific morphological phenomena, and an overview chapter provides a synthesis of major research trends, contextualizing work from different methodological and philosophical perspectives. Offering a novel perspective on how cognition contributes to our understanding of word structure, it is essential reading for psycholinguists, theoreticians, typologists, computational modelers and cognitive scientists.
Article
The role of working memory in language learning has received considerable attention, but several pertinent issues remain. One of these concerns the directionality of the relationships between working memory and language learning. Another issue relates to different types of processing and working memory components involved in learning different aspects of a second language (vocabulary, grammatical sub-skills, e.g., subject-verb agreement, verb placement, word order, auxiliaries). In this chapter we review and integrate findings of previous studies, following the extraction and integration model (Thiessen et al., 2013), and apply these to second language learning. In so doing, we distinguish between statistical learning based on conditional relations of adjacencies (extraction) and statistical learning based on distributional patterns of non-adjacencies (integration). We propose how L2 children's gradual increase in knowledge of the second language increases the sensitivity of working memory to cues in ambient speech that, in turn, fosters further second language learning.
Article
Bringing together cutting-edge research, this Handbook is the first comprehensive text to examine the pivotal role of working memory in first and second language acquisition, processing, impairments, and training. Authored by a stellar cast of distinguished scholars from around the world, the Handbook provides authoritative insights on work from diverse, multi-disciplinary perspectives, and introduces key models of working memory in relation to language. Following an introductory chapter by working memory pioneer Alan Baddeley, the collection is organized into thematic sections that discuss working memory in relation to: Theoretical models and measures; Linguistic theories and frameworks; First language processing; Bilingual acquisition and processing; and Language disorders, interventions, and instruction. The Handbook is sure to interest and benefit researchers, clinicians, speech therapists, and advanced undergraduate and postgraduate students in linguistics, psychology, education, speech therapy, cognitive science, and neuroscience, or anyone seeking to learn more about language, cognition and the human mind.
Article
Bringing together cutting-edge research, this Handbook is the first comprehensive text to examine the pivotal role of working memory in first and second language acquisition, processing, impairments, and training. Authored by a stellar cast of distinguished scholars from around the world, the Handbook provides authoritative insights on work from diverse, multi-disciplinary perspectives, and introduces key models of working memory in relation to language. Following an introductory chapter by working memory pioneer Alan Baddeley, the collection is organized into thematic sections that discuss working memory in relation to: Theoretical models and measures; Linguistic theories and frameworks; First language processing; Bilingual acquisition and processing; and Language disorders, interventions, and instruction. The Handbook is sure to interest and benefit researchers, clinicians, speech therapists, and advanced undergraduate and postgraduate students in linguistics, psychology, education, speech therapy, cognitive science, and neuroscience, or anyone seeking to learn more about language, cognition and the human mind.
Article
Children with developmental language disorder (DLD) show significant difficulties mastering language yet exhibit normal-range nonverbal intelligence, normal hearing and speech, and no neurological impairment. Deficits in sentence comprehension represent a major feature of school-age children’s language profile. So do memory limitations, including deficits in verbal working memory, controlled attention, and long-term memory. Though there is general consensus that the memory and comprehension deficits of these children relate in some fashion, the relationship has historically been unclear. In this chapter, we present the first conceptually integrated and empirically validated model of the sentence comprehension abilities of school-age children with DLD that describes the structural relationship among all these abilities.
Article
Statistical learning (SL) is considered a cornerstone of cognition. While decades of research have unveiled the remarkable breadth of structures that participants can learn from statistical patterns in experimental contexts, how this ability interfaces with real-world cognitive phenomena remains inconclusive. These mixed results may arise from the fact that SL is often treated as a general ability that operates uniformly across all domains, typically assuming that sensitivity to one kind of regularity implies equal sensitivity to others. In a preregistered study, we sought to clarify the link between SL and language by aligning the type of structure being processed in each task. We focused on the learning of trigram patterns using artificial and natural language statistics, to evaluate whether SL predicts sensitivity to comparable structures in natural speech. Adults were trained and tested on an artificial language incorporating statistically-defined syllable trigrams. We then evaluated their sensitivity to similar statistical structures in natural language using a multiword chunking task, which examines serial recall of high-frequency word trigrams—one of the building blocks of language. Participants' aptitude in learning artificial syllable trigrams positively correlated with their sensitivity to high-frequency word trigrams in natural language, suggesting that similar computations span learning across both tasks. Short-term SL taps into key aspects of long-term language acquisition when the statistical structures—and the computations used to process them—are comparable. Better aligning the specific statistical patterning across tasks may therefore provide an important steppingstone toward elucidating the relationship between SL and cognition at large.
Article
Full-text available
Purpose The aim of the study was to investigate the effects of specific acoustic patterns on word learning and segmentation in 8- to 11-year-old children and in college students. Method Twenty-two children (ages 8;2–11;4 [years;months]) and 36 college students listened to synthesized “utterances” in artificial languages consisting of six iterated “words,” which followed either a phonetically natural lenition–fortition pattern or an unnatural (cross-linguistically unattested) antilenition pattern. A two-alternative forced-choice task tested whether they could discriminate between occurring and nonoccurring sequences. Participants were exposed to both languages, counterbalanced for order across subjects, in sessions spaced at least 1 month apart. Results Children showed little evidence for learning in either the phonetically natural or unnatural condition nor evidence of differences in learning across the two conditions. Adults showed the predicted (and previously attested) interaction between learning and phonetic condition: The phonetically natural language was learned better. The adults also showed a strong effect of session: Subjects performed much worse during the second session than the first. Conclusions School-age children not only failed to demonstrate the phonetic asymmetry demonstrated by adults in previous studies but also failed to show strong evidence for any learning at all. The fact that the phonetic asymmetry (and general learning effect) was replicated with adults suggests that the child result is not due to inadequate stimuli or procedures. The strong carryover effect for adults also suggests that they retain knowledge about the sound patterns of an artificial language for over a month, longer than has been reported in laboratory studies of purely phonetic/phonological learning. Supplemental Material https://doi.org/10.23641/asha.13641284
Conference Paper
Full-text available
There are considerable differences in language processing skill among the normal population. A key question for cognitive science is whether these differences can be ascribed to variations in domain-general cognitive abilities, hypothesized to play a role in language, such as working memory and statistical learning. In this paper, we present experimental evidence pointing to a fundamental memory skill—chunking—as an important predictor of cross-individual variation in complex language processing. Specifically, we demonstrate that chunking ability reflects experience with language, as measured by a standard serial recall task involving consonant combinations drawn from naturally occurring text. Our results reveal considerable individual differences in participants' ability to use chunk frequency information to facilitate sequence recall. Strikingly, these differences predict variations across participants in the on-line processing of complex sentences involving relative clauses. Our study thus presents the first evidence tying the fundamental ability for chunking to sentence processing skill, providing empirical support for construction-based approaches to language.
Article
Full-text available
Memory is fleeting. New material rapidly obliterates previous material. How then can the brain deal successfully with the continual deluge of linguistic input? We argue that, to deal with this “Now-or-Never” bottleneck, the brain must compress and recode linguistic input as rapidly as possible. This observation has strong implications for the nature of language processing: (i) the language system must “eagerly” recode and compress linguistic input; (ii) as the bottleneck recurs at each new representational level, the language system must build a multi-level linguistic representation; and (iii) it must deploy all available information predictively to ensure that local linguistic ambiguities are dealt with “Right-First-Time;” once the original input is lost, there is no way for the language system to recover. This is “Chunk-and-Pass” processing. Similarly, language learning must also occur in the here-and-now. This implies that language acquisition is learning to process, rather than inducing a grammar. Moreover, this perspective provides a cognitive foundation for grammaticalization and other aspects of language change. Chunk-and-Pass processing also helps explain a variety of core properties of language, including its multi-level representational structure and duality of patterning. This approach promises to create a direct relationship between psycholinguistics and linguistic theory. More generally, we outline a framework within which to integrate often disconnected inquiries into language processing, language acquisition, and language change and evolution.
Article
Full-text available
In recent years, psycholinguistic studies have built support for the notion that formulaic language is more widespread and pervasive in adult sentence processing than previously assumed. These findings are mirrored in a number of developmental studies, suggesting that children's item-based units do not diminish, but persist into adulthood, in keeping with a number of approaches emerging from cognitive linguistics. In the present paper, we describe a simple, psychologically motivated computational model of language acquisition in which the learning and use of formulaic expressions represents the foundation for comprehension and production processes. The model is shown to capture key psycholinguistic findings on children's sensitivity to the properties of multiword strings and use of lexically specific multiword frames in morphological development. The results of these simulations, we argue, stress the importance of adopting a developmental perspective to better understand how formulaic expressions come to play an important role in adult language use.
Conference Paper
Full-text available
Whether the input available to children is sufficient to explain their ability to use language has been the subject of much theoretical debate in cognitive science. Here, we present a simple, developmentally motivated computational model that learns to comprehend and produce language when exposed to child-directed speech. The model uses backward transitional probabilities to create an inventory of ‘chunks’ consisting of one or more words. Language comprehension is approximated in terms of shallow parsing of adult speech and production as the reconstruction of the child’s actual utterances. The model functions in a fully incremental, on-line fashion, has broad cross-linguistic coverage, and is able to fit child data from Saffran’s (2002) statistical learning study. Moreover, word-based distributional information is found to be more useful than statistics over word classes. Together, these results suggest that much of children’s early linguistic behavior can be accounted for in a usage-based manner using distributional statistics.
Article
Full-text available
Tests of nonword repetition (NWR) have often been used to examine children’s phonological knowledge and word learning abilities. However, theories of NWR primarily explain performance either in terms of phonological working memory or long term knowledge, with little consideration of how these processes interact. One theoretical account that focuses specifically on the interaction between short-term and long-term memory is the chunking hypothesis. Chunking occurs because of repeated exposure to meaningful stimulus items, resulting in the items becoming grouped (or chunked); once chunked, the items can be represented in short-term memory using one chunk rather than one chunk per item. We tested several predictions of the chunking hypothesis by presenting 5–6-year-old children with three tests of NWR that were either high, medium, or low in wordlikeness. The results did not show strong support for the chunking hypothesis, suggesting that chunking fails to fully explain children’s NWR behavior. However, simulations using a computational implementation of chunking (namely CLASSIC, or Chunking Lexical And Sub-lexical Sequences In Children) show that, when the linguistic input to 5–6-year-old children is estimated in a reasonable way, the children’s data are matched across all three NWR tests. These results have three implications for the field: (a) a chunking account can explain key NWR phenomena in 5–6-year-old children; (b) tests of chunking accounts require a detailed specification both of the chunking mechanism itself and of the input on which the chunking mechanism operates; and (c) verbal theories emphasizing the role of long-term knowledge (such as chunking) are not precise enough to make detailed predictions about experimental data, but computational implementations of the theories can bridge the gap.
Article
Full-text available
The chunking hypothesis suggests that during the repeated exposure of stimulus material, information is organized into increasingly larger chunks. Many researchers have not considered the full power of the chunking hypothesis as both a learning mechanism and as an explanation of human behavior. Indeed, in developmental psychology there is relatively little mention of chunking and yet it can be the underlying cause of some of the mechanisms of development that have been proposed. This paper illustrates the chunking hypothesis in the domain of non-word repetition, a task that is a strong predictor of a child's language learning. A computer simulation of non-word repetition that instantiates the chunking mechanism shows that: (1) chunking causes task behavior to improve over time, consistent with children's performance; and (2) chunking causes perceived changes in areas such as short-term memory capacity and processing speed that are often cited as mechanisms of child development. Researchers should be cautious when considering explanations of developmental data, since chunking may be able to explain differences in performance without the need for additional mechanisms of development.
Article
Noise interrupted at a steady rate has essentially the same spectrum over the range of frequencies transduced by the earphone as does continuous noise. The frequency corresponding to the rate of interruption is not intensified in the spectrum. Consequently, the ability of listeners to respond differentially to the rate of interruption cannot be explained on the basis of a simple resonance theory of hearing. The point at which an interrupted noise becomes indistinguishable from a continuous noise depends upon the rate of interruption, the sound‐time fraction, and the intensity of the noise. For a sound‐time fraction of 0.5, the presence of interruptions can be detected at rates well above 1000 per second. Differential sensitivity to changes in the rate of interruption (with a sound‐time fraction of 0.5) is poor above 250 interruptions per second. Also at these high rates the listener loses his ability to match the frequency of a pure tone to the rate of interruption. Presumably the ability to perceive interruptions in a random noise depends upon the synchronous firing of the fibers in the auditory nerve. This hypothesis is supported by the correspondence between auditory sensitivity to changes in the rate of interruption of a noise and the tactual sensitivity to changes in the frequency of a vibrating pressure applied to the skin.
Article
Three experiments examined the processing of subject–verb agreement in sentence comprehension. Experiment 1 used a word-by-word self-paced moving window reading methodology, and participants read sentences such as The key to the {cabinet/cabinets} {was/were} rusty from many years of disuse. When the head noun (key), local noun (cabinet), and verb were all singular, reading times after the verb were faster than when either a plural local noun or plural verb was present. Experiment 2 used an eyetracking paradigm and revealed a pattern like that in Experiment 1, with a finer grain of resolution. Agreement computations influenced processing within one word after encountering the verb, and processing disruptions occurred in response to both agreement violations and locally distracting number-marked nouns, despite the fact that neither is a priori relevant for comprehension in English. Experiment 3 revealed an asymmetry in the pattern of disruptions that parallels error distributions in language production (e.g., Bock & Miller, 1991). The results suggest that agreement is an early, integral component of comprehension, mediated by processes similar to those of production.