Content uploaded by Harald Baayen
Author content
All content in this area was uploaded by Harald Baayen on Dec 13, 2016
Content may be subject to copyright.
Storage and computation in the mental lexicon
R. H. Baayena
aRadboud University and Max Planck Institute for Psycholinguistics
P.O. Box 310, 6500 AH Nijmegen, The Netherlands
1. Introduction
In the seventies of the previous century, the mathematical properties of formal lan-
guages have provided a key source of inspiration to morphological theory. Models such as
developed by Lieber (1980) and Selkirk (1984) viewed the lexicon as a calculus, a formal
system combining a repository of morphemes with rules for combining these morphological
atomic units into complex words.
This approach to the lexicon was driven by two fundamental assumptions. First, the
lexicon was assumed to be a compositional derivational system. Complex words were
believed to be generated from simpler forms (Bloch, 1947; Chomsky and Halle, 1968).
Second, following Bloomfield (1933), the set of atomic elements was assumed to comprise
any word or formative that is not predictable by rule. Rule-governed combinations of these
atomic units, the regular complex words, were assumed not to be available as units in the
lexicon, as storage would introduce unnecessary redundancy in the model. Instead of being
listed (i.e., stored without analysis or substructure), regular complex words were generated
(produced or parsed) by rule. Unsurprisingly, the goal of morphological theory was seen
as accounting for which words belong to the set of possible words in the languages of the
world. The question of whether a regular complex word exists in a language was regarded
as a question addressing performance rather than competence, and hence irrelevant for
morphological theory.
Although many other formalisms have been developed to replace sequences of rules
(Halle and Marantz, 1993; McCarthy and Prince, 1993), these formalisms did not chal-
lenge these fundamental assumptions of generative morphology. In optimality theory,
for instance, forms still enter into derivational relations, even though the algorithm that
relates underlying forms to surface forms is not based on a sequence of rules but on
constraint satisfaction.
This type of theory of the lexicon is to a surprising extent equally adequate as a com-
petence theory for how a pocket calculator works. A pocket calculator has a set of atomic
elements, the symbols on its keys. Its chip is endowed with a small set of arithmetic rules
that, when supplied with a legal string, compositionally evaluate this string. Whenever
a pocket calculator is requested to evaluate a string such as ’2 + 3’, it computes the
outcome. It has no memory that holds the output of previous evaluations of the same
string. It never learns from past experience. The balance of storage and computation is
shifted totally to the maximization of computation and the minimization of storage.
1
2R. H. Baayen
A first goal of this chapter is to show that the pocket calculator provides a fundamen-
tally flawed metaphor for understanding morphological structure and processing in the
mental lexicon. To this end, we first survey evidence from experimental studies of lexical
processing, and then consider another source of information, the fine phonetic detail that
is present in the acoustic signal. We then address the second goal of this chapter, to pro-
vide an indication of the kind of formal mathematical model that may help us to better
understand process and representation in the mental lexicon.
2. Experimental evidence
Over the last twenty-five years, the regular and irregular past tense forms in English
and related languages have provided a rich testing ground for theories of morphological
processing. Whereas English regular verbs have a past tense form in the dental suffix -ed
(e.g., walked, claimed), irregular verbs have past tense forms that range from suppletion
(go/went) to invariance (cut/cut) and from pure vocalic alternation (give/gave) to com-
binations of vocalic alternation and a variant of the dental suffix (sell/sold). Bybee and
Slobin (1982) and Bybee and Moder (1983) called attention to the many kinds of sub-
regularities that characterize the irregular verbs, which older structuralists had already
characterized as semi-productive (e.g. Van Haeringen, 1940).
Most researchers understand regular past tense forms as being derived from their present
tense stems (Bloch, 1947; Chomsky and Halle, 1968) by a simple rule adding the den-
tal suffix. Although irregular past tense forms might also be analysed as governed by
various unproductive rules, such rule-based descriptions tend to be baroque, fairly arbi-
trary and uninsightful. For understanding the semi-regularities of the irregular past tense,
connectionist models offered an alternative that obviated the need for a series of ad hoc
unproductive rules (Rumelhart and McClelland, 1986; McClelland and Patterson, 2002b).
The response of generative linguists (Pinker and Prince, 1988, 1994; Pinker, 1991) was to
defend the Bloomfieldian model by claiming that regular and irregular morphology be-
long to two completely independent cognitive systems, the dual mechanisms of rule (for
regulars) and rote (for irregulars). Irregulars would be stored in an associative memory,
regulars would not be stored at all but always derived by means of morphological rules
(Pinker, 1991, 1997).
The theory of speech production developed by Levelt et al. (1999) and its computational
implementation in the weaver model (Roelofs, 1996, 1997a,b) provide a psycholinguis-
tic formalization of the generative approach to storage and computation. The weaver
model embodies a fully decompositional theory of morphological processing in produc-
tion. Conceptual representation for words like handbooks and walked are linked to lemma
representations, which specify their syntactic and inflectional properties (handbooks is a
plural noun, walked a verb in the past tense). These lemma representations activate the
form representations (lexemes) of their constituent morphemes, hand,book and sfor hand-
books, and walk and -ed for walked. In this model, complex words do not receive their
own lexemes, they are assumed always to be produced through their constituents. In the
absence of immediate constituents, irregular verbs are assigned their own lexemes. In
short, the weaver model, as well as the general approach of Pinker and colleagues, take
as their point of departure that morphology is a simple formal system similar in essence
Storage and computation in the mental lexicon 3
to the computational system implemented in a standard pocket calculator.
Although a model like weaver is attractive for its simplicity, economy, and the broad
range of experimental data that it accounts for, it is becoming increasingly clear that its
generative design leads to a series of subtle predictions that turn out to be demonstrably
incorrect.
2.1. Storage is ubiquitous
First, storage is not restricted to irregular words. Fully regular complex words also
leave traces in lexical memory, as shown by Taft (1979) and Sereno and Jongman (1997)
for comprehension in English, Baayen et al. (1997, 2002) for comprehension in Dutch,
and Bien et al. (2005) for production in Dutch. All these studies observed that the
frequency of a complex word itself was predictive for processing latencies, independently
of the frequencies of its constituents. Such a frequency effect is widely regarded as a proof
of existence of a separate independent representation for a complex word. It has been
argued that in English regular complex words would not leave traces in lexical memory
when their frequencies fall below a threshold of 6 per million (Alegre and Gordon, 1999).
However, Wurm and Baayen (2006) observed a word frequency effect for English regular
inflected words well below this threshold in both visual and auditory lexical decision.
Interestingly, the presence of a word frequency effect went hand in hand with the
absence of a stem frequency effect in the data of Wurm and colleagues. Traditionally,
word frequency and stem (or root) frequency effects have been interpreted as the hall-
marks of whole word based processing versus decompositional processing respectively. For
very low-frequency words, however, it is highly unlikely that morphological structure is
completely irrelevant. Therefore, Wurm and colleagues suggest that word frequency be
reinterpreted as a joint probability, the probability of the co-occurrence of the immediate
constituents. If this interpretation is correct, lexical frequency effects may derive from
two sources: memory for phonological sequences (morphs), and memory for sequences of
such sequences, i.e., sequences of morphs. For (regular) complex words, both kinds of
memory are probably involved simultaneously.
In more recent work, Pinker and colleagues acknowledge that regulars can be stored.
Nevertheless, their argument is that regulars may perhaps be stored, but crucially they
need not be stored. In normal language use, regulars would be processed by rule, and
only under extreme experimental conditions would one begin to see that regulars have
their own, albeit normally superfluous, representations (Pinker and Ullman, 2002a,b).
However, why would supposedly redundant storage of regulars take place at all? In
fact, we can infer from its mere existence in experiments that it must be advantageous
for the brain to keep track of detailed combinatorial probabilities, contrary to what the
metaphor of the pocket calculator would lead one to expect. Interestingly, De Vaan et al.
(2007) discuss preliminary evidence that regular complex forms may already leave a trace
in memory after just a single exposure. Crucially, their evidence is not restricted to visual
lexical decision, but extends to self-paced reading, a task with a much higher degree of
ecological validity.
One of the advantages that storage offers for lexical processing is probability-driven
elimination of unlikely but possible segmentations in comprehension. Spurious segmenta-
tions are ubiquitous not only in syntax but also in morphology, and knowledge about the
4R. H. Baayen
likelihood of substructures is crucial for efficient selection of the most likely morphological
parse (Baayen and Schreuder, 2000; Bod, 2006).
2.2. Processing is not derivational
Given that regular inflected words seem to have their own traces in memory, it is no
longer necessary to assume that a past tense form is derived from its stem. The idea
that complex forms are, in some real sense, constructed on-line from their parts, i.e.,
that a past-tense form like walked is derived in real time from its stem walk, is one of
the few assumptions that many connectionist models (Rumelhart and McClelland, 1986;
Plunkett and Marchman, 1991; MacWhinney and Leinbach, 1991) share with symbolic
theories. Evidence is accumulating, however, that this very common assumption is wrong.
One source of evidence stems from research on speech errors. Stemberger and Middleton
(2003) presented irregular verbs in the progressive form (is falling), and asked subjects to
respond with the simple past (fell). When the vowel of the past tense is more frequent
in the language in general compared to the vowel of the present tense, overregularisation
errors (falled for fell ) decreased. When it is less frequent, overtensing errors (felled for
fall ) were more likely. These data suggest that the present and past tense forms are in
competition, and that this competition is modulated by the a-priori probabilities of the
vowels in these verb forms (see also Stemberger 2004).
Tabak et al. (2005b) obtained further evidence for competition between the past and
present tense forms using a task in which subjects were shown a present or a past tense
form, and were asked to say out loud the corresponding past or present tense form. In this
task, henceforth cross-tense naming, the ratio of the frequency of the form seen to that
of the form said was inhibitory. This inhibitory effect was the same for both irregular
and regular verbs. It also did not vary with the direction of naming, from present to
past or from past to present. In cross-tense naming, the form seen (the cue) apparently
inhibits the form to be said (the target). The presence of this effect for irregulars is
expected in the light of the results of Stemberger and Middleton (2003). Its presence also
for regulars shows that the present and past tense forms of regulars exist in the mental
lexicon just as the present and past tense forms of irregulars. Interestingly, the ratio
of the two inflectional variants was not predictive at all in straightforward word naming
nor in picture naming of the same forms. This shows that in normal situations, the two
inflectional variants are not considered jointly, which would be necessary if one form is to
be derived from the other. Instead, the targeted form is retrieved from memory without
noticeable interference from its counterpart in the opposite tense.
2.3. Paradigmatic structure affects processing
We have seen that probabilistic information about individual inflectional variants is
available in lexical memory. It is well-known that inflectional variants are organized in
paradigms (see, e.g., Matthews, 1974). From the syntagmatic perspective of standard
decompositional approaches, paradigms are enigmatic oddities with little more status
than educationally useful ways of displaying inflectional variants. After all, to the extent
that an inflectional variant is decomposable, its structure can be accounted for by a
syntagmatic rule. However, paradigmatic structure and its complexity is emerging from
recent experimental studies as a genuine independent factor in lexical processing.
Storage and computation in the mental lexicon 5
Moscoso del Prado Mart´ın et al. (2004b) developed, as part of an overall information-
theoretic measure of processing complexity, a measure of paradigmatic inflectional com-
plexity based on the entropy measure of Shannon and Weaver (1949),
H=−X
i
pilog2(pi),(1)
where iranges over inflectional variants, and piis the probability of a variant in its
paradigm (estimated by its relative frequency in the paradigm). Baayen et al. (2006)
considered this measure (henceforth inflectional entropy) by itself in a regression study of
English monomorphemic words, applying (1) in its simplest form by allowing ito range
over all inflectional variants of a given word. They observed a negative correlation of
inflectional entropy with response latencies in visual lexical decision, and a positive cor-
relation for subjective frequency estimates. Words with a more complex, informationally
rich inflectional paradigm have more connections to other words in the mental lexicon,
and this superior lexicality allows faster lexical decision responses and gives rise to higher
subjective frequency estimates. Tabak et al. (2005a) also observed a (non-linear) effect of
inflectional entropy for Dutch verbs in visual lexical decision.
The information structure of more complex paradigms has also been shown to affect
morphological processing. Kosti´c et al. (2003) investigated Serbian nominal paradigms
with visual lexical decision. Their key predictor was the average probability of a syntactic
function of a given case inflection expressed in bits of information. This information
measure can be calculated across all relevant nouns in the language, or it can be calculated
for each noun separately. For instance, Serbian plural feminine nouns take the ending -ama
for the dative and the instrumental. We can base our estimate of a syntactic function
such as recipient either across all feminine plurals, or for a specific noun, say ˇzenama.
Kostic and colleagues observed that both the general and the item-specific measures were
predictive, with forms carrying a higher information load giving rise to prolonged response
latencies.
Current work on Dutch and Spanish using the picture naming paradigm suggests that
inflectional structure is also predictive for speech production. Tabak et al. (2006) observed
a facilitatory effect for inflectional entropy for Dutch past-tense forms, for both regular
and irregular verbs. For Spanish verbs, (Van Buren et al., 2006) observed facilitation for
the inflectional family size, i.e., for the number of nonzero inflectional variants realized in
a corpus, over and above an effect of the word’s lemma frequency. Note that the effects of
inflectional entropy and the inflectional family size bear further witness, albeit indirectly,
to the presence of memory traces for regular inflected words.
Paradigmatic effects in morphological processing are not restricted to inflection. The
derived words and compounds in which a word occurs, its morphological family, co-
determines lexical processing (Moscoso del Prado Mart´ın et al., 2004a). Furthermore,
morphological family members sharing the same structural position have been found to
constitute a domain of analogical generalisation (Krott et al., 2001). Analogical gen-
eralization challenges the high level of abstraction that is part and parcel of classical
syntagmatic approaches — a syntagmatic rule by definition is blind to the properties
of individual words and has access only to a selection of abstract general features. Not
surprisingly, the syntagmatic design of decompositional models renders them unable to
6R. H. Baayen
account for the many analogical, graded effects in morphology and morphological pro-
cessing (Seidenberg and Gonnerman, 2000; Ernestus and Baayen, 2003; McClelland and
Patterson, 2002b). Importantly, whereas (Pinker, 1997) assumed that analogical similar-
ity would be restricted to irregulars, as irregulars and only irregulars would be stored in an
associative memory, Albright and Hayes (2003) have shown that regulars are also subject
to analogical similarity (islands of reliability in their terminology) just as are irregulars
(see also Ernestus and Baayen 2003).
2.4. Form and meaning interact
The decompositional approach of Levelt et al. (1999) implements a high degree of mod-
ularity and encapsulation. Conceptualisation leads to the selection of a lemma. Once a
lemma has been selected, a new process activating the relevant constituent morphemes
is started up. This process is completely independent of prior conceptualization pro-
cesses. Once a morpheme (lexeme) has been activated, it in turn activates its constituent
phonemes, again fully independently of any preceding processes. The hypothesis under-
lying encapsulated modeling is that the rules and regularities at the level of word form
operate independently from rules and regularities at the level of word meaning. Thus,
Pinker regards semantics as irrelevant for the past tense (Pinker, 1999).
In connectionist approaches to morphology, the modularity assumption is dropped, and
form and meaning are allowed to interact, as for instance in the triangle model (Joanisse
and Seidenberg, 1999; Seidenberg and Gonnerman, 2000). Patterson et al. (2001a,b) argue
that irregulars come to depend more on semantic similarity compared to regulars due to
reduced similarity in their phonological form. They call attention to the cooccurrence of
semantic deficits with degraded performance on irregular verbs (but see Tyler et al. 2004).
Furthermore, there is independent distributional evidence that irregular verbs tend to
have more semantic neighbors than do regular verbs (Baayen and Moscoso del Prado Mart´ın,
2005). Experimental evidence that this difference in semantic density between regulars
and irregulars may affect lexical processing is reported by Tabak et al. (2005a). They
observed that in visual lexical decision, semantic density (gauged by means of the count
of synonym sets in WordNet) was more facilitatory for irregular past tense forms than
for regular past tense forms. It is precisely the forms of irregular verbs that carry the
irregularity for which we find that the greater semantic density of irregulars verbs boosts
lexical processing.
Differences in embodiment (Barsalou, 2003; Feldman, 2006) may also be involved. Ta-
ble 1 lists a number of basic verbs for position and movement in English and Dutch. In
both languages, irregular verbs are in the majority, suggesting that irregular verbs have
a greater degree of embodiment than regulars. Work in progress (Tabak et al., 2006)
provides further evidence for this hypothesis. We had artists make photographs of verbal
actions, covering some 180 verbs. The artists who made the photographs of verbal actions
reported that acting out irregulars was much easier than acting out regulars. This sub-
jective impression was supported by two observations. First, the byte size of the jpeg files
of photographs for irregulars were significantly smaller than those for regulars. Second,
if we look at the names elicited from subjects for the pictures, we see that the uncer-
tainty about the pictures’ names (measured by means of entropies over the frequency
distribution of different verbs elicited for a given picture) was much smaller for irregu-
Storage and computation in the mental lexicon 7
lars than for regulars. Possibly, the greater degree of embodiment of irregular verbs, in
combination with their greater semantic density, may not only render them more easy
to understand and conceptualize, but may also contribute to their remarkable resistance
against regularization.
Table 1
Basic verbs for position and motion in English and Dutch. Irregular verbs are shown in
upper case, regular verbs in lower case.
verbs of position
(lie sit stand lean hang float hover
lig zit sta leun hang drijf zweef
verbs of motion
(walk crawl jump run swim sink dive
loop kruip spring ren zwem zink duik
(ride fly climb ascend descend fall
rijd vlieg klim stijg daal val
We have seen that storage is ubiquitous and not restricted to regulars, that morpho-
logical processing is not derivational in the sense that more complex forms are derived
on-line from less complex forms, that the paradigmatic relations between complex words
across inflection and word formation co-determine lexical processing and generalization,
and that form and meaning interact. All these observations run counter to the predictions
of fully decompositional generative models.
3. Phonetic evidence
The evidence considered thus far was gathered from domains that traditionally are at
the heart of psycholinguistic investigation. In this section, we consider the consequences
of morphological structure for the fine phonetic detail of complex words. Evidence is
accumulating that the speech signal itself constitutes an additional source of information
about the architecture of the mental lexicon.
3.1. Lexical competition and the computation of fine phonetic detail
According to the model of Levelt et al. (1999), the process of word form encoding is
initiated once a word’s lexeme has been activated. The lexeme sequentially activates the
word’s phonemes from first to last. Phonemes are grouped into syllables, and syllables
are looked up in a syllabary, which provides gestural scores driving articulation.
The weaver model implements key insights of mainstream generative phonology. It
embodies an important intuition, an axiom driving its design, namely, that once we know
8R. H. Baayen
which word we need for expressing a given concept, we can select the form of that word
without interference from the forms of other irrelevant words. In weaver, this intuition
is formalized by the restrictions that the only way a lexeme can be activated is through
its own lemma, and that only the lemma selected from the set of candidate lemmas
will activate its lexeme. As a consequence, lexemes are viewed as highly encapsulated
representations that would not enter into competition with each other.
Studies investigating speech errors (e.g., Sevald et al., 1995; Dell et al., 1999) have
long suggested that word forms enter into a process of lexical competition during speech
production. More recently, advances in laboratory phonology and phonetics have pro-
vided further evidence for lexical competition during word form encoding. Recall that
in auditory comprehension, the cohort of a word’s lexical competitors is gradually re-
duced as acoustic information unfolds over time (Marslen-Wilson, 1996; Marslen-Wilson
and Welsh, 1978). Several phonetic studies provide evidence that a similar competition
process characterizes speech production.
Van Son and Pols (2003) observed that the fine phonetic detail of a given segment in the
word reflects the information load of that segment. In their study of Dutch, segments that
contributed more to reducing the cohort were pronounced with longer acoustic durations
and with increased articulatory effort, quantified by means of the spectral center of gravity
(see also Van Son and Van Santen, 2005).
The measure that Van Son and Pols used for gauging a segment’s information load,
IL, is the negative log to base 2 of a ratio of two probabilities. The first probability, p+,
estimates the joint probability (by means of token counts) of all words that begin with the
same sequence of segments up to and including the target segment. For the [I] in sit, all
words beginning with [sI] are taken into account. The second probability, p−, estimates,
again by means of token counts, the joint probability of all words beginning with the
same sequence of segments up to but not including the target segment. For [I] in sit, this
second probability considers all words that begin with [s]. Thus, a segment’s information
load is defined as
IL=−log2
p+
p−
.(2)
What makes ILinteresting is that it gauges the extent to which the [I] in sit contributes
to reducing the uncertainty in the cohort. Before the [I] is considered, the amount of
information in the cohort is −log(p−) (Shannon and Weaver, 1949). Once the [I] is
considered, however, the amount of information in the cohort increases to −log(p+).
Since −log(a/b) = log(b)−log(a), it is easy to see that ILquantifies the extent to which
the set of competitors before the [I] comes in (characterized by the larger and hence less
information rich joint probability p−) is reduced by the [I] (resulting in the smaller more
informative joint probability p+).
Van Son and Pols (2003) actually used a more complex estimate of ILthat was weighted
for the contextual likelihood of the word containing the target segment, i.e., the word
for which duration and spectral center of gravity were measured (sit). However, how
to optimally estimate the cohort probabilities p+and p−requires further research. For
instance, Kuperman et al. (2006), following up on Van Son and Pols (2003), observed
improved correlations when estimating p+and p−not with token-based counts but instead
with type counts. Furthermore, this study investigated the predictivity of ILas defined
Storage and computation in the mental lexicon 9
in (2) for the divergence of phonemes from their mean durations, rather than the raw
durations of these phonemes. Although the two studies are not directly comparable, their
results are consistent: It is clear that a segment with greater lexical information tends to
have a longer duration. Furthermore, the acoustic signal as produced by the speaker is
advantageous for the listener, as segments which are more important for distinguishing
the target word from its competitors in the auditory cohort are more prominently present
in the speech signal.
This finding raises the question of whether the speaker is modulating the fine pho-
netic detail of a word’s segments explicitly with the purpose of facilitating comprehension
for the listener, in line with Lindblom’s hypo- and hyper-articulation theory (Lindblom,
1990). Although it makes sense to assume that the communicative efficiency of speech
is enhanced by the hyperacticulation of informationally more salient segments and the
hypoarticulation of informationaly more redundant segments (Aylett and Turk, 2004;
Van Son and Van Santen, 2005), it seems unlikely that this efficiency is due to some
conscious or even unconscious effort on the part of the speaker to accomodate to the
listener. Speakers can adjust their speech depending on their audience, the acoustics
of their environment, whether or not they are using a telephone, etc. But purposeful
modulation of phonetic detail at the fine-grained level that is at issue here seems highly
unlikely. For instance, Bard et al. (2000) observed that clarity of spontaneous speech was
predictable from the speaker’s knowledge, and not from the listener’s knowledge. Fur-
thermore, Kuperman et al. (2006) observed the effect of informational redundancy on the
details of acoustic durations within a single register, read aloud speech from the library
for the blind as available in the spoken Dutch corpus. Although the speakers sampled
in this corpus produced carefully articulated speech modulated to fit the needs of their
intended audience, informational redundancy still emerges as an independent predictor
for phonetic detail within this register. Tr´on (2006) likewise points out that it is unlikely
that the modulation of acoustic duration by previous mentions of a word in the discourse
involves adaptation of the speaker to the needs of the listener.
Although it is logically possible that speakers purposefully adjust to their listeners, it
is logically equally possible that fine phonetic detail is the straightforward consequence
of the organization of the mental lexicon. Instead of repeating traditional explanations
based on the interaction between speaker and listener, we therefore explore the viability
of explanations based on what we know about lexical access.
We know from research on auditory word recognition that the incoming speech signal
is matched incrementally against a pool of lexical candidates that is winnowed down as
more acoustic information becomes available. According to the theory of Levelt et al.
(1999), word form encoding in speech production would require a fundamentally different
architecture, with word form selection being driven by semantics, and proceeding with-
out lexical competition. However, consider the possibility that word form encoding in
production makes use of the same phonological memory that is addressed in auditory
comprehension, and that accessing this phonological memory always involves a proba-
bilistic process of lexical competition during which the target word is singled out from its
phonological neighbors.
In comprehension, the greater amount of fine phonetic detail in the acoustic signal (in
the sense of a strengthened articulatory realization) that is present for more discriminative
10 R. H. Baayen
segments allows the listener to distinguish the carrier words of these segments more rapidly
and reliably from their lexical competitors. In production, by contrast, the speaker has to
retrieve a representation from phonological memory. During this retrieval process, lexical
competitors are co-activated. The co-activation of these competitors seems to come with
two benefits. First, as suggested by naming experiments reported by Vitevitch (2002),
phonological neighbors appear to gang up to facilitate production in a morphologically
simple language like English (but see Vitevitch and Stamer (2006) for the opposite effect
in a morphologically rich language, Spanish).
Second, a greater neighborhood density also has been observed to correlate in English
with strengthened phonetic detail. Wright (2004) and Munson and Solomon (2004) re-
ported that words with a high frequency and low neighborhood density (easy words) were
articulated with more centralized vowels than words with a low frequency and a high
neighborhood density (hard words). Scarborough (2004) observed that words with a low
frequency and a high neighborhood density were characterized by higher degrees of coar-
ticulation. One type of coarticulation that she studied was nasal coarticulation, which
concerns the extent to which the vowel in a word like band is nasalized in anticipation
of the following nasal. Scarborough also measured vowel-to-vowel coarticulation, i.e., the
extent to which the first (or second) vowel affects the location in F1− −F2acoustic space
of the second (or first) vowel. What she found was that more coarticulation takes place
for words with more neighbors. Apparently, once in the course of lexical competition
the phonetic characteristics of a segment have been highly activated, these characteris-
tics are not easily de-activated and may spill over to neighboring segments with which
they are compatible. Interestingly, the effect of neighborhood density on coarticulation
also emerged in Scarborough’s experiments for nonwords. This indicates that these ef-
fects do not hinge on phonetically rich stored representations, but emerge during lexical
competition.
These studies suggest that hyperarticulation is part and parcel of increased lexical
competition. The more intense the process of lexical competition is, the more the unique
properties of a word become relevant for distinguishing it from its competitors. As a
consequence, greater lexical competition results in superior and more detailed lexical ac-
tivation. In short, the corrollary of inceased competition is greater articulatory precision.
The results obtained by Van Son and Pols (2003) and Kuperman et al. (2006) add
a temporal perspective to the consequences of neighborhood structure for articulation
reported by Wright (2004) and Munson and Solomon (2004). Standard definitions of a
word’s neighbors are string-based, and consider those words as competitors that differ
with respect to a single segment, unweighted for its position in the string. The measure
of lexical information studied by Van Son and Pols, by contrast, taps into the temporal
dynamics of lexical competition by gauging the extent to which a given segment succeeds
in disqualifying irrelevant competitors that up till then were viable alternatives. Further
evidence for sequentiality in speech production is provided by Sevald and Dell (1994), who
observed slowed production for sequences of words with discrepant initial segments (initial
neighbors) compared to words with discrepant final segments (final neighbors). Their
results suggest that the position of the segment that is exchanged to obtain a neighbor
is crucial for understanding word form encoding in speech production. (For discussion of
the vulnarability of initial segments against the backdrop of the phenomenon of prosodic
Storage and computation in the mental lexicon 11
strengthening, see Keating (2006).)
Further evidence for the relevance of the position at which words differ from their neigh-
bors, i.e., the position at which the competition is focussed, pertains to morphologically
complex words. Bien et al. (2006) calculated separate counts of the numbers of neighbors
at the initial, the second, and the third position of the stems of derived and inflected
words. In parallel, Bien also considered the entropy of the relative frequencies of the
cohort of lexical competitors at these three positions. She studied these predictors in a
position-response association task (cf. Cholin et al., 2004; Bien et al., 2005), a naming task
that seeks to minimize the effect of comprehension processes in production experiments.
Bien observed an inhibitory effect of the neighborhood count for the initial position, and
a facilitatory effect of the cohort entropy at the second position. Her results confirm that
lexical competition at the initial position slows word form encoding, and add the new
finding that lexical competition at the second position facilitates word form encoding.
Tabak et al. (2006) observed a similar pattern of results with the standard word naming
paradigm, using monomorphemic Dutch verbs. Again the positional neighborhood count
at the initial position of the word was inhibitory, whereas the positional count at the
second position, and also the summed count of neighbors for later positions in the word,
were both facilitatory.
The inhibition observed in naming latencies for initial neighbors is consistent with the
results reported by Sevald and Dell (1994) for rapid sequence naming. The facilitation
at later positions may be the driving factor behind the facilitation reported by Vitevitch
(2002) for a non-positional neighborhood count. Assuming that these results are robust
and replicable, the hypothesis suggests itself that positional densities might be predictive
for the duration with which the corresponding target segments are produced. Preliminary
results suggest that indeed a greater density at the initial phoneme gives rise to prolonged
acoustic duration of this phoneme.
In summary, rich phonetic detail seems to be the by-product, or perhaps even the goal,
of lexical competition in speech production. Assuming that replication studies will con-
solidate these findings, we may speculate that a word’s phonological form is not a static
representation (as a string in computer memory) nor a simple piece of code that sequen-
tially triggers the activation of an otherwise static sequence of segments, as in the weaver
model. Instead, a word’s phonological form may be the outcome of a dynamic competi-
tion process that is biased either by acoustic input (in comprehension) or by meaning (in
production). In other words, the morphs that from a morphologist’s perspective seem to
be the basic units stored in memory are themselves the resultant of a dynamic selection
process.
3.2. Syllabification and fine phonetic detail
Syllabification is a well-studied phonological process that affects morphologically com-
plex words, where it may assign stem-final consonants to the onset of a new syllable
containing a vowel-initial affix as rime. The ensuing changes in fine phonetic detail have
surprising consequences for the listener.
When the comparative suffix -er is added to an adjectival base, our orthographic conven-
tions suggest that the comparative form is simply a longer continuation of the adjectival
base, and that morphological information becomes available to the listener once the first
12 R. H. Baayen
phoneme of the suffix has been heard. However, syllabification of warmer as war-mer
has far-reaching consequences for the fine phonetic detail of the stem (Lehiste, 1972).
Kemps et al. (2005a) and Kemps et al. (2005b) showed that listeners are highly sensitive
to the durational differences between a base word by itself (warm) and the base word
as it occurs in an inflected or derived word (warmer). In a word like warmer, the vowel
and the coda of warm tend to be articulated with shorter durations than when warm is a
word by itself. Even though there are tremendous differences in speech rate both between
and within speakers, listeners nevertheless have been found to be highly sensitive to these
durational differences. For instance, when the Dutch singular kant (’side’) is spliced out
of its plural kanten and presented to Dutch listeners in a number decision task, response
latencies to the spliced-out singular were longer than for normal singulars. Moreover,
the shorter the spliced-out singular was compared to its normal counterpart, the longer
response latencies were found to be. This prosodic mismatch effect was observed both
for words and (phonotactically legal) pseudowords, which shows that we are dealing with
general inferential processes that are not driven primarily by word-specific articulatory
information in memory.
A key challenge in this line of research is to establish whether these findings generalize
from laboratory speech to spontaneous conversational speech. The mere fact that listeners
are able to make use of these subtle cues already suggests that they must be functional
as well in normal language use. Work reviewed by Hawkins (2003) points in the same
direction.
3.3. Morphological effects on fine phonetic detail
Fine phonetic detail is predictable not only from the dynamics of lexical competition and
from general syllabification processes, but also from a word’s morphological properties.
We first consider syntagmatic properties, and then discuss paradigmatic properties.
According to Levelt et al. (1999), the word frequency effect arises at the level of the
lexeme. The higher the frequency of a morpheme is, the faster its lexeme is assumed to
initiate activation of its phonemes. This model predicts that frequency effects for complex
words do not arise. We have already reviewed ample chronometric evidence that falsifies
this prediction. Further evidence arguing against this decompositional approach to speech
production is provided by a detailed examination of the fine phonetic detail of complex
words.
Pluymaekers et al. (2005b) studied the acoustic durations of four Dutch derivational af-
fixes in spontaneous conversations. For three out of four affixes, Pluymaekers documented
that the acoustic duration of the derivational affix tended to decrease with increasing
word frequency. In a laboratory study eliciting complex words with these same deriva-
tional affixes across three speech rates, Pluymaekers et al. (2006) observed the very same
negative correlation between frequency and acoustic duration, which now was robust for
all four affixes studied. These results, which are in line with the data reported by Juraf-
sky et al. (2001) for monomorphemic words, show that stored word-specific information
co-determines a word’s acoustic realization.
That the amount of effort invested in articulation is inversely related to the frequency of
the complex words has also been demonstrated for assimilation in compounds by Ernestus
et al. (2006). Higher-frequency compounds tended to undergo more assimilation at the
Storage and computation in the mental lexicon 13
constituent boundary (e.g., td in wet +boek, ’law book’, assimilating to db) than low-
frequency compounds.
Paradigmatic relations have also been shown to be predictive for fine phonetic detail.
Hay (2001) observed that t-deletion in words like swiftly is more likely than in words like
softly. The likelihood of deletion turns out to be positively correlated with the ratio of the
frequency of the complex word to that of its base. The greater the extent to which the
complex word is independent of its stem, the lesser the functionality of the low-probability
diphone tl becomes for morphological segmentation, and hence the greater the likelihood
that this cluster can be simplified without loss of comprehension.
A second example of the reflection of paradigmatic structure in fine phonetic detail
concerns the duration of the interfix in Dutch compounds. Krott et al. (2001) showed
that the probability distribution of the interfixes in the set of compounds sharing the left
immediate constituent is crucial for understanding the otherwise enigmatic selection of the
interfix in Dutch novel and existing (Krott et al., 2004) compounds. The greater the like-
lihood of a given interfix in this probability distribution (henceforth its bias), the greater
the likelihood is that it is used, and the shorter its required processing time. Kuperman
et al. (2006) measured the acoustic duration of the interfixes -s and -en in the spoken
Dutch corpus (Oostdijk, 2002; Oostdijk et al., 2002), using the read aloud speech from the
subcorpus ’library of the blind’. Kuperman considered many other variables along with
this paradigmatic bias in the statistical analysis. One of these control variables was the
abovementioned ILmeasure of Van Son and Pols (2003), which showed that if the interfix
conveyed more information with respect to its acoustic cohort, it was realized with longer
durations. Interestingly, Kuperman observed that independently of the other predictors
and independently of the ILmeasure, interfixes with a greater bias were pronounced with
greater acoustic durations. Apparently, a greater bias is not a measure of greater informa-
tional redundancy, but a measure of the amount of paradigmatic support for an interfix:
Interfixes with greater paradigmatic support are articulated more robustly.
A final example of the consequences of paradigmatic structure for morphological pro-
cessing, but now for comprehension, concerns the phenomenon known as final devoicing
in Dutch. For a subset of Dutch stems ending in an obstruent, this obstruent alternates
between voiceless (when syllable-final) and voiced (when syllable-initial), compare raaf
(’raven’), plural ra-ven. Ernestus and Baayen (2006b) calculated the probability in a
word’s inflectional and derivational paradigms that its obstruent is voiced, its paradig-
matic likelihood of voicing. They presented word forms in which the final obstruent is
voiceless or nearly voiceless in an auditory lexical decision experiment. Words with a high
paradigmatic likelihood of voicing, i.e., words that are predominantly used with inflec-
tional forms that have the obstruent voiced, elicited longer reaction times. This shows that
the distribution of voicing within the paradigm codetermines the listener’s expectations.
When these expectations are violated, responses are slowed. Interestingly, forms realized
with residual voicing elicited longer latencies than words with completely voiceless final
obstruents. The fine phonetic detail of residual voicing in the acoustic signal (which itself
probably arises due to phonological paradigmatic analogy, Ernestus and Baayen 2003,
2006a) is detected by the listener, and decreases the listener’s estimate that a voiceless
variant is being heard. As a consequence, the response to the voiceless variant is slowed.
We have seen that lexical competition among monomorphemic words gives rise to en-
14 R. H. Baayen
hanced articulatory detail. This suggests that a word’s canonical form is dynamically
computed during lexical competition process, and that this competition process causes a
word’s phonetic form to be optimally distinct from its nearest phonological and morpho-
logical neighbors. Dynamic computation likewise takes place at the paradigmatic level,
across sets of morphological neighbors instead of across sets of phonological neighbors.
For both monomorphemic and complex words, computations are involved that crucially
involve a word’s own specific neighbors. Since decompositional models only have abstract
rules expressing global syntagmatic generalizations across the lexicon at their disposal,
they are severely challenged by what is now known about the consequences of phonological
and morphological neighborhoods for the articulation of fine phonetic detail.
In addition to dynamic computation, word-specific biases for phonetic detail may be at
work. Jurafsky et al. (2002) argue that English of is realized differently in the Switchboard
corpus according to whether it expresses the genitive, the partitive, or a complement.
Kemps et al. (2005b) discuss the possibility that the odds of encountering the base versus
a bisyllabic derivative may further help listeners to optimize their responses in the number
decision task. Gahl (2006) reports evidence that homophones such as thyme and time
differ systematically in duration, with the more intensively used alternative receiving, on
average, the shorter acoustic realization.
Especially with respect to extremely reduced words, word-specific variation can be quite
extensive. Dutch natuurlijk (’nature-like’, i.e., ’of course’), for instance, is attested with
reduced forms ranging from ntuurlijk, tuurlijk, ntuuk, tuuk to tk (Ernestus, 2000). The
choice of a given reduced form depends in part on the syntactic and discourse context
(Plug, 2005), and in part on social and geographical variables. Keune et al. (2005) docu-
ment, for instance, that severe reduction in colloquial Dutch of high-frequency words such
as natuurlijk and eigenlijk (’own-like’, i.e., ’actually’) to tuuk and eigk is more common in
the Netherlands than in Flanders.
The syntagmatic and paradigmatic lexical forces affecting the fine phonetic detail of
a word are themselves part of a larger set of forces that co-determine the details of
articulation, such as the probability of a word given the preceding or following word
(Bell et al., 2003; Gregory et al., 1999; Jurafsky et al., 2001), the probability of the
syntactic construction in which a word is used (Gahl and Garnsey, 2004, 2006), and
the recency with which a word has been heard in (conversational) discourse (Fowler and
Housum, 1987; Fowler, 1988; Hawkins and Warren, 1994; Bard et al., 2000; Pluymaekers
et al., 2005a; Tr´on, 2006). From this overall perspective, it seems that fine phonetic
detail at lexical and sublexical levels realizes a kind of informational prosody that reflects
probabilistic generalizations at all levels of linguistic structure, complementary to (or
possibly subsuming) classical prosodic structure and its consequences for articulatory
realization (see, e.g., Keating, 2006)).
4. Towards a new class of theories
The results reviewed in the previous sections show that the mathematics of formal
languages do not provide an adequate metaphor for understanding the mental lexicon.
One of the key issues for current research is what kind of formal frameworks might then
be worth pursuing instead.
Storage and computation in the mental lexicon 15
An alternative approach that has been studied intensively but with hotly debated suc-
cess is connectionist modeling (e.g., Rumelhart and McClelland, 1986; Seidenberg and
Gonnerman, 2000; Pinker and Ullman, 2002b,a; McClelland and Patterson, 2002b,a).
Connectionist networks have the advantage that they can account for graded, probabilis-
tic phenomena. But they also have their share of disadvantages. One such a disadvantage
is the merging of rules and representations. This might seem a step forward compared
to the standard Von Neumann computer architecture that is the source of inspiration
for symbolic models. However, rules and representations might have distinguishable neu-
ral substrates, as argued by Ullman (2001, 2004). According to Ullman, symbolic rules
reside in a procedural memory system, and monomorphemic words and formatives in a
declarative memory system. Evidence for such a division of labor for regular and irregu-
lar verbs comes from studies such as Jaeger et al. (1996) (positron emission topography)
and Beretta et al. (2003) (functional magnetic resonance imaging). Interpretation of the
reported differential activation of brain regions for regulars and irregulars is not straight-
forward, unfortunately. On the one hand, functional differentiation can take place in
neural networks. Furthermore, present-to-past naming as used by Jaeger et al. (1996)
is a task that, as discussed above, induces co-activation of inflectional variants that does
not take place in normal speech. In addition, the experimental materials of regulars and
irregulars are not appropriately controlled for semantic density and paradigm complexity
(Baayen and Moscoso del Prado Mart´ın, 2005). Nevertheless, the distinction between
procedural and declarative knowledge is an important one and the available evidence
reviewed by Ullman seems compelling.
Given the data reviewed in the preceding sections, it is clear that Ullman’s straight-
forward traditional divide between regulars (processed by procedural memory) and ir-
regulars (stored in declarative memory) must be too simplistic. Several modifications
suggest themselves. Recall that the word frequency effect for very low-frequency complex
words suggests that combinatorial probabilities may be at issue, for higher-frequency com-
plex words probably in combination with phonological memory traces. It is conceivable
that such combinatorial probabilities are evaluated in procedural memory as memories of
previous assembly and decomposition allowing subsequent faster assembly and decompo-
sition, whereas the phonological memory traces reside in declarative memory. Another
possibility is that procedural memory is responsible for analogical generalization over
lexical exemplars in declarative memory. Yet another alternative interpretation is that
the weaker embodiment of regular verbs compared to irregulars goes hand in hand with
greater multimodal computational demands across memory systems in the brain for con-
ceptual interpretation, and that it is these additional costs that show up in brain imaging
studies.
A second disadvantage is that the connectionist networks in common use are neurolog-
ically implausible. This is a key criticism made by Hawkins and Blakeslee (2004), a study
that researches the consequences of the biological structure of the neocortex for intelligent
computation (see also Miller, 2006). Interestingly, Hawkins and George (2006) provide an
outline of a new implemented technology, Hierarchical Temporal Memory (htm) that is
designed to replicate the general structural and algorithmic properties of the neocortex.
Htms consist of a hierarchy of memory nodes. Each node is itself a network that learns
causes from its child nodes and forms beliefs that it propagates to its parent nodes. Go-
16 R. H. Baayen
ing from low-level sensory nodes to higher-level nodes, the htm performs as a classifier,
coalescing a series of input patterns into a relatively stable output pattern. Conversely, a
stable pattern at the top of the hierarchy can unfold into a complex temporal pattern at
the bottom of the hierarchy.
Htm memory is not designed specifically for language, but it promises to be a great
tool for the computational modeling of the mental lexicon. Htm as described by Hawkins
and George (2006) is probably best seen as a computational model of declarative memory,
albeit a memory system that is intrinsically predictive. In this sense, it is fundamentally
different from a declarative memory conceived of as a static store of bits and bytes in
the memory registers on a Von Neumann computer. Therefore, htm bears the promise
of being able to deal in a natural way with graded linguistic phenomena such as fine
phonetic detail and the semi-morphology of phonaesthemes (Bergen, 2004). At the same
time, rules and representations are not merged a priori. Nodes in the htm memory can
represent individuals and effectively function as symbols, albeit as symbols in a system
with ’rules’ that are inherently analogical and probabilistic in nature.
Htm memory may also help resolve the problem of positional encoding that is rampant
in analogical (Skousen, 1989) and machine learning (Daelemans and Van den Bosch, 2005)
models as well as in connectionist networks. The alternative developed by Albright and
Hayes (2003) is attractive in that it does not depend on positional encoding, but the price
paid is a highly complex system of discrete rules. Since htm memory is designed explicitly
to match the spatial and temporal hierarchical structure of the real word, it may be able
to detect structure in time without depending on predefined slots for the constituents of
a linguistic unit.
Whereas the mathematics of formal languages has been a key source of inspiration for
morphological theory and models of the mental lexicon, I expect new advances at the
intersection of statistics, information science and the neurosciences such as Hierarchical
Temporal Memory (and models using related techniques such as Dynamic Bayesian Net-
works) to constitute an important source of inspiration for research on the mental lexicon
during the coming years.
As a consequence, not only the controversy between connectionist and symbolic ap-
proaches to the mental lexicon, but also the controversy between abstractionist and
exemplar-based approaches may well be resolved in harmony. Any current exemplar-
based machine learning algorithm must make use of smart economical storage, otherwise
the system will grind to a halt when trying to survey all exemplars in its memory (see,
e.g., the IG-tree technology used by TiMBL Daelemans and Van den Bosch 2005). On
the other hand, the abovementioned frequency effects for fully regular complex words and
the effects of inflectional and paradigmatic entropy bear witness to a remarkable sensi-
tivity of lexical memory to very item-specific probabilities. In memory systems such as
htm, such item-specific probabilities are bound to be captured. On the other hand, such
memory systems do not require as a matter of principle that all exemplars, such as the
inflectional variants of a Spanish or Georgian verb, or all of a word’s phonetic variants,
are represented by individual nodes. Instead, htm-like lexical memory systems promise
to be fully compatible with the dynamic analogy-driven computation of morphologically
complex words and their fine phonetic detail.
Storage and computation in the mental lexicon 17
Acknowledgements
I am indebted to Gonia Jarema, Gary Libben, Mark Pluymaekers, Mirjam Ernestus,
Rachel Smith and Susanne Gahl for comments and discussion.
References
Albright, A., & Hayes, B. (2003). Rules vs. analogy in English past tenses: A computa-
tional/experimental study. Cognition, 90, 119–161.
Alegre, M., & Gordon, P. (1999). Frequency effects and the representational status of
regular inflections. Journal of Memory and Language, 40, 41–61.
Aylett, M., Turk, A. (2004). The smooth signal redundancy hypothesis: a functional
explanation for relationships between redundancy, prosodic prominence, and duration
in spontaneous speech. Language and Speech, 47, 31–56.
Baayen, R., Feldman, L., & Schreuder, R. (2006). Morphological influences on the recog-
nition of monosyllabic monomorphemic words. Journal of Memory and Language, 53,
496–512.
Baayen, R. H., Dijkstra, T., & Schreuder, R. (1997). Singulars and plurals in Dutch:
Evidence for a parallel dual route model. Journal of Memory and Language, 36, 94–
117.
Baayen, R. H., & Moscoso del Prado Mart´ın, F. (2005). Semantic density and past-tense
formation in three Germanic languages. Language, 81, 666–698.
Baayen, R. H., & Schreuder, R. (2000). Towards a psycholinguistic computational model
for morphological parsing. Philosophical Transactions of the Royal Society (Series A:
Mathematical, Physical and Engineering Sciences), 358, 1–13.
Baayen, R. H., Schreuder, R., De Jong, N. H., & Krott, A. (2002). Dutch inflection: the
rules that prove the exception. In S. Nooteboom, F. Weerman, & F. Wijnen (Eds.),
Storage and Computation in the Language Faculty (pp. 61–92). Kluwer Academic Pub-
lishers, Dordrecht.
Bard, E., Anderson, A., Sotillo, C., Aylett, M., Doherty-Sneddon, G., & Newlands, A.
(2000). Controlling the Intelligibility of Referring Expressions in Dialogue. Journal of
Memory and Language, 42, 1–22.
Barsalou, L. W. (2003). Situated simulation in the human conceptual system. Language
and Cognitive Processes, 18, 513–562.
Bell, A., Jurafsky, D., Fosler-Lussier, E., Girand, C., Gregory, M., & Gildea, D. (2003).
Effects of disfluencies, predictability, and utterance position on word form variation in
English conversation. Journal of the Acoustical Society of America, 113, 1001–1024.
Beretta, A., Campbell, C., Carr, T., Huang, J., Schmitt, L. M., Christianson, K., & Cao,
Y. (2003). An ER-fMRI investigation of morphological inflection in German reveals that
the brain makes a distinction between regular and irregular forms. Brain and Language,
85, 67–92.
Bergen, B. K. (2004). The psychological reality of phonaesthemes. Language, 80, 290–311.
18 R. H. Baayen
Bien, H., Levelt, W., & Baayen, R. H. (2005). Frequency effects in compound production.
Proceedings of the National Academy of Sciences of the USA, 102, 17876–17881.
Bien, H., Levelt, W., & Baayen, R. H. (2006). Frequency effects in the production of
derivations and inflections. Manuscript in preparation, Max Planck Institute for Psy-
cholinguistics, Nijmegen.
Bloch, B. (1947). English verb inflection. Language, 23, 399–418.
Bloomfield, L. (1933). Language. Allen and Unwin, London.
Bod, R. (2006). Exemplar-based syntax: How to get productivity from examples. The
Linguistic Review, 23, in press.
Bybee, J. L., & Moder, C. L. (1983). Morphological classes as natural categories. Language
59, 251–270.
Bybee, J. L., & Slobin, D. I. (1982). Rules and schemas in the development and use of
the english past tense. Language, 58, 265–289.
Cholin, J., Schiller, N. O., & Levelt, W. J. M. (2004). The preparation of syllables in
speech production. Journal of Memory and Language, 20, 47–61.
Chomsky, N., & Halle, M. (1968). The sound pattern of English. Harper and Row, New
York.
Daelemans, W., & Van den Bosch, A. (2005). Memory-based language processing. Cam-
bridge University Press, Cambridge.
De Vaan, L., Schreuder, R., & Baayen, R. H. (2007). Regular morphologically complex
neologisms leave detectable traces in the mental lexicon. The Mental Lexicon, 2, in
press.
Dell, G., Chang, F., & Griffin, Z. (1999). Connectionist Models of Language Production:
Lexical Access and Grammatical Encoding. Cognitive Science, 23, 517–542.
Ernestus, M. (2000). Voice assimilation and segment reduction in casual Dutch. LOT,
Utrecht.
Ernestus, M., & Baayen, R. H. (2006)a. The functionality of incomplete neutralization in
Dutch. The case of past-tense formation. Laboratory Phonology, 8, 27–49.
Ernestus, M., & Baayen, R. H. (2006)b. Paradigmatic effects in auditory word recognition:
The case of alternating voice in Dutch. Language and Cognitive Processes, in press.
Ernestus, M., & Baayen, R. H. (2003). Predicting the unpredictable: Interpreting neu-
tralized segments in Dutch. Language, 79, 5–38.
Ernestus, M., Lahey, M., Verhees, F., & Baayen, R. H. (2006). Lexical frequency and
voice assimilation. Journal of the Acoustical Society of America, 120, 1040–1051.
Feldman, J. (Ed.) (2006). From molecule to metaphor. A neural theory of language. The
MIT Press, Cambridge, MA.
Fowler, C. (1988). Differential shortening of repeated content words produced in various
communicative contexts. Language and Speech, 31, 307–317.
Fowler, C., & Housum, J. (1987). Talkers’ Signalling of ”New” and ”Old” Words in Speech
and Listeners’ Perception and Use of the Distinction. Journal of Memory and Language,
26, 489–504.
Gahl, S. (2006). Is frequency a property of phonological forms? Evidence from spon-
taneous speech. Paper presented at the 19th Annual CUNY Conference on Human
Sentence Processing, New York City.
Storage and computation in the mental lexicon 19
Gahl, S., & Garnsey, S. (2004). Knowledge of grammar, knowledge of usage: Syntactic
probabilities affect pronunciation variation. Language, 80, 748–774.
Gahl, S., & Garnsey, S. (2006). Knowledge of grammar includes knowledge of syntactic
probabilities. Language, 82, 405–410.
Gregory, M., Raymond, W., Bell, A., Fosler-Lussier, E., & Jurafsky, D. (1999). The effects
of collocational strength and contextual predictability in lexical production. CLS, 35,
151–166.
Halle, M., & Marantz, A. (1993). Distributed morphology and the pieces of inflection. In
K. Hale & S. J. Keyser (Eds.), The View from Building 20: Essays in Linguistics in
Honor of Sylvain Bromberger (pp. 111–176). Vol. 24 of Current Studies in Linguistics.
MIT Press, Cambridge, Mass.
Hawkins, J., & Blakeslee, S. (2004). On intelligence. Henry Holt and Company, New York.
Hawkins, J., & George, D. (2006). Hierarchical temporal memory. Concepts, theory and
terminology. Numenta Technology, http://www.numenta.com/technology.php.
Hawkins, S. (2003). Roles and representations of systematic fine phonetic detail in speech
understanding. Journal of Phonetics, 31, 373–405.
Hawkins, S., & Warren, P. (1994). Phonetic influences on the intelligibility of conversa-
tional speech. Journal of Phonetics, 22, 493–511.
Hay, J. (2001). Lexical frequency in morphology: Is everything relative? Linguistics, 39,
1041–1070.
Jaeger, J. J., Lockwood, A. H., Kemmerrer, D. L., Van Valin, R. D., & Murphy, B. W.
(1996). A positron emission tomographic study of regular and irregular verb morphology
in English. Language, 72, 451–497.
Joanisse, M. F., & Seidenberg, M. S. (1999). Impairments in verb morphology after brain
injury: a connectionist model. Proceedings of the National Academy of Sciences, 96,
7592–7597.
Jurafsky, D., Bell, A., Gregory, M., & Raymond, W. (2001). Probabilistic relations be-
tween words: Evidence from reduction in lexical production. In J. Bybee & P. Hopper
(Eds.), Frequency and the emergence of linguistic structure (pp. 229–254). Benjamins,
Amsterdam.
Jurafsky, D., Bell, A., & Gyrand, C. (2002). The role of the lemma in form variation. In
C. Gussenhoven & N. Warner (Eds.), Papers in Laboratory Phonology VII (pp. 1–34).
Mouton de Gruyter, Berlin/New York.
Keating, P. A. (2006). Phonetic encoding of prosodic structure. In J. Harrington & M.
Tabain (Eds.), Speech production: Models, phonetic processes, and techniques (pp. 167–
186). Psychology Press, New York and Hove.
Kemps, R., Ernestus, M., Schreuder, R., & Baayen, R. H. (2005)a. Prosodic cues for
morphological complexity: The case of Dutch noun plurals. Memory and Cognition,
33, 430–446.
Kemps, R., Wurm, L., Ernestus, M., Schreuder, R., & Baayen, R. H. (2005)b. Prosodic
cues for morphological complexity in Dutch and English. Language and Cognitive Pro-
cesses, 20, 43–73.
Keune, K., Ernestus, M., Van Hout, R., & Baayen, R. H. (2005). Social, geographical, and
register variation in Dutch: From written ‘mogelijk’ to spoken ‘mok’. Corpus Linguistics
and Linguistic Theory, 1, 183–223.
20 R. H. Baayen
Kosti´c, A., Markovi´c, T., & Baucal, A. (2003). Inflectional morphology and word mean-
ing: Orthogonal or co-implicative domains? In: R. H. Baayen & R. Schreuder (Eds.),
Morphological Structure in Language Processing (pp. 1–44). Mouton de Gruyter, Berlin.
Krott, A., Baayen, R. H., & Schreuder, R. (2001). Analogy in morphology: modeling the
choice of linking morphemes in Dutch. Linguistics, 39, 51–93.
Krott, A., Hagoort, P., Baayen, R. H. (2004). Sublexical units and supralexical com-
binatorics in the processing of interfixed Dutch compounds. Language and Cognitive
Processes, 19, 453–471.
Kuperman, V., Pluymaekers, M., Ernestus, M., & Baayen, R. H. (2006). Morphological
predictability and acoustic salience of interfixes in Dutch compounds. Submitted.
Lehiste, I. (1972). The timing of utterances and linguistic boundaries. JASA, 51, 2018–
2024.
Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999). A theory of lexical access in speech
production. Behavioral and Brain Sciences, 22, 1–38.
Lieber, R. (1980). On the organization of the lexicon. Ph.D. thesis, MIT, Cambridge.
Lindblom, B. (1990). Explaining phonetic variation: A sketch of the H&H theory. In W.
Hardcastle & A. Marchal (Eds.), Speech production and speech modeling (pp. 403–440).
Kluwer, Dordrecht.
MacWhinney, B., & Leinbach, J. (1991). Implementations are not conceptualizations:
revising the verb learning model. Cognition, 40, 121–157.
Marslen-Wilson, W. D. (1996). Function and process in spoken word recognition. In:
Attention and performance: Control of language processes. Vol. X. Lawrence Erlbaum
Associates, Hillsdale, NJ, pp. 125–150.
Marslen-Wilson, W. D., & Welsh, A. (1978). Processing interactions and lexical access
during word recognition in continuous speech. Cognitive Psychology, 10, 29–63.
Matthews, P. H. (1974). Morphology. An Introduction to the Theory of Word Structure.
Cambridge University Press, London.
McCarthy, J. J., & Prince, A. (1993). Generalized alignment. In G. E. Booij & J. van Marle
(Eds.), Yearbook of Morphology (pp. 79–154). Kluwer Academic Publishers, Dordrecht.
McClelland, J. L., & Patterson, K. (2002)a. Rules or connections in past-tense inflections:
what does the evidence rule out. Trends in the Cognitive Sciences, 6, 465–472.
McClelland, J. L., & Patterson, K. (2002)b. ‘words or rules’ cannot exploit the regularity
in exceptions: Reply to Pinker and Ullman. Trends in the Cognitive Sciences, 6, 464–
465.
Miller, G. (2006). An enterprising approach to brain science. Science, 324, 76–77.
Moscoso del Prado Mart´ın, F., Bertram, R., H¨aiki¨o, T., Schreuder, R., & Baayen, R. H.
(2004)a. Morphological family size in a morphologically rich language: The case of
Finnish compared to Dutch and Hebrew. Journal of Experimental Psychology: Learn-
ing, Memory and Cognition, 30, 1271–1278.
Moscoso del Prado Mart´ın, F., Kosti´c, A., & Baayen, R. H. (2004)b. Putting the bits to-
gether: An information theoretical perspective on morphological processing. Cognition,
94, 1–18.
Munson, B., & Solomon, N. P. (2004). The effects of phonological neighborhood density on
vowel articulation. Journal of Speech, Language, and Hearing Research, 47, 1048–1058.
Storage and computation in the mental lexicon 21
Oostdijk, N. (2002). The design of the Spoken Dutch Corpus. In P. Peters, P. Collins & A.
Smith (Eds.), New Frontiers of Corpus Research (pp. 105–112). Rodopi, Amsterdam.
Oostdijk, N., Goedertier, W., Van Eynde, F., Boves, L., Martens, J., Moortgat, M.,
& Baayen, R. H. (2002). Experiences from the Spoken Dutch Corpus Project. In M.
Gonz ez Rodriguez & C. Paz Su ez Araujo (Eds.), Proceedings of the third International
Conference on Language Resources and Evaluation, ELRA, pp. 340–347.
Patterson, K., Lambon Ralph, M., Hodges, J., & McClelland, J. (2001)a. Deficits in
irregular past-tense verb morphology associated with degraded semantic knowledge.
Neuropsychologia, 39, 709–724.
Patterson, K., Lambon Ralph, M. A., Hodges, J. R., & McClelland, J. L. (2001)b. Deficits
in irregular past-tense verb morphology associated with degraded semantic knowledge.
Neuropsycologia, 39, 709–724.
Pinker, S. (1991). Rules of language. Science, 153, 530–535.
Pinker, S. (1997). Words and rules in the human brain. Nature, 387, 547–548.
Pinker, S. (1999). Words and Rules: The Ingredients of Language. Weidenfeld and Nicol-
son, London.
Pinker, S., & Prince, A. (1988). On language and connectionism. Cognition, 28, 73–193.
Pinker, S., & Prince, A. (1994). Regular and irregular morphology and the psychological
status of rules of grammar. In S. Lima, R. Corrigan & G. Iverson (Eds.), The Reality
of Linguistic Rules (pp. 353–388). John Benjamins, Amsterdam.
Pinker, S., & Ullman, M. (2002)a. Combination and structure, not gradedness, is the
issue. Trends in Cognitive Sciences, 6, 472–474.
Pinker, S., & Ullman, M. (2002)b. The past and future of the past tense. Trends in the
Cognitive Sciences, 6, 456–462.
Plug, L. (2005). From words to actions: The phonetics of ’eigenlijk’ in two communicative
contexts. Phonetica, 62, 131–145.
Plunkett, K., & Marchman, V. (1991). U-shaped learning and frequency effects in a multi-
layered perceptron: implications for child language acquisition. Cognition, 38, 1–60.
Pluymaekers, M., Ernestus, M., & Baayen, R. H. (2005)a. Articulatory planning is con-
tinuous and sensitive to informational redundancy. Phonetica, 62, 146–159.
Pluymaekers, M., Ernestus, M., Baayen, R. H. (2005)b. Frequency and acoustic length:
the case of derivational affixes in Dutch. Journal of the Acoustical Society of America,
118, 2561–2569.
Pluymaekers, M., Ernestus, M., & Baayen, R. H. (2006). Effects of word frequency on
articulatory durations. Submitted.
Roelofs, A. (1996). Serial order in planning the production of successive morphemes of a
word. Journal of Memory and Language, 35, 854–876.
Roelofs, A. (1997)a. Morpheme frequency in speech production: Testing WEAVER. In G.
E. Booij & J. van Marle (Eds.), Yearbook of Morphology 1996 (pp. 135–154). Kluwer,
Dordrecht.
Roelofs, A. (1997)b. The WEAVER model of word-form encoding in speech production.
Cognition, 64, 249–284.
22 R. H. Baayen
Rumelhart, D. E., & McClelland, J. L. (1986). On learning the past tenses of English
verbs. In J. L. McClelland & D. E. Rumelhart (Eds.), Parallel Distributed Processing.
Explorations in the Microstructure of Cognition. Vol. 2: Psychological and Biological
Models. (pp. 216–271). The MIT Press, Cambridge, Mass.
Scarborough, R. A. (2004). Coarticulation and the structure of the lexicon. UCLA disser-
tation.
Seidenberg, M. S., & Gonnerman, L. M. (2000). Explaining derivational morphology as
the convergence of codes. Trends in Cognitive Sciences, 4, 353–361.
Selkirk, E. (1984). Phonology and Syntax. The MIT Press, Cambridge.
Sereno, J., & Jongman, A. (1997). Processing of English inflectional morphology. Memory
and Cognition, 25, 425–437.
Sevald, A. C., & Dell, G. S. (1994). The sequential cuing effect in speech production.
Cognition, 53, 91–127.
Sevald, A. C., Dell, G. S., & Cole, J. S. (1995). Syllable Structure in Speech Production:
Are Syllables Chunks or Schemas? Journal of Memory and Language, 34, 807–820.
Shannon, C. E., & Weaver, W. (1949). The Mathematical Theory of Communication. The
University of Illinois Press, Urbana.
Skousen, R. (1989). Analogical Modeling of Language. Kluwer, Dordrecht.
Stemberger, J. P. (2004). Phonological priming and irregular past. Journal of Memory
and Language, 50, 82–95.
Stemberger, J. P., & Middleton, C. (2003). Vowel dominance and morphological process-
ing. Language and Cognitive Processes, 18, 369–404.
Tabak, W., Schreuder, R., Baayen, R. H. (2005)a. Lexical statistics and lexical processing:
semantic density, information complexity, sex, and irregularity in Dutch. In S. Kepser
& M. Reis (Eds.), Linguistic Evidence — Empirical, Theoretical, and Computational
Perspectives (pp. 529–555). Mouton de Gruyter, Berlin.
Tabak, W., Schreuder, R., & Baayen, R. H. (2005)b. The processing of regular and irreg-
ular verbs. In Proceedings of the Interdisciplinary Workshop on the Identification and
Representation of Verb Features and Verb Classes, Saarbr¨ucken, pp. 121–126.
Tabak, W., Schreuder, R., & Baayen, R. H. (2006). Nonderivational inflection.
Manuscript, Max Planck Institute for Psycholinguistics.
Taft, M. (1979). Recognition of affixed words and the word frequency effect. Memory and
Cognition, 7, 263–272.
Tr´on, V. (2006). Corpus evidence for a priming account of durational reduction. Paper pre-
sented at the 2nd Annual Edinburgh Psycholinguistics postgraduate conference, 2006.
Tyler, L., Stamatakis, E., Jones, R., Bright, P., Acres, K., & Marslen-Wilson, W. (2004).
Deficits for semantics and the irregular past tense: A causal relationship? Journal of
Cognitive Neuroscience, 16, 1159–1172.
Ullman, M. (2001). The declarative/procedural model of lexicon and grammar. Journal
of Psycholinguistic Research, 30, 37–69.
Ullman, M. (2004). Contributionss of memory circuits to language: the declara-
tive/procedural model. Cognition, 92, 231–270.
Van Buren, H., Tabak, W., Carreiras, M., & Baayen, R. H. (2006). Morphological effects
in picture naming of spanish verbs by L1 and L2 speakers. Manuscript in preparation.
Storage and computation in the mental lexicon 23
Van Haeringen, C. B. (1940). De taaie levenskracht van het sterke werkwoord. De Nieuwe
Taalgids, 34, 241–255.
Van Son, R., & Pols, L. (2003). Information Structure and Efficiency in Speech Produc-
tion. Proceedings of Eurospeech-2003. Geneva, Switzerland, 769–772.
Van Son, R., & Van Santen, J. (2005). Duration and spectral balance of intervocalic
consonants: A case for efficient communication. Speech Communication, 47, 100–123.
Vitevitch, M. S. (2002). The influence of phonological similarity neighborhoods on speech
production. Journal of Experimental Psychology: Learning, Memory and Cognition, 28,
735–747.
Vitevitch, M. S., & Stamer, M. K. (2006). The curious case of competition in Spanish
speech production. Language and cognitive processes, 21, 760–770.
Wright, R. (2004). Factors of lexical competition in vowel articulation. In J. Local & R.
Ogden (Eds.), Papers in Laboratory Phonology 6 (pp. 75–87). Cambridge University
Press, Cambridge.
Wurm, L. H., & Baayen, R. H. (2006). Sufrace frequency effects below the threshold:
Comparing types, tasks, and modalities. Submitted.