ArticlePDF Available

Abstract and Figures

The paper presents a language production model referring the version of the Levelt model that is proposed by Roelofs starting from his 2005 paper. On the base of that model we argue that slips of the tongue and word finding failures, particularly tip-of-the-tongue states (TOT states), occur for the same reasons. This leads us to a sub classification of TOT states analogous to the sub classification of slips of the tongue. That sub classification of TOT states is evaluated against knowledge about the tip-of-the-tongue effect as presented in the literature.
Content may be subject to copyright.
On the emergence and resolution of tips of the tongue
and their relation to slips of the tongue
Nina Jeanette Sauer und Ulrich Schade
The paper presents a language production model referring the version of the Levelt model that
is proposed by Roelofs starting from his 2005 paper. On the base of that model we argue that
slips of the tongue and word finding failures, particularly tip-of-the-tongue states (TOT
states), occur for the same reasons. This leads us to a sub classification of TOT states
analogous to the sub classification of slips of the tongue. That sub classification of TOT states
is evaluated against knowledge about the tip-of-the-tongue effect as presented in the literature.
Key words
impaired speech production, word finding failures, tip-of-the-tongue (TOT), slips of the
tongue, models of speech production
1 Introduction
Research on cognitive processes is guided by models. Models of language production in
general and lexical access in particular, originally have been built on canny interpretations of
slips of the tongue and their distributions, e.g. by Victoria Fromkin (1973, 1980, 1988) and
Merrill Garrett (1975, 1982, 1990). Today’s models offer consistent explanations about the
occurrence of slips of the tongue and their distributions. In contrast, explanations about the
occurrence of tip-of-the-tongue states (TOT states) are under discussion, e.g., Abrams et al.
(2003), James and Burke (2000). Examining TOT states, we came to the conclusion, that
there might be different kinds of TOT states like there are different kinds of slips of the
tongue. This leads to the subsequent question whether modelling the occurrence of TOT states
can be linked to the modelling of slips of the tongue with the goal of a unified modelling of
production blunders.
Usually, speakers succeed in accessing words intended fast and correctly. If there is a
problem with the lexical access, it normally only shows in pauses or hesitations what Levelt
(1983) calls “covert repairs”. TOT states constitute a rare and special kind of word finding
problems. The speaker wants to produce a specific word but is not able to do so (the word is
on the tip of the speaker’s tongue). During a TOT state, and this is what characterizes TOT
states, the speaker has the semantic information of the word available and often also syntactic
information (Vigliocco et al., 1997). In addition, the speaker may have rudimentary
morphological-information like the word’s initial phoneme or its number of syllables (Brown
& McNeill, 1966).
In the following, we will propose how to model the occurrence of TOT states if linked
to the modelling of slips of the tongue. Therefore, we first present a model of language
production (section 2). We then will present how to explain TOT states (and slips of the
tongue) in that model. This entails a sub classification of TOT states (section 3). In section 4,
we will evaluate our modelling by discussing whether predictions that can be derived from the
modelling are in accordance to experimental results. Section 5 presents a conclusion as well
as a lookout.
2 Modeling Language Production
The base for our language production model is the version of the so-called Levelt-Model
(Levelt 1989; Levelt, Roelofs & Meyer 1999) that had been proposed by Roelofs (2005,
2018). In this chapter, we will sketch the blue-print of that model along the lines of Levelt et
al. (1999) (section 2.1). In a second step, we refer to the differences between Levelt et al.
(1999) and Roelofs (2005, 2018) (section 2.2). In a third step, we explain what is different in
our own model in comparison to Roelof’s model (section 2.3). In the discussions, we focus on
those aspects that are important for explaining slips of the tongue as well as TOT states.
2.1 The Levelt-Model
In his famous book Speaking From Intention to Articulation from 1989, Willem J.M.
Levelt presented a model of the cognitive process of language production that incorporated
most of the knowledge on hand at that time. He thereby created a coherent model of the whole
production process that is still the reference model for all research in the field. The
prominence of the Levelt model suggests it as reference model.
The Levelt model distinguishes between three processing modules: the conceptualizer
transforms an intention into the so-called “preverbal message”, the formulator converts this
message into a syntactic structure (the surface structure) and afterwards converts that structure
into a sequence of phonemes (phonetic plan), and the articulator takes the plan and forms the
overt speech out of it.
The processes running in the conceptualizer are conceptualization processes that do
not depend on language, as is indicated by the conceptualizer’s result, the preverbal message.
Thus, for categorizing slips of the tongue and TOT states, we mainly will assign them to
processes running within the formulator. In particular, we need to focus on lexical access. For
this sub-process of language production, Levelt, Roelofs & Meyer (1999) presented a detailed
update. During lexical access, initially and as part of conceptualization, a lexical concept is
activated according to its semantic properties. Levelt et al. (1999), in contrast to Levelt
(1989), regard concepts as non-decompositional, following Roelofs (1992, 1993), and not as a
set of their features as proposed by e.g. Miller & Johnson-Laird (1976). That lexical concept
activates a lexical entry in the formulator. It is twofold consisting of its lemma and its form
(cf. Kempen & Huijbers 1983). The lemma is connected to the respective lexical concept in
the conceptualizer, and to syntactic features and its form in the formulator (see also Belke,
2013, p. 232, figure 1; Roelofs, 2018, p. 64, figure 3). During production, lemmata collect
activation received for example from lexical concepts. The lemma that collects most
activation is selected. In the Levelt model, only a selected lemma then can activate its form
(serial model). The form represents the lexical entry’s morphological and phonological
breakdown. It thus consists of corresponding morphemes, phonemes and the syllabic
structure. The latter serves sequencing and is comparable to Dell`s (1988) wordshape header
nodes or MacKay’s (1987) sequence nodes. In Figure 2, the respective node is labeled σ-σ-
σ”, see also Levelt et al. (1999, p. 4, figure 2). Syllabic structure nodes are not to be mistaken
for syllable nodes. Following Levelt & Wheeldon (1994), Levelt et al. (1999) postulate
syllable nodes that are activated from the morpheme nodes via the phoneme nodes and that
trigger syllabic motor programs, see also Cholin, Dell & Levelt (2011). The syllable nodes
connect formulator and articulator. This is in contrast to other models, e.g. to those proposed
by Dell (1986) or MacKay (1987) which show a syllable level between the word level and the
phoneme level.
2.2 Roelofs’ Improvements
In the Levelt model, monitoring includes two loops: the internal one operates on the phonetic
plan and the external one on overt speech. Both are processed by the comprehension module
(cf. Nooteboom & Quené, 2013). In the original Levelt model, phonetic plan and overt speech
are processed by comprehension and the results then are compared to intended production
within the conceptualizer. Roelofs (2005), however, allows the monitor to compare results on
each level. We do not want to discuss monitoring here, in particular not whether there is a
production-based monitoring module (at least in addition to the routes mentioned) or whether
such a production-based monitoring operates on conflicts like Nozari et al. (2011) proposed
and against which Nooteboom & Quené (2013) argued. With respect to these aspects we
would like to refer to Gauvin et al. (2016). However, since at least the detection of
phonological errors drops significantly if the external loop is blocked (Postma & Noordanus,
1996), partial monitoring on the overt speech is likely. This allows us to follow Roelofs
(2005) idea and link cuing to overcome TOT states (cf. section 3.8) to the external loop of
Figure 1 displays the principle architecture of the model following Roelofs
(2005, p. 46, figure 3.2). In addition, figure 2 shows a small detail of the model’s network
with “banana” as example of a lexical item.
Figure 1: The blueprint of the model’s architecture.
Figure 2: Part of the model’s network shown with “banana” as example for a lexical item.
2.3 Further Changes
Most language production models agree with the layers of the Levelt model but some disagree
with its dynamics. The most widely discussed difference among language production models
result from the models’ conceptions on when and how an activated lemma might activate its
form. In contrast to the serial approach of the Levelt model, we assume that a lemma activates
its form as soon as it is active itself. Such a model is called a cascading model (Humphrey,
Riddoch & Quinlan, 1988). Due to evidence, collected e.g. by Petersen & Savoy (1998),
Blanken (1998), Morsella & Miozzo (2002), and Meyer & Damian (2007), cascading models
are preferred todays. This means that competing lemmata in parallel activate their
corresponding morphemes and phonemes. The production process in a cascading model is
faster but also more error-prone in comparison to that in a serial model.
Dell (1986), MacKay (1987), Stemberger (1985), Berg (1988), and others argued for
so-called interactive models. These models not only incorporate cascading a lemma that
gets activated immediately sends activation to its morphemes and phonemes – but also
feedback: activated morphemes and phonemes send activation back to their lemmata. This
means another facilitation of the process, which again is bought by an increase of errors. Our
model is interactive, an aspect that is relevant for cueing to overcome TOT states. The details
of our model’s dynamics are as presented by Berg and Schade (1992; Schade & Berg, 1992):
in order to support production’s sequencing, it, in contrast to the models proposed by Dell,
MacKay, and Stemberger, our model also includes lateral inhibition.
3 Classification
After having sketched our model of the cognitive process of language production, we now can
take that model to classify production blunders, in particular TOT states, but also slips of the
tongue. Our classification implies that TOT states, like slips of the tongue, can result from
different disturbances of the production process and should thus been differentiated. Each of
the following sections will discuss one type of blunder, either a type of TOT state or a type of
slip. The type is named in the headline. With respect to modelling slips of the tongue, we
mostly adopted the principles outlined by Stemberger (1985), Dell (1986), MacKay (1987),
and Berg (1988). In the following, sections on the modelling of slips are included here in
order to underline the relation to the modelling of the TOT states. The goal is a unified
modelling of production blunders.
3.1 Malapropisms
The term Malapropism is not determined definitely. In his play “The Rivals” (1775), Richard
Brinsley Sheridan introduces the character Mrs. Malaprop. This lady likes to use foreign
words, but when doing so often substitutes a target word with a phonologically similar foreign
word. For example, Mrs. Malaprop says: “Then Sir, she should have a supercilious
knowledge in accounts; and as she grows up, I would have her instructed in geometry, that
she might know something of the contagious countries; – but above all, Sir Anthony, she
should be mistress of orthodoxy, that she might not mis-spell, and mis-pronounce words so
shamefully as girls usually do; and likewise that she might reprehend the meaning of what she
is saying” (1
Act, 2
Scene; accentuations by the authors of this paper). We can assume that
Sheridan let Mrs. Malaprop act out of ignorance. In our model, this would mean that the
concept for, for example, “geography” is incorrectly connected to the lemma for “geometry”.
The production process on that incorrect network then operates correctly.
Fay & Cutler (1977), however, used the term Malapropism for those word errors in
which the target word and the error word are phonologically similar. Such slips of the tongue
that are not caused by ignorance, we call “formal word errors”, see below.
3.2 Semantic Word Errors
Semantic word errors form a sub class of slips of the tongue. The speaker produces a word
that is semantically similar to the target word, e.g. “apple” instead of “banana”. In our model,
such errors occur under the following condition: the activation of the concepts is adequate,
both target and error lemma receives activation. Under normal conditions, the target lemma
would receive more activation and, hence, would be selected and produced. In the error
situation, however, the error lemma receives more activation, e.g. due to some noise
activation in the network. In addition, the monitoring component does not catch the error
before production. As a result, the wrong word is produced.
3.3 TOTs with Semantic Blocking Word
According to our model, TOT states with semantic blocking words result from the same
constellation as semantic word errors: activation of the concepts is adequate, both target and
error lemma receives activation. Again, for some reason, the error lemma receives more
activation. In this case, however, monitoring catches the error and the production is aborted.
The semantic word error is suppressed. Then, the production system tries to repair and starts
anew. If the restarted production would be successful, only a delay – a so-called “covert
repair” (Levelt 1983) would be apparent. If however, the error lemma had been able to
stabilize its activation, it, due to lateral inhibition, can block the target lemma’s activation
from rising beyond its own activation. So, the target will not be selected since its activation
does not surpass the activation of the blocker, and the blocking lemma is not selected due to
monitoring’s veto. As result, a TOT state (with semantic blocking word) occurs.
In a serial model, like the Levelt model, the failure of lemma selection would result in
a complete naming failure and not in a TOT-state since in these models no form nodes can be
activated without lemma selection. In cascading and in interactive models, however,
inadequate activation of the target lemma does not allow lemma selection, but, nevertheless,
triggers activation of form nodes including syllable structures. It also triggers the activation of
syntactic properties. This explains why some of the syntactic and rudimentary morpho-
phonological information might be available for the speaker forming a TOT state.
3.4 Formal Word Errors
Selection of the target lemma does not grant the correct production of the target word. There
is a chance that a stem morpheme different from the target’s stem morpheme collects more
activation than the target morpheme and is selected for production. Morphemes that share
phonemes with the target morpheme and will thus receive activation from those phonemes via
feedback will have the best chance for becoming selected wrongly. If then, the monitor does
not notice the error before production, a formal word error occurs, e.g., “flamingo” instead of
“flamenco” or “professor” instead of “processor”.
3.5 TOTs with Phonological Blocking Word
According to our model, TOT states with a phonological blocking word correspond to formal
word errors as follows: the activated and selected target lemma activates its morphemes
(which in this case is normally only the stem) and phonemes, but an error stem morpheme
receives more activation than the target stem. If the monitoring fails, a formal word error
occurs, but if the monitor catches the error, production is aborted. If the error stem morpheme
due to lateral inhibition can hinder the target stem from collecting enough activation to reach
the selection threshold, a TOT state occurs. The blocking item according to our model here is
a stem morpheme, but as this most often is a free morpheme, it can be called phonological
blocking word.
Concepts can be expressed by different words, but a lemma has a distinct stem
morpheme. Thus, TOT states with semantic blocking word should be much more frequent
than TOT states with phonological blocking word. Beyond that, without a blocking stem, the
lemma should always be able to activate its stem adequately so that we do not expect TOT
states on this level without blockers. The exception here of course is a case of pathological
impairment that has decreased or even erased the lemma stem connection.
3.6 TOTs without Blocking Word
In cascading and interactive models, TOT states might also occur at the lemma level without
competing words blocking the target lemma. This is always the case if the target lemma does
not receive adequate activation. In particular, it might not receive enough activation from the
conceptual representation if the connection between the lexical concept and its lemma is not
developed fully or had been disrupted at least partially and/or temporarily. Then, the lemma
collects some but not enough activation. Nevertheless, some of the inadequate activation
cascades into the form. A TOT state occurs. If the collection of activation, however, is totally
unsuccessful, e.g. if the connection between the lexical concept and its lemma is completely
erased, no TOT state occurs, but pathological anomia is on hand.
3.7 Phonological Errors
A phonological error occurs if the lemma and the morpheme(s) of the target word have been
correctly selected for production. However, during the activation of the related phonemes, a
phoneme differing from the target phoneme (the one in turn next) succeeds in collecting more
activation than the target and is thus selected for production. If it is not noticed by the monitor
component before production, the error occurs.
As has already been mentioned, the model-based explanation for slips of the tongue
has already been proposed by Berg, Dell, MacKay, Stemberger and others. This is in
particular true for phonological slips for which there are no corresponding TOT states. The
model-based explanation for this kind of errors is repeated here for completeness, only.
3.8 Cueing
In principle, one can try to resolve another speaker’s TOT state (or word finding problem) by
offering a cue. We will limit ourselves to phonological cues, here, discussing as possible cues
phonemes, phoneme sequences, syllables, and words that are form-related to the target. Cues
are processed by the comprehension system. The modification of the Levelt model by Roelofs
(2005) allows activation to spill over to production on phoneme level, on morpheme level and
on lemma level to execute cuing effects. Since our model does not assume a syllable level
between the lemma level and the phoneme level, syllable cues act as phoneme sequence cues.
All cues have an impact on the phoneme level: a cue’s phonemes that correspond to
the target’s phonemes exert a positive influence on those target phonemes (on the production
side). Under assumption of an interactive production model, this effect influences the target’s
morphemes and its lemma positively via feedback (again on the production side). So, it may
help to resolve a TOT state. Obviously, on the one hand, this positive effect is the bigger the
more corresponding phonemes a cue includes. Furthermore, the positive effect more likely
resolves a TOT state on morpheme level (as described in section 3.6) than a TOT state on
lemma level (as described in sections 3.3 and 3.4). On the other hand, since there is lateral
inhibition in our model, phonemes of the cue that do not correspond to the target reduce the
cue’s positive effects.
If the cue is a phoneme sequence, it might be long and decisive enough that only some
few corresponding morphemes in the comprehension module become activated. If the cue’s
phoneme sequence (being not decisive) activates too many morphemes, activation on the
morpheme level (comprehension side) stays low due to lateral inhibition. Activated
morphemes (if any) can activate their lemmata (comprehension side) and also can spill over
activation to the morpheme level of the production module. If the target morpheme is
activated by this, it would help to resolve the TOT state. Every other activated morpheme,
however, inhibits the target morpheme and works against resolving. This, in particular, is true
if the morpheme of a blocker is activated.
If the cue is a word, that word’s phonemes, morphemes and its lemma will be
activated on the comprehension side. If it is phonologically similar to the target, it can exert
positive activation to the target via the phoneme level of the production module and via the
feedback on the production side. The influence on the morpheme level depends on whether
the target and the cuing word share morphemes. On the lemma level, however, the cuing word
activates its corresponding lemma on the production side and that will inhibit the target. If the
cuing word is identical to the blocker, everything is lost. The blocking word receives
additional support and stays. If, however, there is a blocker different to the cuing word that
should be the case if target and cuing word do not share the syntactic category, the cuing
lemma will inhibit both the target and the blocker. Therefore, there might be a positive net
effect the more so as there is an excitatory effect from the phoneme level.
In sum, our model allows the following predictions: a) the chance of a cue to resolve
the TOT correlates to the number of phonemes shared by cue and target; b) a cue which is a
syllable exerts no additional effects besides the syllable’s being a phoneme sequence; c)
phoneme sequences that are non-words are better cues than those being words (words
different to the target, of course); d) cuing words that do not share the syntactic category with
the target are better cues than those that do. The following chapter discusses experiments to
compare the predictions to experimental results.
4 Evaluation
From the selected model and the associated classification, predictions can be made for the
occurrence of TOTs and, above all, their possible resolution by presenting cues. These
predictions can be experimentally verified. In this section, we briefly show what experiments
in terms of TOTs look like (4.1), which hypotheses arise from the chosen model, and which
experimental data are available (Section 4.2).
4.1 TOTs in Experiments
TOTs are quite rare. Burke et al. estimate their frequency at once in a week for healthy, adult
speakers. In accordance with this, it is also difficult to elicit them in experiments. Normally,
subjects are confronted with definitions of rare terms that they are asked to respond to. If the
definitions are chosen with care and luck, such a naming task may evoke TOTs in 10% to
20% of the presented cases (Brown, 2012, p. 195). An example taken from Meyer & Bock
(1991, Appendix A, p. 725) is the definition “able to read and write” for the target “literate”.
For their responds the subjects are asked to answer after a definition first if they know the
word (yes), if they do not know it (no), or whether it is on the tip of their tongue (TOT). If
they answer “yes”, they are asked to name the word. If a subject is in a TOT state, information
about the word on the tip of the tongue is asked for, e.g., information about the word class or
the initial phoneme. In some experiments, primes or cues are used to manipulate the
occurrence of TOT states and/or their resolution.
4.2 Activated Information
If there is a word on the tip of ones tongue, some information about it is often available even
if the word itself cannot be produced. This can be syntactical and/or form information. Known
syntactical information might include the word’s syntactical category (cf., e.g., Brown &
McNeill, 1966; Burke et al., 1991), its grammatical gender, a relevant syntactic feature if the
experiments are executed with subjects speaking e.g. Italian, French, or German (cf., e.g.,
Vigliocco et al., 1997), or countability (cf., e.g., Vigliocco et al., 1999). Known form
information might include the initial phoneme (cf., e.g., Brown & McNeill, 1966; Burke et al.,
1991) or the number of syllables (cf., e.g., Brown & McNeill, 1966; Koriat & Lieblich, 1974).
With this paper, we postulate three kinds of TOT states: (a) TOT states with semantic
blocking word, (b) TOT states without blocking word, and (c) TOT states with phonological
blocking word. We further assume that states of types (a) and (b) indicate a lemma selection
problem whereas states of type (c) indicate a form problem or to be more precise a problem of
stem morpheme selection. In a two-step model, like the Levelt model, a failure of lemma
selection precludes the activation of the target’s form information. Thus, in a two-step model,
TOT states with form information available need to located at the form tier (Meyer & Bock,
1992, p. 715). However, and this holds for all models of language production, if a TOT state
is located at the form tier, the target’s lemma is selected successfully and thus correct
syntactical information is available completely.
Vigliocco et al. (1997) report that in those cases in which the subject later successfully
could dissolve their TOT state, they had been able to specify the gender correctly in 84% of
the time whereas the quote was only 53% in those cases in which the subjects dissolves the
TOT states incorrectly, later. An incorrect dissolution indicates the presence of a blocking
word that is the one produced as dissolution whereas a correct dissolution indicates the
absence of a blocking word. Thus, we take the results from Vigliocco et al. (1997) as evidence
for the existence of a least two types of TOT states. In addition, all the cases in which the
subjects named the gender incorrectly need to be TOT states on the lemma level, so there is at
least evidence for types (a) and (b) of TOT states.
4.3 Cueing in Experiments
In order to dissolve TOT states in experiments, cues are used. These cues often had been
whole words. For example, for the target “gosling” (definition: “a young goose”) Meyer &
Bock (1992) used “pelican” (semantic cue), “goblet” (phonological cue) and “beard”
(unrelated cue). Phonological cues share the first phoneme/grapheme or even the first syllable
with the target (James & Burke 2000; White & Abrams 2002; Abrams, White & Eitel 2003).
If such a word-cue is read or heard, it provides, according to our model, additional activation
of the initial phonemes of the target word and thus also additional activation for target
morphemes and target lemma. At the morpheme and lemma level, however, the cues'
morphemes and its lemma are activated as well, which, according to the model we have
assumed, leads to competition with the target morphemes and the target lemma and thus
counteracts with the cue’s function to serve as an aid. If a cue is a word, it is helpful out of the
phoneme level and disturbing from the higher levels.
The model also suggests how these two counteracting effects can be independently
enhanced or mitigated. The disturbing effect is especially strong when a blockade word
triggers the TOT and when the cue corresponds to this blockade word. However, this effect is
difficult to reveal experimentally because it is complicated to initiate a TOT with a fixed
blockade word. Fortunately, a possible blockade word basically shares the syntactic category
with the target word. Conversely, this means that word cues are better cues when the syntactic
categories of cue and target word are different. This effect was already demonstrated by
Abrams and Rodrigues (2005): the word cue which agreed with the target word in the first
syllable only facilitated TOT resolution when the cue had a different syntactic category (e.g.
adjective cue “robust” for the noun target “rosary”). If cue and target word shared the
syntactic category, no effect was detected (e.g. noun cue “robot” for the noun target “rosary”).
Translated to German, this means that the verb “legen” would be a better word cue for
“Legasthenie” than the noun “Leguan”. Abrams and Rodrigues’ finding shows that
phonological cueing depends on syntactic information, which in turn indicates interactivity
between the lexeme and lemma level because the syntactic category is part of the lemma
Our model also predicts that cues that have no word status should work better than
word cues. In addition, the following holds: the longer the cue’s accordance (in phonemes)
with the target word the bigger the cue effect. For both effects, reference is made to the
results of the studies by Sauer (2015): For the purpose of evoking TOTs, German word
definitions were presented on a computer screen and a cue was presented visually for TOT
data. In the first experiment, this cue was either (a) the correct first syllable of the target word,
e.g. “Ab” (for Absolutismus/absolutism) or (b) an incorrect first syllable, e.g. “Sa”, or (c) the
neutral control condition consisting of three crosses (“xxx”). It turned out that the correct
syllable accelerates lexical access to the word you are looking for. The TOT state could be
resolved significantly faster compared to the incorrect syllable and the control condition. In
addition, there were more accurate answers from the subjects after the correct syllable was
presented. Thus, the positive effect of phonological similarity, which was previously shown
on word cues, could also be transferred on syllable cues, i.e. on a sublexical component (see
also Hofferberth-Sauer & Abrams 2014).
In a second experiment of Sauer (2015), a different cue was presented in the TOT
state: either (a) the correct first syllable, i.e. the 'regular' syllable (e.g., “Ab” for
Absolutismus/absolutism), (b) the 'extended' syllable, i.e. the first syllable with a subsequent
segment of the next syllable (e.g., “Abs” for Absolutismus/absolutism, “Bum” for
Bumerang/boomerang, “Pat” for Paternoster/paternoster etc.) or (c) the control condition
(“xxx”). The cues were presented in isolation and not embedded in words in order to prevent
the cue’s activation at the lemma level that could compete with the target’s lemma. It was
demonstrated that the extended syllable both accelerated and enhanced the TOT resolution:
The TOT state could be resolved significantly faster after presentation of the extended
syllable in comparison to the regular syllable and the control condition. In addition, the
extended cue resulted in significantly more accurate TOT resolutions. From this, it can be
concluded that for a successful TOT resolution the syllable boundary might not play a special
role. The results allow the simpler explanation that it is only important to get as much
phonological information as possible.
Since in our model the syllables form the transition to articulation (as in the Level
model, see Figure 3), and not – as in the models of Dell (1986) and MacKay (1987) – define a
layer between the word node layer and the phoneme node layer, experimental results that
illustrate a specific role of the syllable boundary could falsify our model.
Figure 3: Transition from the morphemes to the syllabic routines of articulation
5 Conclusion and Prospects
Slips of the tongue as well as TOT states result from something going wrong during the
process of language production. By this paper, we wanted to show how both kinds of errors
can be pulled together and how they might be explained within a single model. The model we
propose for that approach includes the assumption that syllables serve to connect phoneme
representation to motor programs. Abstaining form a syllable level between the word or
morpheme level and the phoneme level to use the syllable level as gate to motor programming
can explain the occurrence of slips of the tongues of size word, morpheme, and phoneme and
the absence of slips of the tongue of size syllable. Our data on TOT states at least does not
contradict this modelling assumption.
All experiments on TOT states have the problem that TOTs cannot be invoked
reliably, in particular with respect to specific words. In principle, a term like “legacy” might
invoke a TOT state. Subjects with such a state could be given the cues “legate” (noun),
“legal” (adjective) or “legave” (pseudo word). All these cues agree with the target for the first
four phonemes and should thus carry the same positive effect on the subject. However,
according to our modelling approach, they do not carry the same negative effect. The net
effect should be biggest for the pseudo word (no competition on morpheme and no
competition on lemma level) and smallest for the noun (maximal competition on lemma level
due to sharing the syntactic category). Validating this hypothesis, however, would need
experiments in which subjects often show a TOT state with respect to “legacy”. This is hard
to implement due to the unreliable occurrence of TOT states.
For word errors, moreover, the target word and the error word usually coincide in
the syntactic category, so that a target noun is almost always replaced by another noun. This
support for the model presented here by slips of the tongue data could be enhanced by picture-
word interference experiments in the tradition of Schriefers, Meyer, and Levelt (1990). If the
subject names an object in such an experiment, a word that is played with a slight delay
(about 150 ms after the presentation of the object) accelerates the naming. Analogous to the
cues in TOTs, a pseudoword should be more accelerating than a correct word, and a word
with a different syntactic category should be more accelerating than a word with the same
category of the target word. This variant of our prediction is easier to test than the prediction
about the TOTs, as TOTs cannot be specifically generated (as mentioned above).
Modelling the process of language production offers clues how to improve the
diagnosis and the therapy of aphasia and other language disorders. Our approach mainly
offers clues for the use of cues: for patients with anomia, e.g. amnestic aphasia, the beginning
of the word sought-after is a good cue, but another full word with that beginning is not. The
number of phonemes of the cue corresponds to its potency that allows an adjustment of cues
to the severity of the impairment. In addition, the following can be said: if the patient tends to
perseveration errors, in naming tasks, objects to be named that follow each other should not
be from the same semantic field since that would increase the chance that the first object will
be a blockading object during naming the second object.
We would like to thank Eva Belke, Thomas Berg, and Charlotte Bellinghausen as well as two
anonymous reviewers from Linguistische Berichte for helpful remarks and comments.
We also would like to thank our academic advisors, Prof. Dr. Helen Leuninger and Dr. Hans-
Jürgen Eikmeyer (†), for friendship and support.
Abrams, L. & Rodrigues, E. (2005). Syntactic class influences phonological priming of tip-of-
the-tongue resolution. Psychonomic Bulletin & Review, 12, 1018-1023.
Abrams, L., White, K.K. & Eitel, S.L. (2003). Isolating phonological components that
increase tip-of-the-tongue resolution. Memory & Cognition, 31, 1153-1162.
Belke, E. (2013). Long-lasting inhibitory semantic context effects on object naming are
necessarily conceptually mediated: Implications for models of lexical-semantic
encoding. Journal of Memory and Language, 69, 228–256.
Berg, T. (1988). Die Abbildung des Sprachproduktionsprozesses in einem
Aktivationsflussmodell. Tübingen: Niemeyer.
Berg, T. & Schade, U. (1992). The role of inhibition in a spreading-activation model of
language production. Part I: The psycholinguistic perspective. Journal of
Psycholinguistic Research, 21, 405-434.
Blanken, G. (1998). Lexicalization in speech production: Evidence from form-related word
substitutions in aphasia. Cognitive Neuropsychology, 15, 321-360.
Brown, A.S. (2012). The Tip of the Tongue State. New York: Psychology Press.
Brown, R. & McNeill, D. (1966). The "tip of the tongue" phenomenon. Journal of Verbal
Learning and Verbal Behavior, 5, 325-337.
Burke, D.M., MacKay, D.G., Worthley, J.S. & Wade, E. (1991). On the tip of the tongue:
What causes word finding failures in young and older adults? Journal of Memory and
Language, 30, 542-579.
Cholin, J., Dell, G.S. & Levelt, W.J.M. (2011). Planning and Articulation in Incremental
Word Production: Syllable-Frequency Effects in English. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 37, 109-122.
Dell, G.S. (1986). A spreading-activation theory of retrieval in sentence production.
Psychological Review, 93, 283-321.
--- (1988). The retrieval of phonological forms in production: Tests of predictions from a
connectionist model. Journal of Memory and Language, 27, 124-142.
Dell, G.S. & Reich, P.A. (1981). Stages in sentence production: An analysis of speech error
data. Journal of Verbal Learning and Verbal Behavior, 20, 611-629.
Fay, D. & Cutler, A. (1977). Malapropisms and the structure of the mental lexicon. Linguistic
Inquiry, 8, 505-520.
Fromkin, V.A. (Ed.) (1973). Speech errors as linguistic evidence. The Hague: Mouton.
Fromkin, V.A. (1980). Errors in linguistic performance: Slips of the tongue, ear, pen, and
hand. London: Academic Press.
Fromkin, V. A. (1988). Grammatical aspects of speech errors. In F.J. Newmeyer (Ed.),
Linguistics: The Cambridge survey, Vol. II. Linguistic theory: Extensions and
implications (pp. 117-138). Cambridge: Cambridge University Press.
Garrett, M.F. (1975). The analysis of sentence production. In G.H. Bower (Ed.), The
psychology of learning and motivation (pp. 133-177). New York: Academic Press.
Garrett, M.F. (1982). Production of speech: Observations from normal and pathological
language use. In A.W. Ellis (Ed.), Normality and patholgy in cognitive functions.
London: Academic Press.
Garrett, M. F. (1990). Sentence processing. In D.N. Osherson & H. Lasnik (Eds.), An
invitation to cognitive science (Vol. 1: Language, pp. 133-175). Cambridge, MA: MIT
Gauvin, H.S., de Baene, W., Brass, M. & Hartsuiker, R.J. (2016). Conflict monitoring in
speech processing: An fMRI study of error detection in speech production and
perception. Neuroimage, 126, 96-105.
Hofferberth, N.J. (2008). Das Tip-of-the-Tongue-Phänomen. Eine multiple Einzelfallstudie.
Examensarbeit, Goethe-Universität zu Frankfurt am Main.
---- (2011). The tip-of-the-tongue phenomenon: Search strategy and resolution during word
finding difficulties. In: Botinis, A. (Ed.), Proceedings of the 4th ISCA Tutorial and
Research Workshop on Experimental Linguistics. ExLing 25-27 May 2011. Paris:
ISCA, Université Paris Diderot & National and Kapodistrian University of Athens, 83-
Hofferberth-Sauer, N.J. & Abrams, L. (2014). Resolving tip-of-the-tongue states with syllable
cues. In Torrens, V. & Escobar, L. (Eds.), The processing of lexicon and
morphosyntax. Newcastle, UK: Cambridge Scholars Publishing, 43-68.
Humphreys, G.W., Riddoch, M.J. & Quinlan, P.T. (1988). Cascade processes in picture
identification. Cognitive Neuropsychology, 5, 67-103.
James, L. E. & Burke, D.M. (2000). Phonological priming effects on word retrieval and tip-
of-the-tongue experiences in young and older adults. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 26, 1378-1391.
Kempen, G. & Huijbers, P. (1983). The lexicalization process in sentence production and
naming: Indirect election of words. Cognition, 14, 185-209.
Koriat, A. & Lieblich, I. (1974). What does a person in TOT state know that a person in a
don´t know state doesn´t know. Memory and Cognition, 2, 647-655.
Levelt, W.J.M. (1983). Monitoring and self-repair in speech. Cognition, 14, 41-104.
---- (1989): Speaking: From Intention to Articulation. Cambridge, MA: MIT-Press.
Levelt, W.J.M., Roelofs, A. & Meyer, A.S. (1999). A theory of lexical access in speech
production. In: Behavioral and Brain Sciences, 22, 1-75.
MacKay, D.G. (1987). The organization of perception and action: A theory for language and
other cognitive skills. New York: Springer.
Meyer, A.S., & Bock, K. (1992). The tip-of-the-tongue phenomenon: Blocking or partial
activation? Memory & Cognition, 20, 715-726.
Meyer, A.S. & Damian, M.F. (2007). Activation of distractor names in the picture-picture
interference paradigm. Memory and Cognition, 35, 494-503.
Miller, G.A. & Johnson-Laird, P.N. (1976). Language and Perception. Cambridge, MA:
Belknap Press.
Morsella, E., & Miozzo, M. (2002). Evidence for a cascade model of lexical access in speech
production. Journal of Experimental Psychology: Learning, Memory, and Cognition,
28, 555-563.
Nooteboom, S.G. & Quené, H. (2013). Parallels between self-monitoring for speech errors
and identification of the misspoken segments. Journal of Memory and Language, 69,
Nozari, N., Dell, G.S. & Schwartz, M.F. (2011). Is comprehension necessary for error
detection? A conflict-based account of monitoring in speech production. Cognitive
Psychology, 63, 1-33.
Peterson, R.R., & Savoy, P. (1998). Lexical selection and phonological encoding during
language production: Evidence for cascaded processing. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 24, 539-557.
Postma, A. & Noordanus, C. (1996). Production and detection of speech errors in silent,
mouthed, noise-masked, and normal auditory feedback speech. Language and Speech,
39, 375-392.
Roelofs, A. (1992). A spreading-activation theory of lemma retrieval in speaking. Cognition,
42, 107-142.
---- (2005). Spoken word planning, comprehension, and self-monitoring: Evaluation of
Weaver++. In: Hartsuiker, R.J., Bastiaanse, Y., Postma, A. & Wijnen, F.N.K. (Eds.),
Phonological Encoding and Monitoring in Normal an Pathological Speech. New
York: Psychological Press, 42-63.
---- (2018). A unified computational account of cumulative semantic, semantic blocking, and
semantic distractor effects in picture naming. Cognition, 172, 59-72.
Sauer, N.J. (2015). Das Tip-of-the-Tongue-Phänomen. Zur Rolle der Silbe beim Auflösen von
Wortfindungsstörungen. Dissertation, Goethe-Universität zu Frankfurt am Main.
---- (2016). Syllable cueing and segmental overlap effects in tip-of-the-tongue resolution. In:
Botinis, A. (Ed.), Proceedings of the 7th Tutorial and Research Workshop on
Experimental Linguistics. ExLing 01-02 July 2016. St. Petersburg: ISCA, Saint
Petersburg State University & National and Kapodistrian University of Athens, 155-
Schade, U. & Berg, T. (1992). The role of inhibition in a spreading-activation model of
language production. Part II: The simulational perspective. Journal of Psycholinguistic
Research, 21, 435-462.
Schriefers, H., Meyer, A.S. & Levelt, W.J.M. (1990). Exploring the time course of lexical
access in language production: Picture-word interference studies. Journal of Memory
and Language, 29, 86-102.
Sheridan, R.B. (1775). The Rivals. Bloomsbury Methuen Drama, 2
Edition, 2004.
Stemberger, J.P. (1985). An interactive activation model of language production. In: Ellis,
A.W. (Ed.), Progress in the psychology of language, Vol. 1. London: Erlbaum, 143-
Vigliocco, G., Antonini, T. & Garrett, M.F. (1997). Grammatical gender is on the tip of
Italian tongues. Psychological Science, 8, 314-317.
Vigliocco, G., Vinson, D.P., Martin, R.C. & Garrett, M.F. (1999). Is ''count'' and ''mass''
information available when the noun is not? An investigation of tip of the tongue
states and anomia. Journal of Memory and Language, 40, 534-558.
White, K.K. & Abrams, L. (2002). Does priming specific syllables during tip-of-the-tongue
states facilitate word retrieval in older adults? Psychology and Aging, 17, 226-235.
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
The tip-of-the-tongue (TOT) phenomenon refers to a temporary word finding failure. To induce TOTs in the lab, a common method is to ask for terms after providing created definitions. When in a TOT, syllable cues were presented in order to manipulate TOT resolution. After the presentation of the correct first syllable of the target word, TOTs could be resolved faster and more accurately than after the presentation of an incorrect syllable of some other word or the control condition (Experiment 1: syllable cueing effect). The presentation of the extended syllable of the word (the first syllable with one more segment) facilitated TOT resolution and boosted lexical retrieval even more than the regular syllable (Experiment 2: segmental overlap effect).
Full-text available
To minimize the number of errors in speech, and thereby facilitate communication, speech is monitored before articulation. It is, however, unclear at which level during speech production monitoring takes place, and what mechanisms are used to detect and correct errors. The present study investigated whether internal verbal monitoring takes place through the speech perception system, as proposed by perception-based theories of speech monitoring, or whether mechanisms independent of perception are applied, as proposed by production-based theories of speech monitoring. With the use of fMRI during a tongue twister task we observed that error detection in internal speech during noise-masked overt speech production and error detection in speech perception both recruit the same neural network, which includes pre-supplementary motor area (pre-SMA), dorsal anterior cingulate cortex (dACC), anterior insula (AI), and inferior frontal gyrus (IFG). Although production and perception recruit similar areas, as proposed by perception-based accounts, we did not find activation in superior temporal areas (which are typically associated with speech perception) during internal speech monitoring in speech production as hypothesized by these accounts. On the contrary, results are highly compatible with a domain general approach to speech monitoring, by which internal speech monitoring takes place through detection of conflict between response options, which is subsequently resolved by a domain general executive center (e.g., the ACC).
How word production unfolds remains controversial. Serial models posit that phonological encoding begins only after lexical node selection, whereas cascade models hold that it can occur before selection. Both models were evaluated by testing whether unselected lexical nodes influence phonological encoding in the picture-picture interference paradigm. English speakers were shown pairs of superimposed pictures and were instructed to name one picture and ignore another. Naming was faster when target pictures were paired with phonologically related (bed-bell) than with unrelated (bed-pin) distractors. This suggests that the unspoken distractors exerted a phonological influence on production. This finding is inconsistent with serial models but in line with cascade ones. The facilitation effect was not replicated in Italian with the same pictures, supporting the view that the effect found in English was caused by the phonological properties of the stimuli.
Sheridan was born in Dublin in 1751. His father, Thomas Sheridan, was at the time the manager of the Smock Alley Theatre in Dublin. After a serious riot in the theatre in 1754, Thomas relinquished control of the theatre and moved for two years to London, working as an actor under John Rich’s management at Covent Garden. He returned to Dublin in 1756 and attempted once more to manage the Smock Alley Theatre, but this time he faced the prospect of ruinous competition from a new theatre built by the actor Spranger Barry. Once again, he was forced to return to London. Thomas Sheridan was one of those individuals who seem destined to be in the wrong place at the wrong time. Apart from his skills as an actor (and he was once considered a potential rival to Garrick), he also had talents as an educational theorist and a compiler of dictionaries. None of these enterprises met with sufficient acclaim to ensure a stable livelihood. Thomas aspired to the status of a gentleman, but all he managed was to accumulate sufficient debts to oblige him to flee from his creditors to France with his wife in 1764.
After a review of various research procedures used to study TOTs, the book offers a summary of attempts to manipulate this rare cognitive experience through cue and prime procedures. Various aspects of the inaccessible target word are frequently available – such as first letter and syllable number – even in the absence of actual retrieval, and the book explores the implications of these bits of target-word information for mechanisms for word storage and retrieval. It also examines: what characteristics of a word make it potentially more vulnerable to a TOT; why words related to the target word (called "interlopers") often come to mind; the recovery process, when the momentarily-inaccessible word is recovered shortly after the TOT is first experienced; and efforts to evaluate individual differences in the likelihood to experience TOTs.
In this study subjects had to report their errors during the speeded production of tongue twister sentences in one of four speech conditions: silent, mouthed, noise-masked, and normal auditory feedback speech. In contrast to the other three conditions, silent speech comprises speech planning but no articulation. Error monitoring in the normal auditory feedback condition may occur both by means of an inner speech (prearticulatory) loop and by means of auditory feedback, whereas in the other conditions only the first channel is available. The results showed that reported error rates were roughly equal in the silent, mouthed, and noise-masked condition, with an increase in the normal auditory feedback condition. Significantly more phonemic-sized errors and disfluencies were reported with auditory feedback, whereas word errors were less frequent. Notwithstanding the differences with respect to error size, report rates for the individual error categories (e.g. anticipations, perseverations, substitutions, etc.) did not differ notably for the four conditions. Errors typically occurred at the same points across speech conditions. These results suggest that speech planning processes are similar in the four speech conditions. Moreover, actual motor execution (i.e. articulation) does not appear to be an important contributor to the error events under study. The main difference between conditions can be attributed to the available monitoring channels.