Neural inhibition during speech planning
contributes to contrastive hyperarticulation
Michael C. Stern & Jason A. Shaw, Yale University
Contrastive hyperarticulation (CH)
No minimal pair in lexicon, no minimal pair in
context (e.g. “pup”, cf. *“bup”)
Minimal pair in lexicon, no minimal pair in
context (e.g. “pun”, “bun” not on screen)
Minimal pair in lexicon, minimal pair in context
(e.g. “pun”, “bun” on screen)
•A word with a phonological minimal pair is hyperarticulated away from that minimal pair.
•For example, voice onset time (VOT) of a voiceless stop is lengthened away from a voiced minimal
pair competitor, especially when the competitor is contextually salient (Baese-Berk & Goldrick, 2009):
Claim: CH arises from partial activation of minimal pair competitor during speech planning
(Baese-Berk & Goldrick, 2009).
Based on interactive activation
models of speech planning (Dell,
1986; Dell et al., 1999, 2021):
Why would competitor activation cause hyperarticulation, rather than hypoarticulation, as
has been observed in speech errors (Alderete et al., 2021)?
CH from neural inhibition
Dynamic Field Theory (DFT: Schöner et al., 2016)
Dynamic neural field (DNF) model of VOT planning
Experiment: Generalizing to pseudowords
•Cognitive representations are continuous parameters
governed by populations of neurons.
•The distribution of activation across a neural population is
represented by a dynamic neural field (DNF).
•DNFs evolve over time under influence of inputs (e.g.,
percepts or movement plans) which can interact.
•Activation peaks drive behavior, perception, and cognition.
Selective inhibition (Houghton & Tipper, 1994)
Inhibition of some portion of feature space can derive dissimilation of movement targets away from
distractors in hand and eye movements (Tipper et al., 2000) and in speech articulation (Tilsen, 2009, 2013).
•Derive CH from selective inhibition of competitor phonological category in DNF model of VOT planning
•Test predictions of model with speech production experiment
Model summary Figure 5. Right: voiceless
VOT target by input
amplitude of voiced
example of DNF evolution
with no inhibition (left),
(middle), and strong
Figure 4. Voiced input
s1(x,t) (left) and voiceless
input s2(x,t) (right).
Simulation results (Figure 5)
•Inhibition of voiced category causes hyperarticulated voiceless VOT.
•Excitation of voiced category causes hypoarticulated voiceless VOT,
as observed in speech errors (Stern et al., 2022).
•Magnitude of inhibition correlates with magnitude of CH.
•Prediction: Reducing influence of voiced competitor should reduce
magnitude of CH.
Purpose: Reduce influence of competitor by examining pseudowords (with no activation from lexical level of planning).
Method: Dyadic speech production task (N = 24) with pseudowords beginning with voiceless stops, varying in:
(1) whether there is a minimal pair in the lexicon and (2) whether the minimal pair is contextually salient (see Figure 6)
Figure 6. Left: example trials from each condition; target word is bolded and
underlined. Right: mean VOT by condition; error bars indicate standard error.
Experiment results (Figure 6)
•Significant interaction: CH observed, but only when
the minimal pair was a real word and presented as
•CH magnitude ≈ 2 ms (cf. 5-10 ms with real words;
Baese-Berk & Goldrick, 2009)
Stimuli were controlled for phonotactic probability
and phonological neighborhood measures.
Summary: Scaling a single parameter (competitor input amplitude) derives empirically observed range of phonetic effects:
àHyperarticulation: real words (context) > real words (no context) > pseudowords (Baese-Berk & Goldrick, 2009; this study)
àHypoarticulation: “trace effects” in speech errors (Alderete et al., 2021; Stern et al., 2022)
Future empirical work: DNF model incorporates time àtest predicted relationship between response time and VOT
Open theoretical question: How to derive variation in competitor input amplitude from lexical-phonological coupling?
A speculative proposal: Input amplitude is a function of lexical activation, field state, and field position of lexical input.
Negative input amplitudes (inhibition) derive from large distances between field state and field position of lexical input.
àInhibition kicks in when it is needed most, contributing to the preservation of phonological contrast.
𝑠 𝑥,𝑡 = 𝑎exp −𝑥−𝑝 !
𝜏 /𝑢 𝑥,𝑡 = −𝑢 𝑥,𝑡 +ℎ+𝑠"𝑥,𝑡 +𝑠!𝑥,𝑡 +3𝑘 𝑥−𝑥#𝑔 𝑢 𝑥#,𝑡 𝑑𝑥#+𝑞𝜉 𝑥,𝑡
12th International Workshop on Language Production, Pittsburgh, PA, June 2022
p = 19 ms
Figure 1. CH in voiceless stop consonants.
(Adapted from Figure 3, Baese-Berk & Goldrick, 2009:24)
Figure 3. Example DNF over time.
(Figure 2.12, Schöner et al., 2016:50)
Alderete, J., Baese-Berk, M., Leung, K., & Goldrick, M. (2021). Cascading activation in phonological planning and articulation: Evidence from spontaneous speech errors.
Cognition,210.Baese-Berk, M., & Goldrick, M. (2009). Mechanisms of interaction in speech production. Language and Cognitive Processes,24(4), 527–554.Dell, G. S.
(1986). ASpreading-Activation Theory of Retrieval in Sentence Production. Psychological Review,93(3), 283–321.Dell, G. S., Chang, F., & Griffin, Z. M. (1999).
Connectionist models of language production: Lexical access and grammatical encoding. Cognitive Science,23(4), 517–542.Dell, G. S., Kelley, A. C., Hwang, S., & Bian, Y.
(2021). The adaptable speaker: A theory of implicit learning in language production. Psychological Review,128(3), 446–487.Houghton, G., & Tipper, S. P. (1994). Amodel
of inhibitory mechanisms in selective attention.In D. Dagenbach & T. Carr (Eds.), Inhibitory Processes of Attention, Memory and Language (pp.53–112). Academic Press,
Inc.Schöner, G., Spencer, J., & Group, D. R. (2016). Dynamic Thinking: A Primer on Dynamic Field Theory. Oxford University Press. Stern, M. C., Chaturvedi, M., &
Shaw, J. A. (2022). A dynamic neural field model of phonetic trace effects in speech errors.Proceedings of the Annual Meeting of the Cognitive Science Society.Tilsen, S.
(2009). Subphonemic and cross-phonemic priming in vowel shadowing: Evidence for the involvement of exemplars in production. Journal of Phonetics,37(3), 276–296.
Tilsen, S. (2013). Inhibitory mechanisms in speech planning maintain and maximize contrast. In A. Yu (Ed.), Origins of Sound Change: Approaches to Phonologization (pp.
112–127). Oxford University Press. Tipper, S. P., Howard, L. A., & Houghton, G. (2000). Behavioral consequences of selection from neural population codes. Attention and
Many thanks to Marisa Norzagaray, Kevin Roon, the experiment participants,
and members of the Yale Phonologroup, Yale Phonetics Lab, and Yale
Figure 2. CH from interactive activation model of speech planning.
p = 69 ms
Best-fitting model: VOT ~ minimal_pair_in_lexicon * minimal_pair_on_screen +
phonotactic_probability + speech_rate + trial_number + place_of_articulation +
(1|subject) + (1|item)
a = 6
w = 30
w = 30