Content uploaded by Patricia J. Brooks
Author content
All content in this area was uploaded by Patricia J. Brooks on Apr 02, 2018
Content may be subject to copyright.
M
Modern Theories of Language
Vera Kempe
1
and Patricia J Brooks
2
1
Abertay University, Dundee, UK
2
City University of New York, New York, NY,
USA
Synonyms
Embodied language processing;Emergentist
approaches to language;Sociocultural theories
of language;Usage-based linguistics
Definition
Modern theories of language represent efforts to
account for the evolution, acquisition, and
processing of language within an integrated
framework. Such efforts acknowledge the rela-
tionship of language to sensorimotor experience,
social interaction, and general cognitive con-
straints on information processing.
Introduction
The study of language is a diverse field that
engages the applied interests of educators and
speech pathologists as well as the academic inter-
ests of researchers working in wide range of dis-
ciplines including linguistics, speech and hearing
sciences, psychology, biology, computer science,
philosophy, sociology, and anthropology.
Scholars in these different disciplines invariably
conceptualize language in different ways, viewing
it, e.g., as a biological attribute, a cultural trait, a
set of communication skills, or a normative sys-
tem of signs. This entry reviews modern theories
of language that adopt the view of human lan-
guage as a complex form of human behavior.
The review is organized in accordance with
three perspectives considered to be complemen-
tary, as opposed to mutually exclusive: an evolu-
tionary perspective that focuses on the origins of
what are seemingly unique features of human
languages; a developmental perspective that high-
lights the social-interactional contexts in which
children acquire language as well as relationships
between perception, cognition, and action in
development; and a cognitive perspective that
examines the processing mechanisms that deter-
mine how language use unfolds in real time.
Theories of Language Evolution
and Change
Human languages seem to display many discon-
tinuities when contrasted with the communication
systems of other species. These include duality of
patterning, symbolic signs, vast vocabularies,
syntactic rules, and propositional structure, all of
which allow for unlimited productivity in the cre-
ation of communicative signals whose meaning
#Springer International Publishing AG 2016
T.K. Shackelford, V.A. Weekes-Shackelford (eds.), Encyclopedia of Evolutionary Psychological Science,
DOI 10.1007/978-3-319-16999-6_3321-1
transcends the here and now. Modern theories of
language evolution attempt to understand precur-
sors to these discontinuous traits and what scenar-
ios may have given rise to their emergence.
The study of the evolution of language presents
extraordinary methodological challenges. The
comparative method involves the study of homol-
ogous traits to uncover potential common ances-
try as well as the study of analogous traits that
have emerged across different lineages to under-
stand common selection pressures. This method
has revealed strikingly few parallels between the
vocal communicative repertories of humans and
their nearest relatives, the great apes, although
there may be greater similarities in nonverbal
forms of communication such as gesture. At the
same time, the comparative method has revealed
some striking similarities between humans and
songbirds with respect to the imitative behaviors
of juveniles and the role of social feedback in
shaping immature vocalizations.
The comparative method is fraught with diffi-
culties when trying to establish which exact traits
constitute analogues or homologues in species
that do not possess the faculty of language.
Paleo-anthropological methods are of limited use
since the fossil record does not contain clear traces
of anatomical prerequisites (i.e., vocal tract struc-
ture) or of neural adaptations associated with lan-
guage. Similarly, archeological methods rely on
inferences about the link between the notoriously
incomplete record of artifacts and the cognitive
abilities involved in their production and use.
Finally, the recent applications of molecular biol-
ogy that try to trace the emergence of language
depend on the growing, but currently still limited
knowledge about the genetic underpinnings of
this unique human ability. Thus, in the absence
of “hard”evidence, the extant theories of lan-
guage evolution and change apply conjecture or
reverse engineering, or use computational model-
ing and experimental studies of specific selection
pressures that operate during biological evolution
and cultural transmission to provide proof of con-
cept for basic principles that may underlie the
emergence of language.
Evolutionary theories of language often differ
with respect to what is regarded as the
phenomenon to be explained. Drawing on de
Saussure’s distinction between language as a sys-
tem of signs (“langue”) and language as the prod-
uct of the application of knowledge about this
system (“langage”or “parole”), linguistic theory
has upheld the conceptual distinction between
competence and performance. The concept of lan-
guage performance describes the behaviors asso-
ciated with language use such as the
comprehension, production, and learning of lin-
guistic signals. These behaviors and their anatom-
ical, neural, and cognitive underpinnings,
including domain-general cognitive mechanisms
such as sensorimotor processing, working mem-
ory, planning, cognitive control, conceptual rep-
resentations, and intentionality, have recently
been termed the “Faculty of Language in the
Broad Sense”(Hauser et al. 2002), and are the
subject of study of the discipline of psycholinguis-
tics with its vast arsenal of experimental and neu-
rophysiological paradigms.
Theories of the evolution of the underpinnings
of human language have been concerned with
abilities like control over vocalization and ges-
ture, vocal and gestural learning (including imita-
tion), sharing of conceptual representations, and
intention reading, which is required for the recog-
nition of conspecifics’behaviors as communica-
tive signals. While there is debate as to whether
language is homologous with animal vocal sig-
naling systems like bird song or whether precur-
sors of language may have initially arisen in the
gestural modality, there is consensus that many of
these abilities constitute adaptations to a variety of
selection pressures that may not be related to
language, but were likely to be related to social
organization, mate choice, and tool use, and are
the product of a gradual and continuous process of
evolution by natural selection.
In contrast, language competence has been
viewed as a cognitive capacity encompassing
knowledge of a finite set of rules that can be
used to generate an infinite number of utterances.
It is this set of rules –called a Generative
Grammar –which is seen as being at the heart of
the human faculty for language. According to the
theory of Universal Grammar, for a child to
acquire the Generative Grammar of a specific
2 Modern Theories of Language
language they must have innate knowledge of
universal constraints that limit the types of struc-
tures that occur in human languages. Studying the
psychological reality of Universal Grammar (i.e.,
the innate substrate for acquiring a Generative
Grammar) is complicated by the fact that formal
descriptions of what constitutes knowledge of
grammar have changed since the 1960s: The Stan-
dard Theory encompassed the notions of a deep
structure describing the underlying logical rela-
tionships between the parts of a sentence, and a
surface structure describing the specific manifes-
tations of how those parts are assembled based on
a set of transformation rules, the specification of
which underwent major revisions in subsequent
editions of the theory throughout the 1970s. In the
1980s, the Principles-and-Parameters Framework
(aka Government and Binding Theory) viewed
Universal Grammar as an innate set of principles
comprising phrase-structure rules that specify
hierarchical relationships (Government) and rela-
tionships of coreference (Binding) between
words, and a set of parameters, i.e., values that
define variability in language-specific manifesta-
tions of these principles, the specific values of
which are set upon receiving input during the
process of acquisition. Finally, in the 1990s, the
Minimalist Program narrowed the human faculty
for language down to a basic combinatorial oper-
ation, Merge, which hierarchically combines pairs
of elements, consisting of a head and its comple-
ment, into a superordinate unit that inherits the
properties of the head (e.g., in the phrase big dog,
the adjective big serves as the dependent of dog,
and thus modifies its meaning). Crucially, the
product of Merge can subsequently combine
with another element (either as dependent or
head) to create a new unit (e.g., the big dog),
thereby implementing the fundamental property
of recursion considered at the core of the human
“Faculty for Language in the Narrow Sense”
(Hauser et al. 2002). As theories of the evolution
of language predate the proposal of the concept of
the “Faculty for Language in the Narrow Sense”
this entry retains the term “Universal Grammar”
when talking about the evolution of language
competence.
Language as Product of Biological Adaptation
Theories that explain the evolution of Universal
Grammar in analogy with the evolution of biolog-
ical traits can be distinguished with respect to
whether evolution is assumed to have taken
place continuously or in discontinuous jumps.
Adaptationist theories propose that Universal
Grammar evolved gradually following principles
of Darwinian selection because it confers repro-
ductive benefits by supporting language acquisi-
tion. In analogy to the evolution of the visual
system, the central argument is that of complexity
of design: In order for a cognitive entity as com-
plex as Universal Grammar to evolve it must have
been fine-tuned to fit its purpose through a lengthy
process of continuous adaptation. As a result,
humans are equipped with a set of innate, a priori
constraints that enable them to solve the logical
problem of language acquisition by allowing chil-
dren to derive correct structural generalizations
from limited input. Nonadaptationist theories
argue that Universal Grammar has emerged fairly
recently (~90–50 K ago) as the result of a chance
mutation. This view draws on two lines of evi-
dence: (a) archeological evidence for a dramatic
rise in technological advancement attributable to
artifacts that can be dated from 50 K years
onward, and (b) evidence for a mutation in the
FOX P2 gene that occurred within the last 100 K
years. However, given the exceedingly small like-
lihood of a complex set of constraints such as
Universal Grammar arising through a chance
mutation, proponents of the nonadaptationist
view have suggested that the mutation affected
only a minimal, core element of
grammar –recursion (Hauser et al. 2002).
Whether the ability to apply Merge recursively
provides a sufficient advantage for the language-
learning child to derive generalizations about
grammatical structure continues to be debated,
with some theorists providing evidence that
Merge itself could be learned from the input,
rather than existing as innate knowledge (Ninio
2014).
Language as Product of Cultural Transmission
Critics of the idea that Universal Grammar is a
biological adaptation argue that this view is
Modern Theories of Language 3
fraught with a number of fallacies: Given human
dispersion and linguistic variability, it is unclear
how an arbitrary set of universal rules could have
evolved as an adaptation to different specific lin-
guistic environments. Secondly, it is unclear why
only innate representations of highly abstract
grammatical properties should have evolved,
rather than predispositions to learn other specific
aspects of an ambient language such as its sound
structure or vocabulary. Finally, it is unlikely that
modifications in the genotype could have caught
up with the rapidly “moving target”of language
which is liable to change over the course of even
just a few generations.
Instead, recent theorizing about language evo-
lution views language as a cultural trait that has
evolved through cultural transmission via social
interaction. Thus, it is not the human brain that has
adapted to language but languages have adapted
to be learnable and usable by humans
(Christiansen and Chater 2016). Language is
seen as a product of cumulative cultural, rather
than biological, evolution, shaped by require-
ments to be transmissible across generations and
expressive in communication. As a result, lan-
guage is suited to the capacities of the human
perceptual-motor system; the architecture of the
cognitive mechanisms that underlie learning,
memory, and information processing; the nature
of mental representations and thought; and the
pragmatic constraints that govern human
communication.
Evidence for this view comes from agent-
based computational simulations and laboratory
experiments that manipulate various aspects of
cultural transmission of language-like systems to
determine how linguistic structure emerges at var-
ious levels. These lines of research suggest that
combinatorial structure emerges in response to the
need to transmit signals through noisy channels
while compositional structure, i.e., the consistent
linking of form-based features to dimensions of
meaning, emerges from the combined pressure of
transmitting signals through limited capacity
memory systems coupled with the need for users
to be communicatively efficient.
Coevolution of Genes and Language
Gene-culture coevolution theories explore the
feedback between genetic and cultural inheritance
mechanisms. These theories draw on ideas from
domains in which the interplay between genetic
and cultural evolution in certain populations had
been demonstrated, such as evolution of lactose
tolerance as an adaptation to cattle farming, or of
light pigmentation as an adaptation to colder cli-
mates. The central idea is that culture creates new
environments, which then exert specific selection
pressures by influencing mechanisms that enable
learning of ambient cultural traits. This view rec-
onciles the traditional juxtaposition of innateness
and learning. With respect to language, it implies
that although the genotype determines the avail-
able learning mechanisms, their selection is
governed by the structure of language that arises
from the interaction of many individuals over
time. This selection, in turn, leads to a weakening
of the role of innate biases that shape leaning
(Kirby et al. 2007). Indeed, recent modeling
work has demonstrated that strong universality
in behavior can emerge from weak biases through
the process of cultural transmission. Thus,
although the process of cultural evolution causes
languages to adapt to the human cognitive system,
the resultant structural properties of language
select for cognitive abilities that are ever better
suited to learning and processing of language.
Following similar principles, the human larynx
and the associated neural and cognitive control
of sound vocal abilities may also have evolved
in response to the pressure to produce more dis-
tinct sounds and words (Lieberman 2012). How-
ever, gene-culture coevolution theories that
postulate genetic adaptions to language at the
level of individual populations need to reconcile
this view with the current lack of evidence for
individual predispositions for acquiring specific
languages, as would be evident if children
adopted into different cultural and ethnic back-
grounds exhibited specific difficulties in acquiring
features of the languages of their adopted families.
4 Modern Theories of Language
How Is Language Learned?
Acknowledging the shift away from the nativist
stance that regards language acquisition as a mat-
urational process built on an innate blueprint for a
formal generative grammar, modern theories of
language development view first and second lan-
guage learning as a process of skill acquisition
(Christiansen and Chater 2016; Ninio 2006),
wherein learners develop fluency in processing
language in real time, i.e. converting input
consisting of sequences of sounds (or signs) into
mental representations of speakers’communica-
tive intentions and generating sequences of
sounds or signs to express their own communica-
tive intentions. Under this view, the fundamental
context for language acquisition involves joint
activities where speakers and listeners share com-
mon ground and can make reasonable inferences
about their partner’s communicative goals (Clark
1996; Tomasello 1999). The relationship between
language learning and the social environment is
reciprocal: the developing ability to use language
facilitates social interaction, and social interaction
constrains and guides the process of language
learning.
Social Shaping
In the context of social interaction, language
learning involves sustained engagement with par-
ents, siblings, and other caregivers. Many theories
include social learning as an integral component
of language acquisition and consider how feed-
back from caregivers supports its development.
From early infancy, caregivers and infants coor-
dinate the timing of their communicative bids
when engaged in face-to-face (dyadic) social
interaction, using eye gaze as well as vocalization
to modulate the amount and intensity of social
stimulation. Such coordination of communicative
effort fosters the development of a sense of con-
nection between caregiver and infant while build-
ing a mutual awareness of doing something
together. Recent experimental work demonstrates
how contingent social feedback serves to promote
more advanced speech-like vocalizations (i.e.,
canonical babbling) in prelinguistic infants
(Goldstein et al. 2003). In this and similar studies,
caregivers were prompted via an earpiece to
affirm their infant’s communicative attempts by
smiling, touching them, or imitating their vocali-
zations, with half of the infants receiving contin-
gent feedback (i.e., the prompts were timed to
immediately follow the infant’s vocalizations),
while the other half received noncontingent feed-
back (i.e., the prompts were yoked to the timing of
the vocalizations of a different infant from a pre-
viously recorded session). Across studies, the pro-
vision of contingent feedback was found to
increase both the quantity and quality of infant
vocalizations, such that infants who received
prompt, contingent feedback made a greater num-
ber of communicative attempts to engage their
caregivers and produced more mature, canonical
forms of babbling relative to the yoked controls.
Similarly, in longitudinal designs, individual dif-
ferences in maternal responsiveness to their
infant’s communicative bids have been shown to
predict the timing of subsequent language mile-
stones, including the emergence of first words,
50 words in expressive language, combinatorial
speech, and talk about past events (Tamis-
LeMonda et al. 2001), with maternal repetition
or imitation of the child’s speech proving to be
the most impactful form of feedback for later
developmental milestones.
Embodied Cognition as Expressed in Gesture
and Speech
For infants, one of the major challenges involved
in learning a language is to discern the referents of
the words heard in the ambient language –a pro-
cess referred to as word-to-world mapping. Care-
givers play an important role in facilitating this
process by talking about what the infant already
has in mind; for instance, by naming objects that
the infant is pointing at or holding. Most early
verbs and other relational terms like up or off are
acquired in contexts where the infant is involved
in purposeful activity, which allow, for example,
the infant to map a word such as up onto their
efforts to be picked up. Modern theories of lan-
guage have come to recognize that the semantic
representations associated with linguistic forms
are multimodal in nature and reflect the sensori-
motor experiences of learners. Multimodal
Modern Theories of Language 5
representations are one feature of embodied cog-
nition wherein conceptual processing involves
simulation or mental reenactment of previously
experienced situations (Barsalou 2009). In chil-
dren, as well as adults, accessing verb meanings is
associated with neural activity in motor areas
(frontal cortex), which varies as a function of the
type of verb, e.g., “hand”verbs such as clap or
throw vs. “leg”verbs such as run or chase, with
self-generated action seeming to play a key role in
the development of these neural signatures of
sensorimotor activity during verb acquisition
(James and Swain 2011).
Notably, infants are learning the meanings of
words at the same time as they are acquiring a
wide range of communicative gestures, many of
which may be viewed as forms of simulated
action, as when the child puts their hands over
their heads as a request to be picked up (Tomasello
1999). Young children’s gestures seem to convey
their thoughts when they are at the cusp of acquir-
ing a new skill (Goldin-Meadow 2009). These
gestures provide crucial information to caregivers
about what their infant has in mind (i.e., what are
the infant’s communicative intentions), which
facilitates caregivers’provision of relevant input
to aid word-to-world mapping. Infants often
acquire corresponding gestures prior to learning
verbs like wave, sleep, or drink, with the word
subsequently mapped onto the gesture as a com-
ponent of its meaning.Gestures also seem to play
a key role in young children’s acquisition of com-
binatorial speech, as infants first combine single
words with familiar gestures (e.g., pointing at a
cookie while saying more) prior to combining two
words into a single utterance (e.g., saying more
cookie). Deictic gestures (e.g., pointing at some-
thing), in particular, may be important for the
child’s acquisition of a general combinatorial
operation like Merge, by serving as variable
expressions (i.e., with meanings corresponding
to pronouns like it or that) that change in meaning
depending on the context of use. Thus, learning to
point as a form of request at a multitude of objects
(e.g., cookies, milk, out-of-reach toys) paves the
way for the child to acquire corresponding phrases
such as “get it”which exemplify the situational
meanings and flexible utility of linguistic
expressions.
Bootstrapping in Complex Dynamical
Systems
In order to process incoming information lan-
guage learners need to rapidly incorporate this
information into existing representations.
A number of theories provide accounts for how
representations of the hierarchical structure of
language can be built up incrementally from
the input. The evidence suggests that from the
very first moments of contact with language in
utero, infants become attuned to the statistical
regularities associated with the duration of vowels
and consonants and the resulting rhythm of the
language (Mehler et al. 1988). During infancy,
sensitivity to prosodic characteristics, i.e. intona-
tional and rhythmical patterns, provides a rich
source of information that facilitates discovery
of the underlying grammatical structure of a lan-
guage because prosodic and syntactic structures
tend to be aligned to a considerable degree
(Morgan and Demuth 1996). The process of dis-
covering the sound patterns and structures of the
ambient language is aided by the fact that child-
directed speech often exaggerates phonetic and
prosodic characteristics in ways that support the
discriminability of phonemes (i.e., meaning-
bearing speech sounds) and discovery of higher
order units, such as words, recurrent morphemes,
and phrasal constituents. Thus, the rich input sup-
ports online bootstrapping of linguistic structure
at multiple levels. More generally, the notion of
bootstrapping implies that as children acquire
aspects of language, such as the meanings of
concrete nouns that can be learned ostensively
through a process of word-to-world mapping
(e.g., pointing at an object to elicit its name),
they can make use of familiar forms (i.e., what
they have already learned) to aid them in learning
unfamiliar and more abstract words and structures
(Gleitman et al. 2005).
The notion of bootstrapping and learning as
being supported by affordances of the environ-
ment extends beyond language input: Language
development is grounded in the infant’s sensori-
motor experience, which undergoes abrupt
6 Modern Theories of Language
transitions as infants develop motor skills over the
first year of life. Recent work has explored how
the transition from crawling to walking provides
infants with new ways of sharing their interest in
objects with others, which in turn may alter the
communicative dynamics of caregiver-child inter-
action. In a study focusing on 13-month-olds,
where half of the infants were already walking
and half were still crawling (Karasik et al. 2011),
walkers more often carried objects to their
mothers, shared their interest in objects from a
distance, and communicated their interest in shar-
ing an object while in motion. The observed dif-
ferences in how walkers and crawlers shared
objects with others had unexpected consequences
for communicative development by influencing
how the mothers perceived their infants’commu-
nicative intentions. Mothers responded differen-
tially to the moving bids (favored by the walkers)
in comparison to the stationary bids (favored by
the crawlers) by producing more advanced action
directives, such as “open it,”when the child
offered or showed them the object while in transit.
These findings suggest that developmental
changes in one domain (i.e., motor development)
can yield cascading effects on development in a
seemingly unrelated domain (i.e., language devel-
opment) by altering parent-child conversational
patterns, which, in turn, enriches the input to the
language-learning child.
Usage-Based Learning
In the face of decay, incoming linguistic informa-
tion needs to be processed rapidly by the learner.
As a result, learning is based on local contingen-
cies processed in small chunks rather than on
surveying and generalizing over large corpora of
previously encountered input. This assumption is
built into a number of theoretical frameworks that
focus on the dynamics of language usage, chunk
and pattern extraction, categorization and gener-
alization, and change over time; these include
connectionist, emergentist, and usage-based
approaches to language and development.
According to these approaches, infants and tod-
dlers learn language in a piecemeal fashion,
starting out by acquiring words and phrasal pat-
terns that are relevant to their daily routines, such
as mealtime or book reading, using mutually
understood activities and predictable communica-
tive formats to make sense of the accompanying
language. Through a process of cross-situational
statistical learning, young children hone in on the
meanings of words and at the same time become
sensitive to the unique ways that individual words
co-occur with other words, including their fre-
quency of occurrence in different syntactic and
situational contexts. Acquisition of item-specific
co-occurrence statistics supports priming,
wherein the processing of linguistic structure
becomes more fluent and efficient as a function
of item and pattern familiarity. Storage of
precompiled units or chunks of language incurs
further benefits by allowing the child to make
incremental predictions about what comes next
when processing language in real time. By
emphasizing statistical learning of distributional
information, which includes monitoring how
often different lexical items appear in different
grammatical constructions, these approaches pro-
vide dynamic accounts of how children go from
being fairly conservative learners to making the
sorts of constrained generalizations that allow
sophisticated language users to achieve the unlim-
ited expressive potential of human language.
How Is Language Used?
Language is transient. Humans are able to com-
prehend language at a rate of about 20 phonemes
per second despite the fact that the ability to
differentiate nonlinguistic sounds is limited to
about 1.5 per second. This implies that current
input is rapidly overwritten by new input, urging
immediate processing –a phenomenon referred to
as the Now-or-Never bottleneck (Christiansen and
Chater 2016). Similarly, the planning of action
sequences, such as those required for speech pro-
duction, is temporally constrained as planning of
long sequences far in advance is hampered by
interference and forgetting. These constraints are
reflected in a number of basic features of
processing systems, which are discussed below.
These basic features of the architecture of the
language processing system –viewed as
Modern Theories of Language 7
consisting of a series of subprocesses that operate
in a cascaded manner utilizing different types of
information in an interactive manner in order to
interpret and even predict upcoming input –are
thought to govern both language comprehension
and language production, although controversy
persists about the extent to which the two systems
share parts of their architecture and underlying
neural substrate (Pickering and Garrod 2013).
Crucially, the principles that underlie language
processing and communicative interaction may
be shared with processing architectures in a num-
ber of other domains such as visual perception,
action planning, or social cognition, rather than
being unique to language.
Levels of Processing
Theories of language processing postulate com-
ponents or stages that deal with different types of
information in the signal, such as phonological,
prosodic, lexical, morphological, syntactic,
semantic, and pragmatic information. Early theo-
ries were heavily influenced by the idea of modu-
larity of processing of different types of
information (Fodor 1983). As a consequence, lan-
guage processing was assumed to handle different
types of information in several stages assembled
in a strictly sequential and independent way, such
that information processed at later stages could
not influence processing at earlier stages. For
example, the process of utterance or sentence
comprehension was conceived of as parsing the
input to uncover the underlying syntactic struc-
ture, formalized as a phrase-structure grammar.
Parsing was viewed as incorporating the input
into a syntactic tree in incremental fashion follow-
ing a set of heuristics that minimized memory load
by limiting the number of higher-order syntactic
units that could be assembled at any given point in
time. Semantic processing was postulated to be
subsequent to the identification of syntactic struc-
ture, and semantic information could not be
admitted for resolving processing uncertainties at
earlier stages. The idea that processing proceeds
from lower to higher levels in strictly sequential
fashion was also echoed in early theories of lan-
guage production where information was propa-
gated sequentially in the reverse direction.
Incremental Cascaded Processing
More recent theories acknowledge that the rapidly
fading signal necessitates efficient conversion of
sensorimotor input into higher-level representa-
tions. This constraint is reflected in the feature of
incremental processing, whereby small chunks of
input cycle through the various processing stages
so that earlier chunks are already integrated with
higher-level information while lower-level infor-
mation is still being processed to create later
chunks. For example, studies of speech
shadowing have shown that speakers routinely
correct errors in the speech stream, which sug-
gests that spoken word recognition proceeds
extremely rapidly based on partial input allowing
higher-level lexical information to be used to
assemble articulatory commands while lower-
level phonological information is still being
processed. For larger segments of input, like utter-
ances and sentences, the idea of incremental
processing was first introduced by the two-stage
sausage machine model (Frazier and Fodor 1978),
which proposed that incoming words first get
chunked into syntactic phrases, which are then
incrementally incorporated into a syntactic repre-
sentation. Modern theories assume a massively
cascaded architecture according to which the pro-
cessor chunks the incoming input and immedi-
ately passes the information on to higher levels
of information processing while at the same time
continuing to process lower-level information in
the subsequent input (Christiansen and Chater
2016).
An incremental cascaded architecture has also
been proposed in speech production theories,
which show –based on empirical findings from
picture-word interference tasks –that higher-level
structure, e.g. the semantic and syntactic structure
of an utterance, is constructed incrementally and
on the fly. Recent evidence from syntactic priming
suggests that speech planning involves a syntactic
planning phase that is independent from semantic
content and phonological form and liable to
reusing the structure of recently encountered
input (e.g. Pickering and Branigan 1998). The
idea of purely syntactic representations derives
from earlier theories of speech production,
which postulated separate syntactic
8 Modern Theories of Language
representations of words called lemmas
(in contrast to lexemes carrying phonological
information). In the production of continuous
speech, syntactic planning spans somewhat lon-
ger time windows than planning at lower levels,
e.g. planning of prosodic structure, which, in turn,
is planned further ahead than syllabic and phono-
logical structure, attesting to the cascaded nature
of the process. Controversy exists with respect to
the size of the preplanned components, and the
extent to which preparation of multiple compo-
nents involves parallel processing.
Interactive Processing
In comprehension, incremental chunk-by-chunk
processing runs into problems when correct inter-
pretation of current input depends on information
that only becomes available in the yet-to-be-
processed input. At the level of identifying indi-
vidual words, ambiguities can arise from distor-
tions in the acoustic signal (i.e., noisy input), from
a lack of one-to-one correspondence between the
signal and the intended word form (i.e., polysemy
and homonymy), or from competition from pho-
nologically related words with similar onsets (i.e.,
words in a cohort such as carbon,carton,carpen-
ter, etc.). Often, such uncertainties in spoken word
recognition cannot be alleviated through bottom-
up processing, thus requiring the listener to make
inferences about the intended word based on prob-
abilistic contextual cues. Empirical evidence
shows that the lexical status of the ambiguous
word as well as the coarticulatory properties of
the surrounding sounds can affect the sensitivity
of early sensory processing of speech sounds,
suggesting that the processing system is interac-
tive. In many instances of noisy input, interactive
processing includes the use of crossmodal sensory
integration (e.g., lip reading) to facilitate rapid
restoration of missing/distorted phonemes
(Massaro 1987). Moreover, higher-level informa-
tion can influence processing retrospectively; for
example, accessing lexical knowledge about a
given word can influence interpretation of ambig-
uous phonological information that was encoun-
tered within that word, for example, a sound with
a voice onset time between /d/ and /t/ is perceived
as /t/ at the end of /pi/ because /pit/ is a words but
/pid/ is not. Such effects have successfully been
implemented in connectionist word recognition
models such as TRACE (McClelland and Elman
1986).
Ambiguities can also arise in sentence
processing. For example, in so-called garden-
path sentences, such as The horse raced past the
barn fell or Put the frog on the napkin in the box,
listeners may be led down the garden path to form
an incorrect interpretation of the sentence. This
occurs when information early in the sentence,
e.g., the past participle raced, is misunderstood,
e.g., taken to be the main verb in the sentence as
opposed to a modifier of horse (part of a reduced
relative clause). Early sequential theories postu-
lated that the processing system commits itself
deterministically to just one interpretation using
a set of processing heuristics. According to these
theories, listeners will subsequently revise their
interpretations of sentences only if incompatible
information is encountered, with reanalysis
manifesting itself in processing cost measurable
by error rates, reaction times or electrophysiolog-
ical markers indicating processing of unexpected
information. Crucially, the process of reanalysis is
thought to be triggered by various types of infor-
mation that are independent of syntactic structure,
such as semantics, discourse structure, pragmatic
inference, and/or real-world knowledge.
Reanalysis may not be fully available to young
children who show minimal evidence of revising
incorrect interpretations of temporarily ambigu-
ous sentences, like the examples given above
(Trueswell et al. 1999).
In contrast, interactive constraint-satisfaction
theories (MacDonald et al. 1994) permit the
processing system to maintain several alternative
interpretations of a sentence in parallel, with via-
ble interpretations continuously constrained by
multiple information sources (see list above for
sources of information guiding the process of
reanalysis), thereby limiting the need for second-
pass revisions under most circumstances. Thus, a
phrase such as The witness interrogated... is
much less likely to be interpreted as a noun +
main verb than the similarly structured phrase
The horse raced... because a variety of factors,
including frequency of use, encourage the listener
Modern Theories of Language 9
to interpret the noun phrase The witness to be the
object of the verb interrogated. Such probabilistic
information may be obtained through statistical
learning of co-occurrence patterns in the input
(see subsection above on Usage-Based Learning),
although other information may reside in the con-
text of the specific communicative episode. There
remains controversy with respect to the amount of
information that can be retained in a memory
buffer before an ambiguity must be resolved and
the types of information that can be considered at
various processing stages. It has been suggested
that listeners often do not consult all available
sources of information in processing a sentence,
but rather engage in satisficing to generate “good-
enough”representations that minimize effort and
processing cost (Ferreira et al. 2002).
Top-Down Processing
As indicated, higher-level information can be
used to resolve uncertainty at lower levels of
processing either by aiding in the selection of the
correct interpretation among the various alterna-
tives activated by bottom-up processing, or by
selectively preactivating just one compatible
interpretation at the exclusion of others, even
though such top-down processing could lead to
misinterpretation of upcoming input. Controversy
has arisen about whether context-based facilita-
tion of bottom-up processing reflects top-down
processing, or whether it just arises from priming,
i.e. from the spreading of lingering, yet rapidly
decaying, activation of already-processed input at
lower levels of representation. This controversy is
of greater relevance in sentence processing than in
word processing, where top-down processing has
been demonstrated by predictive eye movements
in the visual-world paradigm, for example, when
participants perform eye movements toward
objects predicted by the previous context
(Altmann and Kamide 1999). In contrast, in stud-
ies of sentence processing there is less agreement
on whether high-level information can influence
bottom-up processing of lower-level information
directly, or whether it merely aids in selection of
different alternatives constructed by lower-level
processing. One reason for the difference between
word and sentence processing lies in the length of
the time windows necessary for processing rele-
vant information. Whereas top-down influences
on word recognition occur within a relatively
short time window (not overly taxing on memory
and processing resources), top-down influences
on sentence processing require a much longer
time window for morphological, syntactic, and
semantic processing. However, recent neurophys-
iological evidence supports the idea of top-down
effects in sentence processing: differences in
low-frequency oscillatory neural activity indicate
reciprocal entrainment of cortical networks asso-
ciated with lower and higher-level information
processing prior to point at which disambiguating
input is encountered (Lewis and Bastiaansen
2015).
Predictive Processing
By definition, introducing top-down influences
into a cascaded processing system makes
processing predictive. Prediction safeguards the
system against loss of information in the input
due to noise in the transmission channel and in
neural processing. The exact nature of how pre-
diction operates is currently the subject of consid-
erable debate. In its minimal sense, prediction
implies that contextual effects from other types
of information can influence the state of the
processing system before further bottom-up
processing has taken place. Theories differ with
respect to whether prediction operates in a deter-
ministic or probabilistic manner. As described
above, early theories were deterministic in that
they proposed top-down processing to favor one
possible interpretation, typically the strongest
contender of all possible interpretations, which
was either the simplest in terms of memory load
or the most frequent. In cases of mismatch
between the predicted and incoming input (as in
the case of garden-path sentences), the processing
system had to reanalyze the input to identify the
alternative with the greatest plausibility. Recent
evidence, however, favors graded prediction
effects, which are determined by the surprisal of
a given continuation, i.e. by the amount of new
information that would potentially be gained by
processing this particular alternative: the higher
the surprisal, the higher the processing cost. Thus,
10 Modern Theories of Language
recent theories propose that several alternatives,
each weighted by a certain strength of belief, can
predictively be computed and remain active in the
processing system until a resolution has been
achieved. The process of resolution is proposed
to follow an “ideal”or “rational observer”model,
which engages in incremental belief updating
using Bayesian inference to change an existing
prior distribution of probabilities for the various
interpretation alternatives to a new posterior prob-
ability distribution, which, in turn, becomes the
prior distribution for the next processing cycle.
Modern theories are trying to illuminate how peo-
ple trade off the benefits and costs associated with
predictive preactivation in different domains of
language processing.
Communicative Interaction
Traditional theories of language processing were
developed to account for empirical findings of
laboratory studies involving single speakers or
listeners subjected to sophisticated manipulations
of language input in the absence of a conversa-
tional partner. Recent theorizing has increasingly
shifted the focus toward understanding how
humans use language to engage cooperatively
with others (Clark 1996) and how the production
and comprehension systems of interlocutors inter-
act when taking turns in conversation (Pickering
and Garrod 2013). Modern theories of referential
communication attempt to apply principles of
interactive processing and Bayesian prediction to
explore how speaker intent can be encoded and
recovered efficiently in the face of pragmatic con-
straints on the amount of information conveyed in
speech (Frank and Goodman 2012). Processing
principles derived from comprehension and pro-
duction research can also be applied to model the
communicative interaction itself by demonstrat-
ing how mechanisms like priming, inference, and
monitoring of output lead to interactive alignment
of linguistic signals at various levels of informa-
tion processing, with the ultimate goal of achiev-
ing an alignment between interlocutors’internal
representations. Here, controversies persist
around the question of “audience
design”–under what conditions humans engage
in strategic and volitional attempts to align the
form of their output and adjust its informativeness
in relation to the expectations, needs, and assumed
mental models of different conversational part-
ners. Furthermore, the question as to how lan-
guage is embedded in, and governed by,
communicative behavior in general has led to pro-
posals about the evolutionary primacy of turn-
taking behavior and its ontological independence
from language (Levinson 2016).
Conclusion
Modern theories of language try to understand
how language use arises from domain-general,
embodied learning and processing mechanisms
operating in service of communication that takes
place in richly structured social environments.
Research has begun to move away from the
study of language as the behavior of individuals
to consider how it arises as a cooperative enter-
prise within social groups ranging in size from
dyads to global communities. Contemporary the-
ories of language are informed by powerful math-
ematical and computational models, big-data
approaches to the analysis of natural language
corpora, neurophysiological methods, and tradi-
tional laboratory experiments, designed to illumi-
nate social as well as cognitive mechanisms
underpinning language learning and processing.
These complementary methodologies have the
potential to offer new insights for understanding
language as an emergent property of the biologi-
cal substrate of individuals engaged in complex,
hierarchical social interactions.
Cross-References
▶Communication
▶Communication and Developmental
Milestones
▶Early Theories of Language
▶Language
▶Language Acquisition
▶Language Acquisition in Infants and Toddlers
▶Language Development
▶Language Instinct, The
Modern Theories of Language 11
▶Linguistic Evolution
▶Modeling Language Transmission
▶Motherese
▶Nativism
▶Noam Chomsky and Linguistics
▶Phonemes and Symbols
▶Physiology of Language
▶Pinker’s (1994) The Language Instinct
▶Steven Pinker and Language Development
▶Words and Rules
References
Altmann, G. T., & Kamide, Y. (1999). Incremental inter-
pretation at verbs: Restricting the domain of subsequent
reference. Cognition, 73(3), 247–264.
Barsalou, L. W. (2009). Simulation, situated conceptuali-
zation, and prediction. Philosophical Transactions of
the Royal Society of London B: Biological Sciences,
364(1521), 1281–1289.
Christiansen, M. H., & Chater, N. (2016). Creating lan-
guage: Integrating evolution, acquisition, and
processing. Cambridge, MA: The MIT Press.
Clark, H. H. (1996). Using language. Cambridge, UK:
Cambridge University Press.
Ferreira, F., Bailey, K. G., & Ferraro, V. (2002). Good-
enough representations in language comprehension.
Current Directions in Psychological Science, 11(1),
11–15.
Fodor, J. A. (1983). The modularity of mind: An essay on
faculty psychology. Cambridge, MA: The MIT Press.
Frank, M. C., & Goodman, N. D. (2012). Predicting prag-
matic reasoning in language games. Science,
336(6084), 998–998.
Frazier, L., & Fodor, J. D. (1978). The sausage machine:
A new two-stage parsing model. Cognition, 6(4),
291–325.
Gleitman, L. R., Cassidy, K., Nappa, R., Papafragou, A., &
Trueswell, J. C. (2005). Hard words. Language Learn-
ing and Development, 1(1), 23–64.
Goldin-Meadow, S. (2009). How gesture promotes learn-
ing throughout childhood. Child Development Per-
spectives, 3(2), 106–111.
Goldstein, M. H., King, A. P., & West, M. J. (2003). Social
interaction shapes babbling: Testing parallels between
birdsong and speech. Proceedings of the National
Academy of Sciences, 100(13), 8030–8035.
Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002). The
faculty of language: What is it, who has it, and how did
it evolve? Science, 298(5598), 1569–1579.
James, K. H., & Swain, S. N. (2011). Only self-generated
actions create sensori-motor systems in the developing
brain. Developmental Science, 14(4), 673–678.
Karasik, L. B., Tamis-LeMonda, C. S., & Adolph, K. E.
(2011). Transition from crawling to walking and
infants’actions with objects and people. Child Devel-
opment, 82(4), 1199–1209.
Kirby, S., Dowman, M., & Griffiths, T. L. (2007). Innate-
ness and culture in the evolution of language. Proceed-
ings of the National Academy of Sciences, 104(12),
5241–5245.
Levinson, S. C. (2016). Turn-taking in human communi-
cation: Origins and implications for language
processing. Trends in Cognitive Sciences, 20(1), 6–14.
Lewis, A. G., & Bastiaansen, M. (2015). A predictive
coding framework for rapid neural dynamics during
sentence-level language comprehension. Cortex, 68,
155–168.
Lieberman, P. (2012). Vocal tract anatomy and the neural
bases of talking. Journal of Phonetics, 40(4), 608–622.
MacDonald, M. C., Pearlmutter, N. J., & Seidenberg, M. S.
(1994). The lexical nature of syntactic ambiguity reso-
lution. Psychological Review, 101(4), 676–703.
Massaro, D. W. (1987). Speech perception by ear and eye:
A paradigm for psychological inquiry. Mahwah:
Erlbaum.
McClelland, J. L., & Elman, J. L. (1986). The TRACE
model of speech perception. Cognitive Psychology,
18(1), 1–86.
Mehler, J., Jusczyk, P., Lambertz, G., Halsted, N.,
Bertoncini, J., & Amiel-Tison, C. (1988). A precursor
of language acquisition in young infants. Cognition,
29(2), 143–178.
Morgan, J. L., & Demuth, K. (Eds.). (1996). Signal to
syntax: Bootstrapping from speech to grammar in
early acquisition. Mahwah: Erlbaum.
Ninio, A. (2006). Language and the learning curve: A new
theory of syntactic development. Oxford: Oxford Uni-
versity Press.
Ninio, A. (2014). Learning a generative syntax from trans-
parent syntactic atoms in the linguistic input. Journal of
Child Language, 41, 1249–1275.
Pickering, M. J., & Branigan, H. P. (1998). The represen-
tation of verbs: Evidence from syntactic priming in
language production. Journal of Memory and Lan-
guage, 39(4), 633–651.
Pickering, M. J., & Garrod, S. (2013). An integrated theory
of language production and comprehension. Behav-
ioral and Brain Sciences, 36(4), 329–347.
Tamis-LeMonda, C. S., Bornstein, M. H., & Baumwell,
L. (2001). Maternal responsiveness and children’s
achievement of language milestones. Child Develop-
ment, 72(3), 748–767.
Tomasello, M. (1999). Cultural origins of human cogni-
tion. Cambridge, MA: Harvard University Press.
Trueswell, J. C., Sekerina, I., Hill, N. M., & Logrip, M. L.
(1999). The kindergarten-path effect: Studying on-line
sentence processing in young children. Cognition,
73(2), 89–134.
12 Modern Theories of Language