PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Language allows us to efficiently communicate about the things in the world around us. Seemingly simple words like this and that are a cornerstone of our capability to refer, as they contribute to guiding the attention of our addressee to the specific entity we are talking about. Such demonstratives are acquired early in life, ubiquitous in everyday talk, often closely tied to our gestural communicative abilities, and present in all spoken languages of the world. Based on a review of recent experimental work, we here introduce a new conceptual framework of demonstrative reference. In the context of this framework, we argue that several physical, psychological, and referent-intrinsic factors dynamically interact to influence whether a speaker will use one demonstrative form (e.g., this) or another (e.g., that) in a given setting. However, the relative influence of these factors themselves is argued to be a function of the cultural language setting at hand, the theory-of-mind capacities of the speaker, and the affordances of the specific context in which the speech event takes place. It is demonstrated that the framework has the potential to reconcile findings in the literature that previously seemed irreconcilable. We show that the framework may to a large extent generalize to instances of endophoric reference (e.g., anaphora) and speculate that it may also describe the specific form and kinematics a speaker’s pointing gesture takes. Testable predictions and novel research questions derived from the framework are presented and discussed.
Content may be subject to copyright.
A Conceptual Framework for the Study of Demonstrative Reference
David Peeters*, Emiel Krahmer, Alfons Maes
Tilburg University, Department of Communication and Cognition, TiCC, Tilburg, the
Netherlands
Author Note: This work was supported by a Veni grant awarded to the first author by De
Nederlandse organisatie voor Wetenschappelijk Onderzoek (NWO, the Dutch Research
Council).
Version 28 April 2020 (PsyArXiv)
*Corresponding Author:
David Peeters, PhD
Department of Communication and Cognition
Tilburg University
P.O. Box 90153
NL-5000 LE, Tilburg, The Netherlands
phone: +31 (0)13 4663584
email: d.g.t.peeters@uvt.nl
2
Abstract
Language allows us to efficiently communicate about the things in the world around us.
Seemingly simple words like this and that are a cornerstone of our capability to refer, as they
contribute to guiding the attention of our addressee to the specific entity we are talking about.
Such demonstratives are acquired early in life, ubiquitous in everyday talk, often closely tied
to our gestural communicative abilities, and present in all spoken languages of the world. Based
on a review of recent experimental work, we here introduce a new conceptual framework of
demonstrative reference. In the context of this framework, we argue that several physical,
psychological, and referent-intrinsic factors dynamically interact to influence whether a
speaker will use one demonstrative form (e.g., this) or another (e.g., that) in a given setting.
However, the relative influence of these factors themselves is argued to be a function of the
cultural language setting at hand, the theory-of-mind capacities of the speaker, and the
affordances of the specific context in which the speech event takes place. It is demonstrated
that the framework has the potential to reconcile findings in the literature that previously
seemed irreconcilable. We show that the framework may to a large extent generalize to
instances of endophoric reference (e.g., anaphora) and speculate that it may also describe the
specific form and kinematics a speaker’s pointing gesture takes. Testable predictions and novel
research questions derived from the framework are presented and discussed.
Keywords: referential communication; demonstratives; pointing; multimodal communication
3
1. Introduction: Demonstrative Reference as a Joint Action
Although the capacity to communicate about entities beyond the here-and-now is a powerful
design feature of human language (Hockett, 1960), we nevertheless also often talk about the
things in our immediate surroundings. In everyday conversations, speakers indeed naturally
exploit the communicative potential of words, gestures, and facial expressions to share their
thoughts about people, objects, and ongoing events in their direct environment. It has long been
acknowledged that referring to something in such face-to-face situations is a social and
collaborative enterprise (Bara, 2010; Clark & Bangerter, 2004; Clark & Wilkes-Gibbs, 1986;
Grice, 1975). When selecting from a wide range of possible referring expressions (cf. that blue
bicycle right there’ to ‘the bike’ to ‘it’), speakers typically take into account the presumed
cognitive status of a referent in their addressee’s situation model (e.g., Ariel, 1988; Arnold,
2010; Chafe, 1976; Gundel et al., 1993; Hanks, 2011; Prince, 1981b). Addressees, in turn,
single out one or more referents based on the verbal and non-verbal information provided by
the speaker in light of their assumed common ground (Clark, 1996; Clark et al., 1983).
The collaborative nature of referring in face-to-face communication is also evident from
its multimodal characteristics. When physically pointing at a visible referent, for instance by
using the index finger, speakers typically alternate gaze between referent and addressee
(Bakeman & Adamson, 1984; Kita, 2003) and tailor the kinematic properties of their gesture
(Cleret de Langavant et al., 2011; Liu et al., 2019; Peeters, Chu, et al., 2015) and the specificity
of concurrently produced speech (Brennan & Clark, 1996; Clark & Wilkes-Gibbs, 1986;
Koolen et al., 2011) to the presumed informational needs of their addressee. Addressees may
use the vector created by the speaker’s gesture, available gaze cues, and any concomitant verbal
description to establish joint attention to the inferred, intended referent (Bangerter, 2004; Clark,
2020; Cooperrider, 2020; Diessel, 2006; Eco, 1976; Kita, 2003; Levinson, 2004), subsequently
verbally and non-verbally signaling their understanding to the speaker (Clark & Krych, 2004).
4
As such, referring can be considered both a social and a multimodal hallmark of human
communication (Peeters & Özyürek, 2016).
The current paper focuses on demonstratives deictic words like this, that, these, and
those as a central component of many such multimodal joint actions. As far as we know, all
spoken languages have an inventory of these linguistic expressions (Diessel, 1999; Dixon,
2003), present in the lexicon of a language as a closed-class set of words or morphemes such
as affixes or clitics (Diessel, 1999; Levinson, 2018). Demonstratives are among the earliest
words infants produce (Capirci et al., 1996; Clark, 1978; Clark & Sengul, 1978) and their usage
remains ubiquitous in face-to-face communication throughout life (Wu, 2004) as they occur in
various common speech acts, for instance when we express our attitudes about something (‘that
is a pretty flower’), provide our interlocutor with new information (‘this is your new
colleague), or point at something as a request or imperative for assistance (‘could you pass me
that burrito, please?’). Frequency counts in lexical databases (e.g., Celex, Lexique, Subtlex)
for various languages indeed consistently rank demonstratives amongst the most highly used
lexical items in language (Baayen et al., 1993; Brysbaert & New, 2009; Keuleers et al., 2010;
New et al., 2004). Historically, demonstratives are so old that they cannot easily be traced back
to diachronically earlier linguistic expressions (Diessel, 1999; Himmelmann, 1996), suggesting
that they might even be “the most basic communicative acts in the vocal modality” (Tomasello,
2008, p. 233). Not surprisingly, therefore, they have long been a topic of extensive study in
various scientific disciplines such as philosophy (e.g., Kaplan, 1979; Peirce, 1940), psychology
(Bühler, 1934; Kemmerer, 1999), cross-linguistic typology (e.g., Anderson & Keenan, 1985;
Fillmore, 1982), linguistic anthropology (e.g., Enfield, 2003; Hanks, 1990), discourse studies
(e.g., Ariel, 1990; Gundel et al., 1993), and foreign language learning (e.g., Petch-Tyson, 2000;
Zhang, 2015). Furthermore, they play an important part in some of the most iconic works of
5
art, from Magritte’s ceci n’est pas une pipe to Shakespeare’s to be or not to be / that is the
question.
Despite the universal existence of demonstratives in all spoken languages (Diessel,
1999), the number of available demonstratives per language is a matter of remarkable cross-
linguistic diversity (Diessel, 2013; Levinson, 2018; Weissenborn & Klein, 1982). Whereas
English, for instance, distinguishes between a proximal (this or here) and a distal (that or
there) form, it is not uncommon for languages to have three (e.g., Spanish, Japanese), four
(e.g., Quileute, Somali), or even five or more (e.g., Malagasy, Navajo) different basic
demonstrative terms (Diessel, 2013). Speakers of other languages (e.g., Modern French,
German) may have only a single basic demonstrative determiner at their disposal, but can use
a richer set of demonstrative adverbs similar to English’ here and there (Diessel, 2013;
McCool, 1993). The existence of more than one demonstrative in a given language and the fact
that languages cross-linguistically differ in the number of available terms naturally raises the
question what factors drive a speaker in their decision to use one demonstrative form and not
another in a specific context. Regardless of what exact factors may influence this selection
process, it is within the larger framework of referring as a collaborative joint action (Bangerter,
2004; Clark, 1996) that a speaker’s implicit decision to use one demonstrative form (e.g., this)
over another (e.g., that) should be situated (Enfield, 2003; Hanks, 2011; Jarbou, 2010; Peeters
& Özyürek, 2016).
Complementing earlier philosophical, linguistic, and anthropological work that was
predominantly based on armchair intuitions and field observations (Clark & Bangerter, 2004),
recent years have seen an increase in experimental research into the use and processing of
demonstratives (e.g., Bonfiglioli et al., 2009; Coventry et al., 2008; Peeters, Hagoort, et al.,
2015; Piwek et al., 2008; Rocca, Wallentin, et al., 2019). An important aim of many such well-
controlled studies has indeed been to pinpoint precisely, often in carefully monitored lab
6
settings, what factors (e.g., the location of a referent or its visibility) affect whether a speaker
selects one demonstrative form and not another, and as such, what demonstratives implicitly
tell the addressee about the relative location and/or cognitive status of the referent. This strictly
experimental work from the lab is further complemented by quasi-experimental work
performed at field sites around the world (e.g., Levinson et al., 2018; see also Da Milano, 2007).
Although the recent experimental approach to the study of demonstrative reference has yielded
several interesting insights, we do not yet understand the mechanisms at work in the mind of a
language user when they select a demonstrative form for inclusion in their referential utterance.
Moreover, a comprehensive account integrating the variety of observational and experimental
findings at a cognitive level is lacking.
The main aim of the current paper is therefore, based on a review of the experimental
literature on demonstratives situated in the broader context of earlier non-experimental work,
to introduce a conceptual framework that describes the various factors and mechanisms
involved in demonstrative reference across languages. We will initially focus on situations in
which speakers use demonstratives exophorically, i.e., in reference to entities present in the
immediate surroundings of the speech event (Halliday & Hasan, 1976; Levinson, 1983) and
show how the framework may explain a speaker’s choice of demonstrative form in different
contexts. We will then explore whether the framework conceptually generalizes to cases of
endophoric demonstrative reference (Levinson, 1983; Lyons, 1977), particularly situations in
which speakers or writers refer to elements of the ongoing discourse. We hope that the
framework will serve as a conceptual basis for future experimental work on demonstratives
and as a starting point for a computational model of demonstrative reference. Before
introducing the framework, we will now first provide a review of recent experimental findings
on demonstrative use across different languages.
7
2. The Experimental Study of Demonstratives: A Review of Recent Work
The traditional view on demonstratives in exophoric use is that they “indicate the relative
distance of a referent in the speech situation vis-à-vis (…) the speaker’s location at the time of
the utterance(Diessel, 2013, p. 1). In a nutshell, this speaker-centric spatialist account
proposes that proximal demonstratives (e.g., English this) are used in reference to entities
relatively nearby the speaker, and distal demonstratives (e.g., English that) in reference to
entities relatively far from the speaker (Anderson & Keenan, 1985; Halliday & Hasan, 1976;
Levelt, 1989). This “folk-view on proximal and distal demonstratives” (Piwek et al., 2008, p.
695) has been found to be too simplistic (e.g., Enfield, 2003; Hanks, 2009; Jarbou, 2010;
Kemmerer, 1999; Peeters & Özyürek, 2016; Strauss, 2002), and extensive cross-linguistic
experimental and observational work questions “whether any language actually has a system
like this” (Levinson, 2018, p. 6). Based on a review of the experimental literature on
demonstratives, we here suggest that rather at least three types of factors influence a speaker’s
choice for a specific demonstrative form in any given setting. These three types of factors
(physical, psychological, and referent-intrinsic) are proposed to play a role, to a variable extent,
in all communicative situations in which a speaker uses a demonstrative in reference to an
entity in the world.
2.1. Physical factors influencing a speaker’s choice of demonstrative form
The experimental literature firstly suggests that physical factors play a role in
influencing a speaker’s choice of demonstrative form. We here define physical factors as
aspects of the external physical context in which language is used that can be objectively
observed and determined, such as the relative physical distance of a referent in relation to the
speaker or the speech situation, and a referent’s visibility to the interlocutors. Various
instantiations of the relative location of a referent have indeed been proposed to influence a
8
speaker’s decision to use one specific demonstrative form over another. A series of experiments
has made clear that whether a referent is located within (‘peripersonal space’) or beyond
(‘extrapersonal space’) a speaker’s physical reach can influence the form a demonstrative takes
in the speaker’s utterance (Caldano & Coventry, 2019; Coventry et al., 2008, 2014; Gudde et
al., 2016). Specifically, it has been observed for a variety of languages (Danish, English,
Spanish, Ticuna) that reachable referents within an elastic zone of peripersonal physical
proximity in front of the speaker typically elicit more proximal demonstratives than referents
located beyond the speaker’s reach (Caldano & Coventry, 2019; Coventry et al., 2008, 2014;
Rocca, Wallentin, et al., 2019; Skilton & Peeters, under review). On the basis of these findings,
the relative location of a referent as situated within or beyond a speaker’s reach should be
considered one clear factor driving a speaker’s choice for a specific demonstrative form.
A recent study suggests, however, that such speaker-anchored coding of space may not
necessarily occur in communicative contexts (Rocca, Wallentin, et al., 2019). When speakers
of Danish referred to shapes placed in a horizontal grid on a table in front of them, the
proportion of ‘proximal’ demonstratives they used increased when the referent was physically
closer to their concurrently pointing hand. Importantly, this effect was observed only when the
task was performed individually or when the speaker was joined by another speaker who
performed an independent, complementary task. Critically, when the task was communicative,
such that the information provided by the speaker was informative and relevant to the
addressee, ‘proximal’ demonstratives were anchored not to the speaker, but to the addressee or
to the speaker-addressee dyad (Rocca, Wallentin, et al., 2019). This is an important finding, as
referring in naturally occurring face-to-face communication is pre-eminently a communicative
and collaborative undertaking (Bangerter, 2004; Clark, 1996; Peeters & Özyürek, 2016).
The generalizability of findings attributing importance to the distinction between
peripersonal and extrapersonal space in driving the choice of demonstrative form may hence
9
be specific to certain contexts (Kemmerer, 1999). This idea is in line with the corpus
observation that, even for languages with a relatively simple two-term system such as English,
“the traditional ‘near speaker’/‘far from speaker’ distinction fails to capture the majority of
phenomena in everyday spoken English in which the forms occur where there is no relation
whatsoever to any physical distance from the speaker(Strauss, 2002, p. 151). Furthermore, in
contrast with theoretical accounts stressing the parallel between perceptual (peripersonal vs.
extrapersonal) and linguistic (proximal vs. distal) representations of space in the case of
demonstratives, kinematic work indicates that speakers may also sometimes prefer a distal
demonstrative for referents located within their peripersonal space (Bonfiglioli et al., 2009).
Together, these findings suggest that the relative location of a referent vis-à-vis the speaker
may play a role in the choice for a specific demonstrative form, but probably only in a limited
number of contexts. The more important the role of the addressee in the speech situation, the
smaller the influence of speaker-anchored physical factors on the speaker’s choice of
demonstrative form appears to be.
The physical location of a referent can indeed be calculated in relation to the speaker,
but also relative to the addressee (Brown & Levinson, 2018; Denny, 1982; Hanks, 1990;
Margetts, 2018), to the speaker-addressee dyad (Hanks, 1990; Hellwig, 2018; Jungbluth, 2003;
Meira & Guirardello-Damian, 2018; Peeters & Özyürek, 2016; Weinrich, 1988), or to some
external entity such as the sea, a river, a hill (Anderson & Keenan, 1985; Burenhult, 2008;
Diessel, 1999; Dixon, 2003; Levinson, 2018), or in exceptional cultural circumstances even the
palace of the local sultan (van Staden, 2018). Experimental work now indeed confirms that the
perspective of the addressee (Rocca, Wallentin, et al., 2019), or the speaker-addressee dyad
(Peeters, Hagoort, et al., 2015), can be taken as an anchoring point (Clark, 2020) by the speaker
when selecting a demonstrative form. The idea that demonstratives may in certain languages
moreover sometimes specify the referent’s relative location in relation to a geographical
10
landmark (the sea, a hill, a river, an iconic tree) is present in various typological sources
(Anderson & Keenan, 1985; Diessel, 1999; Dixon, 1972), but strict experimental work has not
been done. Furthermore, observational and documentary work suggests that demonstrative
form may also in certain languages mark the location of the referent in terms of its degree of
elevation, for instance specifying to the addressee whether it is located above or below the
current speech situation (Diessel, 1999). Additionally, speakers of certain languages may
encode in demonstrative form whether a referent is located downriver or upriver, or whether it
is moving towards the speech situation or away from it (Burenhult, 2008; Diessel, 1999;
Levinson, 2018). Quasi-experimental findings confirm these typological observations for
various languages (Levinson et al., 2018). In sum, the relative location of a referent vis-à-vis
entities (e.g., the addressee, the dyad, a geographical landmark) beyond the speaker seems a
common variable influencing the choice of demonstrative form across languages.
It is perhaps not surprising that the relative location of a referent may influence
demonstrative form, as the speaker often has to identify the location of a referent anyway when
deciding to produce a pointing gesture to guide the addressee’s visual attention in a desired
direction. This idea suggests that demonstrative form may vary as a function of whether the
speaker includes a pointing gesture in their multimodal referential utterance or not, which is
confirmed by recent observations (Bohnemeyer, 2018; Brown & Levinson, 2018; Cooperrider,
2016; Cutfield, 2018; Margetts, 2018; Meira, 2018; Stevens & Zhang, 2014; Terrill, 2018;
Wilkins, 2018). Rather than the presence or absence of a concurrent pointing gesture being a
factor influencing the speaker’s choice of demonstrative form, it may be the case that similar
factors (e.g., the relative location of the referent) simultaneously influence whether a speaker
produces a pointing gesture or not, and which specific demonstrative form they will use (cf.
Senft, 2004). Not surprisingly, then, in sign languages used by Deaf communities, it is pointing
11
signs that often function as demonstratives (Morford et al., 2019), suggesting a common
underlying machinery.
Another physical factor that may influence the choice for a specific demonstrative form
is the visibility of the referent. It has been claimed that several, typologically distinct languages
(e.g., Quileute, Ticuna, Ute, Warao, West Greenlandic) may have one or more demonstrative
forms that would be predominantly used in reference to invisible or visually obscured entities
(Anderson & Keenan, 1985; Diessel, 1999; Herrmann, 2018; Meira, 2018; Skilton, 2019). West
Greenlandic, for instance, is believed to have a specific demonstrative form inna that is opted
for when speakers of this language refer to entities that are currently out of sight (Diessel,
1999). Recent experimental work indicates that also speakers of languages with a relatively
simple two-term demonstrative system may take into account the visibility of a referent when
selecting a demonstrative form. It has been found that speakers of English use the ‘proximal’
form this significantly more often for visible than for invisible referents (Coventry et al., 2014).
Conversely, under similar experimental circumstances, speakers of the Indigenous Amazonian
language Ticuna are found to use their ‘distal’ demonstrative ɟe³a² significantly more in
reference to visible than invisible entities (Skilton & Peeters, under review). Taken together,
these experimental findings confirm earlier observations and strongly suggest that speakers
may take into account a referent’s degree of visibility when selecting a demonstrative form.
However, there is no universal cognitive tendency to conceptualize visible objects as relatively
more proximal (Skilton & Peeters, under review).
2.2. Psychological factors influencing a speaker’s choice of demonstrative form
In addition to the physical factors described above, psychological factors are found to
influence a speaker’s choice of demonstrative form. These factors relate not to an entity’s
objective relative physical location or visibility, but to the cognitive status of the referent in the
12
mind of the speaker and/or the addressee as assumed by the speaker. It is well established that
language users typically take into account the presumed cognitive status of a referent in the
addressee’s situation model when using a referring expression in general (e.g., Chafe, 1976;
Evans et al., 2018; Gundel et al., 1993; Prince, 1981b) and when producing a communicative
pointing gesture (Cleret de Langavant et al., 2011; Liu et al., 2019; Oosterwijk et al., 2017;
Peeters et al., 2013; Winner et al., 2019). Important considerations for the speaker when
selecting a demonstrative form may be whether the referent is in joint attention between speaker
and addressee or not (Brown & Levinson, 2018; Burenhult, 2003; Evans et al., 2018;
Herrmann, 2018; Knuchel, 2019; Küntay & Özyürek, 2006; Meira, 2018; Peeters et al., 2014;
Stevens & Zhang, 2013), whether it is considered perceptually, socially, and/or cognitively
accessible to the addressee (Burenhult, 2008; Hanks, 2009; Jarbou, 2010; Piwek et al., 2008),
and whether it can be considered in the psychologically construed shared space, the current
interactional space, or within or outside the interlocutors’ conceptually defined ‘here-space’
(Cutfield, 2018; Enfield, 2003, 2018; Jungbluth, 2003; Levinson, 2018; Meira & Guirardello-
Damian, 2018; Opalka, 1982; Peeters, Hagoort, et al., 2015).
Also experienced emotions and attitudes towards the referent may come into play here.
When the speaker experiences negative affect towards a referent, they may consider it
psychologically distant (Levinson, 1983, 2018; Lyons, 1977), increasing the odds that they will
use a ‘distal’ demonstrative form when referring to it. Indeed, "notions such as 'near to the
speaker' may be interpreted not only in the literal, physical sense, but also by extension to
'psychological proximity'(Anderson & Keenan, 1985, p. 278). Furthermore, if a referent is
placed behind a physical barrier, even when physically close and visible, it may be considered
by the interlocutors to be psychologically not-here’, influencing a speaker’s choice of
demonstrative form (Enfield, 2018; Shin et al., 2020). In sum, interlocutors keep track of
13
whether a referent is psychologically proximal or distal to themselves, to the addressee, and/or
to the conversational dyad, adjusting their choice of demonstrative form accordingly.
It should be noted that, in the study of exophoric demonstrative reference, it is more
difficult to manipulate in an experimental lab setting the exact cognitive status of a referent in
the mind of the addressee compared to, for instance, the manipulation of a referent’s spatial
location or its visibility. As a spatial proxy of a referent’s psychological proximity within or
outside interlocutors’ shared space, researchers have experimentally varied the location of the
addressee vis-à-vis the speaker. This typically leads to a zone of physically shared space
between speaker and addressee from a spatial zone outside the dyad (Coventry et al., 2008;
Jungbluth, 2003; Peeters, Hagoort, et al., 2015; Skilton & Peeters, under review). In addition,
the presence or absence of visual joint attention between speaker and addressee on a referent
has been experimentally manipulated to test whether this influences demonstrative production
and comprehension (Peeters et al., 2014; Stevens & Zhang, 2013). Furthermore, speakers’ use
of a particular demonstrative form when engaged in a joint activity has been offline correlated
with the assumed cognitive status of a referent in the situation model of the addressee as judged
by the researchers (Jarbou, 2010; Maes & de Rooij, 2007; Piwek et al., 2008; Shin et al., 2020).
Overall, these different approaches all indicate that the psychological proximity of a referent
in the mind of the addressee, as presumed by the speaker, modulates speakers’ choice of
demonstrative form.
2.3. Referent-intrinsic factors influencing a speaker’s choice of demonstrative form
Complementing physical and psychological factors, intrinsic properties or qualities of
the referent and grammatical conventions play a role in the speaker’s selection of a
demonstrative form. Clearly, non-deictic factors such as grammatical gender in many
languages influence demonstrative form (cf. French cette maison this houseto ce jardin this
14
garden). Moreover, number typically plays a role (cf. this chair’ to ‘these chairs’), case may
influence which specific form is used, and the animacy, humanness, or biological gender of
the referent or even its current posture or positional orientation is in certain languages specified
in demonstrative form (Diessel, 1999; Guirardello-Damian, 2018; Hellwig, 2018; Meira,
2003).
Recent experimental findings suggest that, more broadly, speakers may indeed take
permanent or temporary qualities of the referent into account when selecting a demonstrative
form. A referent’s ownership properties and its familiarity to the speaker have for instance been
found to modulate the proportion of use of specific demonstrative forms (Coventry et al., 2014;
see also Margetts, 2018). Furthermore, when speakers of Danish, English, and Italian were
asked to select a demonstrative for a variety of singular nouns, without any further context, it
was found that the demonstrative form they opted for was modulated by the size (small vs.
large referent) and the potential harmfulness (harmful referents: e.g., shark, bomb; harmless
referents: e.g., lamb, tent) of the referent (Rocca, Tylén, et al., 2019). Although it remains to
be established whether the experimental, online observation that the size, harmfulness, or
potentially also the manipulability of a referent matters for demonstrative choice in Indo-
European languages generalizes to situations of face-to-face communication (Rocca &
Wallentin, 2020), it confirms that, in addition to physical and psychological factors, properties
of the referent may influence the speaker’s choice of demonstrative form.
3. A Conceptual Framework of Demonstrative Reference
Our review of the experimental literature indicates, in line with earlier typological and
observational work, that a wide range of physical, psychological, and referent-intrinsic factors
may influence a speaker’s choice of demonstrative form. But does having a list of different
influential factors mean that we fully understand what happens in the mind of a speaker when
15
they include a demonstrative form in their verbal utterance when referring to a certain entity in
a given context for a specific addressee? Ultimately, any comprehensive account of
demonstrative reference should go beyond describing whether one or a couple of individual
factors influence the choice for a specific demonstrative form in a particular language.
Figure 1 therefore provides an attempt to visually depict the minimal factors and
connections that need to be in place at different levels in a conceptual framework explaining
demonstrative reference in cognitive psychological terms. The framework critically
distinguishes between a typological level (i.e., a description of the demonstrative system per
se present in a specific language), a cognitive level (i.e., the range of physical, psychological,
and referent-intrinsic factors that may influence the choice of demonstrative form for speakers
of a given language), and a sociocultural level (i.e., how the broader cultural context, personal
characteristics of the individual speaker, and the affordances of the immediate physical context
shape in a top-down fashion which factors at the cognitive level play a more important role in
a specific setting). Elements at the three different levels (typological, cognitive, and
sociocultural) are proposed to interact dynamically through connections that may have different
strengths, which could be conceived of and computationally implemented as regression
weights.
Figure 1. Outline of a conceptual framework of demonstrative reference, here depicted for a
language with a three-term demonstrative system in which several physical, psychological, and
referent-intrinsic factors (non-exhaustive here), either categorical or continuous, influence
16
which pronominal or adnominal demonstrative form is selected and used by a speaker.
Language characteristics, speaker characteristics, and context affordances in turn drive which
physical, psychological, and referent-intrinsic factors are considered more important in a given
sociocultural context.
3.1. The typological level of the framework
The bottom, typological level of the framework simply comprises the different types of
demonstratives that are available to a speaker of a particular language. Languages vary
substantially in the number of available demonstratives (Diessel, 1999; Levinson et al., 2018);
the language-specific words, affixes, or clitics can be found in grammars of a given language.
At the same time, the orthographic and phonological form, and syntactic properties of
individual demonstrative terms are stored in lexical memory of proficient (and for the
orthographic form: literate) speakers of the language.
As demonstratives are among the first words that we acquire in infancy (Capirci et al.,
1996; Clark & Sengul, 1978), it is likely that the typological level of the framework will be
represented in a speaker’s long-term lexical memory early in life. However, adult-like,
pragmatically appropriate use of these terms takes longer, potentially being fully mastered only
after age six, and possibly connected to and following the child's development of a theory of
mind (Chu & Minai, 2018; Clark & Sengul, 1978; Küntay & Özyürek, 2006; Tanz, 1980). The
developmental gap between acquisition of the lexical items themselves and their contextually
appropriate usage supports the idea that a cognitive and a sociocultural level should
complement the typological level in the conceptual framework as in the mind of the speaker.
3.2. The cognitive level of the framework
The middle, cognitive level of the framework ideally comprises all factors that may
influence the choice of demonstrative form in language. We have seen above that three types
of factors can be distinguished: physical, psychological, and referent-intrinsic factors. We
17
assume that many of these probabilistic factors will be continuous in nature. The relative
influence of the same factor may therefore differ over time. For instance, the higher the
psychological proximity of a referent to speaker and addressee becomes, all other things being
equal, the higher the odds that a speaker of Dutch will select a proximal (and not a distal)
demonstrative when referring to a specific object in a given context (Peeters & Özyürek, 2016).
Other factors influencing the speaker’s choice of demonstrative form may be intrinsically
binary and categorical, such as whether the referent is animate or inanimate (Levinson, 2018).
Careful experimentation may disclose how physical, psychological, and referent-
intrinsic properties of the referent as represented online in the mind of a speaker during a
conversation may interact to lead to that speaker’s use of a particular demonstrative form in a
given setting. We propose that different demonstratives may be activated at the same time in a
given context in the mind of a speaker, but that only the demonstrative with the highest degree
of activation will be selected and produced. Diachronic changes in the demonstrative system
of a language, such as an archaic medial demonstrative term no longer being used by speakers
of a language, in the framework correspond to a gradual disappearance of the connections
between all factors at the cognitive level and the specific demonstrative form at the typological
level. Furthermore, not all factors will be of equal importance in a specific language or culture,
for a specific speaker, and in a specific immediate context.
3.3. The sociocultural level of the framework
The top, sociocultural level of the framework therefore consists of three variables that
specify in a top-down fashion which factors play a relatively more important role in the specific
physical setting in which a multimodal act of demonstrative reference takes place. First, certain
factors identified at the cognitive level may play an important role in influencing the choice of
demonstrative form in one language, but not in another (‘language characteristics’). It has been
18
argued, for instance, that speakers of Dyirbal take into account whether a referent is uphill or
downhill from their own perspective when selecting a demonstrative form (Diessel, 1999;
Dixon, 1972). It is unlikely that this physical factor would be very influential, however, in
natural conversations in speakers that live in a culture such as the Netherlands where hills or
other evident environmental differences in elevation are negligible.
Second, the degree to which specific factors influence demonstrative choice may differ
across individuals who speak the same language (‘speaker characteristics’). If theory of mind
development is indeed critical for the acquisition of adult-like use of demonstratives (Chu &
Minai, 2018; Küntay & Özyürek, 2006), individual differences in the degree to which speakers
take into account the mental state of their addressee (Apperly, 2012; Carlson & Moses, 2001)
may drive whether they factor in the relation between the referent and their addressee when
selecting a specific demonstrative form. Such individual differences between speakers of the
same language may explain part of the substantial variability observed in experiments that elicit
demonstratives from different participants under virtually identical circumstances.
Third, the affordances of the immediate physical and conversational context will
influence the extent to which specific cognitive factors influence a speaker’s choice of
demonstrative form in a given situation (‘context affordances’). In the ‘memory game
paradigm, for instance, the difference in physical location of the different referents is typically
the most salient contextual factor that can be exploited by experimental participants in
distinguishing their usage of different demonstrative forms (e.g., Coventry et al., 2008), and it
is therefore not surprising that they typically do so. However, in a context in which different
referents are most easily distinguishable based on, for instance, their degree of elevation,
speakers may exploit that particular affordance of the given context when opting to use one
demonstrative form rather than another.
19
4. Workings of the framework: The case of Spanish
To illustrate the proposed workings of the conceptual framework introduced above, we will
here describe how it has the potential to unite two opposite result patterns described in the
literature. We will focus on the use of demonstrative determiners in Spanish, a language that
has a three-term demonstrative system consisting of the basic (here singular and masculine)
terms este, ese, and aquel. In the World Atlas of Language Structures, the Spanish
demonstrative system is described as containing a three-term proximal (este) medial (ese)
distal (aquel) distance contrast (Diessel, 2013).
Jungbluth (2003), in her in-depth analysis of the Spanish demonstrative system,
emphasizes that speakers and addressees when talking to each other in face-to-face situations
typically “treat their shared conversational space as uniform. Everything inside the
conversational dyad is treated as proximal without any further differentiation” (Jungbluth,
2003, p. 19). Crucially, she observes that in everyday Spanish conversations, the ‘proximal’
demonstrative form este is dominant and preferred for referents at any location inside such a
face-to-face dyad, also when these are located close to the addressee and outside the speaker’s
peripersonal space (Figure 2). This analysis is clearly not in line with traditional pure speaker-
centric distance-based views of the system, which did not attribute importance to the location
and orientation of the addressee in relation to the speaker in a speaker’s choice of demonstrative
form (see Hottenroth, 1982). It is also not in line with a ‘person-oriented’ description of the
system in which the ‘medial’ demonstrative ese would be predominantly used for referents that
are physically located near a speaker’s addressee (Alonso, 1968).
20
Figure 2. As observed by Jungbluth (2003), in naturally occurring communication, the Spanish
‘proximal’ demonstrative form este is dominant in reference to entities inside the face-to-face
conversational dyad formed by speaker (‘S’) and addressee (‘A’). Hence even a referent (‘R’)
that is located physically close to the addressee and outside the peripersonal space of the
speaker (but inside the dyad) would predominantly invite the speaker to use the ‘proximal’
demonstrative form este in face-to-face conversations.
Prima facie, the observations made by Jungbluth (2003) on the basis of her analysis of
naturally occurring interactions are conceptually difficult to reconcile with a subsequent
experimental study into Spanish (and English) demonstrative use (Coventry et al., 2008). This
latter study introduced the ‘memory game paradigm to experimentally investigate what factors
influence a speaker’s choice for a specific demonstrative form. In this paradigm, participants
are instructed to refer to objects that are placed at different locations on a table in front of them.
In addition to the physical distance of the referent to speaker (participant) and addressee
(experimenter), several theoretically interesting variables can be manipulated using the
paradigm, such as the visibility of the referent object, its familiarity to the speaker, and whether
it is owned by the participant or not (Gudde et al., 2018). On the basis of the theoretical account
provided by Jungbluth (2003), one may predict that Spanish speakers would predominantly use
este in reference to all entities inside the shared space between speaker and addressee when
these are seated face-to-face at opposite ends of the table, regardless of the exact location of
the referent on the table. After all, the table in between speaker and addressee would, at least
physically, constitute the shared space between the interlocutors.
The study observed, however, that este was used dominantly only for referents inside
the peripersonal space of the speaker (Coventry et al., 2008). Referents at medium distance
21
from the speaker mostly elicited the use of ese and referents at a further distance from the
speaker were predominantly referred to using a referential expression containing aquel (cf.
Figure 3). The region of space for which the ‘proximal’ form este was dominantly used was
slightly larger when speaker and addressee were seated face-to-face compared to when they
were seated side-by-side (Coventry et al., 2008), but clearly not to an extent that all referents
located inside the conversational dyad were “treated as proximal without any further
differentiation” (Jungbluth, 2003, p. 19). In sum, the conclusions drawn by Jungbluth (2003)
on the basis of analysis of naturally occurring Spanish interactions seem to contrast sharply
with the experimental results reported by Coventry et al. (2008) on speakers of the same
language. Intuitively, these results are difficult to reconcile, and one would have hoped
experimental findings to generalize to naturally occurring usage patterns ‘in the wild’.
Figure 3. In the experimental context of the ‘memory game’ paradigm, in which a speaker
(‘S’) participant and an addressee (‘A’) experimenter sit at a table, the Spanish ‘proximal’
demonstrative form este is dominant in reference to entities inside the peripersonal space of the
speaker, as observed by Coventry et al. (2008). This spatial zone is here indicated by the large
grey filled circle. A referent (‘R’) placed outside the peripersonal space of the speaker, although
located inside the shared space between speaker and addressee, in this context typically does
not elicit the ‘proximal’ demonstrative este.
An explanation for these divergent result patterns may be found in the fact that the
relative locations of the different referents, as typically indicated on the table by coloured dots
(Figure 3) in such experimental studies using the ‘memory game’ paradigm, are highly salient
22
to the experimental participants. The physical context hence explicitly invites speakers to
exploit the relative physical location of the referent as a salient factor influencing which
demonstrative form to use (cf. Shin et al., 2020). Moreover, in the absence of a broader
conversational context in which the use of the demonstratives takes place, interlocutors may
have no means to jointly construe at a psychological level what they consider their shared
space. In naturally occurring situations such as those observed by Jungbluth (2003), the
opposite is true. Interlocutors may prefer to use demonstratives in such a way that these align
with the jointly (verbally and non-verbally) construed distinction between the psychologically
shared space within the conversational dyad vs. any dyad-external location. In other words,
speakers in the ‘memory game’ paradigm may ascribe more importance to physical factors
such as the relative location of a referent, whereas in naturally occurring conversations
psychological factors such as the psychological distance of a referent may play a more
important role. We propose that the influence of physical factors decreases as a function of an
increase of importance of the addressee in the speech situation at hand (Rocca, Wallentin, et
al., 2019), and that psychological factors are by default most important in shaping a speaker’s
choice of demonstrative form in natural, communicative situations.
In our conceptual framework, the variable influence of physical vs. psychological
factors under different contextual circumstances is implemented through top-down
modulations of the various factors at the middle, cognitive level as a function of the broader
context affordances identified at the top, sociocultural level. Figure 4 illustrates the presumed
‘default’ situation of naturally occurring communication by speakers of Spanish. Here we
follow Jungbluth (2003) in assuming that, by definition, Spanish interlocutors jointly construe
a shared space and keep track of whether a referent is located inside the psychologically shared
space or not (Shin et al., 2020). They adapt their choice of demonstrative form accordingly,
and may even use a specific demonstrative form to indicate whether they consider a referent to
23
be located inside the assumed shared space or not (Shin et al., 2020). In line with the fact that
demonstrative reference is a fundamentally social and collaborative process (e.g., Bara, 2010;
Clark et al., 1983; Peeters & Özyürek, 2016), we hence assume that the psychological factor
‘psychological distance of the referent’ receives more top-down activation than physical factors
during natural conversations. Moreover, the context affordances also activate this
psychological factor as any natural face-to-face conversation allows for the construction of a
shared space between interlocutors. Because the referent is located inside the shared space in
the situation depicted in Figure 2, even though it is closer to addressee than to speaker, the
demonstrative este is strongly activated. If we here assume that the referent is relatively small
in size, and that it is in joint attention between speaker and addressee, additional activation of
este is provided through the referent-intrinsic factor ‘size of referent’ (Rocca, Tylén, et al.,
2019) and the psychological factor ‘joint attention’ (e.g., Küntay & Özyürek, 2006). Because
este is clearly more active than its competing alternatives (demonstratives ese and aquel), it
will be selected for articulation by the speaker.
Figure 4. The conceptual framework of demonstrative reference, here applied to the face-to-
face situation depicted in Figure 2, inspired by Jungbluth (2003). It is assumed that in natural
conversations, the psychological distance of a referent is the most important factor at the
cognitive level influencing the choice of a demonstrative form at the typological level. Both
language characteristics and context affordances in a top-down fashion activate this factor most
at the cognitive level in the situation depicted in Figure 2. Because the referent is
psychologically proximal, relatively small, and in joint attention, este is activated significantly
more than competing alternatives and therefore selected and articulated by the Spanish speaker.
24
Note that on the basis of the relative location of the referent to the speaker, both ese and aquel
also receive some activation, but not to such an extent that they are selected for articulation.
The default state of the framework, in which psychological factors trump physical
factors, may however be overruled, as in the context of the ‘memory game’ paradigm (Figure
3). In the absence of the opportunity to have a normal conversation, speakers in this context
may ascribe more importance to context-dependent physical factors than to the psychological
proximity of a referent in the mind of their addressee (Skilton & Peeters, under review). The
primacy of physical factors may further be primed by the salience of the different physical
locations in this experimental set-up (‘context affordances’) on which referents are placed.
Figure 5 illustrates again that the psychological distance of a referent is an important cognitive
factor influencing the choice of demonstrative form in Spanish (‘language characteristics’).
However, in the speech situation depicted in Figure 3, context affordances activate physical
factors such as the relative location of a referent more than psychological factors. Because the
referent is located relatively far away from the speaker in this set-up, aquel will receive more
activation than este, explaining why it is predominantly used in reference to entities located
relatively far away from the speaker.
Figure 5. The conceptual framework of demonstrative reference, here applied to the ‘memory
game’ paradigm set-up as depicted in Figure 3, and inspired by Coventry et al. (2008). It is
assumed that in this experimental set-up, the contextual salience (‘context affordances’) of the
relative location of the referent vis-à-vis the speaker makes this latter variable the most
25
important factor at the cognitive level influencing the choice of a demonstrative form at the
typological level. Because the referent is relatively small and in joint attention, este is activated.
However the top-down activation of the factor ‘relative location of the referent’ is so dominant
that the referent’s relatively far location as calculated from the location of the speaker leads to
aquel becoming activated to such an extent that it is selected for production and articulated by
the Spanish speaker.
The considerations described above may explain why in different contexts the same
referent at a comparable distance from the speaker may elicit either a ‘proximal’ or a ‘distal’
demonstrative. In addition, experimental work makes clear that there are individual differences
in the choice of demonstrative form across speakers of the same language under virtually
identical experimental circumstances. For instance, although a majority of participants will use
a ‘distal’ demonstrative for the referent located close to the addressee in Figure 3, some
participants will use a ‘proximal’ demonstrative form in this very same context (Coventry et
al., 2008). The conceptual framework explains such individual differences by assuming that
factors at the middle, cognitive level of the framework may have different baseline activation
levels for different individual speakers. We hypothesize that individual differences in theory-
of-mind capacities may contribute to whether physical or psychological factors play a more
important role in different individuals. The more speakers take into account the mental states
of their addressee, and as such the presumed degree of psychological proximity of a referent in
the mind of the addressee, the more influential psychological factors (vs. physical factors) will
be in influencing a speaker’s choice of demonstrative form. Experimental research correlating
speakers’ theory-of-mind capacities with their choice of demonstrative form is needed to test
this proposal.
5. Putative Parallels between Exophoric and Endophoric Use of Demonstratives
Thus far, we have focused on situations in which speakers use demonstratives exophorically,
i.e., in reference to entities present in the immediate surroundings of the speech event (Halliday
26
& Hasan, 1976; Levinson, 1983). However, in naturally occurring communication
demonstratives also often function endophorically (Diessel, 1999; Himmelmann, 1996;
Levinson, 1983; Lyons, 1977), when they are used in reference to elements of the ongoing
spoken or written discourse. Although the exophoric use of demonstratives is considered the
ontogenetic, phylogenetic, and grammatical basis from which other types of use have derived
(e.g., Bühler, 1934; Diessel, 1999; Lyons, 1977; Tomasello, 2008), the endophoric use may be
(even) more frequent in present-day human communication, as not only physically available
referents but virtually all thinkable entities (concrete or abstract; existing or imaginary;
immediately present or absent) can be linguistically introduced and endophorically referred to.
Indeed, a powerful affordance of spoken, written, and signed language is that it allows one to
transform any portion of discourse (e.g., a word, gesture, clause, sentence, or cluster of
sentences) into a newly created endophoric referent.
The main aim of this section is to explore to what extent the conceptual framework of
demonstrative reference, as introduced and embedded above in an exophoric context,
generalizes to situations of endophoric reference. Parallels will be explored at each level
(typological, cognitive, and sociocultural) of the framework, as well as with regards to the top-
down connections between the different levels. To establish a solid basis for application of the
conceptual framework to situations of endophoric demonstrative reference, we will first
introduce and critically evaluate two relevant and influential existing theories of endophoric
reference (the accessibility hierarchy and the givenness hierarchy), and review the
experimental, qualitative, and corpus-based literature on endophoric demonstratives to disclose
whether the factors that may drive a speaker’s or writer’s choice for a specific demonstrative
form in a given discourse context are similar to those identified above for exophoric settings.
Before doing so, we acknowledge that different types of endophoric demonstrative use
can be distinguished (cf. Diessel, 1999; Doran & Ward, 2019; Himmelmann, 1996; Levinson,
27
2004). We will use the term anaphoric demonstrative both for demonstratives with a nominal
antecedent (e.g., The Bell Jar was first published in 1963. This is a wonderful novel.) and for
demonstratives with a propositional antecedent (e.g., The Bell Jar was first published in 1963.
This is something I learned in secondary school.). This implies that we restrict the term deictic
to non-anaphoric demonstratives in spoken and written discourse when these are used in
reference to the (displaced) deictic ground (Hanks, 1992), i.e., to deictic elements of the speech
or writing situation, thus covering (inter alia) situational (Himmelmann, 1996) and symbolic-
exophoric (Levinson, 2004) demonstratives (e.g., non-gestural deictic use of demonstratives in
speech or text as in this chapter, this year, this country, this book, etc.). Additionally, we will
distinguish between demonstrative pronouns (e.g., The Plague was first published in 1947. This
is still a highly relevant book.) and demonstrative noun phrases (e.g., The Plague was first
published in 1947. This book is still highly relevant.).
5.1. Accessibility and givenness in relation to endophoric demonstrative form
Arguably the two most influential theories in the domain of endophoric reference are
Ariel’s accessibility hierarchy (Ariel, 1990) and Gundel and colleagues’ givenness hierarchy
(Gundel et al., 1993). A remarkable difference in the study of endophoric vs. exophoric
demonstrative use that these accounts immediately illustrate is that endophoric demonstratives
have mostly been studied as part of the larger set of referring expressions available in a
language, while research on exophoric demonstrative use has predominantly focused on
variation within the set of demonstratives available in a language alone, as we have done above.
In the former case, different types of referring expression (e.g., the book vs. this book vs. it)
are argued to correspond to different cognitive statuses that a referent is presumed to have in
the mental model of the reader or listener (e.g., Ariel, 1990; Gundel et al., 1993; Prince, 1981b).
As such, in the study of endophoric reference demonstratives are typically seen as a small set
28
of referring expressions within a broader range of possibilities available to the speaker or
writer.
Both the accessibility hierarchy and the givenness hierarchy consistently assign
demonstratives an intermediate cognitive status in between personal pronouns and definite
noun phrases (Ariel, 1990; Gundel et al., 1993; Prince, 1981b). According to these views,
demonstratives are used in reference to entities that are on the one hand cognitively less
accessible than those that personal pronouns refer to, as a demonstrative (compared to a
personal pronoun such as it) is more often found to have a non-subject or propositional
antecedent (e.g., Brown-Schmidt et al., 2005; Çokal et al., 2018; Fossard et al., 2012; Kaiser
& Trueswell, 2008; Maes, 1997). On the other hand, demonstratives are argued to be
commonly used in reference to entities that are relatively more accessible than those referred
to by definite noun phrases (NPs). The idea is that demonstratives (e.g., “that book”) typically
require a referent (e.g., “Ulysses”) that has been previously activated, while definite NPs (e.g.,
the book Ulysses”) more commonly and more successfully introduce new referents.
The two hierarchies differ, however, as to the cognitive status attributed within the
closed set of demonstratives. The accessibility hierarchy (Ariel, 1990) assumes that ‘proximal’
demonstrative forms refer to more accessible entities than ‘distal’ demonstrative forms do, and
that demonstrative pronouns in general refer to entities that are more accessible than those
referred to by demonstrative NPs. On the basis of distributional regularities of different
demonstrative forms in a small corpus, Ariel observed that the distance between antecedent
and anaphor was on average smaller for demonstrative pronouns compared to demonstrative
NPs, and also for ‘proximal’ demonstrative forms compared to ‘distal’ demonstrative forms.
The latter observation suggests that the simple ‘physical’ distance between antecedent and
demonstrative could be an important factor driving a speaker or writer’s choice of
demonstrative form. This intuitive and straightforward explanation of the difference between
29
endophoric this versus that was, however, not confirmed by subsequent larger-scale corpus
analyses (e.g., Botley & McEnery, 2001a, 2001b; Maes, 1996).
In the givenness hierarchy (Gundel et al., 1993), it is ‘distal’ demonstrative NPs
(‘thatN’, e.g., ‘that story) that have a special status as they are assumed to refer to entities that
are currently less activated compared to entities referred to with ‘proximal’ or ‘distal’
demonstrative pronouns, or with proximal demonstrative NPs (‘thisN’, e.g., ‘this story). This
claim is arguably supported by examples of thatN referring to ‘familiar’ first-mention referents,
reminiscent of recognitional thatN (Diessel, 1999; Himmelmann, 1996; Levinson, 2004;
Schlegloff, 1996). Yet, one should acknowledge that familiar or recognitional thatN clauses
are just one of many first-mention thatN cases, including exceptional (e.g., Chen, 1990;
Cheshire, 1999; Maclaren, 1982) as well as more commonly observed first-mentions (e.g., the
demonstrative form that or those followed by a noun and a relative clause: I would like to
thank those people who helped us during the crisis). Therefore it is conceptually difficult to
understand why familiar thatN deserves a special cognitive status compared to non-familiar
first-mention distal cases, or vis-à-vis other demonstrative forms. A counterexample,
moreover, is indefinite thisN, which also represents an exceptional case of first-mention
demonstrative use, but in this case of the ‘proximal’ demonstrative form this (Maclaren, 1982;
Prince, 1981a).
In sum, both the accessibility hierarchy and the givenness hierarchy assume that
differences in the presumed cognitive status of a referent in the mind of the addressee (reader
or listener) are reflected by a speaker or writer’s choice of demonstrative form, but the provided
evidence for these claims remains unconvincing. Of course, this does not invalidate the
hierarchies as a whole, but it does question the specific assumptions they make about
demonstratives. Before explaining a speaker’s or writer’s choice of endophoric demonstrative
30
form in an alternative way in the context of our conceptual framework, we will now first review
existing empirical work on the topic.
5.2. The study of endophoric demonstrative use
In general, at least three types of methodological approaches can be distinguished in
the empirical study of endophoric demonstrative reference. First, experimental work on the
production and comprehension of demonstratives in an endophoric context is surprisingly
scarce. Given the longstanding experimental tradition of investigating the cognitive status of
different types of anaphors (e.g., pronouns vs. nouns), it is striking that hardly any study in this
domain can tell us whether there is a difference in how speakers (or writers) and listeners (or
readers) produce or comprehend ‘proximal’ vs. ‘distal’ anaphoric demonstrative forms. It
should be relatively straightforward to carefully manipulate activation-sensitive variables like
a referent’s syntactic position, its position in a sentence, or its referential distance to the
antecedent in an experimental context. A notable exception (Çokal et al., 2014) experimentally
contrasted and tested a distance-based ( i.e., that referring to topics that were introduced earlier
than this, cf. McCarthy, 2002) and a focus-based (i.e., this referring to newer information than
that, cf. Strauss, 2002) accessibility view of the difference between ‘proximal’ and ‘distal’
demonstrative forms in an endophoric context. Their eye tracking and completion task results,
interestingly, showed that ‘proximal’ and ‘distal’ demonstratives are largely fishing in the same
pond. In other words, no clear and straightforward correlation between the presumed
accessibility of a referent and the production and comprehension of specific demonstrative
forms was observed (Çokal et al., 2014).
Second, qualitative studies have provided fine-grained speculative analyses of
interesting cases of demonstrative use based on acceptability judgments of either invented or
naturally observed examples. Such approaches have for example identified and evaluated
31
specific instances of recognitional thatN (Consten & Averintseva-Klisch, 2012), indefinite
thisN (Maclaren, 1982; Prince, 1981a), interactional that (Cheshire, 1999), restrictive that
(Maclaren, 1982), emotional that (Chen, 1990; Lakoff, 1974), and cataphoric use of
demonstratives (Chen, 1990). Most of such studies focus on exceptional, often semi-anaphoric
and mostly ‘distal’ cases alone rather than on the majority of demonstrative anaphors where
“one could be replaced by the other with very little effect on the meaning” (Stirling &
Huddleston, 2002, p. 1506). Therefore, similar to the experimental study discussed above, also
qualitative studies do not convincingly disclose what factors may drive a speaker’s or writer’s
choice for one demonstrative form over another in a given endophoric setting.
Third, corpus-based studies have been carried out with the potential to provide
distributional evidence on factors influencing a speaker’s or writer’s choice of demonstrative
form in endophoric use (Botley & McEnery, 2001a, 2001b; Byron & Allen, 1998; Maes, 1996;
Petch-Tyson, 2000). Testing the theoretical views on demonstratives in the accessibility
hierarchy and the givenness hierarchy discussed above, these studies did not offer converging
evidence in favor of the presumed relation between a referent’s cognitive status and the used
demonstrative form. What they firstly do show, however, is that anaphoric demonstratives (i.e.,
demonstratives with an NP or propositional antecedent) are in general more frequent than non-
anaphoric ones. More importantly in the context of this paper, they also indicate that the relative
proportions of occurrence of ‘proximal’ vs. ‘distal’ demonstrative anaphors vary widely and in
different directions across different corpora.
Specifically, the proportion of use of a given demonstrative form (e.g., this versus that)
seems to vary strongly as a function of text or discourse genre. For instance, researchers in the
field of English as a second language (L2) collected academic essays from students in different
countries, and compared their demonstrative use with similar essays written in students’ native
language (L1) (e.g., Blagoeva, 2004; Labrador, 2011; Lenko-Szymanska, 2004; Petch-Tyson,
32
2000; Sun-Young, 2009). The varied results of under- or overuse of demonstrative forms
between L1 and L2 are less relevant here than the observation that on average about 70% of all
demonstrative forms in all these corpora is ‘proximal’. This regularity is presumably found
more generally in the broader genre of scientific, expository literature (Gray, 2010).
Conversely, corpora of interactional spoken discourse consistently show (extreme) preferences
for ‘distal’ anaphors (Byron & Allen, 1998; Passonneau, 1989; see also Diessel, 1999, p. 119).
Such a predilection for anaphoric use of ‘distal’ demonstratives can also be found in news
corpora (Botley & McEnery, 2001a) in which information is clearly targeted towards the news
item’s consumer. Other genre categorizations, such as fiction vs. non-fiction, do not directly
seem to result in clear preferences, probably because they represent too rough a distinction
(Ariel, 1988; Kirsner, 1979; Labrador, 2011). Nevertheless, the specific text or discourse genre
seems a clear and reliable top-down factor influencing a speaker’s or writer’s choice of
demonstrative form.
On the basis of the experimental, qualitative, and corpus studies discussed above, we
conclude that it is time to broaden the perspective on endophoric demonstratives by shifting
attention from activation-sensitive discourse structural variables (e.g., ‘accessibility’ or
‘givenness’) to a comprehensive view that highlights the importance of the interaction between
speaker (or writer), listener (or reader), and referent at a psychological level. Specifically, we
propose that the bulk of anaphoric demonstratives, regardless of their specific form, express
the same cognitive status, namely the fact that a referent has been or can be activated on the
basis of previous discourse information. We will argue below that the different demonstrative
forms reflect subtle pragmatic and interactional inferences that significantly exceed the level
of simply ‘finding the intended referent’.
5.3. A comprehensive account of endophoric demonstrative use
33
The observation that text or discourse genre plays a fundamental role in driving a
speaker’s or writer’s choice of demonstrative form is indeed best explained in terms of the
presumed relation between speaker/writer, addressee, and referent in the mental model of the
speaker/writer. We propose that an increasing preference for ‘distal’ demonstrative anaphors
is observed when the role of the addressee becomes more prominent in the discourse setting at
hand (as in interactional and narrative discourse), while an increasing preference for ‘proximal’
demonstrative anaphors is found when speakers feel more responsible themselves for the
produced discourse, as in an expository context. Indeed, in a conversational corpus study, it
was observed that “that frequently co-occurs with features marking interpersonal involvement
in contexts where, in principle, it would seem equally possible for speakers to have chosen to
use this. This, on the other hand, tends to co-occur with linguistic features that encode the
speaker's own involvement in what is being said” (Cheshire, 1996, p. 375). Likewise, the strong
‘proximal’ preference shown in corpora of academic and scientific texts can be explained by
an assumed primordial psychological proximity between speaker and topic in the context of an
addressee to which the topic (and as such, the mentioned referents) are assumed to be
psychologically more distant. At the same time, the overwhelming preference for ‘distal’
demonstratives in narrative news corpora suggests a more intensive desired interaction with
and appeal to the text’s intended addressee(s). The use of a ‘proximal’ demonstrative thus
locates the topic of discourse and its referents in close psychological proximity to the
knowledgeable speaker or writer, while the use of a ‘distal’ demonstrative moves the referent(s)
into the shared space between speaker and addressee, and as such psychologically towards the
addressee.
Similar interactional inferences apply to specific types of demonstrative anaphors as
well. For example, the preference in expository contexts for speakers to construe modified
thisN anaphors may reflect the fact that a speaker is presenting information that is new to the
34
addressee (reminiscent of indefinite thisN). Likewise, the preference in narrative discourse for
long thatN anaphors (reminiscent of recognitional thatN) suggests an appeal to the addressee
to jointly engage in the narrative. Furthermore, cases of attitudinal demonstratives,
predominantly ‘distal’ ones, can be seen as weak variants of (mostly) non-anaphoric pragmatic
uses, with a positive appeal towards the addressee (cf. a typical greeting in Dutch such as Ha
die Bob’, literally: Hey that Bob, Kirsner, 1979, where the ‘proximal’ alternative is not a
reasonable option).
The presumed cognitive importance of the basic speaker-addressee dyad and the
relative location of a referent in their psychologically shared space is further supported by the
usage patterns of typical non-anaphoric demonstratives. Deictic ‘proximal’ demonstratives, for
instance, can be used as exclusive devices to refer to the nearest possible referents in the
endophoric context, i.e., those in the here-and-now of discourse, and in related deictic functions
such as quoted or reported speech (e.g., in news reports, Botley & McEnery, 2001b).
Furthermore, the association of distal demonstratives with an active role of the addressee is
substantiated by a larger variety of loose that references, which can be read as an invitation
and a signal to provide the addressee with the freedom to construct a suitable interpretation of
the referent on the basis of the available contextual information. In such cases, the speaker or
writer thus moves the referents psychologically towards the addressee. Indeed, ‘distal’ forms
are more productive in cases of loose or deferred anaphoric reference, for example in the case
of a referent shift between antecedent and anaphor (e.g. “John’s behavior is an exact match of
that of Peter”), a shift from a specific to a generic interpretation (e.g., Bowdle & Ward, 1995),
or a bridge between referents (e.g., A car drove by. The engine stuttered. Then another car
drove by. That engine stuttered, too; see examples in Apothéloz & Reichler-Béguelin, 1999;
Lücking, 2018).
35
Clearly, we do not intend to say that the role and importance of the addressee have been
neglected in earlier work. On the contrary, addressee assumptions have always been crucial in
defining cognitive statuses. For example, in work discussing the use of ‘familiar that, the
addressee is assumed to be “able to uniquely identify the intended referent because he already
has a representation of it in memory” (Gundel et al., 1993, p. 278). But once we assume that
most of the endophoric demonstratives easily tolerate replacement by alternative, competing
demonstrative forms without ‘losing the referent’ in the mind of the listener or reader, we have
to acknowledge that these purely identification-based addressee assumptions need to be
updated. This conclusion is in line with the observation that “demonstrative determiners encode
procedural meaning which does not necessarily or only guide the hearer to the intended
referent, but may in some cases contribute to what is implicitly communicated as well” (Scott,
2013, p. 56). In what follows, we explore how our conceptual framework of demonstrative
reference incorporates this perspective on endophoric demonstratives. We will do so by
distinguishing once more between the framework’s three different levels (typological,
cognitive, and sociocultural).
5.4. The conceptual framework of demonstrative reference in endophoric settings
As to the bottom, typological level of the framework, there are several languages with
demonstrative forms that are exclusively used as anaphors, but in most languages the existing
exophoric terms are also used in endophoric contexts (Diessel, 1999; Levinson, 2018).
Therefore, the typological level of our conceptual framework will for many languages be
identical or similar across endophoric and exophoric contexts. This overlap in lexical forms
used across exophoric and endophoric contexts makes it intuitively plausible that the choice of
demonstrative forms in endophoric use are to a certain extent affected by the three types of
cognitive variables at the middle level of the exophoric framework.
36
At the cognitive level, we previously distinguished between physical, psychological,
and referent-intrinsic variables influencing a speaker’s choice of demonstrative form in
exophoric settings. To what extent do these three types of factors indeed influence the use of
demonstratives in reference to elements of the ongoing discourse?
First, it seems trivial that endophoric demonstratives are not sensitive to physical factors
such as the visibility or relative physical/spatial location of a referent, as the endophoric
referent is typically located in the ephemeral (for spoken) or displaced (for written) sphere of
discourse (Clark, 2020). We have seen that the ‘physical distance’ between referent and
antecedent has been proposed to drive the choice of demonstrative form (Ariel, 1990), but that
this proposal was later falsified on the basis of more extensive, in-depth corpus analyses (e.g.,
Botley & McEnery, 2001a, 2001b; Maes, 1996). One exceptional situation in which physical
factors could play a role may be found in situations where discourse topics (person, object,
event) are visibly present in interactional endophoric contexts. However, it is questionable
whether in such contexts the demonstrative is used purely endophorically. In sum, as in
exophoric settings (Peeters & Özyürek, 2016), it is not physical factors that are primary in
driving an individual’s choice of endophoric demonstrative form.
Second, psychological factors seem fundamental in driving a speaker or writer’s choice
of endophoric demonstrative form by shaping the interaction between speaker, addressee, and
referent. We assume that speakers and writers commonly keep track of the psychological
proximity of a referent in their own mental model in relation to the mental model of their
addressee, and the degree of assumed joint attention between speaker/writer and addressee on
the referent. The chosen demonstrative form will often reflect the relative position of the
speaker or writer in relation to the addressee, as a function of the broader discourse genre, and
discloses where exactly referents are situated in the assumed (jointly attended) shared space
between speaker/writer and addressee. This can be psychologically relatively close to the
37
speaker, as in expository contexts, or more towards the addressee, as in interactional and
narrative discourse. We thus assume that the presumed psychological distance of a referent in
the mind of the addressee is an important factor in driving the speaker’s or writer’s choice of
demonstrative form at the cognitive level. We propose that the relative importance of this factor
is top-down influenced by genre knowledge, a factor that plays a crucial role at the sociocultural
level of the framework (see below).
Third, it has been hypothesized that referent-intrinsic characteristics such as animacy,
manipulability, or more fine-grained semantic characteristics of a referent may implicitly guide
a writer’s choice of demonstrative form (Rocca, Tylén, et al., 2019; Rocca & Wallentin, 2020).
We deem it unlikely that such subtle influences manage to beat genre affordances or
interactional strategies of speakers (see below). As an example, a few weeks after the outbreak
of the Covid-19 virus, a daily Google-search for ‘this virus versus that virus showed a stable
and substantial preference for this, suggesting an (expository) genre effect rather than a
predominant use of the ‘distal’ demonstrative as a function of pejorative qualities of the
referent. The influence of referent-intrinsic factors on the choice of endophoric demonstrative
form may thus be relatively small.
Nevertheless, the current status of a referent in the presumed common ground between
speaker and addressee could represent one flexible referent-specific variable influencing a
speaker’s choice of demonstrative form. In a study of language use in contexts of negotiation,
a systematic difference between unresolved (‘proximal’) and resolved (‘distal’) negotiation
topics was observed (Glover, 2000) - a dichotomy which can easily be interpreted as reflecting
a difference in spatio-temporal, and consequently psychological, distance between
interlocutors and the referent as a function of its current status (near, current, still under
discussion vs. far, past, finished). As such, the communicative status of a referent could
38
influence a speaker’s choice of endophoric demonstrative form as a temporary and flexible
referent-intrinsic factor.
On the sociocultural level, we consider the affordances provided by genre-related
knowledge as most crucial in influencing demonstrative variation in a top-down fashion. Text
or discourse genre, as such, is the endophoric counterpart of the exophoric context
affordances we discussed before. In spoken interaction, these affordances themselves differ
from what we discussed in the exophoric sections, as the prototypical situation of two
interlocutors engaged in talking about spatially arranged (and sometimes competing) visible
objects only represents one aspect of natural conversations. Instead, we consider the possibility
to have a physical interaction with an addressee as the crucial predictor for the endophoric
‘distal’ preference in narrative and interactional settings, as it enables speakers to immediately
express their social intention to create joint attention to a non-physical referent with the
addressee. More broadly, specific cultural genre knowledge (‘language characteristics’) can
afford and stimulate a large range of assumed relations between speaker, addressee, and
referent.
In addition to context affordances such as text and discourse genre, we predict that
personal characteristics of the speaker or writer are crucial for their choice of demonstrative
form, also in endophoric settings. Endophoric referential choices are based on speakers’
assumptions rather than on rock hard observable evidence (Prince, 1981b, p. 232). Choices can
differ across individuals and contexts, because discourse conditions not always allow for a
univocal choice, and speakers will differ in their ability to construct adequate assumptions
about the mental model of their addressee(s). This may be due to individual speaker differences
in memory span and theory of mind abilities, or because speakers take the freedom to deviate
from the referential default, for instance by purposefully using a first-mention demonstrative
or demonstrative NP rather than a simple pronoun. For activation-based expressions, speakers’
39
leeway is intelligently covered by the idea that cognitive statuses are implicationally related,
predicting that “a form can appropriately encode the necessary and sufficient status (the status
immediately above the form in the table) as well as all higher statuses” (Gundel et al., 1993, p.
290). But once we assume that demonstrative forms largely encode the same cognitive status,
it is reasonable that they will show relatively more individual and less systematic variation than
other types of referring expression. Speakers with stronger theory of mind abilities, relatively
more genre knowledge, or enhanced general rhetorical skills will be able to exploit putative
implicational differences between different demonstrative forms more extensively and more
strategically than others. Furthermore, individual variation in choice of demonstrative form
will vary as a function of the degree to which discourse genre characteristics have been
contextually specified.
In sum, we argued in this section that different endophoric demonstratives typically
access referents with the same or a similar cognitive status, and that they carry subtle pragmatic
inferences related to the presumed relation between speaker, addressee, and referent at a
psychological level. We assume that cognitive abilities and stylistic, rhetorical skills of
individual speakers and writers lead to substantial variation in their choice of demonstrative
form, and consider (cultural knowledge on) genre affordances as the most predictive top-down
variable explaining the distribution of endophoric demonstratives across different contexts.
This knowledge is informative about the position of the speaker or writer in relation to their
addressee(s), and influences where exactly referents will be situated in the assumed (jointly
cognitively attended) shared space between speaker/writer and addressee. Physical factors and
referent-intrinsic variables on the cognitive level are considered less decisive.
Clearly, much work remains to be done to validate or reject our conceptual framework
of demonstrative reference, also with regards to its endophoric predictions. First, we need more
reliable corpus evidence (natural and elicited) that directly compares the use of demonstratives
40
across discourse genres. The development of a decent endophoric toolbox, comparable to the
one in use for elicitation of demonstratives in exophoric settings (Wilkins, 2018), would be
helpful in this respect. Second, more experimental evidence is needed, for instance through
controlled experiments investigating the effect of genre on individuals’ choice of demonstrative
form in different contexts, and on individual cognitive variability in relation to genre
knowledge and genre specificity.
6. Conclusions and Outlook
In this paper, we introduced a novel conceptual framework of demonstrative reference. Based
on a review of the literature, we proposed that physical, psychological, and referent-intrinsic
factors dynamically interact to influence what demonstrative form a speaker will use in a given
context. However, the relative influence of these factors themselves was argued to be a function
of the cultural language setting at hand, the theory-of-mind capacities of the speaker, and the
affordances of the specific context in which a speech event takes place. We showed that the
framework is capable of reconciling seemingly irreconcilable results, and that it may to a large
extent generalize to situations of endophoric reference. Box 1 summarizes a set of ten testable
predictions that our conceptual framework makes, which can be investigated by future work.
Box 2 additionally presents several open questions in the study of demonstrative reference. In
this final section, we will discuss such remaining open questions and look out on promising
computational, neuroscientific, and practical developments in the study of demonstrative
reference and its applications.
41
Box 1. Ten testable predictions derived from the conceptual framework of demonstrative
reference introduced in this paper.
Box 2. Outstanding questions in the study of demonstrative reference.
6.1. Beyond demonstratives: Referring expressions in general
Our review of the literatures on exophoric and endophoric demonstratives revealed an
interesting difference between these two related but often distinctly approached topics of study.
We saw that endophoric demonstratives are typically considered and studied as part of a larger
set of referring expressions available to the language user, whereas research on exophoric
1. Physical, psychological, and referent-intrinsic factors jointly influence a speaker’s choice of exophoric
demonstrative form in any given communicative setting.
2. The relative importance of these three types of factor differs as a function of the affordances of the
specific speech situation.
3. In natural, communicative situations, psychological factors are by default more influential than physical
factors in shaping a speaker’s choice of exophoric and endophoric demonstrative form.
4. The more important the role of the addressee in the speech situation, the smaller the influence of speaker-
anchored physical factors on the speaker’s choice of demonstrative form.
5. The relative influence of physical vs. psychological factors in shaping speakers’ and writers’ choice of
demonstrative form varies as a function of their theory of mind capacities.
6. Languages differ in the relative importance of individual physical, psychological, and referent-intrinsic
factors that influence a speaker’s choice of demonstrative form in a given language.
7. Discourse genre is the most important predictor of a speaker’s or writer’s choice of endophoric
demonstrative form.
8. Expository discourse will elicit clear overall preferences for the use of ‘proximal’ demonstratives,
whereas interactional and narrative discourse will elicit clear overall preference for ‘distal’ demonstratives.
9. The bulk of anaphoric demonstratives, regardless of their specific form, express the same cognitive status,
namely that a referent has been or can be activated on the basis of previous discourse information.
10. The production and comprehension of demonstratives is supported at a neurobiological level by
interactions between the perisylvian language network, the theory-of-mind network, and a visuo-attentional
network, which are together supervised online by activation of areas involved in cognitive control.
1. To what extent does the conceptual framework of demonstrative reference as depicted in Figure 1 generalize
to cases of definite and indefinite reference (e.g., noun phrases including definite and indefinite articles)
beyond demonstratives?
2. What is the extent of variability across languages in terms of the basic configuration of the conceptual
framework?
3. To what extent do similar factors drive a speaker’s choice of demonstrative form and the exact form and
kinematics their pointing gesture takes?
4. To what extent can corpus data and experimental findings be used to determine the overall extent of
individual variation in speakers’ choice of demonstrative form?
5. What are the basic parameter settings of a computational implementation of the conceptual framework?
42
demonstratives often focuses on the various factors influencing a speaker’s choice of one
demonstrative form versus another. This discrepancy in empirical scope naturally raises the
open question whether the conceptual framework of demonstrative reference, as introduced in
this paper, generalizes to a broader set of referring expressions (e.g., definite and indefinite
articles, personal pronouns such as English it) beyond demonstratives. In the case of exophoric
reference, for instance, do the various physical, psychological, and referent-intrinsic factors
identified at the middle, cognitive level of the framework also influence whether speakers will
use a demonstrative (versus an alternative referring expression) at all? In the case of endophoric
reference, for example, how influential is discourse genre in driving speakers’ choice of any
referring expression on the scale between zero anaphora and full definite expressions?
A promising development in the experimental study of exophoric demonstratives in this
vein is presented by a recent cross-linguistic study in which a well-established experimental
paradigm to study demonstratives (in isolation) was extended to study the use of
demonstratives vs. definite and indefinite articles (Skilton & Peeters, under review). This study
observed that speakers of Dutch (the Netherlands) consistently preferred to use noun phrases
containing a definite article in reference to objects that had been recently introduced and were
in cognitive joint attention between speaker and addressee (cf. Coello & Bonnotte, 2013;
Kirsner, 1993). Speakers of the Amazonian language isolate Ticuna (Peru), however,
consistently used demonstrative noun phrases in reference to the same objects under similar
experimental circumstances. This finding suggests that there may be interesting observations
to be made once exophoric researchers start broadening their horizons towards studying
referring expressions beyond demonstratives. Furthermore, it raises the question to what extent
there is variability across languages in terms of the basic configuration of the conceptual
framework in general, and when extended to include various referring expressions beyond
demonstratives.
43
6.2. Beyond demonstratives: The form and kinematics of pointing gestures
Another open issue is the extent to which our conceptual framework may describe and
explain not only a speaker’s choice of demonstrative form, but also the exact form their
pointing gesture takes when they refer to something. Three observations suggest that there may
be high degrees of overlap in the mechanisms involved in the speaker’s selection of a specific
demonstrative form, as described by our framework, and their selection of a type of pointing
gesture (e.g., index-finger pointing, thumb pointing, whole-hand pointing) and its specific
kinematics (e.g., fast vs. slow movement; small vs. large gesture).
First, it has been widely observed cross-linguistically that the demonstrative forms
speakers predominantly use differ for referents located in the space directly in front of them
compared to referents located behind them (Levinson, 2018). This distinction seems to align
well with the fact that in many language communities speakers often point with their thumb
when a referent is located behind them, and with their index-finger when a referent is located
in front of them (e.g., Kendon & Versante, 2003). Furthermore, referents in a relatively more
distant location typically elicit pointing gestures that have a larger stroke amplitude compared
to referents that are located relatively more nearby (Gonseth et al., 2013, 2017). Thus, the
relative location of a referent may influence the form a pointing gestures takes, in terms of both
its type (e.g., index-finger vs. thumb) and the specific kinematic parameters (e.g., stroke
amplitude) of the token.
Second, it has been observed that invisible referents, such as when giving an addressee
directions in the streets towards a currently invisible endpoint, often elicit whole-hand pointing
gestures whereas visible referents may be more typically referred to using index-finger pointing
(Flack et al., 2018; Wilkins, 2003). This observation seems to align with the fact that visibility
44
may impact speakers choice of demonstrative form, as incorporated in the conceptual
framework of demonstrative reference.
Third, experimental studies have observed that speakers meticulously tailor the
kinematics of their index-finger pointing gestures to the communicative needs of their
addressees (e.g., Cleret de Langavant et al., 2011; Liu et al., 2019; Peeters, Chu, et al., 2015).
For instance, speakers commonly lower the velocity of their pointing gesture, and keep their
index finger in apex position for a significantly longer time interval, when a referent is assumed
to be communicatively more relevant to the addressee (Peeters et al., 2013). Arguably, this
offers the addressee more time to correctly detect the location and identity of the intended
referent. These experimental findings are in line with the observation that pointing gestures in
natural interactions differ in size as a function of whether they carry more or less foregrounded
information for the addressee (Enfield et al., 2007). As such, these observations thus also nicely
align with the finding that speaker’s choice of demonstrative form varies as a function of the
presumed communicative relevance of a referent for the addressee (Rocca, Tylén, et al., 2019).
Taken together, it seems that similar factors (e.g., the relative location of a referent, its
visibility, and its presumed cognitive status in the mind of the addressee) shape a speaker’s
choice of demonstrative form as well as the form and kinematics of their pointing gesture.
Similar top-down factors (language characteristics, speaker characteristics, and context
affordances) may furthermore influence which of these cognitive factors play a more important
role in shaping the form and kinematics of a pointing gesture in a given context (Cooperrider,
2020; Kita, 2003). Language communities differ (‘language characteristics’) in the overall
proportion of use of specific articulators (hand, nose, chin, etc.) when pointing (Cooperrider et
al., 2018; Cooperrider & Núñez, 2012; Enfield, 2001; Orie, 2009; Sherzer, 1973). Individuals
will differ (‘speaker characteristics’) in the form their pointing gesture will take under similar
circumstances, as the relation between pointing and individual differences in theory-of-mind
45
development has been clearly established (e.g., Baron‐Cohen, 1989; Camaioni et al., 2004;
Tomasello et al., 2007). The broader physical and social context may again modulate which
cognitive factors are considered more important in a given setting (‘context affordances’).
In sum, we thus propose that our conceptual framework of demonstrative reference may
generalize surprisingly well to manual ways of referring. Both for speech communities that use
the hands in various ways to point, and for speech communities that commonly point using
articulators beyond the hands (e.g., the chin, nose, or lips) in addition to manual articulators,
the same factors that influence a speaker’s choice of demonstrative form may also influence
the form and kinematics of their pointing gestures. More work is needed to specifically test
these proposed parallels in the mechanisms leading to the articulation of demonstratives and
gestures.
6.3. A computational approach to the study demonstrative reference
An additional open question is how the conceptual framework of demonstrative
reference, as introduced in the current paper, can be formalized and computationally
implemented. Specifically, the basic activation levels of the different factors at the cognitive
level of the framework, and the default weights of the connections between the entities
specified at the three levels, currently remain speculative. We envision two promising routes
towards a better computational understanding of demonstrative reference.
First, automatic text generation (Gatt & Krahmer, 2018; Reiter & Dale, 1997) - that is:
artificial intelligence systems that convert input into fluent, coherent text represents one
example in which (in this case, endophoric) demonstrative reference plays a potentially
important role. A key question for such existing computational systems is how to refer to target
referents throughout a text (see Krahmer & van Deemter, 2012, for a survey). This involves
the choice of form of a referring expression (should a target object be referred to using, say, a
46
proper name, a pronoun, or a definite or demonstrative description?). This choice problem has
been addressed in a number of studies (e.g., Callaway & Lester, 2002; Castro Ferreira et al.,
2016; Henschel et al., 2000). These studies often rely on linguistic factors, like syntactic role
(subject, object, etc.), recency, and salience (as modelled, for example, using centering theory,
Grosz et al., 1995; Poesio et al., 2004). None of these studies, so far, paid any attention to the
choice between ‘this’ vs. ‘that’.
Nevertheless, it would be highly interesting to explore whether insights from such
earlier computational models could be used to formalize our proposed conceptual framework
of demonstrative reference. This would not only be helpful for the earlier computational
models, allowing for more fine-grained (and hence more human-like) outputs, but would also
help formalize our current framework (cf. van Deemter et al., 2012). This type of computational
cognitive modelling forces one to be explicit about the model parameters and hence potentially
furthers our understanding of the interplay between the relevant factors involved
(Lewandowsky & Farrell, 2010; van Gompel et al., 2019). Furthermore, it would require
quantifying the extent of individual variation in demonstrative use across speakers of the same
language. Potentially, automatic text generation could be used to disclose the various weights
in a computational implementation of our conceptual framework.
Second, it would also be interesting to see whether the rational speech act (RSA) model
(Frank & Goodman, 2012; Goodman & Frank, 2016) could be exploited for demonstrative
reference. An attractive property of this computationally explicit model is that it simultaneously
models the production and the comprehension of referring expressions, with speakers aiming
to produce expressions that are maximally beneficial for listeners, and listeners assuming that
speakers are maximally helpful. The RSA model has been shown to work well for descriptions,
but demonstrative reference has, to the best of our knowledge, not been explored using its set-
up.
47
6.4. The neurobiology of demonstrative reference
In addition to furthering our understanding of demonstrative reference through
computational modelling, more work is needed to disclose the neurobiological infrastructure
that allows us to refer to entities in the world around us and to comprehend a speaker’s
multimodal message when they do so. After all, it is our brain that controls our speech and
gestures when we guide our addressee’s attention in the direction of a referent. Recent fMRI
studies investigating how listeners integrate situated referential speech (and gesture) with
properties of the immediate visual context suggest that several networks of brain areas (e.g.,
implicated in language processing, theory of mind, and visuo-attentional orienting) are
involved in allowing one to understand a speaker’s multimodal, referential utterance (Peeters
et al., 2017; Rocca, Coventry, et al., 2019). When producing a demonstrative as part of a larger
utterance, the demonstrative’s specific form needs to be retrieved from lexical memory and
integrated into a larger syntactic structure, before it is articulated (Levelt, 1989). Core areas of
the perisylvian language network, such as left inferior frontal gyrus and middle temporal gyrus,
should be involved in those operations in language production (Hagoort, 2019) as they are in
the comprehension of referential speech (Peeters et al., 2017; Peeters, Hagoort, et al., 2015).
Additionally, the theory-of-mind network (Frith & Frith, 2006) is arguably involved in
calculating the perspective of one’s addressee on a referent and the cognitive status of the
referent in the mental situation model of the addressee (Brunetti et al., 2014; Cleret de
Langavant et al., 2011; Committeri et al., 2015; Peeters, Chu, et al., 2015) as it is in the
addressee’s online comprehension of a speaker’s intent (Clark et al., 1983; Peeters et al., 2017).
Furthermore, a “network of frontoparietal areas previously implicated in the construction,
maintenance, and navigation of visuospatial representations” (Rocca, Coventry, et al., 2019, p.
48
12) may be involved in the online orienting of visuospatial attention between speaker,
addressee, and referent also during the selection and articulation of demonstratives.
These various functional networks (language, theory-of-mind, visuo-attentional), that
arguably provide the neurobiological foundation for successful demonstrative reference, may
be dynamically activated in parallel while they subserve different aspects of the conceptual
framework depicted in Figure 1. The bottom, typological level of the framework should be
represented at a neural level as stored lexical representations of referring expressions and their
grammatical properties in speakers’ long-term lexical memory, for instance in temporal areas
of cortex (Hagoort, 2013). The influence of physical factors, such as a referent’s relative
physical location or its visibility, in the choice of demonstrative form may be supported by
activation of areas implicated in orienting visuospatial attention. Psychological factors may
rather be taxing the speaker’s theory-of-mind network as they relate to identifying the cognitive
status of a referent in the mental model of the addressee (Peeters et al., 2017). The top level of
the framework can be conceived of as a series of top-down factors dynamically adjusting the
context-dependent importance of middle-level factors, a process that would typically require
activation of areas implicated in cognitive control. Clearly, these considerations currently
remain speculative as more neuroscientific work is needed to disclose how the brain allows us
to refer in speech and gesture to entities in the world around us.
6.5. Demonstrative reference in human-computer interaction
Thus far, we have approached the use of demonstratives from a theoretical point of
view. The study of demonstrative reference, however, also has relevant practical implications.
Ever since researchers have started thinking about natural, spoken interactions with computer
systems - and long before such systems became a real possibility, as they are now, with virtual
assistants like Siri, Cortana, and Google Assistant - the possibility of using deictic gestures to
49
point the computer's attention to an object has been explored. One of the best known examples
in this vein is arguably described by Bolt (1980), who proposed to combine speech and gesture
as a new, natural input modality in a graphical user interface. Using the (at the time) nascent
technologies of speech recognition and location sensing, Bolt's system could automatically
interpret an exophoric instruction like ‘put that there’, where ‘that’ was understood to refer to
whatever is pointed at’ (Bolt, 1980). Systems in this mould often model physical properties of
the target referent, such as its size and physical distance, as formalized in Fitts' law (Fitts, 1954;
MacKenzie, 1992), explaining why targets that are closer and larger are relatively easier to
point at compared to targets that are smaller or further away. Generally, the spoken utterance
accompanying the pointing gesture has received little attention in those endeavours. But
exceptions exist, like van der Sluis and Krahmer (2007), who focus on the trade-off between
information in gesture and in words, predicting that imprecise pointing gestures are more often
accompanied with more extensive verbal information, while more precise pointing is
accompanied with less verbal information. Importantly, however, in none of these approaches
is any attention devoted yet to the choice between ‘this’ vs. ‘that’. Future work could
incorporate theoretical insights on demonstrative reference into systems that allow for human-
computer interaction.
6.6. Towards the interdisciplinary study of demonstrative reference
To conclude, what this paper as a whole makes clear is that reaching a full (linguistic,
cognitive, and neurobiological) understanding of demonstrative reference requires combining
insights from various academic disciplines. Close collaboration is needed between i) linguists-
anthropologists typologically describing the demonstrative systems of the different languages
of the world and identifying factors that might influence the choice of demonstrative form in a
particular language on the basis of in-depth documentary and corpus-based work, ii)
50
experimental psychologists testing for the unique contribution of a proposed factor in different
languages and different experimental contexts and testing to what extent certain factors
influencing the choice of demonstrative form are universal or language-specific, iii)
computational linguists incorporating demonstrative reference in computational models of
language production to specify the mechanisms involved in the speaker’s choice of
demonstrative form, leading to new hypotheses for experimental psychologists to empirically
test, and ultimately iv) neuroscientists specifying the underlying neural infrastructure and its
dynamic activation in supporting the online selection of demonstratives in naturally occurring
multimodal communication. Demonstratives should best be studied in the context of pointing
gestures and both from an exophoric and endophoric perspective in relation to other referring
expressions. We believe this multidisciplinary endeavour is worth undertaking, as the
fundamental importance of demonstrative reference for human communication cannot easily
be overstated.
References
Alonso, M. (1968). Gramatica del español contemporaneo. Guadarrame.
Anderson, S. R., & Keenan, E. L. (1985). Deixis. In T. Shopen (Ed.), Language typology and
syntactic description (pp. 259308). Cambridge University Press.
Apothéloz, D., & Reichler-Béguelin, M.-J. (1999). Interpretations and functions of
demonstrative NPs in indirect anaphora. Journal of Pragmatics, 31(3), 363397.
https://doi.org/10.1016/S0378-2166(98)00073-3
Apperly, I. A. (2012). What is “theory of mind”? Concepts, cognitive processes and
individual differences. Quarterly Journal of Experimental Psychology, 65(5), 825
839. https://doi.org/10.1080/17470218.2012.676055
51
Ariel, M. (1988). Referring and accessibility. Journal of Linguistics, 24(1), 6587.
https://doi.org/10.1017/S0022226700011567
Ariel, M. (1990). Accessing antecedents. Routledge.
Arnold, J. E. (2010). How Speakers Refer: The Role of Accessibility. Language and
Linguistics Compass, 4(4), 187203. https://doi.org/10.1111/j.1749-
818X.2010.00193.x
Baayen, R. H., Piepenbrock, R., & van Rijn, H. (1993). The CELEX lexical database (CD-
ROM). University of Pennsylvania.
Bakeman, R., & Adamson, L. B. (1984). Coordinating Attention to People and Objects in
Mother-Infant and Peer-Infant Interaction. Child Development, 55(4), 12781289.
https://doi.org/10.2307/1129997
Bangerter, A. (2004). Using Pointing and Describing to Achieve Joint Focus of Attention in
Dialogue. Psychological Science, 15(6), 415419. https://doi.org/10.1111/j.0956-
7976.2004.00694.x
Bara, B. G. (2010). Cognitive Pragmatics: The Mental Processes of Communication. MIT
Press.
Baron‐Cohen, S. (1989). The Autistic Child’s Theory of Mind: A Case of Specific
Developmental Delay. Journal of Child Psychology and Psychiatry, 30(2), 285297.
https://doi.org/10.1111/j.1469-7610.1989.tb00241.x
Blagoeva, R. (2004). Demonstrative reference as a cohesive device in advanced learner
writing: A corpus-based study. Advances in Corpus Linguistics, 49, 297307.
https://doi.org/10.1163/9789004333710_018
Bohnemeyer, J. (2018). Yucatec Demonstratives in Interaction: Spontaneous versus Elicited
Data. In S. C. Levinson, S. Cutfield, M. Dunn, N. Enfield, S. Meira, & D. Wilkins
52
(Eds.), Demonstratives in Cross-Linguistic Perspective (pp. 176205). Cambridge
University Press.
Bolt, R. A. (1980). “Put-that-there”: Voice and gesture at the graphics interface. Proceedings
of the 7th Annual Conference on Computer Graphics and Interactive Techniques,
262270. https://doi.org/10.1145/800250.807503
Bonfiglioli, C., Finocchiaro, C., Gesierich, B., Rositani, F., & Vescovi, M. (2009). A
kinematic approach to the conceptual representations of this and that. Cognition,
111(2), 270274. https://doi.org/10.1016/j.cognition.2009.01.006
Botley, S., & McEnery, T. (2001a). Demonstratives in English: A Corpus-Based Study.
Journal of English Linguistics, 29(1), 733.
https://doi.org/10.1177/00754240122005170
Botley, S., & McEnery, T. (2001b). Proximal and Distal Demonstratives: A Corpus-Based
Study. Journal of English Linguistics, 29(3), 214233.
https://doi.org/10.1177/00754240122005341
Bowdle, B. F., & Ward, G. (1995). Generic Demonstratives. Proceedings of the Twenty-First
Annual Meeting of the Berkeley Linguistics Society: General Session and Parasession
on Historical Issues in Sociolinguistics/Social Issues in Historical Linguistics, 3243.
Brennan, S. E., & Clark, H. H. (1996). Conceptual pacts and lexical choice in conversation.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 22(6),
14821493. https://doi.org/10.1037/0278-7393.22.6.1482
Brown, P., & Levinson, S. C. (2018). Tzeltal: The Demonstrative System. In S. C. Levinson,
S. Cutfield, M. Dunn, N. Enfield, S. Meira, & D. Wilkins (Eds.), Demonstratives in
Cross-Linguistic Perspective (pp. 150175). Cambridge University Press.
53
Brown-Schmidt, S., Byron, D. K., & Tanenhaus, M. K. (2005). Beyond salience:
Interpretation of personal and demonstrative pronouns. Journal of Memory and
Language, 53(2), 292313. https://doi.org/10.1016/j.jml.2005.03.003
Brunetti, M., Zappasodi, F., Marzetti, L., Perrucci, M. G., Cirillo, S., Romani, G. L., Pizzella,
V., & Aureli, T. (2014). Do You Know What I Mean? Brain Oscillations and the
Understanding of Communicative Intentions. Frontiers in Human Neuroscience, 8.
https://doi.org/10.3389/fnhum.2014.00036
Brysbaert, M., & New, B. (2009). Moving beyond Kučera and Francis: A critical evaluation
of current word frequency norms and the introduction of a new and improved word
frequency measure for American English. Behavior Research Methods, 41(4), 977
990. https://doi.org/10.3758/BRM.41.4.977
Bühler, K. (1934). Sprachtheorie. Fischer.
Burenhult, N. (2003). Attention, accessibility, and the addressee: The case of the Jahai
demonstrative ton. Pragmatics, 13(3), 363379.
https://doi.org/10.1075/prag.13.3.01bur
Burenhult, N. (2008). Spatial coordinate systems in demonstrative meaning. Linguistic
Typology, 12(1). https://doi.org/10.1515/LITY.2008.033
Byron, D., & Allen, J. (1998). Resolving Demonstrative Anaphora in the TRAINS93 Corpus.
https://urresearch.rochester.edu/institutionalPublicationPublicView.action?institutiona
lItemId=1357
Caldano, M., & Coventry, K. R. (2019). Spatial demonstratives and perceptual space: To
reach or not to reach? Cognition, 191, 103989.
https://doi.org/10.1016/j.cognition.2019.06.001
Callaway, C. B., & Lester, J. C. (2002). Narrative prose generation.
https://core.ac.uk/reader/82795450
54
Camaioni, L., Perucchini, P., Bellagamba, F., & Colonnesi, C. (2004). The Role of
Declarative Pointing in Developing a Theory of Mind. Infancy, 5(3), 291308.
https://doi.org/10.1207/s15327078in0503_3
Capirci, O., Iverson, J. M., Pizzuto, E., & Volterra, V. (1996). Gestures and words during the
transition to two-word speech. Journal of Child Language, 23(3), 645673.
https://doi.org/10.1017/S0305000900008989
Carlson, S. M., & Moses, L. J. (2001). Individual Differences in Inhibitory Control and
Children’s Theory of Mind. Child Development, 72(4), 10321053.
https://doi.org/10.1111/1467-8624.00333
Castro Ferreira, T., Krahmer, E., & Wubben, S. (2016). Towards more variation in text
generation: Developing and evaluating variation models for choice of referential
form. Proceedings of the 54th Annual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers), 568577. https://doi.org/10.18653/v1/P16-1054
Chafe, W. (1976). Givenness, contrastiveness, definiteness, subjects, topics, and point of
view. In C. N. Li (Ed.), Subject and topic (pp. 2555). Academic Press.
Chen, R. (1990). English demonstratives: A case of semantic expansion. Language Sciences,
12(2), 139153. https://doi.org/10.1016/0388-0001(90)90009-6
Cheshire, J. (1996). That jacksprat: An interactional perspective on English that. Journal of
Pragmatics, 25(3), 369393. https://doi.org/10.1016/0378-2166(95)00032-1
Cheshire, J. (1999). Taming the vernacular: Some repercussions for the study of syntactic
variation and spoken grammar. Cuadernos de Filología Inglesa, 8.
https://revistas.um.es/cfi/article/view/65681
Chu, C.-Y., & Minai, U. (2018). Children’s Demonstrative Comprehension and the Role of
Non-linguistic Cognitive Abilities: A Cross-Linguistic Study. Journal of
55
Psycholinguistic Research, 47(6), 13431368. https://doi.org/10.1007/s10936-018-
9565-8
Clark, E. V. (1978). Strategies for Communicating. Child Development, 49(4), 953959.
https://doi.org/10.2307/1128734
Clark, E. V., & Sengul, C. J. (1978). Strategies in the acquisition of deixis. Journal of Child
Language, 5(3), 457475. https://doi.org/10.1017/S0305000900002099
Clark, H. H. (1996). Using Language. Cambridge University Press.
Clark, H. H. (2020). Anchoring Utterances. Topics in Cognitive Science, n/a(n/a).
https://doi.org/10.1111/tops.12496
Clark, H. H., & Bangerter, A. (2004). Changing Ideas about Reference. In I. A. Noveck & D.
Sperber (Eds.), Experimental Pragmatics (pp. 2549). Palgrave Macmillan.
https://doi.org/10.1057/9780230524125_2
Clark, H. H., & Krych, M. A. (2004). Speaking while monitoring addressees for
understanding. Journal of Memory and Language, 50(1), 6281.
https://doi.org/10.1016/j.jml.2003.08.004
Clark, H. H., Schreuder, R., & Buttrick, S. (1983). Common ground at the understanding of
demonstrative reference. Journal of Verbal Learning and Verbal Behavior, 22(2),
245258. https://doi.org/10.1016/S0022-5371(83)90189-5
Clark, H. H., & Wilkes-Gibbs, D. (1986). Referring as a collaborative process. Cognition,
22(1), 139. https://doi.org/10.1016/0010-0277(86)90010-7
Cleret de Langavant, L., Remy, P., Trinkler, I., McIntyre, J., Dupoux, E., Berthoz, A., &
Bachoud-Lévi, A.-C. (2011). Behavioral and Neural Correlates of Communication via
Pointing. PLoS ONE, 6(3). https://doi.org/10.1371/journal.pone.0017719
56
Coello, Y., & Bonnotte, I. (2013). The Mutual Roles of Action Representations and Spatial
Deictics in French Language. Quarterly Journal of Experimental Psychology, 66(11),
21872203. https://doi.org/10.1080/17470218.2013.775596
Çokal, D., Sturt, P., & Ferreira, F. (2014). Deixis: This and That in Written Narrative
Discourse. Discourse Processes, 51(3), 201229.
https://doi.org/10.1080/0163853X.2013.866484
Çokal, D., Sturt, P., & Ferreira, F. (2018). Processing of It and This in Written Narrative
Discourse. Discourse Processes, 55(3), 272289.
https://doi.org/10.1080/0163853X.2016.1236231
Committeri, G., Cirillo, S., Costantini, M., Galati, G., Romani, G. L., & Aureli, T. (2015).
Brain activity modulation during the production of imperative and declarative
pointing. NeuroImage, 109, 449457.
https://doi.org/10.1016/j.neuroimage.2014.12.064
Consten, M., & Averintseva-Klisch, M. (2012). Tentative Reference Acts? ‘Recognitional
Demonstratives’ as Means of Suggesting Mutual Knowledge – or Overriding a Lack
of It. Research in Language, 10(3), 257277. https://doi.org/10.2478/v10015-011-
0033-x
Cooperrider, K. (2016). The Co-Organization of Demonstratives and Pointing Gestures.
Discourse Processes, 53(8), 632656.
https://doi.org/10.1080/0163853X.2015.1094280
Cooperrider, K. (2020). Fifteen ways of looking at a pointing gesture [Preprint]. PsyArXiv.
Cooperrider, K., & Núñez, R. (2012). Nose-pointing: Notes on a facial gesture of Papua New
Guinea. Gesture, 12(2), 103129. https://doi.org/10.1075/gest.12.2.01coo
57
Cooperrider, K., Slotta, J., & Núñez, R. (2018). The Preference for Pointing With the Hand Is
Not Universal. Cognitive Science, 42(4), 13751390.
https://doi.org/10.1111/cogs.12585
Coventry, K. R., Griffiths, D., & Hamilton, C. J. (2014). Spatial demonstratives and
perceptual space: Describing and remembering object location. Cognitive Psychology,
69, 4670. https://doi.org/10.1016/j.cogpsych.2013.12.001
Coventry, K. R., Valdés, B., Castillo, A., & Guijarro-Fuentes, P. (2008). Language within
your reach: Nearfar perceptual space and spatial demonstratives. Cognition, 108(3),
889895. https://doi.org/10.1016/j.cognition.2008.06.010
Cutfield, S. (2018). Dalabon Exophoric Uses of Demonstratives. In S. C. Levinson, S.
Cutfield, M. Dunn, N. Enfield, S. Meira, & D. Wilkins (Eds.), Demonstratives in
Cross-Linguistic Perspective (pp. 90115). Cambridge University Press.
Da Milano, F. (2007). Demonstratives in the languages of Europe. In P. Ramat & E. Roma
(Eds.), Europe and the Mediterranean as Linguistic Areas: Convergencies from a
Historical and Typological Perspective (pp. 2547). John Benjamins Publishing.
Denny, J. P. (1982). Semantics of the Inuktitut (Eskimo) Spatial Deictics. International
Journal of American Linguistics, 48(4), 359384. https://doi.org/10.1086/465747
Diessel, H. (1999). Demonstratives: Form, function and grammaticalization. John Benjamins
Publishing.
Diessel, H. (2006). Demonstratives, joint attention, and the emergence of grammar. Cognitive
Linguistics, 17(4). https://doi.org/10.1515/COG.2006.015
Diessel, H. (2013). Distance Contrasts in Demonstratives. In M. Dryer & M. Haspelmath
(Eds.), The World Atlas of Language Structures Online. Max Planck Institute for
Evolutionary Anthropology.
Dixon, R. M. W. (1972). The Dyirbal Language of North Queensland. CUP Archive.
58
Dixon, R. M. W. (2003). Demonstratives: A cross-linguistic typology. Studies in Language,
27(1), 61112. https://doi.org/10.1075/sl.27.1.04dix
Doran, R. B., & Ward, G. (2019). A taxonomy of uses of demonstratives. In J. Gundel & B.
Abbott (Eds.), The Oxford Handbook of Reference (pp. 236259). Oxford University
Press.
Eco, U. (1976). A Theory of Semiotics. Indiana University Press.
Enfield, N. J. (2001). ‘Lip-pointing’: A discussion of form and function with reference to data
from Laos. Gesture, 1(2), 185211. https://doi.org/10.1075/gest.1.2.06enf
Enfield, N. J. (2003). Demonstratives in Space and Interaction: Data from Lao Speakers and
Implications for Semantic Analysis. Language, 79(1), 82117.
https://doi.org/10.1353/lan.2003.0075
Enfield, N. J. (2018). Lao Demonstrative Determiners Nii4 and Nan4: An Intensionally
Discrete Distinction for Extensionally Analogue Space. In S. C. Levinson, S. Cutfield,
M. Dunn, N. Enfield, S. Meira, & D. Wilkins (Eds.), Demonstratives in Cross-
Linguistic Perspective (pp. 7289). Cambridge University Press.
Enfield, N. J., Kita, S., & de Ruiter, J. P. (2007). Primary and secondary pragmatic functions
of pointing gestures. Journal of Pragmatics, 39(10), 17221741.
https://doi.org/10.1016/j.pragma.2007.03.001
Evans, N., Bergqvist, H., & San Roque, L. (2018). The grammar of engagement I:
Framework and initial exemplification. Language and Cognition, 10(1), 110140.
https://doi.org/10.1017/langcog.2017.21
Fillmore, C. J. (1982). Towards a descriptive framework for spatial deixis. In R. J. Jarvella &
W. Klein (Eds.), Speech, place, and action: Studies in deixis and related topics. (pp.
3159). John Wiley & Sons.
59
Fitts, P. M. (1954). The information capacity of the human motor system in controlling the
amplitude of movement. Journal of Experimental Psychology, 47(6), 381391.
https://doi.org/10.1037/h0055392
Flack, Z. M., Naylor, M., & Leavens, D. A. (2018). Pointing to Visible and Invisible Targets.
Journal of Nonverbal Behavior, 42(2), 221236. https://doi.org/10.1007/s10919-017-
0270-3
Fossard, M., Garnham, A., & Cowles, H. W. (2012). Between anaphora and deixis … The
resolution of the demonstrative noun phrase “that N”. Language and Cognitive
Processes, 27(9), 13851404. https://doi.org/10.1080/01690965.2011.606668
Frank, M. C., & Goodman, N. D. (2012). Predicting Pragmatic Reasoning in Language
Games. Science, 336(6084), 998998. https://doi.org/10.1126/science.1218633
Frith, C. D., & Frith, U. (2006). The Neural Basis of Mentalizing. Neuron, 50(4), 531534.
https://doi.org/10.1016/j.neuron.2006.05.001
Gatt, A., & Krahmer, E. (2018). Survey of the State of the Art in Natural Language
Generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence
Research, 61, 65170. https://doi.org/10.1613/jair.5477
Glover, K. D. (2000). Proximal and distal deixis in negotiation talk. Journal of Pragmatics,
32(7), 915926. https://doi.org/10.1016/S0378-2166(99)00078-8
Gonseth, C., Kawakami, F., Ichino, E., & Tomonaga, M. (2017). The higher the farther:
Distance-specific referential gestures in chimpanzees (Pan troglodytes). Biology
Letters, 13(11), 20170398. https://doi.org/10.1098/rsbl.2017.0398
Gonseth, C., Vilain, A., & Vilain, C. (2013). An experimental study of speech/gesture
interactions and distance encoding. Speech Communication, 55(4), 553571.
https://doi.org/10.1016/j.specom.2012.11.003
60
Goodman, N. D., & Frank, M. C. (2016). Pragmatic Language Interpretation as Probabilistic
Inference. Trends in Cognitive Sciences, 20(11), 818829.
https://doi.org/10.1016/j.tics.2016.08.005
Gray, B. (2010). On the use of demonstrative pronouns and determiners as cohesive devices:
A focus on sentence-initial this/these in academic prose. Journal of English for
Academic Purposes, 9(3), 167183. https://doi.org/10.1016/j.jeap.2009.11.003
Grice, H. P. (1975). Logic and Conversation. Speech Acts, 4158.
https://doi.org/10.1163/9789004368811_003
Grosz, B. J., Weinstein, S., & Joshi, A. K. (1995). Centering: A framework for modeling the
local coherence of discourse. Computational Linguistics, 21(2), 203225.
Gudde, H. B., Coventry, K. R., & Engelhardt, P. E. (2016). Language and memory for object
location. Cognition, 153, 99107. https://doi.org/10.1016/j.cognition.2016.04.016
Gudde, H. B., Griffiths, D., & Coventry, K. R. (2018). The (Spatial) Memory Game: Testing
the Relationship Between Spatial Language, Object Knowledge, and Spatial
Cognition. JoVE (Journal of Visualized Experiments), 132, e56495.
https://doi.org/10.3791/56495
Guirardello-Damian, R. (2018). Trumai: Non-contrastive Exophoric Uses of Demonstratives.
In S. C. Levinson, S. Cutfield, M. Dunn, N. Enfield, S. Meira, & D. Wilkins (Eds.),
Demonstratives in Cross-Linguistic Perspective (pp. 242256). Cambridge University
Press.
Gundel, J. K., Hedberg, N., & Zacharski, R. (1993). Cognitive Status and the Form of
Referring Expressions in Discourse. Language, 69(2), 274307.
https://doi.org/10.2307/416535
Hagoort, P. (2013). MUC (Memory, Unification, Control) and beyond. Frontiers in
Psychology, 4. https://doi.org/10.3389/fpsyg.2013.00416
61
Hagoort, P. (2019). The neurobiology of language beyond single-word processing. Science,
366(6461), 5558. https://doi.org/10.1126/science.aax0289
Halliday, M. A. K., & Hasan, R. (1976). Cohesion in English. Routledge.
Hanks, W. F. (1990). Referential Practice: Language and Lived Space Among the Maya.
University of Chicago Press.
Hanks, W. F. (1992). The indexical ground of deictic reference. In A. Duranti & C. Goodwin
(Eds.), Rethinking Context: Language as an Interactive Phenomenon (pp. 4376).
Cambridge University Press.
Hanks, W. F. (2009). Fieldwork on deixis. Journal of Pragmatics, 41(1), 1024.
https://doi.org/10.1016/j.pragma.2008.09.003
Hanks, W. F. (2011). Deixis and indexicality. In W. Bublitz & N. R. Norrick (Eds.),
Foundations of Pragmatics (pp. 315346). Walter de Gruyter.
Hellwig, B. (2018). “See This Sitting One”: Demonstratives and Deictic Classifiers in
Goemai. In S. C. Levinson, S. Cutfield, M. Dunn, N. Enfield, S. Meira, & D. Wilkins
(Eds.), Demonstratives in Cross-Linguistic Perspective (pp. 134149). Cambridge
University Press.
Henschel, R., Cheng, H., & Poesio, M. (2000). Pronominalization revisited. Proceedings of
the 18th Conference on Computational Linguistics - Volume 1, 306312.
https://doi.org/10.3115/990820.990865
Herrmann, S. (2018). Warao Demonstratives. In S. C. Levinson, S. Cutfield, M. Dunn, N.
Enfield, S. Meira, & D. Wilkins (Eds.), Demonstratives in Cross-Linguistic
Perspective (pp. 282302). Cambridge University Press.
Himmelmann, N. P. (1996). Demonstratives in Narrative Discourse: A Taxonomy of
Universal Uses. In B. A. Fox (Ed.), Studies in Anaphora (pp. 205254). John
Benjamins Publishing.
62
Hockett, C. F. (1960). The Origin of Speech. Scientific American, 203(3), 8897.
Hottenroth, P.-M. (1982). The system of local deixis in Spanish. In J. Weissenborn & W.
Klein (Eds.), Here and There: Cross-linguistic Studies on Deixis and Demonstration
(pp. 133154). John Benjamins Publishing.
Jarbou, S. O. (2010). Accessibility vs. physical proximity: An analysis of exophoric
demonstrative practice in Spoken Jordanian Arabic. Journal of Pragmatics, 42(11),
30783097. https://doi.org/10.1016/j.pragma.2010.04.014
Jungbluth, K. (2003). Deictics in the conversational dyad. In F. Lenz (Ed.), Deictic
Conceptualisation of Space, Time and Person (pp. 1340). John Benjamins
Publishing.
Kaiser, E., & Trueswell, J. C. (2008). Interpreting pronouns and demonstratives in Finnish:
Evidence for a form-specific approach to reference resolution. Language and
Cognitive Processes, 23(5), 709748. https://doi.org/10.1080/01690960701771220
Kaplan, D. (1979). On the logic of demonstratives. Journal of Philosophical Logic, 8(1), 81
98. https://doi.org/10.1007/BF00258420
Kemmerer, D. (1999). “Near” and “far” in language and perception. Cognition, 73(1), 3563.
https://doi.org/10.1016/S0010-0277(99)00040-2
Kendon, A., & Versante, L. (2003). Pointing by Hand in “Neapolitan”. In S. Kita (Ed.),
Pointing: Where Language, Culture, and Cognition Meet (pp. 109138). Psychology
Press.
Keuleers, E., Brysbaert, M., & New, B. (2010). SUBTLEX-NL: A new measure for Dutch
word frequency based on film subtitles. Behavior Research Methods, 42(3), 643650.
https://doi.org/10.3758/BRM.42.3.643
Kirsner, R. (1993). From meaning to message in two theories: Cognitive and Saussurean
views of the Modern Dutch demonstratives. In R. A. Geiger & B. Rudzka-Ostyn
63
(Eds.), Conceptualizations and Mental Processing in Language (pp. 81114). Walter
de Gruyter.
Kirsner, R. S. (1979). Deixis in Discourse: An Exploratory Quantitative Study of the Modem
Dutch Demonstrative Adjectives. Discourse and Syntax, 355375.
https://doi.org/10.1163/9789004368897_016
Kita, S. (2003). Pointing: Where Language, Culture, and Cognition Meet. Psychology Press.
Knuchel, D. (2019). Kogi Demonstratives and Engagement. Open Linguistics, 5(1), 615629.
https://doi.org/10.1515/opli-2019-0034
Koolen, R., Gatt, A., Goudbeek, M., & Krahmer, E. (2011). Factors causing overspecification
in definite descriptions. Journal of Pragmatics, 43(13), 32313250.
https://doi.org/10.1016/j.pragma.2011.06.008
Krahmer, E., & van Deemter, K. (2012). Computational Generation of Referring
Expressions: A Survey. Computational Linguistics, 38(1), 173218.
https://doi.org/10.1162/COLI_a_00088
Küntay, A. C., & Özyürek, A. (2006). Learning to use demonstratives in conversation: What
do language specific strategies in Turkish reveal? Journal of Child Language, 33(2),
303320. https://doi.org/10.1017/S0305000906007380
Labrador, B. (2011). A corpus-based study of the use of Spanish demonstratives as
translation equivalents of English demonstratives. Perspectives, 19(1), 7187.
https://doi.org/10.1080/0907676X.2010.481047
Lakoff, R. (1974). Remarks on this and that. In R. A. Fox, M. W. La Galy, & A. Bruck
(Eds.), Papers from the tenth regional meeting (pp. 345356). Chicago Linguistic
Society.
64
Lenko-Szymanska, A. (2004). Demonstratives as anaphora markers in advanced learners’
English. In G. Aston, S. Bernardini, & D. Stewart (Eds.), Corpora and Language
Learners (pp. 89108). John Benjamins Publishing.
Levelt, W. J. M. (1989). Speaking: From Intention to Articulation. MIT Press.
Levinson, S. C. (1983). Pragmatics. Cambridge University Press.
Levinson, S. C. (2004). Deixis. In L. Horn (Ed.), The handbook of pragmatics (pp. 97121).
Blackwell.
Levinson, S. C. (2018). Introduction: Demonstratives: Patterns in Diversity. In S. C.
Levinson, S. Cutfield, M. Dunn, N. Enfield, S. Meira, & D. Wilkins (Eds.),
Demonstratives in Cross-Linguistic Perspective (pp. 142). Cambridge University
Press.
Levinson, S. C., Cutfield, S., Dunn, M., Enfield, N., Meira, S., & Wilkins, D. (Eds.). (2018).
Demonstratives in Cross-Linguistic Perspective. Cambridge University Press.
Lewandowsky, S., & Farrell, S. (2010). Computational Modeling in Cognition: Principles
and Practice. SAGE Publications.
Liu, R., Bögels, S., Bird, G., Medendorp, W. P., & Toni, I. (2019). Hierarchical integration
of communicative and visuospatial perspective-taking demands in sensorimotor
control of referential pointing [Preprint]. PsyArXiv.
https://doi.org/10.31234/osf.io/htvqa
Lücking, A. (2018). Witness-loaded and Witness-free Demonstratives. In M. Coniglio, A.
Murphy, E. Schlachter, & T. Veenstra (Eds.), Atypical Demonstratives: Syntax,
Semantics and Pragmatics (pp. 255284). Walter de Gruyter.
Lyons, J. (1977). Semantics: Cambridge University Press.
65
MacKenzie, I. S. (1992). Fitts’ Law as a Research and Design Tool in Human-Computer
Interaction. HumanComputer Interaction, 7(1), 91139.
https://doi.org/10.1207/s15327051hci0701_3
Maclaren, R. (1982). The Semantics and Pragmatics of the English Demonstratives. Cornell
University.
Maes, A. (1996). Nominal Anaphors, Markedness and Coherence of Discourse. Peeters.
Maes, A. (1997). Referent ontology and centering in discourse. Journal of Semantics, 14(3),
207235. https://doi.org/10.1093/jos/14.3.207
Maes, A., & de Rooij, C. (2007). (How) Do Demonstratives Code Distance? Papers
Presented at the Daarc 2007, 8389.
https://research.tilburguniversity.edu/en/publications/92b50ee8-aee2-4b7d-830c-
1829a3011ae5
Margetts, A. (2018). Saliba-Logea: Exophoric Demonstratives. In S. C. Levinson, S. Cutfield,
M. Dunn, N. Enfield, S. Meira, & D. Wilkins (Eds.), Demonstratives in Cross-
Linguistic Perspective (pp. 257281). Cambridge University Press.
McCarthy, M. (2002). It, this and that. In M. Coulthard (Ed.), Advances in Written Text
Analysis (pp. 266275). Routledge.
McCool, G. J. (1993). The French demonstrative system: From Old to Modern French.
WORD, 44(1), 3140. https://doi.org/10.1080/00437956.1993.11435892
Meira, S. (2003). “Addressee effects” in demonstrative systems: The cases of Tiriyó and
Brazilian Portuguese. In F. Lenz (Ed.), Deictic Conceptualisation of Space, Time and
Person (pp. 312). John Benjamins Publishing.
Meira, S. (2018). Tiriyó: Non-contrastive Exophoric Uses of Demonstratives. In S. C.
Levinson, S. Cutfield, M. Dunn, N. Enfield, S. Meira, & D. Wilkins (Eds.),
66
Demonstratives in Cross-Linguistic Perspective (pp. 222241). Cambridge University
Press.
Meira, S., & Guirardello-Damian, R. (2018). Brazilian-Portuguese: Non-contrastive
Exophoric Use of Demonstratives in the Spoken Language. In S. C. Levinson, S.
Cutfield, M. Dunn, N. Enfield, S. Meira, & D. Wilkins (Eds.), Demonstratives in
Cross-Linguistic Perspective (pp. 116133). Cambridge University Press.
Morford, J. P., Shaffer, B., Shin, N., Twitchell, P., & Petersen, B. T. (2019). An Exploratory
Study of ASL Demonstratives. Languages, 4(4), 80.
https://doi.org/10.3390/languages4040080
New, B., Pallier, C., Brysbaert, M., & Ferrand, L. (2004). Lexique 2: A new French lexical
database. Behavior Research Methods, Instruments, & Computers, 36(3), 516524.
https://doi.org/10.3758/BF03195598
Oosterwijk, A. M., Boer, M. de, Stolk, A., Hartmann, F., Toni, I., & Verhagen, L. (2017).
Communicative knowledge pervasively influences sensorimotor computations.
Scientific Reports, 7(1), 112. https://doi.org/10.1038/s41598-017-04442-w
Opalka, H. (1982). Representations of local Ni-deixis in Swahili in Relation to Bühler’s
“Origo des Zeigfelds.” In J. Weissenborn & W. Klein (Eds.), Here and There: Cross-
linguistic Studies on Deixis and Demonstration (pp. 6580). John Benjamins
Publishing.
Orie, O. O. (2009). Pointing the Yoruba way. Gesture, 9(2), 237261.
https://doi.org/10.1075/gest.9.2.04ori
Passonneau, R. J. (1989). Getting at discourse referents. Proceedings of the 27th Annual
Meeting on Association for Computational Linguistics, 5159.
https://doi.org/10.3115/981623.981630
67
Peeters, D., Azar, Z., & Özyürek, A. (2014). The Interplay between Joint Attention, Physical
Proximity, and Pointing Gesture in Demonstrative Choice. Proceedings of the 36th
Annual Meeting of the Cognitive Science Society, 11441149.
Peeters, D., Chu, M., Holler, J., Hagoort, P., & Özyürek, A. (2015). Electrophysiological and
Kinematic Correlates of Communicative Intent in the Planning and Production of
Pointing Gestures and Speech. Journal of Cognitive Neuroscience, 27(12), 2352
2368. https://doi.org/10.1162/jocn_a_00865
Peeters, D., Chu, M., Holler, J., Ozyurek, A., & Hagoort, P. (2013). Getting to the point: The
influence of communicative intent on the kinematics of pointing gestures.
Proceedings of the 35th Annual Meeting of the Cognitive Science Society, 11271132.
Peeters, D., Hagoort, P., & Özyürek, A. (2015). Electrophysiological evidence for the role of
shared space in online comprehension of spatial demonstratives. Cognition, 136, 64
84. https://doi.org/10.1016/j.cognition.2014.10.010
Peeters, D., & Özyürek, A. (2016). This and That Revisited: A Social and Multimodal
Approach to Spatial Demonstratives. Frontiers in Psychology, 7.
https://doi.org/10.3389/fpsyg.2016.00222
Peeters, D., Snijders, T. M., Hagoort, P., & Özyürek, A. (2017). Linking language to the
visual world: Neural correlates of comprehending verbal reference to objects through
pointing and visual cues. Neuropsychologia, 95, 2129.
https://doi.org/10.1016/j.neuropsychologia.2016.12.004
Peirce, C. S. (1940). Philosophical writings of Peirce (J. Buchler, Ed.). Dover.
Petch-Tyson, S. (2000). Demonstrative expressions in argumentative discourse: A computer
corpus-based comparison of non-native and native English. In S. P. Botley & T.
McEnery (Eds.), Corpus-based and Computational Approaches to Discourse
Anaphora (pp. 4364). John Benjamins Publishing.
68
Piwek, P., Beun, R.-J., & Cremers, A. (2008). ‘Proximal’ and ‘distal’ in language and
cognition: Evidence from deictic demonstratives in Dutch. Journal of Pragmatics,
40(4), 694718. https://doi.org/10.1016/j.pragma.2007.05.001
Poesio, M., Stevenson, R., Eugenio, B. D., & Hitzeman, J. (2004). Centering: A Parametric
Theory and Its Instantiations. Computational Linguistics, 30(3), 309363.
https://doi.org/10.1162/0891201041850911
Prince, E. F. (1981a). On the Interfacing of Indefinite-This NPs. In A. K. Joshi, B. L.
Webber, & I. A. Sag (Eds.), Elements of Discourse Understanding (pp. 231250).
Cambridge University Press.
Prince, E. F. (1981b). Towards a taxonomy of given-new information. In P. Cole (Ed.),
Radical Pragmatics (pp. 223255). Academic Press.
Reiter, E., & Dale, R. (1997). Building applied natural language generation systems. Natural
Language Engineering, 3(1), 5787. https://doi.org/10.1017/S1351324997001502
Rocca, R., Coventry, K. R., Tylén, K., Staib, M., Lund, T. E., & Wallentin, M. (2019).
Language beyond the language system: Dorsal visuospatial pathways support
processing of demonstratives and spatial language during naturalistic fast fMRI.
NeuroImage, 116128. https://doi.org/10.1016/j.neuroimage.2019.116128
Rocca, R., Tylén, K., & Wallentin, M. (2019). This shoe, that tiger: Semantic properties
reflecting manual affordances of the referent modulate demonstrative use. PLoS ONE,
14(1), e0210333. https://doi.org/10.1371/journal.pone.0210333
Rocca, R., & Wallentin, M. (2020). Demonstrative Reference and Semantic Space: A Large-
Scale Demonstrative Choice Task Study. Frontiers in Psychology, 11.
https://doi.org/10.3389/fpsyg.2020.00629
69
Rocca, R., Wallentin, M., Vesper, C., & Tylén, K. (2019). This is for you: Social modulations
of proximal vs. distal space in collaborative interaction. Scientific Reports, 9(1), 114.
https://doi.org/10.1038/s41598-019-51134-8
Schlegloff, E. A. (1996). Some Practices for Referring to Persons in Talk-in-Interaction: A
Partial Sketch of a Systematics. In B. A. Fox (Ed.), Studies in Anaphora (pp. 437
486). John Benjamins Publishing.
Scott, K. (2013). This and that: A procedural analysis. Lingua, 131, 4965.
https://doi.org/10.1016/j.lingua.2013.03.008
Senft, G. (Ed.). (2004). Deixis and demonstratives in Oceanic languages. Pacific Linguistics,
Research School of Pacific and Asian Studies.
Sherzer, J. (1973). Verbal and nonverbal deixis: The pointed lip gesture among the San Blas
Cuna. Language in Society, 2(1), 117131.
https://doi.org/10.1017/S0047404500000087
Shin, N., Hinojosa-Cantú, L., Shaffer, B., & Morford, J. P. (2020). Demonstratives as
indicators of interactional focus: Spatial and social dimensions of Spanish esta and
esa. Cognitive Linguistics, 1(ahead-of-print). https://doi.org/10.1515/cog-2018-0068
Skilton, A. E. (2019). Spatial and non-spatial deixis in Cushillococha Ticuna. UC Berkeley,
Department of Linguistics.
Skilton, A. H., & Peeters, D. (under review). Cross-linguistic differences in demonstrative
systems: Comparing spatial and non-spatial influences on demonstrative use in
Ticuna and Dutch.
Stevens, J., & Zhang, Y. (2013). Relative distance and gaze in the use of entity-referring
spatial demonstratives: An event-related potential study. Journal of Neurolinguistics,
26(1), 3145. https://doi.org/10.1016/j.jneuroling.2012.02.005
70
Stevens, J., & Zhang, Y. (2014). Brain mechanisms for processing co-speech gesture: A
cross-language study of spatial demonstratives. Journal of Neurolinguistics, 30, 27
47. https://doi.org/10.1016/j.jneuroling.2014.03.003
Stirling, L., & Huddleston, R. (2002). Deixis and anaphora. In R. D. Huddleston & G. K.
Pullum (Eds.), The Cambridge grammar of the English language (pp. 14491564).
Cambridge University Press.
Strauss, S. (2002). This, that, and it in spoken American English: A demonstrative system of
gradient focus. Language Sciences, 24(2), 131152. https://doi.org/10.1016/S0388-
0001(01)00012-2
Sun-Young, O. (2009). Korean College Students’ Use of English Demonstratives in
Argumentative Essays. English Teaching, 64(1), 5178.
https://doi.org/10.15858/engtea.64.1.200903.51
Tanz, C. (1980). Studies in the Acquisition of Deictic Terms. Cambridge Studies in
Linguistics, 26, 1184.
Terrill, A. (2018). Lavukaleve: Exophoric Usage of Demonstratives. In S. C. Levinson, S.
Cutfield, M. Dunn, N. Enfield, S. Meira, & D. Wilkins (Eds.), Demonstratives in
Cross-Linguistic Perspective (pp. 206221). Cambridge University Press.
Tomasello, M. (2008). Origins of human communication. MIT Press.
Tomasello, M., Carpenter, M., & Liszkowski, U. (2007). A New Look at Infant Pointing.
Child Development, 78(3), 705722. https://doi.org/10.1111/j.1467-
8624.2007.01025.x
van Deemter, K., Gatt, A., Gompel, R. P. G. van, & Krahmer, E. (2012). Toward a
Computational Psycholinguistics of Reference Production. Topics in Cognitive
Science, 4(2), 166183. https://doi.org/10.1111/j.1756-8765.2012.01187.x
71
van der Sluis, I., & Krahmer, E. (2007). Generating Multimodal References. Discourse
Processes, 44(3), 145174. https://doi.org/10.1080/01638530701600755
van Gompel, R. P. G., van Deemter, K., Gatt, A., Snoeren, R., & Krahmer, E. J. (2019).
Conceptualization in Reference Production. Psychological Review, 126(3), 345373.
https://doi.org/10.1037/rev0000138
van Staden, M. (2018). Tidore: Non-contrastive Demonstratives. In S. C. Levinson, S.
Cutfield, M. Dunn, N. Enfield, S. Meira, & D. Wilkins (Eds.), Demonstratives in
Cross-Linguistic Perspective (pp. 343360). Cambridge University Press.
Weinrich, H. (1988). Über Sprache, Leib, und Gedächtnis. In H.-U. Gumbrecht (Ed.),
Materialität der Kommunikation (pp. 8093). Suhrkamp.
Weissenborn, J., & Klein, W. (1982). Here and There: Cross-linguistic Studies on Deixis and
Demonstration. John Benjamins Publishing.
Wilkins, D. (2003). When Pointing With the Index Finger Is Not a Universal (in
Sociocultural and Semiotic Terms). In S. Kita (Ed.), Pointing: Where Language,
Culture, and Cognition Meet (pp. 171216). Psychology Press.
Wilkins, D. (2018). The Demonstrative Questionnaire: “THIS” and “THAT” in Comparative
Perspective. In S. C. Levinson, S. Cutfield, M. Dunn, N. Enfield, S. Meira, & D.
Wilkins (Eds.), Demonstratives in Cross-Linguistic Perspective (pp. 4371).
Cambridge University Press.
Winner, T., Selen, L., Oosterwijk, A. M., Verhagen, L., Medendorp, W. P., Rooij, I. van, &
Toni, I. (2019). Recipient Design in Communicative Pointing. Cognitive Science,
43(5), e12733. https://doi.org/10.1111/cogs.12733
Wu, Y. (2004). Spatial Demonstratives in English and Chinese: Text and Cognition. John
Benjamins Publishing.
72
Zhang, J. (2015). An Analysis of the Use of Demonstratives in Argumentative Discourse by
Chinese EFL Learners. Journal of Language Teaching and Research, 6(2), 460465.
https://doi.org/10.17507/jltr.0602.29
... As an outcome, they are engaging in deictic communication (Gonzalez Pena, 2020). According to the frequency counts in lexical databases, such as Celex, Lexique and Subtlex, demonstratives are amongst the most highly used lexical items in many languages (Peeters et al., 2021). The number of demonstratives in each language is a remarkable cross-linguistic diversity. ...
Article
Full-text available
Deictic words are considered the earliest words which children acquire at the stage of two-word-utterance. However, mastering them like adults may take more time. This paper investigates how L2 children comprehend and produce English spatial deixis ‘here’, ‘there’, ‘this’, and ‘that’ by observing and documenting their responses and reactions in hide-and-seek game. It also aims to find out the children’s obstacles in acquiring these words, such as proximity bias and egocentrism. The subjects are Arabic children of ages four, five, and six who acquire English as a second language in international schools in Riyadh, Saudi Arabia. They performed two types of tasks: comprehension task and production task. Both tasks contained two trials: same perspective and the different perspective. Based on the results, children did better in comprehending the spatial deixis than in producing them. Moreover, the results showed that there was no proximity bias happened with children in this study. In addition, the results of the two trials in both tasks illustrated that changing the deictic center improves with age. Although the study provides some significant results, there should be an increase in the number of the samples in order to make the results generalized.
Article
Full-text available
This study examines the cognitive information processes that Turkish advanced non-native speakers of English employ in assigning the referents of this and that in reading and production. We predicted that these speakers would assign referents in relation to the linear distance between discourse-linked anaphors and their referents in the discourse (i.e., based on spatial-temporal features of this and that), which means they would prefer this for a referent mentioned in the proximal chunk of text and that for a referent mentioned in the distal chunk. We also predicted that readers would not assign referents based on the focusing features of this and that. We tested our predictions in two eye-tracking reading experiments and one sentence-completion experiment. Turkish L2 learners' on-line reference resolution in reading experiments was different from that of English native speakers that were tested in a previous study. In the eye-tracking experiments, Turkish L2 learners did not show evidence of using a recency strategy to resolve referential ambiguity and did not use spatial-temporal or focusing features of this and that to assign referents. On the other hand, in the sentence-completion experiment, the effect of prominence of discourse structure in the use of this and that was qualitatively similar to that of English native speakers, but their indexing of the degree of focus of this and that was different. Our results suggest that the difference between Turkish L2 learners and English native speakers is due to L1 interference.
Article
Full-text available
The human pointing gesture may be viewed from many angles. On a basic description, it is an intentional movement, often of the hand, by which one person tries to direct another’s attention toward something; it is, in short, a bodily command to look. But this definition is only a start. Pointing may also be seen as a semiotic primitive, a philosophical puzzle, a communicative workhorse, a protean universal, a social tool, a widespread taboo, a partner of language, a part of language, a fixture of art, a graphical icon, a cognitive prop, a developmental milestone, a diagnostic window, a cross-species litmus test, and an evolutionary stepping-stone. A tour of these fifteen ways of looking at pointing reveals the diverse dimensions of one of our most unassuming, ubiquitous behaviors. It also reveals a series of dualities that make the gesture especially compelling: it is at once natural and irreducibly cultural; simple yet put to sophisticated purposes; by turns salient and subtle; and is—in its prototypical form, with the index finger extended—special in some ways and not so special in others. These tensions in part explain why pointing has been treated so widely and variously across disciplines. But there is also, I propose, a deeper reason: The gesture embodies our distinctively human preoccupation with attention.
Article
Full-text available
Spatial demonstratives (words like this and that) have been thought to primarily be used for carving up space into a peripersonal and extrapersonal domain. However, when given a noun out of context and asked to couple it with a demonstrative, speakers tend to choose this for words denoting manipulable objects (small, harmless, and inanimate), while non-manipulable objects (large, harmful, and animate) are more likely to be coupled with that. Here, we extend these findings using the Demonstrative Choice Task (DCT) procedure and map demonstrative use along a wide spectrum of semantic features. We conducted a large-scale (N = 2197) DCT experiment eliciting demonstratives for 506 words, rated across 65 + 11 perceptually and cognitively relevant semantic dimensions. We replicated the finding that demonstrative choice is influenced by object manipulability. Demonstrative choice was furthermore found to be related to a set of additional semantic factors, including valence, arousal, loudness, motion, time and more generally, the self. Importantly, demonstrative choices were highly structured across participants, as shown by a strong correlation detected in a split-sample comparison of by-word demonstrative choices. We argue that the DCT may be used to map a generalized semantic space anchored in the self of the speaker, the self being an extension of the body beyond physical space into a multidimensional semantic space.
Article
Full-text available
While demonstratives typically signal aspects of the spatial configuration of speech act participants and objects in the speech situation, intersubjective parameters, such as the attentional state of the interlocutor, have recently gained importance in the analysis of such forms. Several systems have been described in which the use of certain forms is conditioned by shared vs. non-shared attention towards a referent. Phenomena of this kind have recently been considered under the notion of ‘engagement’, i.e. the expression of a speaker’s assumptions about the knowledge or attention of their interlocutor (Evans et al. 2018a, b). The present study contributes to the ongoing investigation of engagement by a descriptive account of demonstratives in Kogi (Chibchan). It is argued that the use of certain (ad)nominal forms that were initially associated with addressee proximity cannot be accounted for in merely spatial terms. The paper proposes a novel analysis in terms of engagement and shows that the forms apply when a referent is in the attention of, or is known to both interlocutors. Evidence in support of this comes from elicited data as well as an interactive matching game in which attentional states of participants can be observed.
Article
Full-text available
American Sign Language (ASL) makes extensive use of pointing signs, but there has been only limited documentation of how pointing signs are used for demonstrative functions. We elicited demonstratives from four adult Deaf signers of ASL in a puzzle completion task. Our preliminary analysis of the demonstratives produced by these signers supports three important conclusions in need of further investigation. First, despite descriptions of four demonstrative signs in the literature, participants expressed demonstrative function 95% of the time through pointing signs. Second, proximal and distal demonstrative referents were not distinguished categorically on the basis of different demonstrative signs, nor on the basis of pointing handshape or trajectory. Third, non-manual features including eye gaze and facial markers were essential to assigning meaning to demonstratives. Our results identify new avenues for investigation of demonstratives in ASL.
Article
Full-text available
Human spatial representations are shaped by affordances for action offered by the environment. A prototypical example is the organization of space into peripersonal (within reach) and extrapersonal (outside reach) regions, mirrored by proximal (this/here) and distal (that/there) linguistic expressions. The peri-/extrapersonal distinction has been widely investigated in individual contexts, but little is known about how spatial representations are modulated by interaction with other people. Is near/far coding of space dynamically adapted to the position of a partner when space, objects, and action goals are shared? Over two preregistered experiments based on a novel interactive paradigm, we show that, in individual and social contexts involving no direct collaboration, linguistic coding of locations as proximal or distal depends on their distance from the speaker’s hand. In contrast, in the context of collaborative interactions involving turn-taking and role reversal, proximal space is shifted towards the partner, and linguistic coding of near space (‘this’ / ‘here’) is remapped onto the partner’s action space.
Article
In all spoken languages, speakers use demonstratives – words like this and that – to refer to entities in their immediate environment. But which factors determine whether they use one demonstrative (this) or another (that)? Here we report the results of an experiment examining the effects of referent visibility, referent distance, and addressee location on the production of demonstratives by speakers of Ticuna (isolate; Brazil, Colombia, Peru), an Amazonian language with four demonstratives, and speakers of Dutch (Indo-European; Netherlands, Belgium), which has two demonstratives. We found that Ticuna speakers' use of demonstratives displayed effects of addressee location and referent distance, but not referent visibility. By contrast, under comparable conditions, Dutch speakers displayed sensitivity only to referent distance. Interestingly, we also observed that Ticuna speakers consistently used demonstratives in all referential utterances in our experimental paradigm, while Dutch speakers strongly preferred to use definite articles. Taken together, these findings shed light on the significant diversity found in demonstrative systems across languages. Additionally, they invite researchers studying exophoric demonstratives to broaden their horizons by cross-linguistically investigating the factors involved in speakers’ choice of demonstratives over other types of referring expressions, especially articles.
Book
As a subject of universal appeal, spatial demonstratives have been studied extensively from a variety of disciplines. What marks the present study as distinct is that it is an English-Chinese comparative study set in a cognitive-linguistic framework and that the methodology features a parallel corpora-based, discourse analysis approach. The framework illuminates the nature of the demonstratives’ basic and extended meaning and use, the connections between them, and the mechanisms that govern and constrain their trends of extension. The corpora place the English and Chinese demonstratives in comparable discourse contexts and processes, providing an “ecological” environment for the observation of how their behavior fits into the respective structural and discourse systems of the two languages. The study also illuminates important issues such as the subjectivity of language, language as a representational system and a vehicle of communication, the interface between form and function, and the role of context in discourse comprehension.
Article
This paper adopts a cognitive linguistic framework to explore the influence of spatial and social factors on the use of Spanish demonstratives esta ‘this’ and esa ‘that’. Twenty adult Spanish speakers in Monterrey, Mexico, were asked questions prompting the selection of puzzle pieces for placement in a 25-piece puzzle located in the shared space between the participant and an addressee. Although participants were not explicitly instructed to produce demonstratives, the need to identify specific puzzle pieces naturally elicited a total of 523 tokens of esta and esa. Analyses of the distribution of esta versus esa show that demonstratives are not used in a categorical manner to mark differences in physical space. Although participants tended to produce proximal esta for referents near the speaker, both esta and esa were used for referents further from the speaker and closer to the addressee. Further, participants’ demonstrative selection was also influenced by interaction type: intersubjective misalignment between speakers promoted the use of proximal esta, whereas intersubjective alignment promoted the use of distal esa. These results support the view that nominal grounding is an intersubjective activity. Physical and social factors jointly shape speakers’ construal of the developing co-constructed communicative event as a whole, leading to increasingly variable usage of demonstratives as the referent is more distant both spatially and intersubjectively from the speaker.
Article
In this Review, I propose a multiple-network view for the neurobiological basis of distinctly human language skills. A much more complex picture of interacting brain areas emerges than in the classical neurobiological model of language. This is because using language is more than single-word processing, and much goes on beyond the information given in the acoustic or orthographic tokens that enter primary sensory cortices. This requires the involvement of multiple networks with functionally nonoverlapping contributions.