Content uploaded by Peter Dekker
Author content
All content in this area was uploaded by Peter Dekker on Dec 14, 2020
Content may be subject to copyright.
Content uploaded by Peter Dekker
Author content
All content in this area was uploaded by Peter Dekker on Dec 14, 2020
Content may be subject to copyright.
Neural Agent-based Models to Study Language
Contact using Linguistic Data
Peter Dekker
AI Lab
Vrije Universiteit Brussel
Brussels, Belgium
peter.dekker@ai.vub.ac.be
Bart de Boer
AI Lab
Vrije Universiteit Brussel
Brussels, Belgium
bart@ai.vub.ac.be
Abstract
In this paper, we propose an outline for linguistic research on language change,
as observed in the languages of the world, using neural agent-based models of
emergent communication. We describe how such models could be used to study
morphological simplification, using a case study of language contact in Eastern
Indonesia. A neural architecture is used to represent hypothesized cognitive mecha-
nisms of language change: a generalization mechanism, the procedural/declarative
model, and a phonological mechanism, the hyper & hypo articulation model, that
involves a theory of mind of the listener.
1 Introduction
What happens to language, when an established population of speakers of a language, comes into
contact with a community of strangers, who try to learn their language? Agent-based computer
simulations of interactions between speakers can be effective models to study this question [
28
]. In
this paper, we will outline how agent-based models with a neural model of production and perception
can be used to study linguistic questions about language change, based on real-world data from the
languages of the world. We will provide the desiderata a neural agent model should fulfill to be able to
study these questions, and give a sketch of a possible model, which we plan to implement in the future.
We want to infer general factors behind language change, by seeing how these general factors surface
in case studies on real languages. Central in this paper is a case study of language contact in Eastern
Indonesia. We will study the hypothesis that morphological simplification is caused by contact
between native speakers of a language (L1) and adult learners, who learn the language as a second
language (L2). We use neural networks as an architecture to implement two cognitive mechanisms that
could lead to morphological simplification: a generalization mechanism, the procedural/declarative
model, and a phonological mechanism for simplifying or clarifying utterances, the hyper & hypo
articulation model, that involves a theory of mind of the listener.
2 Previous work
Agent-based models are used in many areas describing social or cultural processes, for example
to describe social segregation or the spread of religion [
12
,
26
]. Agent models have been used to
describe the evolution of language [
7
] and to describe concrete instances of language change, for
example change in word order in Dutch [
2
]. More abstract models from language evolution have
been used to study the influence of social factors (like population size and language contact) on
linguistic structure [
6
,
20
,
25
]. In addition to this work on agents for linguistic modelling, agent-based
approaches of emergent communication have been developed in the community of natural language
processing (NLP). Agents in these models use an abstract language to designate objects or images
34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada.
and have to agree on a name for a certain object [
18
]. In recent agent models from NLP, deep neural
networks are used as the comprehension and production model [
5
,
9
,
16
]. Interesting analyses can be
made about the language neural agents learn, such as the frequency distribution of symbols and word
order [
3
,
4
]. [
11
] model contact between communities of deep neural agents, and the formation of a
creole language. Some of these models ultimately have the goal of constructing conversational agents
in mind. We will use the same types of neural models, but apply them to study linguistic questions
about how real languages change. [
15
] suggest that among others linguistics and cognitive science
could contribute to hypotheses to test experimentally in neural models of emergent communication.
3 Method
3.1 Data
As a case study, we consider the linguistic situation at the Alor and Pantar islands in Eastern Indonesia.
Alorese, an Austronesian language, is spoken on the coasts of the islands, while landward, Papuan
Alor-Pantar languages are spoken. Many L1 speakers of Alor-Pantar languages, learn Alorese as a
second language. Alorese lost all of its morphology, when compared to closely related Lewoingu
Lamaholot [
23
], which has not been in contact with Alor-Pantar languages. It is assumed Alorese
lost its morphology due to adult language contact [22].
Figure 1: Case study: morphological simplification in Alorese, spoken on the Alor & Pantar islands in Indonesia.
We will specifically look at the verb morphology of Alorese: compared to Lamaholot, all verb suffixes,
signifying the subject of the sentence on the verb, have been lost. For example, the third person plural
for the verb lodo ‘to go down’ is lodo-ka (with the 3PL marker -ka) in Lewoingu Lamaholot, while
it is lodo (without person marker) in Alorese. In Alorese, only verb prefixes, on a small number
of vowel-initial verbs, have been retained. As initialization of our model, we use verb forms, and
their affixes from a grammar of Lewoingu Lamaholot [
23
], which can be seen as the starting state of
Alorese, before it underwent morphological simplification. We use 56 verbs, these are the only verbs
in the grammar explicitly classified as prefixing or suffixing: 17 prefixing and 39 suffixing verbs. As
this grammar is descriptive, there is no distributional data over forms: per verb concept, one verb form
and one prefix/suffix per grammatical person is available. We compare the outcome of the language
in the model to the current state of Alorese from a grammar of Alorese [
14
] and demographic data,
about the proportion of L2 speakers in Alorese communities [
22
]. For possible future research on the
lexicon, the digital dataset LexiRumah, which includes Alorese and Alor-Pantar languages, could be
used [13].
As data representation, we choose to stay close to the real language: agents communicate using real
word forms, prefixes and suffixes. Every phoneme in these forms and affixes, is represented as a string
of phonetic features. By staying close to the real language, language-specific factors which make the
case study interesting, such as the phonological vowel-initial retention of prefixes, can be included
in the study. However, we want to test the validity of our model to describe general mechanisms
of language change. Therefore, we plan to evaluate our model, with as much as possible the same
method and data representation, on other case studies of morphological simplification, such as
language contact between Scandinavian languages and Low German (possible dataset: NorthEuraLex
[8]).
2
3.2 Task
The task for agents is to successfully communicate about concepts in the world, roughly inspired
by a Lewis signalling game [
18
] or naming game [
29
]. Every iteration, every agent in a population
speaks: it picks a concept, produces a form based on that concept and sends it to the listener. In
our case study, we look at verb morphology. We want to analyze the interaction between different
grammatical persons (e.g. 1st person singular, 1SG) in the verb paradigm, and we want to look at
transitivity (a verb having an object), because this determines the affixes being used. Therefore, a
concept consists of a combination of a lexical concept (e.g. verb to go), a person (e.g. 1SG) and a
transitivity (containing object or not) of the sentence. The listener tries to infer the concept from the
received form and points to this object. The speaker then points to the correct object. Based on this
feedback, either only the listener, or the speaker and the listener, update their internal model. We
create a simulation with both L1 and L2 agents. We initialize L1 agents with (train on) a concept-form
mapping from Lewoingu Lamaholot, a precursor of Alorese which has not undergone morphological
simplification. The L2 agents, with a neighboring Papuan language as first language, are initialized
with a random model. We do not initialize the agents with a Papuan language model, since the
literature presupposes no L1 effect on morphology, but rather a general effect of L2 learning at an
adult age [
22
]. By running a model with and without adult language contact, and comparing the
results to the current state of Alorese (where morphological simplification has taken place), we can
test the hypothesis that adult language contact was responsible for morphological simplification.
3.3 Investigating cognitive factors
In our model we will implement two cognitive mechanisms: Ullman’s declarative/procedural model
of language learning [
31
,
32
], and a phonological component, Lindblom’s H&H model [
19
] (Figure
2).
production
comprehension
H&H
sloppy...
clear
concept
concept form
form
re-entrance
procedural
generalization
L1
declarative
L2
no
generalization
Figure 2: Cognitive mechanisms: the declarative/procedural model, used during comprehension and production,
and the H&H model, used during production, re-entering the produced utterance in the comprehension system.
A number of theories account for the differences between L1 and L2 language processing, leading
to morphological simplification during adult language contact, including: a critical threshold age
for learning languages [
17
], missing surface inflection [
24
] – which assumes that adult learners have
knowledge of inflection, but cannot realize it –, a noise channel-based approach (where L2 speakers
have less information to decode) [
10
], and the role of the L1 knowledge when learning L2 [
1
,
27
].
We choose Ullman’s declarative/procedural model as cognitive mechanism of language learning and
generalization. According to the procedural/declarative model, grammar is produced by a procedural
cognitive system, while the lexicon is memorized in a declarative cognitive system. In L2 learners,
linguistic forms which are normally produced in the procedural system, such as morphology, are
memorized in the declarative system. We hypothesize this computationally heavy memorization
step in L2 speakers leads to simplification. According to the theory, based on age of acquisition and
experience, morphology may be produced more via the procedural system also in L2 speakers. We
do not claim that the declarative/procedural model is the only mechanism at play, for example, some
of the aforementioned mechanisms may play a role, but Ullman’s mechanism is the perspective we
will use to approach our problem.
Furthermore, we add a phonological component, because in the data of our Alorese case study,
morphological simplification is also phonologically conditioned. We use Lindblom’s hyper and
3
hypo articulation theory: speakers produce a form more clearly or less clearly (e.g., drop an affix),
depending on their estimation of intelligibility by the listener. This requires the speaker to have a
theory of mind about the listener. One option is re-entrance [
30
]: the speaker interprets its utterance
using its own language comprehension system, as if it was the listener. Another option would be to
take the characteristics of the listener (e.g. L1 or L2) into account, but this assumes that these listener
characteristics are available to the speaker. The H&H articulation component turns the problem of
zero-shot learning upside down. Instead of burdening an L2 listener, who hears a form for the first
time, with the task to infer an concept, the speaker will try to adjust his pronunciation to the L2
listener.
3.4 Neural model
We want to implement the proposed cognitive mechanisms (section 3.3) in a neural network, which
serves as a language comprehension and production model of every agent in the simulation. It is a
challenging task to implement cognitive mechanisms in a neural model, since it is not trivial how a
certain cognitive trait (e.g. generalization versus no generalization) can be translated into a neural
network architecture. Therefore, we will propose some possible ideas, and draw from the literature
on neural emergent communication. We want to develop a model that is cognitively plausible: it
should consist of cognitive modules which can be postulated in humans, and the model should be
able to exhibit to some extent human language processing behaviour in an agent-based simulation.
At the same time, the implementation does not have to be neurally plausible: the structure of the
neural network does not have to mimic the structure of the brain. A deep neural network is merely a
powerful and robust model, to be able to implement our cognitive mechanisms. The network will
perform a reinforcement learning task, where the communicative success is the reward. As in [
11
],
we think a modular structure of the network can well represent our cognitive mechanisms. A specific
challenge is the relatively small amount of data in our setting, which may call for specific network
architectures or data augmentation.
We will implement a declarative and a procedural module, in both L1 and L2 agents. The procedural
system facilitates generalization over different concepts, grammatical persons and sentence transi-
tivities. The declarative system performs no generalization, but instead memorizes concept-form
mappings. It is an open question how this generalization versus no generalization can be implemented
in a neural network. A possible approach could be to model the procedural model using a smaller
number of nodes, and add dropout and regularization, forcing it to generalize over training examples.
The declarative system could consist of a larger number of nodes and layers, nudging it to overfit.
In L1 agents, there is a stronger bias towards using the procedural module, while L2 agents use the
declarative module more, but are able to shift to the procedural system after gaining more experience.
The weights which modulate procedural versus declarative system usage, should thus be initialized
differently for L1 and L2 agents, but be able to be change by experience.
The other cognitive mechanism, the H&H articulation model, could involve a step of re-entrance:
how would a speaker’s utterance be perceived if the speaker had to interpret it himself? This could
possibly be implemented using a game setting like self-play [
21
], where an agent plays against itself.
Based on the estimated intelligibility of the utterance, the utterance is produced clearer or sloppier.
This post-processing step of an utterance, can be performed in a separate module of the network.
This module will need to have knowledge of how exactly a word form can be pronounced clearer or
sloppier (e.g. dropping an ending), possibly by pre-training on language data.
4 Conclusion
We have described a dataset, task and model that show how neural models of language production
and comprehension can be used in an agent-based setting to study language change, using data from
languages of the world. We sketched how the declarative-procedural model, responsible for linguistic
generalization, and the hyper & hypo articulation model, which involves a theory of mind through
re-entrance, could be implemented in a neural model. Further research is needed to determine the
precise architectures to implement these mechanisms, specifically in a small data setting. We hope
the techniques proposed help to develop models that can better explain language change, and by
doing this, eventually shed light on human (pre)history.
4
Acknowledgments and Disclosure of Funding
This work was supported by funding from the Flemish Government under the Onderzoeksprogramma
Artificiële Intelligentie (AI) Vlaanderen programme. PD was supported by a PhD Fellowship funda-
mental research (11A2821N) of the Research Foundation – Flanders (FWO).
References
[1]
BAPTI STA, M., GELMAN, S. A., AND BECK, E. Testing the role of convergence in language
acquisition, with implications for creole genesis. International Journal of Bilingualism 20, 3
(June 2016), 269–296.
[2]
BLOEM , J., VER SLOOT, A., AND WEERMAN, F. An agent-based model of a historical word
order change. In Proceedings of the Sixth Workshop on Cognitive Aspects of Computational
Language Learning (Lisbon, Portugal, 2015), Association for Computational Linguistics, pp. 22–
27.
[3]
CHAABOUNI, R., KH ARITONOV, E., DUPOUX, E., AND BARONI, M. Anti-efficient encoding
in emergent communication. In Advances in Neural Information Processing Systems (2019),
pp. 6290–6300.
[4]
CHAABOUNI, R., KHARITONOV, E., LAZARIC, A., DUPOUX, E., AND BARONI, M. Word-
order biases in deep-agent emergent communication. arXiv:1905.12330 [cs] (June 2019).
[5]
DAGA N, G., HU PKES, D., AND BRUN I, E. Co-evolution of language and agents in referential
games. arXiv preprint arXiv:2001.03361 (2020).
[6] DAL E, R., AND LUP YAN, G. UNDERSTANDING THE ORIGINS OF MORPHOLOGICAL
DIVERSITY: THE LINGUISTIC NICHE HYPOTHESIS. Advances in Complex Systems 15,
03n04 (May 2012), 1150017.
[7]
DE BOER , B. Self-organization in vowel systems. Journal of Phonetics 28, 4 (Oct. 2000),
441–465.
[8]
DELLE RT, J., DANE YKO, T., MÜ NCH, A., LADY GINA , A., BUCH, A., CLARIUS, N.,
GRIGORJEW, I., BALAB EL, M., BOGA, H. I., BAYSAROVA, Z., MÜHLENBERND, R., WAHLE,
J., AND JÄGER , G. NorthEuraLex: A wide-coverage lexical database of Northern Eurasia.
Language Resources and Evaluation (Nov. 2019).
[9]
EVTIM OVA, K., DROZDOV, A., KIELA, D., AND CHO , K. Emergent Communication in a
Multi-Modal, Multi-Step Referential Game. arXiv:1705.10369 [cs, math] (Apr. 2018).
[10]
FUTRE LL, R., A ND GIBSON, E. L2 processing as noisy channel language comprehension.
Bilingualism: Language and Cognition 20, 4 (Sept. 2016), 683–684.
[11]
GRAES SER, L., CHO, K., AND KIEL A, D. Emergent Linguistic Phenomena in Multi-Agent
Communication Games. arXiv:1901.08706 [cs] (Jan. 2019).
[12]
IANNACCO NE, L. R., AND MAKOW SKY, M. D. Accidental Atheists? Agent-Based Explana-
tions for the Persistence of Religious Regionalism. Journal for the Scientific Study of Religion
46, 1 (Mar. 2007), 1–16.
[13]
KAIPING, G. A., A ND KLAMER, M. LexiRumah: An online lexical database of the Lesser
Sunda Islands. PLOS ONE 13, 10 (Oct. 2018), e0205250.
[14]
KLAMER, M. A. F. A Short Grammar of Alorese (Austronesian). No. 486 in Languages of the
World. Materials. LINCOM Europa, Muenchen, 2011.
[15]
LAZARIDOU, A., AND BARONI, M. Emergent Multi-Agent Communication in the Deep
Learning Era. arXiv:2006.02419 [cs] (July 2020).
[16]
LAZARIDOU, A., PEYSAKHOVICH, A., AND BARONI, M. Multi-Agent Cooperation and the
Emergence of (Natural) Language. arXiv:1612.07182 [cs] (Mar. 2017).
[17]
LENNE BERG, E. H. Response to reviews of biological foundations of language. Journal of
Communication Disorders 1, 4 (Oct. 1968), 320–322.
[18] LEWIS, D. Convention. Harvard University Press, Cambridge, MA, 1969.
5
[19]
LINDBLOM, B. Explaining Phonetic Variation: A Sketch of the H&H Theory. In Speech
Production and Speech Modelling, W. J. Hardcastle and A. Marchal, Eds. Springer Netherlands,
Dordrecht, 1990, pp. 403–439.
[20]
LOU-MAGNUSON, M., AN D ONNIS, L. Social Network Limits Language Complexity. Cogni-
tive Science 42, 8 (Nov. 2018), 2790–2817.
[21]
LOWE , R., GUP TA, A., FOER STER, J., KIEL A, D., AN D PINEAU, J. On the interaction
between supervision and self-play in emergent communication. arXiv:2002.01093 [cs, stat]
(June 2020).
[22]
MORO , F. R. Loss of Morphology in Alorese (Austronesian): Simplification in Adult Language
Contact. Journal of Language Contact 12, 2 (Aug. 2019), 378–403.
[23]
NISHIYAMA, K., AND KE LEN, H. A Grammar of Lamaholot, Eastern Indonesia. Lincom
Europa, 2007.
[24]
PRÉVOST, P., AN D WHITE, L. Missing Surface Inflection or Impairment in second language
acquisition? Evidence from tense and agreement. Second Language Research 16, 2 (Apr. 2000),
103–133.
[25]
REALI , F., CHATER, N., AND CHRISTIANSEN, M. H. Simpler grammar, larger vocabulary:
How population size affects language. Proceedings of the Royal Society B: Biological Sciences
285, 1871 (Jan. 2018), 20172586.
[26]
SCHELLING, T. C. Dynamic models of segregation
†
.The Journal of Mathematical Sociology
1, 2 (July 1971), 143–186.
[27]
SCHEPENS, J., VAN DER SLI K, F., AN D VAN HOUT, R. The effect of linguistic distance across
Indo-European mother tongues on learning Dutch as a second language. In Approaches to
Measuring Linguistic Differences, L. Borin and A. Saxena, Eds. DE GRUYTER, Berlin, Boston,
Jan. 2013.
[28]
SMITH, A. D. Models of language evolution and change. Wiley Interdisciplinary Reviews:
Cognitive Science 5, 3 (2014), 281–293.
[29]
STEEL S, L. Synthesising the origins of language and meaning using co-evolution, self-
organisation and level formation. In Approaches to the Evolution of Language. Cambridge
University Press, 1998, pp. 384–404.
[30]
STEEL S, L. Language re-entrance and the ‘inner voice’. Journal of Consciousness Studies 10,
4-5 (2003), 173–185.
[31]
ULLMAN, M. T. The neural basis of lexicon and grammar in first and second language:
The declarative/procedural model. Bilingualism: Language and Cognition 4, 2 (Aug. 2001),
105–122.
[32]
ULLMAN, M. T. A neurocognitive perspective on language: The declarative/procedural model.
Nature Reviews Neuroscience 2, 10 (Oct. 2001), 717–726.
6