BookPDF Available

Understanding language genealogy: Alternatives to the tree model


Abstract and Figures

There are important reasons to be sceptical of the accuracy and usefulness of the family-tree model in historical linguistics. That model assumes that every linguistic innovation applies to a language considered as an undifferentiated whole, a point with no “width”. But this assumption makes it impossible to use a tree to model the partial diffusion of an innovation within a language community (“internal diffusion”), or the diffusion of an innovation across language communities (“external diffusion”). These limitations have long been noticed by historical linguists (Schmidt 1872, Schuchardt 1900); but they become glaringly obvious in the cases discussed by Ross (1988) and François (2014) under the heading of “linkages” – i.e., language families that arise through the diversification, in situ, of a dialect network. The articles in this special issue all contribute towards addressing this problem, from a range of perspectives. ______________ Siva Kalyan, Alexandre François & Harald Hammarström (eds), 2019. "Understanding language genealogy: Alternatives to the tree model". Special issue of "Journal of Historical Linguistics" 9/1.
Content may be subject to copyright.
Journal of Historical Linguistics
Journal of Historical Linguistics
John Benjamins Publishing Company
volume 9 number 1 2019
John Benjamins Publishing Company
Journal of
Historical Linguistics
volume 9 number 1
issn 2210-2116 / e-issn 2210-2124
Problems with, and alternatives to, the tree model in historical
Siva Kalyan, Alexandre François and Harald Hammarström
Detecting non-tree-like signal using multiple tree topologies
Annemarie Verkerk
Visualizing the Boni dialects with Historical Glottometry
Alexander Elias
Subgrouping the Sogeram languages: A critical appraisal
of Historical Glottometry
Don Daniels, Danielle Barth and Wolfgang Barth
Save the trees: Why we need tree models in linguistic
reconstruction (and when we should apply them)
Guillaume Jacques and Johann-Mattis List
When the waves meet the trees: A response to Jacques and List
Siva Kalyan and Alexandre François
volume 9 number 1 2019
Understanding language genealogy
Alternatives to the tree model
jhl.9.1.cover.indd 1 16/04/2019 16:22:06
John Benjamins Publishing Company
    
General Editor Review Editor
Silvia Luraghi Eugenio R. Luján
University of Pavia University of Madrid Complutense
Associate Editors
Michela Cennamo Heiko Narrog
University of Naples Tohoku University
Gunther De Vogelaer Sarah G. omason
Universität Münster University of Michigan
Eitan Grossman
Hebrew University of Jerusalem
Consulting Editor
Joseph C. Salmons
University of Wisconsin
Journal of
Historical Linguistics
Editorial Assistant
Hope Wilson
e Ohio State University
 
Understanding language genealogy
Alternatives to the tree model
Edited by Siva kalyan, Alexandre François and
Harald Hammarström
Australian National University / LaTTiCe, CNRS, École Normale Supérieure, Univ.
Paris 3 Sorbonne nouvelle, Australian National University / Uppsala University
Table of contents
Problems with, and alternatives to, the tree model in historical
Siva Kalyan, Alexandre François and Harald Hammarstrom
Detecting non-tree-like signal using multiple tree topologies
Annemarie Verkerk
Visualizing the Boni dialects with Historical Glottometry
Alexander Elias
Subgrouping the Sogeram languages: A critical appraisal of Historical
Don Daniels, Danielle Barth and Wolfgang Barth
Save the trees: Why we need tree models in linguistic reconstruction
(and when we should apply them)
Guillaume Jacques and Johann-Mattis List
When the waves meet the trees: A response to Jacques and List
Problems with, and alternatives to,
the tree model in historical linguistics
Siva Kalyan,
Alexandre François, and
Harald Hammarström
Australian National University |
LaTTiCe (CNRS; ENS-PSL; Paris -USPC) |Uppsala University
Ever since it was popularized by August Schleicher (, ), the family-tree
model has been the dominant paradigm for representing historical relations
senting language histories: for example, Johannes Schmidt’s () Wave Model”
(as illustrated, e.g., in Schrader : and Anttila :); Southworth’s ()
“tree-envelopes” (which seem to predate the “species trees” of phylogeography,
e.g. Goodman et al. ; Maddison ); Hock’s (:) ‘truncated octo-
pus’-like tree”; and, more recently, NeighborNet (Hurles et al. ; Bryant et al.
) and Historical Glottometry (Kalyan & François ). However, none of
pretability of the family tree model.
that every generation of speakers derives their language from the parental gener-
not share any further genealogical innovations with its unmodified variant, but
the use of powerful techniques of phylogenetic inference that have been developed
in biology (see Greenhill & Gray ; Baum & Schmidt ), and the stringent
assumptions underlying a family tree make it possible to infer the relative age of
family (see Jacques & List this volume: Section ., Kalyan & François : Sec-
tion ., and Baum & Schmidt : Chapter  for parallels in biology).
Journal of Historical Linguistics 9:1 (2019), pp. 1–8. issn 2210-2116 |eissn 2210-2124
© John Benjamins Publishing Company
of the family tree model in historical linguistics. When applying that model to a
language family, it is assumed that every linguistic innovation applies to a lan-
isassumptionmakesitimpossibletouseatreetomodelthepartial diffusion
of an innovation within a language community (“internal diffusion in François
:) or the diffusion of an innovation across language communities (“external
diffusion” in François :, or simply “borrowing”). ese limitations have
long been noticed by historical linguists (Schmidt ; Schuchardt ), but
they become glaringly obvious in the cases discussed by Ross (, ) under
the heading of “linkages, i.e., language families that arise through the diversifica-
tion, in situ, of a dialect network.
Following the discussion in François (:), a linkage consists of separate
modern languages which are all related and linked together by intersecting layers
of innovations; it is a language family whose internal genealogy cannot be rep-
resented by any tree. Figure shows how innovations (isoglosses numbered  to
patterns a configuration encountered both in dialect continua and in the link-
ages that descend from them.
Figure . Intersecting isoglosses in a dialect continuum, or a “linkage”
Over the past several decades, linguistic research has revealed numerous
examples of linkage phenomena in a broad range of language families: these
. As noted by Kalyan & François (this volume), this type of assumption is well-justified in
biology, where the rate at which innovations spread is far greater than the rate at which popula-
tions split, so that for all practical purposes, each innovation affects a species as an undifferen-
tiated whole (Baum & Schmidt :).
Siva Kalyan, Alexandre François and Harald Hammarström
examples can be found in (subgroups of) Sinitic (Hashimoto ; Chappell );
Semitic (Huehnergard & Rubin ; Magidow ); Western Romance (Penny
:–; Ernst et al. ), Germanic (Ramat ), Indo-Aryan (Toulmin
), and Iranian (Korn forthcoming); Athabaskan (Krauss & Golla ; Holton
); Pama-Nyungan (Bowern ); and Oceanic (Geraghty ; Ross ;
bone” tree accounting for vertical transmission with a sprinkle of additional bor-
rowing events (as exemplified by, e.g., Ringe et al.  or Nakhleh et al.  for
Indo-European); on the other end, the roles are reversed, with the bulk of linguis-
phylogeny” (as exemplified by, e.g., the “rake-like tree” discussed by Pawley 
search for ways of quantifying and representing the diversification of a linkage
(); still, it remains an open problem.
e articles in the present issue all contribute towards addressing this problem
ular language families that exhibit linkage-like behavior, using methodologies that
vary in the degree to which they accept the premises of the family-tree model.
Verk er k,inherarticle“Detectingnon-tree-likesignalusingmultipletree
topologies, addresses the question of how and where non-tree-like behaviour can
ing a single tree, her methods infer two trees for each language family a “major-
tree” accounting for as much as possible of the remainder. e differences between
also possible to explore which specific characters (in this case lexical cognate sets)
ing datasets of the Austronesian, Sinitic, Indo-European, and Japonic families.
of lexical and phonological innovations occurring in this group is carefully sur-
veyed before addressing the question of which features are inherited and which
tent with a tree-like divergence, while the remaining innovations cross-cut any
quantified and illustrated using the newly-proposed technique of Historical Glot-
tometry (François ; Kalyan & François ). is helps the human observer
Problems with, and alternatives to, the tree model in historical linguistics
to visually appreciate the presence and extent of multiple subgrouping, chaining,
and areal spread.
Daniels, Barth & Barth,in“SubgroupingtheSogeramlanguages:Acritical
occur in this group, then address the question of which historical scenario(s)
could explain them. Using Historical Glottometry, the authors quantify and com-
tree-like break-ups in the history of this subfamily. Furthermore, some improve-
alization, the handling of missing data, and transparency of data analysis.
While all of the above papers discuss theoretical and methodological issues
in the context of particular datasets, the final two articles in this issue are more
general in nature; they try to make explicit the differences between the family tree
model and its alternatives and discuss the extent to which these may be combined
into a unified framework for thinking about language diversification.
reconstruction (and when we should apply them), address skeptics of the tree
in particular distinguishing “data display” from models that encode an explicit
patible with the tree model can in fact be the result of tree-like diversification,
once the phenomenon of “incomplete lineage sorting” is taken into account; thus
missed too quickly. Lastly, they give examples in which an assumption of tree-like
language diversification simplifies the task of inferring the histories of particular
Finally, Kalyan & François,intheircontribution“Whenthewavesmeetthe
ical Glottometry. ey stress agreements between Jacques & List’s approach and
their own, then turn to the reading of glottometric diagrams. ey define a sys-
diagram, thereby arguing that such diagrams are not limited to static data display.
eage sorting” (i.e., unresolved variation in a proto-language) is extended to the
case of dialectal (i.e. geographically-conditioned) variation.
In summary, the articles in this volume provide a sample of possible
approaches to analyzing the evolution of a language family in non-cladistic terms.
Siva Kalyan, Alexandre François and Harald Hammarström
to which different approaches diverge from these assumptions. We hope that this
issue leads to a diversification of methods in historical linguistics, with ample bor-
rowing and diffusion among them.
is work contributes to the research program “Investissements d’Avenir, overseen by the
French National Research Agency (ANR--LABX-): LabEx Empirical Foundations of Lin-
guistics, Strand  – “Typology and dynamics of linguistic systems.
Anttila, Raimo. . Historical and Comparative Linguistics (Second Edition).
Amsterdam/Philadelphia: John Benjamins.
Baum, David A. & Stacey D. Smith. . Tree inking: An Introduction to Phylogenetic
Biology.New York:Macmillan.
Bowern, Claire. . Another Look at Australia as a Linguistic Area. Linguistic Areas ed. by
Yaron Matras, April McMahon & Nigel Vincent, –. Basingstoke: Palgrave
Bryant, David, Flavia Filimon & Russell D. Gray. . Untangling Our Past: Languages, Trees,
Splits and Networks. e Evolution of Cultural Diversity: Phylogenetic Approachesed. by
Ruth Mace, Clare J. Holden & Stephen Shennan, –. London: UCL Press.
Chappell, Hilary. . Language Contact and Areal Diffusion in Sinitic Languages. Areal
Diffusion and Genetic Inheritance: Problems in Comparative Linguistics ed. by
Alexandra Aikhenvald & R.M.W. Dixon, –. Oxford: Oxford University Press.
van Driem, George. . Languages of the Himalayas: An Ethnolinguistic Handbook of the
Greater Himalayan Region. Leiden: Brill.
Ellegård, Alvar. . Statistical Measurement of Linguistic Relationship. Language :.–.
Ernst, Gerhard, Martin-Dietrich Gleßgen, Christian Schmitt & Wolfgang Schweickard. .
Romanische Sprachgeschichte – Ein internationales Handbuch zur Geschichte der
romanischen Sprachen/Histoire linguistique de la Romania – Manuel international
d’histoire linguistique de la Romania. Berlin: Mouton de Gruyter.
François, Alexandre. . Social Ecology and Language History in the Northern Vanuatu
Linkage: A Tale of Divergence and Convergence. Journal of Historical Linguistics
François, Alexandre. . Trees, Waves and Linkages: Models of Language Diversification. e
Routledge Handbook of Historical Linguistics ed. by Claire Bowern & Bethwyn Evans,
–. London: Routledge.
Problems with, and alternatives to, the tree model in historical linguistics
François, Alexandre. . Méthode comparative et chaînages linguistiques: Pour un modèle
diffusionniste en généalogie des langues. Diffusion: implantation, affinités, convergence ed.
by Jean-Léo Léonard, –. (= Mémoires de la Société de Linguistique de Paris, .)
Louvain: Peeters.
Geraghty, Paul A. . e History of the Fijian Languages (= Oceanic Linguistics Special
Publication, .) Honolulu: University of Hawaii Press.
Goodman, Morris, John Czelusniak, G. William Moore, A.E. Romero-Herrera &
Genji Matsuda. . Fitting the Gene Lineage into its Species Lineage: A Parsimony
Strategy Illustrated by Cladograms Constructed from Globin Sequences. Systematic
Biology :.–.
Greenhill, Simon J. & Russell D. Gray. . Austronesian Language Phylogenies: Myths and
Misconceptions About Bayesian Computational Methods. Austronesian Historical
Linguistics and Culture History: A Festschri for Robert Blust ed. by Alexander Adelaar &
Andrew K. Pawley, –. (= Pacific Linguistics, .) Canberra: Pacific Linguistics.
Hashimoto, Mantarō J. . Hakka in Well en th eo ri e Perspective. Journal of Chinese Linguistics
Hock, Hans Henrich. . Principles of Historical Linguistics, Second Edition.Berlin:Mouton
de Gruyter.
Holton, Gary. . A Geo-Linguistic Approach to Understanding Relationships Within the
Athabaskan Family. Paper presented at Language in Space: Geographic Perspectives on
Language Diversity and Diachrony, Boulder, Colorado, USA, June –, .
Huehnergard, John & Aaron Rubin. . Phyla and Waves: Models of Classification of the
Semitic Languages. Semitic Languages: An International Handbook ed. by
Stefan Weninger, Geoffrey Khan, Michael P. Streck & Janet C.E. Watson, –. (=
Handbücher zur Sprach- und Kommunikationswissenscha, .) Berlin: Mouton de
Hurles, Matthew E., Elizabeth Matisoo-Smith, Russell D. Gray & David Penny. .
Untangling Oceanic Settlement: e Edge of the Knowable. Trends in Ecology and
Evolution :.–.
Kalyan, Siva & Alexandre François. . Freeing the Comparative Method from the Tree
Model: A Framework for Historical Glottometry. Let’s Talk about Trees: Problems in
Representing Phylogenic Relationships Among Languages ed. by Ritsuko Kikusawa &
Lawrence A. Reid, –. (= Senri Ethnological Studies, .) Ōsaka: National Museum of
Korn, Agnes. Forthcoming. Isoglosses and Subdivisions of Iranian. Journal of Historical
Linguistics :.
Krauss, Michael E. & Victor Golla. . Northern Athapaskan Languages. Handbook of North
American Indians, Vol. : Subarctic ed. by June Helm & William C. Sturtevant, –.
Washington, DC: Smithsonian Institution.
Kroeber, Alfred L. & C. Douglas Chrétien. . Quantitative Classification of Indo-European
Languages. Language :.–.
Maddison, Wayne P. . Gene Trees in Species Trees. Systematic Biology :.–.
Magidow, Alexander. . Towards a Sociohistorical Reconstruction of Pre-Islamic Arabic
Dialect Diversity. University of Texas at Austin PhD dissertation.
Siva Kalyan, Alexandre François and Harald Hammarström
Nakhleh, Luay, Don Ringe & Tandy Warnow. . Perfect Phylogenetic Networks: A New
Methodology for Reconstructing the Evolutionary History of Natural Languages.
Language :.–.
Pawley, Andrew. . Chasing Rainbows: Implications of the Rapid Dispersal of Austronesian
Languages for Subgrouping and Reconstruction. Selected Papers from the Eighth
International Conference on Austronesian Linguistics ed. by Elizabeth Zeitoun &
Paul Jen-Kuei Li, –. (= Symposium Series of the Institute of Linguistics: Academia
Sinica, .) Honolulu: University of Hawaii Press.
Penny, Ralph John. . Variation and Change in Spanish. Cambridge: Cambridge University
Ramat, Paolo. . e Germanic Languages. e Indo-European Languages ed. by
Anna Giacalone Ramat & Paolo Ramat, –. London: Routledge.
Ringe, Don, Tandy Warnow & Ann Taylor. . Indo-European and Computational
Cladistics. Transactions of the Philological Society :.–.
Ross, Malcolm D. . Proto Oceanic and the Austronesian Languages of Western Melanesia.
Canberra: Pacific Linguistics.
Ross, Malcolm. . Social Networks and Kinds of Speech-Community Event. Archaeology
and Language I: eoretical and Methodological Orientations ed. by Roger Blench &
Matthew Spriggs, –. London: Routledge.
Schleicher, August. . Die ersten Spaltungen des indogermanischen Urvolkes. Allgemeine
Monatsschri für Wissenscha und Literatur ed. by Johann Gustav Droysen &
G.W. Nitzsch, –. Braunschweig: C.A. Schwestchke & Sohn.
Schleicher, August. . Die Darwinsche eorie und die Sprachwissenscha: Offenes
Sendschreiben an Herrn Dr. Ernst Häckel, o. Professor der Zoologie und Director des
zoologischen Museums an der Universität Jena, Second Edition. Weimar: Hermann Böhlau.
Schmidt, Johannes. . Die Verwantschasverhältnisse der indogermanischen Sprachen.
Hermann Böhlau.
Schrader, Otto. . Sprachvergleichung und Urgeschichte: linguistisch-historische Beiträge zur
Erforschung des indogermanischen Altertums. Jena: Hermann Costenoble.
Schuchardt, Hugo. . Über die Klassifikation der romanischen Mundarten. Probe-Vorlesung,
gehalten zu Leipzig am . April . Graz.
Southworth, Franklin C. . Family-Tree Diagrams. Language :.–.
Toulmin, Matthew. . From Linguistic to Sociolinguistic Reconstruction: e Kamta
Historical Subgroup of Indo-Aryan. (= Pacific Linguistics, .) Canberra: Pacific
Problems with, and alternatives to, the tree model in historical linguistics
Address for correspondence
Siva Kalyan
Department of Linguistics
School of Culture, History and Language
College of Asia and the Pacific
Australian National University
 Fellows Road
Acton, ACT 
Co-author information
Alexandre François
Harald Hammarström
Department of Linguistics and Philology
Uppsala University
Siva Kalyan, Alexandre François and Harald Hammarström
Detecting non-tree-like signal
using multiple tree topologies
Annemarie Verkerk
University of Reading &
Recent applications of phylogenetic methods to historical linguistics have
been criticized for assuming a tree structure in which ancestral languages
differentiate and split up into daughter languages, while language evolution
is inherently non-tree-like (François ; Blench :–). is article
attempts to contribute to this debate by discussing the use of the multiple
topologies method (Pagel & Meade a) implemented in BayesPhyloge-
nies (Pagel & Meade ). is method is applied to lexical datasets from
four different language families: Austronesian (Gray, Drummond & Green-
hill ), Sinitic (Ben Hamed & Wang ), Indo-European (Bouckaert
et al. ), and Japonic (Lee & Hasegawa ). Evidence for multiple
topologies is found in all families except, surprisingly, Austronesian. It is
suggested that reticulation may arise from a number of processes, including
dialect chain break-up, borrowing (both shortly aer language splits and
later on), incomplete lineage sorting, and characteristics of lexical datasets.
It is shown that the multiple topologies method is a useful tool to study the
dynamics of language evolution.
Keywords: Bayesian phylogenetic inference, Austronesian, Sinitic, Indo-
European,Japonic,language contact,reticulation
. Introduction
made inroads into historical linguistics: these methods are applied both to build-
ing phylogenetic trees (Ben Hamed & Wang ; Gray, Drummond & Greenhill
; Lee & Hasegawa ; Bouckaert et al. ; Grollemund et al. ) and to
inferring the process of evolution of typological characteristics on trees (Dunn
et al. ; Verkerk ; Zhou & Bowern ). e reception of these studies
has been mixed, with studies criticizing data type (generally cognate-coded lexical
Journal of Historical Linguistics 9:1 (2019), pp. 9–69. issn 2210-2116 |eissn 2210-2124
© John Benjamins Publishing Company
data), data quality, the applicability of methods and models from another disci-
knowledge into the phylogenetic analysis (Eska & Ringe ; Heggarty ;
Holm ; Blench ; Pereltsvaig & Lewis ).
Nevertheless, phylogenetic methods have been adopted by a wide range of
linguists to answer questions regarding the diversification of language families
and of typological features (Bowern & Atkinson ; Bouchard-Côté et al. ;
Galucio et al. ; Macklin-Cordes & Round ; Meira, Birchall & Chousou-
Polydouri ; Jaeger & Wichmann ; Widmer et al. ). Dunn (:),
in his chapter in Bowern & Epps () e Routledge Handbook of Historical
positive reception is for good reason, given that evolutionary biologists have been
considering statistical approaches to the study of species diversification for over
rial came from Edwards & Cavalli-Sforza (). Dunn () describes how lin-
guists began considering quantitative approaches to language history only when
Swadesh (Swadesh , ) first developed his methods of lexicostatistics and
ods from evolutionary biology by Gray & Jordan (). us, the methods of
evolutionary biology are likely to complement those of traditional historical lin-
guistics, especially when it comes to quantitative methods for inferring language
genealogies and changes in cultural and linguistic features in these genealogies
(Pagel ; Levinson & Gray ).
In the current article, a phylogenetic inference technique taken from biology
is applied to linguistic data, and its usability is reviewed. e technique is called
“multiple tree topologies. It is implemented using the soware BayesPhylogenies,
which provides a Bayesian framework for inferring trees for a variety of data types,
history, but instead fits multiple independent tree topologies to the data. If, for
whatever reason, certain sites (columns in the dataset, which here constitute the
cognate sets) point towards a different cladistic grouping than others do, these
different signals are picked up and then reflected by displaying statistical support
for two or more different tree topologies which are estimated simultaneously. is
method is applied to lexical datasets from four language families: Austronesian
(Gray, Drummond & Greenhill ), Sinitic (Ben Hamed & Wang ), Indo-
European (Bouckaert et al. ), and Japonic (Lee & Hasegawa ). is article
contributes to the use of phylogenetic methods in historical linguistics by apply-
ing a specific phylogenetic method to linguistic data. It is especially relevant to
 Annemarie Verkerk
temporary entities in terms of descent with modification from common ancestors,
can deal with non-tree-like aspects of language change.
we know includes both (i) genealogically inherited characteristics that can be
through other processes, including dialect chain break-up, borrowing (from a
substrate, superstrate, or adstrate), or incomplete lineage sorting. Aer applying
and non-tree-like signal is recovered and correctly characterized. However, this
required dataset does not exist, as the bundles of data that are studied by historical
linguistics contain features that could have arisen via any of these processes (Heg-
garty, Maguire & McMahon :), and quantitative methods for exploring a
Scornavacca :). In some cases, it is possible to distinguish different sources
of divergent signals on the basis of knowledge about which features are more likely
topologies method or of any method that attempts to incorporate multiple histor-
ical signals in some way is therefore not possible at this point in time.
In this light, this study should be seen as an exploratory experiment for the