There are important reasons to be sceptical of the accuracy and usefulness of the family-tree model in historical linguistics. That model assumes that every linguistic innovation applies to a language considered as an undifferentiated whole, a point with no "width". But this assumption makes it impossible to use a tree to model the partial diffusion of an innovation within a language community ("internal diffusion"), or the diffusion of an innovation across language communities ("external diffusion"). These limitations have long been noticed by historical linguists (Schmidt 1872, Schuchardt 1900); but they become glaringly obvious in the cases discussed by Ross (1988) and François (2014) under the heading of "linkages" – i.e., language families that arise through the diversification, in situ, of a dialect network. The articles in this special issue all contribute towards addressing this problem, from a range of perspectives.
Problems with, and alternatives to, the tree model in historical
Siva Kalyan, Alexandre François and Harald Hammarström
Detecting non-tree-like signal using multiple tree topologies
Annemarie Verkerk
Visualizing the Boni dialects with Historical Glottometry
Alexander Elias
Subgrouping the Sogeram languages: A critical appraisal
of Historical Glottometry
Don Daniels, Danielle Barth and Wolfgang Barth
Save the trees: Why we need tree models in linguistic
reconstruction (and when we should apply them)
Guillaume Jacques and Johann-Mattis List
When the waves meet the trees: A response to Jacques and List
Siva Kalyan and Alexandre François
Understanding language genealogy
Alternatives to the tree model
the tree model in historical linguistics
Siva Kalyan,
Alexandre François, and
Harald Hammarström
Australian National University |
LaTTiCe (CNRS; ENS-PSL; Paris -USPC) |Uppsala University
Ever since it was popularized by August Schleicher (, ), the family-tree
model has been the dominant paradigm for representing historical relations
senting language histories: for example, Johannes Schmidt’s () Wave Model”
(as illustrated, e.g., in Schrader : and Anttila :); Southworth’s ()
“tree-envelopes” (which seem to predate the “species trees” of phylogeography,
e.g. Goodman et al. ; Maddison ); Hock’s (:) ‘truncated octo-
pus’-like tree”; and, more recently, NeighborNet (Hurles et al. ; Bryant et al.
) and Historical Glottometry (Kalyan & François ). However, none of
pretability of the family tree model.
that every generation of speakers derives their language from the parental gener-
not share any further genealogical innovations with its unmodified variant, but
the use of powerful techniques of phylogenetic inference that have been developed
in biology (see Greenhill & Gray ; Baum & Schmidt ), and the stringent
assumptions underlying a family tree make it possible to infer the relative age of
family (see Jacques & List this volume: Section ., Kalyan & François : Sec-
tion ., and Baum & Schmidt : Chapter  for parallels in biology).
Journal of Historical Linguistics 9:1 (2019), pp. 1–8.
© John Benjamins Publishing Company
of the family tree model in historical linguistics. When applying that model to a
language family, it is assumed that every linguistic innovation applies to a lan-
isassumptionmakesitimpossibletouseatreetomodelthepartial diffusion
of an innovation within a language community (“internal diffusion in François
:) or the diffusion of an innovation across language communities (“external
diffusion” in François :, or simply “borrowing”). ese limitations have
long been noticed by historical linguists (Schmidt ; Schuchardt ), but
they become glaringly obvious in the cases discussed by Ross (, ) under
the heading of “linkages, i.e., language families that arise through the diversifica-
tion, in situ, of a dialect network.
Following the discussion in François (:), a linkage consists of separate
modern languages which are all related and linked together by intersecting layers
of innovations; it is a language family whose internal genealogy cannot be rep-
resented by any tree. Figure shows how innovations (isoglosses numbered  to
patterns a configuration encountered both in dialect continua and in the link-
ages that descend from them.
Figure . Intersecting isoglosses in a dialect continuum, or a “linkage”
Over the past several decades, linguistic research has revealed numerous
examples of linkage phenomena in a broad range of language families: these
. As noted by Kalyan & François (this volume), this type of assumption is well-justified in
biology, where the rate at which innovations spread is far greater than the rate at which popula-
tions split, so that for all practical purposes, each innovation affects a species as an undifferen-
tiated whole (Baum & Schmidt :).
Siva Kalyan, Alexandre François and Harald Hammarström
examples can be found in (subgroups of) Sinitic (Hashimoto ; Chappell );
Semitic (Huehnergard & Rubin ; Magidow ); Western Romance (Penny
:–; Ernst et al. ), Germanic (Ramat ), Indo-Aryan (Toulmin
), and Iranian (Korn forthcoming); Athabaskan (Krauss & Golla ; Holton
); Pama-Nyungan (Bowern ); and Oceanic (Geraghty ; Ross ;
bone” tree accounting for vertical transmission with a sprinkle of additional bor-
rowing events (as exemplified by, e.g., Ringe et al.  or Nakhleh et al.  for
Indo-European); on the other end, the roles are reversed, with the bulk of linguis-
phylogeny” (as exemplified by, e.g., the “rake-like tree” discussed by Pawley 
search for ways of quantifying and representing the diversification of a linkage
(); still, it remains an open problem.
e articles in the present issue all contribute towards addressing this problem
ular language families that exhibit linkage-like behavior, using methodologies that
vary in the degree to which they accept the premises of the family-tree model.
Verk er k,inherarticle“Detectingnon-tree-likesignalusingmultipletree
topologies, addresses the question of how and where non-tree-like behaviour can
ing a single tree, her methods infer two trees for each language family a “major-
tree” accounting for as much as possible of the remainder. e differences between
also possible to explore which specific characters (in this case lexical cognate sets)
ing datasets of the Austronesian, Sinitic, Indo-European, and Japonic families.
of lexical and phonological innovations occurring in this group is carefully sur-
veyed before addressing the question of which features are inherited and which
tent with a tree-like divergence, while the remaining innovations cross-cut any
quantified and illustrated using the newly-proposed technique of Historical Glot-
tometry (François ; Kalyan & François ). is helps the human observer
Problems with, and alternatives to, the tree model in historical linguistics
to visually appreciate the presence and extent of multiple subgrouping, chaining,
and areal spread.
Daniels, Barth & Barth,in“SubgroupingtheSogeramlanguages:Acritical
occur in this group, then address the question of which historical scenario(s)
could explain them. Using Historical Glottometry, the authors quantify and com-
tree-like break-ups in the history of this subfamily. Furthermore, some improve-
alization, the handling of missing data, and transparency of data analysis.
While all of the above papers discuss theoretical and methodological issues
in the context of particular datasets, the final two articles in this issue are more
general in nature; they try to make explicit the differences between the family tree
model and its alternatives and discuss the extent to which these may be combined
into a unified framework for thinking about language diversification.
reconstruction (and when we should apply them), address skeptics of the tree
in particular distinguishing “data display” from models that encode an explicit
patible with the tree model can in fact be the result of tree-like diversification,
once the phenomenon of “incomplete lineage sorting” is taken into account; thus
missed too quickly. Lastly, they give examples in which an assumption of tree-like
language diversification simplifies the task of inferring the histories of particular
Finally, Kalyan & François,intheircontribution“Whenthewavesmeetthe
ical Glottometry. ey stress agreements between Jacques & List’s approach and
their own, then turn to the reading of glottometric diagrams. ey define a sys-
diagram, thereby arguing that such diagrams are not limited to static data display.
eage sorting” (i.e., unresolved variation in a proto-language) is extended to the
case of dialectal (i.e. geographically-conditioned) variation.
In summary, the articles in this volume provide a sample of possible
approaches to analyzing the evolution of a language family in non-cladistic terms.
Siva Kalyan, Alexandre François and Harald Hammarström
to which different approaches diverge from these assumptions. We hope that this
issue leads to a diversification of methods in historical linguistics, with ample bor-
rowing and diffusion among them.
is work contributes to the research program “Investissements d’Avenir, overseen by the
French National Research Agency (ANR--LABX-): LabEx Empirical Foundations of Lin-
guistics, Strand  – “Typology and dynamics of linguistic systems.
Annemarie Verkerk
University of Reading &
Recent applications of phylogenetic methods to historical linguistics have
been criticized for assuming a tree structure in which ancestral languages
differentiate and split up into daughter languages, while language evolution
is inherently non-tree-like (François ; Blench :–). is article
attempts to contribute to this debate by discussing the use of the multiple
topologies method (Pagel & Meade a) implemented in BayesPhyloge-
nies (Pagel & Meade ). is method is applied to lexical datasets from
four different language families: Austronesian (Gray, Drummond & Green-
hill ), Sinitic (Ben Hamed & Wang ), Indo-European (Bouckaert
et al. ), and Japonic (Lee & Hasegawa ). Evidence for multiple
topologies is found in all families except, surprisingly, Austronesian. It is
suggested that reticulation may arise from a number of processes, including
dialect chain break-up, borrowing (both shortly aer language splits and
later on), incomplete lineage sorting, and characteristics of lexical datasets.
It is shown that the multiple topologies method is a useful tool to study the
dynamics of language evolution.
Keywords: Bayesian phylogenetic inference, Austronesian, Sinitic, Indo-
European,Japonic,language contact,reticulation
. Introduction
made inroads into historical linguistics: these methods are applied both to build-
ing phylogenetic trees (Ben Hamed & Wang ; Gray, Drummond & Greenhill
; Lee & Hasegawa ; Bouckaert et al. ; Grollemund et al. ) and to
inferring the process of evolution of typological characteristics on trees (Dunn
et al. ; Verkerk ; Zhou & Bowern ). e reception of these studies
has been mixed, with studies criticizing data type (generally cognate-coded lexical
Journal of Historical Linguistics 9:1 (2019), pp. 9–69.
© John Benjamins Publishing Company
data), data quality, the applicability of methods and models from another disci-
knowledge into the phylogenetic analysis (Eska & Ringe ; Heggarty ;
Holm ; Blench ; Pereltsvaig & Lewis ).
Nevertheless, phylogenetic methods have been adopted by a wide range of
linguists to answer questions regarding the diversification of language families
and of typological features (Bowern & Atkinson ; Bouchard-Côté et al. ;
Galucio et al. ; Macklin-Cordes & Round ; Meira, Birchall & Chousou-
Polydouri ; Jaeger & Wichmann ; Widmer et al. ). Dunn (:),
in his chapter in Bowern & Epps () e Routledge Handbook of Historical
positive reception is for good reason, given that evolutionary biologists have been
considering statistical approaches to the study of species diversification for over
rial came from Edwards & Cavalli-Sforza (). Dunn () describes how lin-
guists began considering quantitative approaches to language history only when
Swadesh (Swadesh , ) first developed his methods of lexicostatistics and
ods from evolutionary biology by Gray & Jordan (). us, the methods of
evolutionary biology are likely to complement those of traditional historical lin-
guistics, especially when it comes to quantitative methods for inferring language
genealogies and changes in cultural and linguistic features in these genealogies
(Pagel ; Levinson & Gray ).
In the current article, a phylogenetic inference technique taken from biology
is applied to linguistic data, and its usability is reviewed. e technique is called
“multiple tree topologies. It is implemented using the soware BayesPhylogenies,
which provides a Bayesian framework for inferring trees for a variety of data types,
history, but instead fits multiple independent tree topologies to the data. If, for
whatever reason, certain sites (columns in the dataset, which here constitute the
cognate sets) point towards a different cladistic grouping than others do, these
different signals are picked up and then reflected by displaying statistical support
for two or more different tree topologies which are estimated simultaneously. is
method is applied to lexical datasets from four language families: Austronesian
(Gray, Drummond & Greenhill ), Sinitic (Ben Hamed & Wang ), Indo-
European (Bouckaert et al. ), and Japonic (Lee & Hasegawa ). is article
contributes to the use of phylogenetic methods in historical linguistics by apply-
ing a specific phylogenetic method to linguistic data. It is especially relevant to
 Annemarie Verkerk
temporary entities in terms of descent with modification from common ancestors,
can deal with non-tree-like aspects of language change.
we know includes both (i) genealogically inherited characteristics that can be
through other processes, including dialect chain break-up, borrowing (from a
substrate, superstrate, or adstrate), or incomplete lineage sorting. Aer applying
and non-tree-like signal is recovered and correctly characterized. However, this
required dataset does not exist, as the bundles of data that are studied by historical
linguistics contain features that could have arisen via any of these processes (Heg-
garty, Maguire & McMahon :), and quantitative methods for exploring a
Scornavacca :). In some cases, it is possible to distinguish different sources
of divergent signals on the basis of knowledge about which features are more likely
topologies method or of any method that attempts to incorporate multiple histor-
ical signals in some way is therefore not possible at this point in time.
In this light, this study should be seen as an exploratory experiment for the