ChapterPDF Available

Resolving the Avian Tree of Life from Top to Bottom: The Promise and Potential Boundaries of the Phylogenomic Era

Authors:
Chapter

Resolving the Avian Tree of Life from Top to Bottom: The Promise and Potential Boundaries of the Phylogenomic Era

Abstract and Figures

Reconstructing relationships among extant birds (Neornithes) has been one of the most difficult problems in phylogenetics, and, despite intensive effort, the avian tree of life remains (at least partially) unresolved. Thus far, the most difficult problem is the relationship among the orders of Neoaves, the major clade that includes the most (~95%) named bird species. This clade appears to have undergone a rapid radiation near the end Cretaceous mass extinction (the K-Pg boundary). On the other hand, if one embraces a “glass half full” view, the fact that most orders in Neoaves can be placed into seven clades, recently designated the “magnificent seven,” could be viewed as remarkable progress. We propose that the dawning era of whole-genome phylogenetics will only resolve the remaining relationships, if we improve data quality, exploit information from other sources (i.e., rare genomic changes), and learn more about the functional and evolutionary landscape of avian genomes. Of course, it is possible that the remaining unresolved relationships are unresolvable regardless of the data available, but we suggest that the community should avoid this conclusion until more data collection has been completed and improved analyses have been conducted. We say this because there is ample evidence that estimates of avian phylogeny based on large-scale datasets may be affected by well-characterized artifacts (e.g., long-branch attraction, heterotachy, and discordance among gene trees) and by subtle “data-type effects” that reflect poor fit to empirical data for available models of sequence evolution. Even if these analytical challenges can be addressed, we need to integrate phylogenomic and fossil data. Finally, we also emphasize that, regardless of the resolution (or lack thereof) for relationships among major avian clades, we are only at the dawn of the phylogenomics of birds. Large-scale molecular data remain unavailable for the vast majority of the ~10,000 named bird species, and those named bird species probably represent an underestimate of the true number of distinct evolutionary lineages of birds (whether or not those lineages are assigned the rank of species) by as much as threefold. A true biodiversity genomics effort in birds is likely to reveal many additional examples of cases where it is very difficult to resolve relationships; the effort to resolve as many of those relationships as possible will represent a major scientific achievement and provide lessons for phylogenomic studies in other parts of the tree of life.
Relationships within core landbirds depend on data type and analytical approach. We present four candidate topologies that have been recovered when different data types and analytical approaches are used. Multispecies coalescent ("species tree") analyses are indicated using "MSC"; all other analyses used concatenated data. (a) Division into two major clades (Australaves and Afroaves; see Table 3), present in the Jarvis et al. (2014) TENT and the Reddy et al. (2017) analyses of concatenated noncoding data. (b) Accipitriformes sister to all other core landbirds, found in the binned MP-EST (Mirarab et al. 2014a) analysis of intronic data reported by Jarvis et al. (2014) and the Prum et al. (2015) concatenated tree. The Kimball et al. (2013) NJst analysis had as similar topology (the asterisk indicates a minor rearrangement placing mousebirds sister to owls). This topology was also supported by unbinned MP-EST analyses of three different noncoding datasets in Edwards et al. (2017), although the taxon sampling in that study was limited. (c) An acciptriform + owl clade sister to all other core landbirds, found in binned MP-EST analysis of the TENT data and unbinned MP-EST analysis of introns by Jarvis et al. (2014). Reddy et al. (2017) also found this topology in their concatenated analyses of a 104-locus coding data matrix (their "Prum noJar" tree). (d) Division into two major clades but mousebirds sister to all other Afroaves, found in the Jarvis et al. (2014) concatenated UCE tree and the Suh et al. (2015) analysis of TE insertions. Cavitaves is defined as Piciformes, Coraciiformes, Bucerotiformes, Trogoniformes, and Leptosomiformes (Yuri et al. 2013)
… 
Two examples of gene trees from Reddy et al. (2017), emphasizing differences in relative rates and base composition. Support values from an ultrafast bootstrap (Minh et al. 2013) analysis in IQ-TREE are shown next to branches when they are 70%. (a) Tree based on an intron in the PPP2CB locus. This is one of the few individual gene trees that divides Neoaves into Columbea and Passerea. As in Fig. 7, the GC-content for informative sites is indicated with colored arrows (red for the six most GC-rich taxa and blue for the six least GC-rich taxa). Although the median GC-content for informative sites (45.4%) does differ from the GC-content for constant sites (49.7%), this locus exhibits limited GC-variation overall (range for informative sites ¼ 41.5-58.4%). (b) Tree based on part of the BDNF coding exon. This locus exhibits substantial rate variation (note the very long branches for the bee-eater, Darwin's finch, and tinamou). Like PPP2CB, the median GC-content for informative sites (47.5%) differs from the GC-content of constant sites (52.5%). However, this region also exhibits substantial GC-content variation (range for informative sites ¼ 33.9-84.8%). Most nodes in this gene tree have limited support, similar to the PPP2CB gene tree (and most trees based on short gene regions), but the BDNF tree does includes some strongly supported clades. However, some of those strongly supported clades contradict monophyly of Palaeognathae and Neoaves, which are united by very long branches in all other estimates of the avian species tree. The extreme GC-content and rate variation suggest that these conflicts may reflect a biased estimate of phylogeny. All data supporting this figure is available in Braun (2018)
… 
Content may be subject to copyright.
Resolving the Avian Tree of Life from Top
to Bottom: The Promise and Potential
Boundaries of the Phylogenomic Era
Edward L. Braun, Joel Cracraft, and Peter Houde
Abstract
Reconstructing relationships among extant birds (Neornithes) has been one of the
most difcult problems in phylogenetics, and, despite intensive effort, the avian
tree of life remains (at least partially) unresolved. Thus far, the most difcult
problem is the relationship among the orders of Neoaves, the major clade that
includes the most (~95%) named bird species. This clade appears to have
undergone a rapid radiation near the end Cretaceous mass extinction (the K-Pg
boundary). On the other hand, if one embraces a glass half fullview, the fact
that most orders in Neoaves can be placed into seven clades, recently designated
the magnicent seven,could be viewed as remarkable progress. We propose
that the dawning era of whole-genome phylogenetics will only resolve the
remaining relationships, if we improve data quality, exploit information from
other sources (i.e., rare genomic changes), and learn more about the functional
and evolutionary landscape of avian genomes. Of course, it is possible that the
remaining unresolved relationships are unresolvable regardless of the data avail-
able, but we suggest that the community should avoid this conclusion until more
data collection has been completed and improved analyses have been conducted.
Author contributed equally with all other contributors. Edward L. Braun, Joel Cracraft and Peter
Houde
E. L. Braun
Department of Biology and Genetics Institute, University of Florida, Gainesville, FL, USA
e-mail: ebraun68@u.edu
J. Cracraft
Department of Ornithology, American Museum of Natural History, New York, NY, USA
e-mail: jlc@amnh.org
P. Houde (*)
Department of Biology, New Mexico State University, Las Cruces, NM, USA
e-mail: phoude@nmsu.edu
#Springer Nature Switzerland AG 2019
R. H. S. Kraus (ed.), Avian Genomics in Ecology and Evolution,
https://doi.org/10.1007/978-3-030-16477-5_6
151
We say this because there is ample evidence that estimates of avian phylogeny
based on large-scale datasets may be affected by well-characterized artifacts (e.g.,
long-branch attraction, heterotachy, and discordance among gene trees) and by
subtle data-type effectsthat reect poor t to empirical data for available
models of sequence evolution. Even if these analytical challenges can be
addressed, we need to integrate phylogenomic and fossil data. Finally, we also
emphasize that, regardless of the resolution (or lack thereof) for relationships
among major avian clades, we are only at the dawn of the phylogenomics of birds.
Large-scale molecular data remain unavailable for the vast majority of the
~10,000 named bird species, and those named bird species probably represent
an underestimate of the true number of distinct evolutionary lineages of birds
(whether or not those lineages are assigned the rank of species) by as much as
threefold. A true biodiversity genomics effort in birds is likely to reveal many
additional examples of cases where it is very difcult to resolve relationships; the
effort to resolve as many of those relationships as possible will represent a major
scientic achievement and provide lessons for phylogenomic studies in other
parts of the tree of life.
Keywords
Bird phylogeny · Phylogenetic estimation · Base composition · GC-content ·
Heterotachy · Rare genomic changes · Multispecies coalescent · Whole-genome
sequencing · Phylogenomics
1 Introduction
A renaissance within systematics began in the late 1980s with the introduction of the
polymerase chain reaction (PCR), which provided a simple method for directly
harvesting and sequencing DNA for comparative studies (Higuchi and Ochman
1989; Kocher et al. 1989; Saiki et al. 1988). This had an almost immediate effect
within systematic biology in general (Hillis and Moritz 1990; Miyamoto and
Cracraft 1991), as well as in ornithology in particular (Edwards et al. 1991; Edwards
and Wilson 1990; Helm-Bychowski and Cracraft 1993; Mindell 1997; also see
Sheldon and Bledsoe 1993 for a review describing the early history of molecular
systematics in birds). The human genome project began during the same period
(Cantor 1990; Watson 1990). This led to remarkable advances in DNA sequencing
technologies and methods for the storage and analysis of sequence data that continue
to have a profound impact on comparative biology, including the eld of systematics
(for additional historical details, see Wink 2019). These technical achievements
resulted in DNA sequence comparative datasets of ever increasing sizes. Thus, the
next decade saw the widespread expansion of the use of DNA sequence data in avian
systematics, and knowledge about the avian tree of life deepened at all taxonomic
levels (reviewed by Cracraft et al. 2004). In particular, studies of avian relationships
152 E. L. Braun et al.
were characterized by increased taxon sampling and the inclusion of multiple
mitochondrial regions and nuclear loci.
The idea of phylogenomicshas existed for about two decades. Although the
original usage included the inference of gene function using evolutionary history
(Eisen 1998; Eisen et al. 1997), the term has largely become synonymous with
the use of large amounts of sequence data in phylogenetics (Delsuc et al. 2005).
And today, there are many thousands of citations alluding to phylogenomics,and
the majority of these imply the use of genomicor large-scaleapproaches to
estimate the tree of life. In some sense, the phylogenomic era of avian systematics
began with Hackett et al. (2008), in which the Early Birdconsortium of
investigators employed 19 loci (~32 kb of sequence) from 169 species to construct
a phylogeny that included all major avian lineages (Table 1). It might be more
accurate to view the Early Bird effort as the beginning of a proto-phylogenomic
eraof avian systematics because it represented a signicant increase in scale of data
collection relative to previous work but it still relied on PCR for gene sampling.
However, the use of PCR sampling was not a major limitation, and, at the time,
Hackett et al. (2008) provided the broadest support for relationships within and
(to some degree) among avian orders.
Table 1 Recent large-scale estimates of avian phylogeny
a
Study
Number of
neornithine taxa Data type
b
Analysis
c
Branch
lengths
d
Fain and Houde (2004) 149 one nuclear locus (nc) MP/ML/
BI
n/a
Ericson et al. (2006) 111 ve nuclear loci (c/nc) BI time
Livezey and Zusi (2007a) 150 morphology
e
MP n/a
Hackett et al. (2008) 169 19 nuclear loci (nc) ML mol
Kimball et al. (2013) 77 50 nuclear loci (nc) ML mol
McCormack et al. (2013) 33 1541 nuclear loci (nc) BI mol
Jarvis et al. (2014) 48 12,020 nuclear loci
(nc/c)
ML time/
mol
Prum et al. (2015) 198 259 nuclear loci (c) BI time
Claramunt and Cracraft
(2015)
48/230
f
up to 1156 nuclear
loci (c)
BI time
Reddy et al. (2017) 235 54 nuclear loci (nc) ML mol
a
We dene large-scale trees as those that include most or all of the orders, as dened by Cracraft
(2013), in Neoaves
b
Data types are reported as cfor primarily coding and ncfor primarily noncoding (introns,
UCEs, and untranslated regions)
c
BI Bayesian inference, ML maximum likelihood, MP maximum parsimony
d
Branch lengths available as estimates of absolute time or molecular change (substitutions per site).
n/aindicates that branch lengths are not available
e
Livezey and Zusi (2007b) provide a detailed description of this morphological data; Mayr (2008)
discussed issues with character scoring and the interpretations of avian morphological variation that
are more congruent with molecular phylogenies
f
The 48-taxon tree in Claramunt and Cracraft (2015) reects an analysis of 1156 clocklike coding
regions. The 230-taxon tree reects an analysis of two coding regions. Both analyses were
constrained to the Jarvis et al. (2014) backbone. Additional analyses using the Prum et al. (2015)
backbone are reported in their supplementary material
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 153
This achievement was soon surpassed by studies that adopted various next-
generation sequencing technologies (Glenn 2011). Arguably, the new and innova-
tive methods for sequence capture represent the most important technology for avian
phylogenomics at this time. Those methods greatly expanded the harvesting of
hundreds or thousands of loci across the genome (Table 2), truly launching the era
of avian phylogenomics. Currently, one of the more popular sequence capture
methods is the use of probes that hybridize to ultraconserved elements (UCEs),
which correspond largely to noncoding regions that are conserved across most or all
vertebrates (Bejerano et al. 2004). Many noncoding UCEs appear to be involved in
gene regulation (Dimitrieva and Bucher 2012), but their function is not related to
their use in phylogenetics. Instead, the conserved nature of the sequences allows
probes that hybridize to UCEs to be used for sequence capture in many different
vertebrates; most or all of the phylogenetic information in UCE datasets actually
reects the less-conserved sequences that ank the conserved core of the UCE
(Crawford et al. 2012; Faircloth et al. 2012; McCormack et al. 2012,2013).
Analyses of UCEs have begun to advance our understanding of avian systematics
Table 2 Published phylogenomic studies using the UCE sequence capture data
Study Focal order Number of species Number of samples
a
Smith et al. (2014) Passeriformes 5 36
Sun et al. (2014) Galliformes 15
Bryson et al. (2016) Passeriformes 30
Hosner et al. (2016) Galliformes 23
Hosner et al. (2015b) Galliformes 90
Manthey et al. (2016) Passeriformes 11 28
McCormack et al. (2016) Passeriformes 1 (3)
b
27
Meiklejohn et al. (2016) Galliformes 18
Moyle et al. (2016) Passeriformes 106
Persons et al. (2016) Galliformes 11
Zarza et al. (2016) Passeriformes 3 26
Andersen et al. (2017) Coraciiformes 21
Bruxaux et al. (2017)
c
Columbiformes 6 21
Campillo et al. (2017) Passeriformes 17
Hosner et al. (2017) Galliformes 115
Wang et al. (2017) Galliformes 20
White et al. (2017) Caprimulgiformes 12
Andermann et al. (2018) Caprimulgiformes 2 9
Musher and Cracraft (2018) Passeriformes 29 62
Younger et al. (2018) Passeriformes 3 23
a
The total number of samples is listed if multiple individuals were sequenced for at least one of the
focal species; a dash indicates that only one sample per species was sequenced
b
McCormack et al. (2016) sequenced 27 Western scrub jays (Aphelocoma californica) from
3 lineages that could represent species
c
Bruxaux et al. (2017) used genome skimming followed by bioinformatic extraction of UCE loci
(and other loci) rather than sequence capture
154 E. L. Braun et al.
at all levels, from the deepest branches (e.g., McCormack et al. 2013; Gilbert et al.
2018) to the tips of the tree (Table 2). UCEs even appear to be useful at
phylogeographic scales (Harvey et al. 2016; Smith et al. 2014). Lemmon et al.
(2012) developed a similar approach based on a distinct probe set (one focused
largely on coding exons) that they called anchored hybrid enrichment; like UCE
sequence capture, anchored hybrid enrichment is capable of harvesting vast
quantities of data, and it is also being applied in avian systematics (Prum et al. 2015).
At present, the only higher-level phylogenetic study of birds that has employed
whole genomes (more accurately, draft genome sequences) is that of Jarvis et al.
(2014). However, analyses of draft genome sequences are increasingly informing
avian evolutionary studies (Cornetti et al. 2015; Lamichhaney et al. 2015;
Nadachowska-Brzyska et al. 2015; Nater et al. 2015; Poelstra et al. 2014; Toews
et al. 2016a; Tuttle et al. 2016; Ottenburghs et al. 2017a; Stryjewski and Sorenson
2017; Tiley et al. 2018; for recent general reviews of avian evolutionary genomics,
see Joseph and Buchanan 2015; Kraus and Wink 2015; Toews et al. 2016b). Indeed,
there are currently major efforts in multiple labs to increase the number of avian
genome sequences (like the B10K project; Zhang et al. 2015), and within just a few
years, the amount of comparative data available for phylogenetic and evolutionary
studies in birds will expand exponentially. Yet, although the last decade has seen a
great improvement in our understanding of avian relationships, these large-scale data
have also revealed remarkable incongruence in avian relationships due to using
different genes, distinct data types (e.g., across exons, introns, and UCEs), and
various taxon and character sampling regimes (e.g., Jarvis et al. 2014; Hosner
et al. 2015b,2016; Ottenburghs et al. 2016a; Reddy et al. 2017). Thus, far from
heralding the end of incongruence,as Gee (2003) asserted, phylogenomics has
actually revealed the complex nature of the phylogenetic signals in genomes (e.g.,
Jarvis et al. 2014). This raises several questions: Why do some nodes in
phylogenomic trees have limited support even when large amounts of data are
analyzed? What does it mean to use whole genomesin phylogenetics? What are
the limitations to whole-genomeanalysis? If we shift our focus toward avian
phylogenomics more specically, there are several additional questions that emerge:
What are the most problematic relationships in the bird tree based on the data and
analyses that are currently available? And nally, what are likely to be the best ways
forward to resolve many of the very difcult problems that still exist across the avian
tree?
In this chapter we seek to address these questions by examining several funda-
mental issues relevant to the theory and practice of phylogenetics, using the evolu-
tion of birds as a model systemto understand phylogenomic methods. We focus
exclusively on extant birds (Neornithes) and refer readers to recent reviews (Brusatte
et al. 2015; Wang and Zhou 2017) for discussions of Mesozoic birds. We also avoid
discussion of non-avian dinosaurs, although we would like to point interested
readers to the chapter on dinosaur genomicsby Grifn et al. (2019). This chapter
begins by highlighting the progress that has been made (or has not been made) in
elucidating the avian tree of life, largely focusing on higher-level taxonomic
relationships. Throughout, we have used common names (unless their use is
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 155
unwieldy) in order to make this chapter more accessible to readers working outside
of avian systematics (the scientic names associated with those common names in
Table 3). Then we discuss the lessons of current genome-scale efforts to estimate the
bird tree, focusing on analytical challenges, such as the impact of data-type effects
(Reddy et al. 2017) and the computational challenges (e.g., the need to use more than
400 years of CPU time to analyze 48 bird genomes; see Jarvis et al. 2014). We place
this review of avian phylogenomics in a broader context, discussing the potential for
phylogenomics to have an impact on the elds of systematics, paleontology, geno-
mics, molecular biology, evolutionary developmental biology (evo-devo), and
biodiversity studies more generally. We also discuss the potential for those elds
to inuence avian phylogenomics.
We have sometimes been deliberately provocative in this review, making it our
goal to summarize hypotheses embraced by many in the avian systematics commu-
nity and to play devils advocate regarding those hypotheses. We believe that this
will stimulate integrative studies to answer the many remaining questions in avian
phylogeny. Finally, we take a look forward and discuss the potential impact of
very high-quality (platinumor referencequality) genome assemblies generated
using third- and fourth-generation sequencing technologies (for recent reviews of
sequencing technologies, see Bleidorn 2016; Feng et al. 2015; Korlach et al. 2017).
We certainly expect platinum-quality genome assemblies to have a major impact
on phylogenomics, especially when those high-quality data are combined with
improved analytical methods. However, the phylogenomic data available at this
time (e.g., sequence capture and draft genome assemblies) have already enabled the
community to solve phylogenetic problems that have long been thought to be
intractable; we expect this trend to continue.
2 What We Do (And Do Not) Know About the Avian Tree
Systematic studies during the twenty-rst century rapidly afrmed the
non-monophyly of many traditionally recognized orders, notably Gruiformes,
Ciconiiformes, Pelecaniformes, and Falconiformes. All of these gured prominently
in classications for more than a century, and their ordinal names have been retained
in modern taxonomies (Table 3). However, the results of analyses using molecular
data required subsuming some traditional orders into more inclusive ones (e.g.,
Ciconiiformes within Pelecaniformes, Apodiformes within Caprimulgiformes) or
dening other orders more narrowly (e.g., Gruiformes and Falconiformes). Like-
wise, some largely abandoned names have been resurrected (e.g., Accipitriformes)
and some families to have been reassigned to ordinal rank (e.g., Mesitornithiformes
and Eurypygiformes). Efforts based on many genes, including some early
phylogenomic efforts, have also begun to resolve superordinal clades with some
degree of condence (see below). In some cases, these studies have revealed
counterintuitive relationships, like the sister relationships of grebes and amingos
as well as the placement of tropicbirds sister to Eurypygiformes (Sunbittern
and Kagu).
156 E. L. Braun et al.
Table 3 Comparison of ordinal circumscriptions in commonly used avian taxonomies (“—” indicates ordinal name identical to H&M
a
)
Taxa H&M Clements IOC HBW Traditional
Number of orders: 37 40 40 36 n/a
Number of species: 10,021 10,550 10,694 10,964 n/a
Palaeognathae
Ostrich Struthioniformes ——— —
Rheas Rheiformes ——Struthioniformes
Emu and cassowaries Casuariiformes ——Struthioniformes
Kiwis Apterygiformes ——Struthioniformes
Elephant birds
{
Aepyornithiformes
Moas
{
Dinornithiformes
Tinamous Tinamiformes ——Struthioniformes
Galloanseres
Waterfowl Anseriformes ——— —
Landfowl Galliformes ——— —
Neoaves
1. Telluraves (core landbirds)
1a. Australaves (originally Australavisin Ericson 2012)
Passerines Passeriformes ——— —
Parrots Psittaciformes ——— —
Falcons Falconiformes ——— —
Seriemas Cariamiformes ——— —
1b. Afroaves (monophyly uncertain; Cavitaves sensu Yuri et al. 2013 italicized)
Woodpeckers and allies Piciformes ——— —
Puffbirds and jacamars Piciformes Galbuliformes —— —
Rollers and allies Coraciiformes ——Coraciiformes
Hornbills and allies Bucerotiformes ——Coraciiformes
Trogons Trogoniformes ——— —
(continued)
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 157
Table 3 (continued)
Taxa H&M Clements IOC HBW Traditional
Cuckoo roller Leptosomiformes ——Coraciiformes
Mousebirds Coliiformes ——— —
Owls Strigiformes ——— —
Hawks and allies Accipitriformes ——Falconiformes
New World vultures Accipitriformes Cathartiformes Cathartiformes Falconiformes
2. Aequornithes (core waterbirds)
Penguins Sphenisciformes ——— —
Tubenoses Procellariiformes ——— —
Pelicans Pelecaniformes ——See text
Cormorants and allies Pelecaniformes Suliformes Suliformes Suliformes
Storks Pelecaniformes Ciconiiformes Ciconiiformes Ciconiiformes Ciconiiformes
Loons Gaviiformes ——— —
3. Phaethontimorphae
Sunbittern and Kagu Phaethontiformes ——Pelecaniformes
Tropicbirds Eurypygiformes ——Gruiformes
4. Otidimorphae
Cuckoos Cuculiformes ——Cuculiformes
Bustards Otidiformes ——Gruiformes
Turacos Musophagiformes Cuculiformes —— Cuculiformes
5. Caprimulgiformes (Strisores)
Hummingbirds Caprimulgiformes Apodiformes Apodiformes
Swifts and tree-swifts Caprimulgiformes Apodiformes Apodiformes
Owlet-nightjars Caprimulgiformes Apodiformes ——
Frogmouths Caprimulgiformes ——— —
Nightjars Caprimulgiformes ——— —
Potoos Caprimulgiformes ——— —
158 E. L. Braun et al.
Oilbird Caprimulgiformes ——— —
6. Columbimorphae
Mesites Mesitornithiformes ——Gruiformes
Sandgrouse Pterocliformes ——Columbiformes
Doves Columbiformes ——— —
7. Phoenicopterimorphae (Mirandornithes)
Grebes Podicipediformes ——— —
Flamingos Phoenicopteriformes ——— —
Orphan orders(Cursorimorphae sensu Jarvis et al. 2014 italicized, but monophyly uncertain)
Hoatzin Opisthocomiformes ——See text
Cranes, rails, and allies Gruiformes ——— —
Shorebirds Charadriiformes ——— —
a
H&M ¼(Dickinson and Christidis 2014; Dickinson and Remsen Jr. 2013); Clements ¼(Clements et al. 2017); IOC ¼IOC World Bird List (Gill and Donsker
2017); HBW ¼(del Hoyo et al. 2017)
{
extinct
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 159
Below we showcase what is currently known with reasonable certainty about the
relationships of the supraordinal lineages of birds (summarized as a consensus tree in
Fig. 1), highlighting the contributions of phylogenomics to our understanding of the
bird tree. At present, Jarvis et al. (2014) and Prum et al. (2015) are the largest-scale
estimates of the avian tree. Jarvis et al. (2014) presented many trees, but we largely
focus on the total evidence nucleotide tree(TENT), which was based on a
maximum likelihood (ML) analysis of a data matrix comprising intronic, exonic,
and UCE data. Prum et al. (2015) presented a single tree based on a Bayesian
analysis of a largely exonic dataset. They also provided an ML tree with a virtually
identical topology but much lower support; in this chapter, we focus on the ML
bootstrap support because it is more comparable to the support values associated
with Jarvis et al. (2014) trees. The tree in Fig. 1is a strict consensus of the Jarvis et al.
(2014) TENT and the Prum et al. (2015) tree, modied to include information about
specic taxa that were not sampled by Jarvis et al. (2014). For those taxa that were
not included in Jarvis et al. (2014), we also considered Reddy et al. (2017), who
presented ML and Bayesian analyses of a data matrix dominated by noncoding
sequences with a sample of taxa similar to the Prum et al. (2015) study. We view the
consensus tree in Fig. 1as the best corroborated hypothesis for the bird tree, but the
topology of the bird tree remains far from certain at this time. We emphasize this
uncertainty by calling attention to the key unsolved problems of higher-level
relationships.
2.1 Palaeognathae (Ratitesand Tinamous)
One of the most surprising results from the rst comprehensive avian phylogeny
based on multiple nuclear loci (Hackett et al. 2008) was the failure to support
monophyly of ratites (the large, ightless paleognaths such as the ostrich and
emu). Indeed, Hackett et al. (2008) had 100% bootstrap support for a node
contradicting ratite monophyly, placing ostriches sister to a clade comprising the
other ratites and the volant tinamous. Ratites had long been regarded as a textbook
exemplar (e.g., Bergstrom and Dugatkin 2012; Futuyma 2005; Stearns and Hoekstra
2005) of Gondwana vicariance following a single loss of ight in their common
ancestor (Cracraft 1973,1974). The widespread occurrence of volant Paleogene
paleognaths (lithornithids; Houde 1986,1988) certainly raised questions regarding
the prevailing paradigm that the distribution of ratites reects a single loss of ight
and vicariance due to the breakup of Gondwana in the Cretaceous. Likewise,
analyses of complete mitochondrial genomes conducted shortly before Hackett
et al. (2008) reported, at best, equivocal support for ratite monophyly (Braun and
Kimball 2002; Slack et al. 2007) and analyses of some nuclear loci conicted with
ratite monophyly (MYC in Cracraft et al. 2004; GH1 in Yuri et al. 2008; combined
CLTC and CLTCL1 in Chojnowski et al. 2008). However, an explicit hypothesis of
ratite non-monophyly based on broad sampling of the genome was not advanced
until Hackett et al. (2008). Ratite non-monophyly does not, in and of itself, falsify
the Gondwana biogeography hypothesis. However, it does raise the possibility
160 E. L. Braun et al.
Loons Gaviiformes
Landfowl Galliformes
Seriemas Cariamiformes
Tubenoses Procellariiformes
New World vultures Accipitriformes
Owlet-nightjars Caprimulgiformes
Mousebirds Coliiformes
Ostrich Struthioniformes
Tinamous Tinamiformes
Passerines Passeriformes
Swifts Caprimulgiformes
Tropicbirds Phaethontiformes
Owls Strigiformes
Waterfowl Anseriformes
Penguins Sphenisciformes
Cuckoos Cuculiformes
Pelicans & allies Pelecaniformes
Cuckoo-roller Leptosomiformes
Falcons Falconiformes
Trogons Trogoniformes
Hornbills & allies Bucerotiformes
Eagles, Hawks Accipitriformes
Woodpeckers & allies Piciformes
Turacos Musophagiformes
Sunbittern (& Kagu) Eurypygiformes
Hummingbirds Caprimulgiformes
Parrots Psittaciformes
Bustards Otidiformes
Rollers & allies Coraciiformes
Shorebirds Charadriiformes
Hoatzin Opisthocomiformes
Cranes, Rails Gruiformes
Emu & Cassowaries Casuariiformes
Moas Dinornithiformes
Rheas Rheiformes
Kiwis Apterygiformes
Elephant birds Aepyornithiformes
Frogmouths
Nightjars
Potoos
Oilbird
Caprimulgiformes
Caprimulgiformes
Caprimulgiformes
Caprimulgiformes
1
2
3
4
5
Flamingos Phoenicopteriformes
Doves Columbiformes
Mesites Mesitornithiformes
Sandgrouse Pterocliformes
Grebes Podicipediformes
6
7
Palaeognathae
Galloanseres
Neoaves
(Columbea)
Neoaves
(Passerea)
#
Fig. 1 Consensus phylogenomic tree of birds. This backbone is a strict consensus of the Jarvis
et al. (2014)total evidence nucleotide tree(TENT) and the Prum et al. (2015) tree. The division
of Neoaves into Columbea and Passerea is based on Jarvis et al. (2014), although the division
is not presented in the consensus tree because it is not present in the Prum et al. (2015) tree.
The numbered clades correspond to the magnicent sevenof Reddy et al. (2017): (1) core
landbirds (Telluraves); (2) corewaterbirds (Aequornithes); (3) tropicbirds and Sunbittern
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 161
of dispersal by a volant ancestor followed by independent losses of ight. The
initial suggestion of ratite monophyly was followed by a detailed reanalysis of
the Early Bird data focused on determining whether the support for ratite
non-monophyly reects misleading phylogenetic signal (Harshman et al. 2008);
that study did not reveal any sources of bias (for a detailed discussion of bias in
phylogenetic analyses, see below, in Sect. 3). Subsequent molecular studies, includ-
ing reanalyses of mitochondrial DNA (Phillips et al. 2010), the addition of nuclear
genes (Baker et al. 2014; Haddrath and Baker 2012; Smith et al. 2013), analyses of
transposable element (TE) insertions (Baker et al. 2014; Haddrath and Baker 2012),
and whole-genome analyses (Sackton et al. 2018), have also corroborated ratite
non-monophyly. However, those studies have shown conicts regarding the other
relationships within paleognaths (we emphasize this by presenting the base of
Palaeognathae except ostriches as a polytomy in Fig. 1).
The recent results raise a profound question about our understanding of
paleognath evolution: how strongly corroborated is the new paradigm? It is impor-
tant to recognize that this new view of palaeognath history actually has three major
components: (1) ratites are not monophyletic; (2) ratites had a volant and vagile
ancestor that lost ight independently after dispersing; and (3) the morphological
features that unite ratites reect convergence. The rst of these is strongly
corroborated, although the phylogenomic study of Le Duc et al. (2015), which
included an analysis of 623 coding regions, supported ratite monophyly. However,
the Le Duc et al. (2015) results are unlikely to be accurate because their taxon
sampling was poor (the only paleognaths sampled were ostrich, a kiwi, and a
tinamou) and analyzed coding data, which is more likely to be misleading than
noncoding sequences (see below, in Sect. 3). It is possible to argue that the second
and third components of the current hypothesis are more equivocal. They are also the
more interesting components of the current hypothesis. A tree topology that nests the
volant tinamous within the ightless ratites does not provide denitive evidence for
multiple losses of ight; logically, it could reect a single loss of ight followed by a
reversal to a volant state in tinamous. In fact, a skeptic might point out that a single
loss of ight followed by the reacquisition of ight in tinamous is actually the most
parsimonious optimization of the volant/ightless character state. Several authors
(e.g., Harshman et al. 2008; Smith et al. 2013) argued against that position by
pointing out that there are many examples of evolution from a volant to ightless
state (e.g., Wright et al. 2016) whereas evidence for evolution in the other direction
is absent. That argument provides evidence for multiple losses of ight within
paleognaths when the data are interpreted in a likelihood framework. The nal
Fig. 1 (continued) (Phaethontimorphae); (4) cuckoos, bustards, and turacos (Otidimorphae);
(5) nightjars, swifts, hummingbirds, and allies (Caprimulgiformes); (6) doves, mesites, and sand-
grouse (Columbimorphae); and (7) amingos and grebes (Phoenicopterimorphae). We indicate the
limited support for Otidimorphae using a hash (#). Reddy et al. (2017) called shorebirds, cranes, and
the Hoatzin the orphan orders; shorebirds and cranes form a clade (Cursorimorphae) in the Jarvis
et al. (2014) TENT but they represent independent lineages in the Prum et al. (2015) tree
162 E. L. Braun et al.
component of the current hypothesis (that the other features that appear to unite
ratites arose by convergence) might appear to have been resolved by Johnston (2011),
who presented a morphological phylogeny that supports ratite non-monophyly and
places ostrich sister to all other paleognaths (like the molecular studies). However,
the vast majority of morphological phylogenies support ratite monophyly, including
recent studies (Bourdon et al. 2009; Worthy and Scoeld 2012; and the uncon-
strained analyses in Worthy et al. 2017). This suggests a truly remarkable degree of
convergence if the current molecular hypothesis is correct.
Understanding the developmental basis for the loss of ight in different ratite
lineages could provide a direct way to examine the multiple loss of ight hypothesis.
Faux and Field (2017) found that tinamous retain the ancestral pattern of wing length
development (assuming the chicken character state is ancestral) whereas ostriches
and emus exhibit different patterns of wing development. However, Faux and Field
(2017) ultimately map three character states onto a four-taxon tree; thus, all possible
topologies are equally parsimonious (all trees require two character state changes to
explain the observed data). It will probably be necessary to understand the molecular
basis for the loss of ight in each lineage to resolve this issue denitively. Whole-
genome sequencing along with the identication of functional elements (using
approaches similar to Seki et al. 2017) is likely to facilitate the necessary evo-devo
studies. Sackton et al. (2018) used this approach, analyzing 14 paleognath genome
assemblies and identifying 63 noncoding elements that are likely to be transcrip-
tional enhancers that also exhibit an unusually high degree of sequence divergence
in ratites. They examined one of these ratite-accelerated regionsexperimentally,
nding that the chicken or tinamou sequences had enhancer activity in the developing
chick forelimb, whereas the orthologous rhea sequence did not. Similar experiments
focused on ratite-accelerated regionsfrom other paleognath species should be very
informative (e.g., Cloutier et al. 2018). Examining the activity of resurrected
ancestors of these regions (i.e., sequences reecting computational ancestral state
reconstructions) could be even more informative; the types of experiments could be
conducted by combining standard ancestral state reconstruction methods (e.g.,
Huelsenbeck and Bollback 2001) with approaches from synthetic biology (reviewed
by Hughes and Ellington 2017). These types of tools are likely to usher in a new era
for our understanding of these fascinating birds.
Studies focused on the phylogenetic position of extinct paleognaths (moas and
elephant birds) represent another exciting research area in that they have now moved
into the phylogenomic era (Baker et al. 2014; Grealy et al. 2017; Yonezawa et al.
2017). Surprisingly, these investigations have further corroborated earlier studies
that placed the Neotropical tinamous as sister to the extinct New Zealand moas
(Haddrath and Baker 2012; Phillips et al. 2010; Smith et al. 2013), on the one hand,
and the New Zealand kiwis sister to the extinct Malagasy elephant birds, on the other
(Mitchell et al. 2014). Estimates of divergence times for key taxa in these clades
(5060 Mya for the kiwi and elephant bird; Mitchell et al. 2014; Yonezawa et al.
2017) have also been interpreted as precluding plausible avenues of overland
dispersal. The very young age for crown paleognaths inferred by Prum et al.
(2015) implies that those divergence times could be even more recent. Berv and
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 163
Field (2018) suggested the conicts in divergence times reect accelerated molecu-
lar evolution early in the Paleogene. They attributed this to a combination of the
observation that birds with smaller body size also tend to exhibit higher rates of
molecular evolution combined with the Lilliput effect,which is the tendency for
the lineages that survive mass extinctions to exhibit a marked decrease in body size
(Urbanek 1993). However, estimates of paleognath divergence time are probably too
recent for overland dispersal even if the Berv and Field (2018) hypothesis of an early
Paleogene rate acceleration is incorrect; if their hypothesized rate acceleration
is correct, it would provide additional evidence against overland dispersals. Those
late divergences among paleognaths have been viewed as additional evidence
corroborating the multiple loss of ight hypothesis (reviewed by Allentoft and
Rawlence 2012; for a dissenting argument, see Worthy and Scoeld 2012, p. 88).
Regardless of the details, it seems clear that studies focused on both extant and
extinct paleognaths will continue to provide many interesting ndings in the
genomic era.
2.2 Galloanseres (Landfowl and Waterfowl)
In sharp contrast to Palaeognathae and Neoaves (see below), in which there is
substantial uncertainty regarding many relationships, the picture for Galloanseres
is one of greater certainty. Monophyly of Galloanseres was initially controversial
(e.g., Ericson 1996), but that controversy was resolved prior to the phylogenomic era
(cf. Cracraft 2001). Relationships among the families were also established without
phylogenomic data, with Cox et al. (2007) resolving the last major question, the
positions of New World quail (Odontophoridae) and guineafowl (Numidae), using
only eight nuclear loci and three mitochondrial regions. Phylogenomic approaches
have been remarkably successful within galloanserine families; species-rich trees
with 100% bootstrap support at almost every node are now available, using sequence
capture data for Galliformes (summarized in Table 2) and more than 6.6 million base
pairs (Mbp) of coding sequence data for Anseriformes (Ottenburghs et al. 2016a).
There are certainly a few relationships that remain poorly supported, both in
Galliformes (Hosner et al. 2015b; Meiklejohn et al. 2016) and Anseriformes (e.g.,
Reddy et al. 2017 was unable to resolve the radiation of tribes within Anatidae with
condence). Indeed, the poor resolution of anatid tribes might be viewed as overly
pessimistic based on Sun et al. (2017), although that study reected analyses of
the mitochondrial genome which is ultimately a single gene tree that can differ from
the species tree (see below in Sect. 3). Regardless, problematic nodes within
Galloanseres are the exception and not the rule. Likewise, some taxa have not
been included in phylogenomic trees because they have only been sampled for a
limited amount of data (or because molecular data remain unavailable). However,
analyses of sparse supermatrices that combine legacy markers (sequences obtained
by PCR) with phylogenomic data have proven to be a successful strategy even in
those cases for which specic taxa are data limited (e.g., Hosner et al. 2016; Persons
et al. 2016). Overall, it is very likely that a species-level phylogenomic tree of
164 E. L. Braun et al.
Galloanseres will be completed in the near future (at least for the level of species
named in current checklists).
One area in which our knowledge of the early evolution of Galloanseres is limited
is the estimation of time-calibrated trees; this is problematic because the oldest
neornithine fossils are putatively galloanserines. The oldest of these, Austinornis
lentus from the Cretaceous Austin chalk, can probably be dismissed. The fossil plac-
ing Austinornis within Galloanseres (as a stem galliform) is fragmentary, and Clarke
(2004) was only able to score it for 9 of 202 characters; the skeptical position that
Austinornis is a stem galloanserine (or even some other Cretaceous bird lineage) is
more appropriate than viewing it as evidence for the existence of crown Galloanseres
ca. 85 million years ago (Mya). Another putative Cretaceous galloanserine, Vegavis
iaai, has attracted substantial attention because Clarke et al. (2005) placed it within
Anseriformes with high (99%) bootstrap support. However, more recent analyses
place Vegavis outside crown Anseriformes (Agnolín et al. 2017; Lee et al. 2014;
OConnor and Zhou 2013; Worthy et al. 2017). Indeed, Mayr et al. (2018) went
further and questioned whether Vegavis was even galloanserine. Placing Vegavis
within crown Anseriformes has a major impact on divergence time estimates for the
avian tree as a whole (e.g., Prum et al. 2015), so resolving its position with
condence is critical. Ancient galloanserine fossils that are reliably placed within
crown Anseriformes do exist (e.g., Prebyornithidae; De Pietri et al. 2016), but they
are younger than Vegavis.For example,the oldest fossil placed within the
anseriform crown in the maximum parsimony (MP) and Bayesian analyses of
Worthy et al. (2017) was the Eocene Presbyornis pervetus. Kurochkin et al.
(2002) did place the Cretaceous Teviornis gobiensis in Presbyornithidae, but the
presbyornithid afnities of Teviornis are questionable (Clarke and Norell 2004).
Indeed, all of the putative upper Cretaceous fossils assigned to extant avian orders,
including a number of galloanserines (see Hope 2002), are quite fragmentary.
Fountaine et al. (2005) suggested the fragmentary nature of the Cretaceous
neornithine fossil record reects a biological signal given the collecting efforts; if
so, it strongly suggests that the ancient calibrations that have been used for
Galloanseres in many clock studies are inappropriate. Even within the Paleogene,
a number of galloanserine fossils appear to have been placed incorrectly (Ksepka
2009; Wang et al. 2016). Finding better ways to incorporate fossil evidence is likely
to be more important for establishing divergence times for Galloanseres (and birds as
a whole) than the availability of genome-scale sequence data.
2.3 Neoaves (All Remaining Extant Birds)
Recent phylogenomics studies support the division of Neoaves into ten major
lineages, seven of which contain multiple orders. Reddy et al. (2017) called the
superordinal clades the magnicent sevenand referred to the remaining three
lineages as the orphan orders.Interrelationships among these major lineages
remain poorly resolved [Fig. 1; also see Thomas (2015) for another comparison of
the Jarvis et al. (2014) and Prum et al. (2015) trees]. Moreover, differences across
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 165
recent studies are sensitive to the data type used for analyses (i.e., the estimates of
phylogeny differ depending on whether exons, introns, noncoding ultraconserved
elements, conserved non-exonic elements, or TE insertions were used for analyses;
Jarvis et al. 2014; Reddy et al. 2017). The observed conict among analyses is so
striking that Suh (2016) suggested that the relationships among these taxa might
reect a hard polytomy. We are skeptical of this extreme hypothesis; there has been
remarkable progress toward a satisfying resolution of Neoaves in the early part of the
phylogenomic era (i.e., the convincing evidence for the magnicent seven). We
believe this progress represents a good reason to expect that continued data collec-
tion and method development will ultimately resolve relationships among these
major groups. However, it is clear that branching did occur over a very short interval
of time early in the Paleocene, and this has made Neoaves one of the most difcult
problems in modern phylogenetics. Below we summarize the ongoing progress
made toward resolving the neoavian tree and highlight some of the remaining
open questions (although we emphasize that our discussion is not exhaustive).
The Base of the Neoavian Tree Identifying the basal split within the Neoaves has
long been a vexing problem, but one that is seemingly approaching resolution. Initial
large-scale analyses (Fain and Houde 2004; Ericson et al. 2006; Hackett et al. 2008)
identied a large but relatively poorly supported cluster called Metaves (we indicate
the metavianin Fig. 2with an asterisk). Metaves comprised Caprimulgiformes
(as currently dened to include nightjars, swifts, hummingbirds, and allies) along
with doves, mesites, sandgrouse, amingos, grebes, the Kagu and Sunbittern
(Eurypygiformes), tropicbirds, and (in some analyses) the Hoatzin. Those studies
placed Metaves sister to all other neoavian taxa, the latter being designated
Coronaves by Fain and Houde (2004). Kimball et al. (2013) showed that the signal
supporting Metaves was almost exclusively associated with a single locus
(FGB/β-brinogen). Phylogenomics is based on the idea that analyses of many
loci can overcome the existence of misleading signal in any individual locus. The
preponderance of the phylogenomic data of Jarvis et al. (2014) claried the phylog-
eny of metaves, identifying a cluster that includes amingos and grebes along with
doves, sandgrouse, and mesites (collectively named Columbea by Jarvis et al. 2014)
and placing that clade sister to all other Neoaves (termed Passerea; Jarvis et al.
2014). In contrast, Prum et al. (2015) places Caprimulgiformes sister to all other
Neoaves (Fig. 2). Both Jarvis et al. (2014) and Prum et al. (2015) dispersed the other
metavian taxa across several other basal neoavian lineages.
Reddy et al. (2017) analyzed the topological conicts between Jarvis et al. (2014)
and Prum et al. (2015) and concluded that the observed conicts reect data-type
effects,which they dened as the observation that there are different signals
associated with analyses of subsets of the genome that can be dened a priori
using non-phylogenetic criteria.The data used to generate the Prum et al. (2015)
tree were 82.5% exonic, and this very probably led to incorrect taxonomic groupings
on their tree, whereas the Jarvis et al. (2014) TENT reects an analysis of 41.8 Mbp
that included a mixture of introns, coding exons (rst and second codon positions
only), and noncoding UCEs. Reddy et al. (2017) argued that exons are more likely to
166 E. L. Braun et al.
be misleading than noncoding regions for two reasons: (1) coding exons exhibited
greater GC-content variation than noncoding regions and (2) the structure of the
genetic code combined with selection to maintain the amino acid sequence
represents a violation of most models used for phylogenetic analyses. However,
the fundamental observations underlying Reddy et al. (2017) were empirical:
(1) analyses of a largely noncoding 54-locus data matrix for 235 species supported
the same basal split in Neoaves as the Jarvis et al. (2014) TENT; and (2) trees based
on large-scale coding datasets exhibited more topological similarities to each other
than to trees based on noncoding data. Reddy et al. (2017) also found that trees based
on rare genomic changes, like TE insertions, were more congruent with the trees
based on noncoding data than with the trees based on coding data (see Fig. 6in
Reddy et al. 2017). Taken as a whole, those results indicate that the observed
differences between Jarvis et al. (2014) and Prum et al. (2015) are more likely to
reect data type than taxon sampling. We address below some of these remaining
Flamingos Phoenicopteriformes
Doves Columbiformes
Sandgrouse Pterocliformes
Grebes Podicipediformes
Hoatzin Opisthocomiformes
Shorebirds Charadriiformes
Cranes, Rails Gruiformes
Tropicbirds Phaethontiformes
Sunbittern Eurypygiformes
Loons Gaviiformes
Tubenoses Procellariiformes
Penguins Sphenisciformes
Pelicans & allies Pelecaniformes
Seriemas Cariamiformes
New World vultures Accipitriformes
Mousebirds Coliiformes
Passerines Passeriformes
Owls Strigiformes
Cuckoo-roller Leptosomiformes
Falcons Falconiformes
Trogons Trogoniformes
Hornbills & allies Bucerotiformes
Eagles, Hawks Accipitriformes
Woodpeckers & allies Piciformes
Parrots Psittaciformes
Rollers & allies Coraciiformes
1
2
3
6
7
Mesites Mesitornithiformes
5
Cuckoos Cuculiformes
Turacos Musophagiformes
Bustards Otidiformes
Nightjars & allies Caprimulgiformes
4Flamingos Phoenicopteriformes
Doves Columbiformes
Sandgrouse Pterocliformes
Grebes Podicipediformes
Hoatzin Opisthocomiformes
Shorebirds Charadriiformes
Cranes, Rails Gruiformes
Tropicbirds Phaethontiformes
Sunbittern Eurypygiformes
Loons Gaviiformes
Tubenoses Procellariiformes
Penguins Sphenisciformes
Pelicans & allies Pelecaniformes
Seriemas Cariamiformes
New World vultures Accipitriformes
Mousebirds Coliiformes
Passerines Passeriformes
Owls Strigiformes
Cuckoo-roller Leptosomiformes
Falcons Falconiformes
Trogons Trogoniformes
Hornbills & allies Bucerotiformes
Eagles, Hawks Accipitriformes
Woodpeckers & allies Piciformes
Parrots Psittaciformes
Rollers & allies Coraciiformes
Mesites Mesitornithiformes
Cuckoos Cuculiformes
Turacos Musophagiformes
Bustards Otidiformes
Nightjars & allies Caprimulgiformes
1
2
3
6
7
5
4
Jarvis et al. TENT
APrum et al. anchored hybrid enrichment tree
B
Expanded
Waterbird
cade
Columbea Passerea
#
#
#
#
#
#
#
#
#
#
Fig. 2 The Jarvis et al. (2014) TENT and the Prum et al. (2015) anchored hybrid enrichment
(sequence capture) tree exhibit many differences at the base of Neoaves. Both of these trees are
presented as rooted trees for Neoaves and taxa placed in Metaves(see text) are indicated with
asterisks. (a) Jarvis et al. (2014) TENT with low-support branches (branches with <100% bootstrap
support) indicated using thin lines. Very low-support branches (branches with <70% bootstrap
support) are indicated with a hash (#) below the relevant branch. The strongly supported basal
division of Neoaves into Columbea and Passerea is indicated. (b) Prum et al. (2015) tree with
low-support (branches with <70% bootstrap support) and very low-support (branches with <50%
bootstrap support) are indicated in the same manner. Taxa placed in Columbea and Passerea are also
indicated to side of the Prum et al. (2015) tree to emphasize their non-monophyly in that analysis; an
expanded waterbird clade(named Aequorlitornithes by Prum et al. 2015) is indicated using a gray
box. We used different bootstrap support cutoffs for the two trees because they are based on data
matrices of different sizes (more than 13,000 loci for the Jarvis et al. 2014 TENT but only 259 loci
for the Prum et al. 2015 tree)
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 167
conicts revealed by these three studies as they represent some of the most important
puzzles in avian phylogenetics.
Phoenicopteromorphae (Clade 7; Also Called Mirandornithes)
and Columbimorphae (Clade 6) There is little question that a close relationship
between amingos and grebes (clade 7 in Fig. 2) was astonishing to most
ornithologists when it was rst proposed (Van Tuinen et al. 2001). However, that
node was relatively easy to resolve (even with mitochondrial DNA; Van Tuinen
et al. 2001; also see Cracraft et al. 2004, p. 476) and was present in the results of
analyses conducted by many different investigators in multiple labs (e.g., Chubb
2004; Fain and Houde 2004; Mayr 2004a; Ericson et al. 2006; Hackett et al. 2008).
From a result that few believed at the time, it is now solidly accepted. The closest
relative of the amingo-grebe clade is another matter. Hackett et al. (2008) placed
them sister to a subset of metavian taxa, in a clade comprising doves, mesites, and
sandgrouse (clade 6 in Fig. 2) and tropicbirds. Jarvis et al. (2014) placed amingos
and grebes sister to doves, mesites and sandgrouse, separating the tropicbirds from
both lineages. Importantly, the clade comprising doves, mesites, sandgrouse,
amingos, and grebes (Columbea) was strongly (100% bootstrap) supported and
sister to all other Neoaves (Passerea). Reddy et al. (2017) also recovered the deep
division between Columbea and Passerea, albeit with lower bootstrap support
("95% for Columbea but only 5090% for Passerea). In sharp contrast, Prum
et al. (2015) did not recover Columbea or Passerea. Instead, Prum et al. (2015)
placed amingos and grebes sister to shorebirds in a generalized waterbird clade
(emphasized using a gray box in Fig. 2). Provocatively, many analyses of coding
exons in Jarvis et al. (2014) also supported a generalized waterbird clade, albeit with
rearrangements relative to Prum et al. (2015). Those results are consistent with the
data-type effects hypothesis advanced by Reddy et al. (2017). Prum et al. (2015) also
supported monophyly of doves, mesites, and sandgrouse but placed that clade sister
to cuckoos, bustards, and turacos; they named this large clade, which comprises
clades 4 and 6 from Fig. 2, Columbaves. We consider the latter relationship to be less
likely given the nature of character support discussed by Reddy et al. (2017), but it
seems clear that the Columbea and Columbaves hypotheses both deserve additional
study.
Caprimulgiformes (Clade 5; Also Called Strisores) Many current taxonomies
treat Caprimulgiformes (the clade comprising nightjars, nighthawks, oilbirds,
potoos, frogmouths, owlet-nightjars, swifts, and hummingbirds) as a single order
(Table 3), but those diverse taxa were split into at least two orders in older
classications (e.g., hummingbirds and swifts in Apodiformes; Table 3). For this
reason, Reddy et al. (2017) viewed Caprimulgiformes as one of their magnicent
sevensuperordinal clades. As described above, many analyses of coding exons
(including Prum et al. 2015) place Caprimulgiformes sister to all other Neoaves.
There is also uncertainty regarding the relationships among the families within
Caprimulgiformes. This could reect a data-type effect, since Reddy et al. (2017)
place the potoo-oilbird clade sister to all other caprimulgiforms (Fig. 3a), whereas
168 E. L. Braun et al.
Prum et al. (2015) place nightjars and nighthawks (Caprimulgidae) sister to all
other caprimulgiforms (Fig. 3b). Both studies support a clade comprising the
ve remaining families (frogmouths, owlet-nightjars, swifts, tree-swifts, and
hummingbirds). The uncertainty actually reects alternative placements of the root
since both studies supported the same unrooted ingroup topology (Fig. 3c). Obvi-
ously, this group is an excellent target for additional phylogenomic analyses. In fact,
Oilbird
Potoos
Nightjars
Frogmouths
Owlet-nightjars
Hummingbirds
Swifts
Treeswifts
Oilbird
Potoos
Nightjars
Frogmouths
Owlet-nightjars
Hummingbirds
Swifts
Treeswifts
Reddy et al. (2017)
(non-coding data)
Prum et al. (2015)
(coding data)
#
#
Nightjars & Nighthawks
Caprimulgidae Oilbird
Steatornithidae
Potoos
Nyctibiidae
Frogmouths
Podargidae
Owlet-nightjars
Hummingbirds
Swifts
Aegothelidae
Trochilidae
Apodidae
Treeswifts
Hemiprocnidae
Reddy root
Prum root
AB
C
Fig. 3 The position of the root of Caprimulgiformes (clade 5) is uncertain. (a) Most analyses
reported by Reddy et al. (2017) place the root between the oilbird + potoos clade and all other
Caprimulgiformes, although some analyses (see supporting material for Reddy et al. 2017) place the
root between potoos and all other Caprimulgiformes. (b) Prum et al. (2015) place the root between
the nightjar family and all other Caprimulgiformes. (c) The unrooted ingroup topology for
Caprimulgiformes is identical in the Prum et al. (2015) and Reddy et al. (2017) analyses, but the
position of the root differs. Support is indicated as above, with thin branches or thin branches and a
hash (#). We do not consider Jarvis et al. (2014) in this gure because that study only included three
Caprimulgiformes (a nightjar, a swift, and a hummingbird); relationships among those taxa are
strongly supported in many analyses (including analyses of individual genes; cf. g. 1b in Hackett
et al. 2008)
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 169
phylogenomic analyses of caprimulgiforms could be especially informative given
their extensive Paleogene fossil record (Mayr 2009,2004b). Those fossils have long
suggested that nightjars and their relatives arose early in the neoavian radiation so
they can provide an excellent source of calibrations for molecular clock studies.
Otidimorphae (Clade 4) A clade comprising cuckoos, bustards, and turacos has
emerged in only the most recent phylogenomic studies (Jarvis et al. 2014; Prum
et al. 2015), although we do not view this group as decisively established (Fig. 1).
Clade 4 has strong (100%) bootstrap support in the Jarvis et al. (2014) TENT, which
placed cuckoos sister to a turaco-bustard clade. However, the position of turacos
sister to bustards is the most poorly supported clade in the Jarvis et al. (2014) TENT
(only 55% bootstrap support). Prum et al. (2015) does not place turacos sister to
bustards; instead, the Prum et al. (2015) tree places turacos sister to a cuckoo-bustard
clade. Clade 4 does have strong support in the Bayesian analysis reported by Prum
et al. (2015), but it has very low support (only 41%) in their ML tree. The cuckoo-
bustard clade was strongly supported (98% bootstrap) in the Prum et al. (2015) ML
analysis. Reddy et al. (2017) did not recover clade 4. Instead, Reddy et al. (2017)
recovered a cuckoo-bustard clade (albeit with limited support in many analyses)
and placed turacos elsewhere, sister to cranes, rails, and allies (Gruiformes).
Provocatively, Reddy et al. (2017) also found the turaco-gruiform clade in their
reanalyses of the Prum et al. (2015) dataset after excluding data that was also present
in the Jarvis et al. (2014) TENT dataset (this was done to produce a coding exon
dataset that was truly independent of Jarvis et al. 2014). Clearly, relationships among
these taxa, their relationships to other clades, or whether they even form a clade is
still an open question that is ripe for additional phylogenomic exploration.
Shorebirds, Cranes, Rails, and the Hoatzin (the Orphan Ordersin Reddy
et al. 2017)The three orders containing shorebirds (Charadriiformes); cranes, rails,
and allies (Gruiformes); and the Hoatzin (Opisthocomus hoazin, the only extant
species in the order Opisthocomiformes) form a clade on the Jarvis et al. (2014)
TENT, although this clade did not receive 100% bootstrap support in the TENT. In
contrast to the TENT, the larger Jarvis et al. (2014)whole-genome treeplaced
shorebirds sister to core landbirds (core landbirds are clade 1 in Fig. 2) and placed
caprimulgiforms as sister to gruiforms, with the Hoatzin as their sister. The whole-
genome tree reected an analysis of 322 Mbp, more than seven times larger than the
TENT dataset. However, Jarvis et al. (2014) expressed concern that the sequence
alignment and the assessment of orthology for the whole-genome tree dataset were
inferior to the TENT dataset. Prum et al. (2015) placed the Hoatzin sister to the core
landbirds (clade 1), naming that more inclusive clade Inopinaves. Prum et al. (2015)
also separated shorebirds and gruiforms, placing the former in a clade with amingos
and grebes and the latter sister to a larger neoavian clade (Fig. 2). Reddy et al. (2017)
also separated all three of these orders, albeit with low support. Additional
phylogenomic analyses are clearly necessary to resolve these relationships.
Other than relationships among palaeognathous birds (see above), perhaps the
most long-standing problem in avian higher-level relationships has been that of the
170 E. L. Braun et al.
Hoatzin. The Hoatzin exhibits peculiar specializations, including a folivorous diet
and foregut fermentation as well as the hypertrophy and use of forelimb claws by
nestlings. These features fueled fanciful speculations that the Hoatzin might be
primitive among extant birds (Feduccia 1996; Olson 1985). Historically, many
authorities placed Hoatzin close to fowl (Galliformes) or within a questionably
monophyletic circumscription of Cuculiformes (in that case dened as comprising
both cuckoos and turacos). Studies that supported the latter position placed the
Hoatzin sister to either cuckoos or turacos (see Sibley and Ahlquist 1990 for review;
also see Hughes and Baker 1999; Sorenson et al. 2003). This uncertainty continues
today; as stated above, no phylogenomic study upholds any of the previous
hypotheses. Moreover, none of the comprehensive phylogenomic studies (Jarvis
et al. 2014; Prum et al. 2015; Reddy et al. 2017) agree with one another regarding the
position of the Hoatzin. The McCormack et al. (2013) UCE study, which did not
include all major avian clades, placed Hoatzin sister to shorebirds, similar to the
Jarvis et al. (2014) TENT (Fig. 2a). However, it placed gruiforms (represented in
that study by the trumpeter, family Psophiidae) in another position. The difculty
of resolving the position of the Hoatzin likely reects the rapid radiation of
Neoaves itself and the fact that the Hoatzin lineage diverged very close in time to
all lineages that are putative close relatives. Coupled with that is the monotypy of
Opisthocomiformes, which eliminates the possibility of breaking up its long-branch
stem. The fossil record unfortunately sheds little light on the issue except to
document that early hoatzins of modern appearance (as far as it is known from
limb bones) were distributed in South America, Europe, and Africa during the Oligo-
Miocene (Mayr 2014; Mayr et al. 2011; Mayr and De Pietri 2014).
Phaethontimorphae (Clade 3), Aequornithia (Core Waterbirds; Clade 2),
and Telluraves (Core Landbirds; Clade 1) These clades comprise many pheno-
typically distinct lineages. Clade 3 comprises tropicbirds and Eurypygiformes (the
Sunbittern and Kagu), lineages placed in Metaves by early phylogenomic studies
(Fig. 2). Later studies resolved them as a distinct lineage that is either related to core
waterbirds (clade 2; Fig. 4a) or core landbirds (clade 1; Fig. 4b). This could also be a
data-type effect; analyses that include exonic data (including the Jarvis et al. 2014
TENT) place the tropicbird-eurypygiform clade as sister to waterbirds, whereas
analyses of exclusively or almost exclusively noncoding data matrices [the analyses
of introns and noncoding UCEs in Jarvis et al. (2014) and Reddy et al. (2017)] place
the tropicbird-eurypygiform clade sister to landbirds. However, if this is a data-type
effect, it is unusual. Reddy et al. (2017) proposed the terminology data-type effects
in order to discuss the conicts between the Jarvis et al. (2014) TENT and the Prum
et al. (2015) tree with respect to the deepest divergence in Neoaves. In this case, the
Jarvis et al. (2014) TENT (which reects a data matrix that is 68% noncoding)
supports monophyly of Columbea (clades 6 and 7) and places of Columbea sister to
Passerea (all other Neoaves). The Jarvis et al. (2014) TENT and intron trees as well
as the Reddy et al. (2017) tree support reciprocal monophyly of Columbea and
Passera; the Jarvis et al. (2014) UCE tree supports monophyly of Passerea. Thus, the
TENT resembles trees based on noncoding data. In contrast, the position of the
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 171
tropicbird-eurypygiform clade in the TENT corresponds to its position in analyses of
coding data. Another potential explanation for the difculty placing the tropicbird-
eurypygiform clade is that they are long branches. This does not reect length due to
an accelerated rate of molecular evolution; instead it reects the fact that neither
lineage has close relatives. There are only three tropicbird species, all of which are
very closely related and are placed in a single genus (Phaethon). There is another
extant eurypygiform (the New Caledonian Kagu; Fig. 4), which is placed in a
different family. Reddy et al. (2017) did break up the long branch to the Sunbittern
as much as possible by including Kagu, and it remains possible that adding genome-
scale Kagu data will be helpful; however, adding Kagu represents the limit for the
addition of taxa to clade 3 in a meaningful way. Thus, the practice of adding taxa,
which many systematists believe to be one of the best ways to improves estimates
of phylogeny (Heath et al. 2008), is unlikely to be a way to improve placement of
clade 3. Overall, because tropicbirds and eurypygiforms are both long branches that
can only be subdivided near the tips, improved phylogenomic analyses may repre-
sent the only hope for a convincing resolution of the position of the tropicbird-
eurypygiform clade.
The relationships among core landbirds, core waterbirds, and the tropicbird-
eurypygiform clade are somewhat more complex than we indicate in Fig. 4. The
TENT and intron tree in Jarvis et al. (2014) both support a clade comprising core
“Core” waterbirds Aequornithes
“Core” landbirds Telluraves
Tropicbirds Phaethontiformes
Sunbittern Eurypygiformes
Kagu Eurypygiformes
Tropicbirds
Sunbittern
“Core” landbirds
“Core” waterbirds
A
B
Loons Gaviiformes
Penguins Sphenisciformes
Tubenoses Procellariiformes
Storks Ciconiidae
Frigatebirds Fregatidae
Gannets, Boobies Sulidae
Cormorants Phalacrocoracidae
Darters Anhingidae
Ibises, Spoonbills Threskiornithidae
Herons Ardeidae
Shoebill Balaenicipitidae
Hamerkop Scopidae
Pelicans Pelecanidae
Pelecaniformes
C
Jarvis et al. (2014)
Prum et al. (2015)
Kagu (Only in Reddy et al. 2017)
- TENT, WGT
- coding
Jarvis et al. (2014)
Reddy et al. (2017)
- introns, UCEs
- non-codin
g
α
Reddy et al. (2017) waterbird topology
β
Fig. 4 Relationships among core landbirds (clade 1), core waterbirds (clade 2), and the
tropicbird + eurypygiform clade (clade 3). (a) Phylogeny based on TENT, WGT, and coding
data. (b) Phylogeny based on introns, UCEs, and non-coding data. The position of the
tropicbird + eurypygiform clade sister to either core waterbirds or core landbirds depends on the
data type analyzed (analyses of coding data and mixtures of coding and noncoding data support a
waterbird sister hypothesis, whereas analyses of noncoding data alone support a landbird sister
hypothesis). (c) The inset shows the topology within core waterbirds. Most large-scale relationships
within core waterbirds are strongly supported and robust both to data type and to analytical
approach, but there are two exceptions (indicated using Greek letters, see text for details)
172 E. L. Braun et al.
landbirds, core waterbirds, and tropicbirds + eurypygiforms, but most other analyses
place other lineages within that clade. Analyses using exonic data typically nest the
core waterbird + (tropicbird + eurypygiforms) clade within a larger clade comprising
many aquatic and semiaquatic lineages (e.g., shorebirds, amingos, and grebes);
analyses using noncoding data separate those lineages. We believe this greater
waterbird clade(named Aequorlitornithes by Prum et al. 2015) is unlikely to be a
true clade because analyses of rare genomic changes, which are likely to have
different strengths and weaknesses relative to analyses of either or coding and
noncoding sequences (see below, in Sect. 3), yield a tree topology closer to the
noncoding trees (see Reddy et al. 2017).
The core waterbirds [clade 2, called Aequornithes by Mayr (2011) and
Aequornithia by Cracraft (2013)] are perhaps one of the most exceptional groups
in Neoaves: almost all major groups within this clade have the same topology in all
recent phylogenomic analyses. There are, however, two nodes of interest (branches
αand βin Fig. 4). Branch α, which unites pelicans with the Hamerkop, is especially
surprising; the RAxML (Stamatakis 2014) analyses in both Prum et al. (2015) and
Reddy et al. (2017) support the resolution shown in Fig. 4whereas the Bayesian
analyses conducted in both of those studies support a different resolution (Hamerkop
sister to pelicans + shoebill). The Bayesian analyses in Prum et al. (2015) reect the
use of ExaBayes (Aberer et al. 2014) whereas Reddy et al. (2017) used both
ExaBayes and MrBayes (Ronquist et al. 2012); thus, these results are not associated
with specic software. It is unclear whether this topological difference reects the
details of the specic programs used for analyses or more fundamental differences
between ML analyses (where the parameters used for analyses are assigned the
values that result in the optimal likelihood score) and Bayesian analyses (where the
method integrates over the uncertainty in those parameters, assuming some prior
distribution for the parameter values). Regardless, the observed differences between
ML and Bayesian analyses as implemented in commonly used phylogenetic
programs deserve further scrutiny. The other branch (βin Fig. 4) is somewhat
more variable across analyses. Both of these remaining questions regarding core
waterbird phylogeny deserve attention in coming analyses of phylogenomic data.
Monophyly of core landbirds (clade 1, also called Telluraves; Yuri et al. 2013)
is strongly supported in almost all multigene analyses with sufcient taxon sampl-
ing that have been published since Hackett et al. (2008). Like the core waterbirds,
core landbirds comprise some of the most widespread and familiar avian
lineages, including raptors (hawks, eagles, falcons, and owls), songbirds and allies
(Passeriformes, typically called passerines), parrots, and the woodpeckers and their
allies (such as rollers, kingshers, bee-eaters, hornbills, hoopoes and woodhoopoes,
trogons, and other lineages). The Jarvis et al. (2014) TENT, like a number of prior
analyses (e.g., Ericson et al. 2006; Hackett et al. 2008; Kimball et al. 2013), splits
core landbirds into two clades that Ericson (2012) named Australaves and Afroaves
(Table 3). That split renders the raptorial lineages para- or polyphyletic. Specically,
the falcons are placed in Australaves sister to parrots and passerines whereas the
hawks, eagles, New World vultures, and owls are placed in Afroaves as the succes-
sive sister groups of a clade comprising mousebirds, and a diverse assemblage that
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 173
includes woodpeckers and their allies (Fig. 2a). However, the Jarvis et al. (2014)
TENT conicts with the Prum et al. (2015) topology with respect to the position of
the clade comprising hawks, eagles, and New World vultures (Accipitriformes); the
TENT (and Reddy et al. 2017) places accipitriforms sister to all other Afroaves
whereas Prum et al. (2015) places them sister to all other core landbirds (Fig. 2b).
The Jarvis et al. (2014) analysis of rst and second codon positions supported a third
topology, with a clade comprising accipitriforms and owls sister to all other core
landbirds. Although there is some conict between the Jarvis et al. (2014) exon
analysis and the Prum et al. (2015) tree, this suggests the position of accipitriforms
could reect another data-type effect.
Although differences between the results of analyses using coding vs. noncoding
data have emerged as a major source of conict in the avian tree of life, differences
due to data-type effects are not sufcient to explain all of the conicts at the base of
core landbirds; analytical methods also play an important role (Fig. 5). Analyses of
noncoding data using multispecies coalescent (MSC) methods (species tree
methods; see below in Sect. 3) yield trees that are more congruent with standard
concatenated analyses of exon data, either placing an accipitriform-owl clade or
accipitriforms alone sister to all other landbirds (Fig. 5). Thus, the position of
accipitriforms and owls represents an interesting case of uncertainty related to two
different factors: (1) data type and (2) whether the analytical method assumes a
single underlying tree or a mixture of trees (for additional details regarding analytical
methods, see below in the next section).
Mousebirds represent an equally striking source of conict (Fig. 5). Many
analyses, regardless of data type or analytical approach, place mousebirds sister to
the diverse clade comprising woodpeckers and their allies (named Cavitaves by Yuri
et al. 2013; see Fig. 5). The analysis of UCEs in Jarvis et al. (2014) and the analysis
of TE insertions Suh et al. (2015) are the major exception; analyses of both of those
data types support monophyly of Afroaves, but they place mousebirds sister to the
other taxa in that clade. However, earlier multigene analyses recognized mousebirds
as a rogue taxon, shifting to various positions within core landbirds depending on the
analytical approach, taxon sample, and data (Suh et al. 2011; Wang et al. 2012;
McCormack et al. 2013). Those analyses include positions sister to or even within
Australaves or sister all other landbirds. Like tropicbirds and eurypygiforms,
mousebirds represent a long branch that can only be subdivided close to the tip.
This led Suh (2016) to favor the Fig. 5d topology by arguing that the UCE and TE
analyses are more resistant to the long-branch attraction artifact (see below in Sect.
3). However, Gilbert et al. (2018) found that the position of mousebirds was unstable
when they applied data ltering approaches designed to reduce noise to the Jarvis
et al. (2014) UCE data. The observation that analyses of UCE data are sensitive to
ltering does not refute the Fig. 5d topology. However, it does question one of the
examples of congruence advanced by Suh (2016) for favoring that topology (i.e., the
congruence of the UCE and TE analyses). These conicts regarding the position of
mousebirds are especially frustrating given the excellent fossil record of this lineage
(Ksepka and Clarke 2009; Ksepka et al. 2017). However, when the available
174 E. L. Braun et al.
analyses are taken as a whole, it seems the positions of acciptriforms, owls, and
mousebirds within core landbirds should all be approached with caution. Regardless
of the details, core landbirds appear to represent yet another major avian clade where
additional phylogenomic analyses will provide surprises and insights.
New World vultures
Mousebirds
Owls
Eagles, Hawks
Cavitaves
Australaves
ANew World vultures
Mousebirds
Owls
Eagles, Hawks
Cavitaves
Australaves
B
Mousebirds
Cavitaves
Australaves
New World vultures
Eagles, Hawks
Owls
C
New World vultures
Owls
Eagles, Hawks
Mousebirds
Cavitaves
Australaves
D
Jarvis et al. (2014)
Reddy et al. (2017)
- TENT, introns
- non-coding
Jarvis et al. (2014)
Prum et al. (2015)
Kimball et al. (2013)
Edwards et al. (2017)
- binned intron MSC
- coding
- non-coding MSC*
- non-coding MSC*
Jarvis et al. (2014)
Reddy et al. (2017)
- TENT MSC, intron MSC
- codin
g
(Prum noJAR)
Jarvis et al. (2014)
Suh et al. (2015)
- UCEs
- TE insertions
Fig. 5 Relationships within core landbirds depend on data type and analytical approach. We
present four candidate topologies that have been recovered when different data types and analytical
approaches are used. Multispecies coalescent (species tree) analyses are indicated using MSC;
all other analyses used concatenated data. (a) Division into two major clades (Australaves and
Afroaves; see Table 3), present in the Jarvis et al. (2014) TENT and the Reddy et al. (2017) analyses
of concatenated noncoding data. (b) Accipitriformes sister to all other core landbirds, found in
the binned MP-EST (Mirarab et al. 2014a) analysis of intronic data reported by Jarvis et al. (2014)
and the Prum et al. (2015) concatenated tree. The Kimball et al. (2013) NJst analysis had as
similar topology (the asterisk indicates a minor rearrangement placing mousebirds sister to owls).
This topology was also supported by unbinned MP-EST analyses of three different noncoding
datasets in Edwards et al. (2017), although the taxon sampling in that study was limited. (c) An
acciptriform + owl clade sister to all other core landbirds, found in binned MP-EST analysis of the
TENT data and unbinned MP-EST analysis of introns by Jarvis et al. (2014). Reddy et al. (2017)
also found this topology in their concatenated analyses of a 104-locus coding data matrix (their
Prum noJartree). (d) Division into two major clades but mousebirds sister to all other Afroaves,
found in the Jarvis et al. (2014) concatenated UCE tree and the Suh et al. (2015) analysis of TE
insertions. Cavitaves is dened as Piciformes, Coraciiformes, Bucerotiformes, Trogoniformes, and
Leptosomiformes (Yuri et al. 2013)
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 175
3 Why Are the Deep Branches in Neoaves So Difficult?
The motivation for developing phylogenomic methods was the hypothesis that
analyses of large data matrices would result in a fully resolved tree of life (Gee
2003). However, as described above, simply collecting more data does not appear to
be sufcient to resolve the tree of life with condence. Arguably, the failure of
phylogenomics to provide a convincing and simple resolution of the tree of life
should have been expected. Long before it was possible to collect phylogenomic-
scale data, mathematical studies (e.g., Felsenstein 1978; Hendy and Penny 1989) and
simulations (e.g., Hillis et al. 1994) had revealed that some analyses of large-scale
data matrices fail to converge on the correct tree. Much of the early theoretical work
focused on identifying conditions where specic phylogenetic methods converge
on an incorrect topology with condence when additional data are added. This
suggests that one might expect analyses of genome-scale data to exhibit a high
degree of support, at least when support is assessed using standard methods (i.e., the
nonparametric bootstrap; Felsenstein 1985). In sharp contrast to this simplistic
expectation, many analyses of large data matrices actually result in low support
and conicting results.
In this section we provide a brief review of those foundational results in theoreti-
cal phylogenetics and connect those early results to observed conicts in the bird tree
(see above). However, we emphasize that we do not view the base of Neoaves (or the
bird tree in general) as unique; indeed, these types of conicts have been observed in
phylogenomic studies focused on other groups (e.g., Philippe et al. 2011; King and
Rokas 2017; Pease et al. 2018), and we expect many additional challenging nodes to
be identied across the tree of life during the phylogenomic era. Hinchliff et al.
(2015) synthesized published phylogenetic information for all organisms, including
taxa that range from microbes to mammals and birds, and they found 4610 nodes that
conict with their taxonomy. Some of those conicting nodes are likely to reect
data-limited studies or cases in which the taxonomy was incorrect, but they also
highlighted a few cases that reect conicts among phylogenomic studies. The
Hinchliff et al. (2015) study underscores the work that still needs to be done to
resolve the tree of life, even in the phylogenomic era. However, the base of Neoaves
is among the best studied phylogenetic problems that remain unresolved. Therefore,
it seems likely that understanding the reasons for these continuing difculties in
resolving the bird tree will have general implications for the resolution of other
challenging nodes on the tree of life.
Although many systematists have suggested that complex analytical methods
may be necessary to arrive at a satisfying resolution of the difcult nodes in the tree
of life (e.g., Philippe et al. 2011; Reddy et al. 2017; Steel 2005), the biological basis
of those difcult nodes is actually quite straightforward. The nodes most difcult to
infer are those associated with short internal branches (int in Fig. 6a). This reects
the fact that, ultimately, all phylogenetic methods (including both parametric and
nonparametric approaches) rely on the existence of characters that unite taxa
(synapomorphies; cf. Hennig 1966). Given even the simplest models of evolution,
the probability that a synapomorphic substitution uniting a specic group exists is
176 E. L. Braun et al.


a
b
c











  
  







int
term





Fig. 6 Potential sources of nonhistorical signals relevant to phylogenomic analyses. (a) Almost all
of the most challenging nodes in the tree of life reect short internal branches (int) that provide little
time for synapomorphic changes to accumulate. Long terminal branches (term) exacerbate the
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 177
directly linked to the length of the internal branch uniting that group (Braun and
Kimball 2001). However, the terminal branch lengths (term in Fig. 6a) also play an
important role. All other things being equal, longer terminal branches result in a
higher probability that (1) subsequent substitutions will obscure synapomorphic
substitutions and (2) convergent substitutions will create the appearance of
synapomorphies that unite other groups. Thus, correctly inferring the topology for
short internal branches deep in the tree will be more challenging than inferring short
branches closer to the tips. In fact, reconstructing the phylogeny for deep branches is
impossible if the rate of accumulation for substitutions exceeds a specic value
(Mossel 2003), a particularly pessimistic nding for phylogenomicists.
The extreme case in which phylogenetic reconstruction is impossible is unlikely
to be the case for the data types typically used for avian phylogenomics (i.e., those
used by Jarvis et al. 2014). Chojnowski et al. (2008) used simulations to show that
analyses of sequences evolving at a rate similar to avian introns could resolve a tree
with branch lengths similar to those at the base of Neoaves; even as little as 32 kb of
simulated intron data could yield trees with an average of only one rearrangement.
Introns are the most rapidly evolving data type analyzed by Jarvis et al. (2014).
However, examining the results of Jarvis et al. (2014), Prum et al. (2015), and Reddy
et al. (2017) in light of the earlier Chojnowski et al. (2008) study raises another
important question: why is there so much incongruence among those studies given
the large size of the data matrices in each study? The conict within the Jarvis et al.
(2014) study is especially troubling. Chojnowski et al. (2008) found that analyses of
simulated exon data did not perform as well as analyses using simulated intron data.
However, that study simulated a maximum of 8000 base pairs (bp) of exonic data;
the Jarvis et al. (2014) exon datasets were three orders of magnitude larger than that.
Thus, the observed conicts at the base of Neoaves must reect much larger issues
than simply the overall rate of sequence evolution, the time between cladogenic
events at the base of Neoaves, or even the amount of available data.
Long-Branch Attraction, Changes in Patterns of Sequence Evolution, and Data
Types Two major phenomena with the potential to explain the observed conicts
were identied long before the phylogenomic era using mathematical approaches
and simulations: long-branch attraction and shifts in the model of sequence evolu-
tion. Highly unequal rates of evolution (e.g., Fig. 6b) are thought to be a major
source of long-branch attraction (Felsenstein 1978). This potential source of
misleading signal is actually one of the reasons that many systematists advocate
Fig. 6 (continued) problems associated with short internal branches. (b) Example of long-branch
attraction. Branch lengths reect numbers of substitutions per site (c) Changes in model parameters,
illustrated for this tree by focusing on shifts in the equilibrium GC-content. (d) Example of a
heterotachous tree mixture. The number of substitutions per site for the locus associated with the
black gene tree differs from the number of substitutions per site for the gray tree. (e) Example of a
tree mixture that has multiple topologies (potentially generated by the MSC). The black and gray
gene trees have the same topology but different branch lengths and the dashed gene tree has a
distinct topology
178 E. L. Braun et al.
breaking up long branches by adding taxa (Bergsten 2005), although there are many
reasons that adding taxa is likely to be benecial. However, highly unequal rates are
not absolutely necessary for long-branch attraction; Hendy and Penny (1989) found
cases where the MP criterion is inconsistent (i.e., it converges on an incorrect tree in
expectation) even for when the data conform to the molecular clock. Long-branch
attraction has often been viewed as a problem associated with the MP criterion that
can be solved by parametric methods (e.g., Huelsenbeck 1997; Swofford et al. 2001),
such as ML and Bayesian inference. However, a number of studies have shown that
long-branch attraction can mislead those parametric methods when the model used
for analyses is incorrect (e.g., Gaut and Lewis 1995; Lockhart et al. 1996), a fact
that should be troubling given that true underlying modelsof evolution are both
unknown and ultimately unknowable (Sanderson and Kim 2000). Regardless, the
fundamental nding of this theoretical work is that large amounts of sequence
data generated by a single model have the potential to converge on an incorrect tree
with high support. The body of older theoretical work should also give pause to
systematists who focus only on high support values in as much as those support
values could be inated. However, the results of recent phylogenomic studies do not
conform to the expectation of relatively simple artifacts like long-branch attraction
because many challenging nodes on the tree of life receive low support even when
very large datasets are analyzed. It is now clear that the real-world evolution of
genomic data cannot be characterized by a single model, making it reasonable to
speculate the limited support observed in recent phylogenomic studies reects the
existence of a complex mixture of evolutionary processes rather than a simple artifact.
Patterns of sequence evolution with the potential to mislead phylogenetic
methods are not limited to long-branch attraction; there are myriad model violations
that might mislead available analytical methods. The most obvious and easy model
violation to detect is a case in which the base composition changes across the tree
(Fig. 6c). Indeed, the results of some phylogenetic analyses appear to be driven
largely by convergence in base composition (e.g., Katsu et al. 2009; Phillips et al.
2004), and some authors have suggested that genes with variable base composition
(often called nonstationarybase composition) should be excluded from phyloge-
netic analyses for this reason (e.g., Collins et al. 2005; Jeffroy et al. 2006). The
general time reversible (GTR) model is the most commonly used model in
phylogenomics; it assumes that base frequencies remain constant (i.e., stationary)
over time. However, many other shifts in the patterns of substitution are possible,
and some do have the potential to affect phylogenetic estimation. For example,
there is evidence that the rate matrix (the relative rates of various substitutions
types, including the transition-transversion ratio and the relative rates of different
transitions and transversions) can change across trees (e.g., Ota and Penny 2003).
Those changes in model parameters can degrade the performance of ML analyses
using standard models like GTR (Casanellas and Fernandez-Sanchez 2007). It is
unclear whether changes in base composition or in the rate matrix across the tree will
necessarily lead to nodes with limited support. It is clear, however, that the bird tree
exhibits both highly unequal branch lengths (Fig. 7) and variation in model
parameters (Fig. 7shows variation in GC-content). This variation may, at least in
part, explain the limited support for specic avian clades in phylogenomic analyses.
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 179
0.05
substitutions per site
GC-content is reported for:
inf sites - fast sites
Colors used to emphasize taxa:
black -- evolutionary rate
red
blue
-- high GC taxa
-- low GC taxa
50% - 51.8% (lark)
50% - 52.4% (sparrow)
49.8% - 54.2% (bee-eater)
50.2% - 53.1% (tinamou)
43.6% - 34.9% (rhea)
45.3% - 38.4% (cuckoo)
45.4% - 40.9% (nighthawk)
49.4% - 47.1% (woodhoopoe)
45.6% - 39.0% (parrot)
45.2% - 38.4% (screamer)
Palaeognathae
Galloanseres
Neoaves
(Passerea)
Neoaves
(Columbea)
hemipode
fftails
woodpeckers,
honeyguides, & barbets
“birds of prey”
(owls, hawks, eagles, & New World vultures)
“birds of prey” (falcons & seriemas)
Fig. 7 Phylogram of the Prum et al. (2015) ML tree emphasizing shifts in GC-content and
evolutionary rate. Taxa with the most extreme values for the GC-content of the parsimony
informative sites and the fast sites(sites in the 95th percentile for number of MP steps given
this tree) are indicated using colored arrows (red for high values and blue for low values). The
median GC-content for informative sites was 46.8% (range 43.650.2%), and the median
GC-content of the fast sites was 44.7% (range 34.954.2%). The les used for these analyses can
be found in Braun (2018). Some taxa with very high GC-contents were also long branches (long
branches indicate lineages with elevated substitution rates). We also emphasize lineages with
branch lengths that differ substantially from their sister groups: (1) rails and allies (which are sister
to cranes and allies), (2) hemipodes (which are sister to the gulls, skuas, and allies), (3) a kingsher
(nested within the order Coraciiformes), and (4) the woodpeckers and allies (sister to jacamars and
puffbirds). We also emphasize the birds of prey,as dened in Jarvis et al. (2014), because they
have shortest branches (i.e., lowest evolutionary rates) within Telluraves (core landbirds). Many
high-rate taxa are characterized by long branches in analyses of other data (e.g., compare this
phylogram to Fig. 4in Reddy et al. 2017). We have indicated taxa in Columbea and Passerea, the
two major clades within Neoaves in the Jarvis et al. (2014) TENT, on the tree to emphasize that they
are non-monophyletic in the Prum et al. (2015) tree
180 E. L. Braun et al.
Much of our discussion regarding conicts within Neoaves (see above) focused
on data-type effects. Data-type effects are not a phylogenetic artifact like long-
branch attraction or base compositional shifts; they are simply a way to discuss
different topological signals that emerge in phylogenetic analyses using distinct
subsets of the genome that can be dened using non-phylogenetic criteria. The
fundamental idea is that distinct subsets of the genome might exhibit different
degrees of model violation. If the model used for analysis is not violated by a subset
of the genome (or, more likely, the model is only violated to a modest degree), then
analyses of that subset are likely to yield an accurate estimate of phylogeny; if the
model violation is strong, then analyses could yield an inaccurate estimate of
phylogeny. Reddy et al. (2017) dened their data types crudely (i.e., coding
vs. noncoding regions), but one could subdivide the genome more nely. For
example, it would seem logical to subdivide noncoding data into transcribed and
non-transcribed regions, whereas coding regions might be subdivided using protein
structure (in fact, Pandey and Braun 2018 recently reported a data-type effect linked
to protein structure for the base of Metazoa). Regardless, the fundamental reason that
Reddy et al. (2017) proposed data-type effects was to provide a framework for
exploring and discussing variation in the phylogenetic signal evident in different
parts of the genome; the actual reason(s) why analyses of any particular data type
might yield an incorrect estimate of phylogeny will relate to specic model
violations associated with each data type.
It might seem that one could overcome data-type effects by conducting analyses
that apply different models to each data partition. Partitioned analyses are common
practice in phylogenomics (Lanfear et al. 2014), and partitioned analyses could, at
least in principle, solve data-type effects if one could identify adequate models for
each partition. However, the idea that Reddy et al. (2017) articulated is that there
may not be any models that yield accurate estimates of phylogeny for a specic data
type (or, at the very least, none of the models that are good enoughhave been
implemented in a software package that is practical to use for phylogenomic
analyses). Most programs used in phylogenomics, like RAxML and MrBayes,
only implement the GTR model and its submodels (typically in combination with
methods to describe among-sites rate heterogeneity like Γ-distributed rates and/or
invariant sites). This has led to the use of the GTR model (or a submodel) in almost
all empirical studies (Sumner et al. 2012). If the GTR model is fundamentally
problematic for analyses of one or more of the data type(s), then partitioned analyses
that apply the GTR model to each partition will also be problematic. The only
solution would be excluding the problematic data or using models that differ from
the GTR model in a more fundamental way. Efforts to do the latter are ongoing; for
example, IQ-TREE (Nguyen et al. 2015) is fast enough for phylogenomic studies
(Zhou et al. 2018), and it implements a broader suite of models than many other
programs. If these newer models have a better t to the data that are poorly described
by the GTR model and yield accurate estimates of phylogeny for those data, then
partitioned analyses (using the appropriate models) should ameliorate data-type
effects.
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 181
Mixtures of Gene Trees That Reect Heterotachy, Incomplete Lineage Sorting,
or Reticulation Another challenge for phylogenomic studies is the fact that geno-
mic data may reect a mixture of trees rather than a single tree (Maddison 1997). The
simplest case is a mixture of trees with the same topology but different branch
lengths (Fig. 6d), a phenomenon also called heterotachy (Lopez et al. 2002).
Heterotachy in protein-coding regions is often thought to reect shifts in selective
constraints, causing sites to transition between a state in which substitutions can
accumulate, and a second state in which substitutions are removed by purifying
selection (Penny et al. 2001). However, regional variation in mutation rates is well
characterized in birds (e.g., Axelsson et al. 2005), and shifts in the mutation rate for a
specic neutral region will also result in heterotachy. Regardless of the biological
basis for heterotachy, analyses of heterotachous data using standard (i.e.,
non-heterotachous) ML methods can be misleading (Matsen and Steel 2007), some-
times resulting in a mixed branch repulsionanalogous to long-branch attraction. A
more extreme tree mixture involves gene trees with different topologies (Fig. 6e).
Those tree mixtures are biologically realistic; a number of processes, such as
incomplete lineage sorting (ILS) and hybridization, can result in mixtures of trees
with different topologies (Maddison 1997). Edwards (2009) pointed out that ILS
actually results in both heterotachy and discordant trees; the heterotachy reects
variation in coalescence times for gene trees with the same topology. In fact, some
gene trees with the same topology as the species tree still reect a deep coalescence
(DC) in which the split in the gene tree occurs prior to multiple speciation events
(e.g., the gray gene tree in Fig. 6e); branch lengths of those trees will certainly differ
from the non-DC trees. Mixtures with multiple topologies arise when DC gene trees
are discordant with the species tree (e.g., the dashed gene tree in Fig. 6e). There is
strong direct evidence for heterotachy (e.g., note the very different terminal branch
lengths for BDNF and PPP2CB in Fig. 8), although the impact of heterotachy on
estimates of avian phylogeny using standard ML methods remains unclear. The
evidence for discordance among gene trees due to ILS is indirect, but there is a
strong theoretical basis for expecting ILS whenever there are short branches in a
species tree. The impact of ILS on estimates of the bird tree obtained using standard
ML methods also remains uncertain.
The theoretical and computational phylogenetics community has put a tremen-
dous amount of effort into the development of MSC (species tree) methods over
the past decade (reviewed by Edwards 2009; Edwards et al. 2016; Liu et al. 2009;
Warnow 2018). MSC methods are designed to infer the correct species tree given
mixtures of gene trees due to ILS. Standard ML analyses using concatenated gene
sequences implicitly assume a single underlying tree with a specic set of branch
lengths. Thus, standard ML analyses violate the MSC model, and those analyses will
converge on an incorrect estimate of the true species tree in certain parts of parameter
space (Kubatko and Degnan 2007; Mendes and Hahn 2017; Roch and Steel 2015).
Although the fact that MSC methods are consistent (i.e., they converge on the correct
tree in expectation) is viewed as a desirable property, some authors have raised
concerns about the criterion of statistical consistency. Warnow (2015) pointed out
that available proofs of consistency for MSC methods actually focus on a weak
182 E. L. Braun et al.





 



 











 





 
 





 


 





   

  












 
 


 
















Fig. 8 Two examples of gene trees from Reddy et al. (2017), emphasizing differences in relative
rates and base composition. Support values from an ultrafast bootstrap (Minh et al. 2013) analysis in
IQ-TREE are shown next to branches when they are "70%. (a) Tree based on an intron in the
PPP2CB locus. This is one of the few individual gene trees that divides Neoaves into Columbea and
Passerea. As in Fig. 7, the GC-content for informative sites is indicated with colored arrows (red for
the six most GC-rich taxa and blue for the six least GC-rich taxa). Although the median GC-content
for informative sites (45.4%) does differ from the GC-content for constant sites (49.7%), this locus
exhibits limited GC-variation overall (range for informative sites ¼41.558.4%). (b) Tree based on
part of the BDNF coding exon. This locus exhibits substantial rate variation (note the very long
branches for the bee-eater, Darwinsnch, and tinamou). Like PPP2CB, the median GC-content for
informative sites (47.5%) differs from the GC-content of constant sites (52.5%). However, this
region also exhibits substantial GC-content variation (range for informative sites ¼33.984.8%).
Most nodes in this gene tree have limited support, similar to the PPP2CB gene tree (and most trees
based on short gene regions), but the BDNF tree does includes some strongly supported clades.
However, some of those strongly supported clades contradict monophyly of Palaeognathae and
Neoaves, which are united by very long branches in all other estimates of the avian species tree. The
extreme GC-content and rate variation suggest that these conicts may reect a biased estimate of
phylogeny. All data supporting this gure is available in Braun (2018)
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 183
versionof statistical consistency, showing only that tree estimated by the species
tree method will converge in probability to the true species tree as the number of sites
per locus and the number of loci both increase.Even this weak version of statistical
consistency assumes data were generated by gene trees that reect the MSC model
along with a model of sequence evolution for which available methods are consis-
tent. Springer and Gatesy (2016) pointed out that the MSC model ignores important
aspects of genome evolution such as selection and linkage. Even as ubiquitous a
phenomenon as population subdivision will yield a different spectrum of gene trees
than expected given the simple MSC model (Slatkin and Pollack 2008). It is
certainly possible that the real-world processes underlying genome evolution are
close enough to the assumptions that underlie MSC methods that those methods will
yield an accurate estimate of the species tree; indeed, this is generally assumed to be
the case when those methods are applied. However, a recent study examined the t
of 25 datasets to the MSC model, nding that 20 of those datasets violate the model
(Reid et al. 2013). Ultimately, it should be clear that the criterion of consistency only
yields guarantees in the abstract world of mathematics, not in the real world of
biology.



 



 

 
   

  

















 


 


 


 



 














 

















 



 

 
   

  

















 


 


 


 



 











 














 



Fig. 8 (continued)
184 E. L. Braun et al.
Given their growing use in phylogenomics, it is important to understand the types
of MSC methods available at this time, despite the concerns we expressed regarding
their justication by appealing to their statistical consistency. Modern MSC methods
often yield trees that are quite congruent with trees based on standard (concatenated)
ML analyses (Tonini et al. 2015), and some are actually less computational burden-
some than those ML approaches. Xu and Yang (2016) reviewed MSC methods,
highlighting two basic approaches: (1) full-likelihood methods and (2) gene tree
summary methods. Xu and Yang (2016) also described (but did not name) a third
approach that we call site pattern methods. Full-likelihood methods integrate over
the uncertainty in gene trees (this is the approach Maddison 1997 originally
suggested). At this time the full-likelihood approach has been implemented in
BEST (Liu 2008), *BEAST (Heled and Drummond 2010), RevBayes (Höhna
et al. 2016), and BPP (Rannala and Yang 2017); all of those programs use a
Bayesian Markov chain Monte Carlo approach, and none are able to scale to
phylogenomic analyses similar in size to Jarvis et al. (2014). Gene tree summary
methods involve two steps: (1) a standard method (e.g., ML) is used to generate gene
trees and (2) the estimated gene trees are combined to generate the species tree.
Examples of gene tree summary methods include MP-EST (Liu et al. 2010),
ASTRAL (Mirarab et al. 2014b; Mirarab and Warnow 2015; Zhang et al. 2018),
and NJst/ASTRID (Liu and Yu 2011; Vachaspati and Warnow 2015). There are
tools to visualize discordance among estimated gene trees, either as networks or
using other approaches (e.g., Ottenburghs et al. 2016b; Sayyari et al. 2018). Many
phylogenomic studies, including a number focused on birds (e.g., Kimball et al.
2013; McCormack et al. 2013; Jarvis et al. 2014; Edwards et al. 2017), have used
gene tree summary methods. The practice of estimating gene trees and then combin-
ing those trees is actually imposing less computational burden than standard
ML analyses of concatenated loci. Finally, site pattern species tree methods use a
concatenated data matrix as input. However, they differ from standard analyses of
concatenated data by decomposing the data matrix into quartets (SVDquartets;
Chifman and Kubatko 2014) or rooted triples (SMRT-ML; DeGiorgio and Degnan
2010) and identifying the optimal tree for those subsets of taxa. Obviously, the
method used to infer the quartet (or rooted triplet) subtrees must be consistent given
the MSC for the approach to be viewed as a species tree method. However, methods
that are consistent for those subtrees (under at least some circumstances) do exist
(DeGiorgio and Degnan 2010; Long and Kubatko 2017). Use of singular value
decomposition to choose the subtrees (the criterion used by SVDquartets) is also
very fast computationally. After generating the subtrees, they are combined using a
supertree method such as MRP (Baum 1992; Ragan 1992) or Quartet MaxCut (Snir
and Rao 2012) to generate the species tree. Although the use of site pattern methods
remains less common than the gene tree summary methods, they have begun to
attract attention in avian phylogenomics (e.g., Hosner et al. 2015b; Meiklejohn et al.
2016; Moyle et al. 2016; Sun et al. 2014).
Hybridization represents another major source of discordance among gene trees.
Hybrids have been documented in most bird orders (Ottenburghs et al. 2015,2017b),
and hybridization can impact phylogenetic estimation for recent radiations (e.g.,
Lavretsky et al. 2014). Phylogenomics is likely to revolutionize the study of these
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 185
complex radiations (Lamichhaney et al. 2015; Grant and Grant 2016; Stryjewski and
Sorenson 2017). However, the amount of gene tree discordance due to introgression
in many lineages may be limited by the fact that hybrids often have lower tness than
parental types (e.g., Bronson et al. 2003). Two phenomena which might be espe-
cially important for generating mixtures of gene trees that reect hybridization are
(1) hybrid speciation and (2) despeciation. Hybrid speciation reects cases where
hybrid populations become reproductively isolated from both parental species. The
Golden-crowned manakin appears to be an example of such a hybrid species; ~2/3 of
its genome is more closely related to the Opal-crowned manakin, whereas ~1/3 is
more closely related to the Snow-capped manakin (Barrera-Guzmán et al. 2018).
Despeciation refers to the fusion of two related species, resulting in a single species
and erasing the initial speciation event. Kearns et al. (2018) presented phylogenomic
evidence that Common ravens arose by fusion of two raven lineages that diverged
ca. 1.5 Mya and would likely be viewed as species if they were extant (Holarctic
and California). Chihuahuan ravens diverged from the California lineage, while it
was isolated from the Holarctic birds. Both of these phenomena lead to a spectrum of
gene trees that differ from the expectation given ILS alone (see Fig. 3in Hahn and
Nakhleh 2016).
If we focus deeper in evolutionary history, the impact of hybridization is expected
to be more difcult to examine: gene tree estimation error might make it virtually
impossible to distinguish the descendants of lineages that underwent limited
amounts of ancient hybridization from those with purely treelike history (combined
with ILS). However, the expectation that sex-linked genes will exhibit less intro-
gression than autosomal genes (Rheindt and Edwards 2011) might be useful for
distinguishing conict due to ILS from conict due to introgression. Indeed, Fuchs
et al. (2013) used this to explain a difference between autosomal and Z-linked genes
they observed in woodpeckers. That study used seven autosomal loci, three Z-linked
loci, and mitochondrial sequences, so it would be interesting to reexamine it in a
phylogenomic framework. Another test uses the expectation that the two minority
topologies for rooted triplets in gene trees will be recovered in equal numbers if ILS
alone is responsible for discordance; given the failure of a collection of true gene
trees to observe this equality would lead one to reject treelike evolution under ILS
alone. However, errors in estimated gene trees can either produce or obscure
inequalities in the numbers of gene trees with each minority resolution, limiting
the utility of the inequality test. To address this, Zwickl et al. (2014) proposed a
cumulative support distributiontest that incorporates information about support in
the gene trees. Developing practical tests that are able to establish whether the null
hypothesis of ILS alone can explain discordance among gene trees represents a
major challenge in the phylogenomic era.
Rare Genomic Changes A fundamental problem for studies focused on discor-
dance among gene trees is the indirect nature of the evidence. Unlike evolutionary
rate heterogeneity and heterotachy, in which it is relatively straightforward to nd
direct evidence for the phenomena, the evidence for ILS and introgression is indirect
because it depends on the comparison of gene trees. However, phylogenetic trees
186 E. L. Braun et al.
estimated using short sequences, like individual genes, are subject to substantial
estimation error (e.g., Chojnowski et al. 2008; Gatesy and Springer 2014; Mirarab
et al. 2014a; Patel et al. 2013) and could be subject to systematic error (e.g., if there is
strong evolutionary rate variation and/or nonstationary base composition). Thus,
observing incongruence among estimated gene trees does not provide direct evi-
dence of ILS or introgression because that incongruence could reect error. This is
true even for loci with limited evolutionary rate and GC-content variation (e.g.,
Fig. 8a); it is certainly true for gene trees for loci with substantial rate and
GC-content variation (e.g., Fig. 8b).
Rare genomic changes, which correspond to a heterogeneous set of slowly
accumulating changes in the genome, provide an alternative means to examine
ILS and introgression that has the potential to be better than the use of estimated
gene trees. Ideally, rare genomic changes represent uniquely derived genomic
characters (i.e., homoplasy-free changes that are subject neither to reversal nor to
convergence). Genuinely homoplasy-free genomic characters would dene a single
branch in their associated gene tree perfectly. Analyses of avian ILS have used three
types of rare genomic changes: (1) TE insertions, (2) numt insertions, and (3) indels
(insertions and deletions) as a whole. TE insertions are the most commonly used
(e.g., Haddrath and Baker 2012; Suh et al. 2011). Many conicting TE insertions
have been identied in birds (Han et al. 2011; Jarvis et al. 2014; Matzke et al. 2012;
Suh et al. 2015,2011,2017); this observed conict has typically been interpreted
as direct evidence of ILS. Unfortunately, precise quantication of ILS using TE
insertions is difcult because (1) the rate of TE insertion is quite variable over time
(Kapusta and Suh 2017); (2) informative TE insertions are relatively rare even
when they are scored at a whole-genome scale (Suh 2015); and (3) avian TE
insertions do not appear to be completely free of true homoplasy (Han et al. 2011).
Unlike TE insertions, the sole numt study (Liang et al. 2018) did not reveal any
homoplasy. However, Liang et al. (2018) only identied a small number of informa-
tive numt insertions and that study included a limited taxon sample. If we expand our
focus to indels as a whole, which are much more numerous than TE or numt
insertions, Jarvis et al. (2014) predicted that discordance due to ILS would yield a
positive relationship between internal branch lengths and the proportion of apparent
synapomorphies mapping to each of those branches that appear non-homoplastic.
That exact relationship was observed, although that approach cannot provide a
quantitative estimate of ILS. Regardless, the observed levels of conict among
rare genomic changes indicate there was a relatively large amount of ILS near the
base of Neoaves.
Analyses of rare genomic changes could be revolutionized by truly whole-
genome phylogenetics. Many analyses reported by Jarvis et al. (2014) actually did
not require whole-genome sequencing. For example, sequence capture could
(at least in principle) have been used to generate the data for the TENT. In contrast,
generating rare genomic change data would have been much more difcult (poten-
tially even impossible) without genome sequencing. Identifying TE insertions prior
to the phylogenomic era involved labor-intensive methods necessary for their
identication (described by Haddrath and Baker 2012). Bioinformatic screens of
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 187
whole-genome assemblies can reveal TE insertions without the need for complex
laboratory methods. Repetitive sequences, like TEs, can be challenging to assemble,
but this can be solved by using platinum-quality genome sequences (Kapusta and
Suh 2017; Weissensteiner and Suh 2018). It is also challenging to target numt
insertions, but they are straightforward to identify by searching whole-genome
data. However, because numt insertions are nonfunctional mitochondrial sequences
in the nuclear genome, they have a relatively high likelihood of being misassembled
in genomes based exclusively on short-read data (because some numt reads are likely
to cluster with mitochondrial reads). Thus, platinum-quality genome sequences
should improve numt scoring relative to most currently available assemblies. Finally,
the availability of more avian genome assemblies could allow the use of other types
of rare genomic changes. Microinversions are one of these types of rare genomic
change that are impossible to identify without sequence data (Braun et al. 2011).
Microinversions technically include all inversions that cannot be identied cytolog-
ically (Chaisson et al. 2006). However, the sole avian microinversion study (Braun
et al. 2011) focused on short (<50 bp) inversions, nding that they accumulate at a
rate comparable to TE insertions and suggesting that they are likely to be as useful
for phylogenetics as TE insertions. Large-scale identication of microinversions will
revolutionize their use and provide another way to examine discordance among gene
trees. We anticipate that the phylogenomic era will lead to a ood of rare genomic
change datasets.
Rare genomic changes are interesting from a computational standpoint because
the optimal tree for ideal rare genomic changes is the MP tree (Steel and Penny
2004,2005). MP is orders of magnitude more computationally efcient than ML
(or Bayesian) methods (Sanderson and Kim 2000). It is unclear whether the MP tree
for a collection of ideal rare genomic changes generated on the tree mixture
generated by the MSC will be the species tree. However, Mendes and Hahn
(2017) have shown that MP analyses of concatenated data are consistent given the
MSC (assuming certain assumptions are made), providing reasons for optimism if
one views the criterion of consistency as critical (for detailed arguments against the
position that consistency is a necessary feature of phylogenetic methods, see Brower
2018; Sanderson and Kim 2000). Of course, there are two reasons that empirical rare
genomic change datasets will not be perfect(i.e., absolutely homoplasy-free):
(1) the relevant type of rare genomic change could exhibit some true homoplasy
and (2) errors in the genome assembly and/or orthology detection pipeline. It may be
necessary to develop analytical methods that consider those sources of error. Never-
theless, it seems reasonable to postulate that rare genomic changes will make it
possible to estimate the species tree and amount of ILS accurately.
Assessing Support and Confronting Theory with Phylogenomic Data Despite
the large body of theoretical work, we do not have a complete explanation for the
observed results of avian phylogenomic studies. The strongest type of theoretical
studies, proofs of consistency or inconsistency (e.g., Felsenstein 1978; Hendy and
Penny 1989; Kim 2000; Matsen and Steel 2007; Roch and Steel 2015; Mendes and
Hahn 2017), provides information about the behavior of specic analytical methods
188 E. L. Braun et al.
given an innite amount of data that were generated under a specic model. Kim
(2000) discussed an elegant geometric interpretation of phylogenetic methods that
have the interesting corollary that nonparametric bootstrap support for clades should
increase to 100% as the sample size increases. If the analytical method is consistent
given the true underlying model of evolution, the correct clade will exhibit 100%
support given sufcient data; if the data reect a part of parameter space where the
method is not consistent, an incorrect clade is expected to exhibit 100% support.
However, neither Jarvis et al. (2014) nor Prum et al. (2015) observed 100% bootstrap
support for all clades. This raises the question of how many sites are necessary to
provide a sufcient amountof data to observe the expected asymptotic behavior.
Many empirical systematists would have predicted that an alignment of 41.8 Mbp
(the size used to generate the Jarvis et al. 2014 TENT) would be sufcient for all
nodes to converge on 100% support (recognizing that some nodes might be resolved
incorrectly due to inconsistency). But there are nine nodes in the TENT that have
<100% support (and one has only 55% support). Increasing the data matrix size
more than sevenfold does not eliminate those low-support branches; six nodes with
<100% support remain present in the Jarvis et al. (2014) whole-genome tree (four of
those nodes have 62% support). Thus, the limited support for clades at the base of
Neoaves given available analytical methods appears to reect an intrinsic property of
the data rather than a trivial limitation in the amount of data.
The fact that analyses of a very large (even genome-scale) dataset can yield 100%
bootstrap support for an incorrect clade may seem disturbing. However, it is a natural
outcome when the analytical method is not consistent. By denition, a consistent
estimator converges on the true value (in phylogenetics this would be the true tree)
as the amount of data available for analysis increases (as more of the genome is
sampled). If an estimator is based on a fundamental misunderstanding of the
processes that generated the data, then the method is unlikely to be consistent, at
least in some parts of parameter space. It should not be surprising that analyses using
such a method could lead to an incorrect conclusion with 100% support; after all, we
dened the analytical method as one based on fundamentally incorrect assumptions
regarding the processes that generated the data. This raises a question: are there
phylogenomic methods that are immune to this issue? There have certainly been
efforts to create metrics of support for the phylogenomic era. For example, Seo
(2008) extended the standard bootstrap to a multilocus framework (ASTRAL
includes an easy to use implementation of this multilocus bootstrap). The general
idea of subsampling genes has also attracted attention (Edwards 2016);
bootstrapping is often used to assess support in locus subsampling studies. There
are other methods like concordance factors (Ané et al. 2007; Baum 2007) and the
information theoretic measures of Salichos et al. (2014), which instead focus on
examining the agreement among gene trees. The latter is especially interesting
because it has pointed out that it can also be applied to rare genomic changes. It is
important to recognize that measures of agreement among gene trees do not provide
information regarding the support for clades in the species tree. If data t the MSC,
one may still have a high degree of condence regarding the presence of a specic
clade in the species tree even when a relatively small proportion of gene trees agree
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 189
with that clade (Pamilo and Nei 1988). If enough gene trees are sampled, multilocus
bootstrapping (Seo 2008) and local posterior probabilities (Sayyari and Mirarab
2016) can both yield strong support for clades that only agree with a plurality of
gene trees. However, all methods exhibit the same behavior as the standard bootstrap
when the analytical method is inconsistent (i.e., incorrect clades will receive 100%
support if sufcient data are analyzed and those data reect a part of parameter space
where the method is inconsistent). But all is not lost; many trees are both robust to
the analytical method (e.g., Rindal and Brower 2011) and very likely to be correct. In
fact, the story of avian phylogenomics is arguably one of steady progress, with parts
of the tree that would have been viewed as hopeless a decade ago now emerging as
solvedin a satisfying manner (e.g., Fig. 2). It is the parts of the tree that remain in
ux despite the availability of very large amounts of data that represent a problem:
we must either conclude that the 41.8 Mbp analyzed by Jarvis et al. (2014) is not
sufcient to observe the expected asymptotic behavior (i.e., 100% support at all
nodes) or that there is something about the data and available methods of phyloge-
netic analysis that we do not understand.
There are four basic hypotheses that can explain the limited support for major
groups at the base of Neoaves in multiple phylogenomic analyses (Fig. 2). First,
errors in the data matrices (e.g., assembly, orthology assignment, and alignment)
could introduce noise. If this hypothesis is correct, the limited support should
disappear as the quality of the aligned dataset is improved, either by removing
problematic regions (e.g., using alignment-ltering methods like DivA; Mendoza
et al. 2014) or by extracting data from improved genome assemblies that lead to less
downstream error. Second, the poor support could reect limitations of the available
computational implementations of methods. If the bird data lies very close to
boundarieswith even minor differences in numerical optimization during the
calculation of the likelihood, the methods might choose different trees in different
bootstrap replicates. However, this would require all of the very large Jarvis et al.
(2014) datasets to be in parts of parameter space where those computational issues
manifest themselves. Third, it could the case that the heterogeneous nature of the
data tends to obscure phylogenetic signal. If this hypothesis is correct, it might be
possible to improve estimates of the avian tree by focusing on less noisy parts of the
genome. Reddy et al. (2017) presented several arguments that noncoding data, such
as introns and UCEs, might perform better in phylogenetic analyses than coding
exons. The strongest empirical argument favoring noncoding data was the fact that
trees based on rare genomic changes (one for TE insertions and one for all indels) are
more similar to the intron and UCE trees. Nonetheless, it would clearly be more
convincing to identify additional data types that yield trees congruent with the
noncoding trees. Finally, as described above, the base of Neoaves could represent
a hard polytomy (Suh 2016). If Suh (2016) is correct, then individual gene trees will
be random for the lineages involved in the hard polytomy. The hard polytomy
hypothesis predicts that any relatively small set of gene trees are unlikely to be
very congruent. Suh (2016) proposed a nine-taxon hard polytomy, and there are
>135,000 possible rooted nine-taxon trees. However, Reddy et al. (2017) found
that analyses of a largely intronic 54-locus dataset yielded a tree similar to the Jarvis
190 E. L. Braun et al.
et al. (2014) intron tree (and the TENT, which was 68% noncoding data). Reddy
et al. (2017) also observed that analyses of a largely coding 104-locus dataset that
did not overlap with any loci in Jarvis et al. (2014) results in a tree similar to the
Jarvis et al. (2014) exon tree. Those results seem unlikely if the low support at the
base of Neoaves reects a hard polytomy; if the hard polytomy hypothesis is correct,
it would require that analyses of data reecting relatively small sets of gene trees
coincidentally converge on two specic parts of tree space that were identied by
Jarvis et al. (2014) in this manner is correlated with data type. We suggest that
understanding the heterogeneity present in genomic datasets will ultimately be
necessary to obtain a well-supported estimate of the avian tree of life.
The Impact of Genome Assembly Quality in the Phylogenomic Era All of the
issues discussed above focus on the behavior of analytical methods, raising an
important question: what is the role for platinum-quality genome assemblies in
avian phylogenomics? After all, draft genome assemblies typically capture at least
90% of most bird genome sequences, even when they reect very low-coverage
sequencing (e.g., Tiley et al. 2018). The increased amount of data available in
platinum-quality genome assemblies is unlikely to have much direct impact on
phylogenomic analyses. However, high-quality genome assemblies are likely to
improve dataset quality. Springer and Gatesy (2018) reported that many alignments
analyzed by Jarvis et al. (2014) had homology errors (e.g., the inadvertent alignment
of exons to introns due to incorrect gene annotation). The existence of problematic
alignments in a phylogenomic dataset is not unexpected; any computational pipeline
used to generate a phylogenomic data matrix will yield both false positives (i.e., it
will align some nonhomologous sequences) and false negatives (i.e., it will fail to
align some truly homologous sequences). The greater contiguity and smaller number
of misassemblies in platinum-quality genome assemblies could make it easier to
extract orthologous sequences. Improved genome annotations will also make it
easier to extract orthologous regions. Functional data, such as RNA-seq, can be
very important for genome annotation (Roberts et al. 2011), and it is accumulating at
a rapid pace (e.g., Seki et al. 2017). Iso-seq data have the potential to be even more
helpful because it reects long-read sequencing technologies to generate full-length
mRNA sequences (Gonzalez-Garay 2016), unlike RNA-seq data that reect short
reads that do not dene complete transcripts in a direct manner. Both Iso-seq data
and platinum-quality genome assemblies are now being generated for birds (e.g.,
Korlach et al. 2017; Workman et al. 2017). Better annotation will also highlight
changes in gene structure (e.g., precise intron deletions; Coulombe-Huntington and
Majewski 2007); this will provide another type of rare genomic change (Bleidorn
2017). Finally, better genome assemblies will also permit better analyses of genome
structure and content (e.g., inversions, rearrangements, gene losses, and gene
duplications). Methods to use gene order information for phylogenetic estimation
already exist (Hu et al. 2014), and all that is needed are genome assemblies of
sufcient quality. Ultimately, resolving the most difcult questions in the avian tree
of life is likely to require improved genome assembly and annotation, the extraction
of multiple data types, and improved analytical methods.
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 191
4 Progress Toward Species-Level Avian Megaphylogenies
Resolving the bird tree of life actually involves two related but somewhat distinct
research programs: (1) the resolution of difcult nodes, like the base of Neoaves, and
(2) the generation of large-scale trees that include all bird species. The latter is
complicated by evidence that many avian taxa should be considered species but are
not currently recognized as such (Gill 2014; Barrowclough et al. 2016). The decision
to assign the rank of species to taxa depends on the species concept one chooses to
employ (see Ottenburghs 2019 for a recent review of species concepts). Nonetheless,
it is likely that the number of evolutionary entities (regardless of whether or not those
sometimes-cryptic taxa are assigned the rank of species) will ultimately increase
from the ~10,000 bird species that are recognized in most current taxonomies
(Table 3) by at least twofold (and possibly even increase threefold). The Brown
et al. (2017) megaphylogeny is the only existing bird tree with more than 10,000
taxa, although it is unclear how many of those taxa will ultimately be assigned a
status equivalent to species upon detailed study. Nevertheless, generating large-scale
avian trees (megaphylogenies or simply big trees) that include most or all of the
currently recognized avian species would represent an excellent starting point for the
next phase of avian phylogenomics.
The available avian megaphylogenies (Table 4) have two major limitations:
(1) most fail to include all named bird species and (2) none of them reect analyses
of phylogenomic-scale data. The reason that many megaphylogenies exclude
species is simple: molecular data are absent for a number of species. The complete
megaphylogenies (Brown et al. 2017; Jetz et al. 2012) incorporate data-decient
species using taxonomic information. Two megaphylogenies (Hedges et al. 2015;
Jetz et al. 2012) also used strong backbone constraints (i.e., a number of
relationships were xed). Other big trees (Brown et al. 2017; Davis and Page
2014) lack meaningful branch lengths because they reect the synthesis of published
trees rather than a direct analysis of any molecular (or morphological) data. Burleigh
et al. (2015) is the only avian megaphylogeny generated using a purely empirical
approach; it reects an unconstrained ML analysis of orthologous avian sequences
published before June 2011. The newest megaphylogeny (Brown et al. 2017) reects
a synthesis of published trees (i.e., a supertree) so it does incorporate phylogenomic
data in the form of source trees generated using phylogenomic data; in fact, it has a
backbone identical to the Prum et al. (2015) tree (Fig. 2b).
The value of phylogenomic data, especially whole-genome data, for the genera-
tion of a species-rich bird tree is actually an open question, especially if we choose to
include more than 10,000 terminal evolutionary lineages. The additional information
available in whole-genome sequences may not justify data collection costs or the
computational burden of assembling, annotating, and analyzing whole-genome data
when we are near the tips of the tree. Sequence capture methods (e.g., Table 2)
192 E. L. Braun et al.
sidestep problems associated with homology assignment to a fairly large degree
because they focus on assembling relatively short contiguous regions; there is no
need to annotate the genome and extract the relevant data types. Moreover, it is
usually possible to recover complete or virtually complete mitochondrial genome
sequences even if mitochondrial sequences are not targeted by probes (Meikejohn
et al. 2014; Raposo do Amaral et al. 2015; Wang et al. 2017); alternatively,
low-coverage genome sequencing can yield mitochondrial genome sequences
(Barker et al. 2015). Mitochondrial sequences are likely to be especially valuable
near the tips of trees (Barrowclough and Zink 2009). At least in the near term, it
seems likely that sequence capture will contribute substantial amounts of data to
avian megaphylogenies.
The data included in current megaphylogenies is heterogeneous in quality, and
this has no doubt led to topological (and, when they are available, branch-length)
errors in those trees. The impact of those errors on downstream analyses is unclear,
and it probably depends on the specic analyses that are conducted. Indeed, in a
study focused on a single family (woodpeckers; about 200 species), Dufort (2016)
found at least 28 sequences for 10 different species in Jetz et al. (2012) that have
been (or could possibly be) assigned to other species. Burleigh et al. (2015) fared a
little better, including only 14 problematic sequences for 5 species. These errors
largely appear to reect difculties associated with reconciling the NCBI taxonomy
with current species limits, although Dufort (2016) also noted that both big trees
included a cytochrome bsequence that Fuchs et al. (2008) deemed a pseudogene.
Although these ndings should raise concerns, the more important question is
Table 4 Avian megaphylogenies generated by synthesizing data from multiple sources
Study
Number of
neornithine taxa Analytical approach
a
Method
b
Branch
lengths
c
Jetz et al. (2012) 6670/9993
d
Supermatrix (15 regions) BI time
Davis and Page (2014) 5379 Supertree analysis MRP n/a
Burleigh et al. (2015) 6714 Supermatrix (29 regions) ML mol
Hedges et al. (2015) 7163 Synthesis of divergence times TTOL time
Brown et al. (2017) 13,579
e
Supertree analysis RH n/a
a
Analytical approaches are supermatrix (analysis of concatenated sequence data), supertree (com-
bination of published trees), and synthesis of divergence times (rening a taxonomy using
published timetree data)
b
BI Bayesian inference (with constraints in the case of Jetz et al. 2012), ML maximum likelihood,
MRP matrix representation with parsimony (Baum 1992; Ragan 1992), RH Redelings and Holder
(2017) supertree pipeline, TTOL time tree of lifeapproach (Hedges et al. 2015)
c
Branch lengths available as estimates of absolute time or molecular change (substitutions per site).
n/aindicates that branch lengths are not available
d
Two Jetz et al. (2012) trees are available. One comprises 6670 taxa for which at least some
sequence data were available. The other comprises 9993 taxa, and it includes taxa for which no data
are available; taxa without any associated sequence data were placed using taxonomic information
e
The Brown et al. (2017) tree includes taxa that are not assigned the rank of species in current
checklists
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 193
whether the phylogenetic errors caused by these issues inuence downstream
analyses. Wang et al. (2016) provided empirical evidence that errors can inuence
downstream analyses (in this case biogeographic inference) by showing that a single
misplaced species in the Jetz et al. (2012) megaphylogeny had a large impact on their
conclusions. The practice of placing data-decient taxa using taxonomic informa-
tion, which has been criticized on theoretical grounds (Rabosky 2015; Title and
Rabosky 2017), is potentially a bigger problem. There are clades in Jetz et al. (2012)
with posterior probabilities of 1.0 that depend on the position of taxa with no data
that conict with clades that received 100% bootstrap support in an ML analysis of
eight nuclear loci and three mitochondrial regions (Hosner et al. 2015a). These
problems are not unique to Jetz et al. (2012); even those megaphylogenies that
eschew the use of data-decient taxa suffer from problems inherent to the synthesis
of diverse data sources. The only solution is the collection of much larger datasets
from all bird species; there are ongoing efforts, like the B10K project (Zhang et al.
2015) and the OpenWings project (Pennisi 2018), that aim to produce the requisite
sequence data.
The exercise of generating an accurate, taxon-rich phylogeny of birds is not an
end in itself. Rather, it is a necessary component of comparative studies that
addresses evolutionary pattern and process. Trees allow us to ask about the evolu-
tionary opportunities, constraints, and processes that led to the biodiversity we now
observe. Comparative methods reveal whether observed patterns in biodiversity data
or correlations among traits require a deeper explanation or have the potential to
be explained by simple null hypotheses. Studies using these methods have revealed
patterns in biogeography (e.g., Claramunt and Cracraft 2015; Field and Hsiang
2018; Moyle et al. 2016; Wang et al. 2016), rates of diversication (e.g., Ricklefs
2007; Jetz et al. 2012), and many other types of phenotypic changes (e.g., Cooney
et al. 2017; Hosner et al. 2017; Field et al. 2018; Stoddard et al. 2017). Phylogenetic
trees can inform conservation priorities (e.g., Diniz-Filho et al. 2013; Jetz et al.
2014). Trees are also a necessary component of analyses that range from those
focused on examining sequence conservation and the genomic landscape (e.g.,
Botero-Castro et al. 2017; Zhang et al. 2015) to the relationship between whole-
organism traits and patterns of molecular change, such as evolutionary rates, amino
acid substitutions, and base composition (e.g., Zhang et al. 2014; Berv and Field
2018; Weber et al. 2014a,b). Phylogenomic studies have proven to be the most
revealing, surprising, and reliable of all sources of phylogenetic information. This
assertion may seem surprising given the conicts among phylogenomic studies that
we have highlighted (Fig. 2), but phylogenomic trees exhibit substantially more
topological similarities than trees generated before the phylogenomic era. Indeed, it
has been only by virtue of conicts that we have better come to understand the nature
and complexity of the evolutionary and historical processes that appear to have
misled earlier studies, both molecular (e.g., heterotachy, nonstationarity, ILS, and
hybridization) and morphological (e.g., convergent evolution; cf. Mayr 2008;
Sackton et al. 2018).
194 E. L. Braun et al.
5 Where Do We Go From Here? The Future of Avian
Phylogenomics
We anticipate many challenges as the eld of phylogenetics moves from analyses of
phylogenomic-scaledatasets, like those generated by sequence capture methods
(Table 2), to analyses of truly whole-genome scale. The Jarvis et al. (2014) analyses
of 48 bird genomes required more than 400 years of CPU time; 42 of those CPU
years were devoted to the 322-Mbp whole-genome tree. Obviously, using the same
methods of analysis for the ~10,000 bird species recognized at this time will be
impractical. Filtering genome alignments to focus on the type(s) of data that are most
likely to provide an accurate estimate of evolutionary history may prove to be
necessary, thus further exploration of data-type effects will be no doubt be helpful.
Of course, the assertion that a specic data type yields a topology close to the true
tree is a hypothesis; nding ways to corroborate specic hypotheses regarding data-
type effects represents a major challenge for the eld of phylogenomics. In principle,
improved models of sequence evolution could address data-type effects, but efforts
to develop complex (and presumably very computationally-demanding) models may
actually take us in the wrong direction. We argued that rare genomic changes might
be an extremely valuable source of information; rare genomic change data could
provide another solution to the computational problem because it might be possible
to use MP for their analyses without sacricing accuracy. Moreover, the use of
rare genomic change data intrinsically reduces whole-genome alignments to much
smaller and, therefore, easier-to-analyze data matrices. This shifts the computational
challenges away from the tree search and toward the identication of rare genomic
changes; those computational challenges are likely to be further ameliorated by the
availability of platinum-quality genome assemblies with high-quality gene annotations.
Obviously, there may be other solutions to the computational challenges associated
with phylogenetic analysis of complete avian genomes, but the fact that it is possible
to envision some practical ways forward makes us sanguine that the challenges can
be solved.
In summary, it is clear that the avian tree of life will grow substantially over the
next few years, and we can expect that nearly all named taxa will eventually be
included. As we alluded to in our discussion of megaphylogenies, there is now a
major impetus among systematists globally to identify more and more cryptic taxa
(whatever their rank). This might easily swell avian taxonomic diversity from the
~10,000 species that are currently recognized to more than 30,000 evolutionary
entities. Investigators of avian diversity and biology will want all of these taxa to nd
a home on the avian tree. Throughout this review, we have generally been agnostic
regarding the direction that analytic methods will take in the future, simply describ-
ing the results of various analyses and, in some cases, the strengths and weaknesses
of those methods. However, we believe any comprehensive avian tree should
ultimately reect the results of an empirical approach that links the data and the
tree in a direct manner, such as the global supermatrix approach akin to the Burleigh
et al. (2015) study. Ideally, that tree would incorporate data from rare genomic
changes as well as sequence information. This will present huge computational
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 195
challenges, and it is unclear, for reasons we articulated above, how the community
will meet these challenges. Increasing incorporation of more taxa and more
characters has always called for shifts in analytical thinking, and these shifts will
no doubt continue to happen. Important questions in avian biology will be asked and
answered using many different kinds and amounts of data. Some of those questions
will require genome-scale data; some will even require reference quality genome
assemblies. However, it will remain possible to address many innovative questions
without genome-scale data. With that in mind, it is important that ornithologists
focus on developing and framing the innovative questions in avian biology; there is
little doubt that genome-scale data for birds will become broadly available and be
combined with dense taxon sampling. We expect that genome-scale data will make it
possible to answer new questions and push the frontiers of avian biology forward
over the coming years.
Acknowledgments We are grateful to Robert Kraus for inviting this chapter and for his encour-
agement (and patience) while we were writing it. We would also like to express our gratitude to
Tom Gilbert and two anonymous reviewers for insightful comments that improved the manuscript.
E.L.B. acknowledges support from the US National Science Foundation grants DEB-1118823 and
DEB-1655683 (the OpenWingsproject) and a seed grant from the University of Florida Biodi-
versity Institute; J.C. acknowledges the US National Science Foundation awards DEB-1241066 and
DEB-1146423.
References
Aberer AJ, Kobert K, Stamatakis A (2014) ExaBayes: massively parallel Bayesian tree inference
for the whole-genome era. Mol Biol Evol 31:25532556. https://doi.org/10.1093/molbev/
msu236
Agnolín FL, Egli FB, Chatterjee S, Marsà JAG, Novas FE (2017) Vegaviidae, a new clade of
southern diving birds that survived the K/T boundary. Sci Nat 104:87. https://doi.org/10.1007/
s00114-017-1508-y
Allentoft ME, Rawlence NJ (2012) Moas Ark or volant ghosts of Gondwana? Insights from
nineteen years of ancient DNA research on the extinct moa (Aves: Dinornithiformes) of
New Zealand. Ann Anat 194:3651. https://doi.org/10.1016/j.aanat.2011.04.002
Andermann T et al (2018) Allele phasing greatly improves the phylogenetic utility of
ultraconserved elements. Syst Biol. https://doi.org/10.1093/sysbio/syy039
Andersen MJ, McCullough JM, Mauck WM, Smith BT, Moyle RG (2017) A phylogeny of
kingshers reveals an Indomalayan origin and elevated rates of diversication on oceanic
islands. J Biogeogr. https://doi.org/10.1111/jbi.13139
Ané C, Larget B, Baum DA, Smith SD, Rokas A (2007) Bayesian estimation of concordance among
gene trees. Mol Biol Evol 24:412426. https://doi.org/10.1093/molbev/msl170
Axelsson E, Webster MT, Smith NG, Burt DW, Ellegren H (2005) Comparison of the chicken and
turkey genomes reveals a higher rate of nucleotide divergence on microchromosomes than
macrochromosomes. Genome Res 15:120125. https://doi.org/10.1101/gr.3021305
Baker AJ, Haddrath O, McPherson JD, Cloutier A (2014) Genomic support for a moa-tinamou
clade and adaptive morphological convergence in ightless ratites. Mol Biol Evol
31:16861696. https://doi.org/10.1093/molbev/msu153
Barker FK, Oyler-McCance S, Tomback DF (2015) Blood from a turnip: tissue origin of
low-coverage shotgun sequencing libraries affects recovery of mitogenome sequences. Mito-
chondrial DNA 26:384388. https://doi.org/10.3109/19401736.2013.840588
196 E. L. Braun et al.
Barrera-Guzmán AO, Aleixo A, Shawkey MD, Weir JT (2018) Hybrid speciation leads to novel
male secondary sexual ornamentation of an Amazonian bird. Proc Natl Acad Sci USA 115:
E218E225. https://doi.org/10.1073/pnas.1717319115
Barrowclough GF, Zink RM (2009) Funds enough, and time: mtDNA, nuDNA and the discovery of
divergence. Mol Ecol 18:29342936. https://doi.org/10.1111/j.1365-294X.2009.04271.x
Barrowclough GF, Cracraft J, Klicka J, Zink RM (2016) How many kinds of birds are there and
why does it matter? PLoS One 11:e0166307. https://doi.org/10.1371/journal.pone.0166307
Baum BR (1992) Combining trees as a way of combining data sets for phylogenetic inference, and
the desirability of combining gene trees. Taxon 41:310. https://doi.org/10.2307/1222480
Baum DA (2007) Concordance trees, concordance factors, and the exploration of reticulate
genealogy. Taxon 56:417426. https://doi.org/10.1002/tax.562013
Bejerano G, Pheasant M, Makunin I, Stephen S, Kent WJ, Mattick JS, Haussler D (2004)
Ultraconserved elements in the human genome. Science 304:13211325. https://doi.org/10.
1126/science.1098119
Bergsten J (2005) A review of long-branch attraction. Cladistics 21(2):163193. https://doi.org/10.
1111/j.1096-0031.2005.00059.x
Bergstrom CT, Dugatkin LA (2012) Evolution. W. W. Norton & Company, New York
Berv JS, Field DJ (2018) Genomic signature of an avian Lilliput effect across the K-Pg extinction.
Syst Biol 67:113. https://doi.org/10.1093/sysbio/syx064
Bleidorn C (2016) Third generation sequencing: technology and its potential impact on evolution-
ary biodiversity research. Syst Biodivers 14:18. https://doi.org/10.1080/14772000.2015.
1099575
Bleidorn C (2017) Rare genomic changes. In: Bleidorn C (ed) Phylogenomics. Springer, Cham, pp
195211. https://doi.org/10.1007/978-3-319-54064-1_10
Botero-Castro F, Figuet E, Tilak MK, Nabholz B, Galtier N (2017) Avian genomes revisited:
hidden genes uncovered and the rates versus traits paradox in birds. Mol Biol Evol
34:31233131. https://doi.org/10.1093/molbev/msx236
Bourdon E, de Ricqles A, Cubo J (2009) A new transantarctic relationship: morphological evidence
for a Rheidae-Dromaiidae-Casuariidae clade (Aves, Palaeognathae, Ratitae). Zool J Linn Soc
156:641663. https://doi.org/10.1111/j.1096-3642.2008.00509.x
Braun EL (2018) Data for: Resolving the avian tree of life from top to bottom: the promise and
potential boundaries of the phylogenomic era (Version 1.0) [Data set]. Zenodo. https://doi.org/
10.5281/zenodo.1419827
Braun EL, Kimball RT (2001) Polytomies, the power of phylogenetic inference, and the stochastic
nature of molecular evolution: a comment on Walsh et al. (1999). Evolution 55:12611263
Braun EL, Kimball RT (2002) Examining basal avian divergences with mitochondrial sequences:
model complexity, taxon sampling, and sequence length. Syst Biol 51:614625. https://doi.org/
10.1080/10635150290102294
Braun EL et al (2011) Homoplastic microinversions and the avian tree of life. BMC Evol Biol
11:141. https://doi.org/10.1186/1471-2148-11-141
Bronson CL, Grubb TC, Braun MJ (2003) A test of the endogenous and exogenous selection
hypotheses for the maintenance of a narrow avian hybrid zone. Evolution 57:630637. https://
doi.org/10.1111/j.0014-3820.2003.tb01554.x
Brower AVZ (2018) Statistical consistency and phylogenetic inference: a brief review. Cladistics
34: 562567. https://doi.org/10.1111/cla.12216
Brown JW, Wang N, Smith SA (2017) The development of scientic consensus: analyzing conict
and concordance among avian phylogenies. Mol Phylogenet Evol 116:6977. https://doi.org/
10.1016/j.ympev.2017.08.002
Brusatte SL, OConnor JK, Jarvis ED (2015) The origin and diversication of birds. Curr Biol 25:
R888R898. https://doi.org/10.1016/j.cub.2015.08.003
Bruxaux J et al (2017) Recovering the evolutionary history of crowned pigeons (Columbidae:
Goura): implications for the biogeography and conservation of New Guinean lowland birds.
Mol Phylogenet Evol. https://doi.org/10.1016/j.ympev.2017.11.022
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 197
Bryson RW, Faircloth BC, Tsai WLE, McCormack JE, Klicka J (2016) Target enrichment of
thousands of ultraconserved elements sheds new light on early relationships within New World
sparrows (Aves: Passerellidae). Auk 133:451458. https://doi.org/10.1642/Auk-16-26.1
Burleigh JG, Kimball RT, Braun EL (2015) Building the avian tree of life using a large-scale, sparse
supermatrix. Mol Phylogenet Evol 84:5363. https://doi.org/10.1016/j.ympev.2014.12.003
Campillo LC, Oliveros CH, Sheldon FH, Moyle RG (2017) Genomic data resolve gene tree
discordance in spiderhunters (Nectariniidae, Arachnothera). Mol Phylogenet Evol. https://doi.
org/10.1016/j.ympev.2017.12.011
Cantor CR (1990) Orchestrating the human genome project. Science 248:4951
Casanellas M, Fernandez-Sanchez J (2007) Performance of a new invariants method on homoge-
neous and nonhomogeneous quartet trees. Mol Biol Evol 24:288293. https://doi.org/10.1093/
molbev/msl153
Chaisson MJ, Raphael BJ, Pevzner PA (2006) Microinversions in mammalian evolution. Proc Natl
Acad Sci USA 103:1982419829. https://doi.org/10.1073/pnas.0603984103
Chifman J, Kubatko L (2014) Quartet inference from SNP data under the coalescent model.
Bioinformatics 30:33173324. https://doi.org/10.1093/bioinformatics/btu530
Chojnowski JL, Kimball RT, Braun EL (2008) Introns outperform exons in analyses of basal avian
phylogeny using clathrin heavy chain genes. Gene 410:8996. https://doi.org/10.1016/j.gene.
2007.11.016
Chubb AL (2004) New nuclear evidence for the oldest divergence among neognath birds: the
phylogenetic utility of ZENK (i). Mol Phylogenet Evol 30:140151. https://doi.org/10.1016/
S1055-7903(03)00159-3
Claramunt S, Cracraft J (2015) A new time tree reveals Earth historys imprint on the evolution of
modern birds. Sci Adv 1:e1501005. https://doi.org/10.1126/sciadv.1501005
Clarke JA (2004) Morphology, phylogenetic taxonomy, and systematics of Ichthyornis and
Apatornis (Avialae: Ornithurae). Bull Am Mus Nat Hist 286:1179. https://doi.org/10.1206/
0003-0090(2004)286<0001:MPTASO>2.0.CO;2
Clarke JA, Norell MA (2004) New avialan remains and a review of the known avifauna from the
Late Cretaceous Nemegt Formation of Mongolia. Am Mus Novit 3447:112. https://doi.org/10.
1206/0003-0082(2004)447<0001:NARAAR>2.0.CO;2
Clarke JA, Tambussi CP, Noriega JI, Erickson GM, Ketcham RA (2005) Denitive fossil evidence
for the extant avian radiation in the Cretaceous. Nature 433:305308. https://doi.org/10.1038/
nature03150
Clements JF, Schulenberg TS, Iliff MJ, Roberson D, Fredericks TA, Sullivan BL, Wood CL (2017)
The eBird/clements checklist of birds of the world: v2016. http://www.birds.cornell.edu/
clementschecklist/download/. Accessed 31 Aug 2017
Cloutier A, Sackton TB, Grayson P, Edwards SV, Baker AJ (2018) First nuclear genome assembly
of an extinct moa species, the little bush moa (Anomalopteryx didiformis). bioRxiv:262816.
https://doi.org/10.1101/262816
Collins TM, Fedrigo O, Naylor GJP (2005) Choosing the best genes for the job: the case for
stationary genes in genome-scale phylogenetics. Syst Biol 54:493500. https://doi.org/10.1080/
10635150590947339
Cooney CR et al (2017) Mega-evolutionary dynamics of the adaptive radiation of birds. Nature
542:344347. https://doi.org/10.1038/nature21074
Cornetti L, Valente LM, Dunning LT, Quan X, Black RA, Hebert O, Savolainen V (2015) The
genome of the great speciatorprovides insights into bird diversication. Genome Biol Evol
7:26802691. https://doi.org/10.1093/gbe/evv168
Coulombe-Huntington J, Majewski J (2007) Characterization of intron loss events in mammals.
Genome Res 17:2332. https://doi.org/10.1101/gr.5703406
Cox WA, Kimball RT, Braun EL (2007) Phylogenetic position of the New World quail
(Odontophoridae): eight nuclear loci and three mitochondrial regions contradict morphology
and the Sibley-Ahlquist tapestry. Auk 124:7184. https://doi.org/10.1642/0004-8038(2007)124
[71:Ppotnw]2.0.Co;2
198 E. L. Braun et al.
Cracraft J (1973) Continental drift, palaeoclimatology, and the evolution and biogeography of birds.
J Zool 169:455545. https://doi.org/10.1111/j.1469-7998.1973.tb03122.x
Cracraft J (1974) Phylogeny and evolution of ratite birds. Ibis 116:494521. https://doi.org/10.
1111/j.1474-919X.1974.tb07648.x
Cracraft J (2001) Avian evolution, Gondwana biogeography and the Cretaceous-Tertiary mass
extinction event. Proc Biol Sci 268:459469. https://doi.org/10.1098/rspb.2000.1368
Cracraft J (2013) Avian higher-level relationships and classication: nonpasseriforms. In:
Dickinson EC, Remsen JV Jr (eds) The Howard and Moore complete checklist of the birds of
the world, vol 1, 4th edn. Aves Press, Eastbourne, pp xxixliii
Cracraft J et al (2004) Phylogenetic relationships among modern birds (Neornithes): towards an
avian tree of life. In: Cracraft J, Donoghue MJ (eds) Assembling the tree of life. Oxford
University Press, New York, pp 468489
Crawford NG, Faircloth BC, McCormack JE, Brumeld RT, Winker K, Glenn TC (2012) More
than 1000 ultraconserved elements provide evidence that turtles are the sister group of
archosaurs. Biol Lett 8:783786. https://doi.org/10.1098/rsbl.2012.0331
Davis KE, Page RDM (2014) Reweaving the tapestry: a supertree of birds. PLoS Curr 6. https://doi.
org/10.1371/currents.tol.c1af68dda7c999ed9f1e4b2d2df7a08e
De Pietri VL, Scoeld RP, Zelenkov N, Boles WE, Worthy TH (2016) The unexpected survival of
an ancient lineage of anseriform birds into the Neogene of Australia: the youngest record of
Presbyornithidae. R Soc Open Sci 3:150635. https://doi.org/10.1098/rsos.150635
DeGiorgio M, Degnan JH (2010) Fast and consistent estimation of species trees using supermatrix
rooted triples. Mol Biol Evol 27:552569. https://doi.org/10.1093/molbev/msp250
del Hoyo J, Elliott A, Sargatal J, Christie DA, de Juana E (eds) (2017) Handbook of the birds of the
world alive. Lynx Edicions, Barcelona. http://www.hbw.com
Delsuc F, Brinkmann H, Philippe H (2005) Phylogenomics and the reconstruction of the tree of life.
Nat Rev Genet 6:361375. https://doi.org/10.1038/nrg1603
Dickinson EC, Christidis L (2014) The Howard and Moore complete checklist of the birds of the
world, 4th edn, vol 2. Passerines. Aves Press, Eastbourne
Dickinson EC, Remsen JV Jr (2013) The Howard and Moore complete checklist of the birds of the
world, 4th edn, vol 1. Passerines. Aves Press, Eastbourne
Dimitrieva S, Bucher P (2012) UCNEbase a database of ultraconserved non-coding elements and
genomic regulatory blocks. Nucleic Acids Res 41:D101D109. https://doi.org/10.1093/nar/
gks1092
Diniz-Filho JA, Loyola RD, Raia P, Mooers AO, Bini LM (2013) Darwinian shortfalls in biodiver-
sity conservation. Trends Ecol Evol 28:689695. https://doi.org/10.1016/j.tree.2013.09.003
Dufort MJ (2016) An augmented supermatrix phylogeny of the avian family Picidae reveals
uncertainty deep in the family tree. Mol Phylogenet Evol 94:313326. https://doi.org/10.
1016/j.ympev.2015.08.025
Edwards SV (2009) Is a new and general theory of molecular systematics emerging? Evolution
63:119. https://doi.org/10.1111/J.1558-5646.2008.00549.X
Edwards SV (2016) Phylogenomic subsampling: a brief review. Zool Scr 45:6374. https://doi.org/
10.1111/zsc.12210
Edwards SV, Wilson AC (1990) Phylogenetically informative length polymorphism and sequence
variability in mitochondrial DNA of Australian songbirds (Pomatostomus). Genetics
126:695711
Edwards SV, Arctander P, Wilson AC (1991) Mitochondrial resolution of a deep branch in the
genealogical tree for perching birds. Proc Biol Sci 243:99107. https://doi.org/10.1098/rspb.
1991.0017
Edwards SV et al (2016) Implementing and testing the multispecies coalescent model: a valuable
paradigm for phylogenomics. Mol Phylogenet Evol 94:447462. https://doi.org/10.1016/j.
ympev.2015.10.027
Edwards SV, Cloutier A, Baker AJ (2017) Conserved nonexonic elements: a novel class of marker
for phylogenomics. Syst Biol 66:10281044. https://doi.org/10.1093/sysbio/syx058
Eisen JA (1998) Phylogenomics: improving functional predictions for uncharacterized genes by
evolutionary analysis. Genome Res 8:163167
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 199
Eisen JA, Kaiser D, Myers RM (1997) Gastrogenomic delights: a movable feast. Nat Med
3:10761078
Ericson PGP (1996) The skeletal evidence for a sister-group relationship of anseriform and
galliform birds: a critical evaluation. J Avian Biol 27:195202. https://doi.org/10.2307/3677222
Ericson PGP (2012) Evolution of terrestrial birds in three continents: biogeography and parallel
radiations. J Biogeogr 39:813824. https://doi.org/10.1111/j.1365-2699.2011.02650.x
Ericson PGP et al (2006) Diversication of Neoaves: integration of molecular sequence data and
fossils. Biol Lett 2:543U541. https://doi.org/10.1098/rsbl.2006.0523
Fain MG, Houde P (2004) Parallel radiations in the primary clades of birds. Evolution
58:25582573. https://doi.org/10.1111/j.0014-3820.2004.tb00884.x
Faircloth BC, McCormack JE, Crawford NG, Harvey MG, Brumeld RT, Glenn TC (2012)
Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary
timescales. Syst Biol 61:717726. https://doi.org/10.1093/sysbio/sys004
Faux C, Field DJ (2017) Distinct developmental pathways underlie independent losses of ight in
ratites. Biol Lett 13:20170234. https://doi.org/10.1098/rsbl.2017.0234
Feduccia A (1996) The origin and evolution of birds. Yale University Press, New Haven, CT
Felsenstein J (1978) Cases in which parsimony or compatibility methods will be positively
misleading. Syst Zool 27:401410. https://doi.org/10.2307/2412923
Felsenstein J (1985) Condence limits on phylogenies: an approach using the bootstrap. Evolution
39:783791. https://doi.org/10.1111/j.1558-5646.1985.tb00420.x
Feng Y, Zhang Y, Ying C, Wang D, Du C (2015) Nanopore-based fourth-generation DNA
sequencing technology. Genomics Proteomics Bioinformatics 13:416. https://doi.org/10.
1016/j.gpb.2015.01.009
Field DJ, Hsiang AY (2018) A North American stem turaco, and the complex biogeographic history
of modern birds. BMC Evol Biol 18:102. https://doi.org/10.1186/s12862-018-1212-3
Field DJ et al (2018) Early evolution of modern birds structured by global forest collapse at the
end-Cretaceous mass extinction. Curr Biol 28:18251831. https://doi.org/10.1016/j.cub.2018.
04.062
Fountaine TM, Benton MJ, Dyke GJ, Nudds RL (2005) The quality of the fossil record of Mesozoic
birds. Proc Biol Sci 272:289294. https://doi.org/10.1098/rspb.2004.2923
Fuchs J, Pons JM, Ericson PGP, Bonillo C, Couloux A, Pasquet E (2008) Molecular support for a
rapid cladogenesis of the woodpecker clade Malarpicini, with further insights into the genus
Picus (Piciformes: Picinae). Mol Phylogenet Evol 48:3446. https://doi.org/10.1016/j.ympev.
2008.03.036
Fuchs J, Pons JM, Liu L, Ericson PGP, Couloux A, Pasquet E (2013) A multi-locus phylogeny
suggests an ancient hybridization event between Campephilus and melanerpine woodpeckers
(Aves: Picidae). Mol Phylogenet Evol 67:578588. https://doi.org/10.1016/j.ympev.2013.02.
014
Futuyma DJ (2005) Evolution. Sinauer, Sunderland, MA
Gatesy J, Springer MS (2014) Phylogenetic analysis at deep timescales: unreliable gene trees,
bypassed hidden support, and the coalescence/concatalescence conundrum. Mol Phylogenet
Evol 80:231266. https://doi.org/10.1016/J.Ympev.2014.08.013
Gaut BS, Lewis PO (1995) Success of maximum likelihood phylogeny inference in the four-taxon
case. Mol Biol Evol 12:152162. https://doi.org/10.1093/oxfordjournals.molbev.a040183
Gee H (2003) Evolution: ending incongruence. Nature 425:782. https://doi.org/10.1038/425782a
Gilbert PS, Wu J, Simon MW, Sinsheimer JS, Alfaro ME (2018) Filtering nucleotide sites by
phylogenetic signal to noise ratio increases condence in the Neoaves phylogeny generated
from ultraconserved elements. Mol Phylogenet Evol 126:116128. https://doi.org/10.1016/j.
ympev.2018.03.033
Gill FB (2014) Species taxonomy of birds: which null hypothesis? Auk 131:150161. https://doi.
org/10.1642/Auk-13-206.1
Gill F, Donsker D (2017) IOC World Bird List (v 7.3). http://www.worldbirdnames.org/. Accessed
31 Aug 2017
200 E. L. Braun et al.
Glenn TC (2011) Field guide to next-generation DNA sequencers. Mol Ecol Resour 11:759769.
https://doi.org/10.1111/j.1755-0998.2011.03024.x
Gonzalez-Garay ML (2016) Introduction to isoform sequencing using pacic biosciences technol-
ogy (Iso-Seq). In: Wu J (ed) Transcriptomics and Gene Regulation. Translational Bioinformat-
ics, vol 9. Springer, Dordrecht, pp 141160. https://doi.org/10.1007/1978-1094-1017-7450-
1005_1006
Grant PR, Grant BR (2016) Introgressive hybridization and natural selection in Darwinsnches.
Biol J Linn Soc 117:812822. https://doi.org/10.1111/bij.12702
Grealy A et al (2017) Eggshell palaeogenomics: Palaeognath evolutionary history revealed through
ancient nuclear and mitochondrial DNA from Madagascan elephant bird (Aepyornis sp.)
eggshell. Mol Phylogenet Evol 109:151163. https://doi.org/10.1016/j.ympev.2017.01.005
Grifn DK, Larkin M, OConnor RE (2019) Jurassic Spark: what did the genomes of dinosaurs look
like? In: Kraus RHS (ed) Avian genomics in ecology and evolution from the lab into the wild.
Springer, Cham
Hackett SJ et al (2008) A phylogenomic study of birds reveals their evolutionary history. Science
320:17631768. https://doi.org/10.1126/Science.1157704
Haddrath O, Baker AJ (2012) Multiple nuclear genes and retroposons support vicariance and
dispersal of the palaeognaths, and an Early Cretaceous origin of modern birds. Proc Biol Sci
279:46174625. https://doi.org/10.1098/rspb.2012.1630
Hahn MW, Nakhleh L (2016) Irrational exuberance for resolved species trees. Evolution 70:717.
https://doi.org/10.1111/evo.12832
Han K-L et al (2011) Are transposable element insertions homoplasy free? An examination using
the avian tree of life. Syst Biol 60:375386. https://doi.org/10.1093/Sysbio/Syq100
Harshman J et al (2008) Phylogenomic evidence for multiple losses of ight in ratite birds. Proc
Natl Acad Sci USA 105:1346213467. https://doi.org/10.1073/pnas.0803242105
Harvey MG, Smith BT, Glenn TC, Faircloth BC, Brumeld RT (2016) Sequence capture versus
restriction site associated DNA sequencing for shallow systematics. Syst Biol 65:910924.
https://doi.org/10.1093/sysbio/syw036
Heath TA, Hedtke SM, Hillis DM (2008) Taxon sampling and the accuracy of phylogenetic
analyses. J Syst Evol 46:239257. https://doi.org/10.3724/SP.J.1002.2008.08016
Hedges SB, Marin J, Suleski M, Paymer M, Kumar S (2015) Tree of life reveals clock-like
speciation and diversication. Mol Biol Evol 32:835845. https://doi.org/10.1093/molbev/
msv037
Heled J, Drummond AJ (2010) Bayesian inference of species trees from multilocus data. Mol Biol
Evol 27:570580. https://doi.org/10.1093/molbev/msp274
Helm-Bychowski K, Cracraft J (1993) Recovering phylogenetic signal from DNA sequences:
relationships within the corvine assemblage (class Aves) as inferred from complete sequences
of the mitochondrial DNA cytochrome-bgene. Mol Biol Evol 10:11961214
Hendy MD, Penny D (1989) A framework for the quantitative study of evolutionary trees. Syst Zool
38:297309. https://doi.org/10.2307/2992396
Hennig W (1966) Phylogenetic systematics. University of Illinois Press, Chicago, IL
Higuchi RG, Ochman H (1989) Production of single-stranded DNA templates by exonuclease
digestion following the polymerase chain reaction. Nucleic Acids Res 17:5865. https://doi.org/
10.1093/nar/17.14.5865
Hillis DM, Moritz C (eds) (1990) Molecular systematics. Sinauer, Sunderland, MA
Hillis DM, Huelsenbeck JP, Cunningham CW (1994) Application and accuracy of molecular
phylogenies. Science 264:671677. https://doi.org/10.1126/science.8171318
Hinchliff CE et al (2015) Synthesis of phylogeny and taxonomy into a comprehensive tree of life.
Proc Natl Acad Sci USA 112:1276412769. https://doi.org/10.1073/pnas.1423041112
Höhna S et al (2016) RevBayes: Bayesian phylogenetic inference using graphical models and an
interactive model-specication language. Syst Biol 65(4):726736. https://doi.org/10.1093/
sysbio/syw021
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 201
Hope S (2002) The Mesozoic radiation of Neornithes. In: Chiappe LM, Witmer LM (eds) Mesozoic
birds: above the heads of dinosaurs. University of California Press, Berkeley, CA, pp 339388
Hosner PA, Braun EL, Kimball RT (2015a) Land connectivity changes and global cooling shaped
the colonization history and diversication of New World quail (Aves: Galliformes:
Odontophoridae). J Biogeogr 42:18831895
Hosner PA, Faircloth BC, Glenn TC, Braun EL, Kimball RT (2015b) Avoiding missing data biases
in phylogenomic inference: an empirical study in the landfowl (Aves: Galliformes). Mol Biol
Evol 33:11101125. https://doi.org/10.1093/molbev/msv347
Hosner PA, Braun EL, Kimball RT (2016) Rapid and recent diversication of curassows, guans,
and chachalacas (Galliformes: Cracidae) out of Mesoamerica: phylogeny inferred from mito-
chondrial, intron, and ultraconserved element sequences. Mol Phylogenet Evol 102:320330.
https://doi.org/10.1016/j.ympev.2016.06.006
Hosner PA, Tobias JA, Braun EL, Kimball RT (2017) How do seemingly non-vagile clades
accomplish trans-marine dispersal? Trait and dispersal evolution in the landfowl (Aves:
Galliformes). Proc Biol Sci 284:20170210. https://doi.org/10.1098/rspb.2017.0210
Houde P (1986) Ostrich ancestors found in the Northern Hemisphere suggest new hypothesis of
ratite origins. Nature 324:563565. https://doi.org/10.1038/324563a0
Houde P (1988) Paleognathous birds from the early Tertiary of the Northern Hemisphere. Publ
Nuttall Ornithol Club 22:1148
Hu F, Lin Y, Tang J (2014) MLGO: phylogeny reconstruction and ancestral inference from gene-
order data. BMC Bioinformatics 15:354. https://doi.org/10.1186/s12859-014-0354-6
Huelsenbeck JP (1997) Is the Felsenstein zone a y trap? Syst Biol 46:6974. https://doi.org/10.
2307/2413636
Huelsenbeck JP, Bollback JP (2001) Empirical and hierarchical Bayesian estimation of ancestral
states. Syst Biol 50:351366. https://doi.org/10.1080/10635150119871
Hughes JM, Baker AJ (1999) Phylogenetic relationships of the enigmatic Hoatzin (Opisthocomus
hoazin) resolved using mitochondrial and nuclear gene sequences. Mol Biol Evol
16:13001307. https://doi.org/10.1093/oxfordjournals.molbev.a026220
Hughes RA, Ellington AD (2017) Synthetic DNA synthesis and assembly: putting the synthetic in
synthetic biology. Cold Spring Harb Perspect Biol 9:a023812. https://doi.org/10.1101/
cshperspect.a023812
Jarvis ED et al (2014) Whole-genome analyses resolve early branches in the tree of life of modern
birds. Science 346:13201331. https://doi.org/10.1126/Science.1253451
Jeffroy O, Brinkmann H, Delsuc F, Philippe H (2006) Phylogenomics: the beginning of incongru-
ence? Trends Genet 22:225231. https://doi.org/10.1016/J.Tig.2006.02.003
Jetz W, Thomas GH, Joy JB, Hartmann K, Mooers AO (2012) The global diversity of birds in space
and time. Nature 491:444448. https://doi.org/10.1038/Nature11631
Jetz W, Thomas GH, Joy JB, Redding DW, Hartmann K, Mooers AO (2014) Global distribution
and conservation of evolutionary distinctness in birds. Curr Biol 24:919930. https://doi.org/10.
1016/j.cub.2014.03.011
Johnston P (2011) New morphological evidence supports congruent phylogenies and Gondwana
vicariance for palaeognathous birds. Zool J Linn Soc 163:959982. https://doi.org/10.1111/j.
1096-3642.2011.00730.x
Joseph L, Buchanan KL (2015) A quantum leap in avian biology. Emu 115:15. https://doi.org/10.
1071/MUv115n1_ED
Kapusta A, Suh A (2017) Evolution of bird genomes-a transposons-eye view. Ann N Y Acad Sci
1389:164185. https://doi.org/10.1111/nyas.13295
Katsu Y, Braun EL, Guillette LJ Jr, Iguchi T (2009) From reptilian phylogenomics to reptilian
genomes: analyses of c-Jun and DJ-1 proto-oncogenes. Cytogenet Genome Res 127:7993.
https://doi.org/10.1159/000297715
Kearns AM et al (2018) Genomic evidence of speciation reversal in ravens. Nat Commun 9:906.
https://doi.org/10.1038/s41467-018-03294-w
202 E. L. Braun et al.
Kim J (2000) Slicing hyperdimensional oranges: the geometry of phylogenetic estimation. Mol
Phylogenet Evol 17:5875. https://doi.org/10.1006/mpev.2000.0816
Kimball RT, Wang N, Heimer-McGinn V, Ferguson C, Braun EL (2013) Identifying localized
biases in large datasets: a case study using the avian tree of life. Mol Phylogenet Evol
69:10211032. https://doi.org/10.1016/j.ympev.2013.05.029
King N, Rokas A (2017) Embracing uncertainty in reconstructing early animal evolution. Curr Biol
27:R1081R1088. https://doi.org/10.1016/j.cub.2017.08.054
Kocher TD, Thomas WK, Meyer A, Edwards SV, Paabo S, Villablanca FX, Wilson AC (1989)
Dynamics of mitochondrial DNA evolution in animals: amplication and sequencing with
conserved primers. Proc Natl Acad Sci USA 86:61966200
Korlach J et al (2017) De novo PacBio long-read and phased avian genome assemblies correct and
add to reference genes generated with intermediate and short reads. Gigascience 6:116. https://
doi.org/10.1093/gigascience/gix085
Kraus RHS, Wink M (2015) Avian genomics: edging into the wild! J Ornithol 156:851865.
https://doi.org/10.1007/s10336-015-1253-y
Ksepka DT (2009) Broken gears in the avian molecular clock: new phylogenetic analyses support
stem galliform status for Gallinuloides wyomingensis and rallid afnities for Amitabha
urbsinterdictensis. Cladistics 25:173197. https://doi.org/10.1111/j.1096-0031.2009.00250.x
Ksepka DT, Clarke JA (2009) Afnities of Palaeospiza bella and the phylogeny and biogeography
of Mousebirds (Coliiformes). Auk 126:245259. https://doi.org/10.1525/auk.2009.07178
Ksepka DT, Stidham TA, Williamson TE (2017) Early Paleocene landbird supports rapid phyloge-
netic and morphological diversication of crown birds after the K-Pg mass extinction. Proc Natl
Acad Sci USA 114:80478052. https://doi.org/10.1073/pnas.1700188114
Kubatko LS, Degnan JH (2007) Inconsistency of phylogenetic estimates from concatenated data
under coalescence. Syst Biol 56:1724. https://doi.org/10.1080/10635150601146041
Kurochkin EN, Dyke GJ, Karhu AA (2002) A new presbyornithid bird (Aves, Anseriformes) from
the Late Cretaceous of Southern Mongolia. Am Mus Novit 3386:111. https://doi.org/10.1206/
0003-0082(2002)386<0001:ANPBAA>2.0.CO;2
Lamichhaney S et al (2015) Evolution of Darwinsnches and their beaks revealed by genome
sequencing. Nature 518:371375. https://doi.org/10.1038/nature14181
Lanfear R, Calcott B, Kainer D, Mayer C, Stamatakis A (2014) Selecting optimal partitioning
schemes for phylogenomic datasets. BMC Evol Biol 14:82. https://doi.org/10.1186/1471-2148-
14-82
Lavretsky P, Hernández-Baños BE, Peters JL (2014) Rapid radiation and hybridization contribute
to weak differentiation and hinder phylogenetic inferences in the New World Mallard complex
(Anas spp.). Auk 131:524538. https://doi.org/10.1642/AUK-13-164.1
Le Duc D et al (2015) Kiwi genome provides insights into evolution of a nocturnal lifestyle.
Genome Biol 16:147. https://doi.org/10.1186/s13059-015-0711-4
Lee MSY, Cau A, Naish D, Dyke GJ (2014) Morphological clocks in paleontology, and a
mid-Cretaceous origin of crown Aves. Syst Biol 63:442449. https://doi.org/10.1093/sysbio/
syt110
Lemmon AR, Emme SA, Lemmon EM (2012) Anchored hybrid enrichment for massively high-
throughput phylogenomics. Syst Biol 61:727744. https://doi.org/10.1093/sysbio/sys049
Liang B, Wang N, Li N, Kimball RT, Braun EL (2018) Comparative genomics reveals a burst of
homoplasy-free numt insertions. Mol Biol Evol 35(8):20602064. https://doi.org/10.1093/
molbev/msy112
Liu L (2008) BEST: Bayesian estimation of species trees under the coalescent model. Bioinformat-
ics 24:25422543. https://doi.org/10.1093/bioinformatics/btn484
Liu L, Yu L (2011) Estimating species trees from unrooted gene trees. Syst Biol 60:661667.
https://doi.org/10.1093/sysbio/syr027
Liu L, Yu L, Kubatko L, Pearl DK, Edwards SV (2009) Coalescent methods for estimating
phylogenetic trees. Mol Phylogenet Evol 53:320328. https://doi.org/10.1016/j.ympev.2009.
05.033
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 203
Liu LA, Yu LL, Edwards SV (2010) A maximum pseudo-likelihood approach for estimating
species trees under the coalescent model. BMC Evol Biol 10:302. https://doi.org/10.1186/
1471-2148-10-302
Livezey BC, Zusi RL (2007a) Higher-order phylogeny of modern birds (Theropoda, Aves:
Neornithes) based on comparative anatomy. II. Analysis and discussion. Zool J Linn Soc
149:195. https://doi.org/10.1111/j.1096-3642.2006.00293.x
Livezey BC, Zusi RL (2007b) Higher-order phylogeny of modern birds (Theropoda, Aves:
Neornithes) based on comparative anatomy. I. Methods and characters. Bull Carnegie Mus
Nat Hist 37:1544
Lockhart PJ, Larkum AW, Steel M, Waddell PJ, Penny D (1996) Evolution of chlorophyll and
bacteriochlorophyll: the problem of invariant sites in sequence analysis. Proc Natl Acad Sci
USA 93:19301934. https://doi.org/10.1073/pnas.93.5.1930
Long C, Kubatko L (2017) Identiability and reconstructibility of species phylogenies under a
modied coalescent. arXiv preprint:1701.06871
Lopez P, Casane D, Philippe H (2002) Heterotachy, an important process of protein evolution. Mol
Biol Evol 19:17. https://doi.org/10.1093/oxfordjournals.molbev.a003973
Maddison WP (1997) Gene trees in species trees. Syst Biol 46:523536. https://doi.org/10.2307/
2413694
Manthey JD, Campillo LC, Burns KJ, Moyle RG (2016) Comparison of target-capture and
restriction-site associated DNA sequencing for phylogenomics: a test in cardinalid tanagers
(Aves, Genus: Piranga). Syst Biol 65:640650. https://doi.org/10.1093/sysbio/syw005
Matsen FA, Steel M (2007) Phylogenetic mixtures on a single tree can mimic a tree of another
topology. Syst Biol 56:767775. https://doi.org/10.1080/10635150701627304
Matzke A et al (2012) Retroposon insertion patterns of neoavian birds: strong evidence for an
extensive incomplete lineage sorting era. Mol Biol Evol 29:14971501. https://doi.org/10.1093/
Molbev/Msr319
Mayr G (2004a) Morphological evidence for sister group relationship between amingos (Aves:
Phoenicopteridae) and grebes (Podicipedidae). Zool J Linn Soc 140:157169. https://doi.org/
10.1111/j.1096-3642.2003.00094.x
Mayr G (2004b) Old World fossil record of modern-type hummingbirds. Science 304:861864.
https://doi.org/10.1126/science.1096856
Mayr G (2008) Avian higher-level phylogeny: well-supported clades and what we can learn from a
phylogenetic analysis of 2954 morphological characters. J Zool Syst Evol Res 46:6372. https://
doi.org/10.1111/j.1439-0469.2007.00433.x
Mayr G (2009) Paleogene fossil birds. Springer, Berlin
Mayr G (2011) Metaves, Mirandornithes, Strisores and other novelties a critical review of the
higher-level phylogeny of neornithine birds. J Zool Syst Evol Res 49:5876. https://doi.org/10.
1111/j.1439-0469.2010.00586.x
Mayr G (2014) A Hoatzin fossil from the middle Miocene of Kenya documents the past occurrence
of modern-type Opisthocomiformes in Africa. Auk 131:5560. https://doi.org/10.1642/Auk-13-
134.1
Mayr G, De Pietri VL (2014) Earliest and rst Northern Hemispheric Hoatzin fossils substantiate
Old World origin of a Neotropic endemic. Naturwissenschaften 101:143148. https://doi.org/
10.1007/s00114-014-1144-8
Mayr G, Alvarenga H, Mourer-Chauvire C (2011) Out of Africa: fossils shed light on the origin of
the Hoatzin, an iconic Neotropic bird. Naturwissenschaften 98:961966. https://doi.org/10.
1007/s00114-011-0849-1
Mayr G, De Pietri VL, Scoeld RP, Worthy TH (2018) On the taxonomic composition and
phylogenetic afnities of the recently proposed clade Vegaviidae Agnolín et al., 2017
neornithine birds from the Upper Cretaceous of the Southern Hemisphere. Cretac Res
86:178185. https://doi.org/10.1016/j.cretres.2018.02.013
McCormack JE, Faircloth BC, Crawford NG, Gowaty PA, Brumeld RT, Glenn TC (2012)
Ultraconserved elements are novel phylogenomic markers that resolve placental mammal
204 E. L. Braun et al.
phylogeny when combined with species-tree analysis. Genome Res 22:746754. https://doi.org/
10.1101/gr.125864.111
McCormack JE, Harvey MG, Faircloth BC, Crawford NG, Glenn TC, Brumeld RT (2013) A
phylogeny of birds based on over 1,500 loci collected by target enrichment and high-throughput
sequencing. PLoS One 8:e54848. https://doi.org/10.1371/journal.pone.0054848
McCormack JE, Tsai WLE, Faircloth BC (2016) Sequence capture of ultraconserved elements from
bird museum specimens. Mol Ecol Resour 16:11891203. https://doi.org/10.1111/1755-0998.
12466
Meikejohn KA, Danielson MJ, Faircloth BC, Glenn TC, Braun EL, Kimball RT (2014) Incongru-
ence among different mitochondrial regions: a case study using complete mitogenomes. Mol
Phylogenet Evol 78:314323. https://doi.org/10.1016/j.ympev.2014.06.003
Meiklejohn KA, Faircloth BC, Glenn TC, Kimball RT, Braun EL (2016) Analysis of a rapid
evolutionary radiation using ultraconserved elements: evidence for a bias in some multispecies
coalescent methods. Syst Biol 65:612627. https://doi.org/10.1093/sysbio/syw014
Mendes FK, Hahn MW (2017) Why concatenation fails near the anomaly zone. Syst Biol. https://
doi.org/10.1093/sysbio/syx063
Mendoza MLZ, Nygaard S, da Fonseca RR (2014) DivA: detection of non-homologous and very
divergent regions in protein sequence alignments. BMC Res Notes 7:806. https://doi.org/10.
1186/1756-0500-7-806
Mindell DP (ed) (1997) Avian molecular evolution and systematics. Academic, San Diego, CA
Minh BQ, Nguyen MA, von Haeseler A (2013) Ultrafast approximation for phylogenetic bootstrap.
Mol Biol Evol 30:11881195. https://doi.org/10.1093/molbev/mst024
Mirarab S, Warnow T (2015) ASTRAL-II: coalescent-based species tree estimation with many
hundreds of taxa and thousands of genes. Bioinformatics 31:4452. https://doi.org/10.1093/
bioinformatics/btv234
Mirarab S, Bayzid MS, Boussau B, Warnow T (2014a) Statistical binning enables an accurate
coalescent-based estimation of the avian tree. Science 346:1250463. https://doi.org/10.1126/
science.1250463
Mirarab S, Reaz R, Bayzid MS, Zimmermann T, Swenson MS, Warnow T (2014b) ASTRAL:
genome-scale coalescent-based species tree estimation. Bioinformatics 30:i541i548. https://
doi.org/10.1093/bioinformatics/btu462
Mitchell KJ et al (2014) Ancient DNA reveals elephant birds and kiwi are sister taxa and claries
ratite bird evolution. Science 344:898900. https://doi.org/10.1126/science.1251981
Miyamoto MM, Cracraft J (eds) (1991) Phylogenetic analysis of DNA sequences. Oxford Univer-
sity Press, New York
Mossel E (2003) On the impossibility of reconstructing ancestral data and phylogenies. J Comput
Biol 10:669676. https://doi.org/10.1089/106652703322539015
Moyle RG et al (2016) Tectonic collision and uplift of Wallacea triggered the global songbird
radiation. Nat Commun 7:12709. https://doi.org/10.1038/ncomms12709
Musher LJ, Cracraft J (2018) Phylogenomics and species delimitation of a complex radiation of
Neotropical suboscine birds (Pachyramphus). Mol Phylogenet Evol 118:204221. https://doi.
org/10.1016/j.ympev.2017.09.013
Nadachowska-Brzyska K, Li C, Smeds L, Zhang G, Ellegren H (2015) Temporal dynamics of avian
populations during Pleistocene revealed by whole-genome sequences. Curr Biol 25:13751380.
https://doi.org/10.1016/j.cub.2015.03.047
Nater A, Burri R, Kawakami T, Smeds L, Ellegren H (2015) Resolving evolutionary relationships
in closely related species with whole-genome sequencing data. Syst Biol 62:10001017. https://
doi.org/10.1093/sysbio/syv045
Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ (2015) IQ-TREE: a fast and effective
stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol
32:268274. https://doi.org/10.1093/molbev/msu300
Resolving the Avian Tree of Life from Top to Bottom: The Promise and... 205
OConnor JK, Zhou Z (2013) A redescription of Chaoyangia beishanensis (Aves) and a compre-
hensive phylogeny of Mesozoic birds. J Syst Palaeontol 11:889906. https://doi.org/10.1080/
14772019.2012.690455
Olson SL (1985) The fossil record of birds. Avian Biol 8:79252
Ota R, Penny D (2003) Estimating changes in mutational mechanisms of evolution. J Mol Evol 57
(Suppl 1):S233S240. https://doi.org/10.1007/s00239-003-0032-1
Ottenburghs J (2019) Avian species concepts in the light of genomics. In: Kraus RHS (ed) Avian
genomics in ecology and evolution from the lab into the wild. Springer, Cham
Ottenburghs J, Ydenberg RC, Van Hooft P, Van Wieren SE, Prins HH (2015) The Avian Hybrids
Project: gathering the scientic literature on avian hybridization. Ibis 157:892894. https://doi.
org/10.1111/ibi.12285
Ottenburghs J et al (2016a) A tree of geese: a phylogenomic perspective on the evolutionary history
of True Geese. Mol Phylogenet Evol 101:303313
Ottenburghs J, van Hooft P, van Wieren SE, Ydenberg RC, Prins HH (2016b) Birds in a bush:
toward an avian phylogenetic network. Auk 133:577582. https://doi.org/10.1642/AUK-16-53.1
Ottenburghs J et al (2017a) A history of hybrids? Genomic patterns of introgression in the True
Geese. BMC Evol Biol 17:201. https://doi.org/10.1186/s12862-017-1048-2
Ottenburghs J, Kraus RHS, van Hooft P, van Wieren SE, Ydenberg RC, Prins HH (2017b) Avian
introgression in the genomic era. Avian Res 8:30. https://doi.org/10.1186/s40657-017-0088-z
Pamilo P, Nei M (1988) Relationships between gene trees and species trees. Mol Biol Evol
5:568583. https://doi.org/10.1093/oxfordjournals.molbev.a040517
Pandey A, Braun EL (2018) Why do phylogenomic analyses of early animal evolution continue to
disagree? Sites in different structural environments yield different answers. biorXiv:400465.
https://doi.org/10.1101/400465
Patel S, Kimball RT, Braun EL (2013) Error in phylogenetic estimation for bushes in the tree of life.
J Phylogen Evol Biol 1:110. https://doi.org/10.4172/jpgeb.1000110
Pease JB, Brown JW, Walker JF, Hinchliff CE, Smith SA (2018) Quartet sampling distinguishes
lack of support from conicting support in the green plant tree of life. Am J Bot 105:385403.
https://doi.org/10.1002/ajb2.1016
Pennisi E (2018) Bigger, better bird tree of life will soon y into view. Science. https://doi.org/10.
1126/science.aat8989
Penny D, McComish BJ, Charleston MA, Hendy MD (2001) Mathematical elegance with biochem-
ical realism: the covarion model of molecular evolution. J Mol Evol 53:711723. https://doi.org/
10.1007/s002390010258
Persons NW, Hosner PA, Meiklejohn KA, Braun EL, Kimball RT (2016) Sorting out relationships
among the grouse and ptarmigan using intron, mitochondrial, and ultra-conserved element
sequences. Mol Phylogenet Evol 98:123132. https://doi.org/10.1016/j.ympev.2016.02.003
Philippe H, Brinkmann H, Lavrov DV, Littlewood DT, Manuel M, Worheide G, Baurain D (2011)
Resolving difcult phylogenetic questions: why more sequences are not enough. PLoS Biol 9:
e1000602. https://doi.org/10.1371/journal.pbio.1000602
Phillips MJ, Delsuc F, Penny D (2004) Genome-scale phylogeny and the detection of systematic
biases. Mol Biol Evol 21:14551458. https://doi.org/10.1093/molbev/msh137
Phillips MJ, Gibb GC, Crimp EA, Penny D (2010) Tinamous and moa ock together: mitochondrial
genome sequence analysis reveals independent losses of ight among ratites. Syst Biol
59:90107. https://doi.org/10.1093/sysbio/syp079
Poelstra JW et al (2014) The genomic landscape underlying phenotypic integrity in the face of gene
ow in crows. Science 344:14101414. https://doi.org/10.1126/science.1253226
Prum RO, Berv JS, Dornburg A, Field DJ, Townsend JP, Lemmon EM, Lemmon AR (2015) A
comprehensive phylogeny of birds (Aves) using targeted next-generation DNA sequencing.
Nature 526:569573. https://doi.org/10.1038/nature15697
Rabosky DL (2015) No substitute for real data: a cautionary note on the use of phylogenies from
birth-death polytomy resolvers for downstream comparative analyses. Evolution 69:32073216.
https://doi.org/10.1111/evo.12817
206 E. L. Braun et al.
Ragan MA (1992) Phylogenetic inference based on matrix representation of trees. Mol Phylogenet
Evol 1:5358. https://doi.org/10.1016/1055-7903(92)90035-F
Rannala B, Yang Z (2017) Efcient Bayesian species tree inference under the multispecies
coalescent. Syst Biol 66:823842. https://doi.org/10.1093/sysbio/syw119
Raposo do Amaral F, Neves LG, Resende MF Jr, Mobili F, Miyaki CY, Pellegrino KC, Biondo C
(2015) Ultraconserved elements sequencing as a low-cost source of complete mitochondrial
genomes and microsatellite markers in non-model amniotes. PLoS One 10:e0138446. https://
doi.org/10.1371/journal.pone.0138446
Reddy S et al (2017) Why do phylogenomic data sets yield conicting trees? Data type inuences
the avian tree of life more than taxon sampling. Syst Biol 66:857879. https://doi.org/10.1093/
sysbio/syx041
Redelings BD, Holder MT (2017) A supertree pipeline for summarizing phylogenetic and taxo-
nomic information for millions of species. PeerJ 5:e3058. https://doi.org/10.7717/peerj.3058
Reid NM, Hird SM, Brown JM, Pelletier TA, McVay JD, Satler JD, Carstens BC (2013) Poor t to
the multispecies coalescent is widely detectable in empirical data. Syst Biol 63:322333. https://
doi.org/10.1093/sysbio/syt057
Rheindt FE, Edwards SV (2011) Genetic introgression: an integral but neglected component of
speciation in birds. Auk 128:620632. https://doi.org/10.1525/auk.2011.128.4.620
Ricklefs RE (2007) Estimating diversication rates from phylogenetic information. Trends Ecol
Evol 22:601610. https://doi.org/10.1016/j.tree.2007.06.013
Rindal E, Brower AVZ (2011) Do model-based phylogenetic analyses perform better than parsi-
mony? A test with empirical data. Cladistics 27:331334. https://doi.org/10.1111/j.1096-0031.
2010.00342.x
Roberts A, Pimentel H, Trapnell C, Pachter L (2011) Identication of novel transcripts in annotated
genomes using RNA-Seq. Bioinformatics 27:23252329. https://doi.org/10.1093/bioinformat
ics/btr355
Roch S, Steel M (2015) Likelihood-based tree reconstruction on a concatenation of aligned
sequence data sets can be statistically inconsistent. Theor Popul Biol 100:5662. https://doi.
org/10.1016/j.tpb.2014.12.005
Ronquist F et al (2012) MrBayes 3.2: efcient Bayesian phylogenetic inference and model choice
across a large model space. Syst Biol 61:539542. https://doi.org/10.1093/sysbio/sys029
Sackton TB et al (2018) Convergent regulatory evolution and the origin of ightlessness in
palaeognathous birds. bioRxiv:262584. https://doi.org/10.1101/262584
Saiki RK et al (1988) Primer-directed enzymatic amplication of DNA with a thermostable DNA
polymerase. Science 239:487491
Salichos L, Stamatakis A, Rokas A (2014) Novel information theory-based measures for
quantifying incongruence among phylogenetic trees. Mol Biol Evol 31:12611271. https://
doi.org/10.1093/molbev/msu061
Sanderson MJ, Kim J (2000) Parametric phylogenetics? Syst Biol 49:817829. https://doi.org/10.
1080/106351500750049860
Sayyari E, Mirarab S (2016) Fast coalescent-based computation of local branch support from
quartet frequencies. Mol Biol Evol 33:1654