ArticlePDF Available

A comprehensive phylogeny of birds (Aves) using targeted next-generation DNA sequencing

Article

A comprehensive phylogeny of birds (Aves) using targeted next-generation DNA sequencing

Abstract and Figures

Although reconstruction of the phylogeny of living birds has progressed tremendously in the last decade, the evolutionary history of Neoaves-a clade that encompasses nearly all living bird species-remains the greatest unresolved challenge in dinosaur systematics. Here we investigate avian phylogeny with an unprecedented scale of data: >390,000 bases of genomic sequence data from each of 198 species of living birds, representing all major avian lineages, and two crocodilian outgroups. Sequence data were collected using anchored hybrid enrichment, yielding 259 nuclear loci with an average length of 1,523 bases for a total data set of over 7.8 × 10(7) bases. Bayesian and maximum likelihood analyses yielded highly supported and nearly identical phylogenetic trees for all major avian lineages. Five major clades form successive sister groups to the rest of Neoaves: (1) a clade including nightjars, other caprimulgiforms, swifts, and hummingbirds; (2) a clade uniting cuckoos, bustards, and turacos with pigeons, mesites, and sandgrouse; (3) cranes and their relatives; (4) a comprehensive waterbird clade, including all diving, wading, and shorebirds; and (5) a comprehensive landbird clade with the enigmatic hoatzin (Opisthocomus hoazin) as the sister group to the rest. Neither of the two main, recently proposed Neoavian clades-Columbea and Passerea-were supported as monophyletic. The results of our divergence time analyses are congruent with the palaeontological record, supporting a major radiation of crown birds in the wake of the Cretaceous-Palaeogene (K-Pg) mass extinction.
Content may be subject to copyright.
LETTER
doi:10.1038/nature15697
A comprehensive phylogeny of birds (Aves) using
targeted next-generation DNA sequencing
Richard O. Prum
1,2
*, Jacob S. Berv
3
*, Alex Dornburg
1,2,4
, Daniel J. Field
2,5
, Jeffrey P. Townsend
1,6
,
Emily Moriarty Lemmon
7
& Alan R. Lemmon
8
Although reconstruction of the phylogeny of living birds has pro-
gressed tremendously in the last decade, the evolutionary history of
Neoaves—a clade that encompasses nearly all living bird species—
remains the greatest unresolved challenge in dinosaur systematics.
Here we investigate avian phylogeny with an unprecedented scale
of data: .390,000 bases of genomic sequence data from each of
198 species of living birds, representing all major avian lineages,
and two crocodilian outgroups. Sequence data were collected using
anchored hybrid enrichment, yielding 259 nuclear loci with an
average length of 1,523 bases for a total data set of over 7.8 3 10
7
bases. Bayesian and maximum likelihood analyses yielded highly
supported and nearly identical phylogenetic trees for all major
avian lineages. Five major clades form successive sister groups to
the rest of Neoaves: (1) a clade including nightjars, other caprimul-
giforms, swifts, and hummingbirds; (2) a clade uniting cuckoos,
bustards, and turacos with pigeons, mesites, and sandgrouse; (3)
cranes and their relatives; (4) a comprehensive waterbird clade,
including all diving, wading, and shorebirds; and (5) a compre-
hensive landbird clade with the enigmatic hoatzin (
Opisthocomus
hoazin
) as the sister group to the rest. Neither of the two main,
recently proposed Neoavian clades—Columbea and Passerea
1
were supported as monophyletic. The results of our divergence
time analyses are congruent with the palaeontological record, sup-
porting a major radiation of crown birds in the wake of the
Cretaceous–Palaeogene (K–Pg) mass extinction.
Birds (Aves) are the most diverse lineage of extant tetrapod verte-
brates. They comprise over 10,000 living species
2
, and exhibit an extra-
ordinary diversity in morphology, ecology, and behaviour
3
. Substantial
progress has been made in resolving the phylogenetic history of birds.
Phylogenetic analyses of both molecular and morphological data sup-
port the monophyletic Palaeognathae (the tinamous and flightless
ratites) and Galloanserae (gamebirds and waterfowl) as successive,
monophyletic sister groups to the Neoaves—a diverse clade including
all other living birds
4
. Resolving neoavian phylogeny has proven to be a
difficult challenge because this radiation was very rapid and deep in
time, resulting in very short internodes
4
.
In the last decade, phylogenetic analyses of large, multilocus data
sets have resulted in the proposal of numerous, novel neoavian rela-
tionships. For example, a clade consisting of diving and wading birds
has been consistently recovered, as well as a large landbird clade in
which falcons and parrots are successive sister groups to the perching
birds
4–8
. Recently, phylogenetic analyses of 48 whole avian genomes
resulted in the proposal of a novel phylogenetic resolution of the initial
branching sequence within Neoaves
1
. Although this genomic study
provided much needed corroboration of many neoavian clades, the
limited taxon sampling precluded further insights into the evolution-
ary history of birds.
It has long been recognized that phylogenetic confidence depends
not only on the number of characters analysed and their rate of evolu-
tion, but also on the number and relationships of the taxa sampled
relative to the nodes of interest
9–11
. Theory predicts that sampling a
single taxon that diverges close to a node of interest will have a far
greater effect on phylogenetic resolution than will adding more char-
acters
11
. Despite using an alignment of .40 million base pairs, sparse
sampling of 48 species in the recent avian genomic analysis may not
have been sufficient to confidently resolve the deep divergences among
major lineages of Neoaves. Thus, expanded taxon sampling is required
to test the monophyly of neoavian clades, and to further resolve the
phylogenetic relationships within Neoaves.
Here, we present a phylogenetic analysis of 198 bird species and
2 crocodilians (Supplementary Table 1) based on loci captured using
anchored enrichment
12
. Our sample includes species of 122 avian
families in all 40 extant avian orders
2
, with denser representation of
non-oscine birds (108 families) than of oscine songbirds (14 families).
Effort was made to include taxa that would break up long phylogenetic
branches, and provide the highest likelihood of resolving short inter-
nodes at the base of Neoaves
11
. We also sampled multiple species
within groups whose monophyly or phylogenetic interrelationships
have been controversial—that is, tinamous, nightjars, hummingbirds,
turacos, cuckoos, pigeons, sandgrouse, mesites, rails, storm petrels,
petrels, storks, herons, hawks, hornbills, mousebirds, trogons, king-
fishers, barbets, seriemas, falcons, parrots, and suboscine passerines.
We targeted 394 loci centred on conserved anchor regions of the
genome that are flanked by more variable regions
12
. We performed all
phylogenetic analyses on a data set of 259 genes with the highest
quality assemblies. The average locus was 1,524 bases in length
(361–2,316 base pairs (bp)), and the total percentage of missing data
was 1.84%. The concatenated alignment contained 394,684 sites. To
minimize overall model complexity while accurately accounting for
substitution processes, we performed a partition model sensitivity
analysis with PartitionFinder
13,14
, and compared a complex partition
model (one partition per locus) to a heuristically optimized (rclust)
partition model. Phylogenetic informativeness (PI) approaches
15,16
provided strong evidence that the phylogenetic utility of our data set
was high, with low declines in PI profiles for individual loci, data set
partitions, and the concatenated matrix (Supplementary Fig. 4). We
estimated concatenated trees in ExaBayes
17
and RAxML
18
using a 75
partition model. Coalescent species trees were estimated with the gene
tree summation methods in STAR
19
, NJst
20
, and ASTRAL
21
from gene
trees estimated with RAxML (see Methods.)
Our concatenated Bayesian analyses resulted in a completely
resolved, well supported phylogeny. All clades had a posterior prob-
ability (PP) of 1, except for a single clade including shoebill
(Balaeniceps) and pelican (PP 5 0.54) (Fig. 1). The concatenated
*These authors contributed equally to this work.
1
Department of Ecology & Evolutionary Biology, Yale University, New Haven, Connecticut 06520, USA.
2
Peabody Museum of Natural History, Yale University, New Haven, Connecticut 06520, USA.
3
Department of Ecology and Evolutionary Biology, Fuller Evolutionary Biology Program, Cornell University, and Cornell Laboratory of Ornithology, Ithaca, New York 14853, USA.
4
North Carolina Museum of
Natural Sciences, Raleigh, North Carolina 27601, USA.
5
Department of Geology & Geophysics, Yale University, New Haven, Connecticut 06520, USA.
6
Department of Biostatistics, and Program in
Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA.
7
Department of Biological Science, Florida State University, Tallahassee, Florida 32306, USA.
8
Department
of Scientific Computing, Florida State University, Tallahassee, Florida 32306, USA.
G
2015 Macmillan Publishers Limited. All rights reserved
00 MONTH 2015 | VOL 000 | NATURE | 1
Palaeognathae
Galloanserae
Neoaves
Strisores Columbaves Gruiformes Aequorlitornithes
Tinam.
Galliformes
Anseriform.
Apodiform.
Otidimorph.
Columbimorph.
Tinam.Tinaam.
quorlitornnnnnnnnnnnnnnnnnnnnnithes
Ple.
Pli.
Miocene
Oligocene
Eocene
Palaeocene
Upper
Q.
Neogene
Palaeogene
Cretaceous
Streptoprocne
Tauraco
Treron
Corythaeola
Tringa
Theristicus
Chroicocephalus
Burhinus
Ciconia
Columba
Charadrius
Topaza
Phaethornis
Leptotila
Crax
Odontophorus
Nothoprocta
Crypturellus
Tinamus
Coccyzus
Tigrisoma
Columbina
Chordeiles
Ardea
Chaetura
Nyctibius
Colinus
Anas
Anseranas
Morus
Podargus
Leipoa
Oxyura
Caprimulgus
Dromaius
Psophia
Sterna
Balaeniceps
Archilochus
Bonasa
Jacana
Ardeotis
Oceanodroma
Dendrocygna
Anser
Phoenicopterus
Aythya
Haematopus
Oceanites
Mesitornis
Sarothrura
Monias
Recurvirostra
Rollulus
Phalacrocorax
Chauna
Gallus
Phaethon
Leptoptilos
Heliornis
Anhinga
Casuarius
Fregata
Pelecanoides
Hemiprocne
Apteryx
Pelecanus
Rynchops
Aegotheles
Pterodroma
Eurypyga
Centropus
Eurostopodus
Glareola
Rostratula
Syrrhaptes
Fulmarus
Grus
Puffinus
Porphyr io
Uria
Turnix
Pterocles
Pelagodroma
Rhea
Phoebastria
Scopus
Aramus
Ixobrychus
Rollandia
Cuculus
Tapera
Micropygia
Ortalis
Arenaria
Rallus
Limosa
Eudromia
Balearica
Ptilinopus
Steatornis
Numida
Gavia
Spheniscus
Struthio
Pedionomus
70 60 50 40 30 20 10 0
Ma
1
2
3
4
5
6
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
Aves
Figure 1
|
Phylogeny of birds. Time-calibrated phylogeny of 198 species of
birds inferred from a concatenated, Bayesian analysis of 259 anchored
phylogenomic loci using ExaBayes
17
. Figure continues on the opposite page
from green arrow at the bottom of this panel. Complete taxon data in
Supplementary Table 1. Higher taxon names appear at right. All clades are
supported with posterior probability (PP) of 1.0, except for the Balaeniceps–
Pelecanus clade (PP 5 0.54; clade 109). The five major, successive, neoavian
sister clades are: Strisores (brown), Columbaves (purple), Gruiformes (yellow),
Aequorlitornithes (blue), and Inopinaves (green). Background colours mark
geological periods. Ma, million years ago; Ple, Pleistocene; Pli, Pliocene;
Q., Quaternary. Clade numbers refer to the plot of estimated divergence
dates (Supplementary Fig. 7). Fossil age-calibrated nodes are shown in grey.
Illustrations of representative bird species
30
are depicted by their lineages. See
Supplementary Information for details and further discussion.
RESEARCH LETTER
G
2015 Macmillan Publishers Limited. All rights reserved
2|NATURE|VOL000|00MONTH2015
maximum likelihood analysis recovered a single topology that was
identical to the Bayesian tree except for three clades, all of which are
far from the base of Neoaves: the relationships among pigeons; among
skimmers, gulls, and terns; and among pelicans, shoebill, and waders
(Supplementary Fig. 1). Almost all clades in the maximum likelihood
tree were maximally supported with bootstrap scores (BS) of 1.00, but
nine clades within Neoaves (including four of the most inclusive
neoavian clades) received support ,0.70 (Supplementary Fig. 1).
Coalescent species tree analyses produced substantially different
hypotheses for neoavian relationships (Supplementary Fig. 3), but
Ple.
Pli.
Miocene
Oligocene
Eocene
Palaeocene
Upper
Q.
Neogene
Palaeogene
Cretaceous
70 60 50 40 30 20 10 0
Ma
Inopinaves
Neoaves continued
Coraciimorphae
Australaves
Passeriformes
Buteo
Momotus
Trogon
Smithornis
Apaloderma
Indicator
Alcedo
Buccanodon
Corvus
Tockus
Merops
Furnarius
Cathartes
Hymenops
Hirundinea
Thamnophilus
Strix
Jynx
Sylvia
Regulus
Micrastur
Rupicola
Myiobius
Turdus
Sclerurus
Piprites
Rhynchocyclus
Neopelma
Fringilla
Upupa
Todus
Falco
Myrmornis
Cotinga
Deroptyus
Ceratopipra
Lepidocolaptes
Tyrannus
Caracara
Tityra
Picus
Terenura
Oxyruncus
Ibycter
Schiffornis
Capito
Bucco
Accipiter
Psittrichas
Chloroceryle
Galbula
Chelidoptera
Vultur
Probosciger
Coracias
Ramphastos
Sagittarius
Atelornis
Leptosomus
Opisthocomus
Psittacus
Melanopareia
Climacteris
Malurus
Barnardius
Elanus
Eurylaimus
Nestor
Phoeniculus
Megalaima
Pitta
Colius
Urocolius
Menura
Cryptopipo
Cariama
Myrmothera
Elaenia
Neodrepanis
Ptilonorhynchus
Pandion
Tyto
Chunga
Calandrella
Poecile
Lophorina
Calyptomena
Sericulus
Spizella
Pycnonotus
Bucorvus
Acanthisitta
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
65
Accipitriformes
Figure 1
|
Continued.
LETTER RESEARCH
G
2015 Macmillan Publishers Limited. All rights reserved
00 MONTH 2015 | VOL 000 | NATURE | 3
most of the discordant clades received conspicuously lower bootstrap
support values (0.07 , BS , 0.30). Quantifying the phylogenetic
informativeness of individual loci
15,16
revealed that these low support
values were not due to homoplasy driven by saturation of nucleotide
states, but rather by the low power of individual loci to resolve the
entire range of internode lengths across the depth of the tree
(Supplementary Figs 4 and 5; see Methods). This result was not unex-
pected. The low phylogenetic information content of individual genes
at deep timescales has been demonstrated to impede phylogenetic
resolution in a coalescent species tree framework
22,23
. Furthermore,
when clades with ,0.75 bootstrap support values in the species trees
are collapsed, the resulting topology is exactly congruent with the con-
catenated Bayesian tree (except for the relationships of tinamous
among palaeognaths; Supplementary Fig. 3). Although coalescent spe-
cies trees account for incomplete lineage sorting, simulations show that
species tree methods based on gene tree summation may not provide
significantly better performance over concatenation methods
22
.
Our phylogeny identifies many new clades, and supports many
phylogenetic relationships proposed in previous studies (see detailed
phylogenetic discussion in the Supplementary Information).
Congruent with all recent studies, the phylogeny places palaeognaths
as the sister group to the rest of birds, and the flying tinamous
(Tinamidae) within the flightless ratites. This tree, however, places
tinamous as the sister group to cassowary and emu alone (Fig. 1, grey).
The phylogeny of Galloanserae is exactly congruent with previous
studies
4
(Fig. 1, red).
Within the monophyletic Neoaves, we recover five major clades,
each of which is the successive sister group to the remaining clades in
the series (Fig. 1). The Strisores includes the nightjars and their noc-
turnal relatives with the diurnal swifts and hummingbirds (Fig. 1,
brown). Four nocturnal lineages—nightjars, a neotropical oilbird-
potoo clade, frogmouths, and owlet-nightjars—form successive sister
groups to the diurnal swift and hummingbird clade.
The Columbaves is a novel clade that consists of two monophyletic
groups recently identified by Jarvis et al.
1
(Fig. 1, purple). A clade
consisting of turacos, bustards, and cuckoos (Otidimorphae) is sister
to a clade consisting of pigeons as the sister group to sandgrouse and
mesites (Columbimorphae). The third neoavian clade consists of a well
recognized monophyletic group of core gruiform birds (Gruiformes;
Fig. 1, yellow), with interrelationships that are consistent with previous
phylogenies
4
.
The Aequorlitornithes is a novel, comprehensive clade of waterbirds,
including all shorebirds, diving birds, and wading birds (Fig. 1, blue).
Within this group, the flamingos and grebes
1,4–6
are the sister group to
shorebirds, and the sunbittern and tropicbirds
1,4,6
are the sister group to
the wading and diving birds (Fig. 1, blue). Other interrelationships
within these groups are extensively congruent with the results in
ref. 4 and the work of others (see Supplementary Information).
The fifth major neoavian clade, which we name Inopinaves, is a very
diverse landbird clade with the samecompositionaspreviouslyrecog-
nized (Telluraves)
1,4–6
,butwiththeenigmatic,neotropicalhoatzin
(Opisthocomus hoazin)asthesistergrouptoallotherlandbirds(Fig.1,
green). The phylogeny of the landbirds shares many points of congruence
with earlier hypotheses, including the relationships of seriemas, falcons,
parrots, and perching birds
1,4–6
,andtheinterrelationshipsamongoscine
songbirds
24
.However,wefindthathawks(Accipitriformes) are the sister
group to a new clade including the rest of the landbirds, to be called
Eutelluraves (see Supplementary Information).
Our divergence time analyses employed 19 phylogenetically and
geologically well-constrained fossil calibrations (following recently
proposed best practices
25
), documenting many deep divergences
within the avian crown group (Fig. 1, grey nodes; see Supplementary
Information). Our analysis supports an extremely rapid radiation of
the avian crown group in the wake of the K–Pg mass extinction event
(Fig. 1, Supplementary Figs. 6 and 7). Although the post-K–Pg radi-
ation hypothesis has long been strongly supported by the avian fossil
record
26,27
, it has so far received little support from molecular diver-
gence time analyses
4,28
. The tempo and mode of the extant avian radi-
ation remains contentious. For example, an alternative calibration
analysis including the fossil Vegavis did not support significantly
different dates of divergence outside of the Galloanserae (see Supple-
mentary Information and Supplementary Figs 10–12). Confident
determination of the age of crown Aves will have to await discoveries
of Mesozoic stem neognaths and palaeognaths, and detailed assess-
ments of the influence of soft maximum bound parameterization on
the age of the deepest avian divergences.
Our results indicate that the recent genome phylogeny
1
may contain
some erroneous relationships induced by long branch attraction from
sparse taxon sampling. Maximum likelihood analysis of our sequence
data pruned down to a phylogenetically equivalent subsample of
48 species produces relationships along the neoavian ‘backbone’
(Supplementary Fig. 8) that are entirely discordant with the phylogeny
based on our full data set (Fig. 1). This reduced taxon analysis recovers
some of the specific features of the recent genome phylogeny by Jarvis
et al.
1
(Supplementary Fig. 8): for example, the placement of the
pigeons, mesites, and sandgrouse (a subclade of Columbea
1
) outside
of the rest of the Neoaves. Differences in tree topology when taxa are
excluded are to be expected if early internodes in Neoaves are very
short. Adding taxa that have diverged near nodes of interest has been
theoretically demonstrated to constrain the possible historical substi-
tution patterns, and increase the accuracy of phylogenetic inference
11
.
By increasing our taxon sampling to include all major avian lineages,
we have minimized the possibility that additional taxon sampling
alone will alter the relationships in our tree.
Jarvis et al.
1
also identified a well supported clade consisting of the
hoatzin (Opisthocomus) as the sister group to a crane (Grus) and a
plover (Charadrius) (total evidence nucleotide tree, BS 5 0.91, 0.96,
respectively). However, Grus and Charadrius were the only species
sampled from two very diverse neoavian orders: Gruiformes, 185 spe-
cies; and Charadriiformes, 385 species
2
. Our results indicate that
Opisthocomus is the most ancient bird lineage (, 64 million years)
consisting of only a single, extant species. Thus, the three taxa placed
in this assemblage by Jarvis et al.
1
comprise three of the most ancient,
and under-sampled lineages within all birds, indicating the strong
possibility of long branch attraction artefacts. By contrast, these same
groups are represented by 26 species in our analysis, and they do not
form an exclusive clade (Fig. 1).
In addition to providing a new backbone for comprehensive avian
supertrees and comparative evolutionary analyses
28
, this new avian
phylogeny supports many interesting hypotheses about avian evolu-
tion. This phylogeny upholds the hypothesis that the ancestor of the
diurnal swifts and hummingbirds evolved from a clade that had been
predominantly nocturnal for ,10 million years. Although humming-
birds have acute near-ultraviolet vision
29
, the effect of extended ances-
tral nocturnality on the evolution of the visual system in this group of
birds is unknown. Our findings also support the emerging pattern that
landbirds evolved from a raptorial grade
1
. The sister group relation-
ships of hawks to the rest of the landbirds, of owls to the diverse
coraciimorph clade, and of seriemas and falcons to the parrots and
passerines indicate the persistence of a raptorial ecology among ances-
tral landbirds. Lastly, the identification of a new, broadly comprehens-
ive waterbird–shorebird clade indicates a striking, and previously
unappreciated, level of evolutionary constraint on the ecological diver-
sification of birds that will be exciting to investigate in the future.
Online Content Methods, along with any additional Extended Data display items
and Source Data,are available in the online version of the paper; references unique
to these sections appear only in the online paper.
Received 3 May; accepted 9 September 2015.
Published online 7 October 2015.
1. Jarvis, E. D. et al. Whole-genomeanalyses resolve early branches in the treeof life of
modern birds. Science 346, 1320–1331 (2014).
RESEARCH LETTER
G
2015 Macmillan Publishers Limited. All rights reserved
4|NATURE|VOL000|00MONTH2015
2. Gill, F. & Donsker, D. IOC World Bird List (v5.1) http://dx.doi.org/10.14344/
IOC.ML.5.1 (2015).
3. Gill, F. B. Ornithology 2nd edn (W. H. Freeman and Co., 1995).
4. Hackett, S. J. et al. A phylogenomic study of birds reveals their evolutionary history.
Science 320, 1763–1768 (2008).
5. Ericson, P. G. P. et al. Diversification of Neoaves: integration of molecular sequence
data and fossils. Biol. Lett. 2, 543–547 (2006).
6. McCormack, J. E. et al. A phylogeny of birds based on over 1,500 loci collected by
target enrichment and high-throughput sequencing. PLoS ONE 8, e54848 (2013).
7. Mayr, G. Paleogene Fossil Birds (Springer, 2009).
8. Mayr, G. Metaves, Mirandornithes, Strisores and other novelties a critical review
of the higher-level phylogeny of neornithine birds. J. Zoological Syst. Evol. Res. 49,
58–76 (2011).
9. Graybeal, A. Is it better to add taxa or characters to a difficult phylogenetic
problem? Syst. Biol. 47, 9–17 (1998).
10. Heath, T. A., Hedtke, S. M. & Hillis, D. M. Taxon sampling and the accuracy of
phylogenetic analyses. Journal of Systematics and Evolution 46, 239–257 (2008).
11. Townsend, J. P. & Lopez-Giraldez, F. Optimal selection of gene and ingroup taxon
sampling for resolving phylogenetic relationships. Syst. Biol. 59, 446–457 (2010).
12. Lemmon, A. R., Emme, S. A. & Lemmon, E. M. Anchored hybrid enrichment for
massively high-throughput phylogenomics. Syst. Biol. 61, 727–744 (2012).
13. Lanfear, R., Calcott, B., Ho, S. Y. & Guindon, S. PartitionFinder: combined selection
of partitioning schemes and substitution models for phylogenetic analyses. Mol.
Biol. Evol. 29, 1695–1701 (2012).
14. Berv, J. S. & Prum, R. O. A comprehensive multilocus phylogeny of the neotropical
cotingas (Cotingidae, Aves) with a comparative evolutionary analysis of breeding
system and plumage dimorphism and a revised phylogenetic classification. Mol.
Phylogenet. Evol. 81, 120–136 (2014).
15. Townsend, J. P. Profiling phylogenetic informativeness. Syst. Biol. 56, 222–231
(2007).
16. Townsend, J. P., Su, Z. & Tekle, Y. I. Phylogenetic signal and noise: predicting the
power of a data set to resolve phylogeny. Syst. Biol. 61, 835–849 (2012).
17. Aberer, A. J., Kobert, K. & Stamatakis, A. ExaBayes: massively parallel Bayesian tree
inference for the whole-genome era. Mol. Biol. Evol. 31, 2553–2556 (2014).
18. Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis
of large phylogenies. Bioinformatics 30, 1312–1313 (2014).
19. Liu, L., Yu, L., Pearl, D. K. & Edwards, S. V. Estimating species phylogenies using
coalescence times among sequences. Syst. Biol. 58, 468–477 (2009).
20. Liu, L. & Yu, L. Estimating species trees from unrooted gene trees. Syst. Biol. 60,
661–667 (2011).
21. Mirarab, S. et al. ASTRAL: genome-scale coalescent-based species tree estimation.
Bioinformatics 30, i541–i548 (2014).
22. Tonini, J., Moore, A., Stern, D., Shcheglovitova, M. & Or
´
, G. Concatenation and
species tree methods exhibit statistically indistinguishable accuracy under a
range of simulated conditions. PLOS Currents Tree of Life 1, http://dx.doi.org/
10.1371/currents.tol.34260cc27551a527b124ec5f6334b6be (2015).
23. Mirarab, S., Bayzid, M. S. & Warnow, T. Evaluating summary methods for multi-
locus species tree estimation in the presence of incomplete lineage sorting. Syst.
Biol. http://dx.doi.org/10.1093/sysbio/syu063 (2014).
24. Barker, F. K., Cibois, A., Schikler, P., Felsenstein, J. & Cracraft, J. Phylogeny and
diversification of the largest avian radiation. Proc. Natl Acad. Sci. USA 101,
11040–11045 (2004).
25. Parham, J. F. et al. Best practices for justifying fossil calibrations. Syst. Biol. 61,
346–359 (2012).
26. Longrich, N. R., Tokaryk, T. & Field, D. J. Mass extinction of birds at the Cretaceous–
Paleogene (K–Pg) boundary. Proc. Natl Acad. Sci. USA 108, 15253–15257 (2011).
27. Feduccia, A. The Origin and Evolution of Birds 2nd edn (Yale Univ. Press, 1999).
28. Jetz, W., Thomas, G. H., Joy, J. B., Hartmann, K. & Mooers, A. O. The global diversity
of birds in space and time. Nature 491, 444–448 (2012).
29. Goldsmith, T. H. Hummingbirds see near ultraviolet light. Science 207, 786–788
(1980).
30. del Hoyo, J., Elliott, A., Sargatal, J., Christie, D. A. & de Juana, E. Handbook of the Birds
of the World Alive (Lynx Edicions, 2015).
Supplementary Information is available in the online version of the paper.
Acknowledgements The research was supported by W. R. Coe Funds from Yale
University to R.O.P., and by NSF grants to A.R.L. and E.M.L. We thank the ornithology
curators and staff of the following collec tions for granting research access to the
invaluable avian tissue collections that made this work possible: American Museum
of Natural History, Field Museum of Natural History, Royal Ontario Museum,
University of Kansas Museum of Natural History and Biodiversity Research Center,
University of Washington Burke Museum of Natural History, and Yale Peabody
Museum of Natural Hist ory. We thank M. Kortyna and H. Ralicki for contributions to
laboratory work, S. Gullapalli for computational assistance, and N. J. Carriero and
R. D. Bjornson at the Yale University Biomedical High Performance Computing
Center, which is supported by the NIH. Bird illustrations reproduced with
permission from the Handbook of the Birds of the World Alive Online, Lynx Edicions,
Barcelona
30
. The research was aided by discussions wi th R. Bow ie, S. Edwards,
I. Lovette, J. Musser, T. Near, and K. Zyskowski.
Author Contributions R.O.P., J.S.B., A.R.L., and E.M.L. conceived of and designed the
study. R.O.P. selected the taxa studied. A.R.L. selected the loci and designed the probes.
J.S.B., A.R.L., and E.M.L. collected the data. J.S.B. and A.R.L. performed the phylogenetic
analyses. A.D. and J.P.T. performed the phylogenetic informativeness, and signal and
noise analyses. D.J.F. selected fossil taxa for calibration, and J.S.B., D.J.F., and A.D.
designed and performed the dating analyses. R.O.P. wrote the paper with contributions
from all other authors.
Author Information Electronic data files and software are permanently archived at
http://dx.doi.org/10.5281/zenodo.28343. Reprints and permissions information is
available at www.nature.com/reprints. The authors declare no competing financial
interests. Readers are welcome to comment on the online version of the paper.
Correspondence and requests for materials should be addressed to R.O.P.
(richard.prum@yale.edu) or J.S.B. (jsb439@cornell.edu).
LETTER RESEARCH
G
2015 Macmillan Publishers Limited. All rights reserved
00 MONTH 2015 | VOL 000 | NATURE | 5
METHODS
Locus selection and probe design. Anchor loci described in ref. 12 were extended
such that each contained approximately 1,350 bp. In some cases neighbouring loci
were joined to form a single locus. Also, loci that performed poorly in ref. 12 were
removed from the locus set. This process produced 394 loci (referred to as the
version 2 vertebrate loci). Genome coordinates corresponding to these regions in
the Gallus gallus genome (galGal3, UCSC genome browser) were identified and
sequences corresponding to this region were extracted (coordinates are available
in the Zenodo archive (http://dx.doi.org/10.5281/zenodo.28343)). In order to
improve the capture efficiency for passerines, we also obtained homologous
sequences for Taeniopygia guttata. After aligning the Gallus and Taeniopygia
sequences using MAFFT
31
, alignments were trimmed to produce the final probe
region alignments (alignments available in the Zenodo archive), and probes were
tiled at approximately 1.5 3 tiling density (probe specification will be made avail-
able upon publication).
Data collection. Data were collected following the general methods of ref. 12
through the Center for Anchored Phylogenomics at Florida State University
(http://www.anchoredphylogeny.com). Briefly, each genomic DNA sample was
sonicated to a fragment size of ,150–350 bp using a Covaris E220 focused-
ultrasonicator with Covaris microTUBES. Subsequently, library preparation and
indexing were performed on a Beckman-Coulter Biomek FXp liquid-handling
robot following a protocol modified from ref. 32. One important modification is
a size-selection step after blunt-end repair using SPRIselect beads (Beckman-
Coulter; 0.9 3 ratio of bead to sample volume). Indexed samples were then pooled
at equal quantities (typically 12–16 samples per pool), and enrichments were
performed on each multi-sample pool using an Agilent Custom SureSelect kit
(Agilent Technologies), designed as specified above. After enrichment, the 12
enrichment pools were pooled in groups of three in equal quantities for sequencing
on four PE150 Illumina HiSeq2000 lanes (three enrichment pools per lane).
Sequencing was performed in the Translational Science Laboratory in the
College of Medicine at Florida State University.
Data processing. Paired-read merging (Merge.java). Typically, between 50% and
75% of sequenced library fragments had an insert size between 150 bp and 300 bp.
As 150 bp paired-end sequencing was performed, this means that the majority of
the paired reads overlap and thus should be merged before assembly. The over-
lapping reads were identified and merged following the methods of ref. 33. In
short, for each degree of overlap for each read we computed the probability of
obtaining the observed number of matches by chance, and selected degree of
overlap that produced the lowest probability, with a P value less than 10
210
required to merge reads. When reads are merged, mismatches are reconciled using
base-specific quality scores, which were combined to form the new quality scores
for the merged read (see ref. 33 for details). Reads failing to meet the probability
criterion were kept separate but still used in the assembly. The merging process
produces three files: one containing merged reads and two containing the
unmerged reads.
Assembly (Assembler.java). The reads were assembled into contigs using an
assembler that makes use of both a divergent reference assembly approach to
map reads to the probe regions and a de novo assembly approach to extend the
assembly into the flanks. The reference assembler uses a library of spaced 20-mers
derived from the conserved sites of the alignments used during probe design. A
preliminary match was called if at least 17 of 20 matches exist between a spaced
kmer and the corresponding positions in a read. Reads obtaining a preliminary
match were then compared to an appropriate reference sequence used for probe
design to determine the maximum number of matches out of 100 consecutive
bases (all possible gap-free alignments between the read and the reference ware
considered). The read was considered mapped to the given locus if at least
55 matches were found. Once a read is mapped, an approximate alignment posi-
tion was estimated using the position of the spaced 20-mer, and all 60-mers
existing in the read were stored in a hash table used by the de novo assembler.
The de novo assembler identifies exact matches between a read and one of the 60-
mers found in the hash table. Simultaneously using the two levels of assembly
described above, the three read files were traversed repeatedly until an entire pass
through the reads produced no additional mapped reads.
For each locus, mapped reads were then clustered into clusters using 60-mer
pairs observed in the reads mapped to that locus. In short, a list of all 60-mers
found in the mapped reads was compiled, and the 60-mers were clustered if found
together in at least two reads. The 60-mer clusters were then used to separate the
reads into clusters for contig estimation. Relative alignment positions of reads
within each cluster were then refined in order to increase the agreement across
the reads. Up to one gap was also inserted per read if needed to improve the
alignment. Note that given sufficient coverage and an absence of contamination,
each single-copy locus should produce a single assembly cluster. Low coverage
(leading to a break in the assembly), contamination, and gene duplication, can all
lead to an increased number of assembly clusters. A whole-genome duplication,
for example, would increase the number of clusters to two per locus.
Consensus bases were called from assembly clusters as follows. For each site an
unambiguous base was called if the bases present were identical or if the poly-
morphism of that site could be explained as sequencing error, assuming a binomial
probability model with the probability of error equal to 0.1 and alpha equal to 0.05.
If the polymorphism could not be explained as sequencing error, the ambiguous
base was called that corresponded to all of the observed bases at that site (for
example, ‘R’ was used if ‘A’ and ‘G’ were observed). Called bases were soft-masked
(made lowercase) for sites with coverage lower than five. A summary of the
assembly resu lts is presen ted in a spreadsheet in the electronic data archive (http://
dx.doi.org/10.5281/zenodo.28343; Prum_AssemblySummary_Summary.xlsx).
Contamination filtering (IdentifyGoodSeqsViaReadsMapped.r, GatherALL
ConSeqsWithOKCoverage.java). In order to filter out possible low-level contami-
nants, consensus sequences derived from very low coverage assembly clusters
(,10 reads) were removed from further analysis. After filtering, consensus
sequences were grouped by locus (across individuals) in order to produce sets
of homologues.
Orthology (GetPairwiseDistanceMeasures.java, plotMDS5.r). Orthology was
then determined for each locus as follows. First, a pairwise distance measure
was computed for pairs of homologues. To compute the pairwise distance between
two sequences, we computed the percent of 20-mers observed in the two sequences
that were found in both sequences. Note that the list of 20-mers was constructed
from consecutive 20-mers as well as spaced 20-mers (every third base), in order to
allow increased levels of sequence divergence. Using the distance matrix, we
clustered the sequences using a neighbour-joining algorithm, but allowing at most
one sequence per species to be in a given cluster. Clusters containing fewer than
50% of the species were removed from downstream processing.
Alignment (MAFFT). Sequences in each orthologous set were aligned using
MAFFT v7.023b
31
with “–genafpair” and “–maxiterate 1000” flags.
Alignment Trimming (TrimAndMaskRawAlignments3). The alignment for
each locus was then trimmed/masked using the following procedure. First, each
alignment site was identified as ‘good if the most common character observed was
present in .40% of the sequences. Second, 20 bp regions of each sequence that
contained ,10 good sites were masked. Third, sites with fewer than 12 unmasked
bases were removed from the alignment. Lastly, entire loci were removed if both
outgroups or more than 40 taxa were missing. This filter yielded 259 trimmed loci
containing fewer than 2.5% missing characters overall.
Model selection and phylogenetic inference. To minimize the overall model
complexity while accurately accounting for substitution processes, we performed
a partition-model sensitivity analysis with the development version of
PartitionFinder v2.0 (ref. 13), sensu
14
, and compared a complex partition-model
(one partition per gene) to a heuristically optimized (relaxed clustering with the
RAxML option for accelerated model selection) partition-model using BIC. Based
on a candidate pool of potential partitioning strategies that spanned a single
partition for the entire data set to a model allowing each locus to represent a
unique partition, the latter approach suggested that 75 partitions of our data set
represented the best-fitting partitioning scheme, which reduced the number of
necessary model parameters by 71%, and hugely decreased computation time.
We analysed each individual locus in RAxML v8.0.20 (ref. 18), and then the
concatenated alignment, using the two partitioning strategies identified above
with both maximum likelihood and Bayesian based approaches in RAxML
v8.0.20, and ExaBayes v1.4.2 9 (ref. 34). For each RAxML analysis, we executed
100 rapid bootstrap inferences and thereafter a thorough ML search using a
GTR1C
4
model of nucleotide substitution for each data set partition. Although
this may potentially over-parameterize a partition with respect to substitution
model, the influence of this form of model over-parameterization has been found
to be negligible in phylogenetic inference
35
. For the Bayesian analyses, we ran four
Metropolis-coupled ExaBayes replicates for 10 million generations, each with
three heated chains, and sampling every 1,000 generations (default tuning and
branch swap parameters; branch lengths among partitions were linked).
Convergence and proper sampling of the posterior distribution of parameter
values were assessed by checking that the effective sample sizes of all estimated
parameters and branch lengths were greater than 200 in the Tracer v1.6 software
36
(most were greater than 1,000), and by using the ‘sdsf and ‘postProcParam’ tools
included with the ExaBayes package to ensure the average standard deviation of
split frequencies and potential scale reduction factors across runs were close to
zero and one, respectively. Finally, to check for convergence in topology and clade
posterior probabilities, we summarized a greedily refined majority-rule consensus
tree (default) from 10,000 post burn-in trees using the ExaBayes ‘consense’ tool for
each run independently and then together. Analyses of the reduced data set refer-
enced in the main text were conducted using the same partition-model as the full
data set.
RESEARCH LETTER
G
2015 Macmillan Publishers Limited. All rights reserved
To explore variation in gene tree topology and to look for outliers that might
influence combined analysis, we calculated pairwise Robinson-Foulds
37
(RF) and
Matching Splits (MS) tree distances implemented in TreeCmp
38
. We then visua-
lized histograms of tree distances and multidimensional scaling plots in R,
and estimated neighbour-joining ‘trees-of-trees’ in the Phangorn R package
sensu lato
39,40
. Using RF and MS distances, outlier loci were identified as those
that occurred in the top 10% of pairwise distances for .30 comparisons to other
loci (,10%) in the data set. We also identified putative outlier loci using the
kdetrees.complete function of the kdetrees R package
41
. All three methods iden-
tified 13 of the same loci as potential outliers; however removal of these loci from
the analysis had no effect on estimating topology or branch lengths.
Coalescent species tree analyses. Although fully parametric estimation (for
example, *BEAST, see ref. 42) of a coalescent species tree with hundreds of genes
and hundreds of taxa is not currently possible, we estimated species trees using
three gene-tree summation methods that have been shown to be statistically
consistent under the multispecies coalescent model
43
. First, we used the
STRAW web server
44
to estimate bootstrapped species trees using the STAR
19
and NJ-ST
20
algorithms (also available through STRAW). The popular MP-
EST
45
method cannot currently work for more than ,50 taxa. STAR takes rooted
gene trees and uses the average ranks of coalescence times
19
to build a distance
matrix from which a species tree is computed with the neighbour-joining
method
46
. By contrast, NJst applies the neighbour-joining method to a distance
matrix computed from average gene-tree internode distances, and relaxes the
requirement for input gene trees to be rooted
20
.
We also summarized a species tree with the ASTRAL 4.7.6 algorithm. With
simulated data, ASTRAL has been shown to outperform concatenation or other
summary methods under certain amounts of incomplete lineage sorting
21
. For
very large numbers of taxa and genes, ASTRAL uses a heuristic search to find the
species tree that agrees with the largest number of quartet trees induced by the set
of input gene trees. For analysis with ASTRAL, we also attempted to increase the
resolution of individual gene trees (Supplementary Fig. 2) by generating supergene
alignments using the weighted statistical binning pipeline of refs 47, 48 with a
bootstrap score of 0.75 as a bin threshold.
STAR, NJst (not shown), and the binned ASTRAL (Supplementary Fig. 3)
analysis produced virtually identical inferences when low support branches
(,0.75) were collapsed, and differed only with respect to the resolution of a few
branches. NJst resolved the Passeroidea (Fringilla plus Spizella) as the sister group
to a paraphyletic sample of Sylvioidea (Calandrella, Pycnonotus, and Sylvia), while
STAR does not resolve this branch. Comparing STAR/NJst to ASTRAL, we find
five additional differences: (1) within tinamous, STAR/NJst resolves Crypturellus
as sister to the rest of the tinamous, whereas ASTRAL resolves Crypturellus as
sister to Tinamus (similar to ExaBayes/RAxML); (2) STAR/NJst resolves pigeons
as sister to a clade containing Mesitornithiformes and Pteroclidiformes, while
ASTRAL does not resolve these relationships; (3), STAR/NJst fails to resolve
Oxyruncus and Myiobius as sister genera, while ASTRAL does (similar to
RAxML/ExaBayes); (4), in STAR/NJst, bee-eaters (Merops) are resolved as the
sister group to coraciiforms (congruent with ref. 4), while ASTRAL resolves
bee-eaters as sister to the rollers (Coracias) (similar to RAxML/ExaBayes);
(5) lastly, in STAR/NJst, buttonquail (Turnix) is resolved as sister to the most
inclusive clade of Charadriiformes not including Burhinus, Charadrius,
Haematopus, and Recurvirostra, while in ASTRAL, buttonquail is resolved as sister
to a clade containing Glareola, Uria, Rynchops, Sterna, and Chroicocephalus (sim-
ilar to RAxML/ExaBayes).
Although lower level relationships detected with concatenation are generally
recapitulated in the species trees, few of the higher level, or interordinal, relation-
ships are resolved. This lack of resolution of the gene-tree species-tree based
inferences relative to the inferences based on concatenation are not surprising,
as it is increasingly recognized that the phylogenetic information content
required to resolve the gene-tree histories of individual loci becomes scant at
deep timescales
47
. Despite our extensive taxon sampling and the slow rate of
nucleotide substitution that characterizes loci captured using anchored enrich-
ment
12
, no single locus was able to fully resolve a topology, and this lack of
information will challenge the accuracy of any coalescent-based summary
approach relative to concatenation
49–54
. Finally, all summation methods tested
here assume a priori that the only source of discordance among gene trees is deep
coalescence, and violations of this assumption may introduce systematic error in
phylogeny estimation
54
.
Phylogenetic informativeness. Site-specific evolutionary rates, l
i…j
, were calcu-
lated for each locus using the program HyPhy
55
in the PhyDesign web interface
56
in conjunction with a guide chronogram generated by a nonparametric rate
smoothing algorithm
57
applied to our concatenated RAxML tree. Using these rates
to predict whether an alignment will yield correct, incorrect, or no resolution of a
given node, we quantified the probability of phylogenetically informative changes
(y)
16
contributing to the resolution of the earliest divergences in Neoaves.
Estimates generated under a three character state model
58
reveal that the majority
of loci have a strong probability of y, and suggest a high potential for most loci and
partitions containing multiple loci (assigned by PartitionFinder) to correctly
resolve this internode. The potential for resolution as a consequence of phylogen-
etic signal is therefore high relative to the potential for saturation and misleading
inference induced by stochastic changes along the subtending lineages (Supple-
mentary Fig. 4a).
To assess the information content of the loci across the entire topology, we
profiled their phylogenetic informativeness (PI)
15
, (Supplementary Fig. 4b). There
was considerable variation in PI across loci (Supplementary Fig. 4). In all cases, the
loci with the lowest values of y are categorized by substantially lower (60–90%)
values of PI, rather than sharp declines in their PI profiles. The absence of a sharp
decline in the PI profile suggests that a lack of phylogenetic information, rather
than rapid increasesin homoplasious sites, underlie low values of the probability of
signal y
59
.
Because declines in PI can be attributed to increases in homoplasious site
patterns
59
, we further assessed the phylogenetic utility of data set partitions by
quantifying the ratio of PI at the most recent common ancestor of Neoaves to the
PI at the most recent common ancestor of Aves (Supplementary Fig. 4c). Values of
this ratio that are less than 1 correspond to a rise in PI towards the root. Values
close to 1 correspond to fairly uniform PI. Values greater than 1 correspond to a
decline in PI towards the root. Sixty-six out of 75 partitions demonstrated less than
a 50% percent decline in PI, and only six partitions demonstrated a decline of PI
greater than 75% (Supplementary Fig. 4c). As all but a few nodes in this study
represent divergences younger than the crown of Neoaves, these ratios of PI
suggest that the predicted impact of homoplasy on our topological inferences
should be minimal.
As PI profiles do not directly predict the impact of homoplasious site patterns
on topological resolution
16,60
, we evaluated probabilities of y for focal nodes using
both the concatenated data set as well as individual loci that span the variance in
locus lengths. Concordant with expectations from the PI profiles, all quantifica-
tions strongly support the prediction that homoplasy will have a minimal impact
on topological resolution for the concatenated data set across a range of tree depths
and internode distances (y 5 1.0 for all nodes), while individual loci vary in their
predicted utility (Supplementary Fig. 4d). As the guide tree does not represent a
true known tree, we additionally quantified y across a range of tree depths and
internode distances to test if our predictions of utility are in line with general
trends in the data. Concordant with our results above, the concatenated data set
is predicted to be of high phylogenetic utility at all timescales (y 5 1.0 for all
nodes), while the utility of individual loci begins to decline for small internodes
at deep tree depths (Supplementary Fig. 5).
Estimating a time-calibrated phylogeny. We estimated a time-calibrated tree
with a node dating approach in BEAST 1.8.1 (ref. 42) that used 19 well justified
fossil calibrations phylogenetically placed by rigorous, apomorphy-based dia-
gnoses (see the descriptions of avian calibration fossils in the Supplementary
Information). We used a starting tree topology based on the ExaBayes inference
(Fig. 1), and prior node age calibrations that followed a lognormal parametric
distribution based on occurrences of fossil taxa. To prevent BEAST from exploring
topology space and only allow estimation of branch lengths, we turned off the
subtree-slide, Wilson–Balding, and narrow and wide exchange operators
61,62
.
Finally, we applied a birth–death speciation model with default priors.
As rates of molecular evolution are significantly variable across certain bird
lineages
63–65
, we applied an uncorrelated relaxed clock (UCLN) to each partition
of the data set where rates among branches are distributed according to a lognor-
mal distribution
66
. All dating analyses were performed without crocodilian out-
groups to reduce the potential of extreme substitution rate heterogeneity to bias
rate and consequent divergence time estimates of the UCLN model
67
.
All calibrations were modelled using soft maximum age bounds to allow for the
potential of our data to overwhelm our user-specified priors
68
. Soft maximum
bounds are the preferred method for assigning upper limits on the age of phylo-
genetic divergences
69
. As effective priors necessarily reflect interactions between
user specified priors, topology, and the branching-model, they may not precisely
reflect the user-specified priors
70
. To correct for this potential source of error, we
carefully examined the effective calibration priors by first running the prepared
BEAST XML without any nucleotide data (until all ESS values were above 200).
We then iteratively adjusted our user-defined priors until all of the effective priors
(as examined in the Tracer software) reflected the intended calibration densities.
Finally, using the compare.phylo function in the Phyloch R package, we examined
how the inclusion of molecular data influenced the divergence time estimates
relative to the effective prior (Supplementary Fig. 9; see below).
Defining priors. Our initial approach was to set a prior’s offset to the age of its
associated fossil; the mean was then manually adjusted such that 95% of the
LETTER RESEARCH
G
2015 Macmillan Publishers Limited. All rights reserved
calibration density fell more recently than the K–Pg boundary at 65 Ma (million
years ago) (the standard deviation was fixed at 1 Ma). In general, priors con-
structed this way generated calibrationdensitiesthat specified their highest density
peak (their mode) about 3–5 million years older than the age of the offset.
We applied a loose gamma prior to the node reflecting the most recent common
ancestor of crown birds—we used an offset of 60.5 Ma (the age of the oldest known
definitive, uncontroversial crown bird fossil; the stem penguin Waimanu), and
adjusted the scale and shape of the prior such that 97.5% of the calibration density
fell more recently than 86.5 Ma
71
(see below and Supplementary Information for
discussion of the .65 Ma putative crown avian Vegavis). This date (86.5 Ma)
reflects the upper bound age estimate of the Niobrara Formation—one of many
richly fossiliferous Mesozoic deposits exhibiting many crownward Mesozoic stem
birds, without any trace of avian crown group representatives. The Niobrara, in
particular, has produced hundreds of stem birds and other fragile skeletons, with-
out yielding a single crown bird fossil, and therefore represents a robust choice for
a soft upper bound for the root divergence of the avian crown
71–73
. Previous soft
maxima employed for this divergence have arbitrarily selected the age of other
Mesozoic stem avians (that is, Gansus yumenensis, 110 Ma) that are phylogeneti-
cally stemward of the Niobrara taxa
28
. Although the implementation of very
ancient soft maxima such as the age of Gansus are often done in the name of
conservatism, the extremely ancient divergence dates yielded by such analyses
illustrate the misleading influence of assigning soft maxima that are vastly too
old to be of relevance to the divergence of crown group birds
74
. However, this
problem has been eliminated in some more recent analyses
75
.
All of the fossil calibrations employed in our analysis represent neognaths;
rootward divergences within Aves (for example the divergence between
Palaeognathae and Neognathae, and Galloanserae and Neoaves) cannot be con-
fidently calibrated due to a present lack of fossils representing the palaeognath,
neognath, galloanserine, and neoavian stem groups. As such, the K–Pg soft bound
was only applied to comparatively apical divergences within neognaths. Although
the question of whether major neognath divergences occurred during the
Mesozoic has been the source of controversy
76–78
, renewed surveys of Mesozoic
sediments for definitive crown avians or even possible crown neoavians have been
unsuccessful (with the possible exception of Vegavis; see Supplementary
Information), and together with recent divergence dating analyses have cast doubt
on the presence of neoavian subclades before the K–Pg mass extinction
1,74,79
.
Further, recent work has demonstrated the tendency of avian divergence estimates
to greatly exceed uninformative priors, resulting in spuriously ancient divergence
dating results (for example, refs 28, 75, 76, 80). These results motivated our
implementation of the 65 Ma soft bound for our neoavian calibrations.
Contrary to expectation, when we compared the effective prior on the entire tree
to the final summary derived from the posterior distribution of divergence times
(Supplementary Fig. 9), we found no overall trend of posterior estimated ages post-
dating prior calibrations. In fact, the inclusion of our molecular data decreases the
inferred ages of almost all of the deepest nodes in our tree. A similar result has been
obtained for mammals by using large amounts of nuclear DNA sequences
81
.
Future work investigating the interplay of the density of genomic sampling and
the application of various calibration age priors will be indispensible for sensitivity
analyses to help us further develop a robust timescale of avian evolution. However,
the pattern of posterior versus prior age estimates observed in our study raises the
prospect that the new class of data used in this study (that is, semi-conserved
anchor regions) may exhibit some immunity to longstanding problems associated
with inferring avian divergence times, such as systematically over-estimating the
antiquity of extant avian clades.
Implementing BEAST and summarizing a final calibrated tree. In addition to
making predictions about the phylogenetic utility of a locus or partition towards
topological resolution, PI profiles have recently also been used to mitigate the
influence of substitution saturation on divergence time estimates
82
. Given the
variance in PI profile shapes for captured loci and their subsequent partition
assignments (Supplementary Fig. 4c), and observations that alignments and sub-
sets of data alignments characterized by high levels of homoplasy can mislead
branch length estimation
83,84
, we limited our divergence time estimates to 36
partitions that did not exhibit a decline in informativeness towards the root of
the tree. We ran BEAST on each partition separately until parameter ESS values
were greater than 200 (most were greater than 1,000) to ensure adequate posterior
sampling of each parameter value. After concatenating 10,000 randomly sampled
post burn-in trees from each of these completed analyses, we summarized a final
MCC tree with median node heights in TreeAnnotator v1.8.1 (ref. 42).
Supplementary Fig. 6 shows the full, calibrated Bayesian tree (Fig. 1) with 95%
HPD confidence intervals on the node ages, and Supplementary Fig. 7 shows the
distribution of estimated branching times, ranked by median age (using clade
numbers from Fig. 1). All computations were carried out on 64-core PowerEdge
M915 nodes on the Louise Linux cluster at the Yale University Biomedical High
Performance Computing Center.
Data reporting. No statistical methods were used to predetermine sample size.
31. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software
version 7: improvements in performance and usability. Mol. Biol. Evol. 30,
772–780 (2013).
32. Meyer, M. & Kircher M. Illumina sequencing library preparation for highly
multiplexed target capture and sequencing. Cold Spring Harb Protoc. http://
dx.doi.org/10.1101/pdb.prot5448 (2010).
33. Rokyta, D. R., Lemmon, A. R., Margres, M. J. & Arnow, K. The venom-gland
transcriptome of the eastern diamondback rattlesnake (Crotalus adamanteus).
BMC Genomics 13, 312 (2012).
34. Misof, B. et al. Phylogenomics resolves the timing and pattern of insect evolution.
Science 346, 763–767 (2014).
35. Dornburg, A., Santini, F. & Alfaro, M. E. The influence of model averaging on clade
posteriors: an example using the triggerfishes (Family Balistidae). Syst. Biol. 57,
905–919 (2008).
36. Tracer. v1.6 http://beast.bio.ed.ac.uk/Tracer (2014).
37. Robinson, D. F. & Foulds, L. R. in Combinatorial Mathematics VI in Lecture Notes in
Mathematics, Vol. 748 (eds Horadam A. F. & Wallis W. D.) Ch. 12 119–126
(Springer, 1979).
38. Bogdanowicz, D., Giaro, K. & Wro
´
bel, B. TreeCmp: comparison of trees in
polynomial time. Evol. Bioinform. 8, 475–487 (2012).
39. Nye, T. M. W. Trees of Trees: an approach to comparing multiple alternative
phylogenies. Syst. Biol. 57, 785–794 (2008).
40. Schliep, K. P. phangorn: phylogenetic analysis in R. Bioinformatics 27, 592–593
(2011).
41. Weyenberg, G., Huggins, P. M., Schardl, C. L., Howe, D. K. & Yoshida, R. KDETREES:
non-parametric estimation of phylogenetic tree distributions. Bioinformatics 30,
2280–2287 (2014).
42. Drummond, A. J., Suchard,M. A., Xie, D. & Rambaut, A. Bayesianphylogenetics with
BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 1969–1973 (2012).
43. Rannala, B. & Yang, Z. Bayes estimation of species divergence times and ancestral
population sizes using DNA sequences from multiple loci. Genetics 164,
1645–1656 (2003).
44. Shaw, T. I., Ruan, Z., Glenn, T. C. & Liu, L. STRAW: species tree analysis web server.
Nucleic Acids Res. 41, W238–W241 (2013).
45. Liu, L., Yu, L. & Edwards, S. A maximum pseudo-likelihood approach for estimating
species trees under the coalescent model. BMC Evol. Biol. 10, 302 (2010).
46. Saitou, N. & Nei, M. The neighbor-joining method: a new method for
reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987).
47. Mirarab, S., Bayzid, M. S., Boussau, B. & Warnow, T. Statistical binning enables an
accurate coalescent-based estimation of the avian tree. Science 346 (2014).
48. Mirarab, S., Bayzid, M. S. & Warnow, T. Evaluating summary methods for
multilocus species tree estimation in the presence of incomplete lineage sorting.
Syst. Biol., (2014).
49. Bayzid, M. S. & Warnow, T. Naive binning improves phylogenomic analyses.
Bioinformatics 29, 2277–2284 (2013).
50. DeGiorgio, M. & Degnan, J. H. Fast and consistent estimation of species trees using
supermatrix rooted triples. Mol. Biol. Evol. 27, 552–569 (2010).
51. Kimball, R. T., Wang, N., Heimer-McGinn, V., Ferguson, C. & Braun, E. L. Identifying
localized biases in large datasets: a case study using the avian tree of life. Mol.
Phylogenet. Evol. 69, 1021–1032 (2013).
52. McCormack, J. E. et al. A phylogeny of birds based on over 1,500 loci collected by
target enrichment and high-throughput sequencing. PLoS ONE 8, e54848 (2013).
53. Springer, M. S. & Gatesy, J. Land plant origins and coalescence confusion. Trends
Plant Sci. 19, 267–269 (2014).
54. Tonini J., Moore A., Stearn D., Shcheglovitova M. & Ortı
´
, G. Concatenation and
species tree methods exhibit statistically indistinguishable accuracy under a
range of simulated conditions. PLOS Currents Tree of Life 1 (2015).
55. Pond, S. L. K. & Muse, S. V. in Statistical Methods in Molecular Evolution (ed. Nielsen,
R.) 125–181 (Springer, 2005).
56. Lo
´
pez-Gira
´
ldez, F. & Townsend, J. P. PhyDesign: an online application for profiling
phylogenetic informativeness. BMC Evol. Biol. 11, 152 (2011).
57. Sanderson, M. A nonparametric approach to estimating divergence times in the
absence of rate constancy. Mol. Biol. Evol. 14, 1218 (1997).
58. Simmons, M. P., Carr, T. G. & O’Neill, K. Relative character-state space, amount of
potential phylogenetic information, and heterogeneity of nucleotide and amino
acid characters. Mol. Phylogenet. Evol. 32, 913–926 (2004).
59. Townsend, J. P. & Leuenberger, C. Taxon sampling and the optimal rates of
evolution for phylogenetic inference. Syst. Biol. 60, 358–365 (2011).
60. Klopfstein, S., Kropf, C. & Quicke, D. L. J. An evaluation of phylogenetic
informativeness profiles and the molecular phylogeny of Diplazontinae
(Hymenoptera, Ichneumonidae). Syst. Biol. 59, 226–241 (2010).
61. Drummond, A. J. & Bouckaret, R. R. Bayesian Evolutionary Analysis With BEAST
(Cambridge Univ. Press, 2015).
62. Hsiang, A. Y. et al. The origin of snakes: revealing the ecology, behavior, and
evolutionary history of early snakes using genomics, phenomics, and the fossil
record. BMC Evol. Biol. 15, 87 (2015).
63. Phillips, M. J., Gibb, G. C., Crimp, E. A. & Penny, D. Tinamous and moa flock
together: mitochondrial genome sequence analysis reveals independent losses of
flight among ratites. Syst. Biol. 59, 90–107 (2010).
64. Pereira, S. L. & Baker, A. J. A mitogenomic timescale for birds detects variable
phylogenetic rates of molecular evolution and refutes the standard molecular
clock. Mol. Biol. Evol. 23, 1731–1740 (2006).
RESEARCH LETTER
G
2015 Macmillan Publishers Limited. All rights reserved
65. Nam,K. et al. Molecular evolution of genesin avian genomes. Genome Biol. 11, R68
(2010).
66. Drummond, A. J., Ho, S. Y. W., Phillips, M. J. & Rambaut, A. Relaxed phylogenetics
and dating with confidence. PLoS Biol. 4, e88 (2006).
67. Dornburg, A., et al. Relaxed clocks and inferences of heterogeneous patterns of
nucleotide substitution and divergence time estimates across whales and
dolphins (Mammalia: Cetacea). Mol. Biol. Evol. 29, 721–736 (2012).
68. Yang, Z. & Rannala, B. Bayesian estimation of species divergence times under a
molecular clock using multiple fossil calibrations with soft bounds. Mol. Biol. Evol.
23, 212–226 (2006).
69. Ho, S. Y. W. Calibrating molecular estimates of substitution rates and divergence
times in birds. J. Avian Biol. 38, 409–414 (2007).
70. Heled, J. & Drummond, A. J. Calibrated tree priors for relaxed phylogenetics and
divergence time estimation. Syst. Biol. 61, 138–149 (2012).
71. Benton, M. J. & Donoghue, P. C. J. Paleontological evidence to date the tree of life.
Mol. Biol. Evol. 24, 26 (2007).
72. Clarke, J. A. Morphology, phylogenetic taxonomy, and systematics of Ichthyornis
and Apatornis (Avialae: Ornithurae). Bull. Am. Mus. Nat. Hist. 286, 1–179 (2004).
73. Field, D. J., LeBlanc, A., Gau, A. & Behlke, A. D. B. Pelagic neonatal fossils support
viviparity and precocial life history of Cretaceous mosasaurs. Palaeontology 58,
401–407 (2015).
74. Mayr, G. The age of the crown group of passerine birds and its evolutionary
significance molecular calibrations versus the fossil record. Syst. Biodivers. 11,
7–13 (2013).
75. Jetz, W. et al. Global distribution and conservation of evolutionary distinctness in
birds. Curr. Biol. 24, 919–930 (2014).
76. Hedges, S. B., Parker, P. H., Sibley, C. G. & Kumar, S. Continental breakup and the
ordinal diversification of birds and mammals. Nature 381, 226–229 (1996).
77. Benton, M. J. Early origins of modern birds and mammals: molecules vs.
morphology. Bioessays 21, 1043–1051 (1999).
78. Hope, S. in Mesozoic Birds: Above the Heads of Dinosaurs (eds Chiappe L. M. &
Witmer L. M.) 339–388 (Univ. of California Press, 2002).
79. Longrich, N. R., Tokaryk, T. & Field, D. J. Mass extinction of birds at the Cretaceous–
Paleogene (K–Pg) boundary. Proc. Natl Acad. Sci. USA 108, 15253–15257 (2011).
80. Baker, A. J., Pereira, S. L. & Paton, T. A. Phylogenetic relationships and divergence
times of Charadriiformes genera: multigene evidence for the Cretaceous origin of
at least 14 clades of shorebirds. Biol. Lett. 3, 205–209 (2007).
81. dos Reis, M. et al. Phylogenomic datasets provide both precision and accuracy in
estimating the timescale of placental mammal phylogeny. Proc. R. Soc. B 279,
3491–3500 (2012).
82. Dornburg, A., Townsend, J. P., Friedman, M. & Near, T. J. Phylogenetic
informativeness reconciles ray-finned fish molecular divergence times. BMC Evol.
Biol. 14, 169 (2014).
83. Brandley, M. C. et al. Accommodating heterogenous rates of evolution in molecular
divergence dating methods: an example using intercontinental dispersal of
Plestiodon (Eumeces) lizards. Syst. Biol. 60, 3–15 (2011).
84. Phillips, M. J. Branch-length estimation bias misleads molecular dating for a
vertebrate mitochondrial phylogeny. Gene 441, 132–140 (2009).
LETTER RESEARCH
G
2015 Macmillan Publishers Limited. All rights reserved
... To determine how inertial characteristics vary during wing morphing, we developed a general analytical method to quantify any flying bird's center of gravity and I, and used a comparative analysis to investigate 22 species spanning the phylogeny defined by Prum et al. [116] except for Palaeognathae as this clade contains largely flightless birds. First, we measured geometric and mass properties of cadavers (Section 3.3.1) ...
... al(Fig. 1.1candTable 1.1)[116]. A phylogeny captures how closely related different species are, in a manner similar to a family tree. ...
Thesis
Full-text available
Uncrewed aerial vehicle (UAV) design has advanced substantially over the past century; however, there are still scenarios where birds outperform UAVs. Birds regularly maneuver through cluttered environments or adapt to sudden changes in flight conditions, tasks that challenge even the most advanced UAVs. Thus, there remains a gap in our general knowledge of flight maneuverability and adaptability that can be filled by improving our understanding of how birds achieve these desirable flight characteristics. Although maneuverability is difficult to quantify, one approach is to leverage an expected trade-off between stability and maneuverability, wherein a stable flyer must generate larger moments to maneuver than an unstable flyer. Bird’s stability, and adaptability, has previously been associated with their ability to morph their wing shape in flight. Birds morph their wings by actuating their musculoskeletal system, including the shoulder, elbow and wrist joints. Thus, to take an important step towards deciphering avian flight stability and adaptability, I investigated how the manipulating avian wing joints affect longitudinal stability and control characteristics. First, I used an open-source low fidelity model to calculate the lift and pitching moment of a gull wing and body across the full range of flexion and extension of the elbow and wrist. To validate the model, I measured the forces and moments on nine 3D printed equivalent wing-body models mounted in a wind tunnel. With the validated numerical results, I identified that extending the wing using different combinations of elbow and wrist angles would provide a method for adaptive control of loads and static stability. However, I also found that gulls were unable to trim for the tested shoulder angle. Next, I developed an open-source, mechanics-based method (AvInertia) to calculate the inertial characteristics of 22 bird species across the full range of flexion and extension of the elbow and wrist. This method allowed a detailed investigation of how manipulating the elbow and wrist angle changed the center of gravity and moment of inertia tensor. Leveraging the previous aerodynamic results, I developed a method to estimate the neutral point of any bird wing configuration and derived a novel metric for pitch agility. With the neutral point and center of gravity, I found that the majority of investigated species had the ability to shift between stable and unstable flight. Further, I implemented an evolutionary analysis that revealed evidence of evolutionary pressures maintaining this capacity to shift, which transforms our understanding of avian flight evolution. Finally, I combined the aerodynamic and inertial results to investigate the dynamic stability of a gull across a range of shoulder, elbow, and wrist angles. This analysis revealed that a positive dihedral and forward-swept wing allowed a trimmed flight condition. For trimmed configurations, I found that high wrist angles were statically unstable and exhibited a non-oscillating, divergent response to disturbances. Lower wrist angles were both statically and dynamically stable and exhibited a short period and phugoid mode like traditional aircraft. I found that most trimmed configurations exhibited short period characteristics that would be flyable by a human pilot, although with a heavily damped phugoid mode. In summary, I found that the avian elbow and wrist joints can act as adaptive controls and permit birds to shift between stable and unstable flight. Identifying these characteristics provides a starting point for future UAV designs that hope to incorporate avian-like maneuverability and adaptability.
... This may have led to the evolution of daytime flowering in some plant groups that had previously mostly flowered at night, and this facilitated the evolution of diurnalism in hawkmoths. At the same time, the Indian Plate continued to extrude northward and continuously insert under the Asian plate, causing the gradual uplift of the Tibetan Plateau, while gradually producing the Tibet-Himalayan orogenic belt, so differentiation of diurnal hawkmoths may also have been influenced by the uplift of the Tibetan Plateau and the surrounding region [51][52][53]. Alternatively, the main natural predators of nocturnal hawkmoths are bats, which originated in the Eocene (33.9-56.0 million years ago) and evolved on a large scale in the Oligocene-Pre-Miocene (20.44-33.9 million years ago). ...
Article
Full-text available
In this study, the mitochondrial genomes of 22 species from three subfamilies in the Sphingidae were sequenced, assembled, and annotated. Eight diurnal hawkmoths were included, of which six were newly sequenced (Hemaris radians, Macroglossum bombylans, M. fritzei, M. pyrrhosticta, Neogurelca himachala, and Sataspes xylocoparis) and two were previously published (Cephonodes hylas and Macroglossum stellatarum). The mitochondrial genomes of these eight diurnal hawkmoths were comparatively analyzed in terms of sequence length, nucleotide composition, relative synonymous codon usage, non-synonymous/synonymous substitution ratio, gene spacing, and repeat sequences. The mitogenomes of the eight species, ranging in length from 15,201 to 15,461 bp, encode the complete set of 37 genes usually found in animal mitogenomes. The base composition of the mitochondrial genomes showed A+T bias. The most commonly used codons were UUA (Leu), AUU (Ile), UUU (Phe), AUA (Met), and AAU (Asn), whereas GCG (Ala) and CCG (Pro) were rarely used. A phylogenetic tree of Sphingidae was constructed based on both maximum likelihood and Bayesian methods. We verified the monophyly of the four current subfamilies of Sphingidae, all of which had high support. In addition, we performed divergence time estimation and ancestral character reconstruction analyses. Diurnal behavior in hawkmoths originated 29.19 million years ago (Mya). It may have been influenced by the combination of herbaceous flourishing, which occurred 26–28 Mya, the uplift of the Tibetan Plateau, and the large-scale evolution of bats in the Oligocene to Pre-Miocene. Moreover, diurnalism in hawkmoths had multiple independent origins in Sphingidae.
... R. Soc. B 289: 20221398 occiputs, particularly when compared with the primarily terrestrial/arboreal landbird clade Inopinaves (figure 8) [46,47]. Both Enaliornis (a foot-propelled diver, [48,49]) and Cerebavis have been hypothesized to have exhibited water-linked ecologies, along with much of the avian stem lineage crownward of Enantiornithes [24], which could therefore underlie their lower degree of ventralization with respect to MPM-334-1. ...
Article
Full-text available
Among terrestrial vertebrates, only crown birds (Neornithes) rival mammals in terms of relative brain size and behavioural complexity. Relatedly, the anatomy of the avian central nervous system and associated sensory structures, such as the vestibular system of the inner ear, are highly modified with respect to those of other extant reptile lineages. However, a dearth of three-dimensional Mesozoic fossils has limited our knowledge of the origins of the distinctive endocranial structures of crown birds. Traits such as an expanded, flexed brain, a ventral connection between the brain and spinal column, and a modified vestibular system have been regarded as exclusive to Neornithes. Here, we demonstrate all of these 'advanced' traits in an undistorted braincase from an Upper Cretaceous enantiornithine bonebed in southeastern Brazil. Our discovery suggests that these crown bird-like endocranial traits may have originated prior to the split between Enantior-nithes and the more crownward portion of avian phylogeny over 140 Ma, while coexisting with a remarkably plesiomorphic cranial base and posterior palate region. Altogether, our results support the interpretation that the distinctive endocranial morphologies of crown birds and their Mesozoic relatives are affected by complex trade-offs between spatial constraints during development.
... This is the first study that compares flower-visiting bird assemblages and their flower resources at two small urbanised sites in two almost antipodal countries that harbour two different bird groups specialised in exploiting nectar. Despite the differences in species composition and evolutionary origins (Barker et al., 2004;Prum et al., 2015), the two studied assemblages share several similarities. The same number of flower-visiting species (six) at each assemblage could be a coincidence or, more likely, due to constraints imposed by the highly man-modified landscapes, which usually support a small set of habitat-generalised species (Callaghan et al., 2019). ...
Article
Urbanised sites around the world harbour bird assemblages capable to tolerate human-induced environmental changes. Assemblages of flower-visiting birds at urbanised sites usually are composed of a limited number of habitat generalist species that favour open areas and edges, a tendency recorded for fruit or insect-eating species as well. We compared flower-visiting bird assemblages and their flower resources in two small urbanised sites in two almost antipodal countries: Brazil and Australia. The flower-visiting birds at the two study sites are composed of completely different families and species but have similar functional traits. Each study site has flower-visiting bird assemblages composed of six species, with predominance of hummingbirds in Brazil and honeyeaters in Australia. A large hummingbird in Brazil and a medium-sized honeyeater in Australia monopolise the nectar resources and aggressively expel other birds. Landscaping and gardening activities provide year-round nectar-producing flowers exploited by the birds. The two flower-visiting bird assemblages share several similarities despite their different species composition. At both sites the flower-visiting birds retain their ecological functions and deliver ecosystem services. Pollination and cultural services are the most prominent ones provided by the birds and their flowers. Such natural history-oriented comparisons help to understand the poorly known relationships between flower-visiting birds and their flowers in small urbanised areas.
... Indeed, the hippocampus of zebra finches (non-food hoarding specie) exhibited less place selective cells compared to the foodhoarding titmice (Payne et al., 2021). Even fewer may exist in galliformes, such as quails and chickens, which retained more ancestral traits compared to neoaves (Prum et al., 2015). This hypothesis is supported by our most recent paper (Morandi-Raikova and Mayer, 2022). ...
Article
Full-text available
In this review, we discuss the functional equivalence of the avian and mammalian hippocampus, based mostly on our own research in domestic chicks, which provide an important developmental model (most research on spatial cognition in other birds relies on adult animals). In birds, like in mammals, the hippocampus plays a central role in processing spatial information. However, the structure of this homolog area shows remarkable differences between birds and mammals. To understand the evolutionary origin of the neural mechanisms for spatial navigation, it is important to test how far theories developed for the mammalian hippocampus can also be applied to the avian hippocampal formation. To address this issue, we present a brief overview of studies carried out in domestic chicks, investigating the direct involvement of chicks’ hippocampus homolog in spatial navigation.
... Advances in next-generation sequencing technologies have facilitated examination of some of these complex hypotheses with substantial sequence data sampled across the genome (e.g., Chen et al. 2019;Meleshko et al. 2021). However, some phylogenetic problems have remained unresolved despite the use of large amounts of unlinked sequence data, for example, the early evolution of metazoans (Philippe et al. 2009;Simion et al. 2017;Pandey and Braun 2020), the root of placental mammals (McCormack et al. 2012;Song et al. 2012) and the early divergences of Neoaves (McCormack et al. 2013;Jarvis et al. 2014;Prum et al. 2015). ...
Article
Some phylogenetic problems remain unresolved even when large amounts of sequence data are analyzed and methods that accommodate processes such as incomplete lineage sorting are employed. In addition to investigating biological sources of phylogenetic incongruence, it is also important to reduce noise in the phylogenomic dataset by using appropriate filtering approach that addresses gene tree estimation errors. We present the results of a case study in manakins, focusing on the very difficult clade comprising the genera Antilophia and Chiroxiphia. Previous studies suggest that Antilophia is nested within Chiroxiphia, though relationships among Antilophia+Chiroxiphia species have been highly unstable. We extracted more than 11,000 loci (ultra-conserved elements and introns) from whole genomes and conducted analyses using concatenation and multi-species coalescent methods. Topologies resulting from analyses using all loci differed depending on the data type and analytical method, with two clades (Antilophia+Chiroxiphia and Manacus+Pipra+Machaeopterus) in the manakin tree showing incongruent results. We hypothesized that gene trees that conflicted with a long coalescent branch (e.g., the branch uniting Antilophia+Chiroxiphia) might be enriched for cases of gene tree estimation error, so we conducted analyses that either constrained those gene trees to include monophyly of Antilophia+Chiroxiphia or excluded these loci. While constraining trees reduced some incongruence, excluding the trees led to completely congruent species trees, regardless of the data type or model of sequence evolution used. We found that a suite of gene metrics (most importantly the number of informative sites and likelihood of intralocus recombination) collectively explained the loci that resulted in non-monophyly of Antilophia+Chiroxiphia. We also found evidence for introgression that may have contributed to the discordant topologies we observe in Antilophia+Chiroxiphia and led to deviations from expectations given the multi-species coalescent model. Our study highlights the importance of identifying factors that can obscure phylogenetic signal when dealing with recalcitrant phylogenetic problems, such as gene tree estimation error, incomplete lineage sorting and reticulation events.
Article
Full-text available
Islands are natural laboratories for studying patterns and processes of evolution. Research on island endemic birds has revealed elevated speciation rates and rapid phenotypic evolution in several groups (e.g., white-eyes, Darwin's finches). However, understanding the evolutionary processes behind these patterns requires an understanding of how genotypes map to novel phenotypes. To date, there are few high-quality reference genomes for species found on islands. Here, we sequence the genome of one of Ernst Mayr's ‘great speciators’, the collared kingfisher (Todiramphus chloris collaris). Utilizing high molecular weight DNA and linked-read sequencing technology, we assembled a draft high-quality genome with highly contiguous scaffolds (scaffold N50 = 19 Mb). Based on universal single-copy orthologues (BUSCO), we estimated a gene space completeness of 96.6% for the draft genome assembly. Population demographic history analyses reveal a distinct pattern of contraction and expansion in population size throughout the Pleistocene. Comparative genomic analysis of gene family evolution revealed that species-specific and rapidly expanding gene families in the collared kingfisher (relative to other Coraciiformes) are mainly involved in the ErbB signaling pathway and focal adhesion. Todiramphus kingfishers are a species-rich group that has become a focus of speciation research. This draft genome will be a platform for future taxonomic, phylogeographic, and speciation research in the group. For example, target genes will enable testing of changes in sensory structures associated with changes in vision and taste genes across kingfishers.
Article
Full-text available
Notosuchia is a group of mostly terrestrial crocodyliforms. The presence of a prominent crest over-hanging the acetabulum, slender straight-shafted long bones with muscular insertions close to the joints, and a stable knee joint suggests that they had an erect posture. This stance has been proposed to be linked to endo-thermy, because it is present in mammals and birds and contributes to the efficiency of their respiratory systems. However, a bone paleohistological study unexpectedly suggested that Notosuchia were ectothermic organisms. The thermophysiological status of Notosuchia deserves further analysis, because the methodology of the previous study can be improved. First, it was based on a relationship between red blood cell size and bone vascular canal diameter tested using 14 extant tetrapod species. Here we present evidence for this relationship using a more comprehensive sample of extant tetrapods (31 species). Moreover, contrary to previous results, bone cross-sectional area appears to be a significant explanatory variable (in addition to vascular canal diameter). Second, red blood cell size estimations were performed using phylogenetic eigen-vector maps, and this method excludes a fraction of the phylogenetic information. This is because it generates a high number of eigenvectors requiring a selection procedure to compile a subset of them to avoid model overfitting. Here we inferred the thermophysiology of Notosuchia using phylogenetic logistic regressions, a method that overcomes this problem by including all of the phylogenetic information and a sample of 46 tetrapods. These analyses suggest that Araripesuchus wegeneri, Armadillosuchus arrudai, Baurusuchus sp., Iber-osuchus macrodon, and Stratiotosuchus maxhechti were ectothermic organisms.
Article
Urbanised sites around the world harbour bird assemblages capable to tolerate human-induced environmental changes. Assemblages of flower-visiting birds at urbanised sites usually are composed of a limited number of habitat generalist species that favour open areas and edges, a tendency recorded for fruit or insect-eating species as well. We compared flower-visiting bird assemblages and their flower resources in two small urbanised sites in two almost antipodal countries: Brazil and Australia. The flower-visiting birds at the two study sites are composed of completely different families and species but have similar functional traits. Each study site has flower-visiting bird assemblages composed of six species, with predominance of hummingbirds in Brazil and honeyeaters in Australia. A large hummingbird in Brazil and a medium-sized honeyeater in Australia monopolise the nectar resources and aggressively expel other birds. Landscaping and gardening activities provide year-round nectar-producing flowers exploited by the birds. The two flower-visiting bird assemblages share several similarities despite their different species composition. At both sites the flower-visiting birds retain their ecological functions and deliver ecosystem services. Pollination and cultural services are the most prominent ones provided by the birds and their flowers. Such natural history-oriented comparisons help to understand the poorly known relationships between flower-visiting birds and their flowers in small urbanised areas.
Article
Full-text available
The hyper‐diverse clade Passeriformes (crown group passerines) comprises over half of extant bird diversity, yet disproportionately few studies have targeted passerine comparative anatomy on a broad phylogenetic scale. This general lack of research attention hinders efforts to interpret the passerine fossil record and obscures patterns of morphological evolution across one of the most diverse clades of extant vertebrates. Numerous potentially important crown passeriform fossils have proven challenging to place phylogenetically, due in part to a paucity of phylogenetically informative characters from across the passerine skeleton. Here, we present a detailed analysis of the morphology of extant passerine carpometacarpi, which are relatively abundant components of the passerine fossil record. We sampled >70% of extant family‐level passerine clades (132 extant species) as well as several fossils from the Oligocene of Europe and scored them for 54 phylogenetically informative carpometacarpus characters optimised on a recently published phylogenomic scaffold. We document a considerable amount of previously undescribed morphological variation among passerine carpometacarpi, and, despite high levels of homoplasy, our results support the presence of representatives of both crown Passeri and crown Tyranni in Europe during the Oligocene. Here, we bolster knowledge of passerine evolution by presenting a comparative framework for a morphologically variable, functionally important and frequently‐fossilised skeletal component of the wing, the carpometacarpus. We show that some of the earliest crown passerines, including the enigmatic and controversial Wieslochia weissi, were crown‐group suboscines—a diverse radiation of extant passerines completely absent from Europe in the present day. Our study constitutes an important first step towards illuminating passerine skeletal evolution by revealing that widespread homoplasy characterises the passerine carpometacarpus.
Article
Full-text available
Background The highly derived morphology and astounding diversity of snakes has long inspired debate regarding the ecological and evolutionary origin of both the snake total-group (Pan-Serpentes) and crown snakes (Serpentes). Although speculation abounds on the ecology, behavior, and provenance of the earliest snakes, a rigorous, clade-wide analysis of snake origins has yet to be attempted, in part due to a dearth of adequate paleontological data on early stem snakes. Here, we present the first comprehensive analytical reconstruction of the ancestor of crown snakes and the ancestor of the snake total-group, as inferred using multiple methods of ancestral state reconstruction. We use a combined-data approach that includes new information from the fossil record on extinct crown snakes, new data on the anatomy of the stem snakes Najash rionegrina, Dinilysia patagonica, and Coniophis precedens, and a deeper understanding of the distribution of phenotypic apomorphies among the major clades of fossil and Recent snakes. Additionally, we infer time-calibrated phylogenies using both new ‘tip-dating’ and traditional node-based approaches, providing new insights on temporal patterns in the early evolutionary history of snakes. Results Comprehensive ancestral state reconstructions reveal that both the ancestor of crown snakes and the ancestor of total-group snakes were nocturnal, widely foraging, non-constricting stealth hunters. They likely consumed soft-bodied vertebrate and invertebrate prey that was subequal to head size, and occupied terrestrial settings in warm, well-watered, and well-vegetated environments. The snake total-group – approximated by the Coniophis node – is inferred to have originated on land during the middle Early Cretaceous (~128.5 Ma), with the crown-group following about 20 million years later, during the Albian stage. Our inferred divergence dates provide strong evidence for a major radiation of henophidian snake diversity in the wake of the Cretaceous-Paleogene (K-Pg) mass extinction, clarifying the pattern and timing of the extant snake radiation. Although the snake crown-group most likely arose on the supercontinent of Gondwana, our results suggest the possibility that the snake total-group originated on Laurasia. Conclusions Our study provides new insights into when, where, and how snakes originated, and presents the most complete picture of the early evolution of snakes to date. More broadly, we demonstrate the striking influence of including fossils and phenotypic data in combined analyses aimed at both phylogenetic topology inference and ancestral state reconstruction. Electronic supplementary material The online version of this article (doi:10.1186/s12862-015-0358-5) contains supplementary material, which is available to authorized users.
Article
Full-text available
When a phylogenetic reconstruction does not result in one tree but in several, tree metrics permit finding out how far the reconstructed trees are from one another. They also permit to assess the accuracy of a reconstruction if a true tree is known. TreeCmp implements eight metrics that can be calculated in polynomial time for arbitrary (not only bifurcating) trees: four for unrooted (Matching Split metric, which we have recently proposed, Robinson-Foulds, Path Difference, Quartet) and four for rooted trees (Matching Cluster, Robinson-Foulds cluster, Nodal Splitted and Triple). TreeCmp is the first implementation of Matching Split/Cluster metrics and the first efficient and convenient implementation of Nodal Splitted. It allows to compare relatively large trees. We provide an example of the application of TreeCmp to compare the accuracy of ten approaches to phylogenetic reconstruction with trees up to 5000 external nodes, using a measure of accuracy based on normalized similarity between trees.
Article
Full-text available
Phylogeneticists have long understood that several biological processes can cause a gene tree to disagree with its species tree. In recent years, molecular phylogeneticists have increasingly foregone traditional supermatrix approaches in favor of species tree methods that account for one such source of error, incomplete lineage sorting (ILS). While gene tree-species tree discordance no doubt poses a significant challenge to phylogenetic inference with molecular data, researchers have only recently begun to systematically evaluate the relative accuracy of traditional and ILS-sensitive methods. Here, we report on simulations demonstrating that concatenation can perform as well or better than methods that attempt to account for sources of error introduced by ILS. Based on these and similar results from other researchers, we argue that concatenation remains a useful component of the phylogeneticist's toolbox and highlight that phylogeneticists should continue to make explicit comparisons of results produced by contemporaneous and classical methods.
Article
Many evolutionary studies of birds rely on the estimation of molecular divergence times and substitution rates. In order to perform such analyses, it is necessary to incorporate some form of calibration information: a known substitution rate, radiometric ages of heterochronous sequences, or inferred ages of lineage splitting events. All three of these techniques have been employed in avian molecular studies, but their usage has not been entirely satisfactory. For example, the 'traditional' avian mitochondrial substitution rate of 2% per million years is frequently adopted without acknowledgement of the associated uncertainty. Similarly, fossil and biogeographic information is almost always converted into an errorless calibration point. In both cases, the resulting estimates of divergence times and substitution rates will be artificially precise, which has a considerable impact on hypothesis testing. In addition, using such a simplistic approach to calibration discards much of the information offered by the fossil record. A number of more sophisticated calibration methods have recently been introduced, culminating in the development of probability distribution-based calibrations. In this article, I discuss the use of this new class of methods and offer guidelines for choosing a calibration technique.
Article
A new method called the neighbor-joining method is proposed for reconstructing phylogenetic trees from evolutionary distance data. The principle of this method is to find pairs of operational taxonomic units (OTUs [= neighbors]) that minimize the total branch length at each stage of clustering of OTUs starting with a starlike tree. The branch lengths as well as the topology of a parsimonious tree can quickly be obtained by using this method. Using computer simulation, we studied the efficiency of this method in obtaining the correct unrooted tree in comparison with that of five other tree-making methods: the unweighted pair group method of analysis, Farris's method, Sattath and Tversky's method, Li's method, and Tateno et al.'s modified Farris method. The new, neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods.
Article
Mosasaurs were large marine squamates that inhabited all of the world's oceans during the Late Cretaceous. Their success as apex predators has been attributed to their rapid acquisition of aquatic adaptations, which allowed them to become fully pelagic. However, little is known about the breeding biology of derived, flipper-bearing mosasaurs, as the record of neonatal mosasaur fossils is extremely sparse. Here, we report on the fragmentary cranial remains of two neonatal mosasaurs from the Niobrara Formation, referred to Clidastes sp. Comparison with other preliminary reports of neonatal mosasaurs reveals that these specimens are among the smallest individuals ever found and certainly represent the smallest known Clidastes specimens. The recovery of these extremely young specimens from a pelagic setting indicates that even neonatal mosasaurs occupied open oceanic habitats and were likely born in this setting. These data shed new light on the ecology of neonatal mosasaurs and illustrate the degree to which size-related taphonomic and collection biases have influenced our understanding of the early life history of these iconic marine reptiles.