ArticlePDF Available

Phylogenetic diversity measures based on Hill numbers

Authors:

Abstract and Figures

We propose a parametric class of phylogenetic diversity (PD) measures that are sensitive to both species abundance and species taxonomic or phylogenetic distances. This work extends the conventional parametric species-neutral approach (based on 'effective number of species' or Hill numbers) to take into account species relatedness, and also generalizes the traditional phylogenetic approach (based on 'total phylogenetic length') to incorporate species abundances. The proposed measure quantifies 'the mean effective number of species' over any time interval of interest, or the 'effective number of maximally distinct lineages' over that time interval. The product of the measure and the interval length quantifies the 'branch diversity' of the phylogenetic tree during that interval. The new measures generalize and unify many existing measures and lead to a natural definition of taxonomic diversity as a special case. The replication principle (or doubling property), an important requirement for species-neutral diversity, is generalized to PD. The widely used Rao's quadratic entropy and the phylogenetic entropy do not satisfy this essential property, but a simple transformation converts each to our measures, which do satisfy the property. The proposed approach is applied to forest data for interpreting the effects of thinning.
Content may be subject to copyright.
Phylogenetic diversity measures based
on Hill numbers
Anne Chao1,*, Chun-Huo Chiu1,2 and Lou Jost3
1
Institute of Statistics, National Tsing Hua University, Hsin-Chu, Taiwan 30043
2
Institute of Statistics, National Chiao Tung University, Hsin-Chu, Taiwan 30043
3
Via a Runtun, Ban˜os, Tungurahua Province, Ecuador
We propose a parametric class of phylogenetic diversity (PD) measures that are sensitive to
both species abundance and species taxonomic or phylogenetic distances. This work extends
the conventional parametric species-neutral approach (based on ‘effective number of species’ or
Hill numbers) to take into account species relatedness, and also generalizes the traditional phylo-
genetic approach (based on ‘total phylogenetic length’) to incorporate species abundances. The
proposed measure quantifies ‘the mean effective number of species’ over any time interval of
interest, or the ‘effective number of maximally distinct lineages’ over that time interval. The pro-
duct of the measure and the interval length quantifies the ‘branch diversity’ of the phylogenetic
tree during that interval. The new measures generalize and unify many existing measures and
lead to a natural definition of taxonomic diversity as a special case. The replication principle
(or doubling property), an important requirement for species-neutral diversity, is generalized to
PD. The widely used Rao’s quadratic entropy and the phylogenetic entropy do not satisfy this
essential property, but a simple transformation converts each to our measures, which do satisfy
the property. The proposed approach is applied to forest data for interpreting the effects of
thinning.
Keywords: doubling property; Hill numbers; phylogenetic diversity; replication principle;
species-neutral diversity; taxonomic diversity
‘We are all blind men (and women) trying to describe a
monstrous elephant of ecological and evolutionary
diversity...
(Nanney 2004, p. 721)
‘Phylogenetic measures are better indicators of conser-
vation worth than species richness, and measures
using branch-lengths are better than procedures
relying solely on topology...
(Crozier 1997, p. 243)
1. INTRODUCTION
An enormous number of diversity measures have
been proposed, not only in ecology but also in gen-
etics, economics, information science, linguistics,
physics and social sciences, among others (e.g.
Pielou 1975;Magurran 2004). Until recently, most
of these measures were species-neutral, treating
all species as if they were equally distinct. However,
as Pielou (1975, p. 17) was the first to notice,
the concept of diversity could be broadened to con-
sider taxonomic differences between species. Many
biologists (see Vellend et al. 2010 for a review) have
recognized that, all else being equal, an assemblage
of phylogenetically divergent species (say, eagle,
magpie and dunlin) is in an important sense more
diverse than an assemblage consisting of closely
related species (magpie, blue magpie and tree pie).
Since there was never real agreement among
biologists about the simpler base concept of species-
neutral diversity (e.g. Hulbert 1971;Routledge
1979;Patil & Taillie 1982;Purvis & Hector 2000;
Jost 2007;Jost et al. 2010), and since there is even
less agreement about how to incorporate phylogenetic
differentiation (e.g. Crozier 1997;Faith 2002;
Cavender-Bares et al. 2009;Pavoine et al. 2009;
Vellend et al. 2010), we now face a hyperdiverse
and rapidly increasing assemblage of non-neutral
diversity measures.
We show below that most of these non-neutral
measures lack an essential mathematical property
implicit in biological reasoning about diversity. Con-
clusions based on these measures will often be
invalid, especially in conservation applications
(Hardy & Jost 2008). We derive a general class of
measures that take into account both species abun-
dances and species phylogenetic differences, and that
possess all the mathematical properties implicit in
standard biological reasoning about diversity. The
new measures behave more intuitively than previous
measures. Many of the previous measures can be
transformed into this new class.
*Author for correspondence (chao@stat.nthu.edu.tw).
Dedicated to the memory of Ross Crozier, a long-time friend of
Anne Chao.
Electronic supplementary material is available at http://dx.doi.org/
10.1098/rstb.2010.0272 or via http://rstb.royalsocietypublishing.org.
One contribution of 16 to a Discussion Meeting Issue ‘Biological
diversity in a changing world’.
Phil. Trans. R. Soc. B (2010) 365, 3599–3609
doi:10.1098/rstb.2010.0272
3599 This journal is q2010 The Royal Society
2. PREVIOUS NON-NEUTRAL MEASURES
Most of the non-neutral measures that have been pro-
posed are generalization of the classic species-neutral
ecological diversity measures: species richness, the
Shannon entropy and the Gini Simpson index. The
pioneering work of Vane-Wright et al. (1991) general-
ized species richness to take into account cladistic
diversity (CD), based on the total nodes in a taxo-
nomic tree. Subsequent important work was done by
Faith (1992,1994), Crozier (1992,1997), Weitzman
(1992,1998) and Warwick & Clarke (1995).Faith
(1992) defined the phylogenetic diversity (PD) as the
sum of the branch lengths of the phylogeny connecting
all species in the target community. This concept of
PD is essentially a measure of the total amount of evol-
utionary history embodied in an assemblage since the
time of the most recent common ancestor of the
assemblage. The branch lengths may be proportional
to time of divergence, or they may be proportional to
the number of base changes in a given gene or may
use some other measures of change. If the branch
lengths are proportional to divergence time, all
branch tips are the same distance from the tree base
(the first node). Such trees are called ‘ultrametric
trees’ and have particularly simple mathematical
properties.
These generalizations of species richness do not
take into account species relative abundances, because
nearly all early studies were based on a coarse spatial
scale, and data were mostly collected from museum
specimen records; thus, relative abundances could
not be reliably estimated. These measures are still
useful in many conservation purposes or in cases
where species abundances are difficult to count, such
as micro-organisms or clumped plants. However,
species abundances, if available, provide a more com-
plete description of the ecosystem, and it seems
reasonable from the perspective of community ecology
to weigh a lineage by the numerical importance of its
descendants. There is also a strong practical motiv-
ation for using measures that weigh species by their
abundance. In many ecosystems, most species are
rare, and unreasonable or impossible effort is required
to detect them all. Species richness is therefore very
difficult to estimate reliably. By contrast, most abun-
dance-based diversity measures can be reliably
estimated from small samples.
Diversity measures combining both phylogeny and
abundances have been proposed in the literature
(Rao 1982;Solow et al. 1993;Solow & Polasky
1994;Warwick & Clarke 1995;Izsa
´k & Papp 2000;
Webb 2000; Ricotta & Szeidl 2006,2009;Weikard
et al. 2006;Hardy & Senterre 2007;Hardy & Jost
2008;Allen et al. 2009;Pavoine et al. 2009;Cadotte
et al. 2010). Rao’s quadratic entropy (Rao 1982), a
generalization of the GiniSimpson index, is the
most well-developed of these. When pairwise differ-
ences between species are specified, Rao’s Qgives
the mean phylogenetic distance between any two
randomly chosen individuals in the community:
Q¼X
i;j
dij pipj;ð2:1Þ
where d
ij
denotes the phylogenetic distance between
species iand j, and p
i
and p
j
denote species relative
abundance of species iand j.Ricotta & Szeidl (2009)
proposed a transformation of Qif distances are
normalized to the range of [0, 1]:
^
Q¼1
1Q:ð2:2Þ
The advantages of this transformation will become
clear in the following sections.
Allen et al. (2009) generalized the Shannon entropy
to take into account phylogenetic differences. For a
rooted tree, their phylogenetic entropy H
p
is
Hp¼X
i
Liailog ai;ð2:3Þ
where the summation is over all branches, L
i
is the
length of branch i, and a
i
denotes the abundance des-
cending from branch i. This measure includes the
Shannon entropy as a special case.
For ultrametric trees, Pavoine et al. (2009) inte-
grated Faith’s PD, Allen et al.’s H
p
and Rao’s Qinto
a parametric class of measure called I
q
. The parameter
qcorresponds to the order in the Tsallis (1988) gener-
alized entropy. The three named measures correspond
to the orders q¼0, 1 and 2; I
0
¼Faith’s PD minus the
tree height; I
1
¼H
p
and I
2
¼Q.
3. THE REPLICATION PRINCIPLE
While biologists have traditionally used the Shannon
entropy and the Gini Simpson index to quantify
diversity, this practice is inconsistent with their own
rules of inference about diversity. For example, users
of these measures often judge the compositional simi-
larity of two or more groups by taking the ratio of
mean within-group diversity to total (pooled) diversity.
If the within-group diversity is close to the total diver-
sity, biologists infer that the groups are similar in
composition. Yet, when used with the Shannon
entropy or the Gini Simpson index, this ratio does
not directly reflect compositional similarity. When
within-group diversity is high, the ratio approaches
unity, supposedly indicating that the groups are
nearly identical in composition, even if the groups
are in fact completely distinct (no shared species).
Conservation biologists use diversity measures to
judge the impact of human activities or to design con-
servation strategies. Yet, the Shannon entropy and the
GiniSimpson index can be very misleading when
judging human impacts, and are logically self-contra-
dictory when used to assess conservation plans ( Jost
2009), because of their nonlinearity with respect to
increasing diversity. We conclude that these measures,
in spite of their popularity, do not capture biologists’
notions of diversity. The forms of reasoning that biol-
ogists apply to diversity lead to invalid conclusions
when used with these measures. These common
forms of reasoning about diversity implicitly assume
that diversity obeys the ‘replication principle’. The
replication principle for species-neutral diversity
states that if we have Nequally large, equally diverse
groups with no species in common, the diversity of
the pooled groups must be Ntimes the diversity of a
3600 A. Chao et al. Phylogenetic diversity measures
Phil. Trans. R. Soc. B (2010)
single group. Many authors (MacArthur 1965,1972;
Whittaker 1972;Peet 1974;Routledge 1979;Jost
2006,2007,2009;Jost et al. 2010) have shown that
only diversity measures that satisfy this replication
principle or ‘doubling property’ (Hill 1973;Jost
2007,2008,2009;Ricotta & Szeidl 2009) give math-
ematically, logically and intuitively correct results.
The replication principle is best known in economics,
where it has long been recognized as an important
property of concentration and diversity measures
(Hannah & Kay 1977).
To see the importance of this property, compare the
behaviour of the Gini –Simpson index 1 Pip2
iwith
that of the inverse Simpson concentration 1=Pip2
i,
which does obey the replication principle. Consider
an archipelago of 20 equally large, equally diverse
islands, each with completely distinctive tree floras.
There are no shared species among islands. Assume
the tree floras have the frequency distributions of the
trees of Barro Colorado Island, Panama. To measure
the compositional similarity among the islands, ecolo-
gists will take the mean diversity of the islands (0.95)
and divide it by the diversity of the archipelago as a
whole (0.998). For this example, the ratio is 0.95/
0.998 ¼0.95, near unity, supposedly indicating that
the islands are nearly identical in composition, even
though the islands are actually completely distinct
(no shared species). This ratio does not reflect compo-
sitional similarity. Doing the same with the inverse
Simpson concentration gives a ratio of 20.3/406 ¼
1/20, the smallest possible value for a set of 20 equally
large islands, correctly showing that they are comple-
tely distinct in composition. The same problems
apply to the Shannon entropy, and are resolved by
using the exponential of Shannon entropy.
Since the Shannon entropy and the GiniSimpson
index do not obey the replication principle, neither do
their phylogenetic generalizations—Rao’s quadratic
entropy, Allen et al.’s phylogenetic entropy and
Pavoine’s generalized Tsallis entropy family. Though
each of these has a useful interpretation, they cannot
be applied directly to judge efficacy of conservation
plans, magnitudes of human impacts or compositional
similarity among groups. These considerations motiv-
ate our search for a family of PD measures that are
sensitive to species relative abundances and that obey
the replication principle.
Species-neutral diversity measures that do obey the
replication principle (the ‘true’ diversity defined by
Jost 2007) include species richness, the exponential
of Shannon entropy and the inverse Simpson concen-
tration. These are special cases of a general class of
measures, known as Hill numbers (Hill 1973):
qD¼X
S
i¼1
pq
i
!
1=ð1qÞ
;ð3:1Þ
where Sis the number of species, p
i
is the relative
abundance of the ith species and the parameter q,
called the ‘order’ of the diversity measure, determines
its sensitivity to species frequencies. The measure
0
D
corresponds to species richness and
2
Dcorresponds
to the inverse Simpson concentration, giving roughly
the number of ‘very abundant’ species in a community
(Hill 1973). The measure is undefined when q¼1,
but the limit as qapproaches unity exists and equals
1D¼lim
q!1
qD¼exp X
S
i¼1
pilog pi
!
;ð3:2Þ
which is the exponential of Shannon entropy. Roughly,
1
Dmeasures the number of ‘common’ (or ‘typical’)
species in a community. Hill numbers provide a
unified framework for the three most popular groups
of diversity measures, q¼0, 1 and 2.
The Hill numbers are interpreted as the ‘effective
number of species’ or ‘species equivalents’
(MacArthur 1965,1972;Hill 1973;Jost2006,
2007). For any community, if we obtain a value
q
D¼w, then the diversity of this community is the
same as that of a community with wequally abundant
species. Hill numbers will be the basis for our phylo-
genetic generalization. We give the appropriate
phylogenetic generalization of the replication principle
in §6.
4. PHYLOGENETIC DIVERSITY MEASURES
(a)Conceptual framework
To emphasize the conceptual simplicity of our frame-
work, we first explain it verbally, and then derive the
corresponding formulae. We start by considering a
phylogenetic tree that uses divergence times to place
the nodes (so that the tree is ultrametric). At any
given moment t, we can find the species by slicing
the tree as in figure 1a. We can find their ‘abundances’
by summing the abundances of their descendants in
the present-day assemblage. These abundances are
not estimates of the actual abundances of these ances-
tral species at time t, but rather measures of their
importance for the present-day assemblage. The lin-
eage diversity at time tcan be found by dividing
these abundances by the total abundance at this time
t, and inserting these relative abundances into the
equation for Hill numbers of order q, equation (3.1).
We call this
q
D(t).
We can average the diversities
q
D(t) of the phyloge-
netic tree over any time interval of interest. We will be
interested in the time interval from 2Tyears to the
present time. While previous phylogenetic studies
have focused on Tas the age of the first node (root),
we do not make this restriction, because we may
want to compare diversities of systems with different
ages of the first node. Also, how diversity varies
with time for any individual tree provides important
information about evolution.
The average diversity of order qover the interval
[2T, 0] incorporates information about the tree’s
branching pattern, its relative branch lengths and the
relative abundances flowing through each of its
branch segments. For a given present-day diversity,
this average will be large when there are many deep
branches, each well represented in the present-day
assemblage. It will be small when all branches
emerge recently and/or when older branches are
poorly represented in the present-day assemblage.
Phylogenetic diversity measures A. Chao et al. 3601
Phil. Trans. R. Soc. B (2010)
It is always less than or equal to the present diversity
of order q.
There are many ways to take this average. If we want
the replication principle to be valid in its strongest
possible form, then we must average the diversities
q
D(t) according to Jost’s (2007) derivation of the for-
mula for the mean (
a
) diversity of a set of equally
weighted assemblages. This mean diversity over the
time interval [2T, 0] will be called q
DðTÞ(mean diver-
sity of order q over T years). With this choice of mean,
when Nmaximally distinct trees with equal mean
diversities (for fixed T) are combined, the mean diver-
sity of the combined tree is Ntimes the mean diversity
of any individual tree. The branching patterns, abun-
dances and richnesses of the Ntrees can all be
different, as long as each of the trees is completely dis-
tinct (all branching off from the earliest point in the
tree, at or before time T). Some choices of averaging
formulae obey weaker versions of the principle, and
these may be useful for some purposes. We discuss
an alternative choice of mean in §8.
We may want to consider not just the mean diversity
but the branch or lineage diversity of the tree as a
whole, over the interval from Tto present. At any
point within a branch, the abundance or importance
of each branch lineage is the sum of the abundances
of the present-day species descending from that
point, as described above. Then the total diversity of
all the ‘species’ that evolved in the tree during the
time interval [2T, 0] is found by taking the Hill
number of this entire virtual assemblage of ancestral
species. The Hill numbers depend only on the relative
abundances of each species, so we need to divide the
abundances by the total abundance of all the species
in the tree. If each branch is weighted by its corre-
sponding branch length, then we show below that
this diversity depends only on the branching pattern
and on the relative abundances of the species in the
present-day assembly. We call this measure ‘phyloge-
netic diversity of order q through T years ago’or‘branch
diversity’ and denote it by
q
PD(T). This turns out
to be just the product of the interval duration T
and the mean diversity over that interval, q
DðTÞ.For
q¼0 (only species richness is considered), and T¼
the age of the first node, this branch diversity is just
Faith’s PD.
Instead of using time as the metric for a phylo-
genetic tree, we often want to use a more direct
measure of evolutionary work, such as the number of
base changes at a selected locus, or the amount of
functional or morphological differentiation from a
common ancestor. The branches of the resulting tree
will then be uneven, so the tree will not be ultrametric.
However, we can easily apply the idea of branch diver-
sity to such non-ultrametric trees. The branch lengths
are calculated in the appropriate units, such as base
t = 0
(present time)
p1 + p2 + p3
p2 + p3
p1p2p3p4
T3
t = –T
p4
p4
p1
T
(a)(b)
slice 3
slice 2
slice 1
T2
T1
1
2
4
3.5
p1 = 0.5
p2 = 0.2
p3 = 0.3
p2 + p3 = 0.5
Figure 1. (a) A hypothetical ultrametric rooted phylogenetic tree with four species. Three different slices corresponding to
three different times are shown. For a fixed T(not restricted to the age of the root), the nodes divide the phylogenetic tree
into segments 1, 2 and 3 with duration (length) T
1
,T
2
and T
3
, respectively. In any moment of segment 1, there are
four species (i.e. four branches cut); in segment 2, there are three species; and in segment 3, there are two species. The
mean species richness over the time interval [2T,0]is(T
1
/T)4þ(T
2
/T)3þ(T
3
/T)2. In any moment of segment
1, the species relative abundances (i.e. node abundances correspond to the four branches) are fp1;p2;p3;p4g; in segment
2, the species relative abundances are fg1;g2;g3g¼fp1;p2þp3;p4g; in segment 3, the species relative abundances are
fh1;h2g¼fp1þp2þp3;p4g.(b) A hypothetical non-ultrametric tree. Let
Tbe the weighted (by species
abundance) mean of the distances from root node to each of the terminal branch tips.
T¼40:5þð3:5þ2Þ0:2þð1þ2Þ0:3¼4. Note
Tis also the weighted (by branch length) total node abundance
because
T¼0:54þ0:23:5þ0:31þ0:52¼4. Conceptually, the ‘branch diversity’ is defined for an assemblage
of four branches: each has, respectively, relative abundance 0:5=
T¼0:125, 0:2=
T¼0:05, 0:3=
T¼0:075 and
0:5=
T¼0:125; and each has, respectively, weight (i.e. branch length) 4, 3.5, 1 and 2. This is equivalent to an assemblage
with 10.5 equally weighted ‘branches’: there are 4 branches with relative abundance 0:5=
T¼0:125; 3.5 branches with rela-
tive abundance 0:2=
T¼0:05; 1 branch with relative abundance 0:3=
T¼0:075 and 2 branches with relative abundance
0:5=
T¼0:125.
3602 A. Chao et al. Phylogenetic diversity measures
Phil. Trans. R. Soc. B (2010)
changes. In non-ultrametric cases, the time Tis replaced
by
T, the mean of the distances from root node to each
of the terminal branch tips (i.e. the mean evolutionary
change per species); see figure 1bfor a numerical
example. Thus, we can obtain the total effective
number of ‘changes’ based on Hill numbers.
(b)Formulae
To make the above discussion precise and derive
formulae from it, we need to introduce some notation.
Assume that for any fixed time Tthe phylogenetic tree
is divided as ksegments with duration T
1
,T
2
,...,T
k
and species richness S
1
,S
2
,...,S
k
as in figure 1a.
Note that S
1
¼S, the present-day species richness.
Each branching point must form a segment boundary,
so that the species richness in any given segment is a
constant. Our derivation and formulae would be
unchanged by making finer segment divisions. To
obtain the formulae for 0
DðTÞ, assume there are S
i
species (i.e. S
i
branches cut) in the ith segment.
Then, 0
DðTÞ(mean diversity of order 0 over T years)is
0
DðTÞ¼T1
TS1þT2
TS2þþTk
TSk
¼
0PDðTÞ
T:ð4:1Þ
When Tis the time corresponding to the root, then
0
PD(T) is Faith’s PD measure. Our equation (4.1)
connects Faith’s PD to the mean species richness
over the time interval from the terminal tips to
the root.
At each moment within a given segment, the
set of species relative abundances is constant. In
segment 1, the species relative abundances are
fp1;p2;...;pS1g;PS1
i¼1pi¼1. Assume that in
segment 2 the relative abundances are
fg1;g2;...;gS2g;PS2
i¼1gi¼1, ..., and in segment k
the relative abundances are fh1;h2;...;hSkg;
PSk
i¼1hi¼1(figure 1a). Without loss of generality,
we can assume T
1
,T
2
,...,T
k
are all positive integers,
because the mean diversity q
DðTÞis invariant to
the units of time. Weighing each moment in time
equally, we can conceptually imagine that there
are T
1
assemblages with abundance vector
fp1;p2;...;pS1g,T
2
assemblages with abundance
vector fg1;g2;...;gS2g..., and T
k
assemblages with
abundance vector fh1;h2;...;hSkg. There are a total
of T
1
þT
2
þ þ T
k
¼Tassemblages, and each is
given the same weight 1/T.Jost (2007) showed
that, in the context of calculating alpha diversity for
equally weighted assemblages, the alpha diversity
should be obtained by first averaging the sums of
Ppq
i;Pgq
i,..., and Phq
i, and then converting this
average to a ‘true’ diversity by raising it to the power 1/
(1 2q).Weusethissamekindofaveragetoobtainthe
formula for q
DðTÞ(mean diversity of order q over T years)
q
DðTÞ¼ T1
TX
S1
i¼1
pq
iþT2
TX
S2
i¼1
gq
iþþTk
TX
Sk
i¼1
hq
i
()
1=ð1qÞ
:
ð4:2Þ
When q¼0, equation (4.2) reduces to equation (4.1).
Thesameformula(4.2)maybecomputedmore
easily by numbering every branch in the time interval
[2T, 0]. Denote the set of all branches in this time inter-
val by B
T
.Then,q
DðTÞcan be calculated as
q
DðTÞ¼ X
i[BT
Li
Taq
i
()
1=ð1qÞ
¼1
TX
i[BT
Li
ai
T

q
()
1=ð1qÞ
;ð4:3Þ
where L
i
is the length (duration) of branch iin the set B
T
and a
i
is the total abundance descended from branch i.
This diversity may also be interpreted as the effective
number of maximally distinct lineages (or species)
during the interval [2T, 0]. For maximally distinct
specieswehaveallbranchlengthsequaltoT, and thus
q
DðTÞreduces to Hill numbers
q
Din equation (3.1).
This gives a simple reference tree for a value of
q
DðTÞ¼z, i.e. the observed mean diversity in the time
period [2T, 0] is the same as the mean diversity of a
community consisting of zequally abundant and maxi-
mally distinct species with branch length T.
The effective diversity of the whole tree during the
interval [2T, 0] is the product of the effective
number of lineages during the interval and the duration
of the interval. We denote this measure by
q
PD(T)
(phylogenetic diversity of order q through T years ago):
qPDðTÞ¼Tq
DðTÞ¼TX
i[BT
Li
Taq
i
()
1=ð1qÞ
¼X
i[BT
Li
ai
T

q
()
1=ð1qÞ
:ð4:4Þ
This has dimensions of ‘effective number of lineage
years’. If q¼0, this equals
0
PD(T) as defined above,
regardless of branching pattern or abundances. If all
species are maximally distinct and equally common,
and if Tis the age of the highest node, this equals
Faith’s PD for all q.
For an ultrametric tree, we can express the time
parameter Tas T¼Pi[BTLiai. Therefore, the time
length Tcan also be interpreted as the total abun-
dance (weighted by branch lengths) in the time
interval [2T, 0] and a
i
/Trepresents the relative
abundance of the ith branch. Using this idea,
equation (4.4) suggests that instead of dividing the
tree into several segments and treating the mean
diversity as the alpha diversity of several assemblages,
we could conceptually think of all the branch seg-
ments in the interval [2T, 0] as forming a single
assemblage consisting of relative abundances
fai=T;i[BTg, with each branch weighted by its
corresponding branch length. (Equivalently, we can
also think for each ithat there are L
i
equally weighted
‘branches’ with the relative abundance a
i
/T.) Then
the Hill number of order qfor this assemblage is
exactly the branch diversity
q
PD(T) given in equation
(4.4). Dividing this Hill number by T, we obtain
q
DðTÞgiven in equation (4.3).
For the extension to non-ultrametric trees, let B
T
denote the set of branches connecting all focal species
with mean base change
T. The total node abundance
Phylogenetic diversity measures A. Chao et al. 3603
Phil. Trans. R. Soc. B (2010)
weighted by branch lengths is
T¼Pi[B
TLiai, which
also represents the weighted (by species abundance)
mean evolutionary change per species (figure 1b). (In
ultrametric trees,
T¼T.) Based on the assemblage
consisting of all branches with relative abundance set
fai=
T;i[B
Tgand under the assumption that each
branch is weighted by its corresponding branch
length (figure 1b), parallel derivation gives the follow-
ing measures, which are exactly the same as those in
equations (4.3) and (4.4), except that the parameter
Tthere must be replaced by the mean quantity
T:
q
Dð
TÞ¼ X
i[B
T
Li
Taq
i
()
1=ð1qÞ
¼1
TX
i[B
T
Li
ai
T

q
()
1=ð1qÞ
ð4:5Þ
and
qPDð
TÞ¼ X
i[B
T
Li
ai
T

q
()
1=ð1qÞ
:ð4:6Þ
We thus can conclude that the diversity of a non-
ultrametric tree with mean evolutionary change
T
(however this might be measured) is exactly the
same as that of an ultrametric tree with time par-
ameter
T. Therefore, for non-ultametric trees, if
q
Dð
TÞ¼z, then the diversity is the same as the diver-
sity of an ultrametric tree consisting of zequally
abundant and maximally distinct species with
branch length
T.
(c)Relationship with Rao’s Qand phylogenetic
entropy H
p
In the limit as qapproaches unity, the formula q
Dð
TÞ
in equation (4.5) equals
1
Dð
TÞ¼exp X
i[B
T
Li
Tailog ai
"#
:ð4:7Þ
The measure 1
Dð
TÞhas the following simple relation-
ship with the phylogenetic entropy H
p
:
1
Dð
TÞ¼expðHp
TÞor logð1
Dð
TÞÞ ¼ Hp
T:ð4:8Þ
When q¼2, from equation (4.5), we have
2
Dð
TÞ¼ X
i[B
T
Li
Ta2
i
()
1
:ð4:9Þ
After some algebra, we have the relationship between
2
Dð
TÞand Rao’s quadratic entropy Q:
2
Dð
TÞ¼
T
TQ¼1
1Q=
T:ð4:10Þ
Formula (4.10) represents the equivalent number of
completely distinct species (of age
T) for the assem-
blage. Ricotta & Szeidl (2009) derived a similar
formula, given in equation (2.2), for the special case
in which the pairwise distance between any two species
is normalized to the range of [0, 1]. While their
formula is identical to our equation (4.10) for ultra-
metric trees when our time parameter Tis scaled to
1, for non-ultrametric trees, our theory leads to the
conclusion that the equivalent number of species for
Qshould be 1=ð1Q=
TÞ.
We give an example to illustrate this point. Consider
a non-ultrametric tree in which three equally
abundant species are maximally distinct with
branch lengths 1, 1 and 0.2, respectively, from a
divergence point. The pairwise distances between
the three species are d
12
¼1, d
13
¼0.6 and d
23
¼
0.6. We have Rao’s Q¼4.4/9 ¼0.489 and
T¼ð1=3Þð1þ1þ0:2Þ¼2:2=3¼0:733. Based
on our equivalent number of species formula, we
have 1=ð1Q=
TÞ¼3 maximally distinct species
with equal branch lengths of 0.733, and the total
length ¼0.733 3¼2.2, which is Faith’s PD. How-
ever, based on the Ricotta & Szeidl (2009) formula,
we obtain 1=ð1QÞ¼1:957, implying there are
1.957 maximally distinct species with branch length
of 1. The total length is thus 1.957 1¼1.957,
which is not Faith’s PD.
5. TAXONOMIC DIVERSITY
Rather than using time or the number of base
changes at a locus as our measure of evolutionary
work, we might want to use a more holistic measure
of evolutionary work, such as a phylogenetic tree
based on the classical Linnaean taxonomic categories.
Consider the special case in which each Linnaean
taxonomic category is given unit length, and assume
all species are classified in all levels. Our formulae
above can be easily applied to this ultrametric tree,
with Treplaced by an integer representing the
number of taxonomic categories needed to character-
ize the assemblage. We thus change the continuous
time parameter Tto an integer parameter L(level )
to distinguish taxonomic diversity from the general
PD measures q
DðTÞand
q
PD(T). If we use species
and genus, then L¼2; if we use species, genus and
family, then L¼3. Additional intermediate levels,
such as subgenus or subfamily, may be appropriate
depending on the group. Notice that in a taxonomic
tree, the total length is identical to the total number
of nodes. Setting all the segment lengths L
i
to unity
in equations (4.3) and (4.4), we have the following
mean diversity of order q for L taxonomic levels,q
DðLÞ,
q
DðLÞ¼ Piaq
i
L

1=ð1qÞ
¼1
LX
i
ai
L

q
()
1=ð1qÞ
;ð5:1Þ
where iis over all nodes in the Llevels taxonomy
tree. The measure q
DðLÞquantifies ‘the mean effec-
tive number of cladistic nodes per level in a
taxonomic tree of Llevels’. The diversity of a taxon-
omy tree with q
DðLÞ¼zis the same as the diversity
of a community consisting of zequally abundant
species, with each species classified in its own genus
and family, so that there are zspecies, zgenera and
zfamilies.
3604 A. Chao et al. Phylogenetic diversity measures
Phil. Trans. R. Soc. B (2010)
The taxonomic diversity of order q for L levels,
q
TD
(L), is the product of q
DðLÞand the level L. This
measure quantifies ‘the effective number of total
cladistic nodes in a taxonomic tree of Llevels’ and
has the formula
qTDðLÞ¼Lq
DðLÞ¼ X
i
ai
L

q
()
1=ð1qÞ
:ð5:2Þ
In the special case L¼1, the measure q
DðLÞ¼qD.
When q¼0,
0
TD(L)¼total number of nodes,
which is Vane-Wright’s CD. Equations (4.8) and
(4.10) reduce to the following transformations:
1
DðLÞ¼expðHp=LÞand 2
DðLÞ¼1=½1ðQ=LÞ; see
table 1 for a summary of all proposed measures
and their relationships with conventional measures.
The decomposition of taxonomic diversity into
diversity of each level is provided in the electronic
supplementary material.
6. REPLICATION PRINCIPLE FOR
PHYLOGENETIC DIVERSITY
Some basic properties of our proposed measures
(table 1) are summarized in the electronic supplemen-
tary material; details of the proofs are provided in
Chiu (2010) and Jost & Chao (in preparation).Inthis
section, we only refine the concept of the replication
principle for phylogenetic trees, and prove its validity
for the most general case (i.e. non-ultrametric case),
implying that it is valid for all measures in table 1.
Suppose we have Ncompletely distinct assemblages
(no shared lineages), all with the same mean branch
length
T(hence same Tin the case of ultrametric
trees) and the same mean PD q
Dð
TÞ¼X. Then we
can prove the following strong replication principle:
if these assemblages are pooled in equal proportions,
the pooled assemblages have mean PD NX.
Proof. Suppose in tree k, the branch set is B
T;k
(we omit
Tin the subscript and just use B
k
in the fol-
lowing proof for notational simplicity) with branch
lengths fLik;i[Bkgand the corresponding
nodes abundances faik;i[Bkg,k¼1, 2, ...,N.
The Ntrees have the same mean diversity X, implying
Pi[BkðLik=
TÞaq
ik ¼X1qfor all k¼1, 2, ...,N. When
the Ntrees are pooled with equal weight for each tree,
each node abundance a
ik
in the pooled tree becomes
a
ik
/N. Then, the q
Dð
TÞmeasure for the pooled tree
becomes
X
N
k¼1X
i[Bk
Lik
T
aik
N

q
()
1=ð1qÞ
¼fN1qX1qg1=ð1qÞ
¼NX:
In our proof of this replication principle, the Nassem-
blages must have the same average quantity
T, but may
have different numbers of species if q.0, and the tree
structures of the Nassemblages can be totally
different.
7. EXAMPLES
To show the general behaviour of our proposed
measures, we give two simple hypothetical examples
in the electronic supplementary material. Here, we
apply the proposed q
DðTÞand
q
PD(T) measures to
the real forest data discussed by Shimatani (2001),
who collected data from the over-storey tree species
in the Fred Russ experimental forest in Michigan.
For illustrative purpose, we only consider the abun-
dance data of block 4 in his paper for two sites: CT
(thinned site) and CU (un-thinned site). Both sites
were 28-year-old (in 1999) secondary forests. The
two sites were dominated by oak trees. No thinning
was conducted for the CU site after clear cutting in
1971, while thinning was done for non-oak species
in the site CT in 1982 and 1996.
Shimatani (2001) proposed a four-level (species,
genus, family, subclass) taxonomic measure based on
the Simpson index, and concluded that the traditional
diversity indices and the taxonomic diversity consider-
ing species relatedness give different conclusions about
the effect of thinning. We constructed the phylogeny
trees for species in each site by using the software
PHYLOMATIC (from http://www.phylodiversity.net/phy-
lomatic;Webb & Donoghue 2004). The phylogenetic
tree for the species in the two sites, and the two sets
of species relative abundances, are shown in figure 2.
Table 1. A summary of species-neutral and phylogenetic diversity measures and their interpretations; all satisfy the
replication principle. CD, cladistic diversity (total number of nodes) by Vane-Wright et al. (1991); PD, phylogenetic diversity
(sum of branch lengths) by Faith (1992);Q, quadratic entropy, equation (2.1); H
p
, phylogenetic entropy, equation (2.3).
diversity types
species-neutral
diversity
taxonomic classification
(Llevels)
ultrametric phylogenetic
tree
non-ultrametric phylogenetic
tree
diversity or
mean
diversity of
general
order q
q
D: equation (3.1),
Hill numbers
(effective number
of species)
q
DðLÞ: equation (5.1),
mean effective
number of cladistic
nodes per level
q
DðTÞ: equation (4.3),
mean effective number
of species (or lineages)
over Tyears
q
Dð
TÞ: equation (4.5), mean
effective number of
species (or lineages) over
Tmean base changes
q¼0 species richness CD/LPD/TPD/
T
q¼1 exp(entropy) exp(H
p
/L) exp(H
p
/T) expðHp=
TÞ
q¼2 1/Simpson 1/[1 2(Q/L)] 1/[1 2(Q/T)] 1=½1ðQ=
TÞ
branch (or
lineage)
diversity
q
DðLÞL: equation
(5.2), effective
number of cladistic
nodes for Llevels
q
DðTÞT: equation
(4.4), effective number
of lineage lengths over
Tyears
q
Dð
TÞ
T: equation (4.6),
effective number of base
changes over
Tmean base
changes
Phylogenetic diversity measures A. Chao et al. 3605
Phil. Trans. R. Soc. B (2010)
We calculated three types of diversities: (i) the mean
diversity q
DðTÞand the phylogenetic diversity
q
PD(T) based on the phylogeny trees and the relative
abundances in figure 2, (ii) the taxonomic diversity
q
DðLÞbased on taxonomic classification in fig. 1 of
Shimatani (2001), and (iii) the species-neutral diver-
sity based on Hill numbers (
q
D) in equation (3.1) for
q¼0, 1 and 2.
In figure 3, the profile of q
DðTÞand
q
PD(T) when
0,T,150 is shown for q¼0, 1 and 2. For ultra-
metric trees, the two measures give consistent
comparison as clearly seen in figure 3. We focus on
comparing the measure q
DðTÞ, which gives the mean
effective number of species as a function of evolution-
ary time T. Based on species richness (q¼0), the
diversity q
DðTÞof the thinned site CT dominates
that of un-thinned site CU for all values of T. But for
the common species (q¼1) and very abundant species
(q¼2), we have the reverse conclusion. When abun-
dance is taken into account, the un-thinned CU site
is more diverse than the thinned CT site for all
values of T, except for a very small interval in the
case of q¼2.
Table 2 shows the three types of diversity
(q
DðTÞ;q
DðLÞand
q
D) for three orders of q(0, 1
and 2). All these three measures are in the same
units of species. The q
DðTÞmeasure is only shown
for T¼142.3, which is the age of the root in the
pooled phylogenetic tree. The taxonomic measure
q
DðLÞis computed for L¼4 level classifications. For
any fixed order q, we had proved that
q
Dis always
greater than or equal to q
DðTÞand q
DðLÞ, and this is
seen numerically in table 2.
Based on table 2, we confirm the finding of
Shimatani (2001) that the traditional Simpson diver-
sity measure
2
Dimplies that the thinned site is less
diverse. A similar implication is also valid for the
1
D
measure, whereas species richness
0
Dshows that the
thinned site is more diverse. Based on q
DðLÞ, the taxo-
nomic diversity of the thinned site for all three orders is
greater, but the difference is not large. Shimatani thus
concluded that the thinning operation contributed to
an increase in taxonomic diversity.
In contrast to Shimatani’s conclusion, our results
based on q
DðTÞfor q¼1 and 2 imply the opposite
conclusion, as shown in figure 3, and our results are
consistent with those based on the species-neutral
diversity. Our conclusion may be understood intui-
tively by noting that thinning concentrates the
abundance into a few species of intermediate phyloge-
netic distinctiveness (figure 2), while in the un-thinned
site, abundance is spread more equitably throughout
the phylogenetic tree. The plots in figure 3 provide
additional insights about the thinning effect when
both evolutionary history and species abundances
(q¼1 and 2) are considered.
8. CONCLUDING REMARKS AND DISCUSSION
(a)Advantages of the new measures
We have proposed a unified class of PD measures that
are based on Hill numbers and that obey the replica-
tion principle (§§3 and 6). Most previous PD
measures that take into account species abundances,
such as Rao’s (1982) quadratic entropy Q,Allen
et al. (2009) phylogenetic entropy H
p
and Pavoine
et al. (2009) generalized phylogenetic entropy I
q
,do
not obey the replication principle.
Measures that do not obey the replication principle
give self-contradictory results in conservation analyses
(Jost 2009). Furthermore, for such measures, the
commonly used ratio of within-group to total ‘diver-
sity’ does not reflect the compositional similarity of
the groups, since it always approaches unity when
diversity is high (§3 and Hardy & Jost 2008). Finally,
it is difficult to use such measures to judge the magni-
tude of human or natural impacts on the environment.
The problem with these measures is their nonlinearity
with species addition. A numerical example is pro-
vided in the electronic supplementary material. Our
measures solve these problems.
If a dendrogram can be constructed from a trait-
based distance matrix using a clustering scheme
(Petchey & Gaston 2002), then we can apply our pro-
posed measures to quantify functional diversity; see
Chao & Jost (2011) for interpretation. Our proposed
approach can also be extended to the case of multiple
communities. The formulations of phylogenetic alpha,
beta and gamma diversities as well as the construction
of similarity (or differentiation) measures are devel-
oped in Chiu (2010). These results will be reported
in forthcoming papers.
(b)Interpretation of the new measures
For ultrametric trees, the mean diversity (in unit of
species) q
DðTÞ, defined in equation (4.3), quantifies
‘the mean effective number of species from the present
to Ttime units ago’. Here the parameter qdetermines
the diversity’s sensitivity to node (or branch segment)
abundances; high values of qemphasize those nodes
with high relative abundances. The product of q
DðTÞ
and Tis the phylogenetic diversity measure
q
PD(T),
defined in equation (4.4), and quantifies the ‘effective
branch diversity’ of the phylogenetic tree. For a
non-ultrametric tree, the only difference is in the
replacement of Tby the mean evolutionary change
T
(the mean of the distances from root node to each of
Sassafras albidum
CT
0 3.4
7.6 4.1
0.8 2.1
30.3 1.4
1.5 2.1
1.5 17.2
0.8 0
37.9 34.5
13.6 12.4
4.5 22.8
1.5 0
CU
Populus grandidentata
Ostrya virginiana
Quercus rubra
Ulmus americana
Ulmus rubra
Celtis occidentalis
Prunus serotina
Acer rubrum
Acer saccharum
Fraxinus american
Figure 2. The combined phylogenetic tree for the species in
the site CT (grey line) and site CU (black line). The age of
the root for the CT site is 116.6 units and 142.3 units for the
CU site and the pooled site. The species relative abundance
(%) in the two sites CT and CU are shown in the last two
columns (abundance data are from Shimatani 2001).
3606 A. Chao et al. Phylogenetic diversity measures
Phil. Trans. R. Soc. B (2010)
the terminal branch tips.). See table 1 for a summary
of the proposed measures.
For ultrametric trees, the most complete picture of
PD is provided by graphing it as a function of T.We
recommend three profiles for q¼0, 1 and 2
(figure 3) and a range of time between 0 and a maxi-
mum value T(such as the age of the first node or
the age at which the group of interest diverges from
other groups or the time of origin of life). Profiles
may also be taken for fixed T(using the Tvalues just
described, for example) as a function of q. Such pro-
files will show the effect of taking abundances into
account (q¼0 gives no abundance accounting, while
high qtakes into account only the most abundant
species). For non-ultrametric trees, similar recommen-
dations can be made based on the mean base change.
Species-neutral diversity measures discussed in this
paper are featured in the program SPADE (Chao &
Shen 2010), which can be freely downloaded from
the website http://chao.stat.nthu.edu.tw/softwareCE.
html. The new PD measures will be featured in the pro-
gram PhD (phylogenetic diversity) in the same website.
(c)Alternative formulation
We have developed our new measures to obey the
strongest possible version of the replication
principle, facilitating decomposition into independent
within- and between-group components. This was
accomplished by taking the average of
q
D(t) over the
time interval Tusing the mean derived by Jost
(2007). However, some other kinds of means may
also yield useful results. If we had used the ordinary
mean of
q
D(t) over the time interval T, we would
obtain the expectation value of
q
D(t) over the interval
T. Multiplying this by Twould give a measure of the
amount of evolutionary history embodied by the
tree in this interval, or the amount of evolutionary
work done on the assemblage during this interval.
This product would be monotonically increasing in T,
an advantage over the formulation we have developed
above. However, this alternative mean does not obey
the strong version of the replication principle, but only
the following weaker one: when Nmaximally distinct
trees with equal diversities at each time t, and equal
total abundances, are combined, the mean diversity of
the combined trees is Ntimes the mean diversity of
any individual tree. When this weaker version of the
replication principle is deemed sufficient, the alternative
formulation may be useful in some applications.
This paper is dedicated to Ross Crozier, a pioneer in the
phylogenetic research and in the study of genetics in social
insects. Ross unfortunately passed away in November
q = 0
mean diversity
q = 1
mean diversity
q = 2
q = 0 q = 1 q = 2
mean diversity
time, T
PD(T)
PD(T)
0 50 100 150
time, T
0 50 100 150
time, T
0 50 100 150
4
5
6
7
8
9
10
2
3
4
5
6
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
0
200
400
600
800
0
50
100
150
200
250
300
0
100
200
300
400
PD(T)
Figure 3. The profile of q
DðTÞand
q
PD(T) measures for 0 ,T,150 and q¼0, 1 and 2. Solid line, CT site; dashed line,
CU site.
Table 2. Comparison of three types of diversities (q
DðTÞ,q
DðLÞand
q
D) for q¼0, 1 and 2. The q
DðTÞvalue is computed at
T¼142.3, which is the age of the root of the pooled tree. See figure 3 for other values of q
DðTÞ.
order q
site CT (thinned site) site CU (un-thinned site)
q
DðT¼142:3Þq
DðL¼4ÞqDq
DðT¼142:3Þq
DðL¼4ÞqD
q¼0 5.402 7.25 10 5.338 6.750 9
q¼1 2.660 3.951 4.967 2.797 3.904 5.664
q¼2 1.940 3.187 3.809 2.054 3.012 4.548
Phylogenetic diversity measures A. Chao et al. 3607
Phil. Trans. R. Soc. B (2010)
2009. A.C. sincerely thanks Ross for his friendship of many
years and for his inspiring and encouraging discussion and
guidance to the field of phylogenetic diversity. We also
acknowledge helpful discussions with Carlo Ricotta, Olivier
Hardy and Bruno Senterre. The valuable comments and
suggestions from Nick Gotelli and an anonymous reviewer
helped substantially improve the paper. A.C. and C.C.
were supported by Taiwan National Science Council. L.J.
is grateful for support by a grant from John Moore to the
Population Biology Foundation.
REFERENCES
Allen, B., Kon, M. & Bar-Yam, Y. 2009 A new phylogenetic
diversity measure generalizing the Shannon index and its
application to phyllostomid bats. Am. Nat. 174, 236243.
(doi:10.1086/600101)
Cadotte, M. W., Davies, T. J., Regetz, J., Kembel, S. W.,
Clevand, E. & Oakley, T. 2010 Phylogenetic diversity
metrics for ecological communities: integrating species
richness, abundance and evolutionary history. Ecol. Lett.
13, 96105. (doi:10.1111/j.1461-0248.2009.01405.x)
Cavender-Bares, J., Kozak, K. H., Fine, P. V. A. & Kembel,
S. W. 2009 The merging of community ecology and
phylogenetic biology. Ecol. Lett. 12, 693 715. (doi:10.
1111/j.1461-0248.2009.01314.x)
Chao, A. & Jost, L. 2011 Diversity measures. In Sourcebook
in theoretical ecology (eds A. Hastings & L. Gross).
Berkeley, CA: University of California Press.
Chao, A. & Shen, J. 2010 SPADE: species prediction
and diversity estimation. See http://chao.stat.nthu.edu.
tw/softwareCE.html.
Chiu, C. H. 2010 Development and statistical estimation of
phylogenetic diversity indices. PhD thesis, National
Chiao Tung University, Taiwan.
Crozier, R. H. 1992 Genetic diversity and the agony of
choice. Biol. Conserv. 61, 1115. (doi:10.1016/0006-
3207(92)91202-4)
Crozier, R. H. 1997 Preserving the information content of
species: genetic diversity, phylogeny, and conservation
worth. Annu. Rev. Ecol. Syst. 28, 243– 268. (doi:10.
1146/annurev.ecolsys.28.1.243)
Faith, D. P. 1992 Conservation evaluation and phylogenetic
diversity. Biol. Conserv. 61, 1 10. (doi:10.1016/0006-
3207(92)91201-3)
Faith, D. P. 1994 Phylogenetic pattern and the quantification
of organismal biodiversity. Phil. Trans. R. Soc. Lond. B
345, 4558. (doi:10.1098/rstb.1994.0085)
Faith, D. P. 2002 Quantifying biodiversity: a phylogenetic
perspective. Conserv. Biol. 16, 248252. (doi:10.1046/j.
1523-1739.2002.00503.x)
Hannah, L. & Kay, J. A. 1977 Concentration in the modern
industry: theory, measurement, and the UK experience.
London, UK: MacMillan.
Hardy, O. J. & Jost, L. 2008 Interpreting and estimating
measures of community phylogenetic structuring. J. Ecol.
96, 849– 852. (doi:10.1111/j.1365-2745.2008.01423.x)
Hardy, O. J. & Senterre, B. 2007 Characterizing the phyloge-
netic structure of communities by an additive partitioning
of phylogenetic diversity. J. Ecol. 95, 493– 506. (doi:10.
1111/j.1365-2745.2007.01222.x)
Hill, M. O. 1973 Diversity and evenness: a unifying notation
and its consequences. Ecology 54, 427431. (doi:10.
2307/1934352)
Hulbert, S. 1971 The nonconcept of diversity: a critique and
alternative parameters. Ecology 52, 577586. (doi:10.
2307/1934145)
Izsa
´k, J. & Papp, L. 2000 A link between ecological diversity
indices and measures of biodiversity. Ecol. Model. 130,
151156. (doi:10.1016/S0304-3800(00)00203-9)
Jost, L. 2006 Entropy and diversity. Oikos 113, 363– 375.
(doi:10.1111/j.2006.0030-1299.14714.x)
Jost, L. 2007 Partitioning diversity into independent alpha
and beta components. Ecology 88, 2427 2439. (doi:10.
1890/06-1736.1)
Jost, L. 2008 G
ST
and its relatives do not measure
differentiation. Mol. Ecol. 17, 4015–4026. (doi:10.1111/
j.1365-294X.2008.03887.x)
Jost, L. 2009 Mismeasuring biological diversity: response to
Hoffman and Hoffman (2008). Ecol. Econ. 68, 925927.
(doi:10.1016/j.ecolecon.2008.10.015)
Jost, L. & Chao, A. In preparation. Diversity analysis.
London, UK: Taylor and Francis.
Jost, L., DeVries, P., Walla, T., Greeney, H., Chao, A. &
Ricotta, C. 2010 Partitioning diversity for conservation
analyses. Divers. Distrib. 16, 65– 76. (doi:10.1111/j.
1472-4642.2009.00626.x)
MacArthur, R. H. 1965 Patterns of species diversity. Biol.
Rev. 40, 510– 533. (doi:10.1111/j.1469-185X.1965.
tb00815.x)
MacArthur, R. H. 1972 Geographical ecology. New York, NY:
Harper & Row.
Magurran, A. E. 2004 Measuring biological diversity. Oxford,
UK: Blackwell Science.
Nanney, D. L. 2004 No trivial pursuit. Bioscience
54, 720721. (doi:10.1641/0006-3568(2004)054[0720:
NTP]2.0.CO;2)
Patil, G. P. & Taillie, C. 1982 Diversity as a concept and its
measurement. J. Am. Stat. Assoc. 77, 548– 567. (doi:10.
2307/2287709)
Pavoine, S., Love, M. S. & Bonsall, M. B. 2009 Hierarchical
partitioning of evolutionary and ecological patterns in
the organization of phylogenetically-structured species
assemblages: applications to rockfish (genus: Sebastes)in
the Southern California Bight. Ecol. Lett. 12, 898 908.
(doi:10.1111/j.1461-0248.2009.01344.x)
Peet, R. K. 1974 The measurement of species diversity.
Annu. Rev. Ecol. Syst. 5, 285– 307. (doi:10.1146/
annurev.es.05.110174.001441)
Petchey, O. L. & Gaston, K. J. 2002 Functional diversity (FD),
species richness and community composition. Ecol. Lett. 5,
402411. (doi:10.1046/j.1461-0248.2002.00339.x)
Pielou, E. C. 1975 Ecological diversity. New York, NY:
John Wiley and Sons.
Purvis, A. & Hector, A. 2000 Getting the measure of bio-
diversity. Nature 405, 212219. (doi:10.1038/35012221)
Rao, C. R. 1982 Diversity and dissimilarity coefficients: a
unified approach. Theor. Popul. Biol. 21, 24 43. (doi:10.
1016/0040-5809(82)90004-1)
Ricotta, C. & Szeidl, L. 2006 Towards a unifying approach
to diversity measures: bridging the gap between the
Shannon entropy and Rao’s quadratic index. Theor.
Popul. Biol. 70, 2 37 – 243. (doi:10.1016/j.tpb.2006.06.003)
Ricotta, C. & Szeidl, L. 2009 Diversity partition of Rao’s
quadratic entropy. Theor. Popul. Biol. 76, 299302.
(doi:10.1016/j.tpb.2009.10.001)
Routledge, R. D. 1979 Diversity indices: which ones are
admissible. J. Theor. Biol. 76, 503 515. (doi:10.1016/
0022-5193(79)90015-8)
Shimatani, K. 2001 On the measurement of species diversity
incorporating species differences. Oikos 93, 135– 147.
(doi:10.1034/j.1600-0706.2001.930115.x)
Solow, A. R. & Polasky, S. 1994 Measuring biological
diversity. Environ. Ecol. Stat. 1, 95107. (doi:10.1007/
BF02426650)
Solow, A. R., Polasky, S. & Broadus, J. 1993 On the
measurement of biological diversity. J. Environ. Econ.
Manage. 24, 60 68. (doi:10.1006/jeem.1993.1004)
Tsallis, C. 1988 Possible generalization of Boltzmann
Gibbs statistics. J. Stat. Phys. 52, 480– 487.
3608 A. Chao et al. Phylogenetic diversity measures
Phil. Trans. R. Soc. B (2010)
Vane-Wright, R. I., Humphries, C. J. & Williams, P. M. 1991
What to protect: systematics and the agony of choice.
Biol. Conserv. 55, 235254. (doi:10.1016/0006-3207
(91)90030-D)
Vellend, M., Cornwell, W. K., Magnuson-Ford, K. &
Mooers, A. Ø. 2010 Measuring phylogenetic biodiversity.
In Biological diversity: frontiers in measurement and
assessment (eds A. Magurran & B. McGill). Oxford,
UK: Oxford University Press.
Warwick, R. M. & Clarke, K. R. 1995 New ‘biodiversity’
measures reveal a decrease in taxonomic distinctness with
increasing stress. Mar. E col. Prog. Ser. 129, 301– 305.
(doi:10.3354/meps129301)
Webb, C. O. 2000 Exploring the phylogenetic structure
of ecological communities: an example from rain
forest trees. Am. Nat. 156, 145– 156. (doi:10.1086/
303378)
Webb, C. O. & Donoghue, M. J. 2004 Phylomatic: tree
assembly for applied phylogenetics. Mol. Ecol. Notes 5,
181183. (doi:10.1111/j.1471-8286.2004.00829.x)
Weikard, H. P., Punt, M. & Wessler, J. 2006 Diversity
measurement combining relative abundances and taxo-
nomic distinctiveness of species. Divers. Distrib. 12,
215217. (doi:10.1111/j.1366-9516.2006.00234.x)
Weitzman, M. L. 1992 On diversity. Q. J. Econ. 107,
363405. (doi:10.2307/2118476)
Weitzman, M. L. 1998 The Noah’s Ark problem. Econometrica
66, 12791298. (doi:10.2307/2999617)
Whittaker, R. H. 1972 Evolution and measurement of
species diversity. Taxon 12, 213 251.
Phylogenetic diversity measures A. Chao et al. 3609
Phil. Trans. R. Soc. B (2010)
... Amazonian rainforests are characterized by high abundance and diversity of woody angiosperms, where a limited number of species dominate the community Draper et al., 2021). The relatively low number of dominant taxa means that tropical communities usually harbour extremely large numbers of rare species (Ter Steege et al., 2013;Leitão et al., 2016;Draper et al., 2021;Cazzolla Gatti et al., 2022). Many studies have been conducted in Amazonian rainforests over the last century, but none analysed the latitudinal gradients of woody plants within tropical regions by considering all three diversity attributes comprising TD, FD, and PD. ...
... Hence, the contribution of rare species to diversity decreases as the value of the parameter q increases. Hill numbers were typically only used for TD but Chao et al. (2010) extended them to PD based on the phylogenetic distances between species, and Chiu and Chao (2014) extended them to FD based on the functional distances between species traits. ...
... All entities are treated as taxonomically, functionally and phylogenetically equally distinct. Hill numbers for q = 0, q = 1, and q = 2 were obtained for the three diversity attributes using the 'renyi' function in the vegan package for TD, with the code provided by Chiu and Chao (2014) for FD, and with the 'ChaoPD' function in the entropart package (Chao et al., 2010) for PD. Using different Hill numbers to explore diversity allowed us to examine how rare, abundant, and dominant species responded to the environmental factors and latitude. ...
Article
Full-text available
Elucidating how environmental factors drive plant species distributions and how they affect latitudinal diversity gradients, remain essential questions in ecology and biogeography. In this study we aimed: 1) to investigate the relationships between all three diversity attributes, i.e ., taxonomic diversity (TD), functional diversity (FD), and phylogenetic diversity (PD); 2) to quantify the latitudinal variation in these diversity attributes in western Amazonian terra firme forests; and 3) to understand how climatic and edaphic drivers contribute to explaining diversity patterns. We inventoried ca. 15,000 individuals from ca. 1,250 species, and obtained functional trait records for ca. 5,000 woody plant individuals in 50 plots of 0.1 ha located in five terra firme forest sites spread over a latitudinal gradient of 1200 km covering ca . 10°C in latitude in western Amazonia. We calculated all three diversity attributes using Hill numbers: q = 0 (richness), q = 1 (richness weighted by relative abundance), and q = 2 (richness weighted by dominance). Generalized linear mixed models were constructed for each diversity attribute to test the effects of different uncorrelated environmental predictors comprising the temperature seasonality, annual precipitation, soil pH and soil bulk density, as well as accounting for the effect of spatial autocorrelation, i.e ., plots aggregated within sites. We confirmed that TD ( q = 0, q = 1, and q = 2), FD ( q = 0, q = 1, and q = 2), and PD ( q = 0) increased monotonically towards the Equator following the latitudinal diversity gradient. The importance of rare species could explain the lack of a pattern for PD ( q = 1 and q = 2). Temperature seasonality, which was highly correlated with latitude, and annual precipitation were the main environmental drivers of variations in TD, FD, and PD. All three diversity attributes increased with lower temperature seasonality, higher annual precipitation, and lower soil pH. We confirmed the existence of latitudinal diversity gradients for TD, FD, and PD in hyperdiverse Amazonian terra firme forests. Our results agree well with the predictions of the environmental filtering principle and the favourability hypothesis, even acting in a 10°C latitudinal range within tropical climates.
... Species richness is a diversity index of order 0, Shannon entropy is a diversity index of order one, and all Simpson measures are diversity indices of order two. The order q determines a diversity measure's sensitivity to rare or common species (Hill, 1973;Keylock, 2005;Jost, 2007;Chao et al., 2010Chao et al., , 2014. ...
... The Hill numbers are a parametric family of diversity indexes differing among themselves only by the parameter q that determines sensitivity to species relative abundance. It provides the number of equally abundant species that are needed to give the same value of a diversity measures (Chao et al., 2010(Chao et al., , 2014. Compare the diversity calculation results of simulated sample plots (Table 4) and the investigated natural forest sample plots (Table 5), it is found that the trend of species richness and evenness of new diversity D RE (SR and SE) with Hill numbers (q = 0 and q = 2) are almost identical. ...
Article
Full-text available
The significance of biodiversity research is to understand the structure and function of the community, and then to protect and monitor the community. The metric of biodiversity is the base of biodiversity conservation. Species richness and evenness are the most common descriptors of biodiversity. Whether it is diversity information measure, probability measure or geometric measure, they all express the combination of species richness and evenness in different ways. This study presents a new biologically meaningful measure of species diversity, which evaluates species richness and evenness independently, designated as DRE. The novelty of our method is to use “absolute discrepancy” to express the dissimilarity between the observed community and the uniform distribution community with the same species composition and same abundance of each species, and then measure the species evenness. The logarithmic transformation of the species number is used to measure species richness with values ranging between 0 and 1. We test the performance of this measure using simulated data and observations of natural and planted forests in different climatic zones. The results showed that the new diversity index (DRE) has superior statistical qualities compared with the traditional indices. Especially, in extremely uneven communities, the new measure describes the causes of diversity changes than the traditional DRE. In addition, DRE is more sensitive to the abundance changes of rare species in the simulated community, and the interpretation of the results is more intuitive and meaningful. It is an improved method to evaluate the species diversity of any ecosystem.
... They are also congruent with the global reptile patterns described byRoll et al. (2017) andGumbs et al. (2020) in finding higher diversities in the tropics. However, the PD pattern using Faith's PD differs substantially from our findings using the MPD, due to the former being a phylogenetic generalization of species richness(Chao & Jost, 2010). Concerning the FD, we found a non-monotonic relationship with species richness, similar to a previous study focused solely on lizards(Vidan et al., 2019). ...
Article
Full-text available
Aim: Our aim is to document the dimensions of current squamate reptile biodiversity in the Americas by integrating taxonomic, phylogenetic, and functional data, and assessing how this may vary across phylogenetic scales. We also explore the potential underlying mechanisms that may have originated the observed geographical diversity patterns. Location: The Americas. Time period: Present. Major taxa: Squamata. Methods: We used published data on the distribution, phylogeny, and body size of squamate reptiles to document the current dimensions of their alpha diversity in the Americas. We overlapped species ranges to estimate taxonomic diversity (TD) and calculated phylogenetic diversity using mean phylogenetic pairwise distance (MPD), speciation rate (DivRate), and Faith’s index (PD). In addition, we estimated functional diversity as trait dispersion in the multivariate space using body size and leg development data. We implemented a deconstructive macroecological approach using spatial autoregressive models to understand how spatial mismatches between the three facets of diversity vary across phylogenetic scales, and the potential eco-evolutionary mechanisms driving these patterns across the geography. Results: We found a strong latitudinal gradient of taxonomic diversity with a large accumulation in tropical regions. Phylogenetic and functional diversity patterns were largely congruent given the high phylogenetic signal in the traits used, and higher values tended to be concentrated in harsh and/or heterogeneous environments. We found differences between major clades within Squamata that display contrasting geographical patterns. Several regions across the continent shared the same spatial mismatches between dimensions across clades, suggesting that similar eco-evolutionary processes are shaping these regional reptile assemblages. However, we also found evidence that non-mutually exclusive processes can operate differently across clades. Main conclusions: The implementation of a deconstructive macroecological approach across different facets of biodiversity enhances our capacity to establish spatially explicit eco-evolutionary hypotheses regarding the processes that may be shaping biodiversity. Further studies are necessary to test the role of additional ecological factors driving these geographic patterns of diversity across Squamata. Our approach is based on macroecological and macroevolutionary theory and therefore can be used in other taxa with the aim to test how these eco-evolutionary mechanisms are varying across geography.
... Before performing sequence statistical analysis, the number of sequences was normalized by the sample with the smallest number (60779) of sequences (Zhong et al., 2016). Hill numbers including Richness index, Shannon index, Simpson index and Pielou evenness index were calculated to evaluate the α-diversities of bacterial communities in each geographical region (QD, YRH, and HX) using the packages of "hilldiv" (Hill 1973;Chao et al., 2010). The Mann-Whitney U test was used to examine the difference of alpha diversity between any two regions and the results of p-value < 0.05 were considered as significant (Li et al., 2019). ...
Article
Full-text available
We analyze bacterial composition, diversity, geographical distribution, and their community networks in lake water in three adjacent regions on the Qinghai-Tibet Plateau (QTP). Results show that bacterial alpha-diversity indices are much lower in the Hoh Xil (HX) than that in the Yellow River Headwater (YRH) regions and the Qaidam (QD) region. The dominant phyla in QD and YRH are Proteobacteria which account for 42.45 % and 43.64 % of all detected phyla, while Bacteroidetes is the dominant bacterial taxa in HX (46.07 %). Redundancy analysis results suggest that the most important factors in driving bacterial community composition in the three regions are altitude (QD), total nitrogen (YRH), and pH (HX), respectively. Both environmental factors and spatial factors significantly affect the bacterial community composition in QD and HX, while only environmental factors are the major drivers in YRH. Finally, network analyses reveal that the bacterial network structure in QD is more complex than those in YRH and HX, whereas the bacterial network in HX is the most stable, followed by those in QD and YRH.
... The effective absence of phylogenetic mitigation recovers the theoretical expectation that PD is a generalized form of richness index (Chao et al., 2010). Studies have found high correlation between PD and species richness (Davies & Buckley, 2011;Dias et al., 2020;Safi et al., 2011). ...
Article
Full-text available
To help address the underrepresentation of arthropods and Asian biodiversity from climate-change assessments, we carried out year-long, weekly sampling campaigns with Malaise traps at different elevations and latitudes in Gaoligongshan National Park in southwestern China. From these 623 samples, we barcoded 10,524 beetles and compared scenarios of climate-change-induced biodiversity loss, by designating seasonal, elevational, and latitudinal subsets of beetles as communities that plausibly could go extinct as a group, which we call 'loss sets.' The availability of a published mitochondrial-genome-based phylogeny of the Coleoptera allowed us to compare the loss of species diversity with and without accounting for phylogenetic relatedness. We hypothesised that phylogenetic relatedness would mitigate extinction, since the extinction of any loss set would result in the disappearance of all its species but only part of its evolutionary history, which is still extant in the remaining loss sets. We found different patterns of community clustering by season and latitude, depending on whether phylogenetic information was incorporated. However, accounting for phylogeny only slightly mitigated the amount of biodiversity loss under climate change scenarios, against our expectations: there is no phylogenetic "escape clause" for biodiversity conservation. We achieve the same results whether phylogenetic information was derived from the mitogenome phylogeny or from a de novo barcode-gene tree. We encourage interested researchers to use this dataset to study lineage-specific community assembly patterns in conjunction with life-history traits and environmental covariates.
... In particular, another type of curve-based index, the Hill's diversity index, emphasizes the relative (but not absolute) abundances and integrates measures of species richness and species abundance into a diversity profile curve (asymptote) (Kondratyeva et al. 2019;Saha et al. 2022). Hill numbers can be effectively generalized to incorporate taxonomic, phylogenetic, and functional diversity, and thus provide a unified framework for measuring biodiversity (Chao et al. 2010;Gotelli and Chao 2013). ...
Article
Wetland diversity metrics are critical indicators of habitat carrying capacity and anthropogenic disturbance on ecosystem health. This study aimed at comparing the composition and ecological features of bird communities across gradients of habitat effects within wetland areas of freshwater tropical lakes. For this study, the total number, diversity, and dominance of bird communities (n = 519; 29 species) within a 1 km radius riparian area of four tropical lakes were measured and compared alongside habitat effect variables. Three categories of habitat effect variables i.e., lake morphology (surface area of lake, lake perimeter, shape index i.e., form factor); landscape structure (vegetation density within 1 km radius measured using Normalized Difference Vegetation Index [NDVI]), and human disturbance/urbanization features (distance to major road, distance to residential area) were assessed across the four lake wetlands. The multivariate relationship of variables was described using non-metric multidimentional scaling (NMDS) of six habitat effect variables alongside Hill numbers diversity as dependent variables showed that variation in bird species composition was strongly negatively correlated with lake shape index, and positively correlated with vegetation density (NDVI) and distance to the road (indicating complementary effects). This implies that wetlands with heterogeneously shaped lakes alongside greater vegetation density (NDVI), and increased furtherance from major highways (urbanization features) were more likely to attract and accommodate more diverse bird communities. Furthermore, the habitat-specific avian guild structure highlighted the carrying capacity of the different lake wetlands. The study findings imply that the conservation of wetland avifauna will require emphasis on habitat-specific variables.
... Another direct view is that the diversity of a set is just the totality of the dissimilarities between the elements of a set. 14 This view is not sensitive to the distribution of the totality of dissimilarities among individual Papp (2000); for diversity as number and evenness, see Shannon and Weaver (1949), Simpson (1949), McIntosh (1967, Junge (1994), and Justus (2011); for diversity as number and dissimilarities, see Helmus et al. (2007), Chao, Chiu, and Jost (2010) and Dowden (2011); for diversity as abundance and dissimilarities, see Barker (2002); for diversity as evenness and dissimilarities, see Barker (2002), Ricotta (2004); for diversity as number, evenness and dissimilarities, see Warwick and Clarke (1995), Ganeshaiah, Chandrashekara, and Kumar (1997), Shimatani (2001), Mason et al. (2003), Ricotta and Avena (2003), Weikard, Punt, and Wesseler (2006), Helmus et al. (2007), Stirling (2007), and Allen, Kon, and Bar-Yam (2009). 11 For diversity as dissimilarities to elements in the universal set, see Gravel (2009) and Eiswerth and Haney (1992). ...
Article
Full-text available
A large number of diversity measures have been proposed in the literature, almost all of which are designed for sets with elements that differ in multiple aspects. However, these types of measures are not appropriate for sets with elements that differ in a single aspect only, such as duration, value or probability. In this essay I present a new measure of diversity, designed specifically for single aspect diversity. The measure captures the intuitive idea that single aspect diversity is affected by the maximal dissimilarity between the elements, the number of different elements, and the even spacing of the elements’ aspect values, when represented as points in one dimension. I show that the measure satisfies six plausible conditions for a single aspect diversity measure, which concern scale independence, minimal diversity, maximal dissimilarity, restricted function growth, strict monotonicity, and even spacing. The measure is also characterised.
Article
Full-text available
The Huasteca Potosina region has a relevant landscape heritage of biocultural diversity, due to high biological diversity and the presence of the Teenek (Huastec Mayan), Nahua, and Xi’iuy (Pame) ethnic groups. The object of this study is to analyze, among the different cultural groups of the region, how the performances of the relevant Socioecological Systems (SESs) influence the conservation of biocultural diversity. Quantitative approaches are used to determine the expected trends of indices (Informant Consensus Factor, ICF; Cultural Importance Index, CII; Shannon–Wiener Biodiversity Index, SWI) commonly used in the ethnobotanical field. Data of the main domestic forest species used by the groups mentioned above were collected in 2021. We analyzed the SES profile for each of the ethnic groups and a mestizo group, as well as their relationship with the biome they mainly inhabit and the domestic functions fulfilled by the ethnobotanical species. As a result, we found that the low deciduous forest and the sub-evergreen tropical forest biomes, which co-evolved mainly with the Nahua and the Teenek SESs, present higher diversity and effective use of species so that offer better chances for conserving the landscape heritage of biocultural diversity. Otherwise, the results also show the critical nature regarding the biomes inhabited by the Pame and the mestizo’s SESs.
Article
Full-text available
Functional diversity is an important component of biodiversity, yet in comparison to taxonomic diversity, methods of quantifying functional diversity are less well developed. Here, we propose a means for quantifying functional diversity that may be particularly useful for determining how functional diversity is related to ecosystem functioning. This measure of functional diversity ''FD'' is defined as the total branch length of a functional dendrogram. Various characteristics of FD make it preferable to other measures of functional diversity, such as the number of functional groups in a community. Simulating species' trait values illustrates how the relative importance of richness and composition for FD depends on the effective dimensionality of the trait space in which species separate. Fewer dimensions increase the importance of community composition and functional redundancy. More dimensions increase the importance of species richness and decreases functional redundancy. Clumping of species in trait space increases the relative importance of community composition. Five natural communities show remarkably similar relationships between FD and species richness.
Article
This paper puts forth the view that diversity is an average property of a community and identifies that property as species rarity. An intrinsic diversity ordering of communities is defined and is shown to be equivalent to stochastic ordering. Also, the sensitivity of an index to rare species is developed, culminating in a crossing-point theorem and a response theory to perturbations. Diversity decompositions, analogous to the analysis of variance, are discussed for two-way classifications and mixtures. The paper concludes with a brief survey of genetic diversity, linguistic diversity, industrial concentration, and income inequality.
Book
"Measuring Biological Diversity assumes no specialist mathematical knowledge and includes worked examples and links to web-based software. It will be essential reading for all students, researchers, and managers who need to measure biological diversity."--BOOK JACKET.
Article
When pairwise differences (relatedness) between species are numerically given, the average of the species differences weighted by relative frequencies can be used as a species diversity index. This paper first theoretically develops the indices of this type, then applies them to forestry data. As examples of diversity indices, this paper explores the taxonomic diversity and the newly introduced amino acid diversity, which is a modification of the nucleotide diversity in genetics. The first, mathematical part shows that both indices can be decomposed into three inner factors; evenness of relative frequencies (=the Simpson index), the simple average over species differences regardless of relative frequencies, and the taxonomic or genetic balance in relative frequencies. The taxonomic diversity has another decomposition: the sum over the Simpson indices at all the taxonomic levels. The second part examines the effects of different forest management techniques on diversity. It is shown that a thinning operation for promoting survival of specific desirable species also contributed to increasing the taxonomic diversity. If we calculated only conventional indices that do not incorporate species relatedness, we would simply conclude that the thinning did not significantly affect the diversity. The theoretical developments of the first part complement the result, leading us to a better interpretation about contrasting vegetation structures. The mathematical results also reveal that the amino acid diversity involves redundant species, which is undesirable when measuring diversity; hence, this index is used to demonstrates crucial points when we introduce species relatedness. The results suggest further possibilities of applying diversity indices incorporating species differences to a variety of ecological studies.