[Show abstract][Hide abstract] ABSTRACT: Taylors Law (TL) describes the scaling relationship between the mean and
variance of one or more populations as a power-law. TL is widely observed in
ecological systems across space and time with exponents varying largely between
1 and 2. Many ecological explanations have been proposed for TL but it is also
commonly observed outside ecology. We propose that TL arises from the
constraining influence of two primary variables: the number of individuals and
the number of censuses or sites. We show that most possible configurations of
individuals among censuses or sites produce the power-law form of TL with
exponents between 1 and 2. This feasible set approach suggests that TL is a
statistical pattern driven by two constraints, providing an a priori
explanation for this ubiquitous pattern. However, the exact form of any
specific mean-variance relationship cannot be predicted in this way, suggesting
that TL may still contain ecological information.
[Show abstract][Hide abstract] ABSTRACT: We describe a set of best practices for scientific software development, based on research and experience, that will improve scientists' productivity and the reliability of their software.
[Show abstract][Hide abstract] ABSTRACT: Ecological patterns are often used to infer the importance of ecological processes. However, it is rarely known whether sufficient variation exists among all possible forms of a pattern (i.e. the feasible set) to distinguish the effects of different processes. Despite astronomically large feasible sets, ecological patterns such as distributions of abundance can closely mirror the average and the majority of all possible forms. Feasible sets provide rich and informative contexts for examining empirical patterns and can reveal powerful influences of ecological constraint combinations, fundamental explanations for common patterns, and unanticipated problems of ecological indices.
[Show abstract][Hide abstract] ABSTRACT: Conservation strategies depend on a basic understanding of how communities are spatially structured. The Maximum Entropy Theory of Ecology (METE) proposes that communities may be in the most likely spatial distribution given two ecological constraints: total number of species and total number of individuals. We developed and tested the spatial predictions of METE using the species-area and distance-decay relationships across a global forest dataset. The METE predictions were extremely accurate for the species-area relationship, but METE generally predicted steeper patterns of distance-decay than observed empirically. Our results demonstrate that ecological constraints can inform predictions of spatial community structure.
[Show abstract][Hide abstract] ABSTRACT: Background/Question/Methods
The species-abundance distribution (SAD) is a fundamental pattern in community ecology yet until recently there was not an a priori model for the shape of the SAD. The Maximum Entropy Theory of Ecology (METE) provides a statistical framework that predicts the shape of the SAD from two key empirical constraints: the total number of species (S) and the total number of individuals (N). The goal of our project is to test how well S and N and subsequently the SAD can be predicted using remotely-sensed environmental data across 6 continental-scale datasets that encompass birds, trees, butterflies, and small mammals.
In general, the environmental variables explained approximately equal amounts of variance in S (average R2= 0.42) and N (average R2 = 0.38). Additionally, we observed a positive correlation between the R2 value of S and N. Predictions of S and N were most accurate for mammals and trees. Winter and summer bird communities were equally predictable. We explained the least amount of variance in S and N for the butterfly dataset. The predicted SADs based on predicted S and N were surprisingly accurate (average R2 = 0.62), indicating that the exact empirical values of S and N are not necessary to generate reasonable empirical predictions of the SAD using the METE approach. Our findings suggest that remotely-sensed environmental data can provide a quick and relatively accurate method of predicting the pattern of dominance and rarity in a community that has yet to be sampled. Additionally our results demonstrate how a constraint-based, maximum entropy approach can be combined with other modeling approaches to yield simple yet powerful predictions using relatively little information.
[Show abstract][Hide abstract] ABSTRACT: Increasingly large amounts of ecological and environmental data are available for analysis. Using existing data can save time and money, allow us to address otherwise intractable problems, and provide general answers to ecological questions. I will discuss why we should be actively using this data in ecology, how to get started, and give examples of what can be accomplished if we embrace an era of big data in ecology.
[Show abstract][Hide abstract] ABSTRACT: Background/Question/Methods
General theories for macroecological patterns have become increasingly prevalent in the last decade. These theories potentially allow predictions to be made in the absence of detailed understanding of the processes structuring an ecosystem. We discuss research testing one of these general theories, the Maximum Entropy Theory of Ecology, which posits that many macroecological patterns are emergent statistical phenomena. If this theory is correct, it would mean that the form of many common patterns in ecology could be unlocked simply by knowing the total number of individuals and species in a system, and the total metabolic energy use of all of the individuals. To provide the most general test of the theory possible we compare it to large ecological datasets containing thousands of sites and species, and millions of individuals.
Three major macroecological patterns (the species-abundance distribution, the species-area relationship, and the individual size distribution) are well predicted by the theory (R2 > 0.8). Three other patterns (the distance decay of similarity, the relationship between the size of a species and the number of individuals, and the distribution of individual body sizes within a species) are not well predicted by the theory. We discuss what the failures of the current theory suggest for future iterations of this approach.
[Show abstract][Hide abstract] ABSTRACT: The Maximum Entropy Theory of Ecology (METE) is a unified theory of
biodiversity that predicts a large number of macroecological patterns using
only information on the species richness, total abundance, and total metabolic
rate of the community. We conducted a strong test of METE, where four of its
major predictions were evaluated simultaneously using data from 60 globally
distributed communities including over 300,000 individuals and nearly 2000
species. While METE successfully captured 96% and 93% of the variation in the
species abundance distribution and the individual size distribution, it
performed poorly when characterizing the size-density relationship and the
intraspecific distribution of individual body size. Specifically, METE predicts
a negative correlation between individual energy use and species abundance,
which is weak in natural communities. By evaluating multiple predictions with
large quantities of data, our study not only identifies a mismatch between
abundance and body size in METE, but also serves as a general example on the
importance of conducting strong tests of ecological theories.
[Show abstract][Hide abstract] ABSTRACT: The species abundance distribution (SAD) is one of the most intensively studied distributions in ecology and its hollow-curve shape is one of ecology's most general patterns. We examine the SAD in the context of all possible forms having the same richness (S) and total abundance (N), i.e. the feasible set. We find that feasible sets are dominated by similarly shaped hollow curves, most of which are highly correlated with empirical SADs (most R(2) values > 75%), revealing a strong influence of N and S on the form of the SAD and an a priori explanation for the ubiquitous hollow curve. Empirical SADs are often more hollow and less variable than the majority of the feasible set, revealing exceptional unevenness and relatively low natural variability among ecological communities. We discuss the importance of the feasible set in understanding how general constraints determine observable variation and influence the forms of predicted and empirical patterns.
[Show abstract][Hide abstract] ABSTRACT: Abstract Studies of biodiversity typically assume that all species are equivalent. However, some species in a community maintain viable populations in the study area, while others occur only occasionally as transient individuals. Here we show that North American bird communities can reliably be divided into core and transient species groups and that the richness of each group is driven by different processes. The richness of core species is influenced primarily by local environmental conditions, while the richness of transient species is influenced primarily by the heterogeneity of the surrounding landscape. This demonstrates that the well-known effects of the local environment and landscape heterogeneity on overall species richness are the result of two sets of processes operating differentially on core and transient species. Models of species richness should focus on explaining two distinct patterns, those of core and transient species, rather than a single pattern for the community as a whole.
The American Naturalist 04/2013; 181(4):E83-E90. · 4.55 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Ecological research relies increasingly on the use of previously collected data. Use of existing datasets allows questions to be addressed more quickly, more generally, and at larger scales than would otherwise be possible. As a result of large-scale data collection efforts, and an increasing emphasis on data publication by journals and funding agencies, a large and ever-increasing amount of ecological data is now publicly available via the internet. Most ecological datasets do not adhere to any agreed-upon standards in format, data structure or method of access. Some may be broken up across multiple files, stored in compressed archives, and violate basic principles of data structure. As a result acquiring and utilizing available datasets can be a time consuming and error prone process. The EcoData Retriever is an extensible software framework which automates the tasks of discovering, downloading, and reformatting ecological data files for storage in a local data file or relational database. The automation of these tasks saves significant time for researchers and substantially reduces the likelihood of errors resulting from manual data manipulation and unfamiliarity with the complexities of individual datasets.
PLoS ONE 01/2013; 8(6):e65848. · 3.53 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The Maximum Entropy Theory of Ecology (METE) predicts a universal species-area relationship (SAR) that can be fully characterized using only the total abundance (N) and species richness (S) at a single spatial scale. This theory has shown promise for characterizing scale dependence in the SAR. However, there are currently four different approaches to applying METE to predict the SAR and it is unclear which approach should be used due to a lack of empirical comparison. Specifically, METE can be applied recursively or non-recursively and can use either a theoretical or observed species-abundance distribution (SAD). We compared the four different combinations of approaches using empirical data from 16 datasets containing over 1000 species and 300,000 individual trees and herbs. In general, METE accurately downscaled the SAR (R (2) > 0.94), but the recursive approach consistently under-predicted richness. METE's accuracy did not depend strongly on using the observed or predicted SAD. This suggests that the best approach to scaling diversity using METE is to use a combination of non-recursive scaling and the theoretical abundance distribution, which allows predictions to be made across a broad range of spatial scales with only knowledge of the species richness and total abundance at a single scale.
[Show abstract][Hide abstract] ABSTRACT: Macroecological patterns such as the species-area relationship (SAR), the species-abundance distribution (SAD), and the species-time relationship (STR) exhibit regular behavior across ecosystems and taxa. However, determinants of these patterns remain poorly understood. Emerging theoretical frameworks for macroecology attempt to understand this regularity by ignoring detailed ecological interactions and focusing on the influence of a small number of community-level state variables, such as species richness and total abundance, on these patterns. We present results from a 15-year rodent removal experiment evaluating the response of three different macroecological patterns in two distinct annual plant communities (summer and winter) to two levels of manipulated seed predation. Seed predator manipulations significantly impacted species composition on all treatments in both communities, but did not significantly impact richness, community abundance, or macroecological patterns in most cases. How'ever, winter community abundance and richness responded significantly to the removal of all rodents. Changes in richness and abundance were coupled with significant shifts in macroecological patterns (SADs, SARs, and STRs). Because altering species interactions only impacted macroecological patterns when the state variables of abundance and richness also changed, we suggest that, in this system, local-scale processes primarily act indirectly through these properties to determine macroecological patterns.
[Show abstract][Hide abstract] ABSTRACT: Scientists spend an increasing amount of time building and using software.
However, most scientists are never taught how to do this efficiently. As a
result, many are unaware of tools and practices that would allow them to write
more reliable and maintainable code with less effort. We describe a set of best
practices for scientific software development that have solid foundations in
research and experience, and that improve scientists' productivity and the
reliability of their software.
[Show abstract][Hide abstract] ABSTRACT: Background/Question/Methods
The allocation of mass and energy among individuals and species within an assemblage has long been studied in ecology. Recently, a new unified theory, Maximum Entropy (MaxEnt), provides the first attempt to link patterns of energy consumption with patterns of diversity. Originated from informatics, MaxEnt has been used to derive a variety of macroecological patterns including species-abundance distribution (SAR) and species-area relationship (SAR), as well as individual-level and species-level metabolic rate distributions, with four state variables constraining the community – total abundance, total species richness, total metabolic rate, and area (Harte 2011).
While the SAD predicted by MaxEnt is supported by extensive tests with empirical data (White et al. in revision), tests on the predicted energy distributions have been scarce (but see Harte et al. 2008, Harte 2011). Furthermore, current construction of MaxEnt machinery allows substitution of the metabolic rate constraint with another variable, without affecting the predictions for SAD and SAR. Here we try to answer the following two questions: 1. Does MaxEnt accurately predict energy allocations of real communities? 2. If not, is there a surrogate of energy that leads to better predictions under the MaxEnt framework?
To answer these questions, we have compiled a large number of community-level data including trees, birds, and mammals. The individual-level and species-level energy distributions predicted by MaxEnt will be compared with the empirically observed patterns. Different measures correlated with energy consumption (e.g., body mass, diameter for trees) will be adopted to substitute metabolic rate, and goodness of fit of these measures will be examined.
Preliminary analysis has been conducted with data from two tropical forests, Barro Colorado Island (BCI) and Sherman plot. The predicted metabolic rate distribution among individuals shows a clear deviation from empirical data for both communities, with MaxEnt overpredicting the energy consumption of individuals with high metabolic rate. Moreover, while MaxEnt predicts relatively constant total within-species energy consumption across species (i.e., Damuth’s energetic equivalence rule), in both communities the measure is shown to increases with abundance. The relationship between intraspecific average metabolic rate and species abundance is polygonal, similar to the pattern observed by Brown and Maurer (1987) in bird communities. Substituting metabolic rate with body mass yields qualitatively similar results.
Based on the preliminary results, MaxEnt’s prediction for energy distributions is unlikely to work with the current constraint on metabolic rate, and other surrogates for energy consumption need to be explored.
[Show abstract][Hide abstract] ABSTRACT: The species abundance distribution (SAD) is one of themost studied patterns in ecology due to its potential insights into commonness and rarity, community assembly, and patterns of biodiversity. It is well established that communities are composed of a few common and many rare species, and numerous theoretical models have been proposed to explain this pattern. However, no attempt has been made to determine how well these theoretical characterizations capture observed taxonomic and global-scale spatial variation in the general form of the distribution. Here, using data of a scope unprecedented in community ecology, we show that a simple maximum entropy model produces a truncated log-series distribution that can predict between 83% and 93% of the observed variation in the rank abundance of species across 15 848 globally distributed communities including birds, mammals, plants, and butterflies. This model requires knowledge of only the species richness and total abundance of the community to predict the full abundance distribution, which suggests that these factors are sufficient to understand the distribution for most purposes. Since geographic patterns in richness and abundance can often be successfully modeled, this approach should allow the distribution of commonness and rarity to be characterized, even in locations where empirical data are unavailable.
[Show abstract][Hide abstract] ABSTRACT: Ecologists have long sought to understand the mechanisms underlying the assembly and structure of communities. Such understanding is relevant to both basic science and conservation-related issues. The macroecological approach to this problem involves asking scientific questions using a large number of communities in order to elucidate generalities in pattern and process. Such analyses are typically conducted using a substantial amount of data from a particular taxonomic group across a diversity of systems. Large community databases are available for a number of taxa, but no publicly available database exists for mammals. Given the logistical challenges of collecting such data de novo, compiling existing information from the literature provides the best avenue for acquiring the necessary data. Here, we provide a data set that includes species lists for 1000 mammal communities, excluding bats, with species-level abundances available for 940 of these communities. All communities found in the literature that included complete, site-specific sampling data, composed of species lists with or without associated abundances, were included in the data set. Most, but not all, sites are limited to species groups that are sampled using a single technique (e.g., small mammals sampled with Sherman traps). The data set consists of 7977 records from 1000 georeferenced sites encompassing a variety of habitats throughout the world, and it includes data on 660 mammal species with sizes ranging from 2 g to >500 kg.
[Show abstract][Hide abstract] ABSTRACT: Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain.
[Show abstract][Hide abstract] ABSTRACT: Background/Question/Methods
Understanding broad-scale patterns in ecological systems is critical to predicting and mitigating impacts of global change. To that end, ecologists have developed and tested myriad models of numerous patterns that characterize community structure, typically studying one pattern at a time with one dataset from one taxonomic group. The species abundance distribution (SAD) is one of the most commonly studied of these patterns. Decades of research have revealed that all communities are comprised of a few common and many rare species, but a mechanistic or statistical framework to account for this universal form and link it to other important macroecological patterns has been lacking. Consequently, John Harte and colleagues have recently applied the maximum entropy method of inference from information theory (MaxEnt) to develop such a framework. In this framework, the state variables of an ecological community, species richness (S0), total abundance (N0), and total metabolic requirements (E0), are used to predict the least-biased form of the SAD. Here, we test the ability of MaxEnt to characterize SADs using five databases of continental or global extent representing three major taxa (Breeding Bird Survey, Christmas Bird Count, Forest Inventory Analysis, Alwyn Gentry’s Forest Transects, and a compilation of published mammal community data).
We compared MaxEnt predictions with observed SADs for 15,663 sites distributed throughout six continents, including 47,774,772 individuals representing 8,368 species. S0 ranged from 10 to 250 and N0 from 11 to 10,280,057. Goodness of fit was assessed by coefficients of determination (R2) that measured agreement between observed and predicted values. MaxEnt predictions were remarkably similar to observed species abundances for all taxa and datasets (R2: BBS = 0.91; CBC = 0.90; Mammals = 0.83; FIA = 0.96; Gentry = 0.93), with P values of < 0.001 (based on comparisons to simulated SADs sampled from the discrete uniform distribution). The Fisher log-series is the form of the SAD that emerges from the MaxEnt framework, and we found that, for each dataset, >75% of the observed SADs were better characterized by the log-series than by the log-normal (Akaike weights > 0.6). These results provide compelling evidence that the MaxEnt approach can be used to infer species abundances, estimate the number of rare species, and predict extinction in various scenarios of global change. Further testing of this MaxEnt framework is planned, to further assess its potential to unify macroecological patterns and thereby simplify the search for the mechanisms underlying broad-scale patterns in ecology.