Linking demography with drivers: Climate and competition

ArticleinMethods in Ecology and Evolution 7(2):171-183 · February 2016with409 Reads
DOI: 10.1111/2041-210X.12486
Brittany Teller at Pennsylvania State University
  • 15.42
  • Pennsylvania State University
Peter B. Adler at Utah State University
  • 41.69
  • Utah State University
Collin Edwards at Cornell University
  • 6.19
  • Cornell University
Abstract
In observational demographic data, the number of measured factors that could potentially drive demography (such as daily weather records between two censuses) can easily exceed the number of independent observations. Thus, identifying the important drivers requires alternatives to standard model selection and variable selection methods. Spline methods that estimate smooth functions over continuous domains (such as space or time) have the potential to resolve high-dimensional problems in ecological systems. We consider two examples that are important for many plant populations: competition with neighbours that vary in size and distance from the focal individual and climate variables during a window of time before a response (growth, survival, etc.) is measured. For competition covariates, we use a simulation study based on empirical data to show that a monotone spline estimate of competition kernels via approximate AIC returns very accurate estimates. We then apply the method to long-term, mapped quadrat data on the four dominant species in an Idaho (US) sagebrush steppe community. For climate predictors and their temporal lags, we use simulated data sets to compare functional smoothing methods with competing linear (LASSO) or machine learning (random forests) methods. Given sufficient data, functional smoothing methods outperformed the other two methods. Functional smoothing methods can advance data-driven population modelling by providing alternatives to specifying competition kernels a priori and to arbitrarily aggregating continuous environmental covariates. However, there are important open questions related to modelling of nonlinear climate responses and size × climate interactions.

Do you want to read the rest of this article?

  • ... Since plants may be impacted by climate events over a long period of time (see Dahlgren and Ehrlén, 2011;Clark et al., 2011), we will consider the past two years of data. Following Teller et al. (2016), these are thought of as functional covariates leading to a representation of E as a functional linter term: ...
  • ... Detecting delayed effects of weather variables on the demography of perennial plants may require statistical models that explicitly include time lags. Teller et al. (2016) demonstrated an elegant statistical method that uses functional linear models (FLMs) of lagged weather data. A functional linear model is a smooth spline f(x) whose values are multiplied by a vector of observed data z x and then summed. ...
    ... A prerequisite for successfully fitting FLMs is the availabil- ity of a sufficient number of independent observations of vital rates under different environmental conditions. Simulation results from Teller et al. (2016) indicate that at least 20-25 observations of the response y are required to detect climate signals. This requirement is a severe limitation for analyzing demographic data with functional linear smooth splines because most demographic studies are much shorter than 20 yr ( Crone et al. 2011, Salguero-G omez et al. 2015. ...
    ... How well this space for time substitution works is an open question. Teller et al. (2016) did explore the effect of correlations within a time series and found the method robust to correlations. They also found the method robust to cross-correlations between different weather drivers, but this is not the same situation as spatial correlations within a single weather driver. ...
  • ... Disadvantages of this approach are the requirement for long time series ( Teller et al., 2016), and the assumptions that responses to short-term weather fluctuations can be extrapolated over longer time scales. Mechanistic representations of key processes, rather than empirical correlations, are the basis for a third approach, exemplified by dynamic global vegetation models ( Prentice et al., 2007) and bioenergetics models ( Buckley, 2008). ...
  • ... Other work on mismatched time series in ecology has focused on using weighted splines to smooth fine-scaled covariate data (Teller et al., 2016), their exponential smoother is the continuous equivalent of our geometric weighted covariate. In econometrics, models for mismatched data are more common. ...
    ... Other work on mismatched time series in ecology has focused on using weighted splines to smooth fine-scaled covariate data ( Teller et al., 2016), their exponential smoother is the continuous equivalent of our geometric weighted covariate. In econometrics, models for mismatched data are more common. ...
  • ... Yet multiple climate signals may be fairly common and the ability to test and compare these simultaneously would be useful. With advances in computing and statistics a number of data-driven methods to tackle highdimensional problems like climate analysis have become common, such as machine learning, least absolute shrinkage and selection operator (LASSO) and functional linear models using splines [12]. These alternative methods offer additional flexibility compared to Weibull and GEV functions, by allowing for the detection of multiple signals with a single analysis (e.g., [12] ). ...
    ... With advances in computing and statistics a number of data-driven methods to tackle highdimensional problems like climate analysis have become common, such as machine learning, least absolute shrinkage and selection operator (LASSO) and functional linear models using splines [12]. These alternative methods offer additional flexibility compared to Weibull and GEV functions, by allowing for the detection of multiple signals with a single analysis (e.g., [12] ). Furthermore, they open up the possibility of multi-dimensional climate window analysis , analysing multiple climate variables at the same time, potentially improving upon the unidimensional analysis currently employed in climwin. ...
    ... Furthermore, they open up the possibility of multi-dimensional climate window analysis , analysing multiple climate variables at the same time, potentially improving upon the unidimensional analysis currently employed in climwin. Splines in particular may provide a suitable alternative for weighted window analysis, as they are ideally suited for modelling a smooth function over a continuum (e.g., time; [12, 31]). In their work, Teller et al. [12] successfully apply a spline function to assess climate signals, demonstrating the ability to detect multiple climate signals within a single weight distribution. ...
  • ... It is possible that we did not choose the optimal time periods over which to aggregate. New methods using functional linear models (or splines) may offer a data-driven approach for identifying the appropriate time periods over which to aggregate to produce a tractable set of candidate climate variables (Sims et al. 2007; van de Pol & Cockburn 2011; Teller et al. 2016). We also expected IPM forecast accuracy to decline at a lower rate than the QBM as the time between the model initialization and the forecast increased. ...
    ... Unfortunately, we have few ideas about how to improve population forecasts that have not already been proposed (Mouquet et al. 2015; Petchey et al. 2015 ). Longer timeseries should improve our ability to detect exogenous drivers such as climate (Teller et al. 2016), and modeling larger spatial extents may reduce parameter uncertainty (Petchey et al. 2015). We may also have to shift our perspective from making explicit point forecasts to making moving average forecasts (Petchey et al. 2015). ...
  • ... Roberts (2008) and Teller et al. (2016) have suggested alternative explorative methods to identify the critical time window, but their ability to distinguish true from false signals, as well as accuracy and precision of most of the key metrics are unknown. These studies used multiple regression methods in which each daily, weekly or monthly mean temperature is used as a separate predictor variable, and subsequently identified which predictor variables over which time window best explain variation in the response variable. ...
    ... Further research is needed to determine the performance of different methods on the same simulated data over a wider part of the parameter space and different data structures, while keeping in mind that different biologists are interested in optimizing the reliability of different metrics (slope, R 2 , false positive or negative rate). Our aim is to extend climwin to include a variety of methods and provide the tools and benchmarks to compare them, as the question of what constitutes the best method may depend on the biological question (Teller et al. 2016; this study). Another interesting avenue would be to adapt our approach to the question of over which spatial window one should aggregate environmental predictors (Mesquita et al. 2015), as for species moving between various locations, the locations at which the weather influence is strongest may in fact need to be determined (note that climwin can already incorporate weather data from different locations in a single model, see Supp. ...
    ... signal can be directly compared to the output from models fitted by the slidingwin function to investigate whether a weighted mean model is better supported by the data than, for example, a model with the aggregate statistic unweighted mean (see Supporting Information B). For alternative nonparametric methods using smoothing, see Roberts 2008 and Teller et al. 2016. ...
  • ... The overarching issue of correctly identifying the influence of one or more particular environmental factors is addressed by Teller et al. (2016). In practice, researchers often rely on arbi- trary decisions about the spatial scale over which competition occurs and aggregate continuous climate data into discrete lags representing factors such as mean temperature. ...
    ... In addition, many of the manuscripts in this special feature have provided open-access R scripts of their analyses in their online materi- als (e.g. Childs, Sheldon & Rees 2016;Teller et al. 2016). ...
  • ... Instead, techniques such as model averaging or parameter shrinkage using ridge regression, lasso regression or the elastic net could be used (Dahlgren 2010). For environmental drivers that are continuously monitored, such as climatic variables, and where the response to the driver is a smooth function of distance in time or space, functional linear models constitutes an alternative (Teller et al. 2016 ). When there are multiple potential drivers, we must also recognize that some of them may be highly correlated, and the statistical model selection procedures have no way of distinguishing between them so that we cannot say which of them are true drivers and which are simply correlated with drivers. ...
  • ... Svenning et al. 2014). In particular, plant demography has long established that reproduction and survival of plants depend strongly on intraspecific density and that this density dependence can be negative (Stoll & Weiner 2000; Teller et al. 2016) or positive (causing Allee effects; Lamont, Klinkhamer & Witkowski 1993; Courchamp, Berec & Gascoigne 2008). While determinants of small-scale demographic variation are thus reasonably well understood, only a few studies have identified environmental drivers of range-wide variation in key plant demographic rates (Angert 2009; Doak & Morris 2010; Merow et al. 2014). ...
Preprint
May 2018
    In both plant and animal systems, size can determine whether an individual survives and grows under different environmental conditions. However, it is less clear whether and when size-dependent responses to the environment affect population dynamics. Size-by-environment interactions create pathways for environmental fluctuations to influence population dynamics by allowing for negative... [Show full abstract]
    Article
    April 2016 · Test
      We discuss future challenges in developing statistical theory for Random Forests. In particular, we suggest that an analysis of bias and extrapolation is vital to understanding the statistical properties of variable importance measures. We further point to the incorporation of random forests within larger statistical models as an important tool for high-dimensional statistical inference.
      Article
        This paper examines the use of a residual bootstrap for bias correction in machine learning regression methods. Accounting for bias is an important obstacle in recent efforts to develop statistical inference for machine learning methods. We demonstrate empirically that the proposed bootstrap bias correction can lead to substantial improvements in both bias and predictive accuracy. In the... [Show full abstract]
        Article
        March 2012 · Journal of Ecology
          1. A change in a climate variable may alter a species’ abundance not only through a direct effect on that species’ vital rates, but also through ‘indirect’ effects mediated by species interactions. While recent work has highlighted cases in which indirect effects overwhelm the direct effects of climate, we lack robust generalizations to predict the strength of indirect effects. 2. For... [Show full abstract]
          Conference Paper
          January 2004
            There has historically been very little concern with extrapolation in Machine Learning, yet extrapolation can be critical to diagnose. Predictor functions are almost always learned on a set of highly correlated data comprising a very small segment of predictor space. Moreover, flexible predictors, by their very nature, are not controlled at points of extrapolation. This becomes a problem for... [Show full abstract]
            Discover more