Content uploaded by Susan J. Mazer
Author content
All content in this area was uploaded by Susan J. Mazer on Oct 25, 2018
Content may be subject to copyright.
TECHNICAL ADVANCE
Overlooked climate parameters best predict flowering onset:
Assessing phenological models using the elastic net
Isaac W. Park
|
Susan J. Mazer
Department of Ecology, Evolution and
Marine Biology, University of California,
Santa Barbara, California
Correspondence
Isaac W. Park, Department of Ecology,
Evolution, and Marine Biology, University of
California, Santa Barbara, CA.
Email: park@lifesci.ucsb.edu
Funding information
National Science Foundation, Grant/Award
Number: DEB‐1556768
Abstract
Determining the manner in which plant species shift their flowering times in
response to climatic conditions is essential to understanding and forecasting the
impacts of climate change on the world's flora. The limited taxonomic diversity and
duration of most phenological datasets, however, have impeded a comprehensive,
systematic determination of the best predictors of flowering phenology. Addition-
ally, many studies of the relationship between climate conditions and plant phenol-
ogy have included only a limited set of climate parameters that are often chosen a
priori and may therefore overlook those parameters to which plants are most phe-
nologically sensitive. This study harnesses 894,392 digital herbarium records and
1,959 in situ observations to produce the first assessment of the effects of a large
number (25) of climate parameters on the flowering time of a very large number
(2,468) of angiosperm taxa throughout North America. In addition, we compare the
predictive capacity of phenological models constructed from the collection dates of
herbarium specimens vs. repeated in situ observations of individual plants using a
regression approach—elastic net regularization—that has not previously been used
in phenological modeling, but exhibits several advantages over ordinary least
squares and stepwise regression. When herbarium‐derived data and in situ pheno-
logical observations were used to predict flowering onset, the multivariate models
based on each of these data sources had similar predictive capacity (R
2
= 0.27). Fur-
ther, apart from mean maximum temperature (TMAX), the two best predictors of
flowering time have not commonly been included in phenological models: the num-
ber of frost‐free days (NFFD) and the quantity of precipitation as snow (PAS) in the
seasons preceding flowering. By vetting these models across an unprecedented
number of taxa, this work demonstrates a new approach to phenological modeling.
KEYWORDS
flowering time, herbarium specimen, phenoclimate modeling, phenology
1
|
INTRODUCTION
Observations of how individual plants alter the timing of leaf pro-
duction, flowering, and fruiting in response to local temperature or
rainfall provide a way to evaluate the impacts of climate variation on
the world's flora. Changes in flowering phenology that have occurred
in response to recent warming have resulted not only in reproduc-
tive failure in some taxa (Inouye, 2008; Inouye & McGuire, 1991;
Inouye, Saavedra, & Lee‐Yang, 2003), but in some cases has pro-
duced mismatches between plants and the animals that depend on
their flowers as food resources (Huang & Hao, 2018; Reddy et al.,
2015; Schenk, Krauss, & Holzschuh, 2017). Thus, identifying the cli-
mate parameters that best predict changes in the timing of
Received: 26 February 2018
|
Accepted: 8 August 2018
DOI: 10.1111/gcb.14447
Glob Change Biol. 2018;1–13. wileyonlinelibrary.com/journal/gcb ©2018 John Wiley & Sons Ltd
|
1
flowering, and accurately predicting the changes in flowering phenol-
ogy that are likely to occur under future climate change, is essential
to the prediction and management of the effects of climate change
on the reproductive success of angiosperm taxa and on the antago-
nistic (e.g., herbivores) and mutualistic (e.g., pollinators and seed dis-
persers) animals that rely on them. Generating robust predictions of
the effects of local climatic conditions on plant phenology is there-
fore a critical first step toward forecasting the effects of climate
change on plant populations, species, and communities, as well as on
the animals that depend on them.
To date, the intensive work required for repeated in situ phenolog-
ical observation has largely restricted long‐term studies of plant phe-
nology and its relation to climate in the United States to either a
comparatively small number of species (Leopold & Jones, 1947;
Schwartz & Reiter, 2000; Zhao & Schwartz, 2003) or to a narrow geo-
graphic range (Abu‐Asab, Peterson, Shetler, & Orli, 2001; Cook et al.,
2007; Dunnell & Travers, 2011; Miller‐Rushing & Primack, 2008). As a
result, our ability to generalize from these studies to a wider array of
species and climatic conditions remains limited. The design and appli-
cation of models that can detect the climatic factors that best predict
timing of phenological events in native plant species have until
recently also been limited by the lack of spatially extensive, long‐term
climate data (particularly for populations located at some distance
from the nearest weather monitoring station), and by the limited num-
ber of gridded climatic variables that have been readily available.
As a result, most spatially extensive examinations of the relation-
ship between local climate conditions and plant phenology have
depended on comparatively simple climate parameters, many of
which are chosen a priori. In such cases, the resulting models may
fail to include either the specific parameters to which plants are
most phenologically sensitive or all of the climate parameters to
which plants respond. The recent availability of digital herbarium
records, however, in combination with datasets such as those pro-
duced by PRISM and ClimateNA, which collectively provide esti-
mates of a wide array of historical climate parameters at local scales
throughout much of the globe (Wang, Hamann, Spittlehouse, & Car-
rol, 2016), offers the opportunity not only to conduct phenological
assessments across an unparalleled diversity of taxa and at broad
spatial scales, but also to conduct a continental‐scale assessment
designed to identify those climate parameters that best predict the
flowering phenology of each focal species.
Herbarium collections have been used in numerous studies to
document the seasonality of a wide array of species (Borchert,
Robertson, Schwartz, & Williams‐Linera, 2005; Boulter, Kitching, &
Howlett, 2006; Sahagun‐Godinez, 1996) and to examine regional, cli-
mate‐based variation in the phenological timing of well‐collected
species (Lavoie & Lachance, 2006; Matthews & Mazer, 2015; Park,
2016; Willis et al., 2017) at spatial scales that exceed the current
spatial and temporal scope of repeated in situ phenological observa-
tions. Furthermore, the unparalleled taxonomic diversity of herbar-
ium records has been leveraged to examine the collective
phenological properties of entire floras (Park, 2014, 2016) that could
not be assessed using other kinds of phenological records.
Assessments of phenological change over recent decades (Bertin,
Searcy, Hickler, & Motzkin, 2017; Lavoie & Lachance, 2006; Primack,
Imbres, Primack, & Miller‐Rushing, 2004) or across spatial climate
gradients (Bowers, 2007; Hereford, Scmitt, & Ackerly, 2017; Houle,
2007; Miller‐Rushing, Primack, Primack, & Mukunda, 2006) have
reported similar shifts based on observations of both living plants
and herbarium‐based phenological records.
While herbarium records are a useful source of phenological
information (Jones & Daehler, 2018), few studies have compared the
capacity of phenoclimatic models based on herbarium records to
predict flowering to those constructed from repeated in situ obser-
vations of the phenological status of living plants (hereafter referred
to as in situ observations, in contrast to phenological records derived
from herbarium collections). There is good reason to expect that
models based on herbarium collections will have lower predictive
power than those based on in situ observations of individual plants.
At the level of individual plants, if the flowering date is estimated by
the collection date of an herbarium specimen, it is intrinsically less
precise than if it is estimated using repeated observations of individ-
ual plants recorded at known intervals. This is because an herbarium
specimen may have been collected at any time during its flowering
period, so the collection date itself does not provide a precise metric
of either the date of flowering onset, its midpoint, or peak flowering.
Moreover, the digitally recorded information that is associated with
the majority of herbarium records typically documents only whether
a given specimen was in flower at the time of collection and there-
fore cannot distinguish among specimens collected at the onset of
flowering, at peak bloom, or at any other stage of flowering. By con-
trast, in situ phenological observations that of an individual extend
from before the onset of flowering to after its termination within a
single flowering season can be used to estimate the individual's flow-
ering onset and termination dates with a known level of precision
(depending on the frequency of observation). These dates, in turn,
can be used to estimate the date of the midpoint of flowering of an
individual plant.
Previous examinations of bias in herbarium collections have
found that temporal gaps in collection often occur during periods of
inclement weather; that collection effort is often concentrated at
locations that are easily accessible; and that herbarium holdings
often under‐sample threatened or endangered taxa while preferen-
tially sampling certain clades (most notably graminoids, Daru et al.,
2017). While in situ phenological observations may exhibit similar
biases, the repeated nature of in situ observations allows those cases
where gaps in observation occur (potentially leading to biased esti-
mates of flowering time) to be identified and removed, which is not
possible for herbarium specimens. Nevertheless, estimates of mean
flowering time in Boston based on the collection dates of herbarium
specimens were found to provide accurate estimates of mean flow-
ering time; to exhibit variation in flowering date similar to in situ
observations; and to remain accurate among taxa with both short
and long flowering durations (Primack et al., 2004).
The current study was designed to construct phenological mod-
els using a regression approach—elastic net regularization—that has
2
|
PARK AND MAZER
several advantages over ordinary least squares regression and step-
wise regression analysis, both of which have been used extensively
to identify climatic parameters that influence the flowering dates
(FDs) of species represented by either herbarium‐derived data or
observations of living plants. In particular, elastic net regularization is
capable of incorporating multiple collinear explanatory factors (De
Mol, De Vito, & Rosasco, 2009; Raschkla, 2017). This is highly
advantageous in the development of robust phenoclimate models, as
potentially important climate parameters are often highly collinear
(Rawal, Kasel, Keatley, & Nitschke, 2015). To our knowledge, this is
the first study to apply elastic net regularization to develop pheno-
logical models that predict the FD of any species.
Here, we harnessed the power of 894,392 digital herbarium
records and 1,959 in situ observations to construct species‐specific
models of flowering phenology for each of 2,468 angiosperm taxa
using 25 distinct climate parameters. For seven additional species,
we constructed phenological models using both herbarium‐based
data and repeated in situ phenological observations. With this
unprecedented number of species‐specific phenological models, we
aimed to (a) determine the predictive ability of these species‐specific
phenological models at a continental scale; (b) compare the predic-
tive capacity of phenological models derived from herbarium records
of flowering dates vs. repeated in situ observations of flowering; and
(c) determine which climate parameters best predict flowering phe-
nology, while conducting model selection from a more extensive
array of climatic parameters (25 distinct climate parameters) than has
previously been used. By developing and vetting these phenoclimatic
models across an unparalleled number of taxa throughout North
America using elastic net regularization, a powerful under‐utilized
method, our goal is to provide a foundation and launching point for
a new approach to phenological modeling.
2
|
MATERIALS AND METHODS
2.1
|
Phenological data
Herbarium‐based estimates of FDs were obtained from 894,392 spec-
imen records of angiosperm species drawn from the digital archives of
72 herbaria throughout North America (see acknowledgements and
supporting information for complete listing) collected between 1901
and 2015. From these records, specimens that were not explicitly
recorded as being in flower were eliminated, as were those that did
not include either the precise GPS coordinates from which the sample
was collected or the precise date of collection. Duplicate specimens
(i.e., specimens of a given species collected on the same date and from
the same location) were also excluded from analysis.
In situ estimates of FD among living plants were derived from
flowering onset phenometric data collected from 2009 to 2015, as
provided by the USA National Phenology Network's database
(https://data.usanpn.org/observations/), and defined as the midpoint
between the estimated dates of flowering onset and termination by
a given individual in a given year. In order to ensure the accuracy of
these in situ estimates of flowering time, we included only those
individual plant records for which no more than 10 days had elapsed
between a date on which the plant had been recorded not to have
flowered yet and the date on which it was first observed to have
started flowering, and for which no more than 10 days had elapsed
between a date on which the plant was last observed in flower for a
given year and the date on which it was first observed to no longer
be in flower. In other words, data from the USA‐NPN included only
those individual plants for which the estimated flowering onset date
was no more than 10 days after a date on which the plant was
observed not to be in flower, and for which the last date on which
an individual was observed in flower was no more than 10 days
prior to a date on which the plant was observed not to be in flower.
As a result of this filtering, the date of the midpoint of flowering is
accurate within a maximum of 5 days.
2.2
|
Data preparation and standardization
Herbarium specimens were collected across many decades and by
many collectors who sometimes documented collections using differ-
ing taxonomic nomenclature, so we standardized the taxonomic
nomenclature using the Taxonomic Name Resolution Service iPlant
Collaborative, Version 4.0 (Boyle et al., 2013, Accessed: April 4,
2017; http://tnrs.iplantcollaborative.org). Specimen identification was
updated using taxonomic information from The Plant List, the Inter-
national Legume Database and Information Service, the Global Com-
positae Checklist, and Tropicos.org. Specimens that could not be
identified unambiguously to the species level were eliminated.
In order to include only those species with a sufficient number of
observations for the development of accurate phenological models,
we excluded species represented by fewer than 100 herbarium sam-
ples. 2,468 taxa met these criteria, comprising 2,171 distinct species
as well as 117 taxa with subspecific epithets and 180 horticultural
varieties across 119 plant families, representing a total of 563,501
herbarium specimens distributed across North America (Supporting
Information Figure S1). These taxa represent a combination of woody
and herbaceous taxa, including both annual and perennial species. We
further identified seven of these angiosperm species that were also
represented in the USA‐NPN database by at least 100 in situ esti-
mates of FD; this dataset comprised a total of 1,959 individual FD
estimates. These seven species, which consisted of three tree species
(Cornus florida,Quercus agrifolia, and Quercus rubra) and four perennial
shrubs (Baccharis pilularis,Eriogonum fasciculatum,Larrea tridentata,
and Symphoricarpos albus) distributed throughout North America (Fig-
ure 1), were analyzed separately in order to compare the explanatory
power of statistical models based on herbarium records to the
explanatory power of independently constructed models based on
repeated in situ phenological observations.
2.3
|
Azimuthal date corrections
The collection date of each herbarium specimen was converted into a
day of year (DOY) value from 1 (January 1) to 366 (December 31 on a
leap year). However, DOY values exhibit an artificial discontinuity
PARK AND MAZER
|
3
between December 31 (DOY 365 or 366) of 1 year and January 1
(DOY 1) of the next. This discontinuity makes it problematic to treat
DOY as a continuous variable when considering species in which indi-
viduals flower both before and after January 1 in different locations or
years. In order to eliminate this discontinuity, we converted DOY into
a circular variable (Batschelet, 1981; Jammalamadakka & Sengupta,
2001) by rescaling the DOY into an azimuth (A), using Equation 1a, or
Equation 1b in the case of leap years.
A¼DOY 360=365 (1)
A¼DOY 360=366 (2)
The coordinates of the endpoint of a vector with azimuth (A)
and length 1, beginning at the origin point (0,0), were then calculated
using the formula [x= cos(A) and y= sin(A)]. The mean position of
these coordinates was then calculated across all specimens of each
species. The mean azimuth (or angular direction) from the origin
point (0,0) to this mean position was then calculated for each species
and rescaled into a DOY value representing the mean FD of each
species across all climatic regions and all available years. Angular
deviations of each specimen's azimuth from its respective species’
mean azimuth were then calculated, with the direction of angular
rotation being enforced as the direction of rotation that required the
smallest angular change. The angular difference of each specimen
from its species‐wide mean was rescaled into a measure of depar-
ture in DOY (ΔDOY), with the direction of the difference (i.e.,
toward earlier or later DOY) being determined by the direction of
angular rotation. The adjusted DOY (hereafter referred to simply as
DOY) of collection for each specimen was then computed by adding
its ΔDOY to its species‐wide mean flowering DOY.
Among specimens for which the resulting collection date was
prior to January 1 (DOY <1) but the mean DOY was after January
1, the respective year of collection was converted to year +1in
order to place it in the same year as the flowering season to which
it was closest (i.e., a specimen of a species with an overall mean FD
of January 15 that was collected on December 23, 2007, would be
converted to DOY = −23, year 2008). Similarly, in cases where a
specimen was collected after December 31 (DOY <365, or 366 in
leap years) but the mean DOY for the species was prior to Decem-
ber 31, the respective year of collection was converted to year –1
(i.e., a specimen of a species with an overall mean FD of December
10 that was collected on January 5, 2006, would be converted to
DOY = 370, year 2005).
2.4
|
Climate data
Climate parameters included in this study consisted of a variety of
annual and seasonal climate metrics across multiple periods of refer-
ence. Seasonal data in this study consisted of mean conditions dur-
ing the autumn of the previous year (from October 1 to December
31), and from the winter (January 1 –March 31), spring (April 1 –
June 30), summer (July 1 –September 30), and autumn (October 1 –
December 31) of the year in which flowering occurred. In order to
ensure that phenological behavior was modeled using only condi-
tions prior to flowering for each species, we also calculated the
mean FD for each species across all years and collection locations,
and excluded from the phenoclimate models those climate variables
representing all seasons that fell after the mean FD for that species.
All climate data used in this study were estimated using the Cli-
mateNA v5.21 software package, available at http://tinyurl.com/
FIGURE 1 Distribution of herbarium specimens and repeated in situ observations of Baccharis pilularis,Cornus florida, Eriogonum
fasciculatum,Larrea tridentata,Quercus agrifolia, Quercus rubra, and Symphoricarpos albus throughout North America
4
|
PARK AND MAZER
ClimateNA (Wang et al., 2016), which produces estimates of local
monthly, seasonal, and annual climate conditions at 4 km resolution.
Climate parameters used to characterize conditions within each sea-
son included the number of frost‐free days (NFFD) mean daily mini-
mum temperatures (TMIN), mean daily maximum temperatures
(TMAX), total precipitation (PPT), and total precipitation as snow
(PAS) within each season. In addition, the date on which the frost‐
free period began (BFFP), the mean temperature of the coldest
month (i.e., January or February) in the year of flowering (i.e., the
calendar year in which flowering occurred), as well as the date on
which the previous year's frost‐free period ended (EFFP), the total
annual precipitation (TAP) throughout the previous year, and the
mean annual temperature (MAT) of the previous year were consid-
ered as aspects of annual climate. In locations that typically do not
experience freezes, the date on which the previous year's frost‐free
period ended was considered to be December 31, and the date on
which the frost‐free period began was considered to be January 1.
2.5
|
Modeling reproductive phenology
In order to model the flowering phenology of each species, multiple
regression methods have commonly been used to construct predictive
models. Stepwise regression, in particular, represents a frequently
used framework for constructing phenological models, particularly
when the goal is to select which climate parameters to include in such
models (Doi & Katano, 2007; Fraga et al., 2016; Gerst, Rossington, &
Mazer, 2017; Hart, Salick, & Xu, 2014; Mazer, Gerst, Matthews, &
Evenden, 2015; Richardson, Chaney, Shaw, & Still, 2017; Roy &
Sparks, 2000; Sparks & Carey, 1995; Sparks, Jeffree, & Jeffree, 2000;
Szabó, 2016; Tryjanowski, Kuźniak, & Sparks, 2005). In order to avoid
collinearity, however, stepwise regression techniques often eliminate
variables that are highly correlated. This may reduce the accuracy of
the resulting phenological models and result in distorted perceptions
of the importance of the parameters involved if important information
is discarded. As many of the climate parameters that were considered
in this study are highly correlated (Supporting Information Table S1),
we instead use an alternative regression method, elastic net regular-
ization, which is better suited to cases in which explanatory factors
are strongly collinear.
2.6
|
Elastic net regularization
Elastic net regularization is an increasingly popular method for multi-
ple regression that is often used in place of stepwise linear regres-
sion techniques, particularly in cases where the number of
explanatory factors is high or where significant collinearity among
explanatory factors exists (De Mol et al., 2009; Zou & Zhang, 2009).
Instead of selecting variables in a binary fashion, as with forward
selection or backward elimination regression techniques, elastic net
regularization enforces parsimony through the use of two penalty
terms: the sum of the absolute value of all parameter coefficients
(L1, Equation 2a) and the sum of all parameter coefficients squared
(L2, Equation 2b, Zou & Hastie, 2005).
L1 ¼∑jjβjj (3)
L2 ¼∑jjβ2jj (4)
The degree to which model complexity is penalized is controlled
by a penalty weighting term (α), while the relative weighting of L1
vs. L2 penalties is controlled by a relative weighting term (ρ). The
overall model is then identified as the model for which the sum of
the SSE (sum of squared errors) and the L1 and L2 penalties, modi-
fied by the two weighting terms, is minimized (C; Equation 5).
C¼SSE þαρjjL1jj þ αð1ρÞjjL2jjÞ (5)
In combination, L1 and L2 penalize model complexity and force
the coefficients of unimportant parameters to zero, as does lasso
regression (Tibshirani, 2011). The combination of L1 and L2 penaliza-
tion also provides several advantages over OLS regression, particu-
larly in cases where potential explanatory factors are highly
correlated. In OLS‐based regression methods, a high degree of
collinearity often leads to large increases in the variance of coeffi-
cients as well as in their standard errors, making the resulting models
unstable and therefore unreliable (Berry & Feldman, 2011). In elastic
net regularization, however, the L2 penalty term prevents the model
from generating extreme coefficients when confronted with highly
collinear parameters. Instead, models constructed using this method
typically exhibit a “grouping effect”(Zou & Hastie, 2005), in which
the weights of the coefficients are distributed across all of the colli-
near parameters. As a result, models constructed through elastic net
regularization typically remain highly stable when confronted by col-
linear parameters, while also avoiding the problems associated with
variance inflation of parameter coefficients that occurs when con-
ducting OLS‐based regressions on datasets with high collinearity (De
Mol et al., 2009; Raschkla, 2017). Given that potentially important
climate parameters are often highly collinear (Rawal et al., 2015;
Supporting Information Table S1), this makes elastic net regulariza-
tion a better tool for the construction of, and variable selection
among, phenoclimatic models.
2.7
|
Constructing phenoclimate models
For each of the 2,468 plant taxa for which sufficient herbarium data
were available, phenological models were constructed using the elas-
ticCV class contained within Scikit‐Learn 0.814‐4 in python in order
to predict the FD of each species using local climate data. This
method represents an internally cross‐validated version of the elastic
net regularization methods developed by Zou and Hastie (2005), and
selects the optimal balance both between L1 and L2 penalization (ρ)
and between the sum of squared standard errors (SSE) and com-
bined L1 and L2 (α) in order to minimize both the standard error and
model complexity.
For each species, this method conducted iterative fitting along a
regularization path, using 100 values of αand 22 values of ρ(ranging
from 0.01 to 0.99) in order to determine the optimal balance
between minimizing error vs. model complexity and between L1 and
PARK AND MAZER
|
5
L2 penalization. The optimal model coefficients were then selected
using 25‐fold cross‐validation. For the seven species for which suffi-
cient in situ data were also available from the USA‐NPN database to
model FD, the same method was used to develop models using
in situ observations. The R
2
values of these models (i.e., their
explanatory power) were then compared to those based on the
herbarium‐based data representing the same seven species.
2.8
|
Evaluating the predictive capacity of models
derived from herbarium collections and in situ
observations
The R
2
value for each model is the mean of the 25 iterations in
which it was trained and tested using separate datasets; this value
was considered to represent the capacity of each phenological model
to predict the timing of FD for a given species under novel condi-
tions that were not included in the training data set. Using the seven
species for which sufficient data were available to construct models
using both herbarium collections and in situ phenological observa-
tions, we then compared the predictive capacity (i.e., the R
2
values)
of the models constructed using herbarium records vs. in situ obser-
vations using paired sample ttests in SPSS.
2.9
|
Relationship of sampling intensity to model
complexity and to predictive capacity
In order to determine whether the number of specimens analyzed
for each species influenced the complexity or predictive power of
the resulting phenological model, we conducted two linear regres-
sions among all species. In each regression, the number of herbarium
specimens was the independent variable and the dependent variable
was either (a) the number of parameters with nonzero coefficients in
each phenological model (which we considered to be an estimate of
its complexity) or (b) the predictive capacity (as measured by the
cross‐validated R
2
) of each phenological model.
2.10
|
Importance of each type of climate
parameter in predicting flowering phenology
For each species represented by herbarium data, the importance of
each type of climate parameter (i.e., TMAX, TMIN, NFFD, BFFP,
EFFP, MAT, MCMT, PPT, PAS, or TAP) for predicting FD was esti-
mated based on the R
2
values of parameter‐specific phenological
models (Table 1). These models were constructed using a series of
multiple regressions in which only those variables associated with a
given type of climate parameter (e.g., TMAX, etc.) were included as
independent variables in a given model; in all cases, the DOY of col-
lection was the dependent variable. In the case of climate parameter
types that were measured across multiple reference periods, the
value of that of parameter in each time period within which it was
measured was included in the model as an independent variable,
with the exception of season‐specific variables (i.e., values for the
selected type of climate parameter within each season, such as
TMAX
winter
, TMAX
spring
, etc.) that were not retained in the overall
model. For example, the assessment of each type of parameter (e.g.,
TMAX) included up to five distinct variables: the mean value during
the autumn of the previous year, and the mean value during the
winter, spring, summer, and autumn of the year in which flowering
occurred. For each species, the conditions during any season(s) expe-
rienced after its mean flowering date were always excluded. Using
elastic net regularization, each regression was conducted using 25‐
fold cross‐validation, and the overall predictive power of each model
was calculated using the mean R
2
of all iterations.
Prior to testing for significant differences among the 10 distinct
types of climate parameters listed above with respect to the mean
R
2
values of the models that included them, we first tested for the
homogeneity of variances of the R
2
values using Levene's test. As
variances in the R
2
values of models constructed using each parame-
ter type were found to be unequal (F
9,24670
= 591.013, p<0.01),
the mean R
2
of models constructed using each of the 10 types of
climate parameters evaluated in this study were then compared fol-
lowing a nonparametric ANOVA (with type of climate parameter as
the independent variable) using Tamhane's T2 tests in SPSS. These
parameter‐specific models typically exhibited lower explanatory
power than the overall models. This reduction in explanatory power
is intentional, however, as these models were used to evaluate the
relative importance of each type of climate parameter in explaining
the observed phenological variation.
In order to evaluate the possibility that some parameters might
be retained only rarely in the phenoclimate models, but have high
explanatory power when included (such as the potential for precipi-
tation as snow to be highly important for species inhabiting locations
TABLE 1 Types and purposes of regression models tested in this study
Model type Climate parameters Purpose Example
Overall All Prediction of FD by all potential climate parameters BFFP + Tmax
Winter
+ Tmax
Spring
+
Tmax
Summer
+ Tmin
Winter
+Tmin
Spring
+
Tmin
Summer …
Parameter‐
specific
All season‐specific values of a
single type of climate
parameter
Determine the predictive power of each climate
parameter on FD, independent of season
Tmax
Winter
+ Tmax
Spring
+ Tmax
Summer
Reference
period‐
specific
All climate parameters within a
given season
Determine the predictive power of season‐specific climate
parameters on FD, independent of individual climate
parameters
Tmax
Winter
+ Tmin
Winter
+ NFFD
Winter …
6
|
PARK AND MAZER
with high snowfall, but irrelevant in areas with little to no snowfall),
we also calculated the number of species in which a given parameter
exhibited a partial R
2
of more than 0.5, more than 0.3, more than
0.2, and more than 0.1.
2.11
|
Importance of climate conditions during
different reference periods
For each species represented by herbarium data, we constructed
seven season‐specific phenological models using elastic net regular-
ization. Excluding those parameters that were not retained in the
overall model, each model potentially included all types of climate
parameter within one of the following reference periods: the autumn
of the prior year; the winter, spring, summer, or autumn of the year
in which flowering occurred; or, for those parameters that are inher-
ently annual rather than seasonal in nature, the year in which flower-
ing occurred or the year prior to flowering (Table 1).
As with previous models, each regression was conducted using
25‐fold cross‐validation, and the predictive power of each model was
estimated as the mean R
2
of all iterations. The homogeneity of vari-
ances of the R
2
values among the seven distinct reference periods
listed above was tested using Levene's test. As the variances in the R
2
values were unequal among reference periods (F
6,14802
= 1217.7,
p<0.01), the mean R
2
values were then compared following a non-
parametric ANOVA (with reference period as the independent vari-
able) using Tamhane's T2 tests in SPSS. In order to determine the
reference period that exhibited the greatest predictive power for the
greatest number of species, we also calculated the number of species
in which conditions during each reference period exhibited a partial R
2
of more than 0.5, more than 0.3, more than 0.2, and more than 0.1.
3
|
RESULTS AND DISCUSSION
Models of flowering phenology can be produced using digitized
herbarium records across a wide array of taxa, as phenological models
of FD derived from herbarium data explained an average of 27% of
the variance in FD among observations not used in model construc-
tion, with models for 1,514 taxa explaining over 20% of observed vari-
ance, and models for 494 taxa explaining <10% of observed variance
(Figure 2). The predictions of FD based on herbarium specimens were
as accurate as those produced based on in situ observations; no signif-
icant difference was detected in the mean explanatory power (R
2
)of
phenoclimatic models constructed using herbarium records vs. in situ
observations (t=−0.765, df =6, p= 0.474, Figure 3, Supporting
Information Table S3). Similarly, the complexity of the phenological
models constructed using herbarium vs. in situ observations did not
differ significantly, as represented by the number of variables selected
for model inclusion (t=−0.525, df =6,p= 0.619). Further, phenologi-
cal models constructed using herbarium and in situ observations
selected or excluded the same climate parameters 79% of the time on
average (Supporting Information Table S4). No significant differences
in the mean values of the regression coefficients for each climate
parameter were detected between the phenoclimate models
constructed using herbarium‐derived vs. repeated in situ phenological
observations (Supporting Information Table S5).
The number of observations required to construct such models
also appears to be comparatively small, as extremely low correlations
were detected between sample size and model accuracy when con-
sidering species represented by 100 or more herbarium specimens
(R
2
≤0.01, df = 2,467, p<0.01, Figure 4a). Similarly, the relation-
ship between sample size and model complexity was also very low
(R
2
= 0.03, df = 2,467, p<0.01, Figure 4b), indicating that limited
specimen availability does not overly restrict the complexity of the
resulting models. Herbarium‐based phenological models incorporated
a mean of 9.38 climate parameters (Figure 5a), and increased model
FIGURE 2 Distribution of cross‐validated R
2
values of all
phenoclimatic models derived from herbarium data using elastic net
regularization (n= 2,468 taxa, Table S2)
FIGURE 3 Cross‐validated R
2
values among phenoclimatic
models independently constructed using digital records of herbarium
collections and in situ estimates of FD provided by the USA
National Phenology Network's database (NPN). Vertical black lines
indicate standard errors. Each set of phenoclimate models evaluated
seven distinct taxa
PARK AND MAZER
|
7
complexity was associated with moderate increases in predictive
power (R
2
= 0.23, df = 2,467, p<0.01; Figure 5b). Variation among
species in the mean temperature of collection sites, the breadth of
their climate envelope, the mean latitude of the collection sites, or
the number of years across which they were observed played a mini-
mal role in determining the predictive power of the resulting pheno-
climate models (R
2
<0.03 in all cases, Supporting Information
Table S6).
3.1
|
Importance of climate parameters to the
prediction of FD
Parameter‐specific climatic models differed significantly with respect
to their mean explanatory power (Supporting Information Table S7).
Among phenoclimate models that included only a single type of cli-
mate parameter, significant differences were detected in the mean
R
2
value of models corresponding to different types of climate
parameter (F= 315.51, df
1
=9, df
2
= 24,679, p<0.01, Supporting
Information Table S7). Similarly, models corresponding to different
reference periods differed significantly with respect to their mean
explanatory power (F= 848.00, df
1
=5,df
2
= 14,807, p<0.01, Sup-
porting Information Table S8).
Temperature‐related parameters were the primary contributors
to the predictive capacity of phenoclimatic models. Of these, the
most powerful predictors of FD across the 2,468 taxa evaluated in
this study were the number of frost‐free days (NFFD), the mean
maximum temperatures (TMAX), and the quantity of precipitation
that fell as snow in the seasons preceding flowering (PAS). NFFD
FIGURE 4 Sensitivity of model R
2
and model complexity to
sample size, estimated from the linear relationship between the
number of digital herbarium records available for each species and
(a) the predictive power (represented by cross‐validated R
2
values) or
(b) the complexity (measured as the number of climate parameters
with nonzero coefficients) of the associated phenoclimatic model for
that species. Points represent the explanatory power and model
complexity of the phenoclimatic models associated with each
species. Each species is represented by one model (selected by the
elastic net regularization approach). Solid lines represent significant
linear relationships. n= 2,468 taxa in both analyses
FIGURE 5 Summary of elastic net regularization models across all
species and selected models. Frequency distribution of the number of
climate parameters with nonzero coefficients among all phenoclimatic
models constructed from digital records of herbarium collections (a);
relationship between the explanatory power (represented by cross‐
validated R
2
) of phenoclimatic models for each species and the
number of climate parameters with nonzero coefficients (b). Each
point represents the phenoclimate model that was developed for a
single taxon. The solid line represents the linear relationship between
the predictive power of each model and the number of explanatory
variables included in it. n= 2,468 taxa in both analyses
8
|
PARK AND MAZER
explained a mean of 14% of the variance in FD across species (Fig-
ure 6a, Table 2). TMAX explained 12% of the variance in FD, and
PAS explained 11% of observed variance in FD. By comparison,
TMIN, which has commonly been used in phenoclimate models (Ber-
tin, 2015; Mohandass, Zhao, Xia, Campbell, & Li, 2015; Munson &
Long, 2017; Munson & Sher, 2015; Rawal et al., 2015; Robbirt,
Davy, Hutchings, & Roberts, 2011), exhibited less than a third of the
predictive power of NFFD on average (Figures 6a, 7a and Table 2).
NFFD and TMAX, which were highly correlated, were likely the best
predictors due to the fact that flowering time across many species
has been associated with spring warming. PAS, on the other hand,
may be a reliable proxy for the date of snow melt, which has been
shown to be highly tied to flowering times for some species that
occupy habitats with substantial winter snow cover (Inouye &
McGuire, 1991).
When winter‐, spring‐, and summer‐flowering species were
examined separately, three patterns emerged. First, the relative
importance of each type of climatic parameter and season was lar-
gely similar among spring and summer‐flowering species. For these
species, Tmax and NFFD are the variables that most strongly affect
flowering date. Second, the models applied to spring‐flowering spe-
cies exhibited higher predictive power than those applied to sum-
mer‐flowering species (Supporting Information Figure S2 and S3).
Third, winter‐flowering species exhibited more similar R
2
values
FIGURE 6 Mean predictive power (R
2
) associated with each type of climate parameter (a), and with conditions during each reference
period (b) in predicting the FD of all taxa included in this analysis and represented by herbarium records (n= 2,468 species), as derived from
species‐specific linear regression analyses conducted using 25‐fold cross‐validation. Climate parameters consisted of maximum mean seasonal
temperature (TMAX), minimum mean seasonal temperature (TMIN), seasonal number of frost‐free days (NFFD), date of the beginning of the
annual frost‐free period (BFFP), date of the end of the annual frost‐free period during the prior year (EFFP), mean annual temperature of the
prior year (MAT), mean temperature of the coldest month (MCMT), seasonal total precipitation (PPT), seasonal precipitation as snow (PAS), and
total annual precipitation of the previous year (TAP). Vertical black lines indicate standard errors of the associated mean. Within each panel,
letters that are shared between bars indicate groups that do not differ significantly with respect to their mean R
2
value, based on Tamhane's
T2 tests
TABLE 2 Mean predictive power (R
2
) associated with each type
of climate parameter and reference period
Mean predictive
power (R
2
)
Standard deviation
of predictive power
Parameter type
TMAX 0.12 0.17
TMIN 0.04 0.09
NFFD 0.14 0.16
BFFP 0.10 0.14
EFFP 0.06 0.11
MAT 0.03 0.10
MCMT 0.02 0.06
PPT 0.09 0.08
PAS 0.11 0.12
TAP 0.05 0.07
Reference period
Prior autumn 0.10 0.13
Winter 0.14 0.14
Spring 0.19 0.18
Summer 0.04 0.08
Autumn 0.01 0.01
Annual 0.17 0.15
PARK AND MAZER
|
9
across all climate parameters and seasons than the spring‐and sum-
mer‐flowering species (Supporting Information Figure S2 and S3).
Interestingly, a survey of phenological studies published over the
past 3 years (representing 35 individual studies, Supporting Informa-
tion Table S9) found no cases in which the number of frost‐free days
was included in the construction of phenological models, indicating
that this parameter has largely been overlooked. Similarly, this sur-
vey detected no papers that included PAS in the phenological mod-
els. Snow melt dates, which likely represent a similar aspect of
climate, have been used in previous examinations of phenology in
alpine (Wipf, Stoeckli, & Bebi, 2009), subalpine or montane (Dunne,
Harte, & Taylor, 2003; Forrest, Inouye, & Thompson, 2010; Inouye,
2008; Inouye & McGuire, 1991; Price & Waser, 1998), and arctic
environments (Bjorkman, Elmendorf, Beamish, Vellend, & Henry,
2015; Cooper, Dullinger, & Semenchuk, 2011; Mortensen, Schmidt,
Høye, Damgaard, & Forchhammer, 2016; Wheeler, Høye, Schmidt,
Svenning, & Forchhammer, 2015). This study, however, indicates
that PAS should be considered in phenological models of taxa that
occupy a much wider range of climate regimes. Increases in NFFD
and TMAX were typically associated with advances in flowering,
while increases in PAS were associated with delays in flowering
(Supporting Information Table S10).
3.2
|
Importance of reference period to the
prediction of FD
When considered across all species, climate conditions during spring
exhibited higher mean explanatory power than conditions during any
other season, explaining a mean of 18.8% of the observed variance
in FD (Figure 6a). Annual climate conditions explained a mean of
17% of the variance in FD, while conditions during the preceding
winter explained only 14% of the variance on average, and condi-
tions during the prior autumn explained a mean of 10% of the
variance. Thus, it appears that annual or winter conditions are
weaker predictors of FD than conditions during spring (Figures 6b
and 7b). Climate conditions during spring were also found to exhibit
higher explanatory power than any other reference period among
both spring‐and summer‐flowering species. Among winter‐flowering
species, however, climate conditions during the prior year were
found to exhibit the highest explanatory power (Supporting Informa-
tion Figure S3).
4
|
CONCLUSIONS
Collectively, this study demonstrates that herbarium datasets can be
used to produce powerful models for the prediction of flowering
date across a vast array of species and that the sample size required
to develop phenological models is easily achieved. Further, this study
demonstrates that elastic net regression is a powerful tool for the
design of phenoclimatic models, and that some of the most impor-
tant climate parameters for the prediction of phenological variation,
such as the number of frost‐free days, the quantity of snowfall, and
the date of the beginning of the frost‐free period, are in fact climate
parameters that have largely been overlooked in the construction of
phenoclimate models. This study also demonstrates a scalable
method for modeling phenoclimate variation across a large number
of species and represents a powerful new approach for assessing
the relationship between recent climatic conditions and flowering
phenology. Future work will leverage these methods to evaluate
whether systematic differences exist in the phenological responses
of angiosperm taxa that exhibit different growth forms, to evaluate
the degree of phylogenetic conservatism in the phenological respon-
siveness of angiosperm taxa, to measure the degree to which the
timing of phenological events has changed over time, and to evalu-
ate the degree to which future climate changes are likely to disrupt
or enhance synchronies among historically coflowering taxa.
FIGURE 7 Percentage of the 2,468
plant taxa among which the predictive
power (R
2
) of each species’parameter‐
specific (a) or reference period‐specific (b)
model exceeded 0.1, 0.2, 0.3, or 0.5 for
each type of climate parameter. Climate
parameters consisted of mean maximum
seasonal temperature (TMAX), minimum
mean seasonal temperature (TMIN),
seasonal number of frost‐free days (NFFD),
date of the beginning of the annual frost‐
free period (BFFP), date of the end of the
annual frost‐free period during the prior
year (EFFP), mean annual temperature of
the prior year (MAT), mean temperature of
the coldest month (MCMT), seasonal total
precipitation (PPT), seasonal precipitation
as snow (PAS), and total annual
precipitation of the previous year (TAP)
10
|
PARK AND MAZER
ACKNOWLEDGEMENTS
This work was supported by NSF DEB‐1556768 (to PIs Mazer and
Park). All collection data used in this study were drawn from partici-
pating institutions of the Consortium of California Herbaria (uc-
jeps.berkeley.edu/consortium/), SEINet (http://swbiodiversity.org/se
inet/), the SERNEC Data Portal (http//:sernecportal.org/portal/in-
dex.php), the Consortium of Midwest Herbaria (http://midwestherba
ria.org/), the Intermountain Regional Herbarium Network, (http://in
termountainbiota.org), the North American Network of Small Her-
baria (http://nansh.org/), the Northern Great Plains Regional
Herbarium Network (http://ngpherbaria.org), and the Consortium of
Pacific Northwest Herbaria (http://pnwherbaria.org/), and was
accessed on March 14, 2017. A complete list of contributing her-
baria is included in the supporting information.
ORCID
Isaac W. Park http://orcid.org/0000-0001-5539-1641
Susan J. Mazer http://orcid.org/0000-0001-8080-388X
REFERENCES
Abu‐Asab, M. S., Peterson, P. M., Shetler, S. G., & Orli, S. S. (2001). Ear-
lier plant flowering in spring as a response to global warming in the
Washington, DC, area. Biodiversity and Conservation,10, 597–612.
https://doi.org/10.1023/A:1016667125469
Batschelet, E. (1981). Circular statistics in biology. London, UK: Academic
Press.
Berry, W. D., & Feldman, S. (2011). Multicollinearity quantitative applica-
tions in the social sciences: Multiple regression in practice (pp. 38–51).
Thousand Oaks, CA: SAGE Publications Ltd.
Bertin, R. I. (2015). Climate change and flowering phenology in Worces-
ter County, Massachusetts. International Journal of Plant Sciences,176
(2), 107–119. https://doi.org/doi:10.1086/679619
Bertin, R. I., Searcy, K. B., Hickler, M. G., & Motzkin, G. (2017). Climate
change and flowering phenology in Franklin county, Massachusetts.
Journal of the Torrey Botanical Society,144(2), 153–169. https://doi.
org/10.3159/TORREY-D-16-00019R2
Bjorkman, A. D., Elmendorf, S. C., Beamish, A. L., Vellend, M., & Henry,
G. H. R. (2015). Contrasting effects of warming and increased snow-
fall on Arctic tundra plant phenology over the past two decades. Glo-
bal Change Biology,21(12), 4651–4661. https://doi.org/10.1111/gcb.
13051
Borchert, R., Robertson, K., Schwartz, M. D., & Williams‐Linera, G. (2005).
Phenology of temperate trees in tropical climates. International Jour-
nal of Biometeorology,50,57–65. https://doi.org/10.1007/s00484-
005-0261-7
Boulter, S. L., Kitching, R. L., & Howlett, B. G. (2006). Family, visitors and
the weather: Patterns of flowering in tropical rainforests of northern
Australia. Journal of Ecology,94(2), 369–382. https://doi.org/10.
1111/j.1365-2745.2005.01084.x
Bowers, J. E. (2007). Has climatic warming altered spring flowering date of
Sonoran desert shrubs? The Southwestern Naturalist,52(3), 347–355.
https://doi.org/10.1894/0038-4909(2007)52[347:HCWASF]2.0.CO;2
Boyle, B., Hopkins, N., Lu, Z., Garay, J. A. R., Mozzherin, D., Rees, T., …
Enquist, B. J. (2013). The taxonomic name resolution service: An
online tool for automated standardization of plant names. BMC Bioin-
formatics,14, 16. https://doi.org/10.1186/1471-2105-14-16
Cook, B. I., Cook, E. R., Huth, P. C., Thompson, J. E., Forster, A., & Smiley,
D. (2007). A cross‐taxa phenological dataset from Mohonk Lake, NY
and its relationship to climate. International Journal of Climatology,28,
1369–1383.
Cooper, E. J., Dullinger, S., & Semenchuk, P. (2011). Late snowmelt
delays plant development and results in lower reproductive success
in the High Arctic. Plant Science,180(1), 157–167. https://doi.org/d
oi: 10.1016/j.plantsci.2010.09.005
Daru, B. H., Park, D. S., Primack, R. B., Willis, C. G., Barrington, D. S.,
Whitfeld, T. J. S., …Davis, C. C. (2017). Widespread sampling biases
in herbaria revealed from large‐scale digitization. New Phytologist,
217, 939–955. https://doi.org/10.1111/nph.14855
De Mol, C., De Vito, E., & Rosasco, L. (2009). Elastic‐net regularization in
learning theory. Journal of Complexity,25(2), 201–230. https://doi.
org/doi: 10.1016/j.jco.2009.01.002
Doi, H., & Katano, I. (2007). Phenological timings of leaf budburst with
climate change in Japan. Agricultural and Forest Meteorology,148,
512–516.
Dunne, J. A., Harte, J., & Taylor, K. J. (2003). Subalpine meadow flower-
ing phenology responses to climate change: Integrating experimental
and gradient methods. Ecological Monographs,73(1), 69–86. https://d
oi.org/10.1890/0012-9615(2003)073[0069:SMFPRT]2.0.CO;2
Dunnell, K., & Travers, S. (2011). Shifts in the flowering phenology of the
northern Great Plains: Patterns over 100 years (Vol. 98).
Forrest, J., Inouye, D. W., & Thompson, J. D. (2010). Flowering phenol-
ogy in subalpine meadows: Does climate variation influence commu-
nity co‐flowering patters? Ecology,91(2), 431–440. https://doi.org/
10.1890/09-0099.1
Fraga, H., Santos, J. A., Moutinho‐Pereira, J., Carlos, C., Silvestre, J., Eiras‐
Dias, J., …Malheiro, A. C. (2016). Statistical modelling of grapevine
phenology in Portuguese wine regions: Observed trends and climate
change projections. The Journal of Agricultural Science,154(5), 795–
811. https://doi.org/10.1017/S0021859615000933
Gerst, K. L., Rossington, N. L., & Mazer, S. J. (2017). Phenological respon-
siveness to climate differs among four species of Quercus in North
America. Journal of Ecology,105(6), 1610–1622. https://doi.org/10.
1111/1365-2745.12774
Hart, R., Salick, J., & Xu, J. (2014). Herbarium specimens show contrast-
ing phenological response to Himalayan climate. PNAS,111(29),
10615–10619. https://doi.org/10.1073/pnas.1403376111
Hereford, J., Scmitt, J., & Ackerly, D. D. (2017). The seasonal climate
niche predicts phenology and distribution of an ephemeral annual
plant, Molluga verticillata.Journal of Ecology,105, 1323–1334.
https://doi.org/10.1111/1365-2745.12739
Houle, G. (2007). Spring‐flowering herbaceous plant species of the decid-
uous forests of eastern Canada and 20th century climate warming.
Canadian Journal of Forest Research,37(2), 505–512. https://doi.org/
10.1139/X06-239
Huang, J., & Hao, H. (2018). Detecting mismatches in the phenology of
cotton bollworm larvae and cotton flowering in response to climate
change. International Journal of Biometeorology,62(8), 1507–1520.
https://doi.org/10.1007/s00484-018-1552-0
Inouye, D. W. (2008). Effects of climate change on phenology, frost dam-
age, and floral abundance of montane wildflowers. Ecology,89(2),
353–362. https://doi.org/doi:10.1890/06-2128.1
Inouye, D. W., & McGuire, A. D. (1991). Effects of snowpack on timing
and abundance of flowering in Delphinium nelsonii (Ranunculaceae):
Implications for climate change. American Journal of Botany,78(7),
997–1001. https://doi.org/10.1002/j.1537-2197.1991.tb14504.x
Inouye, D. W., Saavedra, F., & Lee‐Yang, W. (2003). Environmental influ-
ences on the phenology and abundance of flowering by Androsace
septentrionalis (Primulaceae). American Journal of Botany,90(6), 905–
910. https://doi.org/10.3732/ajb.90.6.905
Jammalamadakka, S., & Sengupta, A. (2001). Topics in circular statistics.
River Edge, NJ: World Scientific. https://doi.org/10.1142/SMA
PARK AND MAZER
|
11
Jones, C. A., & Daehler, C. C. (2018). Herbarium specimens can reveal
impacts of climate change on plant phenology; a review of method-
sand applications. PeerJ,6, e4576. https://doi.org/10.7717/peerj.
4576
Lavoie, C., & Lachance, D. (2006). A new herbarium‐based method for
reconstructing the phenology of plant species across large areas.
American Journal of Botany,93(4), 512–516. https://doi.org/10.3732/
ajb.93.4.512
Leopold, A., & Jones, S. E. (1947). A phenological record for Sauk and
Dane Counties, Wisconsin, 1935‐1945. Ecological Monographs,17(1),
81–122. https://doi.org/10.2307/1948614
Matthews, E. R., & Mazer, S. J. (2015). Historical changes in flowering
phenology are governed by temperature x precipitation interactions
in a widespread perennial herb in western North America. New Phy-
tologist,210, 157–167.
Mazer, S. J., Gerst, K. L., Matthews, E. R., & Evenden, A. (2015). Species‐
specific phenological responses to winter temperature and precipita-
tion in a water‐limited ecosystem. Ecosphere,6, 98. https://doi.org/
10.1890/ES14-00433.1
Miller‐Rushing, A. J., & Primack, R. B. (2008). Global warming and flower-
ing times in Thoreau's Concord: A community perspective. Ecology,
89(2), 332–341. https://doi.org/10.1890/07-0068.1
Miller‐Rushing, A. J., Primack, R. B., Primack, D., & Mukunda, S. (2006).
Photographs and herbarium specimens as tools to document pheno-
logical changes in response to global warming. American Journal of
Botany,93(11), 1667–1674. https://doi.org/10.3732/ajb.93.11.1667
Mohandass, D., Zhao, J. L., Xia, Y. M., Campbell, M. J., & Li, Q. J. (2015).
Increasing temperature causes flowering onset time changes of alpine
ginger Roscoea in the central Himalayas. Journal of Asia‐Pacific Biodi-
versity,8, 191–198. https://doi.org/10.1016/j.japb.2015.08.003
Mortensen, L. O., Schmidt, N. M., Høye, T. T., Damgaard, C., & Forch-
hammer, M. C. (2016). Analysis of trophic interactions reveals highly
plastic response to climate change in a tri‐trophic High‐Arctic ecosys-
tem. Polar Biology,39(8), 1467–1478. https://doi.org/10.1007/
s00300-015-1872-z
Munson, S. M., & Long, A. L. (2017). Climate drives shifts in grass repro-
ductive phenology across the western USA. New Phytologist,213(4),
1945–1955. https://doi.org/10.1111/nph.14327
Munson, S. M., & Sher, A. A. (2015). Long‐term shifts in the phenology
of rare and endemic Rocky Mountain plants. American Journal of Bot-
any,102(8), 1268–1276. https://doi.org/10.3732/ajb.1500156
Park, I. (2014). Impacts of differing community composition on flowering
phenology throughout warm temperate, cool temperate and xeric
environments. Global Ecology and Biogeography,23(7), 789–801.
https://doi.org/10.1111/geb.12163
Park, I. (2016). Timing the bloom season: A novel approach to evaluating
reproductive phenology across distinct regional flora. Landscape Ecol-
ogy,31, 1567–1579. https://doi.org/10.1007/s10980-016-0339-0
Price, M. V., & Waser, N. M. (1998). Effects of experimental warming on
plant reproductive phenology in a subalpine meadow. Ecology,79(4),
1261–1271. https://doi.org/10.1890/0012-9658(1998)079[1261:
EOEWOP]2.0.CO;2
Primack, D., Imbres, C., Primack, R. B., & Miller‐Rushing, A. J. (2004). Her-
barium specimens demonstrate earlier flowering times in response to
warming in Boston. American Journal of Botany,91(8), 1260–1264.
https://doi.org/10.3732/ajb.91.8.1260
Raschkla, S. (2017). Python machine learning. Birmingham, UK: Packt Pub-
lishing.
Rawal, D. S., Kasel, S., Keatley, M. R., & Nitschke, C. R. (2015). Herbarium
records identify sensitivity of flowering phenology of eucalypts to cli-
mate: Implications for species response to climate change. Austral
Ecology,40, 117–125. https://doi.org/10.1111/aec.12183
Reddy, G. C. P., Shi, P., Hui, C., Cheng, X., Fang, O., & Ge, F. (2015). The
seesaw effect of winter temperature change on the recruitment of
cotton bollwors Helicoverpa armigera through mismatched phenology.
Ecology and Evolution,5(23), 5652–5661. https://doi.org/10.1002/ece
3.1829
Richardson, B. A., Chaney, L., Shaw, N., & Still, S. M. (2017). Will pheno-
typic plasticity affecting flowering phenology keep pace with climate
change? Global Change Biology,23, 2499–2508. https://doi.org/10.
1111/gcb.13532
Robbirt, K. M., Davy, A. J., Hutchings, M. J., & Roberts, D. L. (2011). Vali-
dation of biological collections as a source of phenological data for
use in climate change studies: A case study with the orchid Ophrys
sphegodes.Journal of Ecology,99(1), 235–241. https://doi.org/10.
1111/j.1365-2745.2010.01727.x
Roy, D. B., & Sparks, T. H. (2000). Phenology of British butterflies and
climate change. Global Change Biology,6, 407–416. https://doi.org/
10.1046/j.1365-2486.2000.00322.x
Sahagun‐Godinez, E. (1996). Trends in the phenology of flowering in the
orchidaceae of western Mexico. Biotropica,28(1), 130–136. https://d
oi.org/10.2307/2388778
Schenk, M., Krauss, J., & Holzschuh, A. (2017). Desynchronizations in
bee‐plant interactions cause severe fitness losses in solitary bees.
Journal of Animal Ecology,87(1), 139–149.
Schwartz, M. D., & Reiter, B. E. (2000). Changes in North American
spring. International Journal of Climatology,20, 929–932. https://doi.
org/10.1002/(ISSN)1097-0088
Sparks, T. H., & Carey, P. D. (1995). The responses of species to climate
over two centuries: An analysis of the Marsham phenological record,
1736‐1947. Journal of Ecology,83(2), 321–329. https://doi.org/10.
2307/2261570
Sparks, T. H., Jeffree, E. P., & Jeffree, C. E. (2000). An examination of the
relationship between flowering times and temperature at the national
scale using long term phenological records from the UK. International
Journal of Biometeorology,44,82–87. https://doi.org/10.1007/
s004840000049
Szabó, B. (2016). Flowering phenological changes in relation to climate
change in Hungary. International Journal of Biometeorology,60(9),
1347–1356. https://doi.org/10.1007/s00484-015-1128-1
Tibshirani, R. (2011). Regression shrinkage and selection via the lasso: A
retrospective. Journal of the Royal Statistical Society,73(3), 273–282.
https://doi.org/10.1111/j.1467-9868.2011.00771.x
Tryjanowski, P., Kuźniak, S., & Sparks, T. H. (2005). What affects the
magnitude of change in first arrival dates of migrant birds. Journal of
Ornithology,146, 200–205.
Wang, T., Hamann, A., Spittlehouse, D. L., & Carrol, C. (2016). Locally
downscaled and spatially customizable climate data for historical and
future periods for North America. PLoS ONE,11, e0156720.
Wheeler, H. C., Høye, T. T., Schmidt, N. M., Svenning, J.‐C., & Forchham-
mer, M. C. (2015). Phenological mismatch with abiotic conditions—
implications for flowering in Arctic plants. Ecology,96(3), 775–787.
https://doi.org/10.1890/14-0338.1
Willis, C. G., Ellwood, E. R., Primack, R. B., Davis, C. C., Pearson, K. D.,
Gallinat, A. S., & Soltis, P. S. (2017). Old plants, new tricks: Phenologi-
cal research using herbarium specimens. Trends in Ecology and Evolu-
tion,32(7), 531–546. https://doi.org/10.1016/j.tree.2017.03.015
Wipf, S., Stoeckli, V., & Bebi, P. (2009). Winter climate change in alpine
tundra: Plant responses to changes in snow depth and snowmelt tim-
ing. Climatic Change,94(1), 105–121. https://doi.org/10.1007/
s10584-009-9546-x
Zhao, T., & Schwartz, M. D. (2003). Examining the onset of spring in Wis-
consin. Climate Research,24,59–70. https://doi.org/10.3354/
cr024059
12
|
PARK AND MAZER
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the
elastic net. Journal of the Royal Statistical Society: Series B (Statistical
Methodology),67(2), 301–320. https://doi.org/10.1111/j.1467-9868.
2005.00503.x
Zou, H., & Zhang, H. H. (2009). On the adaptive elastic‐net with a diverg-
ing number of parameters. Annals of Statistics,37(4), 1733–1751.
https://doi.org/10.1214/08-AOS625
SUPPORTING INFORMATION
Additional supporting information may be found online in the
Supporting Information section at the end of the article.
How to cite this article: Park IW, Mazer SJ. Overlooked
climate parameters best predict flowering onset: Assessing
phenological models using the elastic net. Glob Change Biol.
2018;00:1–13. https://doi.org/10.1111/gcb.14447
PARK AND MAZER
|
13