Content uploaded by Joseph A E Stewart
Author content
All content in this area was uploaded by Joseph A E Stewart on Oct 18, 2024
Content may be subject to copyright.
Distribution and Apparent Decline of Aspen (Populus tremuloides)
in the Broader Lake Tahoe Area:
A Four-Decade Assessment
Prepared by:
Joseph A. E. Stewart, Department of Plant Sciences, UC Davis
Jonathan W. Long, Pacific Southwest Region, US Forest Service
Suggested Citation:
Stewart J.A.E. & Long J.W. 2024. Distribution and Apparent Decline of Aspen (Populus tremuloides)
in the Broader Lake Tahoe Area: A Four-Decade Assessment. Tahoe Science Advisory Council (TSAC),
Incline Village, NV. 10.5281/zenodo.13948510
Joseph Stewart
1
Table of Contents
Abstract ...................................................................................................................................................... 2
Introduction.............................................................................................................................................. 2
Results ........................................................................................................................................................ 3
Spatial Models ...................................................................................................................................................... 3
Spatiotemporal Models .................................................................................................................................... 7
Discussion and Management Implications ................................................................................... 11
Methods .................................................................................................................................................... 13
Sample Design & Data Collection ............................................................................................................... 13
Reflectance Data............................................................................................................................................... 18
Model Parameterization ............................................................................................................................... 20
Contributors ........................................................................................................................................... 22
Supplementary Materials ................................................................................................................... 22
References ............................................................................................................................................... 23
2
Abstract
We used machine learning models to produce remotely sensed maps of aspen (Populus
tremuloides) percent cover in the broader Lake Tahoe area (BLTA). Aspen is an important
ecological and cultural resource both sensitive to and dependent on wildfire, and also vulnerable
to climate change. Elsewhere in its range aspen declines have been well documented. More
information on where aspen are and how aspen cover has changed over time is needed to
inform management. Our ensemble model of aspen cover over the past decade (2014–2023)
performed well in cross-validated metrics of predictive performance (e.g., R2 = 0.81). The maps
provide a more accurate and detailed view of the distribution of aspen in our area compared
with previous maps that delineated aspen presence but did not assess levels of cover. Model
outputs indicate that aspen cover has declined in our study area over the past 40 years. This
result is consistent across several distinct versions of our models, appears to be robust to
potential sources of statistical bias, and is supported by multiple lines of outside evidence.
Across the BLTA we provide an initial estimate that aspen cover has declined by about 26%
(95% CI: 9–39%) over the 1984–2023 period. A greater focus on restoration treatments, such
as prescribed fire, strategic management of naturally ignited fires, and targeted thinning of
conifers overtopping aspen, could slow or reverse apparent aspen declines in our region.
Detailed maps, such as our product, can serve to inform strategic and adaptive management.
Introduction
Hardwood communities are an important ecological and cultural resource both sensitive
to and dependent on wildfire, and also vulnerable to climate change. The Tahoe Regional
Planning Agency has adopted an “environmental threshold” for riparian hardwoods in the basin
through various policies and restoration projects, and it has sought to map trends over time and
in response to restoration treatments targeting aspen (Populus tremuloides) communities.
Similarly, the Land and Resource Management Plan for the Lake Tahoe Basin Management
Unit prioritized a monitoring question, “What is our progress towards maintaining and improving
willow and aspen habitats within the Basin?” Existing vegetation maps for the Tahoe Basin have
deficiencies in their resolution, accuracy, and/or temporal update cycle that limit their utility in
tracking the condition of existing hardwood stands. Updating these maps and providing accurate
quantification of hardwoods has been identified as a management need. This project builds on
recent regional work to build high-resolution maps of aspen stands in the Tahoe Basin.
3
Results
Spatial Models
Our remotely sensed estimate of the area of the Lake Tahoe Basin Management Unit
(hereafter referred to simply as “the basin” or LTBMU) with at least 10% aspen cover over the
period 2014–2023 was 886 ha (0.34% of terrestrial area). We estimate that 166 ha had at least
50% aspen cover during this period (0.06% of the terrestrial area in the basin). Aspen cover was
higher over our entire study area, the Broader Lake Tahoe Area (BLTA), defined as the basin
buffered by 20-km (Fig 1). We estimate the area of the BLTA with at least 10% and 50% aspen
cover are 3,658 ha and 872 ha, respectively (0.43% and 0.10% of terrestrial area, Fig 2). These
estimates are derived from our ensemble machine learning model version
4.6.LS4to9.Ensemble.T02, also referred to simply as “ensemble model” elsewhere in this report.
They reflect estimated aspen cover within 900-m2 Landsat-aligned grid cells. Estimates of
“cover” refer to percent cover from above (PCFA), the percent cover visible from directly
overhead, visible to satellites in low earth orbit. The term “cover” is used interchangeably with
PCFA elsewhere in this report.
Fig 1. Map of aspen
cover for the period
2014–2023 as
estimated by our
ensemble model. The
external boundary is the
broader Lake Tahoe
area. The internal
boundary is the Lake
Tahoe Basin
Management Unit.
Lakes are shown in light
blue.
4
Fig 2. Area of the broader Lake Tahoe area with aspen cover above threshold levels, as estimated by our
ensemble model for the period 2014–2023. Areas with sparse scattered aspen trees (e.g.,10–20% aspen
cover) appear to be more extensive than areas with high aspen cover (e.g., > 80% aspen cover). The
area with greater than 10% and 50% aspen cover are shown with small dots. Estimates of the area with
less than about 10% aspen cover may be less reliable, as the frequency of commission and omission
errors appears to be more prevalent below this threshold.
Our models of aspen cover over the 10-year period 2014–2023 performed well in cross-
validated metrics of predictive performance (Table 1). Ensemble and extreme gradient boosting
(XGB) models were trained and evaluated on 82,967 900-m2 Landsat-aligned plots, including
1,108 surveyed plots, 41,874 background plots, and 39,985 NPP plots (see methods). Survey
plots with disturbance events between the survey date and model period (2014–2023) were
excluded. Maxent (ME) models were trained and evaluated on a smaller dataset, consisting of
plots where each seasonal period had at least one unobstructed satellite observation of spectral
reflectance over the full model period.
The coefficient of determination for our ensemble model was 0.81 (Table 1). The
ensemble model had the best performance in terms of coefficient of determination, root mean
square error, and Brier Score, while the XGB model had the highest performance in terms of
mean absolute error and log loss. Plots of observed vs predicted performance indicate the XGB
and ME models each had their own strengths and weaknesses. XGB had better performance
distinguishing areas with aspen from areas without aspen. ME had better accuracy and less
bias distinguishing the level of aspen cover. Missing (i.e., obstructed) spectral data resulted in
the ME model being unable to estimate cover in about 1.4% of the terrestrial area within our
study region. To capture the best aspects of both models we composed an ensemble model
with aspen cover calculated as mean(PXBG, PME) ⋅ (PXBG ≥ 2), where PXBG and PME are percent
aspen cover, as estimated by the two component models. The 2% threshold applied to the XGB
model was set by examining aerial imagery, reliability diagrams, and performance metrics, and
driving around the BLTA with binoculars, attempting to balance the resulting levels of
0
2000
4000
6000
010 20 30 40 50 60 70 80 90 100
Aspen Cover Threshold [%]
Area with Aspen Cover > Threshold [ha]
5
commission and omission errors. The resulting ensemble model has strong overall
performance, with a notable bias toward underpredicting aspen cover in plots with > 80% aspen
cover (Fig 3).
Table 1. Cross-validated predictive performance of top performing models of aspen cover for the period
2014-2023. Ensemble and XGB models were trained and evaluated on 82,967 900-m2 Landsat-aligned
plots, including 1,108 surveyed plots, 41,874 background plots, and 39,985 NPP plots. Survey plots with
disturbance events between the survey date and model period (2014–2023) were excluded. ME models
were trained and evaluated on a smaller dataset, consisting of plots where no seasonal periods were
completely obstructed over the full model period. Ensemble models use the simple mean of XBG and ME
predictions.
Model
R2
MAE
RMSE
Log
Loss
Brier
Score
4.6.LS4to9.Ensemble.T02
0.8120
0.0051
0.0255
0.0097
0.0006
4.6.LS4to9.XGB
0.7813
0.0037
0.0275
0.0087
0.0008
4.6.LS4to9.ME
0.7989
0.0230
0.0517
0.0408
0.0027
Informal field surveys (driving, walking, bicycling, binoculars) conducted from June–
October 2024 by JAES and JWL, also suggest our ensemble model has strong overall
performance. The boundaries of large aspen stands are depicted with remarkable detail. The
model successfully detects the presence of aspen that are intermixed with a multitude of other
plant species and understory conditions. Notably, the model often fails to detect low levels of
aspen cover in more urban or suburban environments (e.g., parking lots, irrigated lawns, denser
buildings); this is unsurprising given the data used to train the model came mostly from less
disturbed native ecosystems. Omission errors in 900-m2 pixels with greater than about 10%
aspen cover from above appear to be relatively uncommon. Below this level omission errors
become more common. We observed many instances where the model failed to detect low
levels of aspen cover (e.g., a single large aspen tree within a pixel, sparse saplings typically
totaling less than about 10% cover within a pixel). As expected, when satellite views of aspen
are largely obstructed by taller trees (i.e., high understory cover but low cover from above) the
model’s ability to detect aspen is hampered. Commission errors typically consist of the model
estimating low levels of aspen cover in areas dominated by allied species and vegetation types
(e.g., montane riparian, alder, cottonwood, willows). Commission errors appear to be relatively
uncommon in areas the model estimated have greater than about 10% aspen cover. Providing
our model with additional training data—spanning a wider range of adjacent vegetation
compositions—would improve its overall performance. In particular, the model would benefit
from additional survey data from areas dominated by other montane riparian species.
6
Fig 3. Reliability diagram depicting out-of-sample predictive performance of our ensemble model. The
model appears to perform remarkably well distinguishing areas with aspen stands from areas without
aspen stands and moderately well predicting aspen cover within individual 30-m x 30-m grid cells.
Compared with previous maps our ensemble model provides a more information rich
picture of aspen cover. While our product provides quantitative estimates of aspen cover within
900-m2 pixels, the previous maps classify polygons by vegetation type or taxa and do not
estimate aspen cover. To the extent that these disparate data types can be compared, our
product appeared to outperform the previous products. Still, there are locations that our product
missed, and a previous product hit its target.
We compared our map with previous products by reviewing areas of disagreement
between the products. We examined sequences of Google Earth imagery and conducted
informal field surveys in these areas. Compared to previous maps, our map appeared to have
higher overall levels of accuracy and detail. Our product appeared to be much more accurate
than FVEG WHR (FRAP 2015). The two WHR types that explicitly include aspen are called
“aspen” and “montane riparian”. FVEG had high rates of omission errors and moderate rates of
apparent commission errors for identifying aspen stands. The FVEG map appears to omit a high
proportion of aspen stands in our study area. We identified 925 ha where our ensemble model
estimated aspen cover was > 25% but were not mapped as aspen or montane riparian types by
FVEG, and 2,200 ha where our model estimated aspen cover was > 10% but were mapped as
non-aspen types by FVEG. The WHR types that were most often misclassified as non-aspen
types include juniper, montane hardwood-conifer, fresh emergent wetland, and lodgepole pine
(Fig 4).
7
Fig 4. Discrepancies between our product and WHR vegetation type as mapped by FVEG. Left panel:
Mean aspen cover, as estimated by our ensemble model, within areas mapped as WHR types by FVEG.
Only the aspen and montane riparian WHR types explicitly include aspen in their type descriptions.
Review of aerial imagery in areas of disagreement between the two products suggests that our product is
much more accurate. In the FVEG map, WHR types that had a high proportion of aspen omission errors
included juniper and montane hardwood-conifer types. Right panel: Locations predicted to have > 25%
aspen cover by our model that are classified by FVEG as WHR types that do not explicitly include aspen.
Location boundaries are outlined in blue to enhance their visibility.
Compared with FVEG WHR, the Dilts et al. (2020) map had far lower levels of both
omission and commission errors. However, the Dilts et al. (2020) map appeared to be
sometimes inconsistent in its level of spatial detail; some polygons have detailed boundaries
that mostly exclude non-aspen areas, while some polygons include large areas (e.g., > 1 ha)
where aspen are not apparent (i.e., apparent commission errors). Our ensemble model appears
to perform better at correctly identifying the presence of aspen than the Dilts et al. (2020) map,
but both products are useful for finding errors made by the other product. Our product identifies
many small aspen stands that were omitted by the Dilts et al. (2020) map. The Dilts et al. (2020)
map includes many areas of sparse (e.g., 5%) aspen cover that were omitted by our product.
Spatiotemporal Models
Annual to decadal temporal resolution models were fit for periods from 1984–2023. We
evaluated modeling approaches that either pooled data over multiple year–year periods or fit
models separately for each period. Of these two categories, models fit separately to each period
performed better. Interannual differences in weather and phenology appear to result in distinct
vegetation signals for each year–year period. Within the limited number of models we tested,
models fit to one year–year period did not tend to generalize well to other periods. However, we
anticipate that the predictive performance of pooled-period approaches can be improved with
8
further model tuning and data collection. Hybrid and/or hierarchical modeling approaches
appear poised to result in improved performance for estimates of vegetation cover over time.
Fig 5. Model estimates of aspen decline and recovery around Marlette Lake during a mass summertime
defoliation event caused by an outbreak of white satin moths. Model estimates of aspen cover over time
broadly align with both written accounts and observations from NAIP and Google Earth imagery. Less
clear is the degree to which year-to-year fluctuations before the mass defoliation event and after recovery
reflect real ecological changes or statistical artifacts.
Machine learning models fit separately to each period demonstrated skill tracking clear
cases of large-scale changes in aspen cover over time. For instance, model predictions
generally tracked the mass defoliation and recover event surrounding the circa 2017 white satin
moth outbreak at Marlette Lake (Fig 5), in which a large proportion of aspen trees lacked leaves
during the summer growing season. At Marlette Lake, model estimates generally align with both
written accounts and with clearly observable patterns in the sequence of Google Earth imagery.
Similarly, model predictions for an area of the 2021 Tamarack Fire that burned at high severity
align with a die-off event that is clearly observable from Google Earth imagery (Fig 6).
To assess the ability of our model to accurately track changes over time at local spatial
scales more broadly we used linear regression on annual predicted aspen cover over time for
each 900-m2 pixel (i.e., aspen_cover ~ intercept + slope * year). We examined trends over
various time periods (e.g., 2004–2023, etc.) and identified clusters of pixels with higher
coefficients of determination (e.g., R2 > 0.5). We then examined Google Earth imagery in a few
dozen of these areas predicted to have substantial change in aspen cover over time. This
evaluation had mixed results. In most areas Google Earth Imagery was not of sufficient quality
to determine if the model was correctly identifying trends. When Google Earth imagery allowed
for trend identification, our model appeared to outperform random chance in predicting the
direction of aspen cover change. However, this exercise left us with the sense that model
estimates of change in cover over time may be noisier than estimates of cover across space,
and that further model improvements would be prudent to improve its capability to inform local-
scale management.
9
Several areas, where the model identified potential changes in aspen cover, appear to
have experienced mass defoliation events, where a large proportion of the mature aspen trees
within a stand were clearly missing their leaves in the middle of summer for one or more years.
We suspect that defoliation events in our training data may be causing the model to
overestimate aspen cover for defoliated aspen. Refining our training data with further surveys,
focused on tracking change over time in individual pixels, and well stratified across both space
and time, would improve the model’s ability to track changes in cover over time. Further, we
suspect that building a remote sensing algorithm specifically focused on identifying summertime
mass aspen defoliation events would be fruitful. There appears to be an abundance of mid-
summer images of aspen defoliation events on Google Earth that could be used to develop data
to train this model. Using this complementary method to identify mass defoliation events (e.g.,
white satin moth, fire) and the extent to which stands subsequentially recovered, could
substantially improve our ability to track change in aspen cover over time.
Fig 6. Aspen decline in an area of the 2021 Tamarack Fire that burned at high severity, followed by initial
apparent recovery. Model estimates of aspen cover broadly align with observations from NAIP and
Google Earth. The postfire recovery estimated by the model aligns with observations that aspen typically
resprouts following high severity fire. However, because low-growing aspen saplings are more difficult to
survey via aerial imagery than taller trees, on-the-ground surveys may be necessary to accurately train
and validate model estimates of remotely sensed aspen recovery following high severity fire.
Our models appeared to have skill tracking aspen cover over time. However, we do not
yet have sufficient repeated-measure survey data to quantify their accuracy in this task. Local
estimates of change in cover over time may be noisier than estimates of cover across space,
particularly where training data were sparser. We surmise that an expanded training dataset,
focused on changes over time within individual plots, would improve and better quantify the
model’s ability to accurately track vegetation cover over time. Further model refinements are
needed to more precisely and reliably track changes, particularly at finer spatiotemporal
10
resolutions. Users seeking to track change over time at local spatial scales should exercise
caution interpreting the current version of our model estimates.
Fig 7. Estimated area by percent aspen cover within the broader Lake Tahoe area over time. The general
trend of declining aspen cover over time is consistent across distinct versions of the model (e.g., distinct
parameterizations for each period vs. pooled data over all periods; 1-, 2-, 5-, and 10-year periods, etc.).
This figure depicts estimates from version 4.6.LS4to7.XGB of the model. We used only data from Landsat
4–7 here to avoid potential for shifting biases over time caused by inclusion of data for Landsat 8–9,
which has slightly different spectral bands and begins in 2013.
Model outputs consistently indicated a decrease in aspen cover over time (1984–2023)
in our study area (Fig 7). We used two methods to obtain estimates of percent change in aspen
cover over time. For both methods we estimated the total area of aspen cover in our study
region for each period, which we calculated as Σ(aspen_PCFA/100 * grid_cell_area *
(aspen_PCFA > 10)), excluding grid cells where estimated aspen cover was ≤ 10% (i.e., were
commission errors become more prevalent). Both methods used only spectral data from
Landsat 4–7 to avoid potential biases from including Landsat 8–9 (i.e., we used model version
4.6.LS4to7.XGB). In the first method we simply calculated the ratio of the total area of aspen
cover between the first (1984–1993) and last (2014–2023) 10-yr periods. This method yielded
an estimated decline in aspen cover of 16%. In the second method we used log-linear
regression on 5-yr resolution estimates of the total area of aspen cover. This method yielded an
estimated decline of 26% (95% CI: 9–39%) over the 1984–2023 period.
One potentially confounding factor that could bias model estimates is the quantity or
quality of Landsat spectral data. More limited availability of spectral data can cause the model to
make more commission errors, resulting in higher estimates of aspen cover. If the quantity of
spectral data increased over time, this could bias the model toward estimating a decline in
11
aspen cover. To mitigate this issue, we examined predicted aspen cover over time using only
data from Landsat 4–7, thereby reducing the quantity of spectral data after the 2013 launch of
Landsat 8. Landsat 8 and 9 use slightly different spectral bands, introducing another potential
source of bias. The quantity of unobstructed Landsat 4–7 data over time exhibits a hump-
shaped relationship; the 1999–2011 period has about twice as many unobstructed observations
per pixel per year compared to the preceding, 1984–1998, and subsequent, 2012–2023,
periods, which have about the same number of unobstructed observations per pixel per year.
Estimates of change over time from models that used only Landsat 4–7 data show somewhat
attenuated decline over time compared to models that that used Landsat 4–9, but still estimate
that substantial decline has occurred (Fig 7). Thus, while trends over time in the quantity of
spectral data may bias our estimates of change in aspen cover over time, the qualitative trend of
declining aspen cover over time was robust to this potentially confounding factor.
Another potentially confounding factor that could bias model estimates is change in
aspen cover over time within our training data. For most of the plots in our dataset we assumed
that aspen cover did not change over time. Exceptions included the area near the Marlette
Lake, where a well-documented mass summertime defoliation event occurred circa 2017, and a
small number of areas that experienced > 10% loss of basal area due to fire. If aspen cover
tended to increase over time within our training data, this could bias the model to overestimate
aspen cover during earlier periods. To mitigate this issue, we also examined trends in aspen
cover using an earlier, pooled-period version of our model (version 4.1), where data from all
periods was fed into a single machine-learning model parameterization and the single model
was used for estimating aspen cover across all periods. Compared with later versions that
parameterize the model separately for each period, this version of the model was less accurate.
However, because the model parameterization does not change over time, the model has less
potential to be biased by gradual changes in aspen cover within plots in our training data.
Outputs from this pooled-period version of the model also indicated that aspen cover has
decreased over time within our study area. Similarly, outputs from versions of our model that
used 1-, 2-, 5-, and 10-year temporal resolution all indicated that aspen cover has declined.
Notably, observations from field surveys and aerial imagery suggest that the opposite bias may
be present; our training data appears to include plots where aspen cover decreased over time.
For this reason, our current methods may underestimate the true rate of decline in aspen cover
over time.
Discussion and Management Implications
Our study provides an assessment of aspen (Populus tremuloides) distribution, cover,
and change over time in the broader Lake Tahoe area over the past four decades. The high-
resolution maps produced by our ensemble machine learning model offer a detailed picture of
aspen cover, significantly improving upon previous vegetation mapping efforts in the region,
which estimated aspen distribution but not the level of aspen cover. Our findings suggest a
concerning trend of aspen decline, estimated at approximately 26% decline (95% CI: 9–39%)
over the period from 1984 to 2023. This decline aligns with broader patterns observed across
the western United States and highlights opportunities for targeted management interventions.
12
The ensemble model developed in this study demonstrated strong predictive
performance (R² = 0.81) in estimating aspen cover across the study area. The model's ability to
detect aspen intermixed with various vegetation types and understory conditions is particularly
noteworthy, given the limited training data (approximately 1,000 aerial surveys of 900-m² plots
with non-zero aspen cover). This performance underscores the potential of machine learning
approaches in vegetation mapping, especially when combined with strategic sampling and
diverse data. Additional data and model refinements would increase the accuracy of model
estimates across both space and time.
The model also demonstrates capacity to track changes in aspen cover over time. Its
performance tracking known disturbance events, such as the white satin moth outbreak at
Marlette Lake and high-severity wildfire impacts, provides evidence of the model's ability to
quantify cover changes. Nevertheless, users should exercise caution when interpreting fine-
scale estimates of change over time from our current models. Local-scale estimates of change
in cover over time appear to be noisy; and we do not yet have sufficient repeated-measure
survey data to more broadly quantify the accuracy of model estimates of change over time.
Though local-scale estimates of change over time appear to be noisy we have greater
confidence in region-wide estimates of change over time.
The estimated 26% decline in aspen cover over the past four decades is consistent with
observations from other parts of the western United States (Pierce and Taylor 2010, Estes
2016, Refsland and Cushman 2021).This trend is particularly concerning given the ecological
importance of aspen in the Lake Tahoe Basin, where it has been identified as one of nine
Ecologically Significant Areas that disproportionately support biodiversity relative to their area
(Murphy et al. 2000). Several interacting factors may be contributing to this decline. Fire
suppression hinders aspen regeneration and facilitates conifer encroachment (Krasnow and
Stephens 2015). This mechanism was apparent during our informal field surveys; we found
areas where mature aspen stands appeared to have been recently replaced by dense conifers
overtopping aspen understories (i.e., fallen trunks of large aspen trees were prevalent on the
ground). The recent invasion of white satin moths (Leucoma salicis), first detected in the Tahoe
region in 2011, caused large-scale defoliation events in 2017 and 2018 (Tahoe Environmenta
Research Center 2019). Aspen declines have also been attributed to climate change, with
further declines projected, unless there is a substantial increase in fire frequency (Rehfeldt et al.
2009, Yang et al. 2015, White et al. 2022).
Recent conifer thinning treatments in the basin alone appear unlikely to stem aspen
declines because of their limited extent and limits on the size and amount of conifer trees that
have been removed (Berrill et al. 2016, Berrill et al. 2017). Prospects for slowing or reversing
aspen decline appear to hinge primarily on fire, specifically management of naturally ignited
fires or higher-severity prescribed fires. Removal of conifers can boost regeneration but is less
effective than fire. Aspen regeneration is more vigorous following high-severity wildfires than
after low-severity burns or prescribed fires. Fire also creates conditions conducive to dispersal
via seedling establishment. For this reason, fire may be crucial for aspen resilience to climate
change because it creates opportunities for aspen to shift their distribution toward more
favorable climate conditions. Strategic management of fire appears to be critical to maintaining
and restoring aspen populations (Krasnow and Stephens 2015, White et al. 2022). Detailed
maps of aspen cover—such as our product—can aid strategic and adaptive management,
13
allowing managers to steer fire for beneficial effects and target treatment to where they may be
most effective. Our product may also be useful for identifying areas where conifers are
suppressing aspen understories (e.g., where aspen cover has declined), though further model
development would be prudent to improve this capacity.
Methods
Sample Design & Data Collection
We used human interpretation of vegetation cover from remote imagery to develop a
dataset to train machine-learning models. Remote imagery consisted of aerial photographs (i.e.,
Google Earth, drone flights, NAIP), and publicly available ground-based photographs (e.g.,
Google Street View). Drone flights, focused on collecting high resolution images during the fall
leaf-senescence period were conducted by Derek Young in October of 2023. Surveyors used
these images to estimate vegetation cover within 900-m2 square plots, aligned to USGS
Landsat pixels. Our study area consisted of the broader Lake Tahoe Area (BLTA), defined by
the Lake Tahoe Basin Management Unit (LTBMU or “the basin”) buffered by 20 km. USGS
Landsat imagery in our study area spans raster grids in two projections: UTM zones 10N and
11N, with resolutions of 30-m and origins of (15,15). We processed these raster grids into two
vector-based sampling grids, spanning our study region, for surveyors to overlay on top of aerial
imagery.
Fig 8. A screenshot of our initial vegetation survey database and data entry form. A subsequent version
of this form includes a range of dates, instead of a single date, for which surveyors assess vegetation
cover estimates to be accurate.
We developed a vegetation survey protocol in collaboration with Laura Young-Hart
(LYH), of the Mapping and Remote Sensing Program at the US Forest Service. LYH is a
botanist with strong expertise and experience surveying vegetation via aerial imagery for
mapping and assessment. LYH helped us identify which tree and shrub taxa can be reliably
identified by human observers using aerial imagery. We identified five physiognomic groups and
ten native hardwood taxonomic groups that are common in our study region and can be readily
identified by skilled observers. Physiognomic categories included conifer tree, hardwood tree,
shrub, herbaceous, and non-vegetation. The most readably identifiably taxonomic groups in our
study area include: Populus tremuloides (quaking aspen), Populus trichocarpa/fremontii (black
and Fremont cottonwoods, which are difficult to distinguish in part because they commonly
hybridize in the region forming Populus × parryi), Cercocarpus ledifolius (mountain mahogany),
14
Quercus kellogii (California black oak), Quercus wislizenii/chrysolepis (interior and canyon live
oak), Salix spp. (tree willows), Salix spp. (shrub willows), Alnus rhombifolia (white alder), Alnus
incana spp. tenuifolia (shrubby mountain alder), and Acer glabrum (Rocky Mountain maple).
The survey protocol consists of first visually scanning aerial and ground-based imagery
to find plots where vegetation cover can be identified with high confidence. Typically, this
involves finding regions where vegetation is clearly visible and can be viewed in multiple
seasonal conditions (e.g., spanning the range from winter leaf-off conditions to autumn leaf
senescence). Once clearly identifiable plots were selected, surveyors estimated percent cover
from above (PCFA) for each physiognomic and taxonomic group (Fig 8). Each survey includes a
date, or range of dates, for which the surveyor believes their PCFA estimates are accurate. We
added date ranges to the survey protocol late in our data collection effort, to better support
parameterization of models with higher temporal predictive accuracy. These included repeated
surveys of 63 plots where changes in vegetation cover over time were evident. We recommend
that any similar future efforts begin with date ranges and repeat measures instead of a single
survey date for each plot.
Fig 9. Three Google Earth images of the same Landsat-aligned plot (red boundaries) containing mature
aspen trees and a large conifer. Leaf-off images can be particularly diagnostic for identifying mature
aspen, revealing their light-colored stems. Distinctive branching patterns are especially apparent from
oblique views (right) and from shadows on the ground (left). Images of spring green-up and autumn leaf-
senesce (not shown) can also be diagnostic but are less abundant in our study region.
LYH spearheaded data collection efforts by collecting vegetation cover data for about 80
plots. A primary objective here was to provide characteristic examples of a range of hardwood
species, identified by an expert, for use in training other vegetation surveyors. We contracted
with SIG-NAL to conduct the rest of the surveys for this project, to be performed by workers with
strong experience surveying vegetation via aerial imagery. Despite the examples provided by
LYH, SIG-NAL surveyors found identifying non-aspen hardwood species challenging and time
consuming. Aspen is the most abundant hardwood tree species in the basin, one of the most
important species from a biological conservation perspective (the Lake Tahoe Watershed
Assessment (Murphy and Knopp 2000) identified aspen groves as one of 9 Ecologically
Significant Areas that disproportionately support biodiversity relative to their area), and one of
the easiest to distinguish via aerial imagery (Fig 9). Therefore, to enhance the odds that we
15
could produce maps that address identified management needs, we prioritized collection of
aspen cover data (Fig 10). We hoped that SIG-NAL surveyors would become more confident
identifying the other hardwood species over the course of their work and that they would have
sufficient time to estimate cover for additional hardwood species. To increase our sample size,
we reallocated some of the UC Davis funding to hire an additional surveyor, which proved to be
cost-effective, as the UC Davis surveyor completed more than ten times as many surveys per
dollar spent compared with SIG-NAL (Table 2). While the surveyors employed by SIG-NAL did
high quality work, they were considerably more expensive.
Fig 10. Number of surveys of 900-m2 Landsat-aligned plots with percent-cover-from-above estimates for
each physiognomic and taxonomic group. Most surveys where non-zero cover was estimated for non-
aspen hardwood taxonomic groups were conducted by LYH (see text). The dearth of non-zero cover
estimates for non-aspen hardwood taxa precluded us from developing useful maps for additional taxa.
Table 2. Number of surveys conducted as part of this project by survey team.
Survey Team
900-m2 Plot Surveys
No-Populus Polygon Surveys
SIG-NAL
927
3
USFS
188
77
UC Davis
290
0
Total
1,405
80
Over the course of data collection, we held weekly coordination meetings with surveyors.
We attempted to balance survey efficiencies that are gained by surveying several nearby plots
with spreading out surveys sufficiently to reduce spatial autocorrelation in our sample. We also
sought to stratify surveys across climatic gradients, to capture variability in phenological spectral
signals, and gradients in taxon-specific vegetation cover (Fig 11). The following quality control
measures were taken. Plots with potentially confusing images or from potentially confusing
16
contexts were reviewed as a group (e.g., shrubby aspen growing on poor substrate). We used
out-of-sample preliminary model predictions (see model parameterization) to identify potential
transcription errors in vegetation cover data. We individually reviewed plots with large
discrepancies between recorded and predicted aspen cover and asked surveyors to re-assess a
subset of these plots. Similarly, we identified plots with impossible combinations of data (e.g.,
aspen cover estimated to be higher than total hardwood cover). Upon re-assessment, we
corrected plots that had data-entry errors.
Fig 11. Spatial and climatic distribution of survey data collected by this project. Left panel: Survey
locations are shown in red. The Lake Tahoe Basin Management Unit (LTBMU) is shown in grey. The
boundary of our study region, the broader Lake Tahoe Area (BLTA), is a 20-km buffer around the LTBMU
and shown as a black outline. Major lakes are shown in blue. Right panel: Our sampling density with
respect to number of frost-free days per year, a proxy for the duration of the growing season,
approximates the distribution of aspen as previously mapped by FVEG within our study area, with a
notable dearth of plots from areas with longer growing seasons (i.e., ≥ 165 frost-free days per year).
The consensus among surveyors was that larger aspen trees are easily identifiable via
aerial imagery, while lower growing aspen is challenging to identify even when aerial images are
entirely unobscured by taller trees. Ground-based imagery did support confident identification of
low-growing aspen, though these images (e.g., google street view) were less available. As a
result, our survey method biases our sample away from including plots with low-growing aspen.
Aspen stands may be low growing because they are young, but aspen also commonly have a
short, shrubby stature where they grow on suboptimal substrates (e.g., talus) in the study area.
To the extent that the spectral and phenological signals of low growing aspen differ from larger
aspen trees, this could bias our mapped outputs to under-detect low-growing aspen, even when
17
satellite imagery is unobstructed by taller trees. However, our informal field surveys suggest our
models generally perform well in identifying low-growing aspen.
Toward the end of our survey efforts, manual review of maps produced by preliminary
machine learning models revealed remaining commission errors (i.e., predicting aspen was
present, typically at low cover, in areas where it clearly wasn’t). To tamp down on these errors
we asked surveyors to identify larger polygons where commission errors were present. Due to
apparent similarity in spectral signals between aspens and cottonwoods, we asked surveyors to
identify large polygons where neither aspen nor other Populus spp. were present, and where
the model predicted > 5% aspen cover over a substantial portion of the polygon. We called this
category of survey no-Populus polygons (NPP). From the NPPs we extracted a weighted
random sample of 900-m2 plots that were fully contained within the NPPs (i.e., not spanning the
edge of NPP boundaries) for model training. While commission errors remain, this approach
greatly reduced the prevalence of commission errors.
Overall, our vegetation survey efforts resulted in 1,405 quality-controlled surveys of 900-
m2 plots, aligned to the USGS Landsat grid, and 80 large no-Populus polygons. 1,396 of the plot
surveys include estimates of aspen cover. 197 plot surveys include estimated percent cover for
at least one non-aspen hardwood taxa. 57 surveys include non-zero estimates of percent cover
for at least one non-aspen hardwood taxa (i.e., primarily collected by USFS). Due to the dearth
of taxon-specific cover estimates for other hardwood species we were limited to producing maps
of aspen cover only. Surveys span the period 2004 to 2023 and include 63 plots that were
resampled over time, in areas where disturbance events were evident (Fig 12). The sampling
density of 900-m2 plots appears to closely approximate the distribution of aspen within our study
area with respect to number of frost-free days per year, a climatic proxy for differences in the
timing of aspen phenology (Fig 11). The sample has a notable bias toward inclusion of plots
with higher (e.g., 70–90% cover) aspen cover, compared to plots with lower percent cover (Fig
13).
18
Fig 12. Number of surveys of 900-m2 Landsat-aligned plots over time. Each survey included either a
single representative date or a range of dates, for which the surveyor was confident in the accuracy of
their vegetation cover estimates. The sample includes 63 plots that were sampled over multiple distinct
periods (e.g., before and after substantial disturbances).
Fig 13. Histogram of Aspen percent cover from above in our sample of 900-m2 Landsat-aligned plots.
Reflectance Data
To capture phenological signals we categorized spectral data into distinct seasonal
periods by calendar day of year. Spectral data was filtered to remove obstructions, including
clouds, dilated clouds, cloud shadows, and surface water. We endeavored to balance the
tradeoff between signal improvements associated with finer temporal resolution seasonal
periods against reductions in the sample size of obstruction-free images. For models focused on
estimating vegetation cover over longer, multi-year periods (e.g., for a five-year period, such as
19
2019–2023) finer temporal resolution (e.g., 15-day periods during the summer low cloud period,
from day of year 165 to 330) tended to achieve higher cross validated predictive accuracy. For
models focused on estimating vegetation cover over shorter annual periods (e.g., for a single
year up to a few years) longer seasonal periods appear to be more appropriate.
Fig 14. Phenological patterns in spectral signatures provide signals that can be used to identify
vegetation composition from high-temporal resolution multi-spectral satellite imagery. Distinct
phenological signatures both inside and outside of the low-obstruction period appear to be highly
informative. The spring peak in short-wave infrared bands 6 and 7 around day of year 120 is frequently
obstructed by clouds (Fig 15).
We obtained and processed gridded reflectance data from the USGS Landsat 4–9
Collection 2, Level 2, Tier 1 dataset using Google Earth Engine and R. To minimize data
degradation, we maintained the data in its native USGS grid projections within our project area,
which falls under UTM zones 10 and 11 (30-m resolutions, origins at 15, 15). Data was
downloaded in tabular and raster formats. Tabular data consisted of unprocessed individual
reflectance measurements and obstruction flags, for plots (i.e., raster cells) where we had
collected vegetation cover data. This time-series data was processed in R and used for initial
model tuning. Raster data was processed into obstruction-free, median-over-time values
covering our study area. Time periods consisted of combinations of up to 16 seasonal periods
per year (e.g., days of year 165–180, etc.), with both annual and multi-annual (e.g., 2019–2023)
periods. Spatially extensive raster data was used for final model parameterization and
predictions. Compressed reflectance data occupy a couple hundred gigabytes of storage.
20
Fig 15. Number of unobstructed Landsat 8–9 observations per pixel in our study region by seasonal
period, from 2013 to 2023. Boxes depict the median and interquartile range. Whiskers depict the 99%
central range. Obstructions include clouds, dilated clouds, cloud shadows, and snow. The “low
obstruction period” in our study region approximately spans the period from day of year 165 to 330.
Model Parameterization
We combined three types of vegetation cover data for model training and validation. The
first and most important category was surveys of 900-m2 Landsat-aligned plots. The second
category was background or “pseudo-absence” plots. Locations of about 40,000 background
plots were randomly drawn from our study region. To increase the odds that background plots
were not located within aspen stands, we discarded plots in, or within 60-m of, areas previously
mapped as aspen or montane riparian types. The third category, no-populus polygons (NPP),
was introduced last to tamp down on remaining commission errors that were apparent when
manually reviewing model predictions against remote imagery. We provided vegetation
surveyors with predicted maps of aspen cover and asked them to identify large polygons where
they had high confidence that Populus spp. were not present and where portions of the
polygons were predicted to have substantially greater than zero aspen cover. We then selected
a weighted random sample of about 40,000 Landsat-aligned plots that were fully contained in
(i.e., not spanning the edge of) the NPPs.
We temporally aligned vegetation cover estimates with reflectance data. Fire severity
was estimated using the CALFIRE fire perimeter database and methods from Parks et al. 2018
and Stewart et al. 2021. When moderate wildfire (e.g., > 10% basal area loss) and other known
disturbances (e.g., white satin moth outbreaks) were not apparent, we assumed that vegetation
cover remained relatively unchanged over time. The strength of this key assumption was
iteratively tuned for different versions of the model (e.g., vegetation cover was assumed to be
constant for 1–15 years in absence of disturbance). Resulting model predictions were then
21
evaluated for their cross-validated predictive performance and as well as their accuracy when
assessed with remote imagery and field surveys.
Fig 16. Importance of Landsat 8-9 imagery from different spectral bands and seasonal periods (day or
year ranges) for predicting aspen cover for extreme gradient boosting and Maxent. Note that extreme
gradient boosting reports higher importance for the spring period from day of year 121 to 160, when the
spring green-up signal is often obscured by cloud cover, reflecting the XGB’s flexibility in accommodating
missing independent data.
Of the many model-fitting algorithms we evaluated, extreme gradient boosting (XGB)
and Maxent (ME) had the best cross validated predictive performance and the most plausible
predictions when compared with remote photography. Hyperparameters for XGB were manually
tuned by iteratively fitting models and examining cross-validated predictive accuracy and
comparing predictions against aerial imagery. Maxent typically requires Bernoulli distributed (0-
1) response variables; to transform percent cover estimates into binary data we disaggregated
each individual binomial cover estimates into up to 100 Bernoulli-distributed values for the
response variable (i.e., 25% cover becomes 25 observations of 100% cover and 75
observations of 0% cover).
Between these two algorithms XGB is particularly useful due to its seamless ability to
produce predictions when missing independent variables are present. This ability enables
22
incorporation of finer temporal resolution seasonal spectral data, where missing data may be
present due to prevalence of obstructions such as cloud cover. The implication here is that XGB
has superior ability to incorporate spectral signals that are sometimes obstructed, such as
spring peaks in short-wave infrared bands 6 and 7 that are often obstructed by clouds (Figs 14,
15, 16). In contrast, feeding missing data into ME (e.g., one out of many seasonal periods has
no obstruction-free spectral data) results in both discarding whole vegetation surveys from
model parametrization and in missing (no data) areas on predicted maps of vegetation cover.
Composite Image of Aspen in October 2023 by Derek Young
Contributors
Laura Young-Hart contributed to designing our vegetation survey protocol, conducting
surveys, and advising other surveyors. Travis Freed and Nick Miley performed the bulk of 900-
m2-plot vegetation surveys for this project. Jennifer O'Brien contributed to vegetation surveys.
Derek Young conducted drone flights in October 2023 and assembled composite images. Quinn
Sorenson contributed to initial data carpentry and model parameterization. Enikoe Bihari
contributed to compiling previous vegetation maps.
Supplementary Materials
Estimates of aspen cover in the broader lake Tahoe area over the period 1984–2023 are
available at https://stewartecology.org/TahoeAreaAspenMaps/
23
References
Berrill, J.-P., Dagley, C.M., Coppeto, S.A. (2016). Predicting treatment longevity after
successive conifer removals in Sierra Nevada Aspen Restoration. Ecological Restoration,
34(3), 236–244.
Berrill, J.-P., Dagley, C.M., Coppeto, S.A., Gross, S.E. (2017). Curtailing succession: Removing
conifers enhances understory light and growth of young aspen in mixed stands around Lake
Tahoe, California and Nevada, USA. Forest Ecology and Management, 400, 511–522.
California Department of Forestry and Fire Protection (CALFIRE), Fire and Resource
Assessment Program (FRAP). (2015). Forest Vegetation (FVEG) Database, version 15.1.
Retrieved from https://frap.fire.ca.gov/mapping/gis-data/
Dilts, T.E., Williams, H.P., Refsland, T.K., Cushman, J.H. (2020). Lake Tahoe Basin Aspen Map
of 2018, version 0.92 [Geospatial Data]. University of Nevada, Reno. Available at
https://nevada.box.com/s/o5lbh3eh939kb5yn9otaeeetlo3rkd84.
Estes, B.L. (2013). Historic Range of Variability for Aspen in the Sierra Nevada and South
Cascades. Placerville, CA: Central Sierra Province Ecologist, Eldorado National Forest. 47 p.
Retrieved from https://www.fs.usda.gov/Internet/FSE_DOCUMENTS/stelprdb5434340.pdf
Krasnow, K.D., Stephens, S.L. (2015). Evolving paradigms of aspen ecology and management:
impacts of stand condition and fire severity on vegetation dynamics. Ecosphere, 6(1), 12.
Murphy, D.D., Knopp, C.M. (2000). Lake Tahoe Watershed Assessment, Volume 1. USDA
Forest Service, Pacific Southwest Forest and Range Experiment Station General Technical
Report PSW-175: Albany, CA.
https://www.fs.usda.gov/psw/publications/documents/psw_gtr175/
Parks, S.A., Holsinger, L.M., Voss, M.A., Loehman, R.A., Robinson, N.P. (2018). Mean
composite fire severity metrics computed with Google Earth Engine offer improved accuracy
and expanded mapping potential. Remote Sensing, 10(6), 879.
Pierce, A.D., Taylor, A.H. (2010). Competition and regeneration in quaking aspen-white fir
(Populus tremuloides-Abies concolor) forests in the Northern Sierra Nevada, USA. Journal of
Vegetation Science, 21, 507–519.
Refsland, T.K., Cushman, J.H. (2021). Continent-wide synthesis of the long-term population
dynamics of quaking aspen in the face of accelerating human impacts. Oecologia, 197, 25–
42.
Rehfeldt, G.E., Ferguson, D.E., Crookston, N.L. (2009). Aspen, climate, and sudden decline in
western USA. Forest Ecology and Management, 258, 2353–2364.
White, A.M., Holland, T.G., Abelson, E.S., Kretchun, A., Maxwell, C.J., Scheller, R.M. (2022).
Simulating wildlife habitat dynamics over the next century to help inform best management
strategies for biodiversity in the Lake Tahoe Basin, California. Ecology and Society, 27(2).
Yang, J., Weisberg, P.J., Shinneman, D.J., Dilts, T.E., Earnst, S.L., Scheller, R.M. (2015). Fire
modulates climate change response of simulated aspen distribution across topoclimatic
gradients in a semi-arid montane landscape. Landscape Ecology, 30, 1055–1073.