ArticlePDF Available

Accuracy, Bias, and Improvements in Mapping Crops and Cropland across the United States Using the USDA Cropland Data Layer


Abstract and Figures

The U.S. Department of Agriculture’s (USDA) Cropland Data Layer (CDL) is a 30 m resolution crop-specific land cover map produced annually to assess crops and cropland area across the conterminous United States. Despite its prominent use and value for monitoring agricultural land use/land cover (LULC), there remains substantial uncertainty surrounding the CDLs’ performance, particularly in applications measuring LULC at national scales, within aggregated classes, or changes across years. To fill this gap, we used state- and land cover class-specific accuracy statistics from the USDA from 2008 to 2016 to comprehensively characterize the performance of the CDL across space and time. We estimated nationwide area-weighted accuracies for the CDL for specific crops as well as for the aggregated classes of cropland and non-cropland. We also derived and reported new metrics of superclass accuracy and within-domain error rates, which help to quantify and differentiate the efficacy of mapping aggregated land use classes (e.g., cropland) among constituent subclasses (i.e., specific crops). We show that aggregate classes embody drastically higher accuracies, such that the CDL correctly identifies cropland from the user’s perspective 97% of the time or greater for all years since nationwide coverage began in 2008. We also quantified the mapping biases of specific crops throughout time and used these data to generate independent bias-adjusted crop area estimates, which may complement other USDA survey- and census-based crop statistics. Our overall findings demonstrate that the CDLs provide highly accurate annual measures of crops and cropland areas, and when used appropriately, are an indispensable tool for monitoring changes to agricultural landscapes.
Content may be subject to copyright.
remote sensing
Accuracy, Bias, and Improvements in Mapping Crops and
Cropland across the United States Using the USDA Cropland
Data Layer
Tyler J. Lark 1, * , Ian H. Schelly 1and Holly K. Gibbs 1,2
Citation: Lark, T.J.; Schelly, I.H.;
Gibbs, H.K. Accuracy, Bias, and
Improvements in Mapping Crops and
Cropland across the United States
Using the USDA Cropland Data
Layer. Remote Sens. 2021,13, 968.
Academic Editors: Georgios Mallinis
and Charalampos Georgiadis
Received: 1 January 2021
Accepted: 24 February 2021
Published: 4 March 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
1Nelson Institute Center for Sustainability and the Global Environment (SAGE),
University of Wisconsin-Madison, Madison, WI 53726, USA; (I.H.S.); (H.K.G.)
2Department of Geography, University of Wisconsin-Madison, Madison, WI 53726, USA
The U.S. Department of Agriculture’s (USDA) Cropland Data Layer (CDL) is a 30 m
resolution crop-specific land cover map produced annually to assess crops and cropland area across
the conterminous United States. Despite its prominent use and value for monitoring agricultural land
use/land cover (LULC), there remains substantial uncertainty surrounding the CDLs’ performance,
particularly in applications measuring LULC at national scales, within aggregated classes, or changes
across years. To fill this gap, we used state- and land cover class-specific accuracy statistics from
the USDA from 2008 to 2016 to comprehensively characterize the performance of the CDL across
space and time. We estimated nationwide area-weighted accuracies for the CDL for specific crops
as well as for the aggregated classes of cropland and non-cropland. We also derived and reported
new metrics of superclass accuracy and within-domain error rates, which help to quantify and
differentiate the efficacy of mapping aggregated land use classes (e.g., cropland) among constituent
subclasses (i.e., specific crops). We show that aggregate classes embody drastically higher accuracies,
such that the CDL correctly identifies cropland from the user’s perspective 97% of the time or greater
for all years since nationwide coverage began in 2008. We also quantified the mapping biases of
specific crops throughout time and used these data to generate independent bias-adjusted crop
area estimates, which may complement other USDA survey- and census-based crop statistics. Our
overall findings demonstrate that the CDLs provide highly accurate annual measures of crops and
cropland areas, and when used appropriately, are an indispensable tool for monitoring changes to
agricultural landscapes.
accuracy assessment; accuracy metrics; map bias; confidence; crop maps; Cropland Data
Layer; land use/land cover change; remote sensing products
1. Introduction
Mapping and monitoring crops and croplands can generate powerful insights about
our environment and agricultural production systems [
]. Because satellite-based re-
mote sensing products are able to efficiently capture land use/land cover (LULC) and
their variations across space and time, these data are increasingly chosen as the basis for
agricultural and environmental decision making, including policy creation, evaluation,
and enforcement [
]. With the increased availability and use of detailed remotely sensed
land cover products, however, there is a growing need to understand their accuracy and
reliability for different applications [912].
In the United States, the Department of Agriculture’s (USDA) Cropland Data Layer
(CDL) is frequently utilized to monitor agricultural land due to its nationwide coverage,
agricultural focus, and annual frequency [
]. Produced by the National Agricultural
Statistics Service (NASS), this satellite-derived map has provided complete coverage of
Remote Sens. 2021,13, 968.
Remote Sens. 2021,13, 968 2 of 29
the conterminous U.S. each year since 2008. Since it tracks specific crops at field-relevant
resolutions, it is an ideal tool to detect geographic trends and changes in cultivation.
Previous studies have used the CDL to track crop rotations and planting patterns [1720],
evaluate Farm Bill policies such as crop insurance and the Sodsaver program [
], and
assess the environmental outcomes of various land management systems [
], among
many other applications. Estimates of cropland area from the CDL are also used internally
by NASS for a variety of reports and survey applications as well as considered by other
government organizations such as the Environmental Protection Agency, for example, to
monitor compliance with land protections in renewable energy policies [4,25].
To characterize the CDL’s performance, NASS calculates land cover class-specific
accuracies at the state level and releases them with each annual state CDL product [
These estimates are based on a comparison with parcel level data from the USDA Farm
Service Agency (FSA) [27] and another land cover map, the National Land Cover Dataset
(NLCD) [
]. While these comparisons provide insights into the accuracy of the CDL
for a given state and year, applications of the CDL product typically extend well beyond
this scope; many analyses utilize modifications of the original CDL datasets, compare
across the state products, and/or estimate changes in LULC over time [
]. Despite the
prevalence of these applications, the performance of the CDLs in many of these extensions
has not been evaluated.
Given this lack of evaluation, several articles have questioned the reliability of analyses
that use CDL data to identify recent agricultural trends, citing concerns about both the
CDL’s accuracy and its appropriateness for measuring changes to the landscape [
Such critiques often cite low reported accuracies for the CDLs when mapping certain
crops in specific regions or when depicting nonagricultural land covers such as grasslands.
Despite the potential validity of these concerns, all such critiques to date have lacked a
systematic nationwide assessment of the CDL accuracy beyond comparisons with coarse
data, thereby leaving substantial uncertainty surrounding the CDL’s ultimate dependability.
Furthermore, select approaches for measuring LULC change using the CDL and other land
cover products may help overcome some of the CDL’s limitations and improve analysis
outcomes [
], though the efficacy of these techniques has not yet been fully quantified. For
example, aggregating specific land cover classes into broader domains, such as cropland
and non-cropland, can help address low classifier accuracies of specific cover classes by
eliminating errors associated with distinguishing different crop types and among various
non-cropland covers, such as the many grassland categories historically delineated in the
CDL [30,40,42].
In this paper, we comprehensively quantified the accuracy of the CDL at the national
scale and evaluated the outcomes relevant for applications of the CDL for mapping crops
and cropland. First, we investigated the benefits of consolidating classes within remote
sensing products and quantified the CDL’s ability to distinguish between crop and non-
cropland covers at multiple spatial scales and thematic resolutions. Then, we calculated
nationwide accuracies for both specific and aggregate classes of the CDL and mapped
the spatial variation in accuracies across the U.S. based on congruence with FSA and
NLCD data. We then explored the use of pixel-level classifier confidence information
to provide additional higher-resolution understanding of thematic certainty. Finally, we
estimated the annual bias in mapping specific crops within the CDL and derived new,
bias-adjusted area estimates for the major crop types. We conclude with a discussion of the
implications of these analyses with a particular focus on recommendations for improving
LULC change analyses.
2. Materials and Methods
2.1. Overview of Assessed and Reference Datasets
The Cropland Data Layer is a crop-specific land cover map produced annually by
the USDA National Agricultural Statistics Service (NASS). Complete coverage of the
conterminous United States dates back to 2008, while some states and years predate the
Remote Sens. 2021,13, 968 3 of 29
nationwide product. Primary satellite imagery inputs for the CDL vary according to
availability and effectiveness but have included the Resourcesat-1 Advanced Wide Field
Sensor (AWiFS), Resourcesat-2 Linear Imaging Self Scanning (LISS), Landsat-5 Thematic
Mapper (TM), Landsat-7 Enhanced TM Plus (ETM+), Landsat-8 Optical Land Imager (OLI),
Sentinel-2 A/B, and Deimos-1 and UK-2 from the Disaster Monitoring Constellation. Input
images are collected and used internally by NASS throughout the growing season, and the
final, publicly released CDL is intended to capture the area and geospatial distribution of
crops in midsummer. Data processing and classification generally occur independently
at the state level by NASS analysts, and the nationwide CDL mosaic that results contains
up to 155 classes of cultivated crops and 23 classes of non-cropland covers. Most states,
however, contain a smaller subset of applicable classes, typically fewer than 30 crops and a
dozen non-crop covers [26].
In producing the CDL, NASS uses supplementary information from both the FSA and
the USGS. Specifically, NASS leverages a selection of data from the FSA’s Common Land
Unit (CLU) administrative database to train all cultivated crop classes of the CDL and
assess their accuracy. CLU data are collected and confirmed by USDA County Field Service
Centers and constitute a comprehensive geospatially tagged database of all land owned by
agricultural producers who participate in an FSA program [
]. This represents the most
complete dataset on U.S. agricultural land use, but is not available to the public [16].
For training and assessing non-cropland cover categories, NASS uses the USGS-led
NLCD as a reference [
]. The NLCD is a nationwide 30-meter resolution, 20-class
land cover map that follows a modified Anderson level I/II classification system [
The product’s mapping emphasizes non-cropped vegetative areas, and was historically
produced for 5-year epochs, though the most recent product release has improved coverage
to 2–3-year intervals. It should be noted that while the NLCD is used as an input in
training the CDL classifier, the CDL does not simply revert to the NLCD in non-crop
locations. Instead, the CDL incorporates the NLCD and other data to generate its own
unique mapping of non-crop areas.
During the assessment of the CDL, NASS produces and publishes online the confusion
matrices used to determine the reported accuracies. Referred to as the “error supermatrices,”
these datasets are generated each year at the state or multistate level and report the
number of times specific CDL classes were mapped either consistently or inconsistently
against CLU data from the FSA for all cultivated crops, or against the NLCD for non-
cultivated land covers [
]. While the FSA data and NLCD provide valuable references
for comparison, each differs from traditional reference data used for land cover map
evaluation. In particular, the FSA data are not selected via a probability sampling design.
In addition, because the dataset is generated for other USDA programmatic purposes,
its classes do not always align perfectly with the classes of the CDL, leading to potential
mismatch between the target and reference data. Nevertheless, the FSA dataset represents
an incredibly rich and extensive source of reference information that is of a quality rarely
available for remote sensing accuracy assessments. The NLCD, as a satellite-based land
cover map, is not fully independent nor necessarily more accurate than the CDL. The
NLCD is also not produced annually, such that the closest NLCD product available at
the time of CDL production must be utilized, leading to potential temporal mismatch
between the target and reference data. Despite these limitations, these two datasets provide
powerful points of comparison for understanding how CDL performance varies across
space and time.
2.2. Investigating Effects of Aggregation: Superclass and Consolidated Class Accuracies
We used the data reported in the CDL error supermatrices to derive supplemental
accuracy metrics useful for characterizing and understanding the CDL across scales and
applications. A summary and example of each accuracy metric we assessed is presented in
Table 1, with further details of their derivation described in the section below.
Remote Sens. 2021,13, 968 4 of 29
Table 1.
Accuracy metrics, measured classes, and associated examples. The table describes each of the four main met-
rics reported in this paper and provides an example of each metric from the producer’s accuracy and user’s accuracy
Metric: Reported For: Measures Accuracy of
Identifying: Producer’s Example User’s Example
Class Accuracy Specific classes Specific classes
The likelihood that
actual corn is mapped
as corn
The likelihood an area
mapped as corn is
actually corn
Accuracy Specific classes An aggregated domain
The likelihood that
actual corn is mapped
as cropland
The likelihood an area
mapped as corn is
actually cropland
Consolidated Class
Accuracy An aggregated domain An aggregated domain
The likelihood that
actual cropland is
mapped as cropland
The likelihood an area
mapped as cropland is
actually cropland
Average Class
Accuracy An aggregated domain Specific classes
The likelihood that any
crop is mapped as that
specific crop
The likelihood that any
mapped crop is
actually that crop
Initially, NASS treats their reference data as a simple random sample and calculates
the class accuracies for all specific land cover classes within each state according to the
general formula:
Class Accuracyx=Pixels correctx
Pixels totalx(1)
for each specific crop x, where pixels correct is the number of mapped pixels that match the
reference data in a given region, and pixels total is either the total number of reference data
observations (for calculating producer’s accuracy) or mapped pixels (for calculating user ’s
accuracy) for each class. Producer’s accuracies reflect errors of omission; they indicate how
likely a feature is to be correctly captured by the remote sensing product. User ’s accuracies
reflect errors of commission, and indicate how likely a mapped class correctly resembles
features on the landscape [45].
Aggregating land cover classes to broader thematic classes increases accuracy by
lowering thematic specificity [
]. To understand how well the CDL can distinguish
general cropland from non-cropland areas, we assessed the accuracy of aggregated crop-
land and non-cropland domains as delineated in Lark et al. (2015), based on original NASS
distinctions [
]. The aggregated cropland category includes all annually cultivated
row, closely planted, and horticultural crops as well as tree crops and actively tilled fallow
(Appendix ATable A1). The non-cropland domain includes all remaining CDL classes.
First, we calculated how frequently each specific class of the CDL is mapped as any
class within the correct cropland or non-cropland domain. We refer to this as the superclass
accuracy for each specific class, and derived it as
Superclass AccuracyC,x=Pixels in correct domainC
Pixels assessedx(2)
for each specific class xincluded in the domain C(e.g., cropland or non-cropland). For
the cropland domain, the superclass producer’s accuracy indicates how frequently a
specific crop on the landscape (e.g., corn) was mapped by the CDL as any type of crop
in the cropland domain. The corresponding superclass user’s accuracy represents how
likely a pixel mapped as a specific crop was actually any type of crop (i.e., cropland) on
the landscape.
From the relationship between specific class accuracy and superclass accuracy, it is
possible to quantify the relative number of mapping errors where confusion occurs with
another class within the same broader domain. We define this metric, which we refer
to as the within-domain error rate, as the difference between a class’s error rate and its
Remote Sens. 2021,13, 968 5 of 29
superclass error rate, normalized by the class error rate. It can also be derived directly from
the previously calculated accuracy metrics as
Within Domain Error RateC,x=Superclass AccuracyC,xClass Accuracyx
1Class Accuracyx(3)
for each specific class xincluded in the domain C.
Then, we calculated the overall consolidated class accuracy for the entire cropland
domain according the following equation:
Consolidated Class AccuracyC=xeC(areax×Su perclass AccuracyC,x)
where xis each specific class belonging to the set of all classes in domain C, area is the area
of class x, and superclass accuracy is the value calculated in Equation (2) above. Because
the superclass accuracies give the likelihood that a specific class will correctly identify the
broader domain, taking the area-weighted mean of the superclass accuracies across all
classes within a domain depicts the likelihood that any class in a domain will correctly
identify the broader domain. For the consolidated cropland domain, this calculation
generates a single value that represents the accuracy with which the CDL can identify
cropland in a given state and year. The user’s accuracy for consolidated cropland represents
the likelihood that any randomly selected pixel mapped as cropland in the CDL is actually
cropland on the landscape. The producer’s accuracy for consolidated cropland is the
likelihood that cropland on the landscape is correctly mapped as cropland in the CDL. In
similar fashions, Equations (2) and (4) can be used to calculate superclass accuracies for
each specific non-crop class and for the single consolidated non-cropland domain.
For thoroughness and comparison, we also calculated the average specific class accu-
racy for each domain, according to the following equation:
Average Class AccuracyC=xeC(areax×Class AccuracyC,x)
The average specific class accuracy indicates how accurately, on average across the
full domain, a randomly selected class is mapped in a given year. Tracking the average
specific class accuracy across several years can thus indicate how well the CDL historically
performed and improved over time at delineating specific crops.
2.3. Calculating Nationwide Accuracies
We next estimated nationwide accuracies for each original CDL class as well as for
the newly derived aggregated metrics. To calculate nationwide accuracies, we weighed
each state accuracy to account for disproportionate class areas and reference observa-
tions. For specific class accuracies of the original CDL, we normalized according to the
following equation:
Nationwide Accuracyx=ieS(Accuracyx,i×areax,i)
where Sis the set of states or multistate regions for which data are produced in a given
year, Area is the total area of class xmapped within the state or region i, and accuracy is the
user’s or producer ’s accuracy (Equation (1)) for region i. Similarly, Equation (6) was used to
calculate the nationwide superclass accuracies for each crop by replacing the specific class
accuracies with the appropriate superclass accuracies derived from Equation (2) above.
To derive the nationwide accuracies for consolidated land cover classes, we also
area-weighed by each constituent class. This accounted for unequal areas of each class
within the consolidated domain and ensured proportional contributions to the accuracy
of the combined class. We considered only classes for which accuracy data existed when
Remote Sens. 2021,13, 968 6 of 29
summing class accuracies and areas, since failure to exclude the area of classes without data
would falsely skew the nationwide mean values. Using the available data, we calculated
nationwide accuracies for consolidated classes using the following formula:
Nationwide Consolidated AccuracyC=xeC(Nationwide Superclass Accuracyx×nationwide areax)
xeC(nationwide areax)(7)
Using the specific class accuracy in this formula gives the nationwide-specific class
accuracy averaged across all land covers in the broader domain. Specifically:
Nationwide Average Class AccuracyC=xeC(Nationwide Accuracyx×nationwide areax)
xeC(nationwide areax)(8)
2.4. Mapping Spatial Patterns of CDL Accuracy and Confidence
We mapped a composite of all state- and class-level users’ and producers’ accuracies
for each specific crop and non-cropland cover to better understand how the accuracy of
CDL data varies spatially across the U.S. To generate these maps, each original CDL pixel
was assigned the value of its specific class accuracy for that state and year and rounded to
the nearest integer to facilitate storage as an eight-bit raster. We also mapped and delineated
crop and non-crop components of the CDL confidence layer, which was provided courtesy
of USDA NASS. The confidence layer is a coproduct of the remote sensing classification
process and provides a measure of how well a specific pixel fits within the decision tree
ruleset used to classify it [
]. A unique benefit of the confidence dataset is that it
provides an independent value for each individual pixel, rather than a single value for all
pixels of a given class within a state. It thus varies at the pixel level, enabling improved
spatial understanding of expected errors within the CDL product [26].
We then combined the assessed accuracy and classifier confidence data into a single
metric of CDL certainty to better understand the spatial variation in CDL performance.
By integrating the pixel-resolution confidence layer into the state- and class-resolution
accuracy estimates, a combined metric may offer additional insight or improved spatial
representation of expected errors compared to standalone accuracy indicators. This is
similar to the approach of using posterior probability spaces in change vector analysis [
We considered several ways to combine the accuracy and confidence data, including
multiplying the two components (Equation (9)), averaging them (Equation (10)), and
additional more elaborate combinations (e.g., Equation (11)):
Certainty =Class Accuracy ×Pix el Con f ide nce (9)
Certainty =(Class Accuracy +Pi xel Con f id ence)
Certainty =Class Accuracy 1+(Pi xel Con f id ence)(Average Class Con f idence)
(Average Class Con fidence)(11)
The approaches of Equations (9) and (10) benefit from their simplicity and intuitiveness.
In Equation (11), the confidence data are used as a scalar multiplier to modify the class-
level accuracy: if a pixel is mapped more confidently than the average of the other pixels
in its class, then its certainty value will be greater than its class accuracy; if a pixel is
mapped less confidently than average, then its certainty value will be lower than its class
accuracy. Ultimately, the selection of a formula should be based on the needs of the specific
application [
]. Thus, we present results only from the simple product combination
(Equation (9)) in order to illustrate the concept and potential value of combining accuracy
and confidence data but leave further investigation to future work and specific applications.
2.5. Estimating Map Biases and Bias-Adjusted Crop Acreages
Due to misclassifications within remote sensing products, area estimates derived
directly from pixel counts are likely to be incorrect and either over- or under-predict actual
Remote Sens. 2021,13, 968 7 of 29
class area. Using data derived from confusion matrices, it is possible to quantify this
bias relative to the reference data and subsequently make bias-adjusted area estimates
accordingly [
]. While best practices in accuracy assessment stipulate the use of bias
adjusted estimators with a probability sampling design [
], a simplified estimate of
map bias and adjusted area estimates may still be derived and useful for products such as
the CDL, where a large and high quality—though non-probabilistic—reference dataset is
available. To illustrate this, we calculated the nationwide relative bias of each crop using
the producer’s and user ’s accuracy:
Sim ple Biasx=Producer0s Accuracyx
User0s Accuracyx
1 (12)
for each class xwhere the producer’s and user’s accuracies were those derived in Eq 6.
This indicator of bias is equivalent to the number of assessed pixels mapped as class x
divided by the number of assessed pixels classified as class xin the reference data, such
that it reflects the relative over- or under-mapping of a class compared to the reference
data. We then calculated bias-adjusted area estimates for each class xby scaling the raw
CDL acreage estimates by the amount of over or underprediction suggested by the bias:
Bias Adjusted Areax=Class Areax(Class Areax×Simple Biasx)(13)
where Class Area is the area estimate for each class x derived from pixel counting and the
Simple Bias is that derived in Equation (12).
3. Results
We first present results from our nationwide analysis of specific class accuracies,
followed by nationwide results for the aggregated superclass and consolidated class metrics.
Throughout the results section, we focus on data for the year 2012 as an example because
it represents an intermediate year within the CDL’s modern era of nationwide coverage,
it was used in multiple applications [
], and it aligns well with the Census of
Agriculture, the Natural Resources Inventory, and other intermittent data sources often
used for comparisons with the CDL. The year 2012 was also particularly challenging for
mapping agricultural LULC—moderate resolution imagery was limited, and a severe
drought impacted crop development in many regions—such that our findings should
be considered a conservative estimate of the performance of the CDL. For completeness,
results were also generated for all years of nationwide CDL coverage 2008–2016 and have
been reposited online as companion datasets at
(accessed on 1 January 2021).
3.1. Nationwide Accuracy of Specific CDL Classes
Nationwide area-weighted accuracies for the major crop classes of the CDL are gen-
erally very high. In 2012, corn, soybeans, and winter wheat—the three largest crops by
area—were mapped correctly 95, 94, and 92% of the time from both the producer’s and
user’s perspectives. The top 20 CDL land cover classes by area and their associated pro-
ducer and user accuracies for 2012 are presented in Table 2, with accuracies for all 130
assessed land cover classes for 2012 included in Appendix ATable A2.
Overall, 10 crops had nationwide producer’s accuracies of 90% or greater in 2012.
These included sugarcane (97%); rice (96%); corn (95%); soybeans (94%); sugarbeets (94%);
canola (94%); winter wheat (92%); cotton (91%); almonds (91%); and cranberries (91%).
Five additional crops had class producer’s accuracies higher than the average for all crops,
88.7%, and the remaining 90 crops with computable accuracies fell below the average class
accuracy. In the same year, 17 crops had nationwide user’s accuracies of 90% or greater
(Appendix ATable A2). The remaining 88 crops had user’s accuracies below the average
of 90.3%. The disproportionate number of crops with below-average accuracy reinforces
observations that the CDL performs best for major crops (defined by area) and less so for
Remote Sens. 2021,13, 968 8 of 29
minor crops. To this end, the 10 crops with the highest producer’s accuracies made up
71.5% of the total mapped crop area.
Table 2.
Nationwide class accuracies of major individual land covers in the 2012 Cropland Data Layer (CDL). Table shows
area-weighted national average accuracies for the 20 most common classes by area in the 2012 CDL, calculated according to
Equation (6), based on data from USDA National Agricultural Statistics Service (NASS). National accuracies of all crops and
land covers for 2012 are listed in Appendix ATable A2.
Class Name ID Producer
Corn 1 95% 5% 95% 5%
Cotton 2 91% 9% 89% 11%
Soybeans 5 94% 6% 94% 6%
Spring Wheat 23 89% 11% 87% 13%
Winter Wheat 24 92% 8% 92% 8%
Alfalfa 36 75% 25% 80% 20%
Other Hay/No Alfalfa 37 57% 43% 57% 43%
Fallow/Idle Cropland 61 69% 31% 79% 21%
Open Water 111 90% 10% 81% 19%
Space 121 89% 11% 61% 39%
Intensity 122 83% 17% 74% 26%
Intensity 123 84% 16% 81% 19%
Barren 131 74% 26% 75% 25%
Deciduous Forest 141 88% 12% 75% 25%
Evergreen Forest 142 87% 13% 73% 27%
Mixed Forest 143 44% 56% 51% 49%
Shrubland 152 87% 13% 71% 29%
Grassland/Pasture 176 79% 21% 50% 50%
Woody Wetlands 190 70% 30% 63% 37%
Herbaceous Wetlands 195 61% 39% 47% 53%
Average of All Crops N/A 88.7% 11.3% 90.3% 9.7%
Average of All
Non-Crops N/A 82.4 % 17.6% 69.4% 30.6%
Reported accuracies of specific non-crop classes of the CDL were generally lower
than those of major crops (Table 2). However, it is important to acknowledge that the
reported figures do not represent congruence with a verified ground or truth dataset
of non-cropped areas, but rather are assessed against a reference dataset consisting of
both FSA administrative crop data and the NLCD, itself a remotely sensed land cover
map subject to misclassifications. Nonetheless, the lower levels of reported accuracy in
the CDL non-crop classes suggest higher levels of uncertainty and potential error in the
product and/or reference data, particularly when compared to the high-performance crop
classes. The specific categories of open-, low-, and medium-intensity developed land as
well as deciduous and coniferous forest, shrubland, and open water were all mapped with
nationwide accuracies of greater than 80 percent, whereas specific classes of herbaceous and
woody wetlands and grassland/pasture had lower nationwide performance that ranged
from 47–79% (Table 2).
Remote Sens. 2021,13, 968 9 of 29
3.2. Consolidated Cropland and Non-Cropland Accuracies
Specific land cover classes of the CDLs are often combined into aggregated categories
for applications such as measuring cropland area or conversion between major land cover
types. As an example of aggregation, we assessed the accuracy of consolidated cropland
and non-cropland domains across the U.S. from 2008–2016.
The area- and class-weighted nationwide accuracies for consolidated cropland in 2012
were 95.0% (producer’s) and 97.4% (user’s). Accuracies for the consolidated non-cropland
domain were 97.8 and 88.8%, respectively. Consolidated classes also performed consistently
well across time (Table 3). For example, in 2008—the oldest year for which nationwide data
were produced—cropland user and producer accuracies were 95% and 98%, respectively.
Table 3.
Average specific class and consolidated class accuracies for each year of the CDL. Data from USDA NASS (2016)
based on the comparison of CDL with data from Farm Service Agency (FSA) and National Land Cover Dataset (NLCD)
and processed according to equations (7) and (8). Cropland and non-cropland domains based on class distinctions in
Appendix ATable A1.
Metric Type 2008 2009 2010 2011 2012 2013 2014 2015 2016
Average Crop Accuracy Prod: 88% 89% 89% 89% 89% 89% 90% 90% 92%
User: 90% 90% 91% 91% 90% 91% 92% 91% 92%
Average Non-Crop Accuracy
Prod: 82% 82% 81% 82% 82% 82% 81% 85% 85%
User: 63% 64% 65% 61% 69% 67% 69% 82% 82%
Consolidated Cropland
Prod: 95% 95% 95% 95% 95% 96% 96% 96% 98%
User: 98% 98% 97% 98% 97% 99% 99% 98% 99%
Consolidated Non-Cropland
Prod: 97% 97% 98% 98% 98% 98% 97% 99% 99%
User: 84% 85% 89% 82% 89% 87% 89% 98% 98%
In 2012, 30 of the 40 state or multistate assessment regions of the CDL had consolidated
cropland producer’s accuracies of 90% or greater (Appendix ATable A3). On the user’s
sides, all but two states—New York and Pennsylvania—mapped cropland correctly 90% of
the time or greater. Oklahoma (OK) and Arizona (AZ)—more arid states where cropland
contrasts with the surrounding landscape and is often irrigated—had the highest cropland
user’s accuracies, with values over 99%. More broadly, states with greater amounts of
cropland typically had higher consolidated cropland accuracies (Figure 1), though this
effect appeared to saturate beyond a certain threshold of crop area (e.g., 5 million acres).
Similar trends were also observed when assessed by proportion (rather than total area) of
cropland within each state [16,35].
Remote Sens. 2021,13, 968 10 of 29
Figure 1.
Plot of consolidated cropland user and producer accuracies for each state for 2012. Accuracies plotted against
total crop area in each state. States with greater amounts of cropland typically had higher consolidated cropland accuracies.
Plotting accuracy against the proportion of cropland within each state generated similar trends (data not shown).
3.3. Superclass Accuracies of Specific Crops and Land Covers
Within the aggregated domains, certain classes are more (or less) likely to align with
their broader domain. Among crops mapped in the CDL with greater than one million
acres, rice was the most accurate predictor of cropland on the landscape and most likely
to be correctly identified as cropland, having superclass user’s and producer’s accuracies
both over 99% in 2012 (Table 4). Areas of corn, the most prevalent crop, were labeled as
cropland by the CDL 98% of the time in 2012 (superclass producer’s accuracy), and pixels
mapped as corn in the CDL were actually cropland on the landscape 98.5% of the time
(superclass user’s). Fields of alfalfa, oats, and fallow/idle cropland, on the other hand,
were correctly labeled as cropland by the CDL just over 80% of the time. On the user ’s
side, alfalfa was the only low outlier, yet still had an 86% superclass user’s accuracy for the
cropland domain.
Within the non-cropland domain, most superclass accuracies were high, with only a
few exceptions (Table 5). Developed/Open Space was incorrectly mapped in locations that
were actually cropland 25% of the time in 2012. Grassland/Pasture had an even lower user’s
accuracy and was mapped in cropped locations 32% of the time that year. Furthermore, the
high ratio of superclass producer’s accuracy to superclass user’s accuracy—indicative of
bias—in each of these classes suggests they are both considerably overmapped in locations
that are actually cropland.
Remote Sens. 2021,13, 968 11 of 29
Table 4.
Superclass producer and user accuracies for the top 20 classes by area in the cropland domain in 2012 as well as
the relative rate of within-domain errors. Superclass accuracy is the likelihood that a given crop is identified correctly as
cropland. Percentage of errors within domain is the proportion of errors in the original CDL where the confusion occurs
among two crops within the cropland domain, rather than between a crop and non-cropland cover.
CDL ID Crop Class CDL Acreage
Superclass Accuracy % of Errors within Domain
Producer’s User’s Omission
1 Corn 94,983,301 98% 99% 57% 73%
2 Cotton 13,114,321 98% 100% 77% 96%
3 Rice 2,671,894 99% 100% 84% 95%
4 Sorghum 6,262,444 96% 99% 81% 95%
5 Soybeans 69,810,086 98% 99% 65% 76%
6 Sunflower 1,595,069 94% 99% 52% 79%
10 Peanuts 1,657,438 98% 99% 88% 93%
21 Barley 2,852,300 94% 99% 78% 92%
22 Durum Wheat 1,860,552 98% 99% 92% 97%
23 Spring Wheat 12,303,171 96% 99% 64% 92%
24 Winter Wheat 34,784,199 97% 99% 55% 88%
26 Dbl Win-
Wht/Soybeans 5,311,121 97% 98% 73% 87%
28 Oats 1,285,192 81% 93% 67% 84%
31 Canola 1,700,926 97% 99% 53% 86%
36 Alfalfa 16,167,152 80% 86% 27% 40%
41 Sugarbeets 1,238,159 99% 100% 77% 94%
42 Dry Beans 1,743,309 97% 99% 84% 94%
61 Fallow/Idle
Cropland 24,395,076 80% 92% 35% 67%
69 Grapes 1,136,718 96% 98% 69% 82%
75 Almonds 1,155,344 98% 99% 78% 94%
Overall, the high superclass accuracies of non-crop classes compared to their low
specific class accuracies reported in Table 2suggests that a sizable portion of the mapping
errors result from within-domain confusion among the various non-crop classes, rather
than between non-cropland covers and crops. To quantify this, we calculated the relative
within-domain error rate for each CDL class. This metric indicates what percentage of
mapping errors were a result of confusion within the same domain. For example, corn
had a relative within-domain omission error rate of 57% in 2012, which means that slightly
more than half of the missed (i.e., omitted) corn fields were mapped as another crop in the
CDL, rather than mapped as a non-cropland cover (Table 4). The within-domain proportion
of commission errors for corn was 73%, which indicates that roughly three-quarters of all
pixels that were incorrectly mapped as corn in the CDL were actually another crop on the
landscape rather than a non-cropland cover.
Remote Sens. 2021,13, 968 12 of 29
Table 5.
Superclass producer and user accuracies for all 16 classes in the non-cropland domain in 2012. Superclass accuracy
is the likelihood a given class is correctly identified as non-cropland. Percentage of errors within domain is the proportion
of errors in the original CDL where the confusion occurs among two land covers within the non-cropland domain, rather
than between a crop and non-cropland cover.
CDL ID Land Cover Class CDL Acreage
Superclass Accuracy % of Errors within Domain
Producer’s User’s Omission
37 Other Hay/Non
Alfalfa 23,881,755 89% 86% 68% 62%
92 Aquaculture 203,750 87% 84% 58% 17%
111 Open Water 32,373,788 99% 95% 89% 76%
Perennial Ice/Snow
427,601 100% 99% 100% 97%
121 Developed/Open
Space 64,041,431 97% 75% 72% 41%
122 Developed/Low
Intensity 28,380,971 99% 91% 96% 69%
123 Developed/Med
Intensity 11,279,299 100% 96% 98% 81%
124 Developed/High
Intensity 3,900,690 100% 98% 99% 87%
131 Barren 20,800,191 99% 96% 96% 87%
141 Deciduous Forest 239,843,277 100% 97% 94% 89%
142 Evergreen Forest 249,399,532 100% 99% 99% 96%
143 Mixed Forest 29,952,005 100% 99% 100% 98%
152 Shrubland 429,532,225 99% 89% 89% 64%
176 Grassland/Pasture 383,816,367 93% 68% 66% 37%
190 Woody Wetlands 75,447,681 99% 93% 97% 83%
195 Herbaceous
Wetlands 23,005,862 94% 86% 88% 75%
Nationwide, most crops had within-domain error proportions greater than 50%,
which signifies that they were most frequently confused with another crop when mapped
incorrectly. Two notable exceptions were alfalfa and fallow/idle cropland, which had
within-domain omission error rates of 27 and 35%, respectively. Thus, alfalfa and fallow
fields that were incorrectly captured by the CDL were most frequently classified as a
non-cropland cover. Alfalfa’s within-domain commission error rate was also less than 50%,
which suggests that pixels incorrectly mapped as alfalfa in the CDL were most likely to be
non-cropland covers on the landscape.
The proportion of within-domain errors for errors of omission for all non-cropland
covers were greater than 50%, indicating that misclassified non-cropland covers were
most likely to be labeled as another non-crop cover by the CDL. However, aquaculture,
developed/open space, and grassland/pasture all had low within-class rates of errors
of commission, which indicates that when incorrect, these land covers were frequently
mapped in locations that were actually cropland.
3.4. Spatial Patterns of CDL Accuracy, Confidence, and Certainty
CDL accuracy for specific crops varied greatly across the U.S. In general, most crop
accuracies in 2012 were highest within major cropping regions such as the Corn Belt,
Central Plains, and Mississippi Delta (Figure 2a; Appendix AFigure A1). Conversely,
crop accuracies were lower along the periphery of these core production zones and in
Remote Sens. 2021,13, 968 13 of 29
less dominant agricultural regions of the eastern, southern, and western parts of the U.S.
These locations with lower accuracy have a higher prevalence of less common crops (e.g.,
crops other than corn and soybeans), which are typically mapped less accurately due
to more limited reference and training data from FSA and a charter by USDA to focus
mapping efforts on major program crops [
]. In addition, a greater mixture of crop and
non-cropland covers in these areas generates more opportunities for misclassification.
Figure 2.
Panel of the user’s accuracy (
), confidence layer (
), and combined product of user’s accuracy and confidence
layer (e,f) delineated for crop (a,c,e) and non-crop (b,d,f) classes of the CDL for 2012.
Remote Sens. 2021,13, 968 14 of 29
Non-crop classes had the highest levels of reported disagreement between mapped
and reference sources in the northern and southern plains (Figure 2b). Most western states,
on the other hand, had a clearer identification of non-cropland cover types, particularly
across the vast non-cultivated areas in the region. Mid-Atlantic states and the eastern
Corn Belt also contained relatively high non-crop accuracies considering their diverse
composition of land cover classes.
The visual inspection of confidence layers suggests that the locations of mixed pixels—
map units which fall across two or more land covers—are often mapped with lower
confidence than adjacent single cover pixels. For example, in heavily cultivated regions of
the country such as Iowa, mixed pixels commonly occur between adjacent fields and along
roadways, where they are often the cause of misclassification in the CDL and other remote
sensing products [
]. In forested regions of the U.S., confidence levels were also low,
even across large uninterrupted swaths of forest land cover. In these such locations, the
low confidence reflects difficulty by the classification algorithm in delineating the specific
type of forest cover—i.e., deciduous, coniferous, mixed forest, or woody wetland.
Regionally, CDL confidence levels are high across the Midwest and west, and lowest
in the southeast, northeast, and Great Lakes regions (Figure 2c,d). Within specific regions of
similar land cover, there is also variation. For example, in the cultivated region of the Texas
panhandle, cotton and corn on the western edge are both mapped with lower confidence,
perhaps due to a greater amount of land use change and intermittent cropping patterns in
that area. Across the North and South Dakota, crops tend to be consistently mapped with
lower confidence the farther west they are located (Appendix AFigure A2).
To extract further insights about the within-class spatial variation of CDL performance,
we combined the classifier confidence data with assessed class accuracy into a single
measure of CDL certainty. Figure 2e,f shows an example of the combined accuracy x
confidence product at the national scale. Integrating pixel resolution spatial variation
from the confidence layer into the existing state and class resolution accuracy estimates
is particularly applicable to nationwide and multistate analyses since the confidence data
have greater continuity among state products.
In addition to helping normalize certainty across regions, the use of both accuracy
and confidence information independently or in combination may provide improved
insights into local uncertainty. Figure 3shows an example of an agriculturally intensive
region of southern Iowa. Here, accuracy data help demarcate field-sized tracts of land
that have low class accuracies (Figure 3a), which are locations that data users may wish
to withhold from analyses due to the large uncertainty associated with their classification.
Alternatively, the confidence layer captures finer levels of uncertainty due to mixed pixels
or other contributors to local uncertainty such as topography or ambiguity among land
covers (Figure 3b) but fails to consider the likelihood of the mapped class being incorrect.
Considering both accuracy and confidence data (Figure 3c) thus provides insights into
multiple dimensions of uncertainty and may be valuable for improving the certitude of
mapping and map applications.
Remote Sens. 2021,13, 968 15 of 29
Figure 3.
Maps of CDL user’s accuracy (
), confidence levels (
), and a combined layer of certainty,
shown as the product of accuracy and confidence (c).
3.5. Measured Biases and Adjusted Crop Area Estimates
Adjusted estimates of crop area informed by map biases can improve upon raw pixel-
count area estimates by calibrating them against the reference data used for assessment.
Table 6presents the simple map biases (Equation (12)) and associated adjusted acreage
Remote Sens. 2021,13, 968 16 of 29
estimates (Equation (13)) for the 18 largest crop classes for the CDL for which there are also
relevant data from official USDA acreage estimates. Given that the CDL represents mid-
summer estimates of crop extent, we include NASS data for both planted and harvested
areas, as well as the average of these two metrics. For ten of the 16 crops with comparable
NASS planted and harvested data, the simple bias-adjusted acreage estimate was closer
than the raw pixel-count estimate to the average of NASS planted and harvested areas. As
such, the adjusted results provide refined measures of crop area that are independent of
(but more consistent with) other acreage estimates such as the NASS Surveys or Census
of Agriculture and could be used to complement or replace raw CDL pixel count area
estimates in various applications.
Table 6.
Simple bias and bias-adjusted acreage estimates for major crops for 2012. CDL area represents the summed area of
all pixels in the CDL. CDL bias and bias-adjusted acreage were calculated for each crop according to Equations (12) and (13)
using the producer’s and user ’s accuracy data of Appendix ATable A2. NASS planted and harvested areas are from the
annual NASS acreage report, released on June 29, 2012. Harvested cotton from 2012 October production report. All area
values are reported in acres.
Crop Name CDL Area CDL Bias Bias-Adjusted
NASS Planted
NASS Harvested
Area NASS Ave
Corn 94,983,301 0.43% 94,572,035 96,405,000 88,851,000 92,628,000
Soybeans 69,810,086 0.03% 69,829,899 76,080,000 75,315,000 75,697,500
Winter Wheat 34,784,199 0.22% 34,860,122 41,819,000 35,023,000 38,421,000
Cropland 24,395,076 12.24% 27,382,251 * 14,145,567 ** 36,382,032 n/a
Alfalfa 16,167,152 5.52% 17,059,748 19,213,000 18,827,000 19,020,000
Cotton 13,114,321 1.88% 12,868,014 12,635,000 10,443,400 11,539,200
Spring Wheat 12,303,171 2.97% 11,937,985 11,995,000 11,681,000 11,838,000
Sorghum 6,262,444 6.60% 6,675,868 6,210,000 5,238,000 5,724,000
Dbl Win-
Wht/Soybeans 5,311,121 2.85% 5,159,595 *** *** ***
Barley 2,852,300 12.90% 3,220,316 3,678,000 3,268,000 3,473,000
Rice 2,671,894 1.51% 2,712,326 2,661,000 2,640,000 2,650,500
Durum Wheat 1,860,552 9.80% 2,042,794 2,203,000 2,122,000 2,162,500
Dry Beans 1,743,309 6.65% 1,859,213 1,632,700 1,573,600 1,603,150
Canola 1,700,926 2.23% 1,738,835 1,631,500 1,593,100 1,612,300
Peanuts 1,657,438 1.42% 1,680,900 1,526,000 1,486,000 1,506,000
Sunflower 1,595,069 8.79% 1,735,327 1,804,500 1,735,400 1,769,950
Oats 1,285,192 33.81% 1,719,707 2,746,000 1,091,000 1,918,500
Sugarbeets 1,238,159 0.55% 1,244,915 1,244,100 1,215,900 1,230,000
* Estimate of fallow cropland area from the 2012 Census of Agriculture. ** Estimate of idle cropland area from the 2012 Census of
Agriculture. *** Double cropped winter wheat/soybean area from the CDL may be added to both CDL soybeans and CDL winter wheat
areas to facilitate comparison with NASS estimates for each individual crop.
Assessing the changes in mapped biases over time may also aid in understanding
the true dynamic compositions of crops on the landscape. Figure 4charts the simple bias
of four major crops over time. According to the estimates, the mapping of both corn and
soybeans by the CDL relative to their reference data have increased only slightly, and
in tandem, over time. In contrast, alfalfa has gone from being under-mapped by 12.6%
relative to the reference data in 2008 to being under-mapped by only 1.6% in 2016, which
marks a considerable change over time. As a result, estimates of alfalfa area based on direct
CDL pixel counts could embody a sizeable artificial increase.
Remote Sens. 2021,13, 968 17 of 29
Figure 4.
Mapping bias of select crops over time. The biases represent the relative over-representation (positive values)
or under-representation (negative values) of crops by the CDL in each year according to comparison with the products’
reference data.
4. Discussion
The Cropland Data Layer currently provides the only annual information on agricul-
tural land use/land cover across the United States that is geographically comprehensive,
spatially explicit, and crop specific. Despite its prominent use and application, the accu-
racy of the CDL had not been well characterized at national scales nor across common
aggregated classes. To fill this gap, we derived and analyzed multiple metrics of certainty
for the CDL across space and time to better understand its performance and associated
implications for measuring LULC and its change.
4.1. CDL Performance
Based on nationwide assessment, it is evident that the CDL consistently identifies
specific major crops like corn and soybeans with very high accuracy. On the other hand,
select land cover classes such as alfalfa and grassland/pasture are captured correctly only
about 75% of the time, which reflects the CDL’s generally lower performance outside of the
major crop classes, a point frequently discussed in state and regional evaluations [35,40].
To accommodate low accuracies, specific classes can be aggregated into broader land
cover domains such as cropland or non-cropland. Our results spatially and numerically
quantify the effectiveness of this approach and show that across the U.S., cropland areas are
mapped correctly by the CDL at least 97% of the time for all years. These findings confirm
the CDL’s acuity of identification and demonstrate its validity for monitoring cropland
locations and associated shifts over time.
Mapping the spatial variation in class accuracies across the United States reveals
clear geographic trends and patterns in the CDL’s performance. In general, specific crop
accuracies are highest within core agricultural areas and among major USDA program
crops. Cropland superclass and consolidated cropland accuracies, however, are consistently
Remote Sens. 2021,13, 968 18 of 29
high across the country, and further illustrate the value of aggregating to broader domains
when attempting to measure land cover across large areas or across all CDL classes,
particularly on the margins of major crop zones.
The use of map bias information to adjust area estimates provides a quantitative means
to improve crop area calculations based on remote sensing products [
]. Similarly, the
simplified bias-adjusted approach for estimating crop area reported here improved upon
raw CDL pixel-based estimates by correcting for misclassifications and also provides a
more comprehensive accounting of cropland than the FSA reference data would provide
on its own, since that data source only captures land with crops that participate in FSA
programs. Our approach thus combines desirable features of both the CDL and FSA
datasets, while remaining independent of other USDA data sources like the NASS surveys
or Census of Agriculture that are occasionally used for calibration or comparison.
4.2. Improvements over Time
For most metrics, we reported on the performance of the 2012 CDL, although variabil-
ity exists across years. Overall, CDL accuracy has improved over time, due in part to use of
additional satellite input (more sources and more images per year), a more robust classifica-
tion process (an ensemble decision tree instead of maximum likelihood methodology), and
increasing amounts of training data from the FSA and elsewhere [
]. As a result, average
class-specific accuracy for all crop classes has improved from 87% in 2008 to 92% in 2016.
By 2016, a total of 17 crops were mapped with 90% or higher producer’s accuracy, up from
just 10 crops in 2008. Aggregate metrics, including consolidated and superclass accuracies
for the cropland and non-cropland domains, have also improved. However, the magnitude
of their increases is more limited due to their already high performance across time.
The annual changes in performance of the CDL can have important ramifications for
CDL-based analyses. If the bias or relative over- or under-mapping of a class changes over
time, it can induce false signals of LULC change or skew estimates of crop area change.
Lark et al. (2017) explore the implications from the change in total cropland bias and
suggest potential solutions [
]. Here, we show that there are also sizable changes in bias
for specific crop types. These changes, if disregarded, may influence the results of analyses
of those crops over time. For example, unadjusted estimates of the increase in corn acreage
following the biofuels boom could be affected by artificial changes in corn mapping across
time. However, the magnitude and direction of impact depends on the specific years of
analysis and may be counterbalanced by parallel biases in soybeans and other crops. Thus,
analyses that focus on the relationship among corn, soybeans, and cropland—or any classes
that have experienced synchronized changes in bias—likely remain valid despite potential
eccentricities in the underlying data. Nonetheless, it is important to consider the biases of
mapped data in applied analyses, particularly when results may influence industry and
4.3. Implications for Measuring LULC Change
The use of aggregated classes to measure LULC change benefits from the high acuity of
the product to detect a broader domain while avoiding challenges of delineating spectrally
similar land covers within the same domain. When measuring conversions between
cropland and non-cropland, the consolidated classes can thus be used to initially detect
change, followed by subsequent identification of the specific land cover or crop planted
before and after the conversion [
]. The assessment of crop specificity after detecting
change maintains the thematic richness of the original CDL dataset without adversely
affecting detection of a conversion between the aggregated domains. In practice, this
isolates the known uncertainty in specific class identification and removes it from the
change detection process.
Using this two-stage approach, the likelihood that a conversion occurred becomes a
function of the highly accurate aggregated classes, whereas the certainty of which specific
land cover class preceded and followed a conversion (given that the conversion was
Remote Sens. 2021,13, 968 19 of 29
correctly identified) is dependent upon the land cover’s specific class accuracy. Thus, for
cropland conversion estimates such as Lark et al. (2015) or Morefield et al. (2016), the class
accuracies reported in our Table 2most closely represent the likelihood that a given crop
was planted on newly converted land, rather than directly indicate the likelihood that a
conversion occurred [22,54].
The challenges of mapping less-common specific crops and the ease of mapping
aggregate cropland have additional implications for CDL-based applications. For example,
it might be argued that the CDL is more appropriate for detecting broad land use changes
(e.g., conversion between cropland and non-cropland) than for identifying nuanced changes
among specific crops (e.g., identifying crop rotations) unless the focus of rotations remains
on major crop types [
]. Crop-specific applications should also consider each class’s
prevalence and accuracy and how such factors may influence results.
Our findings can also be used to guide how specific crops should be treated within
analyses. Alfalfa, for example, is often cited as a problem crop due to its semi-perennial
nature, spectral similarity to non-cropland covers, and occasional interplanting within
mixed species hay and pasture. The crop was incorrectly mapped in non-cultivated areas
14% of the time in 2012. By 2016, this superclass error rate dropped to just 8%. From a
producer’s perspective, alfalfa was mapped as a non-cultivated land cover 20% of the time
in 2012, but this error rate dropped to 8% by 2016. Overall, the lower superclass accuracies
for alfalfa relative to other crops reinforce precautions of past analyses, such as the exclusion
by Morefield et al. (2016) of all non-crop to alfalfa conversions from their change analysis
and the exclusion by Lark et al. (2015) of grassland-to-alfalfa conversion. The relative
within-domain error rates (Table 4) further highlight the challenge of including alfalfa in
the cropland domain, since the crop is more frequently confused with non-cropland covers
than with other crop classes. However, the latest improvements in alfalfa accuracy suggest
that analyses of more recent CDL data may want to consider including the forage crop in
their analyses.
Visual mapping of specific and aggregate accuracies can help users identify hotspots
and problem areas within the country and understand how they vary across space and time.
Coupling accuracy data with its spatial location on the landscape thus offers opportunities
unafforded by the nonspatial structure of the NASS metadata tables and confusion matrices
for each state and year. For example, rather than excluding entire land cover classes from
analyses, such as the exclusions of alfalfa by Morefield et al. (2016) and Lark et al. (2015), the
spatial mapping of the accuracy of individual classes would allow the empirical removal
of just those pixels with low mapped accuracy in certain state–year combinations, while
retaining those with a higher likelihood of being correct. The value of this spatial approach
is greatest in analyses that consider multiple years of CDL data, where the number of
state, class, and year combinations is multiplicative. For example, for an assessment of
change between two years, there are typically over a million unique combinations of state
and class pairs, each with its own likelihood of being correct (e.g., 50 classes times 40
states for year one multiplied by 50 classes times 40 states for the second year yields four
million combinations). The manual selection of which specific LULC class combinations
to include or exclude based on accuracy thus becomes intractable, whereas the spatial
accuracy maps can be used to easily select only those combinations that meet a quantitative
accuracy threshold.
The integration of confidence layer data with assessed accuracy data may also im-
prove spatial insights. For example, in many CDL-based change detection analyses, post-
classification processes such as spatial filters and minimum mapping units have been used
to indiscriminately remove areas of apparent change that are likely falsely mapped due to
mixed pixels or misclassifications. Alternatively, accuracy and confidence data could be
used to set a threshold of certitude below which any identified potential change is flagged
for removal. Probability information from the remote sensing process has previously been
used to improve vector-based detection of land cover change using unclassified Landsat
data [
]. Here, we suggest that confidence information from the remote sensing process
Remote Sens. 2021,13, 968 20 of 29
could similarly help improve the post-classification detection of LULC change using land
cover products. While we have not quantified the impact of such an approach, it has since
been used in other studies to set a higher threshold of certainty for change detection [56].
Confidence layer data could also be used in concert with accuracy information to
spatially allocate error adjustments. For example, here we modified area estimates for each
crop using an accuracy-derived indicator of bias (Table 6). However, such area adjustments
typically do not spatially correct pixels on the map, unless this issue of reconciliation
is specifically addressed [
]. To help achieve this reconciliation in post-classification
environments, confidence data could similarly be used to select the pixels with the lowest
confidence as candidates for reclassification. For example, if the CDL overestimated corn
area by 500 pixels in a given state, the 500 pixels of corn with the lowest confidence could
be removed to make a spatially explicit, bias-adjusted map of corn that was consistent with
the reference data estimates of area.
4.4. Limitations, Representativeness, and Uncertainty of Results
The class consolidation techniques described here do not modify the underlying per-
formance of the remote sensing product, but rather improve the representativeness of the
accuracy at which the product maps aggregate domains. Of note, aggregating classes
improves accuracy by lowering the product’s thematic resolution or specificity—thus im-
provements are made by accommodating errors rather than by correcting them. The greatest
benefits are therefore achieved when the thematic resolution of the product matches the
desired application. When using aggregated remote sensing products in applications, it is
important to quantify these associated changes in accuracy so that the reported metrics and
critiques reflect the actual data used.
There may also be variation in the representativeness of the CDL’s reported accuracy
statistics. The FSA reference data used to assess the CDL are not based on a probabilistic
sample, but rather on an availability approach, with the majority coming from 10 key
USDA program crops. As a result, the reported consolidated class accuracies are most
representative for those crops, and less characteristic for specialty crops and non-crop
covers. Similarly, the distribution of crop sample data across geographic regions are
in some places disproportionate to the amount of crop produced there. Therefore, the
accuracies of certain regions are more reliable than others due to differing levels of reference
data available for assessment.
To maintain the highest level of representativeness while calculating national average
crop accuracies, we weighted the accuracy of each crop in each state by the total acreage of
that crop in that state. For example, Iowa produced 14% of all corn in the nation in 2012;
thus, its accuracy was weighted to contribute 14% of the national accuracy for corn. An
alternative method for calculating nationwide accuracies is to sum all national reference
observations without regard to spatial distributions of the data, and such an approach has
recently been implemented by NASS to report nationwide accuracies for select years in the
online CDL metadata [
]. Here, we choose to area-weight by class prevalence, such that
the nationwide estimates reflect that of a pixel selected at random and are unskewed by
nonrepresentatively sampled reference data.
Uncertainty can also stem from errors in the reference data or a mismatch between
reference and evaluated data. For example, the FSA CLU classifications of grasslands are
often inconsistent across states and time, and occasionally they do not align with CDL
land cover designations. Thus, analysts at NASS make a judgement for each state and year
on how to best utilize the FSA data for training and assessing accuracy. Discrepancies in
how the FSA data are reported and incorporated can thus occasionally lead to apparent
differences in error rates across states and years, when in reality the inconsistencies between
the CDL and the landscape are much smaller. Similarly, errors exist in the NLCD data used
for training and assessing non-crop areas of the CDLs, which in turn affect their production
and assessment. It is possible that some CDL non-crop classes are more correct than the
associated NLCD classes on which they are based and evaluated, given that the CDL is
Remote Sens. 2021,13, 968 21 of 29
updated and improved annually, it includes exclusive confidential FSA training data, and it
generates higher accuracies for cultivated areas. Thus, the reported non-crop accuracies of
the CDL (based on comparison with the NLCD) may underestimate the true performance
of those CDL classes.
5. Conclusions
The CDL is a powerful and unrivaled tool for the exploration of agricultural land-
scapes and is poised to remain the premier remotely sensed agricultural LULC map in the
U.S. due to its annual availability, crop-specific detail, and exclusive access to expansive and
robust ground-based reference datasets from the USDA. We show that the CDL identifies
major crops and certain land covers with high accuracy across the U.S., and that this ability
holds true for all years of nationwide data coverage. Our findings also confirm that the
CDL exhibits extremely high acuity at discerning the aggregated classes of cropland and
non-cropland across spatial and thematic scales. Explicitly considering the bias within
specific classes and incorporating confidence layer data provide two additional opportu-
nities to further improve CDL performance and its use in LULC change assessments and
other applications.
While the original CDL dataset can indeed provide challenges for applications that
are beyond its original intent of mapping annual crop locations, it is the responsibility of its
users to apply the data in ways that do not compromise results. The CDL’s consistent and
reliable performance in mapping crops and cropland nationwide and across time clearly
demonstrates that many of the critiques and concerns regarding the underlying accuracy
of the product are unfounded or dissipate when thoroughly assessed at appropriate scales.
Furthermore, the substantial uncertainty and resource costs of alternative methods for
monitoring crops and croplands, such as through ground surveys or air photo interpre-
tations, underscores the need for approaches that can systematically identify continental
scale LULC change in an automated, reproducible, and verifiable manner. While many
products based on remote sensing seek to fill this gap, the CDL is a dataset proven to be
well suited for the task. When used appropriately, the CDL is a valid and indispensable tool
for studying LULC and a crucial asset for monitoring contemporary cropland dynamics
across the United States.
Author Contributions:
Conceptualization, T.J.L. and H.K.G.; formal analysis, T.J.L. and I.H.S.;
writing—original draft preparation, T.J.L.; writing—review and editing, T.J.L., I.H.S., and H.K.G. All
authors have read and agreed to the published version of the manuscript.
This material is based upon work supported in part by the National Science Foundation
(DRL-1713110) and the Great Lakes Bioenergy Research Center, U.S. Department of Energy, Office of
Science, Office of Biological and Environmental Research (DE-SC0018409).
Data Availability Statement:
All results and data including those for additional years of the CDL
have been archived online via Zenodo and are available at
(accessed on 1 January 2021).
Special thanks to Rick Mueller, Dave Johnson, and Patrick Willis at USDA NASS
for their helpful discussions and for providing CDL confidence data. Thanks also to Meghan Salmon
for her insights and initial help exploring consolidated crop accuracies using the CDL supermatrices
and to Carol Barford and Volker Radeloff for their feedback and suggestions of analyses from early
discussions of this manuscript. Thanks to George Allez for editing.
Conflicts of Interest: The authors declare no conflict of interest.
Remote Sens. 2021,13, 968 22 of 29
Appendix A
Table A1.
List of CDL codes and class names and whether they were included in the cropland or non-cropland domain in
the analyses of superclass and consolidated class accuracies. Domain delineations follow that of Lark et al. (2015) based on
original NASS distinctions [16,22].
Cropland Non-Cropland
ID Class Name ID Class Name ID Class Name ID Class Name
1 Corn 48 Watermelons 216 Peppers 37
Other Hay/Non
2 Cotton 49 Onions 217 Pomegranates
3 Rice 50 Cucumbers 218 Nectarines 63 Forest
4 Sorghum 51 Chickpeas 219 Greens 64 Shrubland
5 Soybeans 52 Lentils 220 Plums 65 Barren
6 Sunflower 53 Peas 221 Strawberries 81 Clouds/No Data
10 Peanuts 54 Tomatoes 222 Squash 82 Developed
11 Tobacco 55 Caneberries 223 Apricots 83 Water
12 Sweet Corn 56 Hops 224 Vetch 87 Wetlands
13 Pop or Orn Corn 57 Herbs 225
Dbl Crop
WinWht/Corn 88 Nonag/Undefined
14 Mint 58
Dbl Crop
Oats/Corn 92 Aquaculture
21 Barley 59 Sod/Grass Seed 227 Lettuce 111 Open Water
22 Durum Wheat 60 Switchgrass 229 Pumpkins 112
23 Spring Wheat 61 Fallow/Idle 230
Dbl Crop
Wht 121
24 Winter Wheat 66 Cherries 231
Dbl Crop Let-
tuce/Cantaloupe 122
25 Other Small Grains 67 Peaches 232
Dbl Crop
Lettuce/Cotton 123
26 Dbl WinWht/Soy 68 Apples 233
Dbl Crop
Lettuce/Barley 124
27 Rye 69 Grapes 234
Dbl Crop Durum
Wht/Sorghum 131 Barren
28 Oats 70 Christmas Trees 235
Dbl Crop
Barley/Sorghum 141 Deciduous Forest
29 Millet 71 Other Tree Crops 236
Dbl Crop
WinWht/Sorghum 142 Evergreen Forest
30 Speltz 72 Citrus 237
Dbl Crop
Barley/Corn 143 Mixed Forest
31 Canola 74 Pecans 238
Dbl Crop
WinWht/Cotton 152 Shrubland
32 Flaxseed 75 Almonds 239
Dbl Crop
33 Safflower 76 Walnuts 240
Dbl Crop
Soybeans/Oats 176 Grassland/Pasture
34 Rape Seed 77 Pears 241
Dbl Crop
35 Mustard 204 Pistachios 242 Blueberries 190 Woody Wetlands
36 Alfalfa 205 Triticale 243 Cabbage 195
38 Camelina 206 Carrots 244 Cauliflower
39 Buckwheat 207 Asparagus 245 Celery
41 Sugarbeets 208 Garlic 246 Radishes
42 Dry Beans 209 Cantaloupes 247 Turnips
43 Potatoes 210 Prunes 248 Eggplants
44 Other Crops 211 Olives 249 Gourds
45 Sugarcane 212 Oranges 250 Cranberries
46 Sweet Potatoes 213 Honeydew Melons 254
Dbl Crop
Misc Vegs and
Fruits 214 Broccoli
Remote Sens. 2021,13, 968 23 of 29
Table A2.
Nationwide area, producer’s accuracy, and user’s accuracy for each crop type in the 2012 CDL. Sorted in order of
descending producer’s accuracy.
CDL ID Crop Name CDL Acreage Producer’s Accuracy User’s Accuracy
45 Sugarcane 1,026,752 96.52% 94.44%
3 Rice 2,671,894 95.54% 97.01%
1 Corn 94,983,301 95.23% 94.82%
5 Soybeans 69,810,086 93.82% 93.85%
41 Sugarbeets 1,238,159 93.67% 94.18%
31 Canola 1,700,926 93.51% 95.64%
24 Winter Wheat 34,784,199 92.18% 92.38%
2 Cotton 13,114,321 91.06% 89.39%
75 Almonds 1,155,344 91.04% 91.56%
250 Cranberries 36,040 91.02% 95.23%
23 Spring Wheat 12,303,171 89.47% 86.89%
212 Oranges 1,019,334 89.24% 91.45%
54 Tomatoes 353,534 89.24% 89.60%
51 Chickpeas 1838 89.19% 84.44%
43 Potatoes 1,083,450 88.98% 92.66%
69 Grapes 1,136,718 87.39% 89.89%
26 Dbl Crop WinWht/Soybeans 5,311,121 86.70% 84.30%
Dbl Crop Lettuce/Durum Wht
39,776 86.08% 80.01%
68 Apples 444,242 85.67% 88.41%
56 Hops 24,903 84.53% 96.44%
6 Sunflower 1,595,069 84.09% 92.20%
10 Peanuts 1,657,438 81.17% 82.33%
42 Dry Beans 1,743,309 79.97% 85.66%
204 Pistachios 201,944 78.50% 85.69%
46 Sweet Potatoes 84,332 77.54% 87.22%
4 Sorghum 6,262,444 77.43% 82.91%
77 Pears 28,048 77.36% 80.67%
36 Alfalfa 16,167,152 75.40% 79.81%
245 Celery 2460 74.95% 93.43%
76 Walnuts 341,480 74.80% 79.49%
52 Lentils 388,352 74.57% 82.45%
22 Durum Wheat 1,860,552 73.30% 81.26%
49 Onions 139,769 72.90% 78.67%
66 Cherries 199,450 72.70% 78.60%
211 Olives 45,218 72.58% 90.34%
21 Barley 2,852,300 72.41% 83.14%
247 Turnips 1990 72.37% 79.65%
53 Peas 774,135 72.14% 83.45%
208 Garlic 17,233 71.20% 84.66%
Remote Sens. 2021,13, 968 24 of 29
Table A2. Cont.
CDL ID Crop Name CDL Acreage Producer’s Accuracy User’s Accuracy
61 Fallow/Idle Cropland 24,395,076 69.29% 78.96%
32 Flaxseed 284,228 68.10% 81.77%
59 Sod/Grass Seed 797,216 68.00% 82.93%
57 Herbs 104,376 67.07% 86.46%
14 Mint 8429 67.00% 77.65%
50 Cucumbers 32,698 65.44% 78.26%
12 Sweet Corn 301,474 65.35% 80.95%
244 Cauliflower 1956 64.25% 79.31%
226 Dbl Crop Oats/Corn 109,775 63.82% 62.71%
234 Dbl Crop Durum
Wht/Sorghum 4095 63.24% 66.43%
47 Misc Vegs and Fruits 47,159 62.89% 78.30%
225 Dbl Crop WinWht/Corn 402,067 61.81% 69.33%
71 Other Tree Crops 68,927 61.69% 75.22%
27 Rye 453,504 61.47% 72.91%
254 Dbl Crop Barley/Soybeans 113,764 61.24% 78.16%
72 Citrus 139,758 60.68% 81.33%
33 Safflower 148,336 59.74% 80.07%
11 Tobacco 112,733 59.62% 79.97%
232 Dbl Crop Lettuce/Cotton 7770 58.53% 69.78%
213 Honeydew Melons 6430 58.09% 75.87%
231 Dbl Crop Lettuce/Cantaloupe 3833 57.97% 85.54%
242 Blueberries 90,911 57.70% 74.20%
248 Eggplants 357 57.69% 68.18%
227 Lettuce 28,621 57.45% 66.98%
58 Clover/Wildflowers 146,851 57.21% 70.80%
209 Cantaloupes 18,325 57.00% 72.44%
217 Pomegranates 20,652 56.79% 76.84%
216 Peppers 19,796 55.46% 67.81%
207 Asparagus 19,258 54.93% 78.11%
74 Pecans 398,572 53.68% 83.55%
29 Millet 457,674 53.43% 64.84%
221 Strawberries 43,438 52.63% 80.70%
39 Buckwheat 22,586 52.11% 78.32%
246 Radishes 10,175 50.75% 70.24%
235 Dbl Crop Barley/Sorghum 12,071 49.65% 50.19%
67 Peaches 53,255 49.19% 68.69%
35 Mustard 32,734 48.15% 78.68%
241 Dbl Crop Corn/Soybeans 16,998 48.07% 75.59%
60 Switchgrass 10,684 47.62% 56.33%
220 Plums 53,436 46.92% 65.53%
Remote Sens. 2021,13, 968 25 of 29
Table A2. Cont.
CDL ID Crop Name CDL Acreage Producer’s Accuracy User’s Accuracy
55 Caneberries 11,633 46.19% 85.35%
206 Carrots 42,670 45.93% 70.76%
229 Pumpkins 23,094 43.87% 72.29%
70 Christmas Trees 65,800 43.65% 75.04%
243 Cabbage 18,368 43.03% 59.38%
38 Camelina 4977 42.94% 69.91%
214 Broccoli 11,202 41.89% 63.04%
28 Oats 1,285,192 41.18% 62.21%
238 Dbl Crop WinWht/Cotton 324,242 41.14% 70.23%
48 Watermelons 37,670 40.78% 62.93%
219 Greens 15,028 40.62% 54.26%
13 Pop or Orn Corn 120,463 40.33% 91.15%
223 Apricots 3760 39.37% 71.61%
222 Squash 20,832 37.05% 61.87%
237 Dbl Crop Barley/Corn 37,530 36.55% 70.59%
236 Dbl Crop WinWht/Sorghum 386,258 34.26% 60.55%
224 Vetch 4595 33.12% 69.06%
44 Other Crops 171,449 32.93% 63.82%
205 Triticale 156,684 32.74% 67.26%
218 Nectarines 2589 32.16% 70.51%
25 Other Small Grains 5008 28.18% 73.07%
34 Rape Seed 3211 23.92% 58.74%
239 Dbl Crop Soybeans/Cotton 7388 20.90% 66.57%
240 Dbl Crop Soybeans/Oats 17,928 19.50% 62.42%
30 Speltz 2811 16.32% 60.80%
249 Gourds 150 10.00% 100.00%
Table A3.
Accuracy of CDL-derived consolidated cropland and non-cropland classifications for each U.S. state or multistate
region for 2012. Results calculated according to Equation (4) and consolidated according to Appendix ATable A1.
Cropland Non-Cropland
State Producer’s
AL 84% 93% 98% 94%
AR 97% 100% 99% 89%
AZ 91% 97% 99% 95%
CA 96% 98% 99% 93%
CO 93% 98% 98% 90%
87% 93% 100% 99%
DE_MD_NJ 93% 94% 98% 96%
FL 89% 95% 98% 92%
GA 86% 91% 98% 93%
IA 97% 99% 95% 77%
ID 93% 96% 99% 95%
IL 98% 97% 92% 95%
IN 98% 97% 94% 96%
KS 97% 99% 99% 95%
KY 92% 99% 97% 84%
Remote Sens. 2021,13, 968 26 of 29
Table A3. Cont.
Cropland Non-Cropland
State Producer’s
LA 95% 98% 98% 90%
MI 96% 95% 95% 90%
MN 98% 98% 97% 93%
MO 98% 96% 97% 98%
MS 94% 98% 98% 92%
MT 91% 95% 99% 98%
NC 91% 95% 97% 93%
ND 95% 98% 97% 90%
NE 97% 100% 95% 67%
NM 88% 98% 99% 87%
NV 88% 96% 100% 99%
NY 86% 88% 98% 95%
OH 96% 97% 96% 95%
OK 97% 100% 96% 62%
OR 93% 97% 98% 93%
PA 83% 82% 98% 96%
SC 86% 92% 98% 94%
SD 95% 97% 98% 97%
TN 95% 99% 98% 90%
TX 92% 100% 96% 56%
UT 90% 98% 99% 94%
VA_WV 92% 94% 99% 98%
WA 97% 98% 100% 97%
WI 95% 97% 93% 84%
WY 88% 98% 99% 93%
Figure A1.
Map of 2012 state level user’s accuracies for specific crop classes of the CDL for the conterminous U.S. Data
from USDA NASS (2016) based on the comparison of CDL with FSA reference data for crop classes. An arbitrary grading
scale of “A”–“F” was assigned to accuracy intervals to help users easily identify where the CDL crop map excels versus
where additional caution may be warranted.
Remote Sens. 2021,13, 968 27 of 29
Figure A2.
Confidence of pixels mapped as corn in the 2012 CDL. Within a specific state, there can
be large spatial variation in the degree of certainty with which specific crops are mapped. In South
Dakota and North Dakota, corn is mapped more confidently in the eastern parts of the states (dark
blue), where the crop is more prevalent, and is mapped less confidently (green to yellow) as one
moves westward and the crop becomes less prominent.
Turner, B.L.; Lambin, E.F.; Reenberg, A. The Emergence of Land Change Science for Global Environmental Change and
Sustainability. Proc. Natl. Acad. Sci. USA 2007,104, 20666–20671. [CrossRef] [PubMed]
Lillesand, T.; Kiefer, R.W.; Chipman, J. Remote Sensing and Image Interpretation; John Wiley & Sons: Hoboken, NJ, USA, 2014;
ISBN 978-1-118-34328-9.
Howard, D.M.; Wylie, B.K.; Tieszen, L.L. Crop Classification Modelling Using Remote Sensing and Environmental Data in the
Greater Platte River Basin, USA. Int. J. Remote Sens. 2012,33, 6094–6108. [CrossRef]
Environmental Protection Agency. Renewable Fuel Standard (RFS) program are under 40 CFR Part 80: Regulation of Fuels and Fuel
Additives; Environmental Protection Agency: Washington, DC, USA, 2010.
Blackman, A. Evaluating Forest Conservation Policies in Developing Countries Using Remote Sensing Data: An Introduction and
Practical Guide. For. Policy Econ. 2013,34, 1–16. [CrossRef]
Lark, T.J. Protecting Our Prairies: Research and Policy Actions for Conserving America’s Grasslands. Land Use Policy
,97, 104727.
Han, W.; Yang, Z.; Di, L.; Mueller, R. CropScape: A Web Service Based Application for Exploring and Disseminating US
Conterminous Geospatial Cropland Data Products for Decision Support. Comput. Electron. Agric. 2012,84, 111–123. [CrossRef]
Holland, A.; Bennett, D.A.; Secchi, S. Complying with Conservation Compliance? An Assessment of Recent Evidence in the US
Corn Belt. Environ. Res. Lett. 2020. [CrossRef]
Foody, G.M. Valuing Map Validation: The Need for Rigorous Land Cover Map Accuracy Assessment in Economic Valuations of
Ecosystem Services. Ecol. Econ. 2015,111, 23–28. [CrossRef]
Olofsson, P.; Foody, G.M.; Stehman, S.V.; Woodcock, C.E. Making Better Use of Accuracy Data in Land Change Studies: Estimating
Accuracy and Area and Quantifying Uncertainty Using Stratified Estimation. Remote Sens. Environ.
,129, 122–131. [CrossRef]
Morales-Barquero, L.; Lyons, M.B.; Phinn, S.R.; Roelfsema, C.M. Trends in Remote Sensing Accuracy Assessment Approaches in
the Context of Natural Resources. Remote Sens. 2019,11, 2305. [CrossRef]
Stehman, S.V.; Foody, G.M. Key Issues in Rigorous Accuracy Assessment of Land Cover Products. Remote Sens. Environ.
231, 111199. [CrossRef]
13. Johnson, D.; Mueller, R. The 2009 Cropland Data Layer. Photogramm. Eng. Remote Sens. 2010,76, 1201–1205.
Mueller, R.; Harris, M. Reported Uses of CropScape and the National Cropland Data Layer Program. In Proceedings of the
International Conference on Agricultural Statistics VI, Rio de Janerio, Brazil, 23–25 October 2013.
Remote Sens. 2021,13, 968 28 of 29
Boryan, C.; Yang, Z.; Mueller, R.; Craig, M. Monitoring US Agriculture: The US Department of Agriculture, National Agricultural
Statistics Service, Cropland Data Layer Program. Geocarto Int. 2011,26, 341–358. [CrossRef]
Johnson, D.M. A 2010 Map Estimate of Annually Tilled Cropland within the Conterminous United States. Agric. Syst.
114, 95–105. [CrossRef]
Stern, A.J.; Doraiswamy, P.C.; Hunt, E.R., Jr. Changes of Crop Rotation in Iowa Determined from the United States Department
of Agriculture, National Agricultural Statistics Service Cropland Data Layer Product. J. Appl. Remote Sens.
,6, 063590.
Sahajpal, R.; Zhang, X.; Izaurralde, R.C.; Gelfand, I.; Hurtt, G.C. Identifying Representative Crop Rotation Patterns and Grassland
Loss in the US Western Corn Belt. Comput. Electron. Agric. 2014,108, 173–182. [CrossRef]
Plourde, J.D.; Pijanowski, B.C.; Pekin, B.K. Evidence for Increased Monoculture Cropping in the Central United States. Agric.
Ecosyst. Environ. 2013,165, 50–59. [CrossRef]
Hendricks, N.P.; Sinnathamby, S.; Douglas-Mankin, K.; Smith, A.; Sumner, D.A.; Earnhart, D.H. The Environmental Effects of
Crop Price Increases: Nitrogen Losses in the US Corn Belt. J. Environ. Econ. Manag. 2014,68, 507–526. [CrossRef]
Cox, C.; Rundquist, S.; Weir, A. Boondoggle: Prevented Planting Insurance Plows up Wetlands, Wastes Billions; Environmental Working
Group: Washington, DC, USA, 2015.
Lark, T.J.; Salmon, J.M.; Gibbs, H.K. Cropland Expansion Outpaces Agricultural and Biofuel Policies in the United States. Environ.
Res. Lett. 2015,10, 044003. [CrossRef]
Werling, B.P.; Dickson, T.L.; Isaacs, R.; Gaines, H.; Gratton, C.; Gross, K.L.; Liere, H.; Malmstrom, C.M.; Meehan, T.D.;
Ruan, L.; et al.
Perennial Grasslands Enhance Biodiversity and Multiple Ecosystem Services in Bioenergy Landscapes. Proc. Natl.
Acad. Sci. USA 2014,111, 1652–1657. [CrossRef] [PubMed]
Meehan, T.D.; Gratton, C.; Diehl, E.; Hunt, N.D.; Mooney, D.F.; Ventura, S.J.; Barham, B.L.; Jackson, R.D. Ecosystem-Service
Tradeoffs Associated with Switching from Annual to Perennial Energy Crops in Riparian Zones of the US Midwest. PLoS ONE
2013,8, e80093. [CrossRef]
25. USDA NASS QuickStats. Available online: (accessed on 10 May 2012).
USDA-NASS-RDD Spatial Analysis Research Section Cropland Data Layer Metadata. Available online: http://www.nass.usda.
gov/research/Cropland/metadata/meta.htm (accessed on 20 July 2015).
USDA; FSA. FSA Common Land Unit Infosheet 2012. Available online:
_infosheet_2012.pdf (accessed on 10 May 2012).
Wickham, J.D.; Stehman, S.V.; Fry, J.A.; Smith, J.H.; Homer, C.G. Thematic Accuracy of the NLCD 2001 Land Cover for the
Conterminous United States. Remote Sens. Environ. 2010,114, 1286–1296. [CrossRef]
Homer, C.; Dewitz, J.; Yang, L.; Jin, S.; Danielson, P.; Xian, G.; Coulston, J.; Herold, N.; Wickham, J.; Megown, K. Completion
of the 2011 National Land Cover Database for the Conterminous United States–Representing a Decade of Land Cover Change
Information. Photogramm. Eng. Remote Sens. 2015,81, 345–354.
Wright, C.K.; Wimberly, M.C. Cropland Data Layer Provides a Valid Assessment of Recent Grassland Conversion in the Western
Corn Belt. Proc. Natl. Acad. Sci. USA 2013. [CrossRef]
Mladenoff, D.J.; Sahajpal, R.; Johnson, C.P.; Rothstein, D.E. Recent Land Use Change to Agriculture in the U.S. Lake States:
Impacts on Cellulosic Biomass Potential and Natural Lands. PLoS ONE 2016,11, e0148566. [CrossRef]
Gage, A.M.; Olimb, S.K.; Nelson, J. Plowprint: Tracking Cumulative Cropland Expansion to Target Grassland Conservation. Gt.
Plains Res. 2016,26, 107–116. [CrossRef]
Johnston, C.A. Wetland Losses Due to Row Crop Expansion in the Dakota Prairie Pothole Region. Wetlands
,33, 175–182.
Lark, T.J.; Spawn, S.A.; Bougie, M.; Gibbs, H.K. Cropland Expansion in the United States Produces Marginal Yields at High Costs
to Wildlife. Nat. Commun. 2020,11, 4295. [CrossRef] [PubMed]
Reitsma, K.D.; Clay, D.E.; Clay, S.A.; Dunn, B.H.; Reese, C. Does the US Cropland Data Layer Provide an Accurate Benchmark for
Land-Use Change Estimates? Agron. J. 2015.
Dunn, J.B.; Merz, D.; Copenhaver, K.L.; Mueller, S. Measured Extent of Agricultural Expansion Depends on Analysis Technique.
Biofuels Bioprod. Biorefining 2017,11, 247–257. [CrossRef]
37. Laingen, C. Measuring Cropland Change: A Cautionary Tale. Pap. Appl. Geogr. 2015,1, 65–72. [CrossRef]
Larsen, A.E.; Hendrickson, B.T.; Dedeic, N.; MacDonald, A.J. Taken as a given: Evaluating the Accuracy of Remotely Sensed Crop
Data in the USA. Agric. Syst. 2015,141, 121–125. [CrossRef]
Shrestha, D.S.; Staab, B.D.; Duffield, J.A. Biofuel Impact on Food Prices Index and Land Use Change. Biomass Bioenergy
2019,124, 43–53. [CrossRef]
Kline, K.L.; Singh, N.; Dale, V.H. Cultivated Hay and Fallow/Idle Cropland Confound Analysis of Grassland Conversion in the
Western Corn Belt. Proc. Natl. Acad. Sci. USA 2013. [CrossRef] [PubMed]
Lark, T.J.; Mueller, R.M.; Johnson, D.M.; Gibbs, H.K. Measuring Land-Use and Land-Cover Change Using the U.S. Department of
Agriculture’s Cropland Data Layer: Cautions and Recommendations. Int. J. Appl. Earth Obs. Geoinf.
,62, 224–235. [CrossRef]
Rashford, B.S.; Albeke, S.E.; Lewis, D.J. Modeling Grassland Conversion: Challenges of Using Satellite Imagery Data. Am. J.
Agric. Econ. 2013,95, 404–411. [CrossRef]
43. Homer, C.H.; Fry, J.A.; Barnes, C.A. The National Land Cover Database. US Geol. Surv. Fact Sheet 2012,3020, 1–4.
Remote Sens. 2021,13, 968 29 of 29
National Land Cover Database (NLCD) Multi-Resolution Land Characteristics Consortium (MRLC). Available online: http:
// (accessed on 17 January 2012).
Congalton, R.G.; Green, K. Assessing the Accuracy of Remotely Sensed Data: Principles and Practices; CRC press: Boca Raton,
FL, USA, 2008.
Anderson, J.R. A Land Use and Land Cover Classification System for Use with Remote Sensor Data; U.S. Government Printing Office:
Washington, DC, USA, 1976.
Liu, W.; Gopal, S.; Woodcock, C.E. Uncertainty and Confidence in Land Cover Classification Using a Hybrid Classifier Approach.
Photogramm. Eng. Remote Sens. 2004,70, 963–971. [CrossRef]
Chen, J.; Chen, X.; Cui, X.; Chen, J. Change Vector Analysis in Posterior Probability Space: A New Method for Land Cover
Change Detection. IEEE Geosci. Remote Sens. Lett. 2011,8, 317–321. [CrossRef]
Stehman, S.V. Estimating Area from an Accuracy Assessment Error Matrix. Remote Sens. Environ.
,132, 202–211. [CrossRef]
Olofsson, P.; Foody, G.M.; Herold, M.; Stehman, S.V.; Woodcock, C.E.; Wulder, M.A. Good Practices for Estimating Area and
Assessing Accuracy of Land Change. Remote Sens. Environ. 2014,148, 42–57. [CrossRef]
Johnston, C.A. Agricultural Expansion: Land Use Shell Game in the U.S. Northern Plains. Landsc. Ecol.
,29, 81–95. [CrossRef]
Falcone, J.A. U.S. Conterminous Wall-to-Wall Anthropogenic Land Use Trends (NWALT), 1974–2012; Data Series; U.S. Geological
Survey: Reston, VA, USA, 2015; p. 45.
Theobald, D.M. Development and Applications of a Comprehensive Land Use Classification and Map for the US. PLoS ONE
2014,9, e94628. [CrossRef] [PubMed]
Morefield, P.E.; LeDuc, S.D.; Clark, C.M.; Iovanna, R. Grasslands, Wetlands, and Agriculture: The Fate of Land Expiring from the
Conservation Reserve Program in the Midwestern United States. Environ. Res. Lett. 2016,11, 094005. [CrossRef]
Hendricks, N.P.; Smith, A.; Sumner, D.A. Crop Supply Dynamics and the Illusion of Partial Adjustment. Am. J. Agric. Econ.
96, 1469–1491. [CrossRef]
Lark, T.J.; Larson, B.; Schelly, I.; Batish, S.; Gibbs, H.K. Accelerated Conversion of Native Prairie to Cropland in Minnesota.
Environ. Conserv. 2019, 1–8. [CrossRef]
Song, X.-P.; Potapov, P.V.; Krylov, A.; King, L.; Di Bella, C.M.; Hudson, A.; Khan, A.; Adusei, B.; Stehman, S.V.; Hansen, M.C.
National-Scale Soybean Mapping and Area Estimation in the United States Using Medium Resolution Satellite Imagery and Field
Survey. Remote Sens. Environ. 2017,190, 383–395. [CrossRef]
... The CDL provided >100 land cover categories, including different types of crops. In the U.S. Corn Belt, the accuracy of maize and soybean exceeded 95% (Lark et al., 2021). According to the data description provided by NASS/USDA, the producer's accuracy (PA) and user's accuracy (UA) of maize in Iowa were higher than 95%. ...
Full-text available
Maize (Zea mays), the second most-produced crop worldwide, serves as the cornerstone for global food security and human livelihood. Early-season maize mapping benefits maize production forecasting and other pre-harvest decision-making applications. However, most existing early-season mapping efforts rely heavily on either the current-year or historical crop labels to train classifiers, limiting the potential applications to new regions lacking crop labels. To explore the possibility of maize mapping only using satellite data in the early season, we proposed a Multi-temporal Gaussian Mixture Model (MGMM) to map maize planting areas without any crop labels. A chlorophyll content relevant proxy, named the Red-edge position (REP), was selected as model input, based on the truth that summer maize tends to show a higher chlorophyll content than other summer crops (e.g., soybean, cotton, peanut, sunflowers, etc.) in the peak season. The novel early-season mapping framework using the REP-based MGMM (MGMM-REP) was applied in four diverse areas (Iowa and Georgia in the US, Heilongjiang province (HLJ) in China, and GrandEst in France). The MGMM-REP could generate maize maps more than two months before harvest with reasonable accuracy (F1 ≥ 77%) using all the available Sentinel-2 (S2) images and the Google Earth Engine platform (GEE). Our early-season maps agreed well with the existing crop maps and official statistics. The correlation coefficient (R) of the maize acreage between our early-season maps and statistics was higher than 0.94. The high inter-class difference of REP between maize and other summer crops could increase the F1 score by 2-47% compared to the other commonly used Vegetation indices (VIs). Since MGMM-REP does not rely on crop labels, it had the potential to be transferred to label-scarce maize-cropped regions and contribute to the international commodity trade and food security forecast.
... In general, the 2016 NLCD was evaluated to be as accurate as the 2011 NLCD [46]. Based on the recent assessments, the accuracy performance of CDL at the national scale for the categories of interest was similar to that of NLCD, and has improved during 2008-2016 [47]. ...
Full-text available
The National Agricultural Statistics Service, the statistical arm of the US Department of Agriculture, and the Multi-Resolution Land Characteristics Consortium, a group of the US federal agencies, collect and publish several land-use and land-cover data sets. The aim of this study is to analyze the consistency of forestland estimates based on two widely used, publicly available products: the National Land-Cover Database (NLCD) and Cropland Data Layer (CDL). Both remote-sensing-based products provide raster-formatted land-cover categorization at a spatial resolution of 30 m. Although the processing of the yearly published CDL non-agricultural land-cover data is based on less frequently updated NLCD, the consistency of large-area forestland mapping between these two datasets has not been assessed. To assess the similarities and the differences between CDL- and NLCD-based forestland mappings for the state of North Carolina, we overlay the two data products for the years 2011 and 2016 in ArcMap 10.5.1 and analyze the location and attributes of the matched and mismatched forestland. We find that the mismatch is relatively smaller for the areas of the state where forests occupy larger shares of the total land, and that the relative mismatch is smaller in 2011 when compared to 2016. We also find that a large portion of the forestland mismatch is attributable to the dynamics of re-growth of periodically harvested and otherwise disturbed forests. Our results underscore the need for a holistic approach to data preparation, data attribution, and data accuracy when performing high-scale map-based analyses using each of these products.
... It provides statistics on rice production and the planting structure for the government [75]. Cropland Data Layer (CDL) data is a typical crop-specific coverage data layer, which is produced for the continental United States, using the annual-medium resolution satellite images and agricultural ground measurement points [77]. The CDLs were made by the USDA and other American scientific institutions such as the National Agricultural Statistics Service (NASS) [78]. ...
Full-text available
Rice is one of the most important food crops around the world. Remote sensing technology, as an effective and rapidly developing method, has been widely applied to precise rice management. To observe the current research status in the field of rice remote sensing (RRS), a bibliometric analysis was carried out based on 2680 papers of RRS published during 1980–2021, which were collected from the core collection of the Web of Science database. Quantitative analysis of the number of publications, top countries and institutions, popular keywords, etc. was conducted through the knowledge mapping software CiteSpace, and comprehensive discussions were carried out from the aspects of specific research objects, methods, spectral variables, and sensor platforms. The results revealed that an increasing number of countries and institutions have conducted research on RRS and a great number of articles have been published annually, among which, China, the United States of America, and Japan were the top three and the Chinese Academy of Sciences, Zhejiang University, and Nanjing Agricultural University were the first three research institutions with the largest publications. Abundant interest was paid to “reflectance”, followed by “vegetation index” and “yield” and the specific objects mainly focused on growth, yield, area, stress, and quality. From the perspective of spectral variables, reflectance, vegetation index, and back-scattering coefficient appeared the most frequently in the frontiers. In addition to satellite remote sensing data and empirical models, unmanned air vehicle (UAV) platforms and artificial intelligence models have gradually become hot topics. This study enriches the readers’ understanding and highlights the potential future research directions in RRS.
... For example, Nelson et al. (2021) found a sizable portion of the recent literature defined rural/urban based on land cover; i.e., by identifying the land cover that is attributable to populated areas (developed land) vs. agricultural, forested or natural land. In recent years, multiple GIS-based, consistently maintained and updated databases on land cover and use become easily accessible to both researchers and practitioners (Yang et al., 2018;Lark et al., 2021). An investigation of the implications of using the population density approach versus land cover approach for quantification of rural would be an exciting future research topic. ...
Contemporary research has measured differences between rural and its urban/suburban counterparts on the backdrop of social, economic, political and health phenomena. However, given the ambiguity of its definition, varying meanings and applications of the word ‘rural’ exist. In this paper we explored three different popular uses of the term rural on the backdrop of quantitative data with findings highlighting 1) there do exist statistical differences in data depending upon how rural is defined and 2) the definition of rural provided through the USDA’s Rural-Urban Commuting Areas (RUCA) best aligned with other definitions of rural.
Full-text available
The land of the conterminous United States (CONUS) has been transformed dramatically by humans over the last four centuries through land clearing, agricultural expansion and intensification, and urban sprawl. High-resolution geospatial data on long-term historical changes in land use and land cover (LULC) across the CONUS are essential for predictive understanding of natural–human interactions and land-based climate solutions for the United States. A few efforts have reconstructed historical changes in cropland and urban extent in the United States since the mid-19th century. However, the long-term trajectories of multiple LULC types with high spatial and temporal resolutions since the colonial era (early 17th century) in the United States are not available yet. By integrating multi-source data, such as high-resolution remote sensing image-based LULC data, model-based LULC products, and historical census data, we reconstructed the history of land use and land cover for the conterminous United States (HISLAND-US) at an annual timescale and 1 km × 1 km spatial resolution in the past 390 years (1630–2020). The results show widespread expansion of cropland and urban land associated with rapid loss of natural vegetation. Croplands are mainly converted from forest, shrub, and grassland, especially in the Great Plains and North Central regions. Forest planting and regeneration accelerated the forest recovery in the Northeast and Southeast since the 1920s. The geospatial and long-term historical LULC data from this study provide critical information for assessing the LULC impacts on regional climate, hydrology, and biogeochemical cycles as well as achieving sustainable use of land in the nation. The datasets are available at (Li et al., 2022).
The National Land Cover Database (NLCD), a product suite produced through the MultiResolution Land Characteristics (MRLC) consortium, is an operational land cover monitoring program. Starting from a base year of 2001, NLCD releases a land cover database every 2–3-years. The recent release of NLCD2019 extends the database to 18 years. We implemented a stratified random sample to collect land cover reference data for the 2016 and 2019 components of the NLCD2019 database at Level II and Level I of the classification hierarchy. For both dates, Level II land cover overall accuracies (OA) were 77.5% ± 1% (± value is the standard error) when agreement was defined as a match between the map label and primary reference label only, and increased to 87.1% ± 0.7% when agreement was defined as a match between the map label and either the primary or alternate reference label. At Level I of the classification hierarchy, land cover OA was 83.1% ± 0.9% for both 2016 and 2019 when agreement was defined as a match between the map label and primary reference label only, and increased to 90.3% ± 0.7% when agreement also included the alternate reference label. The Level II and Level I OA for the 2016 land cover in the NLCD2019 database were 5% higher compared to the 2016 land cover component of the NLCD2016 database when agreement was defined as a match between the map label and primary reference label only. No improvement was realized by the NLCD2019 database when agreement also included the alternate reference label. User’s accuracies (UA) for forest loss and grass gain were>70% when agreement included either the primary or alternate label, and UA was generally<50% for all other change themes. Producer’s accuracies (PA) were>70% for grass loss and gain and water gain and generally<50% for the other change themes. We conducted a post-analysis review for map-reference agreement to identify patterns of disagreement, and these findings are discussed in the context of potential adjustments to mapping and reference data collection procedures that may lead to improved map accuracy going forward.
Accurate mapping of crop types globally is essential for maintaining food security. In recent years, with the continued launch of the earth observation (EO) satellites, the freely accessible EO data with the high spatial–temporal resolution has made it possible to achieve crop type mapping at a finer scale. However, the difficulty of crop type mapping elevates dramatically with multi-source data (such as optical and SAR), especially when dealing with long time-series across the temporal domain. Currently, most existing crop mapping studies have not considered exploiting complementary information from multi-modular time-series. Therefore, this study proposes a multi-branch self-learning Vision Transformer (MSViT) structure for crop classification to better achieve spatial–temporal feature extraction by jointly using optical-SAR time-series. The experiment results show that our proposed method outperforms the most commonly used deep learning crop classification schemes in terms of overall classification accuracy Kappa coefficient, and the F1-score index. In addition, the experiments quantitatively evaluate the impact of model depth, multi-source features, and contrast learning strategy on accurate crop identification, as well as visualize the degree of spatial–temporal feature contributions within the model.
Full-text available
Due to our increasing understanding of the role the surrounding landscape plays in ecological processes, a detailed characterization of land cover, including both agricultural and natural habitats, is ever more important for both researchers and conservation practitioners. Unfortunately, in the United States, different types of land cover data are split across thematic datasets that emphasize agricultural or natural vegetation, but not both. To address this data gap and reduce duplicative efforts in geospatial processing, we merged two major datasets, the LANDFIRE National Vegetation Classification (NVC) and USDA-NASS Cropland Data Layer (CDL), to produce an integrated land cover map. Our workflow leveraged strengths of the NVC and the CDL to produce detailed rasters comprising both agricultural and natural land-cover classes. We generated these maps for each year from 2012–2021 for the conterminous United States, quantified agreement between input layers and accuracy of our merged product and published the complete workflow necessary to update these data. In our validation analyses, we found that approximately 5.5 % of NVC agricultural pixels conflicted with the CDL, but we resolved most of these conflicts based on surrounding agricultural land, leaving only 0.6 % of agricultural pixels unresolved in our merged product. These ready-to-use rasters characterizing both agricultural and natural land cover will be widely useful in environmental research and management.
Full-text available
Recent expansion of croplands in the United States has caused widespread conversion of grasslands and other ecosystems with largely unknown consequences for agricultural production and the environment. Here we assess annual land use change 2008-16 and its impacts on crop yields and wildlife habitat. We find that croplands have expanded at a rate of over one million acres per year, and that 69.5% of new cropland areas produced yields below the national average, with a mean yield deficit of 6.5%. Observed conversion infringed upon high-quality habitat that, relative to unconverted land, had provided over three times higher milkweed stem densities in the Monarch butterfly Midwest summer breeding range and 37% more nesting opportunities per acre for waterfowl in the Prairie Pothole Region of the Northern Great Plains. Our findings demonstrate a pervasive pattern of encroachment into areas that are increasingly marginal for production, but highly significant for wildlife, and suggest that such tradeoffs may be further amplified by future cropland expansion.
Full-text available
Grasslands are among the most endangered ecosystems in the world. They supply vital resources for society, support an abundance of wildlife species, and store rich carbon reserves beneath their surfaces. Despite this, only a fraction of original grasslands in the United States now remains, and their rate of conversion to cropland has recently reaccelerated. This paper discusses opportunities that are immediately available to reduce the loss of U.S. native grasslands (i.e., prairie) and advance toward collective goals in grassland conservation. Potential solution-oriented actions include inventorying and monitoring remaining prairie, reconsidering public and private incentives for conversion and conservation, and establishing an industry-led moratorium on natural ecosystem loss. There is also a need among the engaged communities to develop unified messaging and a shared vision for grassland conservation in the U.S., such as “no prairie conversion” or “zero net loss of grasslands.” Additional tangible steps for action are outlined across the science, policy, and public-driven support arenas and offered for multiple stakeholder groups, including agricultural producers, policymakers, academics, and conservation organizations.
Full-text available
Conservation provisions of US farm bills since 1985 have been aimed at mitigating negative environmental impacts of US agriculture. One of the long term goals has been to protect against soil erosion, with a focus specifically on highly erodible land (HEL). Conservation Compliance (CC) mandates that, in order to receive federal subsidies, farmers who plant annual crops on HEL must implement a conservation plan, with practices such as rotating crops and no-till farming. When crop prices increase, however, the incentives not to follow the plan increase, as conservation activities can reduce farmers’ profits. This study is the first to assess the performance of conservation compliance between 2007 and 2019, a period of historically high and variable crop prices, using geographical information system tools and crop data in a critical agricultural production region, the US Corn Belt. Our results indicate there was a substantial increase in continuous corn on HEL, a proxy measure for non-compliance, in several portions of the study area in correspondence with higher crop prices following the 2007 Energy Bill. This mirrored the change in crop rotations on all cropland. The increase was positively correlated with both absolute and relative corn prices. While at the height of absolute and relative corn prices there were increases in continuous corn on HEL everywhere across the study region except parts of Missouri, some of the largest changes occurred in environmentally sensitive regions and areas which use irrigation, thereby potentially creating disproportionate environmental impacts. Similar changes in continuous corn also occurred in all cropland in the region, indicating that mandatory conservation programs are as vulnerable to periods of high crop prices as voluntary programs. Better monitoring for both CC and other conservation programs is critical to ensure the policies work as intended.
Full-text available
The utility of land cover maps for natural resources management relies on knowing the uncertainty associated with each map. The continuous advances typical of remote sensing, including the increasing availability of higher spatial and temporal resolution satellite data and data analysis capabilities, have created both opportunities and challenges for improving the application of accuracy assessment. There are well established accuracy assessment methods, but their underlying assumptions have not changed much in the last couple decades. Consequently, revisiting how map error and accuracy have been performed and reported over the last two decades is timely, to highlight areas where there is scope for better utilization of emerging opportunities. We conducted a quantitative literature review on accuracy assessment practices for mapping via remote sensing classification methods, in both terrestrial and marine environments. We performed a structured search for land and benthic cover mapping, limiting our search to journals within the remote sensing field, and papers published between 1998–2017. After an initial screening process, we assembled a database of 282 papers, and extracted and standardized information on various components of their reported accuracy assessments. We discovered that only 56% of the papers explicitly included an error matrix, and a very limited number (14%) reported overall accuracy with confidence intervals. The use of kappa continues to be standard practice, being reported in 50.4% of the literature published on or after 2012. Reference datasets used for validation were collected using a probability sampling design in 54% of the papers. For approximately 11% of the studies, the sampling design used could not be determined. No association was found between classification complexity (i.e. number of classes) and measured accuracy, independent from the size of the study area. Overall, only 32% of papers included an accuracy assessment that could be considered reproducible; that is, they included a probability-based sampling scheme to collect the reference dataset, a complete error matrix, and provided sufficient characterization of the reference datasets and sampling unit. Our findings indicate that considerable work remains to identify and adopt more statistically rigorous accuracy assessment practices to achieve transparent and comparable land and benthic cover maps.
Full-text available
Unplowed native grasslands are among the most endangered ecosystems in the world, due in large part to their agricultural suitability and widespread conversion to cropland. Despite this, remaining locations of these species- and carbon-rich landscapes are neither well monitored nor effectively protected. A recent spike in US prices for corn ( Zea mays ) and soybeans ( Glycine max ) intensified incentives to bring new land into production, potentially hastening the conversion of grasslands to crops. We combined satellite-based land cover data with aerial photographs and a field-based inventory of remaining native grassland (hereafter prairie) in Minnesota to assess the areas, rates, and locations of prairie conversion since 2008. Our results reveal that during 2008–2012, prairie was converted at average annual rates more than four times greater than the previous decade and a half. Corn and soybeans were the initial crops planted on 73% of converted prairie, and more than 80% of conversion occurred in recently established conservation priority zones, thereby magnifying the urgency to protect these sites. Broader land-use trends in Minnesota suggest that expansion of both croplands and developed lands continues to threaten all grasslands, including the subset that is prairie, and that the growth of developed or built-up land may be amplifying the conversion pressure exerted by agriculture, though further research is needed. Despite the small total area of prairie lost, the multi-fold increase in conversion rates and the confirmation of native habitat clearing may have substantial conservation implications, especially given the very limited prairie that remains in the region. The overall results reveal challenges for federal policies, including a loophole in the crop insurance Sodsaver provision surrounding alfalfa hay and limitations in the current enforcement of the Renewable Fuel Standard.
Full-text available
Monitoring agricultural land is important for understanding and managing food production, environmental conservation efforts, and climate change. The United States Department of Agriculture's Cropland Data Layer (CDL), an annual satellite imagery-derived land cover map, has been increasingly used for this application since complete coverage of the conterminous United States became available in 2008. However, the CDL is designed and produced with the intent of mapping annual land cover rather than tracking changes over time, and as a result certain precautions are needed in multi-year change analyses to minimize error and misapplication. We highlight scenarios that require special considerations, suggest solutions to key challenges, and propose a set of recommended good practices and general guidelines for CDL-based land change estimation. We also characterize a problematic issue of crop area underestimation bias within the CDL that needs to be accounted for and corrected when calculating changes to crop and cropland areas. When used appropriately and in conjunction with related information, the CDL is a valuable and effective tool for detecting diverse trends in agriculture. By explicitly discussing the methods and techniques for post-classification measurement of land-cover and land-use change using the CDL, we aim to further stimulate the discourse and continued development of suitable methodologies. Recommendations generated here are intended specifically for the CDL but may be broadly applicable to additional remotely-sensed land cover datasets including the National Land Cover Database (NLCD), Moderate Resolution Imaging Spectroradiometer (MODIS)-based land cover products, and other regional, national, and global land cover classification maps.
Full-text available
Concern is rising that ecologically important, carbon-rich natural lands in the United States are losing ground to agriculture. We investigate how quantitative assessments of historical land-use change (LUC) to address this concern differ in their conclusions depending on the data set used through an examination of LUC between 2006 and 2014 in 20 counties in the Prairie Pothole Region using the Cropland Data Layer, a modified Cropland Data Layer dataset, data from the National Agricultural Imagery Program, and in-person ground-truthing. The Cropland Data Layer analyses overwhelmingly returned the largest amount of LUC with associated error that limits drawing conclusions from it. Analysis with visual imagery estimated a fraction of this LUC. Clearly, analysis technique drives understanding of the measured extent of LUC; different techniques produce vastly different results that would inform land management policy in strikingly different ways. Best practice guidelines are needed. © 2017 Society of Chemical Industry and John Wiley & Sons, Ltd.
Accuracy assessment and land cover mapping have been inexorably linked throughout the first 50 years of publication of Remote Sensing of Environment. The earliest developers of land-cover maps recognized the importance of evaluating the quality of their maps, and the methods and reporting format of these early accuracy assessments included features that would be familiar to practitioners today. Specifically, practitioners have consistently recognized the importance of obtaining high quality reference data to which the map is compared, the need for sampling to collect these reference data, and the role of an error matrix and accuracy measures derived from the error matrix to summarize the accuracy information. Over the past half century these techniques have undergone refinements to place accuracy assessment on a more scientifically credible footing. We describe the current status of accuracy assessment that has emerged from nearly 50 years of practice and identify opportunities for future advances. The article is organized by the three major components of accuracy assessment, the sampling design, response design, and analysis, focusing on good practice methodology that contributes to a rigorous, informative, and honest assessment. The long history of research and applications underlying the current practice of accuracy assessment has advanced the field to a mature state. However, documentation of accuracy assessment methods needs to be improved to enhance reproducibility and transparency, and improved methods are required to address new challenges created by advanced technology that has expanded the capacity to map land cover extensively in space and intensively in time.
Food price and land use data over an extended time period have been examined to identify possible correlations between biofuel production and food price or land use changes. We compared the food price index before and after the biofuel boom in the 2000s to evaluate biofuel's impact on the inflation rate. We found that the U.S. food price inflation rate since 1973 could be divided into three distinct regions. The inflation rate was lowest at 2.6% during 1991–2016, which encompasses the biofuel boom. Among many factors, continuously rising food production per capita was identified as the likely cause of low food price inflation during this period. The US exports of corn have not declined since the 1990s and soybean exports are rising at a steady rate. Among several variables tested as a cause of food price index increase, crude oil price had the highest correlation. We also manually verified the automated land use classification of satellite image covering 664 km2 in three selected areas in the US. We found that 10.90% of non-agricultural land was misclassified as agriculture, whereas only 2.23% of agricultural land was misclassified as non-agricultural. The automated classification showed an 8.53% increase in agricultural land from 2011 to 2015, while the manual classification showed only 0.31% (±1.92%) increase. This result was within the margin of error alluding to no significant land use change. We concluded that automated satellite image land use classification should be verified more rigorously to be used for land use change analysis.
Reliable and timely information on agricultural production is essential for ensuring world food security. Freely available medium-resolution satellite data (e.g. Landsat, Sentinel) offer the possibility of improved global agriculture monitoring. Here we develop and test a method for estimating in-season crop acreage using a probability sample of field visits and producing wall-to-wall crop type maps at national scales. The method is illustrated for soybean cultivated area in the US for 2015. A stratified, two-stage cluster sampling design was used to collect field data to estimate national soybean area. The field-based estimate employed historical soybean extent maps from the U.S. Department of Agriculture (USDA) Cropland Data Layer to delineate and stratify U.S. soybean growing regions. The estimated 2015 U.S. soybean cultivated area based on the field sample was 341,000 km² with a standard error of 23,000 km². This result is 1.0% lower than USDA's 2015 June survey estimate and 1.9% higher than USDA's 2016 January estimate. Our area estimate was derived in early September, about 2 months ahead of harvest. To map soybean cover, the Landsat image archive for the year 2015 growing season was processed using an active learning approach. Overall accuracy of the soybean map was 84%. The field-based sample estimated area was then used to calibrate the map such that the soybean acreage of the map derived through pixel counting matched the sample-based area estimate. The strength of the sample-based area estimation lies in the stratified design that takes advantage of the spatially explicit cropland layers to construct the strata. The success of the mapping was built upon an automated system which transforms Landsat images into standardized time-series metrics. The developed method produces reliable and timely information on soybean area in a cost-effective way and could be applied to other regions and potentially other crops in an operational mode.