ArticlePDF Available

Machine-learned prediction of annual crop planting in the U.S. Corn Belt based on historical crop planting maps

Authors:

Abstract and Figures

An accurate crop planting map can provide essential information for decision support in agriculture. The method of post-season and in-season crop mapping has been widely studied in the land use and land cover community. However, it remains a challenge to predict the spatial distribution of crop planting before the growing season. This paper is the first attempt to use machine learning approach on the prediction of field-level annual crop planting from historical crop planting maps. We present an end-to-end machine learning framework for crop planting prediction using Cropland Data Layer (CDL) time series as reference data and multi-layer artificial neural network as prediction model. The proposed framework was first tested at Lancaster County of Nebraska State, then scaled up to the U.S. Corn Belt. According to the experiment results from 53 Agricultural Statistics Districts, we found the machine-learned crop planting map was expected to reach 88% agreement with the future CDL. Meanwhile, the crop acreage estimates derived from the machine-learned prediction were highly correlated (R2 > 0.9) with the crop acreage estimates of CDL and official statistics by the U.S. Department of Agriculture National Agricultural Statistics Service. This study provides a low-cost and efficient way to predict annual crop planting map, which can be used to support many agricultural applications and decision makings before the beginning of a growing season.
Content may be subject to copyright.
Machine-learned prediction of annual crop planting in the U.S. Corn Belt
based on historical crop planting maps
Chen Zhanga,b, Liping Dia,b,
, Li Lina,b, Liying Guoa
aCenter for Spatial Information Science and Systems, George Mason University, Fairfax, VA 22030, USA
bDepartment of Geography and Geoinformation Science, George Mason University, Fairfax, VA 22030, USA
Abstract
An accurate crop planting map can provide essential information for decision support in agriculture. The
method of post-season and in-season crop mapping has been widely studied in the land use and land cover
community. However, it remains a challenge to predict the spatial distribution of crop planting before the
growing season. This paper is the first attempt to use machine learning approach on the prediction of field-
level annual crop planting from historical crop planting maps. We present an end-to-end machine learning
framework for crop planting prediction using Cropland Data Layer (CDL) time series as reference data and
multi-layer artificial neural network as prediction model. The proposed framework was first tested at Lan-
caster County of Nebraska State, then scaled up to the U.S. Corn Belt. According to the experiment results
from 53 Agricultural Statistics Districts, we found the machine-learned crop planting map was expected to
reach 88% agreement with the future CDL. Meanwhile, the crop acreage estimates derived from the machine-
learned prediction were highly correlated (R2>0.9) with the crop acreage estimates of CDL and official
statistics by the U.S. Department of Agriculture National Agricultural Statistics Service. This study pro-
vides a low-cost and efficient way to predict annual crop planting map, which can be used to support many
agricultural applications and decision makings before the beginning of a growing season.
Keywords: Crop planting prediction, Machine learning, Artificial neural network, Cropland Data Layer,
Crop mapping
1. Introduction
With the rapid growth of the volume of Earth Observation (EO) data in the past decades, diverse
agricultural monitoring data have been gathered from remotely sensed images collected from various sensors
such as Advanced Very High Resolution Radiometer (AVHRR), Moderate Resolution Spectroradiometer
(MODIS), and Landsat Thematic Mapper (TM) (Seelan et al., 2003; Liaghat, 2010; Mulla, 2013; Khanal
et al., 2017; Yang et al., 2017). As an essential technology in agricultural monitoring and analysis, the
Corresponding author
Email addresses: czhang11@gmu.edu (Chen Zhang), ldi@gmu.edu (Liping Di), llin2@gmu.edu (Li Lin), lguo2@gmu.edu
(Liying Guo)
Preprint submitted to Computers and Electronics in Agriculture (doi: 10.1016/j.compag.2019.104989) May 17, 2020
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
remote sensing-based crop mapping aims to identify specific crops and delineate the spatial distribution of
crop planting from EO data. A timely and accurate crop planting map can not only provide the fundamental
information for the agricultural-related decision makings applications (Lu et al., 2014; Di et al., 2017; Shrestha
et al., 2017), but also can be used to support food security, global food supplies, market planning, and many
other socioeconomic activities (Doraiswamy et al., 2003; Thenkabail et al., 2009; Brown, 2016; Qu and Hao,
2018).
According to the phase of planting, crop mapping can be divided into post-season mapping, in-season
mapping, and pre-season mapping. The post-season mapping and in-season mapping have been widely
explored by the land use and land cover (LULC) community (Kussul et al., 2015; Belgiu and Csillik, 2018;
Dahal et al., 2018; Phalke and ¨
Ozdo˘gan, 2018; Hao et al., 2015, 2018a). Many of the methods and algorithms
can achieve excellent classification results at early growing season (Potgieter et al., 2010; McNairn et al., 2014;
Vaudour et al., 2015; Zhong et al., 2016; Skakun et al., 2017; Hao et al., 2018b). The pre-season prediction,
on the other hand, aims to forecast the crop planting before the beginning of a growing season. There
are many studies about quantifying the future crop area and yield using the remotely sensed vegetation
indices, such as Normalized Difference Vegetation Index (NDVI), Vegetation Condition Index (VCI), and
Enhanced Vegetation Index (EVI) (Wiegand et al., 1991; Quarmby et al., 1993; Dabrowska-Zielinska et al.,
2002; Kastens et al., 2005; Teal et al., 2006; Prasad et al., 2006; Bolton and Friedl, 2013; Yagci et al., 2015).
However, the study about the pre-season crop mapping and prediction of spatial distribution of crop planting
is quite few. Basically, the pre-season crop mapping can be considered as a task of LULC prediction. The
Cellular Automata-Markov, which is an effetive model to handle the continuous process, has been widely
adopted for the LULC change prediction (Corner et al., 2014; Halmy et al., 2015; Singh et al., 2015; Mondal
et al., 2016; Xu et al., 2016). The basic idea of these work is predicting the future LULC change from the
historical Landsat TM time series. In Osman et al. (2015), a Markov logic networks-based model for early
crop mapping was proposed, which can predict the crop type from the historical crop rotation at the accuracy
of 60% before the growing season. Meanwhile, several models based on the field-level crop sequences can be
potentially used in the prediction of future crop planting information (Aurbacher and Dabbert, 2011; Xiao
et al., 2014; Zhang et al., 2019a). Based on the previous studies, a question arises: “can we predict the future
crop planting from the historical crop planting maps?” This paper attempts to find the answer.
Data and methods are two keys to address the proposed question. The Cropland Data Layer (CDL) of
U.S. Department of Agriculture (USDA) National Agricultural Statistics Service (NASS) provides the annual
publicly available crop planting map from 1997 to present, which is an ideal reference data of this study.
On the other hand, machine learning, particularly those based on artificial neural network (ANN), has been
used in LULC mapping of EO data since the 1980’s (Surkan and Di, 1989; Ritter and Hepner, 1990; Dreyer,
1993; Yoshida and Omatu, 1994; Foody, 1995). The invention of deep learning, a branch of ANN-based
machine learning technology, in the last decade is pushing the use of machine learning in agriculture into new
high. A variety of deep learning architectures, such as Deep Neural Network (DNN), Convolutional Neural
2
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
Network (CNN), and Recurrent Neural Network (RNN), has been applied in crop mapping Kussul et al.
(2017); Zhong et al. (2019), crop yield prediction (Kaul et al., 2005; Kuwata and Shibasaki, 2015; Pantazi
et al., 2016; Nevavuori et al., 2019), plant identification (Lee et al., 2015; Grinblat et al., 2016; Mohanty
et al., 2016; Kaya et al., 2019), and many other agricultural applications (Chen et al., 2002; Kamilaris and
Prenafeta-Bold´u, 2018). In this study, we choose the ANN as the machine learning algorithm for the crop
type prediction.
The overall objective of this study is to predict the spatial distribution of field-level crop planting over
large geographic area. Specifically, an end-to-end machine learning framework for the prediction of annual
crop planting map is developed and demonstrated. In Section 2, we introduce the study area, provide the
information about CDL data, and describe the structure of the proposed machine learning framework. In
Section 3, we conduct experiments at both county and regional scale and thoroughly analyze the prediction
performance. Besides, the crop acreage calculated from the predicted crop planting map are validated using
the CDL data and official statistics from USDA NASS. Section 4 discusses the usefulness and the contribution
of the study, the advantages of the proposed method, and the limitation of the current implementation. The
conclusion and future works are given in Section 5.
2. Data and methods
2.1. Study area
This study focused on the Corn Belt region of the Midwestern United States. USDA NASS divided each
U.S. state into as many as nine (besides Texas which is divided into fifteen) Agricultural Statistics Districts
(ASDs). Each ASD consists of several contiguous counties with relatively similar agricultural characteristics
and environment. Figure 1 shows the geography of the study area, which contains 53 contiguous ASDs cover-
ing the entire states of Iowa, Illinois, and Indiana, and partial states of Minnesota, North Dakota, Nebraska,
South Dakota, and Wisconsin. Some ASDs that mainly covered by grass, forest, or non-major crops, such
as ASD #3110/#3120/#3170 (Western Nebraska) and ASD #3930/#3960/#3980/#3990 (Eastern Ohio),
were excluded. According to the statistics of 2018 CDL, the study area has a total of 149,992,381.2 acres of
croplands, which takes 75% of the total land area. Corn and soybeans are two dominant crop types in the
study area, which takes 37% and 35% of the total croplands. Other crop types, such as alfalfa, winter wheat,
sorghum, rye, and oats, account for a small proportion of all croplands.
2.2. Cropland Data Layer
Cropland Data Layer is an annual crop-specific land cover data layer produced by USDA NASS. It covers
the entire Contiguous United States (CONUS) from 2008 to present and some states from 1997 to 2007 at
30-meter spatial resolution with the accuracy of higher than 95% for major crop types (i.e. corn, soybeans,
and wheat). The production of CDL is based on massive internal ground truth data of NASS and remote
3
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
Figure 1: Geography of the study area (data from 2018 CDL by USDA NASS).
sensing images including the Landsat 8 OLI/TIRS, the Disaster Monitoring Constellation (DMC) DEIMOS-
1 and UK2, the ISRO ResourceSat-2 LISS-3, and the ESA Sentinel-2 images (Boryan et al., 2011; NASS,
2019). As the only available annual field-level agricultural land use map for the entire CONUS, CDL has been
widely used as the authentic reference data to support many studies, such as LULC change measurement
(Lark et al., 2017), flood impact estimation (Shrestha et al., 2017), flood mapping (Shan et al., 2010; Lin
et al., 2019), national-scale cultivated area estimation (King et al., 2017), crop modeling (Resop et al., 2012;
Sahajpal et al., 2014; Roy and Yan, 2018), and land surface modeling (Liu et al., 2016; Nguyen et al., 2018).
As an end-of-season product, the current-year CDL is normally released for public use in the early next
year, usually around February or March. The latest CDL product when this paper is written, the 2018 CDL,
was released at the end of February 2019. All CDL products and its derivative products (e.g., crop mask
layer, crop frequency layer, cultivated layer) can be accessed through CropScape (Han et al., 2012; Zhang
et al., 2019b). Figure 2 summarizes the availability of CDL products for the states of U.S. Corn Belt.
2.3. Machine learning framework for crop planting prediction
This section presents an end-to-end machine learning framework for the prediction of annual crop planting.
The prediction model is based on multi-layer ANN architecture. The input is the historical CDL time series.
The output is the predicted annual crop planting map with the same spatial resolution as the CDL time
series. Here we discuss the proposed framework in three parts: data preprocessing, structure design of ANN,
and evaluation.
4
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
Figure 2: Availability of historical CDL data for the states of U.S. Corn Belt (the coverage of CDL for Iowa in 2000 and CDL
for Nebraska in 2001 are incomplete).
2.3.1. Data preprocessing
Intuitively, the crop prediction model predicts the coming-year crop type based on the historical crop
sequence. For example, if a crop field is consistently planted with corn during the past few years, we can
reasonably predict this field will be planted with corn in the next year. If a crop field follows crop rotation
patterns (e.g., corn-soybean rotation), we can predict its crop type will be changed in the next year. Therefore,
the training/testing set should contain abundant training samples with reliable temporal information. This
step aims to process the historical CDL data into the training/testing set with structured pixel-based crop
sequence information.
Figure 3 shows the preparation of the training/testing set for the prediction of 2017 crop planting map.
The training set (Figure 3a) contains a stack of CDL time series and a label set for the corresponding region
of interest (ROI). To extend the training set, a recursive training strategy is adopted. For example, while
training a prediction model for 2017, we use not only the sub-training set of 2008–2015 CDL labeled with
2016 CDL, but also the sub-training set of 2007–2014 CDL labeled with 2015 CDL and the sub-training set
of 2006–2013 CDL labeled with 2014 CDL. In this way, the total amount of training samples can be tripled.
According to the test, the recursive training strategy can improve the performance of prediction model for
1–5%. The label set maps the coming-year crop type of all training samples. The testing set (Figure 3b)
contains a stack of CDL time series from 2009 to 2016, which will be used to predict the annual crop planting
map of 2017.
To convert the data set into the accepted format for the neural network, we flatten training/testing set
to a structured 2-D table. The example of the structured training/testing set is demonstrated in Figure 4.
In this example, we label each training sample as its crop value (e.g., “1” as corn, “5” as soybean, “36” as
alfalfa) or “others” (e.g., water, developed area, forest, and grass). Each row of the table represents a sample
composed by the pixel-level crop sequence features. For example, a training sample of pixel that follows the
corn-soybean rotation pattern will be represented as “(1, 5, 1, 5, 1, 5, 1, 5)” and labeled with “1”.
5
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
(a) Training set
(b) Testing set
Figure 3: Preparation of training/testing set for prediction of 2017 crop planting map. The training set contains three subsets,
and each subset is composed by a stack of CDL time series with a label set. The testing set is composed by a stack of CDL time
series.
2.3.2. Structure design of artificial neural network
Figure 5 illustrates the architecture of the crop type prediction model. The proposed model has a typical
multilayer perceptron (MLP) structure including one input layer, three hidden layers, and one output layer.
The input layer has eight input nodes corresponding to the crop type of eight years in the historical CDL
time series for individual pixels. We used the binary encoding to represent each input by allocating the
appropriate number of bits corresponding to the number of input categories. In this example, each input
pixel will be encoded as (1,0,0,0), (0,1,0,0), (0,0,1,0), or (0,0,0,1) corresponding the corn, soybeans, alfalfa,
and others respectively. There are three hidden layers between the input layer and the output layer. By
comparing with other model configuration, we found the performance difference between three hidden layers
and more hidden layers is ignorable. To simplify the model structure and improve computing efficiency, we
adopted three-hidden-layer architecture for the ANN model.
The activation function for both the input layer and hidden layers is the Rectified Linear Unit (ReLU),
which can be written as:
f(z) = max(0, z) (1)
6
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
Figure 4: Example of the structured training/testing set. Each sample of training set is composed by a sequence of historical
crop type features and labeled with the coming-year crop type value. Each sample of testing set is composed by a sequence of
historical crop type features.
Figure 5: Structure of the MLP-based crop type prediction model.
where zis the input to a neuron. The output layer uses SoftMax function to normalize the result into a
probability distribution. The SoftMax can be defined as:
σ(z)j=ezj
PK
k=1 ezk
(2)
where zrepresents a vector of the inputs to the output layer, jrepresents the index of the output units, and
Krepresents the total classes. In the proposed model, the crop type of input pixel will be predicted as the
type with the highest probability. Figure 6 shows an example of the machine-learned annual crop planting
7
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
prediction based on the probability distribution of SoftMax. Besides, we use the cross entropy as the loss
function, which is frequently used in classification problems, and Adam optimization algorithm (Kingma and
Ba, 2014) as the optimizer.
Figure 6: Example of mapping machine-learned prediction result based on the probability distribution of SoftMax.
This study deals with the regional-scale mapping, the different geography, climate, and other factors may
influence the crop distribution and crop sequence among the study area. Since the croplands in one ASD have
relatively similar agricultural characteristics and environment, we train and apply the model for prediction
at the ASD level. By feeding the testing set of the corresponding ASD to the trained model, a pixel-by-pixel
prediction map of the coming-year crop planting type can be created. By repeating this procedure ASD by
ASD, we will get a crop planting map for the entire study area.
2.4. Evaluation
The prediction result will be evaluated using overall accuracy (OA), Kappa, precision, recall, and F1
score. The OA measures the proportion of correctly predicted pixels, where the set of labels in prediction
result exactly matches the corresponding set of labels in the reference image, in all pixels. Kappa (Cohen,
1960) measures inter-annotator agreement, which is defined as
κ=(p0pe)
(1 pe)(3)
where p0represents the actual observed agreement, perepresents the hypothetical probability of chance
agreement.
Meanwhile, we use precision and recall to measure the prediction result of each class. The precision and
recall can be defined as follow:
P recision =T P
T P +F P (4)
Recall =T P
T P +F N (5)
where T P represents the number of true positives, F P represents the number of false positives, and F N
represents the number of false negatives. The precision measures the ability of the predictor not to label as
8
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
positive a sample that is negative. The recall measures the ability of the predictor to find all the positive
samples.
Also, we use F1 score to combine precision and recall, which can be defined as:
F1=2×P recision ×Recall
P recision +Recall (6)
The value of the above metrics lies between 0 and 1. The higher the value, the better the prediction
performance.
3. Experiments and results
This section presents a group of experiments to validate the proposed method. First, we tested the
prediction model at the county level (Section 3.1). Then we predicted and validated the annual crop planting
maps for the U.S. Corn Belt (Section 3.2). Moreover, we compared the crop acreage calculated from the
predicted crop planting map with the CDL and official statistics by USDA NASS (Section 3.3).
3.1. County-level crop planting prediction
3.1.1. Region of interest
We selected Lancaster County of Nebraska State as the ROI to test the feasibility of the prediction model.
The geography of the study area is shown in Figure 7. It is located at East Nebraska (ASD #1960), which
falls in the western part of the U.S. Corn Belt (Figure 7a). As one of the top agricultural production counties
in Nebraska, Lancaster county has 418,347 acres of agricultural land, which takes 77% of the total land area.
According to the 2018 CDL statistics (Figure 7b), corn and soybeans are two dominant crops in this area
as well as the entire U.S. Corn Belt. Besides, alfalfa accounts for 1% of croplands. In this experiment, we
predicted and evaluated 12 land use classes, each of which takes at least 1% of the total land area of the
county.
(a) Location (b) 2018 CDL statistics
Figure 7: Geography of Lancaster County, Nebraska.
9
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
3.1.2. Evaluation of county-level crop planting prediction
We used the 2018 CDL data as reference data to evaluate the prediction result. The predicted crop
planting map of 12 land use classes achieved the OA of 90% and Kappa value of 0.86. Table 1 summaries the
performance of the model in the county-level test. The table is sorted by the support value, which represents
the number of CDL pixels of each class in the 2018 CDL data. For classes other than the 12 target land use
classes, they were labeled as “others” during the step of data preprocessing. It is found that the precision
rate of all classes was satisfactory. The F1 score for most classes, including the two dominate crops, corn
and soybean, exceeds 0.9. However, the recall rate of “other hay/non-alfalfa” and “others” was low. There
are two main reasons for the low recall rate. On the one hand, the “other” category mixed different classes
into the single one. Thus most “other” category pixels do not have clear crop sequence patterns. On the
other hand, the crop category of CDL data may vary year by year. For example, the 2010 CDL for Lancaster
County does not include “other hay/non-alfalfa” class.
Table 1: Performance of the 2018 prediction result for Lancaster County of Nebraska.
Class Support Precision Recall F1 Score
Soybean 832519 0.94 0.86 0.90
Corn 827942 0.87 0.93 0.90
Grassland/Pasture 724026 0.87 0.95 0.91
Deciduous Forest 212598 0.83 0.93 0.88
Developed/Low Intensity 176608 0.94 0.94 0.94
Developed/Open Space 149933 0.94 0.96 0.95
Developed/Med Intensity 83745 0.93 0.90 0.91
Other Hay/Non-Alfalfa 56394 0.95 0.12 0.22
Open Water 56195 0.95 0.88 0.92
Developed/High Intensity 36420 0.90 0.96 0.93
Others 34752 0.60 0.04 0.07
Alfalfa 32025 0.64 0.66 0.65
The production of crop planting map is based on the probability distribution of SoftMax function. By
filtering the pixels of lower probability, a crop prediction map of high-probability pixels can be created. The
trade-off here is the number of predicted pixels would be reduced against an increasing probability threshold.
Figure 8 shows the curves of prediction performance verses the probability threshold for corn, soybeans, and
alfalfa from the prediction result.
3.1.3. Mapping of county-level crop planting prediction
The final output of the prediction framework is an annual crop planting map, which is similar to the CDL
product, and its probability map based on the SoftMax function. Figure 9 compares the probability map
10
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
Figure 8: Curves of prediction performance vs. probability threshold.
(a) Probability map of
machine-learned prediction
(b) Crop planting map of
machine-learned prediction
(c) 2018 CDL
Figure 9: Comparison of machine-learned prediction and CDL for Lancaster County of Nebraska State. The probability map
represents the spatial distribution of the highest probability from the SoftMax function (the brighter the pixel, the higher the
probability). All pixels of the crop planting map are categorized as one of land use classes.
(Figure 9a) and crop planting map (Figure 9b) of machine-learned prediction with the CDL data (Figure 9c)
of Lancaster County. In general, the spatial distribution of most croplands in the predicted crop planting
map is consistent with the CDL. Only a small number of mispredicted pixels can be visually discerned.
Based on the prediction result of the following year, the multi-year crop planting maps can be recursively
predicted. For example, by using the prediction result of 2019 as new reference data, we can predict the crop
planting maps of 2020. Figure 10 compares the crop planting maps from 2017 to 2021. By comparing the
2017 CDL, 2018 CDL, and prediction maps from 2019 to 2021, we can observe many corn-soybean rotations
happening over time.
11
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
Figure 10: Crop planting maps from 2017 to 2021.
3.2. Crop planting prediction for the U.S. Corn Belt
With the success of the county-level crop prediction, we scaled up the proposed machine learning method
to ASD level and predicted the annual crop planting maps for the U.S. Corn Belt. There are two main
differences in the configuration between the county-level experiment and the ASD-level experiment. The
first difference is the number of prediction categories. The county-level experiment predicted 12 land use
classes. However, most of them are not crops. To highlight the capability of crop planting prediction for the
proposed approach, this experiment only focused on the major crop types (i.e., corn and soybean) over the
U.S. Corn Belt. Each pixel would be predicted as one of three categories: corn, soybean, or others. The
second difference is the selection of training samples. The ASD-level prediction would handle much more
pixels than the county-level prediction. To improve the training efficiency, each ASD-level CDL data set was
resized and only one of ten pixels were used as the training samples.
To further verify the proposed model, we applied the same ANN framework to generate the crop planting
map of three consecutive years (2016–2018) and evaluated the result with 2016 CDL, 2017 CDL, and 2018
CDL, respectively. Moreover, an annual crop planting map of 2019 was created. Table 2 lists the full data
set used for predicting crop planting maps from 2016 to 2019. Due to the unavailability of the historical
CDL data for Minnesota, Ohio, and South Dakota before 2006, only two subsets (CDL 2007–2014 and CDL
2006–2013) were used to build the recursive training set of the 2016 prediction model for some ASDs.
3.2.1. Evaluation of ASD-level crop planting prediction
Figure 11 illustrates the OA and Kappa score of all ASD-level predicted crop planting maps from 2016 to
2018. We can see the prediction result of most ASDs can achieve the high OA (>80%) and the fair Kappa
score (>70%). However, the overall performance for some specific ASDs located in South Dakota, North
Dakota, and Ohio were not satisfactory.
To measure the overall prediction performance over the entire study area, we calculated the highest/lowest
value, mean, and median for the OA, Kappa score, precision, recall, and F1 score of all ASD-level crop planting
maps for each target year. Figure 12 presents the overall performance as box plot. A noticeable feature of the
result was the relationship between year and performance. Although it is not dramatic, we can still observe
12
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
Table 2: Data set used for predicting crop planting map of 2016–2019.
Year Training Set (Label Set) Testing Set Ground Reference Set
2016 CDL 2007-2014 (CDL 2015) CDL 2008-2015 CDL 2016
CDL 2006-2013 (CDL 2014)
CDL 2005-2012 (CDL 2013)*
2017 CDL 2008-2015 (CDL 2016) CDL 2009-2016 CDL 2017
CDL 2007-2014 (CDL 2015)
CDL 2006-2013 (CDL 2014)
2018 CDL 2009-2016 (CDL 2017) CDL 2010-2017 CDL 2018
CDL 2008-2015 (CDL 2016)
CDL 2007-2014 (CDL 2015)
2019 CDL 2010-2017 (CDL 2018) CDL 2011-2018 N/A
CDL 2009-2016 (CDL 2017)
CDL 2008-2015 (CDL 2016)
*Unavailable for Minnesota, Ohio, and South Dakota
the performance of OA and Kappa (Figure 12a), the precision rate of soybeans (Figure 12b), the recall rate
of corn and soybeans (Figure 12c), and F1 score of corn and soybeans (Figure 12d) had gradually increased
and reached the highest in 2018. The underlying reason for this change is the improvement of the CDL data
quality year after year. Since the training set and testing set were completely built with the historical CDL
time series, the prediction model would be more reliable over time, and the machine-learned prediction of
crop planting maps would be more accurate.
3.2.2. Mapping of ASD-level crop planting prediction
A side-by-side comparison of the ASD-level crop planting prediction maps and CDL was presented in
Figure 13. The predicted spatial distribution maps of corn and soybeans over the U.S. Corn Belt from 2016
to 2018 are displayed in Figure 13a, Figure 13c, and Figure 13e, respectively. The details of two ROIs,
one is located at the ASD of median OA and the other is located at the ASD of median Kappa value, are
demonstrated in each figure. As a control, the CDL maps of the corresponding year are given in Figure 13b,
Figure 13d, and Figure 13f, respectively.
In addition, we made the predicted annual crop planting map for 2019 based on the latest CDL data.
Similar to the above implementation, the prediction model of 2019 was trained using the recursive training
sets of CDL time series of 2008–2015 labeled with 2016 CDL, CDL time series of 2009–2016 labeled with
2017 CDL, and CDL time series of 2010–2017 labeled with 2018 CDL. Then the crop planting prediction
map of 2019 was created by applying CDL time series of 2011–2018 to the trained model. Figure 14 depicts
the predicted crop planting map of 2019. Due to the lack of in-season remote sensing data and in-field survey
13
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
(a) 2016
(b) 2017
(c) 2018
Figure 11: Evaluation of the machine-learned crop planting maps at ASD level.
(a) Overall performance (b) Precision (c) Recall (d) F1 score
Figure 12: Box plot of the overall prediction performance from 2016 to 2018. The upper and lower bounds of the box represent
the first and third quartiles. The cross mark, bar in the box, and vertical line indicate the mean, median, and minimum-maximum
bound, respectively. The outliers of each cluster are shown as the solid dot.
14
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
(a) Machine-learned crop planting map of 2016 (b) CDL 2016
(c) Machine-learned crop planting map of 2017 (d) CDL 2017
(e) Machine-learned crop planting map of 2018 (f) CDL 2018
Figure 13: Comparison of ASD-level prediction result and CDL. Yellow pixels represent corn, green pixels represent soybean.
ROIs in each figure are located at the ASD of median OA and the ASD of median Kappa value respectively.
15
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
(a) Probability map of machine-learned crop planting (b) Crop planting map of machine-learned prediction
Figure 14: Machine-learned prediction of annual crop planting map in 2019.
data, in the real agricultural practice, it is difficult to evaluate the accuracy of machine-learned prediction
before the beginning of a growing season. Based on the performance of the historical validation results, the
machine-learned prediction of crop planting map of corn and soybeans is expected to reach 88% OA.
3.3. Crop acreage estimation from crop planting map
3.3.1. Validation of crop acreage using CDL
Crop acreage is one of the most critical information in agricultural decision makings. The machine-learned
crop planting prediction map can be potentially used for predicting future crop acreage. This experiment aims
to evaluate the accuracy of crop acreage calculated from the machine-learned crop planting map. Figure 15
illustrates the correlation between ASD-level crop acreages derived from the machine-learned crop planting
prediction map and CDL for the entire study area. It can be found from the figure that there is a high
correlation (R2>0.9) throughout all observed years (2016 to 2018). Specifically, the R2coefficient of corn
and soybean kept going up over time and reached the highest value in 2018.
3.3.2. Validation of crop acreage using official statistics
To validate the predicted crop acreage with the official statistics, we obtained the data of ASD-level acres
planted from the USDA NASS Iowa Field Office, which can be accessed at: https://www.nass.usda.gov/
Statistics_by_State/Iowa/index.php. According to the statistics from USDA NASS, Iowa ranks 1st in
the U.S. in corn and 2nd in soybeans production. Over 12.9 million acres of corn and 9.94 million acres
of soybeans were harvested in 2018. The NASS Crops/Stocks survey collects the detailed estimates of crop
acreage from farm and ranch operators four times per year. In this experiment, the data of acres planted are
collected in June. Figure 16 illustrates the correlation between the predicted crop acreage and the official
statistics of acres planted in Iowa. The result shows that the R2coefficient is higher than 0.95 for both corn
and soybeans in all observed years.
16
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
(a) CDL vs. machine-learned predicted crop area for Iowa in 2016
(b) CDL vs. machine-learned predicted crop area for Iowa in 2017
(c) CDL vs. machine-learned predicted crop area for Iowa in 2018
Figure 15: Comparison of CDL and machine-learned predicted crop area at ASD level.
17
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
(a) Official statistics vs. machine-learned predicted crop area for Iowa in 2016
(b) Official statistics vs. machine-learned predicted crop area for Iowa in 2017
(c) Official statistics vs. machine-learned predicted crop area for Iowa in 2018
Figure 16: Comparison of official statistics and machine-learned crop acreage prediction at ASD level.
Table 3 summarizes the total acreage of each crop type for Iowa. This result suggests the machine-learned
crop acreage predictions of corn is very close to the CDL data. The machine-learned crop acreage estimates
18
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
of soybeans, on the other hand, is a little bit lower but still close to the CDL data. The crop acreage of both
machine-learned result and CDL are less than the official statistics.
Table 3: Summary of crop acreage estimates for Iowa state from 2016 to 2018
Year Corn Soybeans
Machine-
learned
CDL Official
Statistics
Machine-
learned
CDL Official
Statistics
2016 13,607,650 13,628,727 13,900,000 8,971,998 9,232,107 9,500,000
2017 13,420,096 13,216,879 13,300,000 9,401,789 9,785,308 10,000,000
2018 13,704,565 13,537,935 13,200,000 9,535,296 9,959,717 9,996,000
4. Discussion
4.1. Applications of machine-learned crop planting map
This study provides a new perspective for crop planting prediction. Different from the in-season and
post-season crop mapping, the proposed approach can predict the spatial distribution of field-level crop
planting before the beginning of growing season. The machine-learned crop planting map can be used to
support many agricultural applications and decision makings, such as crop acreage estimation, crop yield
prediction, crop modeling, and agricultural commodity trading. More importantly, the predicted pixels
with high probability can be used as the field-level reference data to facilitate many early-season/in-season
LULC studies, especially the machine learning applications which require a large amount of reference data as
training labels. Also, the prediction of spatial distribution of crop planting will provide valuable information
for agriculture policymakers as well as the agriculture companies.
4.2. Why using 8-year recursive training sets?
There are different combinations of the number of recursive subsets and the moving window of the
historical CDL time series. We can choose either three recursive subsets with 8-year moving window or
just one subset with a 10-year moving window. While designing the structure of the training set, multiple
combinations were considered. According to our test, we found the combination of three recursive subsets
with 8-year moving window can reach the best performance. This result could be caused by the following
reasons. First, the quality of the early-year CDL data is not as good as the latest CDL data. There are
many pixels misclassified or covered with cloud before 2005. If we build the training set based on a very long
time series, these pixels with incorrect information will affect the prediction performance. Second, different
cropland units might have different crop sequences. If the observed time series is not long enough, the pattern
of crop sequence cannot be well learned by the prediction model. Third, the recursive training set contains
19
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
crop planting information for the last three consecutive years. The prediction model trained with the single-
year labeled training set, though, only refers to the last year’s crop planting results. To be brief, the moving
window of the training set cannot be too long or too short but should cover the period of most crop rotation
patterns.
4.3. Advantages of the proposed framework
Here we summarize some advantages of the proposed machine learning framework. First, the proposed
framework is easy to implement, the only required input is the historical CDL time series data. Once the
process is done, a raster, geo-referenced, machine-learned crop planting prediction map will be generated.
No remote sensing data or in-season crop planting information are needed.
The prediction model can learn many intricate crop sequence patterns. It is well known that a lot
of croplands in the U.S. Corn Belt has specific crop rotation patterns such as monocropping or alternate
cropping. But many of them do not strictly follow one pattern over a long period. The neural network can
learn not only regular crop rotation patterns but also some irregular patterns.
The machine-learned crop planting prediction map is produced based on the probability distribution from
SoftMax layer. By setting the threshold on the probability distribution, a crop planting prediction map with
a higher probability of agreement can be created. For example, if we set the probability threshold to 0.95,
the predicted pixels are expected to have a higher agreement with the CDL. But the cost is, with the increase
of threshold, more pixels will be filtered out.
Additionally, the proposed machine learning framework is flexible and extendable, which can be used in
the prediction of crop mapping products other than CDL. For example, by applying the time series of Annual
Crop Inventory, which is the annual crop mapping product of Canada released by Agriculture and Agri-Food
Canada (AAFC), to the framework, a crop planting map of Canada can be potentially predicted.
4.4. Limitations and potential solutions
There are still some limitations on the current implementation of the proposed model. This study uses
the historical CDL to build training/testing set and evaluate the predicted crop planting map. However,
CDL is not the ground truth data even though it reaches high accuracy (>95%) for classifying major crop
types over CONUS. Therefore, errors cannot be avoided in the prediction result based on the historical CDL
data. The experiment (Section 3.1 and Section 3.2) used CDL as reference data to evaluate the prediction
result due to the unavailability of the ground truth data. Strictly speaking, the evaluation result is close to
the truth but not absolutely accurate. To further evaluate the prediction result, the ground truth data are
required.
Although the prediction results for most ASDs were good, the overall performance for some specific
regions, such North Dakota (ASD #3850, #3860, and #3890), were not satisfactory. Because North Dakota
has a more diverse landscape, the main crops over these ASDs are not only corn and soybean but also many
other crops. The crop sequence of such croplands could be more intricate and diverse. To let the machine fully
20
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
learn the potential crop sequence patterns, a long historical observation is required. A potential solution is
increasing the moving window size of training samples, thus more historical crop planting information would
be included in the training set.
Mixed pixels, which are located on the margin of each land unit, may provide wrong reference data in
both training/testing set. Even though these pixels account for a small proportion in all CDL pixels, they
might still affect the performance of the prediction result. Therefore, the performance of the prediction model
can be improved by removing these mixed pixels from the training set.
Since this study did not focus on the design and development of new deep learning algorithms, the current
ANN implementation can be further improved. To achieve the optimal performance, the hyperparameters,
such as learning rates, weights, optimizer, loss function, activation function, and batch size, of the prediction
model need to be specifically tuned for each ASD or county.
The proposed method only learns the pattern from the historical crop planting maps. Therefore, it is
still a challenge to correctly predict the crop type for land units that break the pattern in the coming year.
Reasons for breaking the patterns are complex, which could be related to many dynamic and uncertain factors,
such as market situation, government policy, socioeconomic factors, weather (rainfall and temperature), the
efficiency of irrigation systems, quality of crop seeds, soil quality, and natural hazards. To further optimize
the ANN framework, other information such as historical and future agricultural commodity prices need to
be considered.
5. Conclusion and future works
This study explored the feasibility of using machine learning to predict the field-level annual crop planting
map based on the historical CDL data. An end-to-end machine learning framework based on multi-layer ANN
and recursive training set was developed and demonstrated. The experiment result of Lancaster County of
Nebraska State shows the machine-learned crop planting map can reach 90% agreement with the CDL data.
By scaling up the proposed approach to the U.S. Corn Belt, we can conclude the ASD-level machine-learned
crop planting map is expected to reach 88% agreement with the future CDL. Meanwhile, the ASD-level crop
acreage calculated from the machine-learned result is highly correlated (R2>0.9) with the CDL statistics.
According to the case study of Iowa state, the ASD-level crop acreage of the machine-learned result is highly
correlated (R2>0.95) with the official statistics. Considering the crop planting map is generated without any
in-season satellite images or in-field surveys, the performance of the proposed prediction model is satisfactory.
In the future, we will improve the proposed framework and test more state-of-the-art neural networks with
more crop types other than corn and soybeans. In addition, we will use the machine-learned crop planting
map as the reference data to conduct in-season crop type classification once the in-season satellite images
become available.
21
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
Acknowledgment
This research is supported by a grant from National Science Foundation INFEWS program (Grant #:
CNS-1739705, PI: Dr. Liping Di). The authors would like to thank two anonymous reviewers for their
valuable and constructive comments.
References
Aurbacher, J., Dabbert, S., 2011. Generating crop sequences in land-use models using maximum entropy
and Markov chains. Agricultural Systems 104, 470–479. doi: 10.1016/j.agsy.2011.03.004.
Belgiu, M., Csillik, O., 2018. Sentinel-2 cropland mapping using pixel-based and object-based time-weighted
dynamic time warping analysis. Remote Sensing of Environment 204, 509–523. doi: 10.1016/j.rse.2017.
10.005.
Bolton, D.K., Friedl, M.A., 2013. Forecasting crop yield using remotely sensed vegetation indices and crop
phenology metrics. Agricultural and Forest Meteorology 173, 74–84. doi: 10.1016/j.agrformet.2013.
01.007.
Boryan, C., Yang, Z., Mueller, R., Craig, M., 2011. Monitoring US agriculture: the US Department of Agri-
culture, National Agricultural Statistics Service, Cropland Data Layer Program. Geocarto International
26, 341–358. doi: 10.1080/10106049.2011.562309.
Brown, M.E., 2016. Remote sensing technology and land use analysis in food security assessment. Journal
of Land Use Science 11, 623–641. doi: 10.1080/1747423X.2016.1195455.
Chen, Y.R., Chao, K., Kim, M.S., 2002. Machine vision technology for agricultural applications. Computers
and Electronics in Agriculture 36, 173–191. doi: 10.1016/S0168-1699(02)00100-X.
Cohen, J., 1960. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement
20, 37–46. doi: 10.1177/001316446002000104.
Corner, R.J., Dewan, A.M., Chakma, S., 2014. Monitoring and Prediction of Land-Use and Land-Cover
(LULC) Change, in: Dewan, A., Corner, R. (Eds.), Dhaka Megacity: Geospatial Perspectives on Urban-
isation, Environment and Health. Springer Netherlands, Dordrecht. Springer Geography, pp. 75–97. doi:
10.1007/978-94-007-6735- 5_5.
Dabrowska-Zielinska, K., Kogan, F., Ciolkosz, A., Gruszczynska, M., Kowalik, W., 2002. Modelling of crop
growth conditions and crop yield in Poland using AVHRR-based indices. International Journal of Remote
Sensing 23, 1109–1123. doi: 10.1080/01431160110070744.
Dahal, D., Wylie, B., Howard, D., 2018. Rapid Crop Cover Mapping for the Conterminous United States.
Scientific Reports 8, 8631. doi: 10.1038/s41598-018-26284-w.
22
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
Di, L., Yu, E.G., Kang, L., Shrestha, R., Bai, Y., 2017. RF-CLASS: A remote-sensing-based flood crop loss
assessment cyber-service system for supporting crop statistics and insurance decision-making. Journal of
Integrative Agriculture 16, 408–423. doi: 10.1016/S2095- 3119(16)61499-5.
Doraiswamy, P.C., Moulin, S., Cook, P.W., Stern, A., 2003. Crop Yield Assessment from Remote Sensing.
Photogrammetric Engineering & Remote Sensing 69, 665–674. doi: 10.14358/PERS.69.6.665.
Dreyer, P., 1993. Classification of land cover using optimized neural nets on SPOT data. Photogrammetric
Engineering & Remote Sensing 59, 617–621.
Foody, G.M., 1995. Land cover classification by an artificial neural network with ancillary information.
International Journal of Geographical Information Systems 9, 527–542. doi: 10.1080/02693799508902054.
Grinblat, G.L., Uzal, L.C., Larese, M.G., Granitto, P.M., 2016. Deep learning for plant identification using
vein morphological patterns. Computers and Electronics in Agriculture 127, 418–424. doi: 10.1016/j.
compag.2016.07.003.
Halmy, M.W.A., Gessler, P.E., Hicke, J.A., Salem, B.B., 2015. Land use/land cover change detection and
prediction in the north-western coastal desert of Egypt using Markov-CA. Applied Geography 63, 101–112.
doi: 10.1016/j.apgeog.2015.06.015.
Han, W., Yang, Z., Di, L., Mueller, R., 2012. CropScape: A Web service based application for exploring and
disseminating US conterminous geospatial cropland data products for decision support. Computers and
Electronics in Agriculture 84, 111–123. doi: 10.1016/j.compag.2012.03.005.
Hao, P., L¨ow, F., Biradar, C., 2018a. Annual Cropland Mapping Using Reference Landsat Time Series—A
Case Study in Central Asia. Remote Sensing 10, 2057. doi: 10.3390/rs10122057.
Hao, P., Tang, H., Chen, Z., Liu, Z., 2018b. Early-season crop mapping using improved artificial immune
network (IAIN) and Sentinel data. PeerJ 6, e5431. doi: 10.7717/peerj.5431.
Hao, P., Wang, L., Niu, Z., 2015. Comparison of Hybrid Classifiers for Crop Classification Using Normalized
Difference Vegetation Index Time Series: A Case Study for Major Crops in North Xinjiang, China. PLOS
ONE 10, e0137748. doi: 10.1371/journal.pone.0137748.
Kamilaris, A., Prenafeta-Bold´u, F.X., 2018. Deep learning in agriculture: A survey. Computers and Elec-
tronics in Agriculture 147, 70–90. doi: 10.1016/j.compag.2018.02.016.
Kastens, J.H., Kastens, T.L., Kastens, D.L.A., Price, K.P., Martinko, E.A., Lee, R.Y., 2005. Image masking
for crop yield forecasting using AVHRR NDVI time series imagery. Remote Sensing of Environment 99,
341–356. doi: 10.1016/j.rse.2005.09.010.
Kaul, M., Hill, R.L., Walthall, C., 2005. Artificial neural networks for corn and soybean yield prediction.
Agricultural Systems 85, 1–18. doi: 10.1016/j.agsy.2004.07.009.
23
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
Kaya, A., Keceli, A.S., Catal, C., Yalic, H.Y., Temucin, H., Tekinerdogan, B., 2019. Analysis of transfer learn-
ing for deep neural network based plant classification models. Computers and Electronics in Agriculture
158, 20–29. doi: 10.1016/j.compag.2019.01.041.
Khanal, S., Fulton, J., Shearer, S., 2017. An overview of current and potential applications of thermal remote
sensing in precision agriculture. Computers and Electronics in Agriculture 139, 22–32. doi: 10.1016/j.
compag.2017.05.001.
King, L., Adusei, B., Stehman, S.V., Potapov, P.V., Song, X.P., Krylov, A., Di Bella, C., Loveland, T.R.,
Johnson, D.M., Hansen, M.C., 2017. A multi-resolution approach to national-scale cultivated area estima-
tion of soybean. Remote Sensing of Environment 195, 13–29. doi: 10.1016/j.rse.2017.03.047.
Kingma, D.P., Ba, J., 2014. Adam: A Method for Stochastic Optimization. arXiv:1412.6980 [cs] ArXiv:
1412.6980.
Kussul, N., Lavreniuk, M., Skakun, S., Shelestov, A., 2017. Deep Learning Classification of Land Cover and
Crop Types Using Remote Sensing Data. IEEE Geoscience and Remote Sensing Letters 14, 778–782. doi:
10.1109/LGRS.2017.2681128.
Kussul, N., Skakun, S., Shelestov, A., Lavreniuk, M., Yailymov, B., Kussul, O., 2015. Regional scale
crop mapping using multi-temporal satellite imagery, in: ISPRS - International Archives of the Pho-
togrammetry, Remote Sensing and Spatial Information Sciences, Copernicus GmbH. pp. 45–52. doi:
10.5194/isprsarchives-XL-7-W3- 45-2015.
Kuwata, K., Shibasaki, R., 2015. Estimating crop yields with deep learning and remotely sensed data,
in: 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp. 858–861. doi:
10.1109/IGARSS.2015.7325900.
Lark, T.J., Mueller, R.M., Johnson, D.M., Gibbs, H.K., 2017. Measuring land-use and land-cover change using
the U.S. department of agriculture’s cropland data layer: Cautions and recommendations. International
Journal of Applied Earth Observation and Geoinformation 62, 224–235. doi: 10.1016/j.jag.2017.06.
007.
Lee, S.H., Chan, C.S., Wilkin, P., Remagnino, P., 2015. Deep-plant: Plant identification with convolutional
neural networks, in: 2015 IEEE International Conference on Image Processing (ICIP), pp. 452–456. doi:
10.1109/ICIP.2015.7350839.
Liaghat, 2010. A Review: The Role of Remote Sensing in Precision Agriculture. American Journal of
Agricultural and Biological Sciences 5, 50–55. doi: 10.3844/ajabssp.2010.50.55.
Lin, L., Di, L., Tang, J., Yu, E., Zhang, C., Rahman, M.S., Shrestha, R., Kang, L., 2019. Improvement
and Validation of NASA/MODIS NRT Global Flood Mapping. Remote Sensing 11, 205. doi: 10.3390/
rs11020205.
24
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
Liu, X., Chen, F., Barlage, M., Zhou, G., Niyogi, D., 2016. Noah-MP-Crop: Introducing dynamic crop growth
in the Noah-MP land surface model. Journal of Geophysical Research: Atmospheres 121, 13,953–13,972.
doi: 10.1002/2016JD025597.
Lu, L., Di, L., Ye, Y., 2014. A Decision-Tree Classifier for Extracting Transparent Plastic-Mulched Landcover
from Landsat-5 TM Images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote
Sensing 7, 4548–4558. doi: 10.1109/JSTARS.2014.2327226.
McNairn, H., Kross, A., Lapen, D., Caves, R., Shang, J., 2014. Early season monitoring of corn and
soybeans with TerraSAR-X and RADARSAT-2. International Journal of Applied Earth Observation and
Geoinformation 28, 252–259. doi: 10.1016/j.jag.2013.12.015.
Mohanty, S.P., Hughes, D.P., Salath´e, M., 2016. Using Deep Learning for Image-Based Plant Disease Detec-
tion. Frontiers in Plant Science 7. doi: 10.3389/fpls.2016.01419.
Mondal, M.S., Sharma, N., Garg, P.K., Kappas, M., 2016. Statistical independence test and validation of
CA Markov land use land cover (LULC) prediction results. The Egyptian Journal of Remote Sensing and
Space Science 19, 259–272. doi: 10.1016/j.ejrs.2016.08.001.
Mulla, D.J., 2013. Twenty five years of remote sensing in precision agriculture: Key advances and remaining
knowledge gaps. Biosystems Engineering 114, 358–371. doi: 10.1016/j.biosystemseng.2012.08.009.
NASS, 2019. Cropland Data Layer Releases. URL: https://www.nass.usda.gov/Research_and_Science/
Cropland/Release/.
Nevavuori, P., Narra, N., Lipping, T., 2019. Crop yield prediction with deep convolutional neural networks.
Computers and Electronics in Agriculture 163, 104859. doi: 10.1016/j.compag.2019.104859.
Nguyen, L.H., Joshi, D.R., Clay, D.E., Henebry, G.M., 2018. Characterizing land cover/land use from multiple
years of Landsat and MODIS time series: A novel approach using land surface phenology modeling and
random forest classifier. Remote Sensing of Environment doi: 10.1016/j.rse.2018.12.016.
Osman, J., Inglada, J., Dejoux, J.F., 2015. Assessment of a Markov logic model of crop rotations for early crop
mapping. Computers and Electronics in Agriculture 113, 234–243. doi: 10.1016/j.compag.2015.02.015.
Pantazi, X.E., Moshou, D., Alexandridis, T., Whetton, R.L., Mouazen, A.M., 2016. Wheat yield prediction
using machine learning and advanced sensing techniques. Computers and Electronics in Agriculture 121,
57–65. doi: 10.1016/j.compag.2015.11.018.
Phalke, A.R., ¨
Ozdo˘gan, M., 2018. Large area cropland extent mapping with Landsat data and a generalized
classifier. Remote Sensing of Environment 219, 180–195. doi: 10.1016/j.rse.2018.09.025.
25
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
Potgieter, A.B., Apan, A., Hammer, G., Dunn, P., 2010. Early-season crop area estimates for winter crops
in NE Australia using MODIS satellite imagery. ISPRS Journal of Photogrammetry and Remote Sensing
65, 380–387. doi: 10.1016/j.isprsjprs.2010.04.004.
Prasad, A.K., Chai, L., Singh, R.P., Kafatos, M., 2006. Crop yield estimation model for Iowa using remote
sensing and surface parameters. International Journal of Applied Earth Observation and Geoinformation
8, 26–33. doi: 10.1016/j.jag.2005.06.002.
Qu, C., Hao, X., 2018. Agriculture Drought and Food Security Monitoring Over the Horn of Africa (HOA)
from Space, in: 2018 7th International Conference on Agro-geoinformatics (Agro-geoinformatics), pp. 1–4.
doi: 10.1109/Agro-Geoinformatics.2018.8476128.
Quarmby, N.A., Milnes, M., Hindle, T.L., Silleos, N., 1993. The use of multi-temporal NDVI measurements
from AVHRR data for crop yield estimation and prediction. International Journal of Remote Sensing 14,
199–210. doi: 10.1080/01431169308904332.
Resop, J.P., Fleisher, D.H., Wang, Q., Timlin, D.J., Reddy, V.R., 2012. Combining explanatory crop models
with geospatial data for regional analyses of crop yield using field-scale modeling units. Computers and
Electronics in Agriculture 89, 51–61. doi: 10.1016/j.compag.2012.08.001.
Ritter, N.D., Hepner, G.F., 1990. Application of an artificial neural network to land-cover classification of
thematic mapper imagery. Computers & Geosciences 16, 873–880. doi: 10.1016/0098- 3004(90)90009-I.
Roy, D.P., Yan, L., 2018. Robust Landsat-based crop time series modelling. Remote Sensing of Environment
doi: 10.1016/j.rse.2018.06.038.
Sahajpal, R., Zhang, X., Izaurralde, R.C., Gelfand, I., Hurtt, G.C., 2014. Identifying representative crop
rotation patterns and grassland loss in the US Western Corn Belt. Computers and Electronics in Agriculture
108, 173–182. doi: 10.1016/j.compag.2014.08.005.
Seelan, S.K., Laguette, S., Casady, G.M., Seielstad, G.A., 2003. Remote sensing applications for precision
agriculture: A learning community approach. Remote Sensing of Environment 88, 157–169. doi: 10.1016/
j.rse.2003.04.007.
Shan, J., Hussain, E., Kim, K., Biehl, L., 2010. Flood Mapping with Satellite Images and its Web Service.
Photogrammetric Engineering & Remote Sensing 76, 102–105.
Shrestha, R., Di, L., Yu, E.G., Kang, L., Shao, Y., Bai, Y., 2017. Regression model to estimate flood impact
on corn yield using MODIS NDVI and USDA cropland data layer. Journal of Integrative Agriculture 16,
398–407. doi: 10.1016/S2095-3119(16)61502-2.
Singh, S.K., Mustak, S., Srivastava, P.K., Szab´o, S., Islam, T., 2015. Predicting Spatial and Decadal LULC
Changes Through Cellular Automata Markov Chain Models Using Earth Observation Datasets and Geo-
information. Environmental Processes 2, 61–78. doi: 10.1007/s40710-015- 0062-x.
26
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
Skakun, S., Franch, B., Vermote, E., Roger, J.C., Becker-Reshef, I., Justice, C., Kussul, N., 2017. Early season
large-area winter crop mapping using MODIS NDVI data, growing degree days information and a Gaussian
mixture model. Remote Sensing of Environment 195, 244–258. doi: 10.1016/j.rse.2017.04.026.
Surkan, A.J., Di, L., 1989. Fast trainable pattern classification by a modification of Kanerva’s SDM model,
in: International 1989 Joint Conference on Neural Networks, pp. 347–349 vol.1. doi: 10.1109/IJCNN.
1989.118607.
Teal, R.K., Tubana, B., Girma, K., Freeman, K.W., Arnall, D.B., Walsh, O., Raun, W.R., 2006. In-Season
Prediction of Corn Grain Yield Potential Using Normalized Difference Vegetation Index. Agronomy Journal
98, 1488–1494. doi: 10.2134/agronj2006.0103.
Thenkabail, P., Lyon, J.G., Turral, H., Biradar, C., Lyon, J.G., Turral, H., Biradar, C., 2009. Remote Sensing
of Global Croplands for Food Security. CRC Press. doi: 10.1201/9781420090109.
Vaudour, E., Noirot-Cosson, P.E., Membrive, O., 2015. Early-season mapping of crops and cultural operations
using very high spatial resolution Pl´eiades images. International Journal of Applied Earth Observation
and Geoinformation 42, 128–141. doi: 10.1016/j.jag.2015.06.003.
Wiegand, C.L., Richardson, A.J., Escobar, D.E., Gerbermann, A.H., 1991. Vegetation indices in crop assess-
ments. Remote Sensing of Environment 35, 105–119. doi: 10.1016/0034-4257(91)90004-P.
Xiao, Y., Mignolet, C., Mari, J.F., Benoˆıt, M., 2014. Modeling the spatial distribution of crop sequences at
a large regional scale using land-cover survey data: A case from France. Computers and Electronics in
Agriculture 102, 51–63. doi: 10.1016/j.compag.2014.01.010.
Xu, X., Du, Z., Zhang, H., 2016. Integrating the system dynamic and cellular automata models to predict
land use and land cover change. International Journal of Applied Earth Observation and Geoinformation
52, 568–579. doi: 10.1016/j.jag.2016.07.022.
Yagci, A.L., Di, L., Deng, M., 2015. The effect of corn–soybean rotation on the NDVI-based drought
indicators: a case study in Iowa, USA, using Vegetation Condition Index. GIScience & Remote Sensing
52, 290–314. doi: 10.1080/15481603.2015.1038427.
Yang, Z., Wu, W.b., Di, L., ¨
Ust¨unda˘g, B., 2017. Remote sensing for agricultural applications. Journal of
Integrative Agriculture 16, 239–241. doi: 10.1016/S2095- 3119(16)61549-6.
Yoshida, T., Omatu, S., 1994. Neural network approach to land cover mapping. IEEE Transactions on
Geoscience and Remote Sensing 32, 1103–1109. doi: 10.1109/36.312899.
Zhang, C., Di, L., Lin, L., Guo, L., 2019a. Extracting Trusted Pixels from Historical Cropland Data Layer
Using Crop Rotation Patterns: A Case Study in Nebraska, USA, in: 2019 8th International Confer-
ence on Agro-Geoinformatics (Agro-Geoinformatics), pp. 1–6. doi: 10.1109/Agro-Geoinformatics.2019.
8820236.
27
C. Zhang et al. doi: 10.1016/j.compag.2019.104989
Zhang, C., Di, L., Yang, Z., Lin, L., Yu, E.G., Yu, Z., Rahman, M.S., Zhao, H., 2019b. Cloud Environment for
Disseminating NASS Cropland Data Layer, in: 2019 8th International Conference on Agro-Geoinformatics
(Agro-Geoinformatics), pp. 1–5. doi: 10.1109/Agro-Geoinformatics.2019.8820465.
Zhong, L., Hu, L., Zhou, H., 2019. Deep learning based multi-temporal crop classification. Remote Sensing
of Environment 221, 430–443. doi: 10.1016/j.rse.2018.11.032.
Zhong, L., Yu, L., Li, X., Hu, L., Gong, P., 2016. Rapid corn and soybean mapping in US Corn Belt and
neighboring areas. Scientific Reports 6, 36240. doi: 10.1038/srep36240.
28
... For instance, a boundary pixel containing different crops may produce distinctive spectral signatures and lead to misclassification depending on the thresholds used in the production of CDL 10 . This error is mainly because the mapping unit (30 m) is too coarse to capture the variation within a pixel 9,16 . To identify false classifications, we proposed a spatial-temporal decision tree algorithm. ...
... CDL codes and class names that were labeled as constant features in the algorithm9,13 . ...
Article
Full-text available
Space-based crop identification and acreage estimation have played a significant role in agricultural studies in recent years, due to the development of Remote Sensing technology. The Cropland Data Layer (CDL), which was developed by the U.S. Department of Agriculture (USDA), has been widely used in agricultural studies and achieved massive success in recent years. Although the CDL’s accuracy assessments report high overall accuracy on various crops classifications, misclassification is still common and easy to discern from visual inspection. This study is aimed to identify and resolve inaccurate crop classification in CDL. A decision tree method was employed to find questionable pixels and refine them with spatial and temporal crop information. The refined data was then evaluated with high-resolution satellite images and official acreage estimates from USDA. Two validation experiments were also developed to examine the data at both the pixel and county level. Data generated from this research was published online in two repositories, while both applications allow users to download the entire dataset at no cost.
... The agri-food supply chain can become uncertain because the collaboration along the supply chain involves complex factors such as an uncontrolled environment [47]. This can unbalance demand and supply, and when it becomes critical (e.g., by causing under or oversupply and demand), then the damage can be severe. ...
Article
Full-text available
An increasing complication due to the rise of dynamic trades and global industry causes a burden in decision-making. There is a need for multi-level perspective factors in supply chain management, such as short-long terms of demand and supply, and their impact on agricultural market dynamics. In this study, Big data is proposed as supply chain open data sensors for data digitization to deal with the problem. Although Big data supports comprehensive, real-time sources, and provides information about market functions, traditional machine learning technologies have proved insufficient for dealing with Big data characteristics. We then propose a time-series decomposition approach for extracting contexts about short-long term impacts to provide insights into Big data for determining market demand and supply. Our agri-big data digitization reveals the significant information about Big data with the better predictive ability and can support agri-big data analysis using any kind of machine learning model.
... First, crop type information is required when detecting crop phenology using SMF-S. Identifying crop types before the growing season may be possible only in agricultural areas with regular crop rotations, as the current-year crop types in these areas can be predicted from historical crop planting maps (Zhang et al., 2019). Recently, some studies have also conducted in-season crop classifications (e.g., Johnson and Mueller, 2021). ...
Article
Crop phenology provides important information for crop growth management and yield estimations. The popular shape model fitting (SMF) method detects crop phenology from vegetation index (VI) time-series data, but it has two limitations. First, SMF assumes the same “relative position” of phenological stages for the pixels of the same crop type. This assumption is valid only if all target pixels, relative to the shape model, display a synchronized increase (or decrease) in length between any two phenological stages, which is uncommon in practice. Second, the variance in the resulting phenology estimates for a particular phenological stage is related to the stage itself; this makes it challenging to simulate spatial and temporal variations in crop phenology using SMF. Here, we address both limitations by developing the shape model fitting by the Separate phenological stage method (“SMF-S"). SMF-S uses a modified fitting function and an iterative procedure to match the shape model with the VI time series for each phenological stage in an adaptive local window. Comparisons between SMF-S and SMF in simulation experiments show the superior performance of SMF-S in different scenarios, regardless of noise. Comparisons involving winter wheat field observations from the North China Plain showed that the RMSE values averaged over nine phenological stages were smaller for SMF-S (RMSE = 9.5 d) than for SMF (RMSE = 13.4 d) and one variant of SMF (the shape model with accumulated growing degree days (SM-AGDD); RMSE=33.6 d). Moreover, SMF-S better described the spatial variations (i.e., variance) in the results and captured the temporal shifts in multiple phenological stages. In the derived regional phenology maps of winter wheat on the North China Plain, SMF-S generated more reasonable spatial patterns, whereas SMF underestimated (overestimated) the variance in the early (late) phenological stages. We expect that the improved crop phenology estimates obtained with SMF-S could benefit various agricultural activities.
... Geospatial information has been proven effective in supporting the understanding of, preparation for, response to, and recovery from natural disasters. The establishment of data, models, computing, services, and standards for enabling the integrated development of monitoring, rapid assessment, emergency communication, and decision-making is the key to improving efficiency and effectiveness in all phases of disaster emergency activities [10][11][12]. It helps disaster agencies and communities to enhance the current systems and policies, to better understand how relevant geospatial information can be qualified, integrated, and shared more efficiently, and to have an SDI already-in-place platform when disaster strikes [13][14][15]. ...
Article
Full-text available
Natural disaster response and assessment are key elements of natural hazard monitoring and risk management. Currently, the existing systems are not able to meet the specific needs of many regional stakeholders worldwide; traditional approaches with field surveys are labor-intensive, time-consuming, and expensive, especially for severe disasters that affect a large geographic area. Recent studies have demonstrated that Earth observation (EO) data and technologies provide powerful support for the natural disaster emergency response. However, challenges still exist in support of the entire disaster lifecycle—preparedness, response, and recovery—which build the gaps between the disaster Spatial Data Infrastructure (SDI) already-in-place requirements and the EO capabilities. In order to tackle some of the above challenges, this paper demonstrates how to facilitate typhoon-triggered flood disaster-ready information delivery using an SDI services approach, and proposes a web-based remote sensing disaster decision support system to facilitate natural disaster response and impact assessment, which implements on-demand disaster resource acquisition, on-the-fly analysis, automatic thematic mapping, and decision report release. The system has been implemented with open specifications to facilitate interoperability. The typhoons and floods in Hainan Province, China, are used as typical scenarios to verify the system’s applicability and effectiveness. The system improves the automation level of the natural disaster emergency response service, and provides technical support for regional remote-sensing-based disaster mitigation in China.
... Therefore, this study assumes that the additional selected eigenvectors in the database would improve the performance of the SVR model which will enable the model to address spatial autocorrelation. Apart from these two ML models, this study also explores the ANN model, which is very popular and widely used for modeling and predicting spatial and non-spatial data (Zhang, Di, Lin, & Guo, 2019). Finally, this study presents a comparative view of the relative predictive performance among ESF-based ML models, with the ESF-based spatial statistical model serving as benchmark. ...
Article
Spatial statistical models are highly effective for modeling geospatial data as they consider spatial information of geographic spaces and other non-spatial covariates, enabling them to minimize spatial autocorrelation by addressing spatial dependence. In contrast, machine learning (ML) models are highly effective for predicting non-spatial data, but they are not as effective for modeling and predicting geospatial data because of spatial autocorrelation issues. One of the frequently reported limitations of ML models for geospatial data modeling is that there is no standard method of incorporating spatial information of geographic space into the model, and consequently they cannot minimize spatial autocorrelation. In this study, we have presented a local spatial information-embedded ML method capable of minimizing spatial autocorrelation by addressing spatial dependence while predicting a geospatial phenomenon. Our study applied the eigenvector spatial filter method to extract approximated eigenvectors from spatial coordinates and embed them within ML models as a set of vectors along with the selected non-spatial covariates. We have also presented a comparison of relative prediction performance between traditional spatial statistical and ML-based models. The experiment demonstrates that incorporating spatially filtered eigenvectors to represent spatial information in ML model specification significantly improves the prediction performance.
... Therefore, this study assumes that the additional selected eigenvectors in the database would improve the performance of the SVR model which will enable the model to address spatial autocorrelation. Apart from these two ML models, this study also explores the ANN model, which is very popular and widely used for modeling and predicting spatial and non-spatial data (Zhang, Di, Lin, & Guo, 2019). Finally, this study presents a comparative view of the relative predictive performance among ESF-based ML models, with the ESF-based spatial statistical model serving as benchmark. ...
Preprint
Full-text available
Spatial statistical models are highly effective for modeling geospatial data as they consider spatial information of geographic spaces and other non-spatial covariates, enabling them to minimize spatial autocorrelation by addressing spatial dependence. Contrarily, machine learning (ML) models are highly effective for predicting non-spatial data, but they are not as effective for modeling and predicting geospatial data because of spatial autocorrelation issues. One of the frequently reported limitations of ML models for geospatial data modeling is that there is no standard method of incorporating spatial information of geographic space into the model, and consequently, they cannot minimize spatial autocorrelation. In this study, we have presented a local spatial information embedded ML method capable of minimizing spatial autocorrelation by addressing spatial dependence while predicting a geospatial phenomenon. Our study applied the ESF method to extract approximated eigenvectors from spatial coordinates and embedded them with ML models as a set of vectors along with the selected non-spatial covariates. We have also presented a comparison of relative prediction performance between classical spatial statistical and ML-based models. The experiment demonstrates that incorporating spatially filtered eigenvectors to represent spatial information in ML model specification significantly improves the prediction performance.
... This could be achieved through geospatial applications and tools in combination with advanced remote sensing based on the extensive progress in remote sensing and earth observation technologies over the past two decades, as well as the increasing availability of satellite images with improved spectral, spatial, and temporal resolutions, and a growing number of image processing data-driven approaches. These methods and techniques have basically evolved from geo computation methods into geospatial CI methods through the rapid advances in web service technologies, geospatial cloud mapping and information interoperability, high-performance computing, and geospatial big data analytics simultaneously (Vitolo et al. 2015;Zhang et al. 2019). Google Earth Engine (GEE) is a cloud-based platform that has recently gained traction in this field because of its world-scale geospatial data analysis with different geospatial datasets and various types of ready-to-use application programming various types of interfaces (Nascetti et al. 2017). ...
Article
Full-text available
With the recent advances in earth observation technologies, the increasing availability of data from more and more different satellite sensors as well as progress in semi-automated and automated classification techniques enable the (semi-) automated remote monitoring and analysis of large areas. Online platforms such as Google Earth Engine (GEE) bring data-driven techniques to the desktops of researchers while changing workflows and making excessive data downloads redundant. We present a study that utilizes machine learning algorithms on the GEE cloud computing platform for land use/land cover (LULC) mapping and change detection analysis using a Landsat satellite image time series. We applied different machine learning techniques to data from an environmentally sensitive area in Northern Iran. We tested their efficiency for LULC mapping and change detection analysis using the support vector machine (SVM), random forest (RF) and classification and regression tree (CART). We obtained LULC maps for the years 2000, 2005, 2010, 2015 and 2020. Training data was collected from field operations and historical datasets, and the respective LULC maps were validated using ground control points. In addition, we validated the reliability of the results through a spatial uncertainty analysis using Dempster-Shafer Theory (DST). The resulting accuracies of the classification outcomes varied significantly. SVM performed best with accuracies of 90.25%, 91.84%, 89.02%, 93.35% and 95.65% for 2000, 2005, 2010, 2015 and 2020, respectively. The spatial uncertainty analysis also validated the efficiency of SVM compared to RF and CART. The results confirm the potential of machine learning techniques for time series LULC mapping on the GEE platform while lowering the barriers to analyzing large amounts of satellite data. The results are also critical for decision-makers and authorities for analyzing the LULC changes and developing the respective environmental protection and polices in Northern Iran.
Article
Full-text available
Mapping the corn dynamics at a large scale and multiple years is essential for global food security. Traditional mapping approaches by collecting training samples from field surveys are labor-intensive, challenging large-scale mapping of corn dynamics over the long term. This study developed an efficient approach to map large-scale corn dynamics in the main corn districts of the United States (US) using adaptive strategies for collecting high-quality training samples. First, this study proposed an automatic approach to collect stable and representative corn samples from the crop data layers (CDL) product. Then, this study improved the mapping performance of corn at a large scale and in earlier years by using adaptive strategies to collect limited (less than 500) but high-quality training samples. Finally, this study assessed and discussed the model performance across spaces and years using multi-source datasets from 2001 to 2020. Our results indicate the proposed approach with adaptive strategies can generate robust classification models with good performance in mapping large-scale corn dynamics. In our study area, the mean overall accuracy (OA) of corn is about 88% if using the CDL data as a reference. Besides, the R 2 of corn areas at the county scale between our result and the surveyed acreage statistics is above 0.9, suggesting the proposed strategy is helpful for mapping corn dynamics across different years. The proposed approach demonstrated that limited but representative samples could map corn dynamics at a large scale with good performance, showing significant improvement compared to the traditional approach. It is feasible to map crop dynamics with multiple types by combining existing crop products with collected field samples, particularly in China, where crop products at the national scale are still lacking.
Article
Meeting an increasing demand for food while preserving the environment is one of the most important challenges of the 21st century. To meet this challenge, conservation agriculture can rely on the age-old practice of crop rotation. The objective of this article is to develop a methodology for predicting and visualizing crop rotations, supporting discussions between agronomists and producers. Based on crop history data, the 6-phase methodology, uses Markov chains for the prediction of the N most likely crops grown in year n + 1. Process mining and Directly-Follows Graphs (DFG) enables modelling and visualization of the results. Generalisation and filtering operations highlight the frequent behaviors of producers. Applied to analyse the crop history of 10,376 fields from 409 field crop farms in Quebec, Canada, the methodology is competitive with the performance of various recurrent neural networks (LSTM, RNN, GRU) with a successful prediction rate that exceeds 90%, while allowing for an intelligibility of results and a relative computational simplicity.
Article
Purpose The emerging technologies of the Fourth Industrial Revolution are transforming various industries, including agriculture. Unaware, young male and female farmers leave the agriculture profession as they perform unsustainable practices. Precision agriculture using the Internet of Things (IoT) is a solution to sustainable agriculture. Extension professionals are at the heart of disseminating agricultural advisory agricultural services in India. The discourse on the IoT is entering the space of extension advisory services (EASs) and social sciences. Thus, the present paper seeks to review the application of IoT in Indian agriculture, its challenges and its effect on EASs. The conceptual framework is drawn from disruptive and surveillance capitalist theories. Design/methodology/approach Online literature review was conducted on electronic e-book Ebsco, Google scholar, PubMed, Jane, j gate, research4life, springer journal and Mendeley databases for full-text repositories, textbook, thesis, web articles, newspaper articles, reports, blogs for the year 1990 to May 2021 using keywords “IoT application in agriculture,” “emerging technologies in agriculture,” “challenges in IoT application,” “extension advisory services sources of information,” “big data and extension advisory, “IoT and extension advisory in India.” Only publications in the English language were included. Findings IoT aids progressive farmers and small farmers alike. Drones, robotics, precision irrigation, livestock tracking and crop disease surveillance are examples of IoT applications in agriculture. Only large corporations and governments access IoT, and for them, big data storage is an issue. Privacy and security concerns demand upgrades in IoT systems. Solutions to the convergence of IoT with the cloud will leverage agricultural EASs, resulting in fast computing, precise and proactive up-to-date problem solving. Hence, the need for communication between firms and clients has ceased. Thus, the jobs of extension agents are replaced. Research limitations/implications The competence of future human extension agents lies in reskilling as a “knowledge broker” of relationships and expertise, as s/he cannot have all multidisciplinary knowledge. Originality/value Although IoT applications in agriculture are available from a technological standpoint, there remains an awareness gap regarding the impact of IoT applications in agricultural EASs. This study will aid in a better comprehension of IoT applications from current and prospective EASs.
Conference Paper
Full-text available
It is still a challenge to generate the timely crop cover map at large geographic area due to the lack of reliable ground truths at early growing season. This paper introduces an efficient method to extract “trusted pixels” from the historical Cropland Data Layer (CDL) data using crop rotation patterns, which can be used to replace the actual ground truth in the crop mapping and other agricultural applications. A case study in the Nebraska state of USA is demonstrated. The common crop rotation patterns of four major crop types, corn, soybeans, winter wheat, and alfalfa, are compared and analyzed. The experiment results show a considerable number of pixels in CDL following the certain crop sequence during the past decade. Each observed crop type has at least one reliable crop rotation pattern. Based on the reliable crop rotation patterns, a great proportion of pixels can be correctly mapped a year ahead of the release of current-year CDL product. These trusted pixels can be potentially used to label training samples for crop type classification at early growing season.
Conference Paper
Full-text available
Cropland Data Layer (CDL) is an annual crop-specific land use map produced by the U.S. Department of Agricultural (USDA) National Agricultural Statistics Service (NASS). The CDL products are officially hosted on CropScape website which provides capabilities of geospatial data visualization, retrieval, processing, and statistics based on the open geospatial Web services. This study utilizes cloud computing technology to improve the performance of CropScape application and Web services. A cloud-based prototype of CropScape is implemented and tested. The experiment results show the performance of CropScape is significantly improved in the cloud environment. Comparing with the original system architecture of CropScape, the cloud-based architecture provides a more flexible and effective environment for the dissemination of CDL data.
Article
Full-text available
Plant species classification is crucial for biodiversity protection and conservation. Manual classification is time-consuming, expensive, and requires experienced experts who are often limited available. To cope with these issues, various machine learning algorithms have been proposed to support the automated classification of plant species. Among these machine learning algorithms, Deep Neural Networks (DNNs) have been applied to different data sets. DNNs have been however often applied in isolation and no effort has been made to reuse and transfer the knowledge of different applications of DNNs. Transfer learning in the context of machine learning implies the usage of the results of multiple applications of DNNs. In this article, the results of the effect of four different transfer learning models for deep neural network-based plant classification is investigated on four public datasets. Our experimental study demonstrates that transfer learning can provide important benefits for automated plant identification and can improve low-performance plant classification models.
Article
Full-text available
The remote-sensing based Flood Crop Loss Assessment Service System (RF-CLASS) is a web service based system developed and managed by the Center for Spatial Information Science and Systems (CSISS). The system uses Moderate Resolution Imaging Spectroradiometer (MODIS)-based flood data, which was implemented by the Dartmouth Flood Observatory (DFO), to provide an estimation of crop loss from floods. However, due to the spectral similarity between water and shadow, a noticeable amount of false classification of shadow can be found in the DFO flood products. Traditional methods can be utilized to remove cloud shadow and part of mountain shadow. This paper aims to develop an algorithm to filter out noise from permanent mountain shadow in the flood layer. The result indicates that mountain shadow was significantly removed by using the proposed approach. In addition, the gold standard test indicated a small number of actual water surfaces were misidentified by the proposed algorithm. Furthermore, experiments also suggest that increasing the spatial resolution of the slope helped reduce more noise in mountains. The proposed algorithm achieved acceptable overall accuracy (>80%) in all different filters and higher overall accuracies were observed when using lower slope filters. This research is one of the very first discussions on identifying false flood classification from terrain shadow by using the highly efficient method.
Article
Full-text available
Mapping the spatial and temporal dynamics of cropland is an important prerequisite for regular crop condition monitoring, management of land and water resources, or tracing and understanding the environmental impacts of agriculture. Analyzing archives of satellite earth observations is a proven means to accurately identify and map croplands. However, existing maps of the annual cropland extent either have a low spatial resolution (e.g., 250–1000 m from Advanced Very High Resolution Radiometer (AVHRR) to Moderate-resolution Imaging Spectroradiometer (MODIS); and existing high-resolution maps (such as 30 m from Landsat) are not provided frequently (for example, on a regular, annual basis) because of the lack of in situ reference data, irregular timing of the Landsat and Sentinel-2 image time series, the huge amount of data for processing, and the need to have a regionally or globally consistent methodology. Against this backdrop, we propose a reference time-series-based mapping method (RBM), and create binary cropland vs. non-cropland maps using irregular Landsat time series and RBM. As a test case, we created and evaluated annual cropland maps at 30 m in seven distinct agricultural landscapes in Xinjiang, China, and the Aral Sea Basin. The results revealed that RBM could accurately identify cropland annually, with producer’s accuracies (PA) and user’s accuracies (UA) higher than 85% between 2006 and 2016. In addition, cropland maps by RBM were significantly more accurate than the two existing products, namely GlobaLand30 and Finer Resolution Observation and Monitoring of Global Land Cover (FROM–GLC).
Article
Full-text available
Over the last 20 years, substantial amounts of grassland have been converted to other land uses in the Northern Great Plains. Most of land cover/land use (LCLU) assessments in this region have been based on the U.S. Department of Agriculture - Cropland Data Layer (USDA - CDL), which may be inconsistent. Here, we demonstrate an approach to map land cover utilizing multi-temporal Earth Observation data from Landsat and MODIS. We first built an annual time series of accumulated growing degree-days (AGDD) from MODIS 8-day composites of land surface temperatures. Using the Enhanced Vegetation Index (EVI) derived from Landsat Collection 1's surface reflectance, we then fit at each pixel a downward convex quadratic model to each year's progression of AGDD (i.e., EVI = α + β × AGDD − γ × AGDD²). Phenological metrics derived from fitted model and the goodness of fit then are submitted to a random forest classifier (RFC) to characterize LCLU for four sample counties in South Dakota in three years (2006, 2012, 2014) when reference point datasets are available for training and validation. To examine the sensitivity of the RFC to sample size and design, we performed classifications under different sample selection scenarios. The results indicate that our proposed method accurately mapped major crops in the study area but showed limited accuracy for non-vegetated land covers. Although all RFC models exhibit high accuracy, estimated land cover areas from alternative models could vary widely, suggesting the need for a careful examination of model stability in any future land cover supervised classification study. Among all sampling designs, the “same distribution” models (proportional distribution of the sample is like proportional distribution of the population) tend to yield best land cover prediction. RFC used only the most eight important variables (e.g., three fitted parameter coefficients [α β and γ]; maximum modeled EVI; AGDD at maximum modeled EVI; the number of observations used to fit CxQ model; and the number of valid observations) have slightly higher accuracy compared to those using all variables. By summarizing annual image time series through land surface phenology modeling, LCLU classification can embrace both seasonality and interannual variability, thereby increasing the accuracy of LCLU change detection.
Article
Using remote sensing and UAVs in smart farming is gaining momentum worldwide. The main objectives are crop and weed detection, biomass evaluation and yield prediction. Evaluating machine learning methods for remote sensing based yield prediction requires availability of yield mapping devices, which are still not very common among farmers. In this study Convolutional Neural Networks (CNNs) – a deep learning methodology showing outstanding performance in image classification tasks – are applied to build a model for crop yield prediction based on NDVI and RGB data acquired from UAVs. The effect of various aspects of the CNN such as selection of the training algorithm, depth of the network, regularization strategy, and tuning of the hyperparameters on the prediction efficiency are tested. Using the Adadelta training algorithm, L2 regularization with early stopping and a CNN with 6 convolutional layers, mean absolute error (MAE) in yield prediction of 484.3 kg/ha and mean absolute percentage error (MAPE) of 8.8% was achieved for data acquired during the early period of the growth season (i.e., in June of 2017, growth phase <25%) with RGB data. When using data acquired later in July and August of 2017 (growth phase >25%), MAE of 624.3 kg/ha (MAPE: 12.6%) was obtained. Significantly, the CNN architecture performed better with RGB data than the NDVI data.
Article
This study aims to develop a deep learning based classification framework for remotely sensed time series. The experiment was carried out in Yolo County, California, which has a very diverse irrigated agricultural system dominated by economic crops. For the challenging task of classifying summer crops using Landsat Enhanced Vegetation Index (EVI) time series, two types of deep learning models were designed: one is based on Long Short-Term Memory (LSTM), and the other is based on one-dimensional convolutional (Conv1D) layers. Three widely-used classifiers were also tested for comparison, including a gradient boosting machine called XGBoost, Random Forest, and Support Vector Machine. Although LSTM is widely used for sequential data representation, in this study its accuracy (82.41%) and F1 score (0.67) were the lowest among all the classifiers. Among non-deep-learning classifiers, XGBoost achieved the best result with 84.17% accuracy and an F1 score of 0.69. The highest accuracy (85.54%) and F1 score (0.73) were achieved by the Conv1D-based model, which mainly consists of a stack of Conv1D layers and an inception module. The behavior of the Conv1D-based model was inspected by visualizing the activation on different layers. The model employs EVI time series by examining shapes at various scales in a hierarchical manner. Lower Conv1D layers of the optimized model capture small scale temporal variations, while upper layers focus on overall seasonal patterns. Conv1D layers were used as an embedded multi-level feature extractor in the classification model which automatically extracts features from input time series during training. The automated feature extraction reduces the dependency on manual feature engineering and pre-defined equations of crop growing cycles. This study shows that the Conv1D-based deep learning framework provides an effective and efficient method of time series representation in multi-temporal classification tasks.