Content uploaded by Priyadarshi Upadhyay
Author content
All content in this area was uploaded by Priyadarshi Upadhyay on Aug 06, 2015
Content may be subject to copyright.
This article was downloaded by: [PRIYADARSHI UPADHYAY]
On: 30 May 2015, At: 23:52
Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Click for updates
Geocarto International
Publication details, including instructions for authors and
subscription information:
http://www.tandfonline.com/loi/tgei20
Temporal MODIS data for identification
of wheat crop using noise clustering
soft classification approach
Priyadarshi Upadhyay
a
, S.K. Ghosh
a
& Anil Kumar
b
a
Department of Civil Engineering, Indian Institute of Technology
Roorkee, Roorkee, India
b
Indian Institute of Remote Sensing, Indian Space Research
Organization, Dehradun, India
Accepted author version posted online: 05 May 2015.Published
online: 26 May 2015.
To cite this article: Priyadarshi Upadhyay, S.K. Ghosh & Anil Kumar (2015): Temporal MODIS data
for identification of wheat crop using noise clustering soft classification approach, Geocarto
International, DOI: 10.1080/10106049.2015.1047415
To link to this article: http://dx.doi.org/10.1080/10106049.2015.1047415
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the
“Content”) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Temporal MODIS data for identification of wheat crop using noise
clustering soft classification approach
Priyadarshi Upadhyay
a
*
,1
, S.K. Ghosh
a
and Anil Kumar
b
a
Department of Civil Engineering, Indian Institute of Technology Roorkee, Roorkee, India;
b
Indian Institute of Remote Sensing, Indian Space Research Organization, Dehradun, India
(Received 4 January 2015; accepted 23 April 2015)
In this study, temporal MODIS-Terra MOD13Q1 data have been used for identifica-
tion of wheat crop uniquely, using the noise clustering (NC) soft classification
approach. This research also optimises the selection of date combination and vegeta-
tion index for classification of wheat crop. First, a separability analysis is used to
optimise the date combination for each case of number of dates and vegetation
index. Then, these scenes have undergone for NC soft classification. The resolution
parameter (δ) was optimised for the NC classifier and found to be a value of
1.6 × 10
4
for wheat crop identification. Classified outputs were analysed by receiver
operating characteristics (ROC) analysis for sub-pixel detection. Highest area under
the ROC curve was found for soil-adjusted vegetation index corresponding to the
three different phenological stages data sets. From this study, the data sets corre-
sponding to the Sowing, Flowering and Maturity phenological stages of wheat crop
were found more suitable to identify it uniquely.
Keywords: MODIS; NC; separability; resolution parameter; phenology; ROC
1. Introduction
Wheat is a major agriculture crop and amongst one of the most produced cereals over
the world. Timely information of wheat crop at regional scale is quite useful for wheat
crop management and organised economy. The early estimate of wheat crop is useful
for many applications like the policy makers, grain industry, disaster relief and drought
declaration (Potgieter et al.
2010). To estimate the regional distribution of an individual
crop, remote sensing has been proven as an effective tool (Lobell & Asner
2004;Pan
et al.
2012). Since every existing crop has its own unique phenology during the grow-
ing season (Pan et al.
2012), therefore remote sensing data acquired over a period dur-
ing the phenological changes of a crop can be more effective tool to discriminate it
from other classes present in the data. However, the changes due to phenology cannot
be utilised while using single date remote sensing image. Further, due to the small
swath of the high spatial resolution images, large-area crop mapping faces a problem of
high cost and handling of large volume of data. To avoid these problems, at present the
focus is towards the use of high temporal and low spatial resolution satellite data sets.
On the other hand, satellite data-based vegetation indices have been used for many
purposes in remote sensing. Time series MODIS, normalized difference vegetation
*Corresponding author. Email: priyadarshi.529@shooliniuniversity.com
1
Current affiliation: Shoolini University, Solan, India
© 2015 Taylor & Francis
Geocarto International, 2015
http://dx.doi.org/10.1080/10106049.2015.1047415
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
index (NDVI) and enhanced vegetation index (EVI) data have been popularly used for
crop mapping (Pringle et al. 2012) and to estimate crop phenology (Sakamoto et al.
2005). Potgieter et al. (2010) provided the early-season information on crop area using
the multi-temporal MODIS 250 m EVI data. The study was aimed to fulfil the require-
ment of the early estimates of net crop production before the harvest. Unsupervised
K-means algorithm was used for the classification. They have estimated the early-
season winter crop area 2 months before the harvest. Pan et al. (
2012) proposed a crop
proportion phenology index (CPPI) to estimate the winter wheat crop area up to
sub-pixel level using MODIS EVI time series area. They have used the phenological
variables from the October to June of next year as input to calculate CPPI. The regres-
sion model was used to estimate the winter wheat crop area. The parameters of CPPI
have been calculated from the inverted form of the regression model using the training
pixels.
The MOD13Q1 product from the MODIS-Terra satellite is a key tool for vegetation
mapping, due to its high temporal resol ution (Pan et al.
2012). However, due to its
moderate spatial resol ution, mixed pixels produce significant errors in crop area estima-
tion (Foody
2000; Shalan et al. 2003). Fuzziness can be incorporated in the classifica-
tion process of such pixel, such that these pixels may have multiple and partial class
membership (Foody et al.
1997). Such a classification is known as soft classification. A
soft classification technique assigns a pixel to different classes according to its area pre-
sent inside the pixel. It yields a number of fraction images equal to the number of land
cover classes. The supervised fuzzy c-means (FCM) (Bezdek et al.
1984), supervised
possibilistic c-means (PCM) (Krishnapuram & Keller
1993), noise clustering (Dave
1991; Dave & Krishnapuram 1997), artificial neural networks (Kanellopoulos et al.
1992) and mixture modelling (Kerdiles & Grondona 1996) are commonly used for the
soft classifications of satellite data. Amongst these, the first three are fuzzy set
theory-based classifiers.
Earlier, the fuzzy set theory-based soft classifiers such as FCM or PCM have been
used in a number of studies (Shalan et al.
2003; Ibrahim et al. 2005 ).To derive accurate
estimates of sub-pixel land cover composition from FCM, it is necess ary to have the
information of all classes in the training stage of classification (Foody
2000). In FCM,
noisy points (i.e. outliers) are grouped with information classes with same overall mem-
bership value of one. Thus, the FCM classification algorithm faces the problem of noise
(outliers). The problem of noise can be resolved in the PCM algorithm by providing
one noise class per good cluster. However, in noise clustering (NC) approach, noise
class can be segregated as a separate information class or one noise class for all clusters
(Dave & Krishnapuram
1997). Thus, unlike the FCM, the presence of untrained classes
does not affect the classification outputs in both PCM and NC (Foody
2000). Further,
unlike the FCM classifier, NC and PCM do not follow the probability rule and hence
may be used for the identification of single land cover.
In the present research, the NC soft classification approach has been used for the
wheat crop estimation in a test site in India. Wheat production in India comes from a
single growing season. It is sown during October–December and harvested during
March–May (Krishna Kumar et al.
2004). Many studies in literature claimed that EVI
is better for crop identification (Huete et al.
2002; Justice et al. 2002; Houborg et al.
2007) but the robustness is still to be proved (Wardlow & Egbert 2010). Therefore, in
this study, different types of vegetation indices such as simple ratio (SR), NDVI,
soil-adjusted vegetation index (SAVI) and EVI have been selected for wheat crop
identification in a test site in India. The objective of this study were to (i) study the
2 P. Upadhyay et al.
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
effectiveness of the MODIS temporal indices data for wheat crop identification, (ii)
optimise the temporal indices datasets, (iii) optimise the resolution parameter δ for the
NC classification and (iv) assess the accuracy through the receiver operating
characteristic (ROC) analysis of sub-pixel detection.
2. Study area and data used
The study area chosen for this study is located between 29°5′00″N–30°3′52″N latitude
and 76°19′52″E–77°22′21″E longitude in the state of Haryana, India. Important places
within the study are Panipat, Karnal and Asandh as shown in Figure
1. Wheat, mustard
and sugarcane are the main crops grown in this region during November–March i.e.
the Rabi season. Wheat is the dominating crop in this region having large and homoge-
neous fields. The other major crops are sugarcane, ground nut, paddy and maize.
The average annual temperature for this study area ranges from 22.5 to 25 °C.
According to the Köppen climate classification system, the study area falls under the
semi-arid climatic zones. Geologically, the study area is under Indo-Gangetic plains
category with the rich alluvial soils. Tropical arid soil found in the study area is vary-
ing from the sandy loam to loam (Kaushal et al.
2009). The soil is deposited due to the
downstreams and the rivers and is more fertile. Average annual rainfall in this study
area is about 610 mm (Ahlawat & Sharma
2012). Nearly 70% rainfall occurs in this
Figure 1. Study area boundaries shown from MODIS data.
Geocarto International 3
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
region due to the south-west monsoon from July to September and the remaining
occurs during December to February.
The satellite data used in this research is from MODIS instrument which is operat-
ing on both the Terra and Aqua satellites. It has a viewing swath width of 2330 km
and views the entire surface of Earth in every 1–2 days. Its detectors measure 36 spec-
tral bands and acquire data at three spatial resolutions 250, 500 and 1000 m (
https://lp
daac.usgs.gov/products/modis_overview). Of 36 spectral bands of MODIS, only 7
bands are of primary use i.e. for vegetation a nd land surface studies. Different land
products are produced at various temporal resolutions. The products are acquired at
daily, 8-day, 16-day, monthly, quart erly, yearly basis. In the present research,
MODIS-Terra MOD13Q1 level-3 product (
https://lpdaac.usgs.gov/products/modis_prod
ucts_table/mod13q1
) with spatial resolution 250 m and a composite period of 16 days
have been used. The Red (B1: 620–670 nm), NIR (B2: 841–876 nm) and Blue (B3:
459–479) of MOD13Q1 have been taken to generate different vegetation indices.
Classification method used in this study requires the information about a few
selected known locations in the satellite imagery as reference data. These known or
reference locations can be used for training the classifier and for testing the outputs
(Congalton
1991; Foody 2002; Congalton & Green 2009 ). Therefore, ground samples
were collected for both the training and testing pixels. These training and testing fields
were selected using the random sampling scheme. These ground samples have been
acquired with the help from GPS-based survey data and existing land-use/land-cover
maps.
As a general rule, if the training data is being extracted from ‘n’ bands, then the
number of training pixels should be greater than 10n for each class (Jensen
1986).
Further, Congalton (
1991) suggested that the minimum number of sample should be
75–100 for each class to assess the accuracy of remote sensing data. However, accord-
ing to Congalton and Green (
2009), there is no universally accepted standard for
assessing the accuracy. The ground truth includes the distribution of the phenomenon
to be mapped, sample size, number, type and frequency of collection (Congalton &
Green
2009). At a particular confidence level and desired precision, the sample size for
training and testing can be calculated using the formula given by Tortora (
1978).
3. Methodology adopted
In India, wheat production comes from a single growing season. Wheat sowing takes
place during October–December and harvested during March–May (Krishna Kumar
et al.
2004). Verma et al. (2003) for the same area has divided the entire Rabi season
into the seven phenological stages, viz. (i) Crown root initiation stage (meteorological
week numbers 44–46), (ii) Tillering stage (meteorological week numbers 47–49), (iii)
Jointing stage (meteorological week numbers 50–52), (iv) Flowering stage (meteoro-
logical week numbers 1–3), (v) Milking stage (meteorological week numbers 4–6), (vi)
Dough stage (meteorological week numbers 7–9) and (vii) Maturity stage (meteorologi-
cal week numbers 10–14). Based on this, the MODIS data were selected between
November 2011 and April 2012.
In this study, one of the aims was to acquire data at a regular interval of 16 days.
However, due to cloud cover, only one scene of 1 November 2011 was available
between November and December 2011. From the January to April 2012, scenes were
available for each 16-day composite period. This period contain information of all
phenological stages of wheat crop from the flowering stage up to harvesting stage . The
4 P. Upadhyay et al.
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
major phenological changes occur during this interval only. The details of MODIS data
sets are shown in the Table 1. The dates mentioned in the Table 1 are not the actual
observation dates but the composite dates with a 16-day wi ndow for MODIS.
Figure
2 shows the flowchart for the adopted methodology in this research. A field
visit was carried out on 30 and 31 December 2011 to collect the ground control points
so that the exact positions of training and testing wheat fields can easily be found. The
study area has a larg e extent of homogeneous wheat field, which can easily be identi-
fied in a 250 m × 250 m pixel. The vegetation indices acquired over a period was used
to generate the temporal spectral indices. Further, the Transform Divergence (TD) fea-
Table 1. Details for the MODIS data.
Day of the year Date Date number Phenological stage
305 01 November 2011 1 Sowing
17 17 January 2012 2 Flowering
33 02 February 2012 3 Milking
49 18 February 2012 4 Dough
65 05 March 2012 5 Maturity
81 21 March 2012 6 Maturity
97 06 April 2012 7 Harvesting
MODIS
01 Nov 2011
17 Jan 2012
02 Feb 2012
18 Feb 2012
05 Mar 2012
21 Mar 2012
06 Apr 2012
Field
information
Ground
truthing / Field
data gathering
SR, NDVI, SAVI and
EVI
Creation of training data
Feature selection
NC soft classification
Preparation of temporal
spectral indices
Optimization of δ
FAR and TP calculation
3D-ROC analysis for sub-pixel
detection
Figure 2. Methodology adopted for wheat crop mapping using NC approach.
Geocarto International 5
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
ture selection method was applied to select the optimised set of temporal spectral
indices. The optimised sets of temporal spectral indices were then used for discrimina-
tion of specific crop using the NC soft classification. The classification was performed
in an in-house java-based Sub-pixel Multi-Spectral Image Classifier (Kumar et al.
2006). For the assessment of these soft classified outputs, the ROC analysis has been
used.
3.1. Vegetation indices
The SR which is also known as ratio-bas ed vegetation index was described by Birth
and McVey (
1968). This vegetation index is based on the fact that the vegetation
absorbs well in the red (visible) and reflects very efficiently in the near infrared spectral
band of the electromagnetic spectrum. It is computed as:
SR ¼
q
NIR
q
R
(1)
where ρ
NIR
reflectance at near infra red (NIR) band and ρ
R
is reflectance at red band.
A value of SR less than 1.0 is taken as non-vegetation while values greater than 1.0 is
considered as the vegetation. The SR reduces the topographic effects. The major draw -
back in this method is division by zero which gives the infinite SR value corresponding
to the zero pixel value in red spect ral band.
Rouse et al. (
1973) introduced the NDVI in order to produce a spectral vegetation
index that separates green vegetation from its background soil brightness using Landsat
MSS digital data. The NDVI has been wi dely used for many vegetation related studies,
for last few decades. It is computed as:
NDVI ¼
q
NIR
q
R
q
NIR
þ q
R
(2)
The NDVI values ranges from −1 to +1. The negative NDVI represents the
non-vegetated area while the positive values represent the vegetated areas. Equation (2)
can be improved by incorporating a soil adjustment factor or inclusion of blue band for
atmospheric correction (Huete & Liu
1994). The soil adjustment factor is included to
minimise the influence due to soil background. The derived index is known as SAVI in
several studies (Huete & Liu
1994; Running et al. 1994) and expressed as:
SAVI ¼
q
NIR
q
R
ðÞ1 þ LðÞ
q
NIR
þ q
R
þ L
(3)
where L is a canopy background adjustment factor and has a value of 0.5.
The MODIS Land Discipline Group has proposed EVI as a standard satellite veg-
etation product for MODIS Terra and Aqua. It is represe nted by Equation (4):
EVI ¼
q
NIR
q
R
q
NIR
þ C
1
q
R
C
2
q
B
þ L
G (4)
where ρ
B
is reflectance at blue band. The EVI is a modified NDVI with a soil adjust-
ment factor L, gain factor G and tw o coefficients, C
1
and C
2
, which describe the use of
the blue band in correction of the red band for atmospheric aerosol scattering. These
coefficients C
1
, C
2
and L are empirically determined as 6.0, 7.5 and 1.0, respectively,
with G having a value of 2.5 (Huete & Liu
1994; Huete et al. 1997). This algorithm
has improved sensitivity to high biomass regions and improved vegetation monitoring
6 P. Upadhyay et al.
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
through a decoupling of the canopy background signal and a reduction in atmospheric
influences.
3.2. Feature selection
Feature selection is an important step while classifying with multi-temporal and multi-
spectral remote sensing data. It is an effective method of selecting desired number of
features (or bands) from large sets of multi-spectral or temporal data (Bruzzone &
Serpico
2000). The computational requirement and cost are always dependent on the
number of input feature required for the classification. Therefore, in this study, the fea-
ture selection method has been used to make the suitable combinations of data sets so
that maximum accuracy can be achieved with low cost and less labour.
In this study, TD (Swain & Davis
1978) separability approach has been used.
The TD can be expressed as:
TD
ij
¼ 2000 1 exp
D
ij
8
(5)
where i and j are two signatures (classes) being compared and D
ij
is the divergence.
The divergence D
ij
can be calculated by the following equation.
D
ij
¼
1
2
tr C
i
C
j
C
1
i
C
1
j
þ
1
2
tr C
1
i
C
1
j
u
i
u
j
u
i
u
j
T
(6)
where C
i
is the covariance matrix of class i , u
i
is the mean vector of class i, tr is the
trace function and T is the transpose function of matrix.
The TD values range from 0 to 2000. As a general rule given by Jensen (
1986) that
if TD values for all LU/LC classes for a pair of spectral bands are greater than 1900,
then the classes can be considered to have no overlapping amongst them and that the
separation betw een two informational classes is good. If values vary between 1700 and
1900, then separation is fairly good and if the values of TD are below 1700, then
separation between informational classes is poor.
3.3. NC soft classification approach
The idea of proper handling of noisy points was first proposed by Ohashi (
1984) (Dave
& Krishnapuram
1997). Further, according to Dave (1991), noise classes (or outliers)
can be segregated from the core information class (or cluster). They do not degrade the
quality of clustering analysis. The main concept of the NC algorithm is to introduce a
single noise information class (c + 1) that contains all noise data points. Originally, NC
has been developed as an unsupervised classifier, yet can be modified to be used in the
supervised mode by providing the information class (or cluster) means directly from
the training data set (Foody
2000).
The objective function of the NC can be obtained by adding another term to FCM
for (c + 1)th noise information class as follows (Dave
1991):
J
nc
ðU; V Þ¼
X
c
i¼1
X
N
k¼1
l
ki
ðÞ
m
Dðx
k
; v
i
Þþ
X
N
k¼1
l
k;cþ1
m
d (7)
With Dðx
k
; v
i
Þ¼d
2
ki
¼ x
k
v
i
kk
2
A
¼ðx
k
v
i
Þ
T
Aðx
k
v
i
Þ
Geocarto International 7
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
where d [ 0isafixed parameter. The noise information class has no centre and the
dissimilarity D
k;cþ1
between x
k
and this noise information class can be expressed as
(Miyamoto et al.
2008):
D
k;cþ1
¼ d
The constraints imposed in the objective function of Equation (7) are given by:
l
f
¼
U ¼ l
ki
:
P
cþ1
j¼1
l
kj
¼ 1 ; 1 k N ;
l
ki
2 0; 1½; 1 k N ; 1 i c þ 1
8
<
:
9
=
;
(8)
In Equation (8), U ¼ N c þ 1ðÞmatrix, V ¼ðv
1
...v
c
Þ is the collection of vector of
cluster centres v
i
; l
ki
is a class membership values of a pixel, d
ki
is distance in feature
space between x
k
and v
i
, x
k
is vector (or feature vector) denoting spectral response of a
pixel k, v
i
is a vector (or prototype vector) denoting the cluster centre of class i, c and
N are total number of clusters and pixels, respectively, d is resolution parameter and A
is the weight matrix.
The weight matrix A controls the shape of optimal cluster (Bezdek et al.
1984).
Generally, it takes the following norm:
A ¼ I Euclidean Norm (9)
A ¼ D
1
i
Diagonal Norm (10)
A ¼ C
1
i
Mahalonobis Norm (11)
where I is the identity matrix, D
i
is the diagonal matrix having the diagonal element as
eigen values of covariance matrix and C
i
is given by:
C
i
¼
X
N
k¼1
ðx
k
c
i
Þðx
k
c
i
Þ
T
(12)
where
c
i
¼
X
N
k¼1
x
k
=N (13)
From the objective function of NC, the membership values of information class and
noise are given by Equations (14) and (15), respectively, while Equation (16) gives the
mean value of information classes.
l
ki
¼
X
c
j¼1
Dðx
k
; v
i
Þ
Dðx
k
;
v
j
Þ
1
m1
þ
Dðx
k
; v
i
Þ
d
1
m1
"#
1
; 1 i c (14)
l
k;cþ1
¼
X
c
j¼1
d
Dx
k
;
v
j
!
1
m1
þ1
2
4
3
5
1
(15)
8 P. Upadhyay et al.
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
v
i
¼
P
N
k¼1
l
ki
ðÞ
m
x
k
P
N
k¼1
l
ki
ðÞ
m
; 1 i c (16)
The noise class will always remain at a constant distance from all data point. This
constant distance is referred as noise distance and is represented by parameter δ, also
known as ‘resolution parameter’.Ifδ is assigned a very small value, then most of the
points will get classified as noise points, while for a large value of δ most of the points
will be classified into other information classes than the noise class (Dave
1991;
Rehm et al.
2007). Thus, the imp ortance of (c + 1)th class is to take the effect of
outliers for classification. The flowchart of NC algorithm in supervised mode is shown
in Figure
3.
3.4. Receiver operating characteristic
The ROC, which is based on the Neyman–Pearson detection theory, is used for the
evaluation of detection performance in signal processing, (Chang et al.
2001; Chang
2010). It is used in the sense either object detected or not, which means it illustrate the
performance of a binary classifier. The detection power of Neyman–Pearson curve is
measured by the area under its corresponding curve. The area is denoted by A
z
and
bounded between ½ and 1. For better detection, it should be closer to 1 (Wang et al.
2005). The 2-D ROC curve is plotted by the false alarm rate (FAR) on one axis
NC (Supervised Mode)
Select δ>0, fix m, c, δ, type of A-norm and training data
Compute the mean value from training data
Input image file(or for pixel k=1 to N)
Compute information class center
i
v
for class i=1 to c
Compute the distances from information class based on the A-norm
Calculate the membership
value of information classes
Write class proportion to file
Calculate the membership
value of noise classes
End
Membership=0.996
Increase value of δ
No
Yes
Figure 3. Flowchart for NC algorithm and optimization of δ.
Geocarto International 9
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
(x-axis) and true positive (TP) rate in another axis (y-axis). On the other hand, the 3-D
ROC curve is plotted by taking the FAR on x-axis, detection threshold (t)iny-axis and
TP rate in z-axis. The 2-D ROC can be used for hard decision produced by the
classifier, whereas 3-D ROC for the soft decision (Wang et al.
2005).
The TP and FAR can be defined as follows:
TP ¼
Total number of target pixels detected as target
Total number of target pixels present in the sample
(17)
FAR ¼
Total number of background pixels detected as target
Total number of background pixels present in the sample
(18)
4. Results and discussion
4.1. Temporal variation of vegetation indices
The mean values of SR, NDVI, SAVI and EVI have been calculated from a training
sample consisting of 90 pixels of wheat. Figure
4 shows the mean values of NDVI,
SAVI and EVI, while Figure
5 shows for SR only. It is found that EVI shows the
variation of the indices value more distinctly in comparison with other indices. Further,
the indices values are higher during the flowering, milking and dough, with maximum
value occurring around the dough stage.
It is observed that the temporal nature of all the indices curve is identical; however,
the variation can be found within different indices for wheat crop (Figures
4 and 5).
The slope of SR in Figure
5 is much steeper in comparison with all the other indices
curves shown in Figure
4; therefore, it is effective for identification of wheat crop due
to its high temporal variation.
0
0.5
1
1.5
2
NDVI
SAVI
EVI
Flowering
Milking
Dough
Maturity
Harvesting
Sowing
Indices Value
Figure 4. Variation of indices values for NDVI, SAVI and EVI for different phenological stages
of wheat.
10 P. Upadhyay et al.
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
4.2. Feature selection using the separability
The TD separability outputs were generated by varying the number of temporal data
from one to seven. In case of NDVI and SAVI, temporal data combinations of ‘Three’,
‘Four’ and ‘Five’ have values equal to the saturation value of 2000. This means any of
the above combination of ‘Three’, ‘Four’ and ‘Five’ should yield good classification
result. However, it would be good if one combination could be identified to economise
the classification process.
From Table
2, it can be observed for ‘Three’, ‘Four’ and ‘Five’ data set combina-
tions, the date combinations ‘1,2,6’, ‘1,2,3,6’ and ‘1,2,3,5,6’ ,respec tively, are common
for both NDVI and SAVI. Further, it is observed that all these data set combinations
have at least one date pertaining sowing, flowering and maturity stage of wheat
phenology.
4.3. Optimization of resolution parameter (δ) and NC-based classification
For identifying wheat crop, the NC soft classification outputs have been generated by
varying the resolution parameter δ from 1 to 10
5
. The NDVI and SAVI corresponding
to the ‘Three’, ‘Four’ and ‘Five’ temporal dates, as selected in the Section
4.2, have
been taken for NC classification. The output of NC soft classification is represented in
terms of the fraction image corresponding to the wheat crop. The output membership
values of 75 known pixels have been plotted against the varying values of δ for both
temporal NDVI and SAVI, as shown in Figure
6.
It is observed that the membership value of wheat is near to zero when δ varies
from 1 to 10 and it increases rapidly when δ varies from 10 to 10
3
and thereafter it
increases slowly and then attains a constant value at a fixed point, where the member-
ship value is close to 0.996. Any further increment in δ increases the membership val-
ues of non-interest classes which increases the noise in the outputs. Thus, value of
resolution param eter δ corresponding to a fixed point yields the best result for wheat
classification. The optimised δ values for identification of wheat crop using MODIS
data are represented in Table
3. The NC soft classification outputs corres ponding to
these δ values are shown in Figure
7. A visual interpretation of the output of NC
0
2
4
6
8
10
12
14
16
SR
Flowering
Milking
Dough
Maturity
Harvesting
Sowing
Index Value
Figure 5. Variation of index value for SR for different phenological stages of wheat.
Geocarto International 11
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
Table 2. The TD separability outputs for different temporal spectral indices.
Data-set combination
SR NDVI SAVI EVI
Date combination TD Date combination TD Date combination TD Date combination TD
One 5 1943 4 1836 4 1836 4 1809
Two 1,4 1998 1,2 1999 1,6 1999 1,2 1994
Three 1,4,5 1999 1,2,6 2000 1,2,6 2000 1,2,3 1996
Four 1,2,4,5 1999 1,2,3,6 2000 1,2,3,6 2000 1,2,3,6 1997
Five 1,2,4,5,7 1999 1,2,3,5,6 2000 1,2,3,5,6 2000 1,2,3,4,6 1997
Six 1,2,3,4,5,7 1999 1,2,3,4,5,6 1999 1,2,3,4,5,6 1999 1,2,3,4,5,6 1997
Seven 1,2,3,4,5,6,7 1997 1,2,3,4,5,6,7 1998 1,2,3,4,5,6,7 1998 1,2,3,4,5,6,7 1996
12 P. Upadhyay et al.
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
(a) NDVI
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Three
Four
Five
Membershipvalue
log
10
δ
NDVI
01 5234
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
012345
Three
Four
Five
Membershipvalue
log
10
δ
SAVI
(b) SAVI
Figure 6. Optimization of resolution parameter (δ) for wheat crop using MODIS data.
Table 3. Optimised value of log
10
δ for wheat crop MODIS data.
Data-set combination NDVI SAVI
Three 3.84 4.30
Four 4.30 4.30
Five 4.30 4.30
Three
Four
Five
NDVI
SAVI
Value of
µ
1
0
Dataset Combination
Figure 7. NC outputs for the wheat crop using MODIS data.
Geocarto International 13
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
classifier (Figure 7) suggests that the wheat crop has been highlighted almost equally
for both the temporal indices.
From the Table
3, it can be observed that the average value of log
10
δ is 4.2 using
MODIS data for identification of wheat crop. This value is equivalent to the δ value of
1.6 × 10
4
and has been considered as an optimised value for identification of wheat
crop using the MODIS data.
4.4. Assessment of accuracy using the ROC
The assessment of accuracy has been done by computing FAR and TP values from the
NC classifier outputs. It is observed that for a good detection (or identi fication), a low
value of FAR and a high value of TP is required. Here, wheat has been considered as
the target pixel, while anything different has been assigned as background. 120 sample
points were selected on the basis of field trip and on the satellite data. Of these 120
sample points, 75 points were wheat and rest as background. For different membership
values ranging between 0 and 1 with an interval of 0.1, FAR and TP values were
calculated for both temporal NDVI and SAVI, as shown in Table
4.
The formulation of TP suggests that it is analogous to overall classification accu-
racy. There is no minimum acceptance rule for the overall classification accuracy,
which is universally applicable (Foody
2008). However, keeping minimum acceptance
value of 85% as per Anderson’s classification scheme, the threshold membership for
‘Three’, ‘Four’ and ‘Five’ temporal data sets has been identified. It is observed that, for
TP = 85% and minimum FAR, the threshold membership value is 0.8 for all temporal
combination, except ‘Three’ date combination of NDVI, for which it is 0.6 (Table
4).
Further, the 3-D ROC curve showing the variation of TP values with the FAR val-
ues along with the different membership threshold have been plotted for each spectral
index, as shown in Figure
8. The 2-D ROC curves derived from the 3-D ROC curves
are shown in Figure
9. For different date combination, the variation of TP with FAR
can be observed much clearly in 3-D curves. However, the 2-D ROC curves have been
used for the calculation of area under the ROC curves. The area under the 2-D ROC
Table 4. FAR and TP values for identification of wheat crop using NC-based classified temporal
indices of MODIS data.
Threshold
membership
NDVI SAVI
Three Four Five Three Four Five
FAR TP FAR TP FAR TP FAR TP FAR TP FAR TP
1.00 0.00 0.12 0.00 0.14 0.00 0.06 0.00 0.17 0.00 0.11 0.00 0.06
0.90 0.00 0.65 0.00 0.78 0.00 0.77 0.00 0.83 0.00 0.75 0.00 0.78
0.80 0.00 0.75 0.02 0.88 0.02 0.86 0.07 0.94 0.02 0.86 0.04 0.88
0.70 0.00 0.82 0.11 0.92 0.04 0.91 0.22 0.98 0.07 0.92 0.07 0.92
0.60 0.00 0.85 0.22 0.97 0.18 0.94 0.47 0.98 0.20 0.97 0.20 0.95
0.50 0.04 0.92 0.49 0.98 0.40 0.97 0.60 1.00 0.40 0.98 0.40 0.98
0.40 0.11 0.92 0.60 1.00 0.53 1.00 0.69 1.00 0.58 1.00 0.56 1.00
0.30 0.22 0.97 0.71 1.00 0.64 1.00 0.80 1.00 0.67 1.00 0.64 1.00
0.20 0.51 0.97 0.78 1.00 0.76 1.00 0.87 1.00 0.78 1.00 0.76 1.00
0.10 0.69 0.98 0.89 1.00 0.84 1.00 1.00 1.00 0.89 1.00 0.84 1.00
0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
14 P. Upadhyay et al.
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
curve has been calculated using the trape zoid rule. For each spectral index, the area
under the ROC curve (Table
5) shows the detection power of the method used for
detection. The larger the area better is the detection. From the Table
5, it is observed
(a) NDVI (b) SAVI
Figure 8. 3-D ROC curves for identification of wheat crop using NC-based classified temporal
indices of MODIS.
(a) NDVI
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Three
Four
Five
FAR
TP
NDVI
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Three
Four
Five
FAR
TP
SAVI
(b) SAVI
Figure 9. 2-D ROC curves for identification of wheat crop using NC-based classified temporal
indices of MODIS.
Table 5. Area under ROC curves for identification of wheat crop using NC-based classified
temporal indices of MODIS data.
Data set combination
Area under the ROC (AUROC)
NDVI SAVI
Three 0.968 0.982
Four 0.974 0.977
Five 0.970 0.974
Geocarto International 15
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
that the areas for ROC curves are above 0.5 for each index. This indicates that the
criterion of Neyman–Pearson detection curve is fulfilled by each index.
In case of SAVI, the highest area under the ROC curve is found corresponding to
the temporal date combination ‘Three’, a combination of sowing, flowering and matu-
rity stages of wheat phenology. Further, the inclusion of milking stage, as in the ‘Four’
date combination in SAVI has increased the redundancy and hence decreas ed the accu-
racy marginally. The reason of this decrease is that SAVI is more sensitive to soil
information and during milking stage, soil is visible in the background as leaf area is
small at this phenological stage. When ‘Five’ date temporal data is used, the accuracy
reduces. The reason for this fall in accuracy is that now two dates corresponding to
same phenology of maturity get included, leading to redundancy. Thus, it can be con-
cluded that SAVI temporal data set combination of ‘Three’ date is more effective for
identification of wheat crop from MODIS data using the NC classification (Figure
10).
5. Conclusion
The phenological changes for every crop or vegetation type occur in a different manner.
The utility of these phenological changes can be easily verified for wheat identification
from MODIS temporal spectral index data sets. In this study, NC classification
approach was used for the classification of temporal spectral index data sets. The res-
olution parameter (δ) for wheat crop identification using MODIS data was found with
an average value of 1.6 × 10
4
for different spectral index. Amongst the indices used for
the study, NDVI has found that the ‘Four’ date combination data set is more accurate
with high area under the ROC curve, while SAVI has produce d the high accuracy for
temporal date combination ‘Three’. Further, it was observed that, in each case, the three
phenological stages viz. sowing, flowering and maturity data are common and hence
found to be more suitable for wheat crop identification. The highest accuracy was
found for the SAVI with temporal date ‘Three’, with an area equivalent to 0.982 under
the ROC curve. Thus, it can be concluded that the SAVI temporal data generated from
MODIS, corresponding to sowing, flowering and maturity stage of wheat phenol ogy is
effective for the identification of wheat crop for the test site.
Acknowledgements
The authors are thankful to the reviewers for their critical comments and suggestions to improve
the manuscript.
0.88
0.89
0.9
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1
NDVI
SAVI
AUROC
Figure 10. AUROC for wheat classification using NDVI and SAVI generated from MODIS data.
16 P. Upadhyay et al.
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
Disclosure statement
No potential conflict of interest was reported by the authors.
References
Ahlawat JS, Sharma, PD. 2012. District disaster management plan. Panipat: Government of
India.
Bezdek JC, Ehrlich R, Full W. 1984. FCM: the fuzzy C-means clustering algorithm. Comput
Geosci. 10:191–203.
Birth GS, McVey G. 1968. Measuring the color of growing turf with a reflectance spectrora-
diometer. Agron J. 60:640–643.
Bruzzone L, Serpico SB. 2000. A technique for feature selection in multiclass problems. Int J
Remote Sens. 21:549–563.
Chang CI. 2010. Multiparameter receiver operating characteristic analysis for signal detection and
classification. IEEE Sens J. 10:423–442.
Chang CI, Ren H, Chiang SS, Ifarraguerri A. 2001. An ROC analysis for subpixel detection.
IEEE 2001 International Geoscience and Remote Sensing Symposium; Jul 24–28; Australia.
Congalton RG. 1991. A review of assessing the accuracy of classifications of remotely sensed
data. Remote Sens Environ. 37:35–46.
Congalton RG, Green K. 2009. Assessing the accuracy of remotely sensed data: principles and
practices. 2nd ed. Boca Raton (FL): Taylor & Francis Group.
Dave RN. 1991. Characterization and detection of noise in clustering. Pattern Recognit Lett.
12:657–664.
Dave RN, Krishnapuram R. 1997. Robust clustering methods: a unified view. IEEE Trans Fuzzy
Syst. 5:270–293.
Foody GM. 2000. Estimation of sub-pixel land cover composition in the presence of untrained
classes. Comput Geosci. 26:469–478.
Foody GM. 2002. Status of land cover classification accuracy assessment. Remote Sens Environ.
80:185–201.
Foody GM. 2008. Harshness in image classification accuracy assessment. Int J Remote Sens.
29:3137–3158.
Foody GM, Lucas RM, Curran PJ, Honzak M. 1997. Non-linear mixture modelling without
end-members using an artificial neural network. Int J Remote Sens. 18:937–953.
Houborg R, Soegaard H, Boegh E. 2007. Combining vegetation index and model inversion meth-
ods for the extraction of key vegetation biophysical parameters using Terra and Aqua MODIS
reflectance data. Remote Sens Environ. 106:39–58.
Huete AR, Liu HQ. 1994. An error and sensitivity analysis of the atmospheric- and soil-correct-
ing variants of the NDVI for the MODIS-EOS. IEEE Trans Geosci Remote Sens. 32:897–
905.
Huete AR, Liu HQ, Batchily K, van Leeuwen W. 1997. A comparison of vegetation indices over
a global set of TM images for EOS-MODIS. Remote Sens Environ. 59:440–451.
Huete A, Didan K, Miura T, Rodriguez EP, Gao X, Ferreira LG. 2002. Overview of the radio-
metric and biophysical performance of the MODIS vegetation indices. Remote Sens Environ.
83:195–213.
Ibrahim MA, Arora MK, Ghosh SK. 2005. Estimating and accommodating uncertainty through
the soft classification of remote sensing data. Int J Remote Sens. 26:2995–3007.
Jensen JR. 1986. Introductory digital image processing: a remote sensing perspective. New Jersey
(NJ): Prentice Hall.
Justice CO, Townshend J, Vermote EF, Masuoka E, Wolfe RE, Saleous N, Roy DP, Morisette JT.
2002. An overview of MODIS Land data processing and product status. Remote Sens Envi-
ron. 83:3–15.
Kanellopoulos I, Varfis A, Wilkinson GG, Mégier J. 1992. Land cover discrimination in SPOT
HRV imagery using an arti
ficial neural network: a 20-class experiment. Int J Remote Sens.
13:917–924.
Kaushal P, Dubey JK, Sankhyan HP, Sharma D, Thakur J. 2009. Study of survival rate for one
year old plantation of 2008–09 in panipat district. New Delhi: National Afforestation and
Eco-Development Board (Ministry of Environment and Forests, GOI).
Geocarto International 17
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015
Kerdiles H, Grondona MO. 1996. NOAA-AVHRR NDVI decomposition and sub-pixel classifica-
tion using linear mixing in the Argentinean Pampa. Int J Remote Sens. 16:1303–1325.
Krishnapuram R, Keller JM. 1993. A possibilistic approach of clustering. IEEE Trans Fuzzy Syst.
1:429–437.
Krishna Kumar K, Rupa Kumar K, Ashrit RG, Deshpande NR, Hansen JW. 2004. Climate
impacts on Indian agriculture. Int J Climatol. 24:1375–1393.
Kumar A, Ghosh SK, Dadhwal VK. 2006. Sub-pixel land cover mapping: SMIC system. ISPRS
International Symposium on Geospatial Databases for Sustainable Development; September
27–30; Goa, India.
Lobell DB, Asner GP. 2004. Cropland distributions from temporal unmixing of MODIS data.
Remote Sens Environ. 93:412–422.
Miyamoto S, Ichihashi H, Honda K. 2008. Algorithms for fuzzy clustering, studies in fuzziness
and soft computing, Vol. 229. Berlin: Springer; p. 65–66.
Ohashi Y. 1984. Fuzzy clustering and robust estimation in 9th Meet. Hollywood Beach, FL: SAS
User Grp. Int.
Pan Y, Li L, Zhang J, Liang S, Zhu X, Sulla-Menashe D. 2012. Winter wheat area estimation
from MODIS-EVI time series data using the Crop Proportion Phenology Index. Remote Sens
Environ. 119:232–242.
Potgieter AB, Apan A, Hammer G, Dunn P. 2010. Early-season crop area estimates for winter
crops in NE Australia using MODIS satellite imagery. ISPRS J Photogramm Remote Sens.
65:380–387.
Pringle MJ, Denham RJ, Devadas R. 2012. Identification of cropping activity in central and
southern Queensland, Australia, with the aid of MODIS MOD13Q1 imagery. Int J Appl Earth
Obs Geoinf. 19:276–285.
Rehm F, Klawonn F, Kruse R. 2007. A novel approach to noise clustering for outlier detection.
Soft Comput. 11:489–494.
Rouse JW, Haas RH, Schell JA, Deering DW. 1973. Monitoring vegetation systems in the Great
Plains with ERTS. In: Third ERTS Symposium, NASA SP-351(1); p. 309–317.
Running SW, Justice CO, Salomonson V, Hall D, Barker J, Kaufmann YJ, Strahler AH. 1994.
Terrestrial remote sensing science and algorithms planned for EOS/MODIS. Int J Remote
Sens. 15:3587–3620.
Sakamoto T, Yokozawa M, Toritani H, Shibayama M, Ishitsuka N, Ohno H. 2005. A crop phe-
nology detection method using time-series MODIS data. Remote Sens Environ. 96:366–374.
Shalan MA, Arora MK, Ghosh SK. 2003. An evaluation of fuzzy classifications from IRS 1C
LISS III imagery: a case study. Int J Remote Sens. 24:3179–3186.
Swain PH, Davis SM, editors. 1978. Remote sensing: the quantitative approach. New York (NY):
McGraw-Hill.
Tortora R. 1978. A note on sample size estimation for multinomial populations. Am Stat.
32:100–102.
Verma U, Ruhal DS, Hooda RS, Yadav M, Khera AP, Singh CP, Kalubarme MH, Hooda LS.
2003. Wheat yield modelling using remote sensing and agro meteorological data in Haryana
state. J Indian Soc Agric Stat. 56:190–198.
Wang J, Chang CI, Yang SC, Hsu GC, Hsu HH, Chung PC, Guo SM, Lee SK. 2005. 3D ROC
analysis for medical diagnosis evaluation. In: Proceedings of the 27th Annual International
Conference of the IEEE Engineering in Medicine and Biology Society (EMBS), p. 7545–7548;
Sep 2005; Shanghai, China.
Wardlow BD, Egbert SL. 2010. A comparison of MODIS 250-m EVI and NDVI data for crop
mapping: a case study for southwest Kansas. Int J Remote Sens. 31:805–830.
18 P. Upadhyay et al.
Downloaded by [PRIYADARSHI UPADHYAY] at 23:52 30 May 2015