Conference PaperPDF Available

Abstract and Figures

This paper presents a novel unsupervised clustering scheme to find changes in two or more coregistered remote sensing images acquired at different times. This method is able to find nonlinear boundaries to the change detection problem by exploiting a kernel-based clustering algorithm. The kernel k-means algorithm is used in order to cluster the two groups of pixels belonging to the 'change' and 'no change' classes (binary mapping). In this paper, we provide an effective way to solve the two main challenges of such approaches: i) the initialization of the clustering scheme and ii) a way to estimate the kernel function hyperparameter(s) without an explicit training set. The former is solved by initializing the algorithm on the basis of the Spectral Change Vector (SCV) magnitude and the latter is optimized by minimizing a cost function inspired by the geometrical properties of the clustering algorithm. Experiments on VHR optimal imagery prove the consistency of the proposed approach.
Content may be subject to copyright.
Unsupervised Change Detection by Kernel Clustering
Michele Volpia, Devis Tuia a,b , Gustavo Camps-Vallsband Mikhail Kanevskia
aInstitute of Geomatics and Analysis of Risk, Universit´e de Lausanne
Quartier UNIL-Sorge, 1015 Lausanne, Switzerland
{michele.volpi,devis.tuia,mikhail.kanevski}@unil.ch
bImage Processing Laboratory, Universitat de Val`encia
Catedr´atico A. Escardino - 46980 Paterna, Val`encia, Spain
gcamps@uv.es
ABSTRACT
This paper presents a novel unsupervised clustering scheme to find changes in two or more coregistered remote
sensing images acquired at different times. This method is able to find nonlinear boundaries to the change
detection problem by exploiting a kernel-based clustering algorithm. The kernel k-means algorithm is used in
order to cluster the two groups of pixels belonging to the ‘change’ and ‘no change’ classes (binary mapping). In
this paper, we provide an effective way to solve the two main challenges of such approaches: i) the initialization
of the clustering scheme and ii) a way to estimate the kernel function hyperparameter(s) without an explicit
training set. The former is solved by initializing the algorithm on the basis of the Spectral Change Vector (SCV)
magnitude and the latter is optimized by minimizing a cost function inspired by the geometrical properties of
the clustering algorithm. Experiments on VHR optical imagery prove the consistency of the proposed approach.
Keywords: Unsupervised change detection, Kernel k-means, Clustering, Remote sensing, VHR imagery
1. INTRODUCTION
In the recent years, the increasing number of Earth Observation satellites and the growing resolutions of the
optical images acquired increased the interest of the remote sensing community to the change detection issue.
Satellites with enhanced spatial (fine scale detection) and temporal resolution (near real time monitoring) provide
images particularly adapted to study the evolution of the ground cover: the detection of changes between images
acquired at different times over the same geographical area has become a major research area.
The analysis of the multitemporal images can be addressed by two main paradigms: supervised and unsu-
pervised (or clustering). The former requires a labeled set of examples provided by the user. It is particularly
well-suited when many classes of land cover evolutions have to be detected and summarized in a map. The latter
does not require labeled information: it generally provides binary maps and is particularly adapted to real life
problems, where the influence of the user must be minimal (i.e. no fitting of parameters, no manual thresholding
and no training set definition).1–3
In the literature, many unsupervised change detection algorithms can be found. Several studies have been
carried out regarding the automatic analysis of the difference image.4An example is the algorithmic comparison
of the scale invariant Mahalanobis distance between the pixels of the difference image, in order to map a specific
typology of change.5The advent of high resolution images within a short revisit time urged the need of studying
the statistics of the multitemporal difference image in an accurate way, in order to be effective when applying
these methods. Bayes decision rule and Markov random fields were introduced in order to deal with automatic
selection (exploiting expectation maximization algorithm) of thresholds and to consider contextual information
in the process.6Similar principles of estimation of the distribution are nowadays adopted in the Change Vector
Analysis (CVA)7–9 where Spectral Change Vectors (SCV) are computed by subtracting the multidimensional
corresponding pixels at different times and studying their magnitude (discriminating radiometric changes) and
angles (discriminating ground cover classes).
Further author information: Michele Volpi, IGAR, Bˆatiment Amphipˆole, Quartier UNIL-Sorge, CH-1015 Lausanne.
+41 21 692 3546
© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org
In parallel to unsupervised techniques, advanced machine learning techniques were introduced in the remote
sensing community. In particular, kernel methods10 have shown accurate and robust behavior when applied to
remote sensing data.11–13 Supervised change detection techniques exploiting these paradigms have shown their
relevancy in many studies,14–16 thus opening interesting fundamental research areas between pattern recognition
and remote sensing image processing.
The rationale of the paper is to study the flexibility of kernel methods to nonlinearities in the context of
unsupervised change detection problem. Kernel methods build linear models in a (high dimensional) feature
space where data is mapped to. The resulting solution in the input space is nonlinear. Classical unsupervised
partitioning method are suboptimal at detecting binary changes because of the nonlinear nature of the change:
however, if the feature space spanned by the kernel function maximizes separability, a linear partitioning algo-
rithm can discover the correct partitioning. In order to exploit this intuition, the well known k-means algorithm
is adapted to find clusters in that higher dimensional space by using its kernel counterpart.10, 17, 18 On one hand
the results are improved with respect to the classical explicitly linear algorithms, but on the other hand some
problems arise. Issues related to the initialization of the kernel k-means and to the optimization of the kernel
hyperparameter(s) are discussed, and effective ways to overcome these problems are proposed.
The rest of the paper is organized as follows. Section 2 introduces the kernel k-means algorithm. In Section 3
the change detection setting is introduced, discussing key problems and proposed solutions. Section 4 evaluates
the effectiveness of the proposed approach on a QuickBird pansharpened image. Section 5 concludes the paper
and discusses some future perspectives.
2. THE KERNEL K-MEANS
This section presents the kernel k-means algorithm starting from the well known k-means clustering technique.19
This approach is very useful to discover a natural partitioning of the input patterns Xin their input space X
into kgroups. The algorithm assigns a cluster membership kto the elements xiXthat minimize the distance
from its gravity centers mk:
d2(xi,mk) = kximkk2(1)
where mk=1
|πk|Pjπkxj, the πkare the elements assigned to cluster k, and |πk|is their number. When
all the patterns are assigned to their corresponding clusters, the mean vectors mkare updated by averaging
the coordinates of elements of the cluster, thus providing a new gravity center. Then, the process is iterated
until the centers stabilize and the algorithm converges to a minimum of d2(xi,mk),i, k. Standard k-means is
particularly adapted to solve linear problems, i.e. the input space is organized in spherical clusters.
The kernel version of k-means relies on the same principles, but instead of working in the input space X, it
works in a higher dimensional feature space H, in which non-spherical clusters in the input space are mapped
into spherical ones, and can consequently be detected correctly. This higher dimensional space is usually induced
by a mapping function ϕ(·), whose images ϕ(xi) correspond to mapped samples in H. Using mapped samples,
the k-means becomes:
d2(ϕ(xi),mk) = kϕ(xi)mkk2; (2)
where mk=1
|πk|X
jπk
ϕ(xj).(3)
This is equivalent to
d2(ϕ(xi),mk) = hϕ(xi),ϕ(xi)i+hmk,mki − 2hϕ(xi),mki.(4)
© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org
By plugging (3) into (4), and replacing the dot product hϕ(·),ϕ(·)iby a proper kernel function k(·,·), the
kernel k-means formulation10, 17, 18 is obtained as:
d2(ϕ(xi),mk) = hϕ(xi),ϕ(xi)i+1
|πk|2X
j,mπk
hϕ(xj),ϕ(xm)i − 2
|πk|X
jπk
hϕ(xi),ϕ(xj)i
=k(xi,xi) + 1
|πk|2X
j,mπk
k(xj,xm)2
|πk|X
jπk
k(xi,xj).(5)
Kernel functions are applied to overcome the problems relied to the explicit computation of the mapping
function, that can be costly and difficult. With kernels, the value of the dot product in the feature space is
evaluated directly by using the values of the samples in the input space.
The kernel values can be interpreted as a similarity measure between samples, and thus the kernel k-means
can be seen as a clustering algorithm that first groups similar points and then separates them, working linearly
in a higher dimensional feature space. As for the linear version, the process is iterated until convergence, by
assigning the cluster membership k, and solving the following minimization problem:
arg min
mk
{d2(ϕ(xi),mk)}= arg min
mknk(xi,xi) + 1
|πk|2X
j,mπk
k(xj,xm)2
|πk|X
jπk
k(xi,xj)o.(6)
Note that, since the mapping is not explicitly known, the exact coordinates of the cluster centers in Hcannot
be computed explicitly. However, the explicit centers coordinates are not needed to assign a pattern to its cluster.
When needed, the pixel closest to the center (the centroid or medioid) is considered to be the center.
In terms of complexity, the kernel k-means scales O(n2(ǫ+m)), where nis the number of samples, ǫis the
number of the iterations and mis the dimensionality. The classical k-means algorithm on the other hand is less
demanding, scaling O(ǫnmk), where kis the number of clusters.
3. THE CHANGE DETECTION SETTING
As mentioned above, two main issues have to be solved in order to apply this clustering algorithm in a completely
unsupervised way. In this section, the problems of initialization and of kernel parameters estimation are detailed.
3.1 Overcoming bad initializations
The main issue of unsupervised algorithms is to find a proper initialization allowing the method to converge to
a global minimum (‘true’ clusters) or to a local minimum sufficiently low. This issue can be greatly alleviated
by choosing a near-optimal initialization, i.e. finding centers within or close enough to the correct clusters.
In this case, the idea is to initialize the kernel k-means with two subsets that belong with high probability to
their respective clusters. In order to estimate the ‘change’ - respectively ‘no change’ - class distributions from
which the centroids are computed, the Spectral Change Vector7magnitude is exploited. The Change Vector
Analysis (CVA) has been widely used in many applications and, after [Bovolo and Bruzzone, 2006], a wide range
of applications has been reported (as initialization,20 change detector itself8or the exploratory data analysis21)
and its behavior is now largely understood.
SCV consists in computing the difference image and analyzing the distribution of magnitudes and angles in
order to discriminate changes. In this paper we exploit the magnitude vector computed as δ=kxt2
ixt1
ik, where
the x{t1,t2}
iare the multidimensional pixels at the two times. This distribution can be seen as a mixture of two
Gaussians, one for the unchanged pixels and another for the changed pixels. The interested reader can find more
details in the aforementioned papers.
© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org
0
+ t
- t
}
Overlapping
zone
No change region Change region
f(δ)
δ
T
Figure 1. Mixture of two Gaussians describing the two classes.
The initial centers are randomly picked from the ‘change’ and
‘no change’ regions. The Tcorresponds to a near optimal
thresholding, separating the ‘no change’ distribution (the left
one) to the ‘change’ one (right one).
This principle is illustrated in Figure 1, where two
subsets can be randomly initialized from each dis-
tribution, according to a threshold selected on the
minimum-error decision rule. It is worth mention-
ing that at this point, a change map can be produced
by assigning pixels according to the threshold on the
magnitude distribution (the Tin Figure 1). This so-
lution is not optimal for many reasons: as pointed out
in many studies21 such an approach suffers greatly of
residual registration misalignments and noise. More-
over, the separation of the ‘change’ and ‘no change’
clusters should be addressed by a nonlinear approach,
due to the great overlapping between the two class
distributions. This is particularly true for high / very
high geometrical resolution images, where the class
distributions strongly overlap and the images are af-
fected by high variances.
In the approach proposed in this paper, the near
Gaussian distributions of Figure 1 are exploited in
order to estimate the cluster centroids (as a pseudo
training set for the kernel k-means). Once a good ini-
tialization is obtained by a correct thresholding, the
convergence is also favored (within the limits dictated
by some possible sensor noise or outliers in the pixels magnitude values). It is worth mentioning that the number
of samples needed for the estimation of the kernel parameter(s) is only marginally important, while the descrip-
tion of the distribution should be complete in order to reproduce the variability of the data (i.e. the extent of
the clusters).
3.2 Learning the kernel parameter in an unsupervised way
The second big challenge is related to the fitting of the kernel parameters. Usually, such parameters are chosen
by evaluating the algorithm on some labeled example (e.g. leave-one-out and cross validation) and retaining the
parameters set Θ that minimizes some predefined cost function. In this paper we propose an unsupervised and
geometrically-inspired cost function, that automatically chooses a correct parameters set for the dataset at hand.
This cost function is formulated as:
arg min
Θ(Pk
1
|πk|Piπkd2(ϕ(xi),mk)
Pk6=pd2(mk,mp)),(7)
where Θ is the set of parameters of the kernel function to be learned. The optimal geometrical distribution
of the samples is formulated in terms of intra-cluster and inter-cluster distances. The distances induced in the
feature space are used as an index to achieve the best possible description for kernel k-means. The minimization
in Eq. (7) can be seen as a maximization of the cluster separability: the minimization of the numerator favors
compact clusters in terms of distances to their centers, while the maximization of the denominator suggests a
kernel that maps samples into two clusters that have distant centers. Any search algorithm (e.g. line/grid search,
simulated annealing and others) can be used to estimate the cost generated by the elements of a given set of
parameters Θ.
© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org
3.3 The change detection algorithm
Starting with two coregistered and equalized images, the proposed algorithm can be resumed in 4 steps, illustrated
by Figure 2.
INITIALIZATION
on the SCV
magnitude
CLUSTER
assignements
PAR AMETER
estimation
binary
CHANGE MAP
Image t1Image t2
CENTERS
computation
Figure 2. The workflow of the proposed ap-
proach.
1) Initialization: in order to apply image differencing the scenes must
be first preprocessed in terms of histogram matching an normalization
of theirs values. Then the initialization based on the thresholding of the
SCV magnitude can be applied. The images are subtracted, and then
the difference vectors (the SCV) are analyzed considering their norm.
The threshold and the confidence ([Tt;T+t] in Figure 1) indicate
where the pixels are mixed in terms of magnitude: thus, outside this
interval, the samples are more likely to belong to either groups and a
pseudo training set can be extracted.
2) Parameter estimation: once the correct threshold is found, the ker-
nel k-means algorithm is exploited as a wrapper to choose the best pa-
rameter optimizing Eq. (7): the pseudo training set is clustered with
different parameters until a minimum in the cost function is found.
3) Centroids computation: the algorithm returns the centroids and the
cluster assignment that corresponds to the best parameter. It is worth
mentioning that the choice of computing the centroids only on a subset
of pixels and not on the whole image is justified by two criteria: first,
by the strong overlap of the classes. This way, unbiased centers of the
two classes are computed, and the pixels in the overlapping part of
the distributions are assigned to the corresponding cluster (which is
the closest in H). Secondly, estimating the centers only on a proper
subset of the image reduces both the computational time (in terms
of algorithm convergence) and computational complexity of the single
iterations of kernel k-means. This is an important issue, especially
taking into account the computational cost of the partitioning algorithm.
4) Change detection: once the centroids are computed, each pixel in the difference image is assigned to the cluster
which center is closest in H. To do that, kernel k-means with the optimized parameters is applied to the entire
difference image.
4. DATA AND EXPERIMENTAL RESULTS
(a) (b)
Figure 3. Images in (a) 2002 and (b) 2006
In this section, the proposed approach to unsupervised change de-
tection is validated on a pansharpened QuickBird image of the city
of Zurich (Switzerland). The available images are shown in Figure 3.
The results of the proposed method are compared to simple thresh-
olding of the histogram and to the linear k-means. Accuracies are
evaluated in terms of AUC (Area Under the ROC curve) estimated
on the basis of some available ground truth. Additionally, Binary
confusion matrices are provided for a single experiment, and basic
accuracy metrics are provided as well.
A total of 15 experiments (corresponding to different initializa-
tions of the pseudo training set) were carried out for the kernel k-
means approach (with a Gaussian RBF kernel function) and for the
linear k-means. The centers are evaluated on the pseudo training
set extracted on the basis of the given regions of the magnitude his-
togram: at each iteration, a balanced set of 500 pseudo training set is extracted and used for computing centroids
for both clustering approaches. In order to have a deterministic term of comparison, the CVA was carried out
in terms of thresholding of the magnitude distribution.
© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org
4.1 Results and discussion
The AUC for the three approaches are illustrated in Table 1. The nonlinear solution improves the linear coun-
terpart and the unidimensional thresholding, reaching globally higher accuracies. It is interesting to see that the
proposed approach reduces greatly the false alarms provided by the k-means clustering on the difference image
(cf. ROCs in Figure 4 and confusion matrices in Table 2). Regarding the true changes (the true positives for
the ’change’ class) the algorithms are not so far in terms of performance, given the simplicity of the difference
image. On the other hand, the false positive rate is greatly reduced by the proposed approach. The averaged
ROC curves illustrated in Figure 4 and the AUC (for the k-means and the kernel k-means approach) show great
performances in terms of detection of true changes for all the algorithms, with a better performance for the
kernel approach.
CVA k-means kernel k-means
0.912 0.923 0.974
Table 1. Mean Area Under the ROC Curve (AUC). The averages are based on 15 independent experiments for the k-means
and for the kernel k-means; the CVA was carried out only once.
Actual Labels vs. Predicted (P)
CVA k-means kernel k-means
C NC C NC C NC
P
C11031 13241 12242 12987 12160 7766
NC 12778 190926 67 191180 149 196401
Basic Accuracy Metrics
OA κOA κOA κ
93.25 0.57 93.96 0.62 96.34 0.74
Table 2. Confusion matrices and accuracy metrics (Overall Accuracy - OA; Cohen’s Kappa - κ) for three models (randomly
chosen). ’C’ corresponds to the ’change’ class and ’NC’ to ’no change’.
In Figure 5, the final binary change detection maps are illustrated. The black color correspond to the ’change’
class while the white color correspond to the ’no change’ class. Note that for the kernel k-means and for the
k-means approaches, the maps represent the number of hits of the clustering algorithms. Thanks to the proper
initialization, both algorithms converge to the correct solution in the most of the iterations, only the k-means
has clustered unwanted pixels in an experiment (the light gray regions in Figure 5).
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Fa lsepositiverate
Tr ue positive rate
CVA
k-means
kernel k-means
Figure 4. ROC curves.
Observing the results of the clustering in Figure 5, the
kernel approach shows less false alarms, greatly reducing the
effect of the shadows on the change detection. The CVA ap-
proach is affected by both shadowed pixels and remarkable
differences in the reflective response of the ground, but the
true positives ratio is high. The k-means approach reduces
the effect of the shadows, but is greatly affected by the dif-
ferences in the reflectance of the images and shows potential
instability even if the centers are initialized on the magni-
tude. The kernel k-means finally shows a reduced effect of
both the principal sources of errors. The shadows and the
shadow-related changes are rarely assigned to the ’change’
cluster. The radiometric differences between the images,
even if less than with the k-means scheme, still influence
the false positive rate. Globally, in terms of true positive
detection, the k-means and the kernel k-means perform sim-
ilarly, but the most noticeable difference is found in terms
© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org
(a) CVA (b) k-means (c) kernel k-means
Figure 5. (a) CVA (b) k-means and (c) kernel k-means. For (a) and (b) white corresponds to 0 hits (100 % hits for the
class ’no change’) and black correspond to 15 hits (100 % hits for the class ’change’). In this case the term ’hits’ refers to
the total number of times that a given pixel is assigned to a given cluster.
of false alarms. These observations can be summarized by observing given accuracy metrics. The κin Table 1
gives some insight about this intuition, growing for the kernel k-means that greatly reduces the false alarm rate.
The Gaussian RBF kernel parameters were tuned by line search in the range of σ[0.01,0.1,...,6]. The
minimum of the function presented in (7) suggested average parameters in the interval [2.5,3] corresponding to
the mean distance of the pixels in the pseudo training set (in average 2.9).
5. CONCLUSIONS AND FUTURE WORK
The kernel clustering method shows great flexibility to the problem of change detection, finding nonlinear so-
lutions to the problem. The main issues of such approach are discussed and solved: first, the initialization
was addressed by finding a threshold on the magnitude distribution, and a geometrically inspired cost function
(which represents the ideal cluster geometry in the kernel induced feature space) has been proposed to estimate
the optimal kernel parameters (if any). Finally, the computational cost is kept low by controlling the number of
samples needed for estimating the centers (the label assignment step costs O(n2m) for the kernel matrix compu-
tation, where nis the number of pixels and mthe variables). The proposed approach shows improvements with
respect to classical clustering techniques. Moreover, the unsupervised kernel clustering introduces great potential
in term of flexibility (e.g. introducing adapted kernels to the data, or using composite kernels for the fusion of
information12) and seems thus to be a candidate for future research in unsupervised (and semi-supervised and
even active) change detection approaches.
ACKNOWLEDGMENTS
This work has been partly supported by the Swiss National Science Foundation projects no. 200021-126505/1
and PBLAP2-127713/1 and by the Spanish Ministry of Science and Innovation under projects AYA2008-05965-
C04-03 and CSD2007-00018.
© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org
REFERENCES
[1] Singh, A., “Digital change detection techniques using remote sensing data,” Int. J. Rem. Sens. 10(6),
989–1003 (1997).
[2] Coppin, P., Jonckheere, I., Nackaerts, K., Muys, B., and Lambin, E., “Digital change detection methods in
ecosystem monitoring: a review,” Int. J. Remote Sens. 25(9), 1565 – 1596 (2004).
[3] Radke, R. J., Andra, S., Al-Kofahi, O., and Roysam, B., “Image change detection algorithms: A systematic
survey,” IEEE Trans. Image Process. 14(3), 294 – 307 (2005).
[4] Fung, T., “An assessment of TM imagery for land-cover change detection,” IEEE Trans. Geosci. Remote
Sens. 28(4), 681–684 (1990).
[5] Bruzzone, L. and Serpico, S. B., “Detection of changes in remotely-sensed images by the selective use of
multi-spectral information,” Int. J. Remote Sens. 18(18), 3883 – 3888 (1997).
[6] Bruzzone, L. and Prieto, D. F., “Automatic analysis of the difference image for unsupervised change detec-
tion,” IEEE Trans. Geosci. Remote Sens. 38(3), 1171–1182 (2000).
[7] Malila, W. A., “Change vector analysis: An approach for detecting forest changes with Landsat,” in [Proc.
LARS Mach. Process. Remotely Sensed Data Symp.], 326 – 335 (1980).
[8] Bovolo, F. and Bruzzone, L., “A theoretical framework for unsupervised change detection based on change
vector analysis in polar domain,” IEEE Trans. Geosci. Remote Sens. 45(1), 218–236 (2006).
[9] Bovolo, F. and Bruzzone, L., “A split-based approach to unsupervised change detection in large size multi-
temporal images: application to Tsunami-damage assessment,” IEEE Trans. Geosci. Remote Sens. 45(6),
1658–1671 (2007).
[10] Shawe-Taylor, J. and Cristianini, N., [Kernel Methods for Pattern Analysis], Cambridge University Press
(2004).
[11] Camps-Valls, G. and Bruzzone, L., [Kernel Methods for Remote Sensing Data Analysis], J. Wiley & Sons
(2009).
[12] Camps-Valls, G., G´omez-Chova, L., Mu˜noz-Mar´ı, J., Rojo-´
Alvarez, J. L., and Mart´ınez-Ram´on, M., “Kernel-
based framework for multi-temporal and multi-source remote sensing data classification and change detec-
tion,” IEEE Trans. Geosci. Remote Sens. 46(6), 1822–1835 (2008).
[13] Camps-Valls, G. and Bruzzone, L., “Kernel-based methods for hyperspectral image classification,” IEEE
Trans. Geosci. Remote Sens. 43(3), 1 – 12 (2005).
[14] Nemmour, H. and Chibani, Y., “Multiple support vector machines for land cover change detection: an
application for mapping urban extensions.,” J. Photogr. Remote Sensi. 61, 125–133 (2006).
[15] Bovolo, F., Camps-Valls, G., and Bruzzone, L., “A support vector domain method for change detection in
multitemporal images,” Pattern Recogn. Lett. 31(10), 1148–1154 (2010).
[16] Volpi, M., Tuia, D., Kanevski, M., Bovolo, F., and Bruzzone, L., “Supervised change detection in VHR
images: a comparative analysis,” in [IEEE International Workshop on Machine Learning for Signal Pro-
cessing], (2009).
[17] Girolami, M., “Mercer kernel-based clustering in feature space,” IEEE Trans. Neural Net. 13(3), 780 – 784
(2002).
[18] Dhillon, I., Guan, Y., and Kulis, B., “A unified view of kernel k-means, spectral clustering and graph cuts,”
Tech. Rep. UTCS Technical Report No. TR-04-25, University of Texas, Austin, Departement of Computer
Science (2005).
[19] MacQueen, J., “Some methods for classification and analysis of multivariate observations,” in [Proc. 5th
Berkeley Symp. on Math. Statist. and Prob.], Proc. 5th Berkeley Symp. on Math. Statist. and Prob , 281 –
297 (1967).
[20] Bovolo, F., Bruzzone, L., and Marconcini, M., “A novel approach to unsupervised change detection based
on a semisupervised SVM and a similarity measure,” IEEE Trans. Geosci. Remote Sens. 46(7), 2070 – 2082
(2008).
[21] Bovolo, F., Bruzzone, L., and Marchesi, S., “Analysis of the effects of registration noise in multitemporal
VHR images,” in [ESA-EUSC, ESRIN ], (2008).
© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org
... Clustering is an exploratory data analysis technique that aims at finding structural groups present in the data (Webb 2011). Although it does not provide category labels, clustering has proven to be well suited for unsupervised change detection in remote sensing (Celik 2009;Ding et al. 2015;Ghosh et al. 2011;Volpi et al. 2010Volpi et al. , 2012Zheng et al. 2014b). In the current study, semantic information on the desired classes (i.e., changed and unchanged buildings) is given implicitly by appropriate preparation of the input features (see Section 0). ...
... and Zheng et al. (2014b) apply k-means clustering to difference images of optical and SAR imagery, respectively. Kernel k-means is used for change detection in QuickBird images byVolpi et al. (2010), while the same group of authors extended their approach and applied it to SPOT and Landsat data inVolpi et al. (2012).Ghosh et al. (2011) employ two fuzzy clustering algorithms for binary change detection in Landsat difference images. A technique from a different clustering domain -sparse hierarchical clustering -is used byDing et al. (2015) for change detection in VHR imagery. ...
Thesis
Cities are hot spots of global change. Thus, highly detailed and up-to-date information is required, which can be delineated based on various earth observation sensors. This thesis aims at the development of a change detection approach based on very high resolution (VHR) optical remote sensing data and consequent exemplary application of the assessment of the ghost city phenomenon in the context of urban geography. The unsupervised object-based change detection method captures the construction of individual buildings with accuracy of 0.8 to 0.9 according to kappa statistics in the city of Dongying, China. The methodology utilizes object-based difference features based on existing building geometries for the delimitation of changed and unchanged buildings. It is capable of handling VHR data from different sensors with deviating viewing geometries which allows the utilization of all present and future available sources of VHR data at small spatial scale. The transferability of the approach is investigated with particular focus on the nature and effects of class distribution. For this purpose, a diagnostic framework is developed and consequently applied in two cities of different characteristics. Results showed that situations of imbalanced class distribution generally provide less reliable identification of changes compared to balanced situations. The assessment of the ghost city phenomenon is conducted as an exemplary application of urban geography in the city of Dongying, China. The conceptual framework replicates undercapacity with respect to the residential population as one of the key characteristics of a ghost city. A 4d functional city model is established based on VHR imagery for population capacity estimation of residential buildings and subsequently related to actual permanent residential population from census counts. A significant mismatch and thus, high likelihood for the emergence and presence of the ghost city phenomenon was found in Dongying.
... (Celik 2009) used k-means clustering on the principal component analysis (PCA) of the difference image to extract change map. (Volpi et al. 2010) introduced a kernel k-means approach solving the problem of data's nonlinearity by representing the difference image in the feature space using kernel functions. Also, In order to have a soft clustering, fuzzy c-means (FCM) were used in (Li et al. 2015) to classify the log-ratio difference image. ...
... The leading idea of kernel methods is that nonlinear decision rules can be achieved by running a linear algorithm in a higher dimensional feature space, the reproducing kernel Hilbert space (RKHS), where the solution is more likely to be linear (Volpi et al. 2010). The mapping function is commonly represented as (. ). ...
Article
Full-text available
Change detection is one of the most important applications of Polarimetric Synthetic Aperture Radar (PolSAR) data in monitoring urban development and supporting urban planning due to the sensibility of SAR signal to geometrical and physical properties of terrestrial features. In this paper, we proposed an unsupervised change detection method using change indices extracted from PolSAR images. Kernel k-means clustering was then performed to extract changed areas. The kernel k-means clustering is an unsupervised algorithm that maps the input features to higher Hilbert dimension space by using a kernel function. To better representation of changed areas, different change indices were generated. The method was applied to UAVSAR L-band SAR images acquired over an urban area in San Andreas, United States. We evaluated the change detection performance based on kappa and overall accuracies of the proposed approach and compared with other well-known classic methods.
... In Shah-Hosseini et al. (2015a), two approaches for automatic CD framework are presented: the first method is based on the integration of CVA method, kernel-based C-means clustering, and kernelbased minimum distance classifier; in addition, a SVMbased CD method is presented and analyzed. One more non-linear approach is proposed in Volpi et al. (2010); here, an initialization routine is used in conjunction with an unsupervised cost function to optimize the kernel hyper parameters. ...
... There are some aspects to consider when using kernelbased CD approaches. These aspects have to do with the fact that non-automatic kernel-based methods require labeled samples for training a classifier (Shah-Hosseini et al., 2015b), otherwise optimal kernel parameters and precise training samples have to be defined (Volpi et al., 2010;Shah-Hosseini et al., 2015a); therefore, some of these methods could be inefficient as far as run time is concerned (Shah-Hosseini et al., 2015b). For instance, in all the tests presented in Bovolo et al. (2010), the initialization threshold value was obtained according to a manual-trial-anderror procedure; moreover, in order to reduce the computational load of the training stage, the resulting sets were randomly sub-sampled. ...
Article
Full-text available
This paper presents a new unsupervised change detection methodology for multispectral images applied to specific land covers. The proposed method involves comparing each image against a reference spectrum, where the reference spectrum is obtained from the spectral signature of the type of coverage you want to detect. In this case the method has been tested using multispectral images (SPOT5) of the community of Madrid (Spain), and multispectral images (Quickbird) of an area over Indonesia that was impacted by the December 26, 2004 tsunami; here, the tests have focused on the detection of changes in vegetation. The image comparison is obtained by applying Spectral Angle Mapper between the reference spectrum and each multitemporal image. Then, a threshold to produce a single image of change is applied, which corresponds to the vegetation zones. The results for each multitemporal image are combined through an exclusive or (XOR) operation that selects vegetation zones that have changed over time. Finally, the derived results were compared against a supervised method based on classification with the Support Vector Machine. Furthermore, the NDVI-differencing and the Spectral Angle Mapper techniques were selected as unsupervised methods for comparison purposes. The main novelty of the method consists in the detection of changes in a specific land cover type (vegetation), therefore, for comparison purposes, the best scenario is to compare it with methods that aim to detect changes in a specific land cover type (vegetation). This is the main reason to select NDVI-based method and the post-classification method (SVM implemented in a standard software tool). To evaluate the improvements using a reference spectrum vector, the results are compared with the basic-SAM method. In SPOT5 image, the overall accuracy was 99.36% and the κ index was 90.11%; in Quickbird image, the overall accuracy was 97.5% and the κ index was 82.16%. Finally, the precision results of the method are comparable to those of a supervised method, supported by low detection of false positives and false negatives, along with a high overall accuracy and a high kappa index. On the other hand, the execution times were comparable to those of unsupervised methods of low computational load.
... In [13], rather than converting the difference image in the polar domain, local PCAs are used in sub-blocks of the image, followed by a binary k-means clustering to detect changed/unchanged areas locally. Kernel clustering has been also studied in [59,60], where kernel k-means with parameters optimized using an unsupervised ANOVA-like cost function is used to separate the two clusters in a fully unsupervised way. Finally, unsupervised neural networks have been considered for binary CD [40,41]. ...
Preprint
Full-text available
Anomalous change detection (ACD) is an important problem in remote sensing image processing. Detecting not only pervasive but also anomalous or extreme changes has many applications for which methodologies are available. This paper introduces a nonlinear extension of a full family of anomalous change detectors. In particular, we focus on algorithms that utilize Gaussian and elliptically contoured (EC) distribution and extend them to their nonlinear counterparts based on the theory of reproducing kernels' Hilbert space. We illustrate the performance of the kernel methods introduced in both pervasive and ACD problems with real and simulated changes in multispectral and hyperspectral imagery with different resolutions (AVIRIS, Sentinel-2, WorldView-2, and Quickbird). A wide range of situations is studied in real examples, including droughts, wildfires, and urbanization. Excellent performance in terms of detection accuracy compared to linear formulations is achieved, resulting in improved detection accuracy and reduced false-alarm rates. Results also reveal that the EC assumption may be still valid in Hilbert spaces. We provide an implementation of the algorithms as well as a database of natural anomalous changes in real scenarios http://isp.uv.es/kacd.html.
... Many authors have proposed different methods of supervised change detection such as wavelet-based methods (Chang and Jay Kuo 1993;Celik and Ma 2010), neural network methods (Christophe and Bisho 1995;Del Frate 2007), fuzzy rule-based method (Bárdossy and Samaniego 2001), kernel-based methods (Volpi 2012), post-classification methods (Raja 2013;Volpi et al. 2010), active contour models (Li et al. 2015;Chen and Cao 2013), conditional random field models (Cao et al. 2016) and Object-based methods (Zhang et al. 2018). These methods are capable of recognizing the types of land cover or land use transition and robustness in handling the various atmospheric and lighting conditions incurred at different acquisition times. ...
Article
Full-text available
This paper presents a novel method for segmentation and change detection of multispectral images using proximal splitting-based clustering and multiclass support vector machine (MSVM). Initially, the multitemporal satellite images are preprocessed and then textures are extracted using Difference of Offset Gaussian filter. In general, the traditional clustering method uses Euclidean distance as a prime factor for segmentation process. For multitextured images such as remotely sensed images, this metric provides inconsistent output. To achieve better segmented results, proximal splitting algorithm has been proposed. This method has been considered as a solution for iterative minimization problem, which is required to find exact changes between the multitemporal images. The MSVM is chosen to group the segmented clusters into a fixed number of classes, since the clusters obtained from the proximal splitting algorithm are not independent with each other. Then, the classified images are subjected to image differencing method to detect the changes. Experimentation is performed with two real data sets of Landsat7 images, which illustrates that the mean of difference in area obtained by the proposed method is reduced by an average of 35.24% compared to the conventional system. The validity index obtained for data set 1 using proposed algorithm is lower than the existing methods.
... Clustering is an exploratory data analysis technique that aims at finding structural groups present in the data (Webb, 2011). Although it does not provide category labels, clustering has proven to be well suited for unsupervised change detection in remote sensing (Celik, 2009;Ding et al., 2015;Ghosh et al., 2011;Volpi et al., 2010Volpi et al., , 2012Zheng et al., 2014). In the current study, semantic information on the desired classes (i.e. ...
Conference Paper
Die Beobachtung und Dokumentation von Veränderungen ist eine der intrinsischen Fähigkeiten der Erdbeobachtung. Mit zunehmender Verfügbarkeit satellitengestützter Aufnahmen mit höchster geometrischer Auflösung werden diese Fähigkeiten auch zur Analyse dynamischer und komplexer Stadtregionen immer relevanter. Daher sind hochautomatisierte Auswertealgorithmen zur Erkennung dieser Veränderungen dringend nötig. Im Rahmen dieses Vortrags wird ein neuartiges objekt-basiertes Verfahren zur automatisierten Änderungserkennung vorgestellt. Dieses Verfahren erlaubt es, die Veränderung von Einzelgebäuden in höchstaufgelösten optischen Fernerkundungsdaten (z.B. WorldView) zu erfassen. Darüber hinaus werden Techniken gezeigt, mit deren Hilfe der Grad der Oberflächenversiegelung aus höchstaufgelösten Erdbeobachtungsdaten abgeleitet werden kann. Da diese Datensätze noch nicht flächendeckend für sehr große Gebiete, wie beispielsweise ganze Bundesländer zur Verfügung stehen, wird die automatisierte Übertragung mittels maschineller Lernverfahren auf großflächig verfügbare Aufnahmen von Sensoren wie Sentinel-2 demonstriert. Die vorgestellten Verfahren erlauben die Erfassung sowie eine detaillierte Charakterisierung von Städten und urbanen Gebieten mit hoher Genauigkeit. Dabei kann die Veränderung einzelner Gebäude automatisiert erfasst werden um die Dynamik von Städten weltweit zu dokumentieren. Darüber hinaus ermöglichen moderne Fernerkundungstechniken die präzise Detektion sowie ein kosteneffizientes Monitoring versiegelter Flächen für große Gebiete.
... For example, [5] and [6] apply k-means clustering to difference images of optical and SAR imagery, respectively. Kernel k-means is used for change detection in QuickBird images by [7], while the same group of authors extended their approach and applied it to SPOT and Landsat data in [8]. Ghosh et al. [9] employ two fuzzy clustering algorithms for binary change detection in Landsat difference images. ...
Conference Paper
Remote sensing has proven to be an adequate tool for observation of changes to the Earth’s surface. Especially modern space-borne sensors with very-high spatial resolution offer new capabilities for monitoring of dynamic urban environments. In this context, clustering is a well suited technique for unsupervised and thus highly automatic detection of changes. In this study, seven partitioning clustering algorithms from different methodological categories are evaluated regarding their suitability for unsupervised change detection. In addition, object-based feature sets of different characteristics are included in the analysis assessing their discriminative power for classification of changed against unchanged buildings. In general, the most important property of favorable algorithms is that they do not require additional arbitrary input parameters except the number of clusters. Best results were achieved based on the clustering algorithms k-means, partitioning around medoids, genetic k-means and self-organizing map clustering with accuracies in terms of κ statistics of 0.8 to 0.9 and beyond.
... Clustering is an exploratory data analysis technique that aims at finding structural groups present in the data (Webb, 2011). Although it does not provide category labels, clustering has proven to be well suited for unsupervised change detection in remote sensing (Celik, 2009;Ding et al., 2015;Ghosh et al., 2011;Volpi et al., 2010Volpi et al., , 2012Zheng et al., 2014). In the current study, semantic information on the desired classes (i.e. ...
Conference Paper
Continuous monitoring of changes is one of the intrinsic capabilities of remote sensing. With respect to the increasing availability of very high resolution (VHR) remote sensing imagery, the capabilities become more and more relevant for rapidly changing complex urban environments. Therefore highly automatic concepts for analysis of changes are more and more required. In addition, appropriate unsupervised change detection approaches should be capable of handling VHR remote sensing data acquired by different sensors with possibly deviating viewing geometries and varying solar illumination angles. Especially concerning the high level of detail present in VHR imagery over urban areas, object-based methods facilitate change detection in this context. Another asset of the object-based analysis is that it inherently tackles discrepancies in exact spatial, spectral and radiometric matching of VHR image pairs. The aim of this paper is to present a novel object-based approach for unsupervised change detection with focus on individual buildings. The object-based paradigm allows the characterization of image objects by a large number of features that can be derived from the multi-temporal VHR image pairs. Modern VHR space-borne sensors like QuickBird, GeoEye, WorldView or Pléiades offer at least four multispectral image channels at spatial resolutions of approximately 50 centimeters. Different groups of features (e.g. 1st and 2nd order statistics of image channels) are compared regarding their discriminative power for building change detection. Principal component analysis is used as a feature extraction technique which compensates redundancies among features and enables proper data representation in the multi-dimensional feature space. For discrimination of changed and unchanged buildings, a comprehensive number of clustering algorithms from different methodological categories are evaluated regarding their capability of handling this two-class change detection problem. Overall, the proposed approach returned viable results which show the general suitability of clustering for object-based change detection. In detail, highest consistent accuracies were achieved using the algorithms k-means, partitioning around medoids, genetic k-means and the self-organizing map (SOM) clustering technique. We conclude that the proposed approach offers new benefits for building change detection particularly in rapidly changing urban settings, such as in Chinese cities.
... Clustering is an exploratory data analysis technique that aims at finding structural groups present in the data (Webb, 2011). Although it does not provide category labels, clustering has proven to be well suited for unsupervised change detection in remote sensing (Celik, 2009; Ding et al., 2015; Ghosh et al., 2011; Volpi et al., 2010 Volpi et al., , 2012 Zheng et al., 2014). In the current study, semantic information on the desired classes (i.e. ...
Article
Monitoring of changes is one of the most important inherent capabilities of remote sensing. The steadily increasing amount of available very-high resolution (VHR) remote sensing imagery requires highly automatic methods and thus, largely unsupervised concepts for change detection. In addition, new procedures that address this challenge should be capable of handling remote sensing data acquired by different sensors. Thereby, especially in rapidly changing complex urban environments, the high level of detail present in VHR data indicates the deployment of object-based concepts for change detection. This paper presents a novel object-based approach for unsupervised change detection with focus on individual buildings. First, a principal component analysis together with a unique procedure for determination of the number of relevant principal components is performed as a predecessor for change detection. Second, k-means clustering is applied for discrimination of changed and unchanged buildings. In this manner, several groups of object-based difference features that can be derived from multi-temporal VHR data are evaluated regarding their discriminative properties for change detection. In addition, the influence of deviating viewing geometries when using VHR data acquired by different sensors is quantified. Overall, the proposed workflow returned viable results in the order of κ statistics of 0.8–0.9 and beyond for different groups of features, which demonstrates its suitability for unsupervised change detection in dynamic urban environments. With respect to imagery from different sensors, deviating viewing geometries were found to deteriorate the change detection result only slightly in the order of up to 0.04 according to κ statistics, which underlines the robustness of the proposed approach.
Article
Full-text available
Techniques based on multi-temporal, multi-spectral, satellite-sensor- acquired data have demonstrated potential as a means to detect, identify, map and monitor ecosystem changes, irrespective of their causal agents. This review paper, which summarizes the methods and the results of digital change detection in the optical/infrared domain, has as its primary objective a synthesis of the state of the art today. It approaches,digital change,detection from,three angles. First, the different perspectives from which the variability in ecosystems and the change,events have been dealt with are summarized.,Change,detection between pairs of images,(bi-temporal) as well as between,time profiles of imagery,derived indicators (temporal trajectories), and, where relevant, the appropriate choices for digital imagery acquisition timing and change interval length definition, are discussed. Second, pre-processing routines either to establish a more direct linkage between remote sensing data and biophysical phenomena, or to temporally mosaic imagery and extract time profiles, are reviewed. Third, the actual change,detection,methods,themselves,are categorized,in an analytical framework and critically evaluated. Ultimately, the paper highlights how some of these methodological,aspects are being,fine-tuned as this review,is being written, and we summarize the new developments that can be expected in the near future. The review,highlights the high complementarity,between,different change,detection methods.
Conference Paper
Full-text available
In this paper, a comparison between supervised change detection methods for very high geometrical resolution satellite images is considered. Methods commonly used for high and medium resolution are here confronted to the problem of exploiting very high resolution imagery, which is characterized by strong redundancy, high variances of information composing objects, collinearity and noise. Three supervised methods for change detection are compared: the post classification comparison, the direct multidate classification and the difference image analysis. Each method is built using support vector machines for the purpose of detecting urban changes between two QuickBird scenes of the city of Zurich, Switzerland. The benefits of adding spatial and contextual information are also studied. Comparison between the performance of the approaches, as well as considerations about the adaptability of such methods to very high geometrical resolution are reported.
Article
A variety of procedures for change detection based on comparison of multitemporal digital remote sensing data have been developed. An evaluation of results indicates that various procedures of change detection produce different maps of change even in the same environment. -Author
Article
Recently, a variety of clustering algorithms have been proposed to handle data that is not linearly separable. Spectral clustering and kernel k-means are two such methods that are seemingly quite different. In this paper, we show that a general weighted kernel k-means objective is mathematically equivalent to a weighted graph partitioning objective. Special cases of this graph partitioning objective include ratio cut, normalized cut and ratio association. Our equivalence has important consequences: the weighted kernel k-means algorithm may be used to directly optimize the graph partitioning objectives, and conversely, spectral methods may be used to optimize the weighted kernel k-means objective. Hence, in cases where eigenvector computation is prohibitive, we eliminate the need for any eigenvector computation for graph partitioning. Moreover, we show that the Kernighan-Lin objective can also be incorporated into our framework, leading to an incremental weighted kernel k-means algorithm for local optimization of the objective. We further discuss the issue of convergence of weighted kernel k-means for an arbitrary graph affinity matrix and provide a number of experimental results. Theseresults show that non-spectral methods for graph partitioning are as effective as spectral methods and can be used for problems such as image segmentation in addition to data clustering.
Article
In this Letter, an unsupervised algorithm for detecting changes in multi-spectral and multi-temporal remotely-sensed images is presented. Such an algorithm makes it possible to reduce the e Vects ofregistration noise' on the accuracy of change detection. In addition, it can be used to reduce the typologies of detected changes in order to better locate the changes under investigation.
Book
Kernel methods have long been established as effective techniques in the framework of machine learning and pattern recognition, and have now become the standard approach to many remote sensing applications. With algorithms that combine statistics and geometry, kernel methods have proven successful across many different domains related to the analysis of images of the Earth acquired from airborne and satellite sensors, including natural resource control, detection and monitoring of anthropic infrastructures (e.g. urban areas), agriculture inventorying, disaster prevention and damage assessment, and anomaly and target detection. Presenting the theoretical foundations of kernel methods (KMs) relevant to the remote sensing domain, this book serves as a practical guide to the design and implementation of these methods. Five distinct parts present state-of-the-art research related to remote sensing based on the recent advances in kernel methods, analysing the related methodological and practical challenges: Part I introduces the key concepts of machine learning for remote sensing, and the theoretical and practical foundations of kernel methods. Part II explores supervised image classification including Super Vector Machines (SVMs), kernel discriminant analysis, multi-temporal image classification, target detection with kernels, and Support Vector Data Description (SVDD) algorithms for anomaly detection. Part III looks at semi-supervised classification with transductive SVM approaches for hyperspectral image classification and kernel mean data classification. Part IV examines regression and model inversion, including the concept of a kernel unmixing algorithm for hyperspectral imagery, the theory and methods for quantitative remote sensing inverse problems with kernel-based equations, kernel-based BRDF (Bidirectional Reflectance Distribution Function), and temperature retrieval KMs. Part V deals with kernel-based feature extraction and provides a review of the principles of several multivariate analysis methods and their kernel extensions. This book is aimed at engineers, scientists and researchers involved in remote sensing data processing, and also those working within machine learning and pattern recognition.
Article
The reliability of support vector machines for classifying hyper-spectral images of remote sensing has been proven in various studies. In this paper, we investigate their applicability for land cover change detection. First, SVM-based change detection is presented and performed for mapping urban growth in the Algerian capital. Different performance indicators, as well as a comparison with artificial neural networks, are used to support our experimental analysis. In a second step, a combination framework is proposed to improve change detection accuracy. Two combination rules, namely, Fuzzy Integral and Attractor Dynamics, are implemented and evaluated with respect to individual SVMs. Recognition rates achieved by individual SVMs, compared to neural networks, confirm their efficiency for land cover change detection. Furthermore, the relevance of SVM combination is highlighted.