Content uploaded by Mikhail Kanevski

Author content

All content in this area was uploaded by Mikhail Kanevski

Content may be subject to copyright.

Unsupervised Change Detection by Kernel Clustering

Michele Volpia, Devis Tuia a,b , Gustavo Camps-Vallsband Mikhail Kanevskia

aInstitute of Geomatics and Analysis of Risk, Universit´e de Lausanne

Quartier UNIL-Sorge, 1015 Lausanne, Switzerland

{michele.volpi,devis.tuia,mikhail.kanevski}@unil.ch

bImage Processing Laboratory, Universitat de Val`encia

Catedr´atico A. Escardino - 46980 Paterna, Val`encia, Spain

gcamps@uv.es

ABSTRACT

This paper presents a novel unsupervised clustering scheme to ﬁnd changes in two or more coregistered remote

sensing images acquired at diﬀerent times. This method is able to ﬁnd nonlinear boundaries to the change

detection problem by exploiting a kernel-based clustering algorithm. The kernel k-means algorithm is used in

order to cluster the two groups of pixels belonging to the ‘change’ and ‘no change’ classes (binary mapping). In

this paper, we provide an eﬀective way to solve the two main challenges of such approaches: i) the initialization

of the clustering scheme and ii) a way to estimate the kernel function hyperparameter(s) without an explicit

training set. The former is solved by initializing the algorithm on the basis of the Spectral Change Vector (SCV)

magnitude and the latter is optimized by minimizing a cost function inspired by the geometrical properties of

the clustering algorithm. Experiments on VHR optical imagery prove the consistency of the proposed approach.

Keywords: Unsupervised change detection, Kernel k-means, Clustering, Remote sensing, VHR imagery

1. INTRODUCTION

In the recent years, the increasing number of Earth Observation satellites and the growing resolutions of the

optical images acquired increased the interest of the remote sensing community to the change detection issue.

Satellites with enhanced spatial (ﬁne scale detection) and temporal resolution (near real time monitoring) provide

images particularly adapted to study the evolution of the ground cover: the detection of changes between images

acquired at diﬀerent times over the same geographical area has become a major research area.

The analysis of the multitemporal images can be addressed by two main paradigms: supervised and unsu-

pervised (or clustering). The former requires a labeled set of examples provided by the user. It is particularly

well-suited when many classes of land cover evolutions have to be detected and summarized in a map. The latter

does not require labeled information: it generally provides binary maps and is particularly adapted to real life

problems, where the inﬂuence of the user must be minimal (i.e. no ﬁtting of parameters, no manual thresholding

and no training set deﬁnition).1–3

In the literature, many unsupervised change detection algorithms can be found. Several studies have been

carried out regarding the automatic analysis of the diﬀerence image.4An example is the algorithmic comparison

of the scale invariant Mahalanobis distance between the pixels of the diﬀerence image, in order to map a speciﬁc

typology of change.5The advent of high resolution images within a short revisit time urged the need of studying

the statistics of the multitemporal diﬀerence image in an accurate way, in order to be eﬀective when applying

these methods. Bayes decision rule and Markov random ﬁelds were introduced in order to deal with automatic

selection (exploiting expectation maximization algorithm) of thresholds and to consider contextual information

in the process.6Similar principles of estimation of the distribution are nowadays adopted in the Change Vector

Analysis (CVA)7–9 where Spectral Change Vectors (SCV) are computed by subtracting the multidimensional

corresponding pixels at diﬀerent times and studying their magnitude (discriminating radiometric changes) and

angles (discriminating ground cover classes).

Further author information: Michele Volpi, IGAR, Bˆatiment Amphipˆole, Quartier UNIL-Sorge, CH-1015 Lausanne.

+41 21 692 3546

© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org

In parallel to unsupervised techniques, advanced machine learning techniques were introduced in the remote

sensing community. In particular, kernel methods10 have shown accurate and robust behavior when applied to

remote sensing data.11–13 Supervised change detection techniques exploiting these paradigms have shown their

relevancy in many studies,14–16 thus opening interesting fundamental research areas between pattern recognition

and remote sensing image processing.

The rationale of the paper is to study the ﬂexibility of kernel methods to nonlinearities in the context of

unsupervised change detection problem. Kernel methods build linear models in a (high dimensional) feature

space where data is mapped to. The resulting solution in the input space is nonlinear. Classical unsupervised

partitioning method are suboptimal at detecting binary changes because of the nonlinear nature of the change:

however, if the feature space spanned by the kernel function maximizes separability, a linear partitioning algo-

rithm can discover the correct partitioning. In order to exploit this intuition, the well known k-means algorithm

is adapted to ﬁnd clusters in that higher dimensional space by using its kernel counterpart.10, 17, 18 On one hand

the results are improved with respect to the classical explicitly linear algorithms, but on the other hand some

problems arise. Issues related to the initialization of the kernel k-means and to the optimization of the kernel

hyperparameter(s) are discussed, and eﬀective ways to overcome these problems are proposed.

The rest of the paper is organized as follows. Section 2 introduces the kernel k-means algorithm. In Section 3

the change detection setting is introduced, discussing key problems and proposed solutions. Section 4 evaluates

the eﬀectiveness of the proposed approach on a QuickBird pansharpened image. Section 5 concludes the paper

and discusses some future perspectives.

2. THE KERNEL K-MEANS

This section presents the kernel k-means algorithm starting from the well known k-means clustering technique.19

This approach is very useful to discover a natural partitioning of the input patterns Xin their input space X

into kgroups. The algorithm assigns a cluster membership kto the elements xi∈Xthat minimize the distance

from its gravity centers mk:

d2(xi,mk) = kxi−mkk2(1)

where mk=1

|πk|Pj∈πkxj, the πkare the elements assigned to cluster k, and |πk|is their number. When

all the patterns are assigned to their corresponding clusters, the mean vectors mkare updated by averaging

the coordinates of elements of the cluster, thus providing a new gravity center. Then, the process is iterated

until the centers stabilize and the algorithm converges to a minimum of d2(xi,mk),∀i, k. Standard k-means is

particularly adapted to solve linear problems, i.e. the input space is organized in spherical clusters.

The kernel version of k-means relies on the same principles, but instead of working in the input space X, it

works in a higher dimensional feature space H, in which non-spherical clusters in the input space are mapped

into spherical ones, and can consequently be detected correctly. This higher dimensional space is usually induced

by a mapping function ϕ(·), whose images ϕ(xi) correspond to mapped samples in H. Using mapped samples,

the k-means becomes:

d2(ϕ(xi),mk) = kϕ(xi)−mkk2; (2)

where mk=1

|πk|X

j∈πk

ϕ(xj).(3)

This is equivalent to

d2(ϕ(xi),mk) = hϕ(xi),ϕ(xi)i+hmk,mki − 2hϕ(xi),mki.(4)

© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org

By plugging (3) into (4), and replacing the dot product hϕ(·),ϕ(·)iby a proper kernel function k(·,·), the

kernel k-means formulation10, 17, 18 is obtained as:

d2(ϕ(xi),mk) = hϕ(xi),ϕ(xi)i+1

|πk|2X

j,m∈πk

hϕ(xj),ϕ(xm)i − 2

|πk|X

j∈πk

hϕ(xi),ϕ(xj)i

=k(xi,xi) + 1

|πk|2X

j,m∈πk

k(xj,xm)−2

|πk|X

j∈πk

k(xi,xj).(5)

Kernel functions are applied to overcome the problems relied to the explicit computation of the mapping

function, that can be costly and diﬃcult. With kernels, the value of the dot product in the feature space is

evaluated directly by using the values of the samples in the input space.

The kernel values can be interpreted as a similarity measure between samples, and thus the kernel k-means

can be seen as a clustering algorithm that ﬁrst groups similar points and then separates them, working linearly

in a higher dimensional feature space. As for the linear version, the process is iterated until convergence, by

assigning the cluster membership k, and solving the following minimization problem:

arg min

mk

{d2(ϕ(xi),mk)}= arg min

mknk(xi,xi) + 1

|πk|2X

j,m∈πk

k(xj,xm)−2

|πk|X

j∈πk

k(xi,xj)o.(6)

Note that, since the mapping is not explicitly known, the exact coordinates of the cluster centers in Hcannot

be computed explicitly. However, the explicit centers coordinates are not needed to assign a pattern to its cluster.

When needed, the pixel closest to the center (the centroid or medioid) is considered to be the center.

In terms of complexity, the kernel k-means scales O(n2(ǫ+m)), where nis the number of samples, ǫis the

number of the iterations and mis the dimensionality. The classical k-means algorithm on the other hand is less

demanding, scaling O(ǫnmk), where kis the number of clusters.

3. THE CHANGE DETECTION SETTING

As mentioned above, two main issues have to be solved in order to apply this clustering algorithm in a completely

unsupervised way. In this section, the problems of initialization and of kernel parameters estimation are detailed.

3.1 Overcoming bad initializations

The main issue of unsupervised algorithms is to ﬁnd a proper initialization allowing the method to converge to

a global minimum (‘true’ clusters) or to a local minimum suﬃciently low. This issue can be greatly alleviated

by choosing a near-optimal initialization, i.e. ﬁnding centers within or close enough to the correct clusters.

In this case, the idea is to initialize the kernel k-means with two subsets that belong with high probability to

their respective clusters. In order to estimate the ‘change’ - respectively ‘no change’ - class distributions from

which the centroids are computed, the Spectral Change Vector7magnitude is exploited. The Change Vector

Analysis (CVA) has been widely used in many applications and, after [Bovolo and Bruzzone, 2006], a wide range

of applications has been reported (as initialization,20 change detector itself8or the exploratory data analysis21)

and its behavior is now largely understood.

SCV consists in computing the diﬀerence image and analyzing the distribution of magnitudes and angles in

order to discriminate changes. In this paper we exploit the magnitude vector computed as δ=kxt2

i−xt1

ik, where

the x{t1,t2}

iare the multidimensional pixels at the two times. This distribution can be seen as a mixture of two

Gaussians, one for the unchanged pixels and another for the changed pixels. The interested reader can ﬁnd more

details in the aforementioned papers.

© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org

0

+ t

- t

}

Overlapping

zone

No change region Change region

f(δ)

δ

T

Figure 1. Mixture of two Gaussians describing the two classes.

The initial centers are randomly picked from the ‘change’ and

‘no change’ regions. The Tcorresponds to a near optimal

thresholding, separating the ‘no change’ distribution (the left

one) to the ‘change’ one (right one).

This principle is illustrated in Figure 1, where two

subsets can be randomly initialized from each dis-

tribution, according to a threshold selected on the

minimum-error decision rule. It is worth mention-

ing that at this point, a change map can be produced

by assigning pixels according to the threshold on the

magnitude distribution (the Tin Figure 1). This so-

lution is not optimal for many reasons: as pointed out

in many studies21 such an approach suﬀers greatly of

residual registration misalignments and noise. More-

over, the separation of the ‘change’ and ‘no change’

clusters should be addressed by a nonlinear approach,

due to the great overlapping between the two class

distributions. This is particularly true for high / very

high geometrical resolution images, where the class

distributions strongly overlap and the images are af-

fected by high variances.

In the approach proposed in this paper, the near

Gaussian distributions of Figure 1 are exploited in

order to estimate the cluster centroids (as a pseudo

training set for the kernel k-means). Once a good ini-

tialization is obtained by a correct thresholding, the

convergence is also favored (within the limits dictated

by some possible sensor noise or outliers in the pixels magnitude values). It is worth mentioning that the number

of samples needed for the estimation of the kernel parameter(s) is only marginally important, while the descrip-

tion of the distribution should be complete in order to reproduce the variability of the data (i.e. the extent of

the clusters).

3.2 Learning the kernel parameter in an unsupervised way

The second big challenge is related to the ﬁtting of the kernel parameters. Usually, such parameters are chosen

by evaluating the algorithm on some labeled example (e.g. leave-one-out and cross validation) and retaining the

parameters set Θ that minimizes some predeﬁned cost function. In this paper we propose an unsupervised and

geometrically-inspired cost function, that automatically chooses a correct parameters set for the dataset at hand.

This cost function is formulated as:

arg min

Θ(Pk

1

|πk|Pi∈πkd2(ϕ(xi),mk)

Pk6=pd2(mk,mp)),(7)

where Θ is the set of parameters of the kernel function to be learned. The optimal geometrical distribution

of the samples is formulated in terms of intra-cluster and inter-cluster distances. The distances induced in the

feature space are used as an index to achieve the best possible description for kernel k-means. The minimization

in Eq. (7) can be seen as a maximization of the cluster separability: the minimization of the numerator favors

compact clusters in terms of distances to their centers, while the maximization of the denominator suggests a

kernel that maps samples into two clusters that have distant centers. Any search algorithm (e.g. line/grid search,

simulated annealing and others) can be used to estimate the cost generated by the elements of a given set of

parameters Θ.

© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org

3.3 The change detection algorithm

Starting with two coregistered and equalized images, the proposed algorithm can be resumed in 4 steps, illustrated

by Figure 2.

INITIALIZATION

on the SCV

magnitude

CLUSTER

assignements

PAR AMETER

estimation

binary

CHANGE MAP

Image t1Image t2

CENTERS

computation

Figure 2. The workﬂow of the proposed ap-

proach.

1) Initialization: in order to apply image diﬀerencing the scenes must

be ﬁrst preprocessed in terms of histogram matching an normalization

of theirs values. Then the initialization based on the thresholding of the

SCV magnitude can be applied. The images are subtracted, and then

the diﬀerence vectors (the SCV) are analyzed considering their norm.

The threshold and the conﬁdence ([T−t;T+t] in Figure 1) indicate

where the pixels are mixed in terms of magnitude: thus, outside this

interval, the samples are more likely to belong to either groups and a

pseudo training set can be extracted.

2) Parameter estimation: once the correct threshold is found, the ker-

nel k-means algorithm is exploited as a wrapper to choose the best pa-

rameter optimizing Eq. (7): the pseudo training set is clustered with

diﬀerent parameters until a minimum in the cost function is found.

3) Centroids computation: the algorithm returns the centroids and the

cluster assignment that corresponds to the best parameter. It is worth

mentioning that the choice of computing the centroids only on a subset

of pixels and not on the whole image is justiﬁed by two criteria: ﬁrst,

by the strong overlap of the classes. This way, unbiased centers of the

two classes are computed, and the pixels in the overlapping part of

the distributions are assigned to the corresponding cluster (which is

the closest in H). Secondly, estimating the centers only on a proper

subset of the image reduces both the computational time (in terms

of algorithm convergence) and computational complexity of the single

iterations of kernel k-means. This is an important issue, especially

taking into account the computational cost of the partitioning algorithm.

4) Change detection: once the centroids are computed, each pixel in the diﬀerence image is assigned to the cluster

which center is closest in H. To do that, kernel k-means with the optimized parameters is applied to the entire

diﬀerence image.

4. DATA AND EXPERIMENTAL RESULTS

(a) (b)

Figure 3. Images in (a) 2002 and (b) 2006

In this section, the proposed approach to unsupervised change de-

tection is validated on a pansharpened QuickBird image of the city

of Zurich (Switzerland). The available images are shown in Figure 3.

The results of the proposed method are compared to simple thresh-

olding of the histogram and to the linear k-means. Accuracies are

evaluated in terms of AUC (Area Under the ROC curve) estimated

on the basis of some available ground truth. Additionally, Binary

confusion matrices are provided for a single experiment, and basic

accuracy metrics are provided as well.

A total of 15 experiments (corresponding to diﬀerent initializa-

tions of the pseudo training set) were carried out for the kernel k-

means approach (with a Gaussian RBF kernel function) and for the

linear k-means. The centers are evaluated on the pseudo training

set extracted on the basis of the given regions of the magnitude his-

togram: at each iteration, a balanced set of 500 pseudo training set is extracted and used for computing centroids

for both clustering approaches. In order to have a deterministic term of comparison, the CVA was carried out

in terms of thresholding of the magnitude distribution.

© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org

4.1 Results and discussion

The AUC for the three approaches are illustrated in Table 1. The nonlinear solution improves the linear coun-

terpart and the unidimensional thresholding, reaching globally higher accuracies. It is interesting to see that the

proposed approach reduces greatly the false alarms provided by the k-means clustering on the diﬀerence image

(cf. ROCs in Figure 4 and confusion matrices in Table 2). Regarding the true changes (the true positives for

the ’change’ class) the algorithms are not so far in terms of performance, given the simplicity of the diﬀerence

image. On the other hand, the false positive rate is greatly reduced by the proposed approach. The averaged

ROC curves illustrated in Figure 4 and the AUC (for the k-means and the kernel k-means approach) show great

performances in terms of detection of true changes for all the algorithms, with a better performance for the

kernel approach.

CVA k-means kernel k-means

0.912 0.923 0.974

Table 1. Mean Area Under the ROC Curve (AUC). The averages are based on 15 independent experiments for the k-means

and for the kernel k-means; the CVA was carried out only once.

Actual Labels vs. Predicted (P)

CVA k-means kernel k-means

C NC C NC C NC

P

C11031 13241 12242 12987 12160 7766

NC 12778 190926 67 191180 149 196401

Basic Accuracy Metrics

OA κOA κOA κ

93.25 0.57 93.96 0.62 96.34 0.74

Table 2. Confusion matrices and accuracy metrics (Overall Accuracy - OA; Cohen’s Kappa - κ) for three models (randomly

chosen). ’C’ corresponds to the ’change’ class and ’NC’ to ’no change’.

In Figure 5, the ﬁnal binary change detection maps are illustrated. The black color correspond to the ’change’

class while the white color correspond to the ’no change’ class. Note that for the kernel k-means and for the

k-means approaches, the maps represent the number of hits of the clustering algorithms. Thanks to the proper

initialization, both algorithms converge to the correct solution in the most of the iterations, only the k-means

has clustered unwanted pixels in an experiment (the light gray regions in Figure 5).

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Fa lsepositiverate

Tr ue positive rate

CVA

k-means

kernel k-means

Figure 4. ROC curves.

Observing the results of the clustering in Figure 5, the

kernel approach shows less false alarms, greatly reducing the

eﬀect of the shadows on the change detection. The CVA ap-

proach is aﬀected by both shadowed pixels and remarkable

diﬀerences in the reﬂective response of the ground, but the

true positives ratio is high. The k-means approach reduces

the eﬀect of the shadows, but is greatly aﬀected by the dif-

ferences in the reﬂectance of the images and shows potential

instability even if the centers are initialized on the magni-

tude. The kernel k-means ﬁnally shows a reduced eﬀect of

both the principal sources of errors. The shadows and the

shadow-related changes are rarely assigned to the ’change’

cluster. The radiometric diﬀerences between the images,

even if less than with the k-means scheme, still inﬂuence

the false positive rate. Globally, in terms of true positive

detection, the k-means and the kernel k-means perform sim-

ilarly, but the most noticeable diﬀerence is found in terms

© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org

(a) CVA (b) k-means (c) kernel k-means

Figure 5. (a) CVA (b) k-means and (c) kernel k-means. For (a) and (b) white corresponds to 0 hits (100 % hits for the

class ’no change’) and black correspond to 15 hits (100 % hits for the class ’change’). In this case the term ’hits’ refers to

the total number of times that a given pixel is assigned to a given cluster.

of false alarms. These observations can be summarized by observing given accuracy metrics. The κin Table 1

gives some insight about this intuition, growing for the kernel k-means that greatly reduces the false alarm rate.

The Gaussian RBF kernel parameters were tuned by line search in the range of σ∈[0.01,0.1,...,6]. The

minimum of the function presented in (7) suggested average parameters in the interval [2.5,3] corresponding to

the mean distance of the pixels in the pseudo training set (in average 2.9).

5. CONCLUSIONS AND FUTURE WORK

The kernel clustering method shows great ﬂexibility to the problem of change detection, ﬁnding nonlinear so-

lutions to the problem. The main issues of such approach are discussed and solved: ﬁrst, the initialization

was addressed by ﬁnding a threshold on the magnitude distribution, and a geometrically inspired cost function

(which represents the ideal cluster geometry in the kernel induced feature space) has been proposed to estimate

the optimal kernel parameters (if any). Finally, the computational cost is kept low by controlling the number of

samples needed for estimating the centers (the label assignment step costs O(n2m) for the kernel matrix compu-

tation, where nis the number of pixels and mthe variables). The proposed approach shows improvements with

respect to classical clustering techniques. Moreover, the unsupervised kernel clustering introduces great potential

in term of ﬂexibility (e.g. introducing adapted kernels to the data, or using composite kernels for the fusion of

information12) and seems thus to be a candidate for future research in unsupervised (and semi-supervised and

even active) change detection approaches.

ACKNOWLEDGMENTS

This work has been partly supported by the Swiss National Science Foundation projects no. 200021-126505/1

and PBLAP2-127713/1 and by the Spanish Ministry of Science and Innovation under projects AYA2008-05965-

C04-03 and CSD2007-00018.

© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org

REFERENCES

[1] Singh, A., “Digital change detection techniques using remote sensing data,” Int. J. Rem. Sens. 10(6),

989–1003 (1997).

[2] Coppin, P., Jonckheere, I., Nackaerts, K., Muys, B., and Lambin, E., “Digital change detection methods in

ecosystem monitoring: a review,” Int. J. Remote Sens. 25(9), 1565 – 1596 (2004).

[3] Radke, R. J., Andra, S., Al-Kofahi, O., and Roysam, B., “Image change detection algorithms: A systematic

survey,” IEEE Trans. Image Process. 14(3), 294 – 307 (2005).

[4] Fung, T., “An assessment of TM imagery for land-cover change detection,” IEEE Trans. Geosci. Remote

Sens. 28(4), 681–684 (1990).

[5] Bruzzone, L. and Serpico, S. B., “Detection of changes in remotely-sensed images by the selective use of

multi-spectral information,” Int. J. Remote Sens. 18(18), 3883 – 3888 (1997).

[6] Bruzzone, L. and Prieto, D. F., “Automatic analysis of the diﬀerence image for unsupervised change detec-

tion,” IEEE Trans. Geosci. Remote Sens. 38(3), 1171–1182 (2000).

[7] Malila, W. A., “Change vector analysis: An approach for detecting forest changes with Landsat,” in [Proc.

LARS Mach. Process. Remotely Sensed Data Symp.], 326 – 335 (1980).

[8] Bovolo, F. and Bruzzone, L., “A theoretical framework for unsupervised change detection based on change

vector analysis in polar domain,” IEEE Trans. Geosci. Remote Sens. 45(1), 218–236 (2006).

[9] Bovolo, F. and Bruzzone, L., “A split-based approach to unsupervised change detection in large size multi-

temporal images: application to Tsunami-damage assessment,” IEEE Trans. Geosci. Remote Sens. 45(6),

1658–1671 (2007).

[10] Shawe-Taylor, J. and Cristianini, N., [Kernel Methods for Pattern Analysis], Cambridge University Press

(2004).

[11] Camps-Valls, G. and Bruzzone, L., [Kernel Methods for Remote Sensing Data Analysis], J. Wiley & Sons

(2009).

[12] Camps-Valls, G., G´omez-Chova, L., Mu˜noz-Mar´ı, J., Rojo-´

Alvarez, J. L., and Mart´ınez-Ram´on, M., “Kernel-

based framework for multi-temporal and multi-source remote sensing data classiﬁcation and change detec-

tion,” IEEE Trans. Geosci. Remote Sens. 46(6), 1822–1835 (2008).

[13] Camps-Valls, G. and Bruzzone, L., “Kernel-based methods for hyperspectral image classiﬁcation,” IEEE

Trans. Geosci. Remote Sens. 43(3), 1 – 12 (2005).

[14] Nemmour, H. and Chibani, Y., “Multiple support vector machines for land cover change detection: an

application for mapping urban extensions.,” J. Photogr. Remote Sensi. 61, 125–133 (2006).

[15] Bovolo, F., Camps-Valls, G., and Bruzzone, L., “A support vector domain method for change detection in

multitemporal images,” Pattern Recogn. Lett. 31(10), 1148–1154 (2010).

[16] Volpi, M., Tuia, D., Kanevski, M., Bovolo, F., and Bruzzone, L., “Supervised change detection in VHR

images: a comparative analysis,” in [IEEE International Workshop on Machine Learning for Signal Pro-

cessing], (2009).

[17] Girolami, M., “Mercer kernel-based clustering in feature space,” IEEE Trans. Neural Net. 13(3), 780 – 784

(2002).

[18] Dhillon, I., Guan, Y., and Kulis, B., “A uniﬁed view of kernel k-means, spectral clustering and graph cuts,”

Tech. Rep. UTCS Technical Report No. TR-04-25, University of Texas, Austin, Departement of Computer

Science (2005).

[19] MacQueen, J., “Some methods for classiﬁcation and analysis of multivariate observations,” in [Proc. 5th

Berkeley Symp. on Math. Statist. and Prob.], Proc. 5th Berkeley Symp. on Math. Statist. and Prob , 281 –

297 (1967).

[20] Bovolo, F., Bruzzone, L., and Marconcini, M., “A novel approach to unsupervised change detection based

on a semisupervised SVM and a similarity measure,” IEEE Trans. Geosci. Remote Sens. 46(7), 2070 – 2082

(2008).

[21] Bovolo, F., Bruzzone, L., and Marchesi, S., “Analysis of the eﬀects of registration noise in multitemporal

VHR images,” in [ESA-EUSC, ESRIN ], (2008).

© SPIE Remote Sensing 2010, Toulouse (F) - downloaded from kernelcd.org