Content uploaded by Trevor J. Bihl

Author content

All content in this area was uploaded by Trevor J. Bihl on Jun 17, 2015

Content may be subject to copyright.

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 12, NO. 8, AUGUST 2015 1725

Principal Component Reconstruction Error

for Hyperspectral Anomaly Detection

James A. Jablonski, Trevor J. Bihl, Member, IEEE, and Kenneth W. Bauer, Senior Member, IEEE

Abstract—In this letter, a reliable, simple, and intuitive ap-

proach for hyperspectral imagery (HSI) anomaly detection (AD)

is presented. This method, namely, the global iterative principal

component analysis (PCA) reconstruction-error-based anomaly

detector (GIPREBAD), examines AD by computing errors

(residuals) associated with reconstructing the original image us-

ing PCA projections. PCA is a linear transformation and fea-

ture extraction process commonly used in HSI and frequently

appears in operation prior to any AD task. PCA features rep-

resent a projection of the original data into lower-dimensional

subspace. An iterative approach is used to mitigate outlier in-

ﬂuence on background covariance estimates. GIPREBAD results

are provided using receiver-operating-characteristic curves for

HSI from the hyperspectral digital imagery collection experiment.

Results are compared against the Reed–Xiaoli (RX) algorithm,

the linear RX (LRX) algorithm, and the support vector data

description (SVDD) algorithm. The results show that the proposed

GIPREBAD method performs favorably compared with RX, LRX,

and SVDD and is both intuitively and computationally simpler

than either RX or SVDD.

Index Terms—Anomaly detection (AD), dimensionality reduc-

tion (DR), hyperspectral imagery (HSI), hyperspectral imaging,

object detection, principal component analysis (PCA), reconstruc-

tion error, remote sensing, residual analysis, support vector data

description (SVDD).

I. INTRODUCTION

HYPERSPECTRAL imaging (HSI) systems collect both

spatial and spectral features and perform imaging spec-

troscopy [1]. HSI differs from multispectral imaging by collect-

ing ﬁnely sampled, as opposed to coarsely sampled, spectral

bands [1]. The ﬁne spectral sampling of HSI permits the

remote detection, examination, and identiﬁcation of materials

by analyzing differences in both spectral and spatial charac-

teristics. HSI is therefore used in three primary applications:

anomaly detection (AD), change detection, and spectral signa-

ture matching [1]. HSI AD is considered as the detection of

pixels statistically different from the background of the image,

with no aprioriinformation about the content of the image [1].

Two general philosophies have emerged in HSI AD algorithm

development: 1) highly accurate or tuned, but computation-

Manuscript received October 14, 2014; accepted March 1, 2015. This

work was supported in part by the U.S. Air Force Research Laboratory,

Human Effectiveness Directorate, Info. Ops. and Applied Mathematics Branch

(AFRL/RHXM).

The authors are with the Department of Operational Sciences, Air Force

Institute of Technology, Wright-Patterson Air Force Base, OH 45433 USA

(e-mail: james.jablonski@aﬁt.edu; trevor.bihl@aﬁt.edu; kenneth.bauer@

aﬁt.edu).

Color versions of one or more of the ﬁgures in this paper are available online

at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/LGRS.2015.2421813

ally expensive and/or complex algorithms, e.g., [2]–[6]; and

2) simple and fast algorithms, but possibly with lower accuracy

[7]–[9]. The work herein focuses on the second philosophy to

create a simple, very fast, and effective AD using principal

component analysis (PCA).

Since HSI contains a large amount of data, dimensionality

reduction (DR) is frequently employed for feature selection/

extraction. PCA is commonly applied to HSI for DR with

further processing by an AD algorithm [1], [10]–[15]. In this

research, the DR step and the AD step are consolidated. The

reconstruction error (residual analysis) evident in PCA is used

to identify potential anomalies. The general philosophy follows

that a reconstruction error may be obtained from a DR by

reprojecting retained features into the data’s original space.

Naturally, since these projections are biased by the majority

class (considered as the background of an image), reconstruct-

ing the original data using the projections should yield poor

reconstructions for anomalies. By a simple extension of this

logic, such poorly reconstructed points are labeled as anoma-

lies. Reconstruction errors for AD, therefore, achieve AD as a

byproduct of data compression.

Image reconstruction has been considered for HSI noise

reduction [16], and reconstruction errors have been considered

in prior works for outlier detection, c.f. [17]–[19], compressive

sensing [20], signal reconstruction [21], and replicator neural

networks [22]; however, none of these applications were for

remote sensing AD. Fowler and Du [23] considered the re-

construction error for signal recovery in HSI; however, they

employed a separate AD step to segment background and

anomalies into separate bins. Similar to the method presented

herein, Li and Du [24] presented an HSI reconstructive error

AD; however, it was based on a window approach. Gao et al.

[25] considered PCA as a second step for DR in compressive

sensing. In [3], manifold reconstructive errors were used to ﬁnd

regions of possible anomalies with Mahalanobis distance used

to discriminate points.

Herein, the authors suggest methods to create a PCA re-

constructive error AD that 1) uses PCA reconstructive error

as a measure in AD; 2) employs adaptive noise ﬁltering it-

eratively to reﬁne the solution [7]; and 3) uses zero-bin his-

tograms (ZBHs) [26] to select anomalies from a set of scores.

The authors create what is dubbed the global iterative PCA

reconstruction-error-based anomaly detector (GIPREBAD) for

HSI AD. This letter is organized as follows: Section II reviews

background material related to GIPREBAD. Section III poses

the GIPREBAD methodology, and Section IV presents our data,

comparison methods, performance results, and comparisons

with GIPREBAD. Section V concludes this letter.

1545-598X © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.

See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

1726 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 12, NO. 8, AUGUST 2015

II. BACKGROUND

A. PCA

PCA is a linear data transformation involving the eigenvalues

and eigenvectors of the data covariance matrix [15], [27]. PCA

considers the data under analysis, i.e., X(n×m)×p, with n×m

being the number of pixels (nbeing rows and mbeing columns)

and pbeing the number of features, spectral bands in this case

[27]. PCA transformation then involves a projection of the

data to

Y(n×m)×k=XS

(n×m)×pVp×k(1)

where Y(n×m)×kis a matrix of scores for the kretained

PCs, and

XS

(n×m)×p=X(n×m)×p−1(n×m)×1μT

1×pD−1/2(2)

where XS

(n×m)×prepresents the standardized data matrix, Vp×k

are the eigenvectors of the data correlation matrix, Dis a

diagonal matrix of variances, σ2

iis the variance of the ith

variable, and μ1×pis the sample mean row vector [27].

B. PCA and HSI

HSI data are reorganized into a matrix form for processing by

PCA. Although some disagreement exists on the applicability

and appropriateness of PCA for HSI [28], PCA is a commonly

used tool because it is simple to use and offers reasonable re-

sults [1]. Additionally, from an information theory perspective,

PCA is the most efﬁcient method of DR due to its accounting

for the most variance in a data set with the least number of

dimensions [29].

In general, the ﬁrst PC of a given HSI image is interpreted

as broadband intensity variation across the spectra, with subse-

quent PCs providing primary global spectral differences across

the image [1]. Eismann [1] considers that rare features or

anomalies often dominate trailing components with low vari-

ance. Therefore, eliminating or ignoring trailing PCs, as done

in many PCA applications [30] including HSI [1], may reduce

AD performance. In this letter, PCA reconstruction error is

employed, and it can be shown to be a function of the linear sum

of the trailing principal components. This way, the GIPREBAD

method, espoused herein, accounts for information in the trail-

ing principal components and may offer advantages over other

detection algorithms that completely discard this information.

Kaiser’s criterion, a method of basic estimation of data

dimensionality based on the mean eigenvalue of the correlation

matrix [30], was used to assess the overall dimensionality of the

data. Kaiser’s criterion speciﬁes the number of retained PCs,

i.e., k,withasimplerulethatkis the number of eigenvalues

greater than the mean eigenvalue [30]. Kaiser’s criterion offers

reasonable performance for hyperspectral digital imagery col-

lection experiment (HYDICE) imagery, as seen in [31], and is

a simple method to implement.

C. PCA Reconstruction Error

PCA-based reconstruction error methods such as these can

achieve compression and AD via DR as well as perform AD.

PCA can be used to “compress” multivariate data to a reduced

set of components. The ﬁrst kPCs of the data yield a recon-

struction of Xthrough

ˆ

X(n×m)×p=Y(n×m)×k(Vp×k)T(3)

with this prediction being essentially a reprojection of the data

into the original feature space. A reconstruction error may be

obtained from comparing the original to the predicted values re-

projected back to the data’s original space. Using residuals from

ˆ

Xas a statistic to detect multivariate outliers was introduced in

[17]. Interestingly, PCA reduction and the residual technique

were suggested in as early as 1957 [17].

In [32], a value for testing goodness-of-ﬁt and multivariate

quality control was deﬁned as the statistic

Qi=Xi(1×p)−ˆ

Xi(1×p)Xi(1×p)−ˆ

Xi(1×p)T(4)

where, under the assumption of multivariate normality, Qis a

linear combination of independent and identically distributed

chi-square random variables, and Xi(1×p)and ˆ

Xi(1×p)are row

vectors. Theoretically, this method is robust to departures from

normality as the number of discarded components increases

[20], [32]. Exemplars exceeding a given threshold of recon-

structive error using probabilities from the cumulative normal

distribution function could then be considered anomalies or

outliers in this method.

D. Iterative Adaptive Noise (IAN) Filtering

Iterative adaptive noise (IAN) ﬁltering, consistent with [7],

is used to ﬁlter the Q-scores prior to anomaly declaration.

IAN ﬁlters more heavily in areas where the variance is close

to system noise while ﬁltering less in areas with signiﬁcant

signal [7]. Local image, pixel neighborhood characteristics,

statistics are estimated to achieve IAN ﬁltering. Anomalies are

thus largely unﬁltered due to their signal often exceeding the

background noise signal [33].

E. Zero-Bin Histogram (ZBH)

Potential anomalies are nominated using a ZBH method, as

described in [26]. ZBH ﬁrst constructs histograms of scores for

each PC. The location, i.e., ϑ, of the ﬁrst histogram bin with a

frequency of zero is then identiﬁed in the Q-score histogram.

Pixels associated with scores greater than ϑare considered

anomalous. The ZBH method is very sensitive to the bin width,

i.e., ω, chosen during histogram construction. Wider bins will

reduce the sensitivity of the detector, and narrow bins will

increase the sensitivity and result in more false positives [7].

III. GLOBAL ITERATIVE PCA RECONSTRUCTION ERROR

BASED ANOMALY DETECTOR (GIPREBAD)

The authors’ extension of the PCA reconstructive error to an

HSI compressive sensing algorithm involves six steps:

1) standardize the data matrix, eqn. (2);

2) compute PCs, eqn. (1);

JABLONSKI et al.: PRINCIPAL COMPONENT RECONSTRUCTION ERROR FOR HYPERSPECTRAL AD 1727

TAB LE I

GIPREBAD SETTINGS USED

TAB LE I I

IMAGES UNDER ANALYSIS

Fig. 1. Images used in analysis: (left) forest and (right) desert.

3) compute reconstruction from retained PCs, eqn. (3);

4) compute the Q-scores, eqn. (4), to ﬁnd anomalies based

on the nominal threshold;

5) iterate between steps 1 and 4 by removing anomalies de-

tected by step 4 from the covariance matrix computation;

6) ZBH to detect anomalies.

ZBH is not used during individual iterations as it slows the

algorithm due to the high computational demands of repeated

sorting. The data are renormalized (step 1) for each iteration

to avoid biasing. Iterations stop after either no new anomalies

are detected or a maximum iteration threshold is reached. In a

manner consistent with [14], iterations of GIPREBAD are used

Fig. 2. Sample forest image Q-scores.

to increase the total reconstruction error for anomalies and thus

separate targets from the background for detection.

Settings used for GIPREBAD are found in the last column

of Table I; these are consistent with [34]. An exhaustive enu-

meration was performed on seven separate training images, and

the settings with the maximum area under receiver operating

characteristic (ROC) curves was chosen, as described in [34],

with a changing threshold on Q-scores used in step 4 to

generate the ROC curves. Additional details on settings and step

sizes are found in [34].

IV. EXPERIMENTS AND RESULTS

The HSI images used herein (see Fig. 1 and Table II) are

from the Forest I and Desert II Radiance collections from

the HYDICE push broom sensor with spectral sampling of

10 nm [1], [35]; additionally, these are identical to the im-

ages used in [11]. Three HSI AD algorithms are used for

comparison to GIPREBAD: the Reed-Xiaoli (RX) detector

(window size =25, 10 PCs) [1], [36], the linear RX (LRX)

algorithm (line size =2·n, 10 PCs) [14], and the SVDD detec-

tor (sigma = 905, Radial Basis Function) [8]. These algorithms

1728 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 12, NO. 8, AUGUST 2015

Fig. 3. Histogram of AD score.

Fig. 4. ROC curves. (a) Log-scale and (b) linear-scale x-axis, for the desert

image comparing ten iterations of GIPREBAD with RX, LRX, and SVDD.

comprise two primary philosophies in HSI AD: local (RX and

LRX) and global (SVDD) statistics.

The RX detector assumes independence and multivariate

normality in the HSI data; however, these assumptions are

usually invalid for HSI [14], [28]. Despite this, the RX detector

is considered as a “benchmark” detector for comparison due to

its simplicity [1], [8]. SVDD is a global detector that ignores the

background probability density function and, instead, computes

“support regions” of the space where most of the data (assumed

to be background) lie; points not in this area are labeled

anomalous [8], [37]. SVDD has subsequently been used as a

benchmark subspace-based anomaly detector in many studies

[14], [37] and is included herein for comparison.

A. GIPREBAD Results

Fig. 2 illustrates the Q-scores computed from (4), by plotting

Q-score versus pixel number. Noticeably, pixels corresponding

to the known anomalies produce higher Q-scores compared

with the background. Relatively low Q-scores for many target

pixels could be explained by spectral mixing.

The detection threshold resulting from the ZBH method is

shown in Fig. 3. This illustrates ZBH as a viable target sepa-

ration method. The results illustrated in Figs. 2 and 3 readily

Fig. 5. ROC curves. (a) Log-scale and (b) linear-scale x-axis, for the forest

image comparing ten iterations of GIPREBAD with RX, LRX, and SVDD.

TABLE III

PERFORMANCE COMPARISON

conﬁrm the logical assertion that anomalous pixels are poorly

reconstructed from principal component subspace.

B. Classiﬁer Performance Comparison

A performance comparison between AD methods is made

using ROC curves, consistent with [38]. Figs. 4 and 5 illustrate

the performance of GIPREBAD versus both SVDD and RX

for the desert (see Fig. 4) and forest (see Fig. 5) images.

In both images, GIPREBAD noticeably outperforms both RX

methods in target detection and provides comparable or better

performance to SVDD.

Table III presents each ROC curve’s area under the curve

(AUC) along with computation time. GIPREBAD clearly offers

both higher accuracy as well as far greater computational efﬁ-

ciency, as illustrated by the drastically lower processing times.

V. C ONCLUSION

Herein, a simple, intuitive, efﬁcient, and effective algo-

rithm, namely, GIPREBAD, for HSI AD has been presented.

GIPREBAD leverages PCA feature extraction for both DR and

AD. GIPREBAD results also compared favorably with RX,

LRX, and SVDD in detection performance and showed a sig-

niﬁcant improvement in computational time. Additionally, on

the presented images, GIPREBAD consistently outperformed

RX and LRX. Due to GIPREBAD using commonly used HSI

tools within a straightforward process, it would be easy to

implement GIPREBAD in many HSI applications. GIPREBAD

could also potentially see use in real-time compressive sensing

applications.

JABLONSKI et al.: PRINCIPAL COMPONENT RECONSTRUCTION ERROR FOR HYPERSPECTRAL AD 1729

ACKNOWLEDGMENT

The views expressed in this letter are those of the authors and

do not reﬂect the ofﬁcial policy of the United States Air Force,

Department of Defense, or the U.S. Government.

REFERENCES

[1] M. T. Eismann, Hyperspectral Remote Sensing. Bellingham, WA,

USA: SPIE Press, 2012.

[2] B. Du and L. Zhang, “Target detection based on a dynamic subspace,”

Pattern Recog., vol. 47, no. 1, pp. 344–358, Jan. 2014.

[3] B. Du and L. Zhang, “A discriminative metric learning based anomaly

detection method,” IEEE Geosci. Remote Sens., vol. 52, no. 11,

pp. 6844–6857, Nov. 2014.

[4] Q. Shi, L. Zhang, and B. Du, “Semisupervised discriminative locally

enhanced alignment for hyperspectral image classiﬁcation,” IEEE Geosci.

Remote Sens., vol. 51, no. 9, pp. 4800–4815, Sep. 2013.

[5] F. M. Mindrup, T. J. Bihl, and K. W. Bauer, “Modeling noise in a frame-

work to optimize the detection of anomalies in hyperspectral imaging,”

Intell. Eng. Syst. through Artif. Neural Netw., vol. 20, pp. 517–524,

2010.

[6] M. J. Mendenhall and E. Merenyi, “Relevance-based feature extraction

for hyperspectral imagery,” IEEE Trans. Neural Netw., vol. 19, no. 4,

pp. 658–672, Apr. 2008.

[7] R. J. Johnson, J. P. Williams, and K. W. Bauer, “AutoGAD: An improved

ICA-based hyperspectral anomaly detection algorithm,” IEEE Geosci.

Remote Sens., vol. 51, no. 6, pp. 3492–3503, Jun. 2013.

[8] A. Banerjee, P. Burlina, and R. Meth, “Fast hyperspectral anomaly detec-

tion via SVDD,” in Proc. IEEE ICIP, 2007, pp. IV-101–IV-104.

[9] Y. Tarabalka, T. V. Haavardsholm, I. Kåsen, and T. Skauli, “Real-time

anomaly detection in hyperspectral images using multivariate normal

mixture models and GPU processing,” J. Real-Time Image Process.,

vol. 4, no. 3, pp. 487–300, Aug. 2009.

[10] M. D. Farrell and R. M. Mersereau, “On the impact of PCA dimension

reduction for hyperspectral detection of difﬁcult targets,” IEEE Geosci.

Remote Sens. Lett., vol. 2, no. 2, pp. 192–195, Apr. 2005.

[11] K. D. Friesen, T. J. Bihl, K. W. Bauer, and M. A. Friend, “Contextual

anomaly detection cueing methods for hyperspectral target recognition,”

Amer. J. Sci. Eng., vol. 2, no. 1, pp. 9–16, 2013.

[12] T. E. Smetek, “Hyperspectral imagery target detection using improved

anomaly detection and signature matching,” Ph.D. dissertation, Air Force

Inst. Technol., Wright-Patterson AFB, OH, USA, 2007.

[13] F. Tsai, E. K. Lin, and K. Yoshino, “Spectrally segmented principal

component analysis of hyperspectral imagery for mapping invasive plant

species,” Int. J. Remote Sens., vol. 28, no. 5, pp. 1023–1039, 2007.

[14] J. P. Williams, T. J. Bihl, and K. W. Bauer, “Towards the mitigation

of correlation effects in anomaly detection for hyperspectral imagery,”

J. Defense Model. Simul., vol. 10, no. 3, pp. 263–273, Feb. 2013.

[15] K.-J. Cheng, “Compression of hyperspectral images,” Ph.D. dissertation,

Ohio Univ., Athens, OH, USA, 2013.

[16] D. Cerra, R. Muller, and P. Reinartz, “Noise reduction in hyperspectral

images through spectral unmixing,” IEEE Geosci. Remote Sens. Lett.,

vol. 11, no. 1, pp. 109–113, Jan. 2014.

[17] E. J. Jackson and R. H. Morris, “An application of multivariate quality

control to photographic processing,” J. Amer. Statist. Assoc., vol. 52,

no. 278, pp. 186–199, Jun. 1957.

[18] W. Li, H. H. Yue, S. Valle-Cervantes, and S. J. Qin, “Recursive PCA

for adaptive process monitoring,” J. Process Control, vol. 10, no. 5,

pp. 471–486, Oct. 2000.

[19] V. Chatzigiannakis, G. Androulidakis, K. Pelechrinis, S. Papavassiliou,

and V. Maglaris, “Data fusion algorithms for network anomaly detection:

classiﬁcation and evaluation,” in Proc. 3rd ICNS, 2007, pp. 50–56.

[20] Q. Ding and E. D. Kolacyk, “A compressed PCA subspace method

for anomaly detection in high-dimensional data,” in Proc. Joint Statist.

Meeting, Aug. 3, 2010.

[21] R. Machiraju and R. Yagel, “Reconstruction error characterization

and control: A sampling theory approach,” IEEE Trans. Vis. Comput.

Graphics, vol. 2, no. 4, pp. 364–378, Dec. 1996.

[22] V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,”

Univ. Minnesota, Minneapolis, MN, USA, 2007.

[23] J. E. Folwer and Q. Du, “Anomaly detection and reconstruction from

random projections,” IEEE Trans. Image Process., vol. 21, no. 1,

pp. 184–195, Jan. 2012.

[24] W. Li and Q. Du, “Collaborative representation for hyperspectral

anomaly detection,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 3,

pp. 1463–1474, Mar. 2015.

[25] J. Gao, Q. Shi, and T. S. Caetano, “Dimensionality reduction via compres-

sive sensing,” Pattern Recog. Lett., vol. 33, pp. 1163–1170, Jul. 2012.

[26] S.-S. Chiang, C.-l. Chang, and I. W. Ginsber, “Unsupervised hyperspectral

image analysis using independent component analysis,” in Proc. IEEE

Geosci.Remote Sens. Symp., 2000, pp. 3136–3138.

[27] W. R. Dillon and M. Goldstein, Multivariate Analysis Methods and

Applications. New York, NY, USA: Wiley, 1984.

[28] S. Prasad and L. M. Bruce, “Limitations of principal components analysis

for hyperspectral target recognition,” IEEE Geosci. Remote Sens. Lett.,

vol. 4, pp. 625–629, Oct. 2008.

[29] E. Christophe, “Hyperspectral data compression tradeoff,” in Opti-

cal Remote Sensing, Advances in Signal Processing and Exploitation

Techniques, vol. 3. New York, NY, USA: Springer-Verlag, 2011,

pp. 9–29.

[30] D. A. Jackson, “Stopping rules in principal component analysis: a com-

parison of heuristical and statistical approaches,” Ecology, vol. 74, no. 8,

pp. 2204–2214, 1993.

[31] L. Fountanas “Principal component analysis for hyperspectral image clas-

siﬁcation,” M.S thesis, Naval Postgraduate School, Monterey, CA, USA,

2004.

[32] E. J. Jackson and G. S. Mudholkar, “Control procedures for residuals

associated with principal component analysis,” Technometrics, vol. 21,

no. 3, pp. 341–349, Aug. 1979.

[33] J. S. Lim, Two-Dimensional Signal and Image Processing. Englewood

Cliffs, NJ, USA: Prentice-Hall, 1990, pp. 546–549.

[34] J. A. Jablonski, “Reconstruction error and principal component based

anomaly detection in hyperspectral imagery,” M.S thesis, Air Force Inst.

Technol., Wright-Patterson AFB, OH, USA, 2014.

[35] L. J. Rickard, R. Basedow, P. P. Silverglate, and E. E. Zalewski,

“HYDICE: An airborne system for hyperspectral imaging,” in Proc. SPIE,

vol. 1937, 1993, pp. 173–179.

[36] I. S. Reed and X. Yu, “Adaptive multiple-band CFAR detection of an

optical pattern with unknown spectral distribution,” IEEE Trans. Acoust.,

Speech, Signal Process., vol. 38, pp. 1760–1770, Oct. 1990.

[37] S. Matteoli, M. Diani, and G. Corsini, “A tutorial overview of anomaly

detection in hyperspectral images,” IEEE Aerosp. Electron. Syst. Mag.,

vol. 25, no. 7, pp. 5–28, 2010.

[38] T. Fawcett, “An introduction to ROC analysis,” Pattern Recog. Lett.,

vol. 27, pp. 861–874, Jun. 2006.