Content uploaded by Trevor J. Bihl
Author content
All content in this area was uploaded by Trevor J. Bihl on Jun 17, 2015
Content may be subject to copyright.
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 12, NO. 8, AUGUST 2015 1725
Principal Component Reconstruction Error
for Hyperspectral Anomaly Detection
James A. Jablonski, Trevor J. Bihl, Member, IEEE, and Kenneth W. Bauer, Senior Member, IEEE
Abstract—In this letter, a reliable, simple, and intuitive ap-
proach for hyperspectral imagery (HSI) anomaly detection (AD)
is presented. This method, namely, the global iterative principal
component analysis (PCA) reconstruction-error-based anomaly
detector (GIPREBAD), examines AD by computing errors
(residuals) associated with reconstructing the original image us-
ing PCA projections. PCA is a linear transformation and fea-
ture extraction process commonly used in HSI and frequently
appears in operation prior to any AD task. PCA features rep-
resent a projection of the original data into lower-dimensional
subspace. An iterative approach is used to mitigate outlier in-
fluence on background covariance estimates. GIPREBAD results
are provided using receiver-operating-characteristic curves for
HSI from the hyperspectral digital imagery collection experiment.
Results are compared against the Reed–Xiaoli (RX) algorithm,
the linear RX (LRX) algorithm, and the support vector data
description (SVDD) algorithm. The results show that the proposed
GIPREBAD method performs favorably compared with RX, LRX,
and SVDD and is both intuitively and computationally simpler
than either RX or SVDD.
Index Terms—Anomaly detection (AD), dimensionality reduc-
tion (DR), hyperspectral imagery (HSI), hyperspectral imaging,
object detection, principal component analysis (PCA), reconstruc-
tion error, remote sensing, residual analysis, support vector data
description (SVDD).
I. INTRODUCTION
HYPERSPECTRAL imaging (HSI) systems collect both
spatial and spectral features and perform imaging spec-
troscopy [1]. HSI differs from multispectral imaging by collect-
ing finely sampled, as opposed to coarsely sampled, spectral
bands [1]. The fine spectral sampling of HSI permits the
remote detection, examination, and identification of materials
by analyzing differences in both spectral and spatial charac-
teristics. HSI is therefore used in three primary applications:
anomaly detection (AD), change detection, and spectral signa-
ture matching [1]. HSI AD is considered as the detection of
pixels statistically different from the background of the image,
with no aprioriinformation about the content of the image [1].
Two general philosophies have emerged in HSI AD algorithm
development: 1) highly accurate or tuned, but computation-
Manuscript received October 14, 2014; accepted March 1, 2015. This
work was supported in part by the U.S. Air Force Research Laboratory,
Human Effectiveness Directorate, Info. Ops. and Applied Mathematics Branch
(AFRL/RHXM).
The authors are with the Department of Operational Sciences, Air Force
Institute of Technology, Wright-Patterson Air Force Base, OH 45433 USA
(e-mail: james.jablonski@afit.edu; trevor.bihl@afit.edu; kenneth.bauer@
afit.edu).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/LGRS.2015.2421813
ally expensive and/or complex algorithms, e.g., [2]–[6]; and
2) simple and fast algorithms, but possibly with lower accuracy
[7]–[9]. The work herein focuses on the second philosophy to
create a simple, very fast, and effective AD using principal
component analysis (PCA).
Since HSI contains a large amount of data, dimensionality
reduction (DR) is frequently employed for feature selection/
extraction. PCA is commonly applied to HSI for DR with
further processing by an AD algorithm [1], [10]–[15]. In this
research, the DR step and the AD step are consolidated. The
reconstruction error (residual analysis) evident in PCA is used
to identify potential anomalies. The general philosophy follows
that a reconstruction error may be obtained from a DR by
reprojecting retained features into the data’s original space.
Naturally, since these projections are biased by the majority
class (considered as the background of an image), reconstruct-
ing the original data using the projections should yield poor
reconstructions for anomalies. By a simple extension of this
logic, such poorly reconstructed points are labeled as anoma-
lies. Reconstruction errors for AD, therefore, achieve AD as a
byproduct of data compression.
Image reconstruction has been considered for HSI noise
reduction [16], and reconstruction errors have been considered
in prior works for outlier detection, c.f. [17]–[19], compressive
sensing [20], signal reconstruction [21], and replicator neural
networks [22]; however, none of these applications were for
remote sensing AD. Fowler and Du [23] considered the re-
construction error for signal recovery in HSI; however, they
employed a separate AD step to segment background and
anomalies into separate bins. Similar to the method presented
herein, Li and Du [24] presented an HSI reconstructive error
AD; however, it was based on a window approach. Gao et al.
[25] considered PCA as a second step for DR in compressive
sensing. In [3], manifold reconstructive errors were used to find
regions of possible anomalies with Mahalanobis distance used
to discriminate points.
Herein, the authors suggest methods to create a PCA re-
constructive error AD that 1) uses PCA reconstructive error
as a measure in AD; 2) employs adaptive noise filtering it-
eratively to refine the solution [7]; and 3) uses zero-bin his-
tograms (ZBHs) [26] to select anomalies from a set of scores.
The authors create what is dubbed the global iterative PCA
reconstruction-error-based anomaly detector (GIPREBAD) for
HSI AD. This letter is organized as follows: Section II reviews
background material related to GIPREBAD. Section III poses
the GIPREBAD methodology, and Section IV presents our data,
comparison methods, performance results, and comparisons
with GIPREBAD. Section V concludes this letter.
1545-598X © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
1726 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 12, NO. 8, AUGUST 2015
II. BACKGROUND
A. PCA
PCA is a linear data transformation involving the eigenvalues
and eigenvectors of the data covariance matrix [15], [27]. PCA
considers the data under analysis, i.e., X(n×m)×p, with n×m
being the number of pixels (nbeing rows and mbeing columns)
and pbeing the number of features, spectral bands in this case
[27]. PCA transformation then involves a projection of the
data to
Y(n×m)×k=XS
(n×m)×pVp×k(1)
where Y(n×m)×kis a matrix of scores for the kretained
PCs, and
XS
(n×m)×p=X(n×m)×p−1(n×m)×1μT
1×pD−1/2(2)
where XS
(n×m)×prepresents the standardized data matrix, Vp×k
are the eigenvectors of the data correlation matrix, Dis a
diagonal matrix of variances, σ2
iis the variance of the ith
variable, and μ1×pis the sample mean row vector [27].
B. PCA and HSI
HSI data are reorganized into a matrix form for processing by
PCA. Although some disagreement exists on the applicability
and appropriateness of PCA for HSI [28], PCA is a commonly
used tool because it is simple to use and offers reasonable re-
sults [1]. Additionally, from an information theory perspective,
PCA is the most efficient method of DR due to its accounting
for the most variance in a data set with the least number of
dimensions [29].
In general, the first PC of a given HSI image is interpreted
as broadband intensity variation across the spectra, with subse-
quent PCs providing primary global spectral differences across
the image [1]. Eismann [1] considers that rare features or
anomalies often dominate trailing components with low vari-
ance. Therefore, eliminating or ignoring trailing PCs, as done
in many PCA applications [30] including HSI [1], may reduce
AD performance. In this letter, PCA reconstruction error is
employed, and it can be shown to be a function of the linear sum
of the trailing principal components. This way, the GIPREBAD
method, espoused herein, accounts for information in the trail-
ing principal components and may offer advantages over other
detection algorithms that completely discard this information.
Kaiser’s criterion, a method of basic estimation of data
dimensionality based on the mean eigenvalue of the correlation
matrix [30], was used to assess the overall dimensionality of the
data. Kaiser’s criterion specifies the number of retained PCs,
i.e., k,withasimplerulethatkis the number of eigenvalues
greater than the mean eigenvalue [30]. Kaiser’s criterion offers
reasonable performance for hyperspectral digital imagery col-
lection experiment (HYDICE) imagery, as seen in [31], and is
a simple method to implement.
C. PCA Reconstruction Error
PCA-based reconstruction error methods such as these can
achieve compression and AD via DR as well as perform AD.
PCA can be used to “compress” multivariate data to a reduced
set of components. The first kPCs of the data yield a recon-
struction of Xthrough
ˆ
X(n×m)×p=Y(n×m)×k(Vp×k)T(3)
with this prediction being essentially a reprojection of the data
into the original feature space. A reconstruction error may be
obtained from comparing the original to the predicted values re-
projected back to the data’s original space. Using residuals from
ˆ
Xas a statistic to detect multivariate outliers was introduced in
[17]. Interestingly, PCA reduction and the residual technique
were suggested in as early as 1957 [17].
In [32], a value for testing goodness-of-fit and multivariate
quality control was defined as the statistic
Qi=Xi(1×p)−ˆ
Xi(1×p)Xi(1×p)−ˆ
Xi(1×p)T(4)
where, under the assumption of multivariate normality, Qis a
linear combination of independent and identically distributed
chi-square random variables, and Xi(1×p)and ˆ
Xi(1×p)are row
vectors. Theoretically, this method is robust to departures from
normality as the number of discarded components increases
[20], [32]. Exemplars exceeding a given threshold of recon-
structive error using probabilities from the cumulative normal
distribution function could then be considered anomalies or
outliers in this method.
D. Iterative Adaptive Noise (IAN) Filtering
Iterative adaptive noise (IAN) filtering, consistent with [7],
is used to filter the Q-scores prior to anomaly declaration.
IAN filters more heavily in areas where the variance is close
to system noise while filtering less in areas with significant
signal [7]. Local image, pixel neighborhood characteristics,
statistics are estimated to achieve IAN filtering. Anomalies are
thus largely unfiltered due to their signal often exceeding the
background noise signal [33].
E. Zero-Bin Histogram (ZBH)
Potential anomalies are nominated using a ZBH method, as
described in [26]. ZBH first constructs histograms of scores for
each PC. The location, i.e., ϑ, of the first histogram bin with a
frequency of zero is then identified in the Q-score histogram.
Pixels associated with scores greater than ϑare considered
anomalous. The ZBH method is very sensitive to the bin width,
i.e., ω, chosen during histogram construction. Wider bins will
reduce the sensitivity of the detector, and narrow bins will
increase the sensitivity and result in more false positives [7].
III. GLOBAL ITERATIVE PCA RECONSTRUCTION ERROR
BASED ANOMALY DETECTOR (GIPREBAD)
The authors’ extension of the PCA reconstructive error to an
HSI compressive sensing algorithm involves six steps:
1) standardize the data matrix, eqn. (2);
2) compute PCs, eqn. (1);
JABLONSKI et al.: PRINCIPAL COMPONENT RECONSTRUCTION ERROR FOR HYPERSPECTRAL AD 1727
TAB LE I
GIPREBAD SETTINGS USED
TAB LE I I
IMAGES UNDER ANALYSIS
Fig. 1. Images used in analysis: (left) forest and (right) desert.
3) compute reconstruction from retained PCs, eqn. (3);
4) compute the Q-scores, eqn. (4), to find anomalies based
on the nominal threshold;
5) iterate between steps 1 and 4 by removing anomalies de-
tected by step 4 from the covariance matrix computation;
6) ZBH to detect anomalies.
ZBH is not used during individual iterations as it slows the
algorithm due to the high computational demands of repeated
sorting. The data are renormalized (step 1) for each iteration
to avoid biasing. Iterations stop after either no new anomalies
are detected or a maximum iteration threshold is reached. In a
manner consistent with [14], iterations of GIPREBAD are used
Fig. 2. Sample forest image Q-scores.
to increase the total reconstruction error for anomalies and thus
separate targets from the background for detection.
Settings used for GIPREBAD are found in the last column
of Table I; these are consistent with [34]. An exhaustive enu-
meration was performed on seven separate training images, and
the settings with the maximum area under receiver operating
characteristic (ROC) curves was chosen, as described in [34],
with a changing threshold on Q-scores used in step 4 to
generate the ROC curves. Additional details on settings and step
sizes are found in [34].
IV. EXPERIMENTS AND RESULTS
The HSI images used herein (see Fig. 1 and Table II) are
from the Forest I and Desert II Radiance collections from
the HYDICE push broom sensor with spectral sampling of
10 nm [1], [35]; additionally, these are identical to the im-
ages used in [11]. Three HSI AD algorithms are used for
comparison to GIPREBAD: the Reed-Xiaoli (RX) detector
(window size =25, 10 PCs) [1], [36], the linear RX (LRX)
algorithm (line size =2·n, 10 PCs) [14], and the SVDD detec-
tor (sigma = 905, Radial Basis Function) [8]. These algorithms
1728 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 12, NO. 8, AUGUST 2015
Fig. 3. Histogram of AD score.
Fig. 4. ROC curves. (a) Log-scale and (b) linear-scale x-axis, for the desert
image comparing ten iterations of GIPREBAD with RX, LRX, and SVDD.
comprise two primary philosophies in HSI AD: local (RX and
LRX) and global (SVDD) statistics.
The RX detector assumes independence and multivariate
normality in the HSI data; however, these assumptions are
usually invalid for HSI [14], [28]. Despite this, the RX detector
is considered as a “benchmark” detector for comparison due to
its simplicity [1], [8]. SVDD is a global detector that ignores the
background probability density function and, instead, computes
“support regions” of the space where most of the data (assumed
to be background) lie; points not in this area are labeled
anomalous [8], [37]. SVDD has subsequently been used as a
benchmark subspace-based anomaly detector in many studies
[14], [37] and is included herein for comparison.
A. GIPREBAD Results
Fig. 2 illustrates the Q-scores computed from (4), by plotting
Q-score versus pixel number. Noticeably, pixels corresponding
to the known anomalies produce higher Q-scores compared
with the background. Relatively low Q-scores for many target
pixels could be explained by spectral mixing.
The detection threshold resulting from the ZBH method is
shown in Fig. 3. This illustrates ZBH as a viable target sepa-
ration method. The results illustrated in Figs. 2 and 3 readily
Fig. 5. ROC curves. (a) Log-scale and (b) linear-scale x-axis, for the forest
image comparing ten iterations of GIPREBAD with RX, LRX, and SVDD.
TABLE III
PERFORMANCE COMPARISON
confirm the logical assertion that anomalous pixels are poorly
reconstructed from principal component subspace.
B. Classifier Performance Comparison
A performance comparison between AD methods is made
using ROC curves, consistent with [38]. Figs. 4 and 5 illustrate
the performance of GIPREBAD versus both SVDD and RX
for the desert (see Fig. 4) and forest (see Fig. 5) images.
In both images, GIPREBAD noticeably outperforms both RX
methods in target detection and provides comparable or better
performance to SVDD.
Table III presents each ROC curve’s area under the curve
(AUC) along with computation time. GIPREBAD clearly offers
both higher accuracy as well as far greater computational effi-
ciency, as illustrated by the drastically lower processing times.
V. C ONCLUSION
Herein, a simple, intuitive, efficient, and effective algo-
rithm, namely, GIPREBAD, for HSI AD has been presented.
GIPREBAD leverages PCA feature extraction for both DR and
AD. GIPREBAD results also compared favorably with RX,
LRX, and SVDD in detection performance and showed a sig-
nificant improvement in computational time. Additionally, on
the presented images, GIPREBAD consistently outperformed
RX and LRX. Due to GIPREBAD using commonly used HSI
tools within a straightforward process, it would be easy to
implement GIPREBAD in many HSI applications. GIPREBAD
could also potentially see use in real-time compressive sensing
applications.
JABLONSKI et al.: PRINCIPAL COMPONENT RECONSTRUCTION ERROR FOR HYPERSPECTRAL AD 1729
ACKNOWLEDGMENT
The views expressed in this letter are those of the authors and
do not reflect the official policy of the United States Air Force,
Department of Defense, or the U.S. Government.
REFERENCES
[1] M. T. Eismann, Hyperspectral Remote Sensing. Bellingham, WA,
USA: SPIE Press, 2012.
[2] B. Du and L. Zhang, “Target detection based on a dynamic subspace,”
Pattern Recog., vol. 47, no. 1, pp. 344–358, Jan. 2014.
[3] B. Du and L. Zhang, “A discriminative metric learning based anomaly
detection method,” IEEE Geosci. Remote Sens., vol. 52, no. 11,
pp. 6844–6857, Nov. 2014.
[4] Q. Shi, L. Zhang, and B. Du, “Semisupervised discriminative locally
enhanced alignment for hyperspectral image classification,” IEEE Geosci.
Remote Sens., vol. 51, no. 9, pp. 4800–4815, Sep. 2013.
[5] F. M. Mindrup, T. J. Bihl, and K. W. Bauer, “Modeling noise in a frame-
work to optimize the detection of anomalies in hyperspectral imaging,”
Intell. Eng. Syst. through Artif. Neural Netw., vol. 20, pp. 517–524,
2010.
[6] M. J. Mendenhall and E. Merenyi, “Relevance-based feature extraction
for hyperspectral imagery,” IEEE Trans. Neural Netw., vol. 19, no. 4,
pp. 658–672, Apr. 2008.
[7] R. J. Johnson, J. P. Williams, and K. W. Bauer, “AutoGAD: An improved
ICA-based hyperspectral anomaly detection algorithm,” IEEE Geosci.
Remote Sens., vol. 51, no. 6, pp. 3492–3503, Jun. 2013.
[8] A. Banerjee, P. Burlina, and R. Meth, “Fast hyperspectral anomaly detec-
tion via SVDD,” in Proc. IEEE ICIP, 2007, pp. IV-101–IV-104.
[9] Y. Tarabalka, T. V. Haavardsholm, I. Kåsen, and T. Skauli, “Real-time
anomaly detection in hyperspectral images using multivariate normal
mixture models and GPU processing,” J. Real-Time Image Process.,
vol. 4, no. 3, pp. 487–300, Aug. 2009.
[10] M. D. Farrell and R. M. Mersereau, “On the impact of PCA dimension
reduction for hyperspectral detection of difficult targets,” IEEE Geosci.
Remote Sens. Lett., vol. 2, no. 2, pp. 192–195, Apr. 2005.
[11] K. D. Friesen, T. J. Bihl, K. W. Bauer, and M. A. Friend, “Contextual
anomaly detection cueing methods for hyperspectral target recognition,”
Amer. J. Sci. Eng., vol. 2, no. 1, pp. 9–16, 2013.
[12] T. E. Smetek, “Hyperspectral imagery target detection using improved
anomaly detection and signature matching,” Ph.D. dissertation, Air Force
Inst. Technol., Wright-Patterson AFB, OH, USA, 2007.
[13] F. Tsai, E. K. Lin, and K. Yoshino, “Spectrally segmented principal
component analysis of hyperspectral imagery for mapping invasive plant
species,” Int. J. Remote Sens., vol. 28, no. 5, pp. 1023–1039, 2007.
[14] J. P. Williams, T. J. Bihl, and K. W. Bauer, “Towards the mitigation
of correlation effects in anomaly detection for hyperspectral imagery,”
J. Defense Model. Simul., vol. 10, no. 3, pp. 263–273, Feb. 2013.
[15] K.-J. Cheng, “Compression of hyperspectral images,” Ph.D. dissertation,
Ohio Univ., Athens, OH, USA, 2013.
[16] D. Cerra, R. Muller, and P. Reinartz, “Noise reduction in hyperspectral
images through spectral unmixing,” IEEE Geosci. Remote Sens. Lett.,
vol. 11, no. 1, pp. 109–113, Jan. 2014.
[17] E. J. Jackson and R. H. Morris, “An application of multivariate quality
control to photographic processing,” J. Amer. Statist. Assoc., vol. 52,
no. 278, pp. 186–199, Jun. 1957.
[18] W. Li, H. H. Yue, S. Valle-Cervantes, and S. J. Qin, “Recursive PCA
for adaptive process monitoring,” J. Process Control, vol. 10, no. 5,
pp. 471–486, Oct. 2000.
[19] V. Chatzigiannakis, G. Androulidakis, K. Pelechrinis, S. Papavassiliou,
and V. Maglaris, “Data fusion algorithms for network anomaly detection:
classification and evaluation,” in Proc. 3rd ICNS, 2007, pp. 50–56.
[20] Q. Ding and E. D. Kolacyk, “A compressed PCA subspace method
for anomaly detection in high-dimensional data,” in Proc. Joint Statist.
Meeting, Aug. 3, 2010.
[21] R. Machiraju and R. Yagel, “Reconstruction error characterization
and control: A sampling theory approach,” IEEE Trans. Vis. Comput.
Graphics, vol. 2, no. 4, pp. 364–378, Dec. 1996.
[22] V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,”
Univ. Minnesota, Minneapolis, MN, USA, 2007.
[23] J. E. Folwer and Q. Du, “Anomaly detection and reconstruction from
random projections,” IEEE Trans. Image Process., vol. 21, no. 1,
pp. 184–195, Jan. 2012.
[24] W. Li and Q. Du, “Collaborative representation for hyperspectral
anomaly detection,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 3,
pp. 1463–1474, Mar. 2015.
[25] J. Gao, Q. Shi, and T. S. Caetano, “Dimensionality reduction via compres-
sive sensing,” Pattern Recog. Lett., vol. 33, pp. 1163–1170, Jul. 2012.
[26] S.-S. Chiang, C.-l. Chang, and I. W. Ginsber, “Unsupervised hyperspectral
image analysis using independent component analysis,” in Proc. IEEE
Geosci.Remote Sens. Symp., 2000, pp. 3136–3138.
[27] W. R. Dillon and M. Goldstein, Multivariate Analysis Methods and
Applications. New York, NY, USA: Wiley, 1984.
[28] S. Prasad and L. M. Bruce, “Limitations of principal components analysis
for hyperspectral target recognition,” IEEE Geosci. Remote Sens. Lett.,
vol. 4, pp. 625–629, Oct. 2008.
[29] E. Christophe, “Hyperspectral data compression tradeoff,” in Opti-
cal Remote Sensing, Advances in Signal Processing and Exploitation
Techniques, vol. 3. New York, NY, USA: Springer-Verlag, 2011,
pp. 9–29.
[30] D. A. Jackson, “Stopping rules in principal component analysis: a com-
parison of heuristical and statistical approaches,” Ecology, vol. 74, no. 8,
pp. 2204–2214, 1993.
[31] L. Fountanas “Principal component analysis for hyperspectral image clas-
sification,” M.S thesis, Naval Postgraduate School, Monterey, CA, USA,
2004.
[32] E. J. Jackson and G. S. Mudholkar, “Control procedures for residuals
associated with principal component analysis,” Technometrics, vol. 21,
no. 3, pp. 341–349, Aug. 1979.
[33] J. S. Lim, Two-Dimensional Signal and Image Processing. Englewood
Cliffs, NJ, USA: Prentice-Hall, 1990, pp. 546–549.
[34] J. A. Jablonski, “Reconstruction error and principal component based
anomaly detection in hyperspectral imagery,” M.S thesis, Air Force Inst.
Technol., Wright-Patterson AFB, OH, USA, 2014.
[35] L. J. Rickard, R. Basedow, P. P. Silverglate, and E. E. Zalewski,
“HYDICE: An airborne system for hyperspectral imaging,” in Proc. SPIE,
vol. 1937, 1993, pp. 173–179.
[36] I. S. Reed and X. Yu, “Adaptive multiple-band CFAR detection of an
optical pattern with unknown spectral distribution,” IEEE Trans. Acoust.,
Speech, Signal Process., vol. 38, pp. 1760–1770, Oct. 1990.
[37] S. Matteoli, M. Diani, and G. Corsini, “A tutorial overview of anomaly
detection in hyperspectral images,” IEEE Aerosp. Electron. Syst. Mag.,
vol. 25, no. 7, pp. 5–28, 2010.
[38] T. Fawcett, “An introduction to ROC analysis,” Pattern Recog. Lett.,
vol. 27, pp. 861–874, Jun. 2006.