ArticlePDF Available

Abstract

Principal component analysis (PCA) in the wavelet domain provides powerful features for underwater object recognition applications. The multiresolution analysis of the Morlet wavelet transform (MWT) is used to pre-process echo returns from targets ensonified by biologically motivated broadband signal. PCA is then used to compress and denoise the resulting time-scale signal representation for presentation to a hierarchical neural network for object classification. Wavelet/PCA features combined with multi-aspect data fusion and neural networks have resulted in impressive underwater object recognition performance using backscatter data generated by simulate dolphin echolocation clicks and bat- like linear frequency modulated upsweeps. For example, wavelet/PCA features extracted from LFM echo returns have resulted in correct classification rates of 98.6 percent over a six target suite, which includes two mine simulators and four clutter objects. For the same data, ROC analysis of the two-class mine-like versus non-mine-like problem resulted in a probability of detection of 0.981 and a probability of false alarm of 0.032 at the 'optimal' operating point. The wavelet/PCA feature extraction algorithm is currently being implemented in VLSI for use in small, unmanned underwater vehicles designed for mine- hunting operations in shallow water environments.
Principal component analysis in the wavelet domain: New features for
underwater object recognition
Gordon Okimotoa* d David Lemonds)**
aTrex Enterprises, Inc., 3398 Manoa Road, Honolulu, HI 96822
bORJCON Corporation, 970 North Kalaheo Ave., Ste. C-215, Kailua, HI 96822
ABSTRACT
Principal component analysis (PCA) in the wavelet domain provides powerful features for underwater object recognition
applications. The multiresolution analysis of the Monet wavelet transform (MWT) is used to pre-process echo returns from
targets ensonified by biologically motivated broadband signals. PCA is then used to compress and denoise the resulting
time-scale signal representation for presentation to a hierarchical neural network for object classification. WaveIetfPCA
features combined with multi-aspect data fusion and neural networks have resulted in impressive underwater object
recognition performance using backscatter data generated by simulated dolphin echolocation clicks and bat-like linear
frequency modulated (LFM) upsweeps. For example, wavelet/PCA features extracted from LFM echo returns have resulted
in correct classification rates of 98.6% over a six target suite, which includes two mine simulators and four clutter objects.
For the same data, ROC analysis of the two-class mine-like versus non-mine-like problem resulted in a probability of
detection (Pd) of 0.98 1 and a probability of false alarm (Pfa) of 0.032 at the "optimal" operating point. The waveletlPCA
feature extraction algorithm is currently being implemented in VLSI for use in small, unmanned underwater vehicles
designed for mine-hunting operations in shallow water environments.
Keywords: principal component analysis; Morlet wavelet transform; hierarchical neural network; biomimetic systems;
underwater object recognition.
1. INTRODUCTION
Biomimetics is the attempt to emulate in hardware and software the natural biosonar of animals such as the bottlenose
dolphin and big brown bat. These creatures exhibit extraordinary echolocation capabilities in acoustically harsh
environments by exploiting structural cues in the acoustic backscatter generated by their broadband transmit signals'.
Although the higher level processing responsible for the observed performance of these animals is not well understood, we
have nevertheless implemented a biomimetic system for underwater object recognition that use dolphin-like echolocation
clicks and bat-like LFM upsweeps to ensonify targets of interest. In our approach, a compact set of features is extracted from
the echo return based on PCA compression in the wavelet transform domain. The resulting features are then presented to a
hierarchical neural network for aspect fusion and classification processing. Figure 1 (next page) illustrates the signal
processing that we have implemented using the MWT and PCA. In general, a set of features should be parsimonious and
faithful in order to alleviate the curse of dimensionality that plague real-world pattern recognition systems2. In this study, we
show that PCA compression in the wavelet domain provides such features and results in a robust system that generalizes well
from training to test data (see Section 7).
This paper focuses on the specification and evaluation of signal features obtained by performing PCA in the wavelet
transform domain (waveletfPCA features). in Section 2, we provide a brief description of the data that are used to evaluate
the impact of waveletfPCA features on system performance. We note that waveletlPCA features conform to the so-called
"expansion/compression" (E/C) paradigm that we describe in Section 3. Section 4 provides an overview of the Monet
wavelet transform that implements the expansion phase of the EtC paradigm. Section 5provides a discussion of PCA, which
is used to implement the compression phase of E/C. We discuss the evaluation of wavelet/PCA features using neural
*Correspondence: email: gokimoto@thermotrex; telephone: 808 988 7158
** Correspondence: email: lemonds@orinconhi.com; telephone: 808 254 1532
Part of the SPIE Conferenceon Detection and Remediation Technologies
for Mines and Minelike Targets IV • Orlando, Florida • April 1999 697
SPIE Vol. 3710 • 0277-786X/99/$1O.OO
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/31/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx
698
Echo Return
LA11
The data consists of echo returns from two distinct target suites, one of which was ensonified with a simulated dolphin
echolocation click and the other with an LFM upsweep. The dolphin-like data set was generated using the WAU1 biosonar
transducer which ensonifies underwater targets using a simulated dolphin echolocation click. The targets are suspended in
the water column of a test tank facility located at the Hawaii Institute of Marine Biology (HIMB) on Coconut Island,
Kaneohe, HI. Each click is appoximately 50microseconds in duration and is highly impulsive and broadband with a peak
frequency in excess of 100 KHz and a sound pressure level of around 200 dBs (see Figure 2). Four targets of identical
dimensions but of differing material composition are ensonified by the WAU1 at a distance of 2 meters. The target types are:
1) foam-filled aluminum cylinder, 2) coral rock cylinder, 3) hollow aluminum cylinder and 4) hollow stainless steel
Ensonify,
Data prep
Worki 800 samples for training (even)
Figure 2 Dolphin-like data collection and pre-processing
cylinder. Each target is rotated in the vertical axis from 0 degrees aspect (broadside) to 90 degrees aspect (left-end view) in
5degree increments. Fifty echoes consisting of 1024 samples each are collected, range-gated and digitized at a rate of 1
MHz for each aspect angle. Each echo is then peak-centered using a matched filter based on the transmitted signal. An echo
consisting of 512 samples is then extracted based on the signal peak provided by the matched filter. Only the first 20 echoes
from each aspect are actually used. The echo data are generally of good quality, with signal-to-noise ratio (SNR) varying
significantly with aspect angle.
The LFM backscatter data was provided by Dr. Gerald Dobeck of Coastal Systems Station, Dahigren Division, NSWC,
Panama City, FL. These data consists of echo returns from a six-target suite which include: 1) a bullet-shaped mine
simulator; 2) a conical mine simulator; 3) a water-filled fifty gallon drum; 4) an irregularly shaped limestone rock; 5)a
smooth granite rock; and 6) a water-soaked log. Each target was ensonifed by a single 20 kHz to 60 kHz LFM upsweep at 5-
degree increments from 0 to 355 degrees resulting in 72 echo returns per target. A H52 hydrophone placed between the F33
networks in Section 6 and present performance results in Section 7. In Section 8 we present conclusions and ideas for future
work.
-i
i:'-T"kwk(\tv%TA1
Figure 1 . Biomimetic signal processing using waveletlPCA features
2. DATA
0 100 200 kH
Simulated Dolphin Click
* Cylindrical Targets
1. Foam-filled aluminum 3. Hollow Aluminum
2. Coral Rock 4. Stainless Steel
*IMHz sample rate
*512 data record
* Ø9Ø degrees, 5 degree increments
* 1520 total samples
— p •j
000kHz
hlm@
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/31/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx
LFM acoustic projector and the target was used to detect the backscatter signal which was sampled at 2 MHz. Each echo
return was pre-processed to remove anomalous scatterers (e.g. target mount) and multipath reflections. The resulting time
series was then down-sampled to 1024 points at 250 kHz to reduce the amount of data (see Figure 3). Aspect dependent
reverberation was then added to each echo return at the 12 dB level with respect to a class-dependent reference amplitude3.
The synthetic reverberation was modeled by convolving the LFM transmit signal with a Gaussian white noise process.
Transmit signal
._A Ensonify
204060kHz
* Unear FM upsweep
*20KHz -60KHz
Ieprocess -
Ike-processed *Removebiases
Data -_J Remove anomalous scatterers
-*Addaspect-dependent reverb @12dB
*Downsample to 1024 pts @ 250 KHz
Figure 3 LFM data collection and pre-processing
3. THE EXPANSION/COMPRESSION PARADIGM4
The expansion/compression (E/C) paradigm is a general approach for extracting features for pattern recognition applications.
Essentially, the E/C paradigm first expands the input signal in some transform domain and then compresses the resulting
expansion for presentation to a classifier. The aim of the expansion phase is to better separate signal from noise and to "pre-
whiten" nonstationary and non-Gaussian noise backgrounds (e.g., fractal noise). Studies have shown that the orthogonal
wavelet transform of fractal noise is Karhunen-Loeve-like in terms of correlation structure5. This implies that the wavelet
transform can be used to pre-condition a signal to enhance signal-to-noise ratio (SNR) by concentrating signal information in
a small number of non-zero coefficients. In this study, we have implemented the expansion phase of the E/C paradigm as a
transformation to the time-scale domain using the Monet wavelet transform (MWT). Although, the MWT is not orthogonal,
it is multiresolution and still provides some degree of signal/noise separation and background equalization. Moreover, the
redundancy of the MWT provides a signal representation with features that are appealing and more easily interpretable than
that provided by an orthogonal wavelet transform.
Assuming that signal and noise are better conditioned in the wavelet domain, we expect to obtain better features by
compressing the wavelet transform of the signal rather than the signal itself. We have implemented the compression phase of
the E/C paradigm using standard PCA based on the singular value decomposition (SVD) of the wavelet data matrix. PCA is
a classical statistical technique that: 1) decorrelates the wavelet coefficients of the echo return; 2) removes the pre-
conditioned noise background; and 3) reduces the dimensionality of the wavelet feature vector that is presented to the neural
classifier. SVD was chosen to compute the waveletlPCA features because it operates directly on the wavelet data matrix and
is more numerically stable than standard PCA.
We note that any number of signal transforms can be used to implement the compression phase of the E/C paradigm.
Examples include the short-time Fourier transform, orthogonal wavelets such those from the Daubechies family, adapted
wavelet packets or the Wigner-Ville transform. Similarly, any number of variations of PCA can be used to implement the
compression phase of EtC. These include PCA based on the Fisher Criterion, non-linear PCA, PCA neural networks and
independent component analysis. Consequently, any transformlPCA pair results in a distinct set of features that may be
useful for underwater object recognition applications.
699
* Targets
1. Bullet mine 4. Limestone rock
2. Cone-shaped mine 5. Granite rock
3. 50 gal. drum 6. Log
*2MHz sample rate
*8192data record
*355degrees, 5 degree incr
*432samples
Target 1 @ 20
-20 40 60kHz
* 216training & 216 test samples
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/31/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx
(i:)
(fJ'sJIL)
=
np(n)f(n)1'SiI(J
=('s)f
s POU!JOP
SI
'
(,6)
l
f
'Iu!s
JO
.LAF'I snonuruoo OILL
•o1u!1 pU I1os U!
1U!S
sozqooI 1OjOAM
qoo
(
pui
i/
1OjOAM JOIflOW I1 JO SUO!SJA pTjSUI1 U1
PUI'P °
(n)
iIi
SOJAM Oqi
(j
:q oiou iqwnu io
£mntqii
u st
d
pui
'oui
ui i
uind
uoiuqsu snonuuoo si
j
'ioiu.nmd
pos snonuuoz g si s
(z)
=
(n)
l'SAL
cq
puijp S1AUM
LInOJqI
UIIP IU
UIjOOIL cq
pAIqO si
spuis
jo
sIsIcpuU
OS-Omi3 V
uuwop
iou3nbaIJ
ui ioijg
upuodsauoo
(q)
pu
'uiwop
own oq Ui IAU1\ 10q20w 3j1OjA
(U)
j7
Oinid
I
°
or
09 /
08
on
(q)
aI
a—
OOL
coo cnoo noo ccoo coo czoo zoo cioo too cooo
60 80 LO 90 cO no co zo 0 0
(U)
U)
UIUwOp
iouonboij
olTi U!
k/il
oj
uipuodsauoo
oqi SMOS
(q)' oini
UU
'L0
=
)/ 0J
(I) uoflUnbo iq
poujop
j/
IOjOAUM ioqow opoj,44 oq sMoqs
(U)j
o.ini
L10' pOUIDO5SU OL Jo
iouonboij
ioiuoo oip SU
pOlOidlOlUi
oq uio
JOIOWUJUd
IUOJ oip UU fl oioqi
SU puijop Si IOIOAUM JOqIOW OjJOjA OUJ 6TLwboIj IOIOAUM Joipoui
OjUIS
U I1IOJJ POAUO OJU SUOUOUflJ OjOAUM liv
ipplMpuUq JOpIM jUE
soiouonboj
iouoo
ioqiq UIAUq
sioijj oi
'oidiouud /1UI1iflJOOUn
OI4
iq
OUOt UU
'5IOIOAUM
O1
pOssOJdUIoO
oiow oq o
puodsauoo
sopos
ioq8iq
oqi
'i(pnis
siq u OlOAUM
oqT Jo UOUflIOSOJ ow! O1 UO
iuopuodop
14pIMpURq pui
iouonboij
ioiuoo sq
'uonounj
OlOAUM oq Jo UIIOJSUUJI tounoj oqT Si qoiqM
'iojg
pOPIDOSSU OUT 0wfl U! UJflIOJ O1JOO OI1 JO
uissoood
UO!UjO1JOO JOJ
UiIdOJ
U SU posn Si UU UOflflOSOJ Jo OLUDS JO UOflUJfl OUIfl UIUUOO U SU OjOAUM OUJ uiUmop
iouonboij
oqi ui iojg pOIUTOOSSU ml UII OjOAUM UIUUIOp-Owfl U
spuodsauoo
atoqi
'OIUO5
qOUO oj
9iuopuodopui
OIUOS si iqi
s!SlUUU jEUIS
uonnjosaiujnm U
uipiAoJd
snq
'suonwnp
own TU0JOJJ!p Jo SOjOAUM
UI5fl
SOIUOS JO
OUU1
U JOAO
iI!S SOZiEUU
pUUq JO11O oqi UO
uissoooid jUUIS
OIOAUM
Tuopuodop
OIUOS OOUOq Si UU
'OZis
pOXiJ JO SMOpUIM
SIS/jEUU
SOZijIlfl UUOJSUUJI JoIJflO OI UO posq
uissoooid jUUIS
IUUO!IUOAUOD
}\TMOISMV11 1W1AVM LFIMOVI H1
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/31/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx
where isthe complex conjugate of l/f! and ()isthe inner product in L2 (9). By Parseval's identity we have that
(s,t)= IsI' (sco)j(co)}' (t) (4)
where (.)v is the inverse Fourier transform of (). That is, as a function of t , J(s,t) is the inverse Fourier transform of
the product 1/1 (sw)f(co) with respect to the variable Ct), with S playing the role of a parameter. This result illustrates how
the MWT operates in the frequency domain and allows us to efficiently compute f (s, t) using the Fourier transform.
Because the signal representation provided by the MWT is highly redundant, it is possible to subsample equation (3) over
scale and time without losing the information that is necessary for the reconstruction of the original signal f That is, we
dyadically subsample the continuous signal representation provided by equation (3) at the discrete lattice points (q, p)
where we set Sq 2q , t, =p'r02q and 'V sampling period of f (which we now regard as a discrete signal). We
then use equation equation (4) to compute f(Sq t )= f(q,p) using wavelets of the form
IIfp,q(u)= 22y(2u_pr0). (5)
In practice, the scale and time indices, p and q have finite ranges since the signal f is usually discrete and is assumed to
have finite time resolution.
For underwater object recognition, the dyadic subsampling of the Monet filter bank described above is much too coarse in
frequency domain and results in the loss of spectral information that could hurt classification performance. Our
implementation of the MWT spectrally decomposes a signal over three octaves with 1 6-20 filters per octave using a
technique known as voicing. This technique uses multiple "mother" wavelets to more densely cover the frequency domain in
order to prevent the loss of spectral information that would have otherwise occurred had we dyadically spaced the filter-bank
using just a single mother wavelet7. Specifically, given a mother wavelet I/I ,the general form of the voices
012N—I. .
111 ,1/I , ai , . . . , yf is shown in equation (6) below
I//i (u) = 2J/N,(2J/N ) (6)
for j 0,1,2. . . , N 1 . Each voice generates a discrete, constant-Q filterbank that is shifted in frequency but aligned in
time. Because the filterbank associated with each voice is time-aligned we are able to capture all the lattice points for each
voice in the 2-dimensional time-scale domain that corresponds to a given sample point in the signal. At the same time, we
have closed the gaps in the frequency domain. Figure 5 shows a sequence of filterbanks that cover three octaves in the
frequency domain for a LFM echo return (these octaves correspond to scales 6,7, and 8 which is where the energy for both
the dolphin-like and LFM data resides). Note how the filters widen as frequency increases illustrating the constant-Q nature
of the Monet filterbank. Also note that had voicing not been used, we would have had just one filter to cover each octave
which is clearly not sufficient for spectral coverage without gaps.
701
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/31/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx
(a) (b) (c)
Figure5. Monet filterbank for an LFM echo return: (a) scale 6, (b) scale 7 and (c) scale 8
The magnitude of the output of each filter is sampled at every sample point of the signal resulting in a 60x1024 image
(assuming 20 filters per octave) that represents a highly redundant analysis of the signal over scale and time. This two-
dimensional representation is known as a scalogram (see Figure 6(b)). Each pixel of the scalogram represents the amount of
signal energy at a given time corresponding to the output of a filter associated with a wavelet at a given scale. Each
horizontal scan represents the signal's energy distribution over time with respect to a filter at a fixed scale and each vertical
scan represents the signal's energy distribution over the entire filter-bank with respect to a fixed time point of the signal.
Note also in Figure 6(b) how the MWT separates signal from reverberation background, especially at the higher resolution
scales.
(a) Time Series
4.. . iii ....
:-0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Figure 6. (a) LFM echo return from 50 gallon drum, and (b) scalogram of echo return
The Monet scalogram in Figure 6(b) is composed of 61,440 coefficients and is an extremely rich but dense signal
representation that is much too large for direct input to a neural classifier. To address this problem we must find a way to
compress the wavelet representation without losing important information. The first step we take is to bin-average in scale
and time to produce a 15 by 16 representation that is raster-scanned to a vector with 240 coefficients. Although, the number
of coefficients has been reduced significantly, the dimensionality of the feature vector is still too high. One way to reduce
the dimensionality even further is to extract the principal components of the wavelet data matrix whose rows are equal to the
bin-averaged wavelet coefficients of the time series data. For example, after applying PCA compression to the wavelet
transform of the LFM backscatter data, we end up with feature vectors of length thirty
5. PRINCIPALCOMPONENT ANALYSIS IN THE WAVELET DOMAIN
PCA is a classical statistical technique for characterizing the linear correlation that exists in a set of data'°. It is closely
related to the Karhunen-Loeve transform (random processes), singular value decomposition (matrix diagnonalization) and
factor analysis (correlation structure of multivariate stochastic observations). It has recently been getting much attention as a
702
6.5
7.5
S
(5) Scalogram
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/31/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx
means of extracting features for pattern recognition applications. The primary goal of PCA in pattern recognition is to find a
linear transformation that maps a vector of noisy, correlated time domain components into a much smaller vector of
denoised, uncorrelated feature components.
Let A = [x1 ,x2,. . . , XK T be a data matrix whose rows are K noisy data vectors x, of dimension N with correlated
components (where superscript T is the matrix transpose operator). We desire a linear transformation P such that the
vector y =Px1 has uncorrelated, denoised components and dimensionality M much smaller than N (i.e., M <<N).
....TA T7TT7T T
PCA states that there is an orthogonal matrix V and a diagonal matrix D such that A ii— viiV . Note that A A is
essentially the covariance matrix of the data {x1 ,x2,. . . , XK }. The columns of V are eigenvectors for ATA and form an
orthonormal basis for .R N while the diagonal entries of D are the eigenvalues 2 of AT A and are ordered so that
>A±1 for j 1,2,.. . N1.Nowchoose the eigenvectors of V that correspond to the M largest eigenvalues where
M << Nand form the matrix V' whose columns are equal to these eigenvectors. Then P : jM defined by
P = (V)T the linear transformation we seek since it maps a N dimensional vector into a M dimensional vector where
M << N . The M components of y Px, are known as the principal components of x, . Theresulting feature vector
is also denoised due to the truncation of those eigenvectors of V whose associated eigenvalues are below a certain
threshold. We note that the eigenvectors that were ignored may contain information that is useful for classification and one
needs to be careful that this information is not lost in the thresholding process. Usually though, a visual analysis of a plot of
the eigenvalues makes it clear where signal ends and noise begins and where the threshold should be set.
Figure 7 shows a plot of the 240 eigenvalues obtained for the LFM training data set. Note that the plot is essentially flat at
some nominal value starting at about the 3O eigenvalue, that is, about 85% of all the "energy" in the data is captured by the
first 30 principal components. The remaining principal components span the noise subspace . Thisimplies that we use the
top 30 eigenvectors of V to construct a transformation matrix that generates the principal wavelet components that
characterize the signal subspace for both the training and test data. To the extent that our training set characterizes the
universe of possibilities, retained eigenvectors will allow the neural network to interpolate over to the test cases. Similarly,
the eigenvalues for the dolphin-like training data can be plotted. The eigenvalues for the dolphin-like data level off at about
the 45 principal component capturing a little over 90% of the variation in the training data. The eigenvectors associated
with these 45 eigenvalues are used to construct the PCA compression matrix for the dolphin-like training and test data.
450
400
350
300
250
200
150
100
50
Figure 7. Eigenvalue plot for LFM training data (even angles)
We have implemented PCA by taking the SVD of the wavelet data matrix whose rows are equal to the wavelet transform of
the echo returns in the training data". Note that SVD operates directly on the wavelet data matrix and precludes the
703
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/31/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx
computation of the data covariance matrix and is hence is more numerically stable than standard PCA. Essentially, if A is
an MxN matrix, then there are orthogonal matrices U and V and a diagonal matrix Z {o J such that
A=UVT (7)
where U is MxM , Vis NxN , and has the same dimensions as A . The columns of U and V are known as the left
and right singular vectors of A , respectively, while the diagonal elements Q of are called the singular values of A.
We note that: 1) the orthogonal matrix V is the same for both SVD and PCA; and 2) the eigenvalues of PCA are related to
the singular values of SVD by A, a . Itfollows that if A is the demeaned wavelet data matrix, then the SVD of
(i/i)results in the PCA of (1/K)AT A (based on the covariance matrix of the training data). As shown above, we
can then construct a linear transformation P that generates the principal components of the wavelet coefficients.
Modern time-frequency signal representations such as the wavelet transform are highly redundant and are often unsuitable
for direct input to a classification algorithm. Indeed, the initial impetus for the development of the wavelet transform was to
produce a more intuitive and appealing way of presenting the essential features of signals for visual inspection. To the
human eye, the wavelet representation is quite informative in that signal, clutter and noise are often well separated and better
conditioned. But the price paid for this visual acuity is high dimensionality. PCA alleviates this curse of dimensionality by
compressing the wavelet representation into a few uncorrelated features that characterize the main features of the signal. The
effect of PCA compression is especially pronounced in the wavelet domain since the wavelet coefficients are often very well
conditioned in terms of SNR and noise equalization.
6. PERFORMANCE EVALUATION OF WAVELETIPCA FEATURES
The multilayer perceptron (MLP) is used as the baseline neural classifier for evaluation of waveletfPCA features. The design
we have chosen uses hyperbolic tangent activation functions for the hidden nodes and logistic activations for the output
nodes. The net is trained using backpropagation to output a value of 0.9 for the node associated with a target class and 0.1
for the remaining output nodes. The node with the highest output value determines the class declaration of the network.
Neural networks with six, four and two output nodes were implemented to solve the six, four and two class problems,
respectively. A six-class net was implemented to classify backscatter from the six-target LFM target suite. A four-class net
was implemented for the dolphin-like data set base on the four target cylinders. Two-class nets were implemented for the
mine-like vs. non-mine-like and the man-made vs. non-man-made scenarios with respect to the LFM target suite. Here, the
two mine simulators were combined into the mine-like class while the remaining objects comprised the non-mine-like class.
For the man-made vs. non-man-made scenario, the two mine simulators and 50-gallon drum were combined to form the
man-made class while the other three objects comprised the non-man-made class. Finally, a two-node net was implemented
to address the man-made vs. non-man-made problem for the dolphin-like backscatter data. In this case, the coral rock
cylinder comprised the non-man-made class while the hollow aluminum, stainless steel and foam-filled aluminum cylinders
were grouped to form the man-made class.
All available data for a given scenario are divided into training and crossvalidation/test sets for neural network training.
When the mean-squared-error on the crossvalidation/test data set begins to increase training is stopped. This approach
prevents over-training as the number of training exemplars is small, especially for the LFM data. As indicated above,
training data consists of exemplars based on even aspect angles, while crossvalidation/test exemplars are based on data
collected from odd aspect angles, i.e., the neural classifier is trained on even angles to maximize classification performance
on the odd angles. This training strategy was adopted to evaluate the interpolation capabilities of the net using waveletfPCA
feature patterns.
Multi-aspect data fusion was implemented by concatenating three exemplars that were separated by 30 degrees across all
aspect angles. This procedure increased the size of the input feature vector to the neural network threefold. But because the
704
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/31/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx
PCA-compressed single-aspect LFM exemplars were only 30 components long, the final input vector had a length of only 90
components. Had we instead concatenated three uncompressed LFM exemplars of length 240, the input feature vector would
have been 720 components long. The greatly reduced size of the multi-aspect feature vector significantly improved LFM
classification performance (see Section 7). The same remarks also apply to the dolphin-like data. In general, PCA
compression in the wavelet domain enhances the positive effect that multi-aspect data fusion has on classification
performance because of the reduced size of the concatenated multi-aspect feature vector.
We note that the two-class results using waveletlPCA features were obtained using hierarchical neural networks. That is, an
optimal, single-aspect 6-class neural network was trained using waveletlPCA features. The 6-dimensional output vectors
generated by the trained net were then concatenated to form multi-aspect feature vectors of dimension 18 and input to a small
2-class neural network. This two-stage process resulted in the best ROC curves for the two-class problems involving the
LFM and dolphin-like data sets.
7. RESULTS
Confusion matrices and ROC curves summarizing classification performance are presented below. Results for the LFM data
are presented first, followed by results for the dolphin-like data. The entries in all confusion matrices are counts with respect
to the total number of test exemplars. All results use waveletfPCA features together with 3-ping multi-aspect data fusion.
The average correct classification rate over the six target classes of the LFM dataset using wavelet/PCA features was 98.6%
(see Figure 8). The test data consisted of 36 exemplars from each target class corresponding to the odd angles that range
from 5 through 355 degrees for a total of 216 test exemplars. We note that the neural classifier was trained on the even
angles from 0 through 350 degrees for a total of 216 training exemplars. Hence the neural classifier was required to
interpolate between the even angles to identify echo returns at odd aspect angles. As Figure 8 shows, the combination of
wavelet/PCA features and multi-aspect data fusion enables the net to do a very good job of interpolating between the even
angles in order to classify the odd angles.
*216training & 216 test samples
2*peakcenteredsignal extraction
*6-class MLP: 90x45x6 (nodes)
'32*Targets
I. Bullet mine 4. limestone rock
42. Cone-shaped mine 5. Granite rock
•0 3.50 gal. drum 6. Log
5
6
Figure 8. Six-class confusion matrix for LFM data
Figure 9 shows ROC curves for the mine-like vs. non-mine-like and the man-made vs. non-man-made problems using LFM
data. The "knee" of the mine-like vs. non-mine-like ROC curve of Figure 9(a) corresponds to a Pd of 0.982 and a Pfa of
705
ground truth
123456
AvgPcc = 0.986
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/31/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx
0.032. The knee of the man-made vs. non-man-made ROC curve of Figure 9(b) corresponds to a Pd of 0.968 and a Pfa of
0.033. These performance results compare very well to other results obtained to date for this data set.
(a) J(b)
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
:: :216 training & 216 test exemplars
C.C2-class MLP: 18x10x2 (nodes)
0u. U.'. Pla) —. —.-, Pta
Figure 9. ROC curves for LFM data: (a) mine-like vs.
non-mine-like, and (b) man-made vs. non-man-made
What follows are performance results for the doiphine-like data. Figure 10 shows the confusion matrix for the 4 cylinder
target suite of the dolphin-like data set using waveletlPCA features. An average correct classification rate of 96.7% over the
four target cylinders was recorded. We note a number of misses involving the foam-filled aluminum cylinder. The data
record length was 512 points and we believe that taking a longer data record may help in alleviating the"miss' problem for
the foam-filled aluminum cylinder. Still, the overall rate of correct classification is quite impressive considering that
discrimination is based on material composition alone.
Truth
; falumcoral
halum
SS
Avg Pee = 0.967
Figure 10. Four-class confusion matrix for dolphin-like data
Figure 1 1 (next page) shows the ROC curve for the man-made versus non-man-made problem using the dolphin-like data.
Figure 1 1(b) shows that the knee of the zoomed ROC curve corresponds to a Pd of 0.972 and a Pfa of 0.0058. A hierarchical
neural network design was used to fuse the multi-aspect outputs of an optimal single-aspect, four-class neural network. We
note the unusually low Pfa for this two-class problem coupled with a very respectable Pd.
706
o
—.- (
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/31/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx
(a) Full ROC Curve (b) Zoomed ROC Curve
Figure 1 1 . ROCcurve for Man-made vs. non-man-made scenario for dolphin-like data
8. CONCLUSIONS AND FUTURE WORK
The principal conclusion based on the results of Section 7 are that wavelet/PCA features are capable of accurately and
robustly classifying different target types in a wide range of situations. The effectiveness of these features has been
demonstrated using two different data sets where each was generated by a different transmit signal and target suite. We note
that one of the data sets, (LFM) was contaminated with synthetic reverberation whose spectral distribution coincided with
that of the signal of interest making these echo returns especially difficult to classify. We have also demonstrated the
effectiveness of combining waveletlPCA features with multi-aspect data fusion using both data sets. Indeed, we observed
that the significant dimensionality reduction achieved using PCA in the wavelet domain significantly enhances the
effectiveness of multi-aspect data fusion.
The results presented in this paper suggest that we continue to exploit the B/C paradigm for new signal features by using
different signal expansions (e.g., biorthogonal wavelets, adapted wavelet bases, Wigner-Ville distribution) and various
nonlinear extensions of PCA compression (nonlinear PCA, PCA neural networks, independent components analysis). Also,
given the success of wavelet/PCA features using dolphin-like and LFM transmit signals, we intend to investigate the fusion
of features extracted from the echo returns generated by these two very different signals off a common target set for
enhanced object recognition. Finally, an effort to implement waveletlPCA features and multi-aspect data fusion in VLSI is
currently underway for use in small unmanned, underwater vehicles designed for mine-hunting operations in very shallow
water environments. The challenge here is to realize the potential of waveletlPCA features, multi-aspect data fusion and
similar novel signal processing approaches in real-world settings.
ACKNOWLEDGEMENTS
This work was supported by the Office of Naval Research (ONR) under Contract NO()014-98-C-O1 1 .The authors wish to
thank Dr. Harold Hawkins of ONR for his support and encouragement during this research effort
REFERENCES
1 . W.Au, "Sonar of Dolphins," Springer —Verlag, 1993.
2. R.O. Duda and P.E. Hart, "Pattern Classification and Scene Analysis,", John Wiley & Sons, 1973
3. L.L. Burton and H. Lai, "Active sonar target imaging and classification system," in Proceedings of the SPIE Inter. Symp.
on Aero space/Defense Sensing and Control, vol. 3079, pp. 19-33, Orlando, FL, April 1997.
4. H.H. Szu and P. Watanapongse, "Application of principal wavelet component in pattern classification," in Proceedings
of the SPIE Inter. Symp. on AerospacelDefense Sensing and Control, vol. 3391, pp.194-205, Orlando, FL, April 1998.
5. 0.Wornell, "Signal Processing with Fractals: A Wavelet Based Approach," Prentice Hall, 1996.
6. R. Young, "Wavelet Theory and its Applications," .Kluwer Academic Publishers, 1993.
707
, —---:------------------;----
0.9
0.8
0.7 %..
0.6 .. '
-O504
0.3
0.2
0_I
(.-
099
0.98
0.97
0.96
0.95
0.94
093
0.92
0.91
.::...: ::.
I't
- 0 0 2 0.4 0 6 0.8 I0.02 0 04 0.06 008
P1 pf
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/31/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx
7. T. Masters, "Signal and image processing with neural networks", John Wiley & Sons, Inc., New York, N.Y., 1994.
8. G. Kaiser, "A Friendly Guide to Wavelets," Birkhauser, 1994.
9. 5. Mallat, "A Wavelet Tour of Signal Processing," Academic Press, 1998.
10. K.I. Diamantaras and S.Y. Kung, "Principal Component Neural Networks: Theory and Applications," John Wiley &
Sons, Inc., 1996.
11. D. Kalman, "A singularly valuable decomposition: The SVD of a Matrix," The College Mathematics Journal, vol. 27,
no. 1, January 1996.
708
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/31/2016 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx
... Moreover, this approach requires a supervised definition of several parameters. To overcome these drawbacks, we propose an unsupervised method based on the expansion/compression paradigm [7] which combines a multiresolution approach [1] and the principal component analysis (PCA) [9]. The multiresolution representation is achieved by successive Gaussian filterings. ...
... The proposed detection method is based on the expansion/compression paradigm (E/C)[7] . This is a general approach for extracting features for pattern recognition applications . ...
Article
Full-text available
This paper presents a method for noisy object detec-tion which is based on the expansion/compression paradigm and combines a multiresolution approach with the principal component analysis (PCA). The multiresolution representa-tion is done by successive Gaussian filterings. The compres-sion of the expanded information is achieved by only keep-ing the first PCA factorial image. Endly, the object of in-terest is detected and delineated from the previous factorial image by using a standard valley thresholding technique. The proposed method behaves as a compromise between the various Gaussian filterings by limiting the blurring ef-fect of such filterings and removing most of the noise. The experimental evaluation using synthetic objects has shown the ability of this approach to clean strongly noisy images. For scenes containing several objects of interest, like CT-Scan images, we first search for regions of interest (ROIs). Then, for each ROI, we locally apply the proposed detection method. Experimental results have shown the potential of the proposed method for the detection of liver tumours from CT-Scan images and for the segmentation of handwritten characters.
... To quantitatively measure how much a signal deviates from a set of baseline signals, statistical techniques appears to be a natural choice. The principal component analysis approach has been practiced in the wavelet domain, where the obtained features are compressed and denoised by the truncation of the eigenvectors whose associated eigenvalues are below a certain threshold [17]. Several damage indices have been studied [18], including the root-mean-square deviation (RMSD), the mean absolute percentage deviation (MAPD), covariance (Cov) and correlation coefficient (CC), etc. ...
Article
In this paper, we present a robust quantitative decision-making methodology for structural damage detection using piezoelectric transducers and the Lamb wave approach with the consideration of environmental and operational variances. The features of wave propagation are extracted from multiple signals using an improved adaptive harmonic wavelet transform (AHWT), which are then denoised and highlighted by applying the principal component analysis (PCA) in the wavelet domain. After self-checking and updating the baseline dataset, the Hotelling's 2 T analysis provides a statistical indication of damage under a certain given confidence level. The detectability, sensitivity and robustness of the methodology are investigated using experimental as well as numerical studies. As the basis for the detection practice, the propagation of Lamb waves in an aluminum beam is systematically studied in this paper.
... PCA is an orthogonal linear transformation that projects data onto a new coordinate system such that the largest variance lies on the first coordinate (first principal component), the second greatest variance on the second coordinate, and so on. In some applications, the obtained wavelet features are compressed and denoised by the truncation of the principal components with the corresponding eigenvalues below a certain threshold (Okimoto and Lemonds 1999). The PCA is usually followed by a decision-making procedure based on Hotelling's T 2 , which is a statistic for a multivariate test of differences between the mean values of two data groups. ...
Article
Full-text available
The propagation of Lamb waves generated by piezoelectric transducers in a one-dimensional structure has been studied comprehensively in part I of this two-paper series. Using the information embedded in the propagating waveforms, we expect to make a decision on whether damage has occurred; however, environmental and operational variances inevitably complicate the problem. To better detect the damage under these variances, we present in this paper a robust and quantitative decision-making methodology involving advanced signal processing and statistical analysis. In order to statistically evaluate the features in Lamb wave propagation in the presence of noise, we collect multiple time series (baseline signals) from the undamaged beam. A combination of the improved adaptive harmonic wavelet transform (AHWT) and the principal component analysis (PCA) is performed on the baseline signals to highlight the critical features of Lamb wave propagation in the undamaged structure. The detection of damage is facilitated by comparing the features of the test signal collected from the test structure (damaged or undamaged) with the features of the baseline signals. In this process, we employ Hotelling's T 2 statistical analysis to first purify the baseline dataset and then to quantify the deviation of the test data vector from the baseline dataset. Through experimental and numerical studies, we systematically investigate the proposed methodology in terms of the detectability (capability of detecting damage), the sensitivity (with respect to damage severity and excitation frequency) and the robustness against noises. The parametric studies also validate, from the signal processing standpoint, the guidelines of Lamb-wave-based damage detection developed in part I.
Article
This paper proposes a new principal component analysis method in the wavelet domain, which is useful for dimension reduction and feature extraction of multiple non-stationary time series. The proposed method is constructed using a novel combination of eigenanalysis and the local wavelet spectrum defined in the locally stationary wavelet process. Therefore, we can expect the proposed method to reflect a more generalized non-stationary time series beyond some limited types of signals that existing methods have performed. We investigate the theoretical results of estimated principal components and their loadings. The results of numerical examples, including the analysis of real seismic data and financial data, show the promising empirical properties of the proposed approach.
Conference Paper
Underwater object identification based on acoustic sequence is a complex task, mainly, because of the non-stationary nature of the underwater environment. Moreover, the ambient conditions contribute heavily to varying temporal and spectral characteristics of the source. Further, the characteristic features of a source lie within its spectrum whereas pure spectral contents are more robust to variations along the time and frequency axis. In this work, performance of two different class of learning approaches i.e. linear and multi-linear subspace learning, have been evaluated. Moreover, spectral features are used as inputs to both the said approaches. Two linear subspace learning techniques, namely, principal component analysis (PCA) and linear discriminant analysis (LDA) along with one multi-linear subspace learning (MSL) technique, namely, multi-linear principal component analysis (MPCA) have been used. Performance of the system was evaluated using two sets of data i.e. raw acoustic dataset having samples belonging to 4 distinct classes of ships and a synthetic dataset downloaded from DOSITS, having acoustic samples belonging to 20 distinct classes of underwater objects i.e. sea species and man-made objects. For raw acoustic database, ships signatures were collected in the Indian ocean. Further, two-pass split window (TPSW) method was used to remove background noise from the processed raw acoustic samples. For classification, two neural classifiers were used, namely, robust variable learning rate feed-forward neural network (RVLR-NN) and convolution neural network (CNN). All simulations have been conducted in MATLAB. Further, the system was evaluated under the effect of noise i.e. additive white Gaussian noise (AWGN) at different levels of signal-to-noise ratio (SNR). In addition, dimensions of the feature set were also varied and effects of dimensionality reduction on classification accuracies were observed. Simulation results observed have shown that the combination of MPCA-CNN produced best classification results with an accuracy of up to 99.4%.
Chapter
IntroductionBackground on Methods and ApproachThe Huang Breast Cancer Data SetResults of The Huang StudyGenomic Signal ProcessingThe Expression Data MatrixExperimental DesignData Preprocessing and Data Quality AssessmentThe Modeling of Phenotypic VariationModel ValidationResultsDiscussionAcknowledgmentsReferences
Article
In this paper, the principal component analysis method is applied in the underwater image data for detecting the image objects. The system is designed to assist the underwater monitor system survey operations, specialized to the task of object identification. Firstly, the nature of the underwater is analyzed according to the image formation model and the appearance. Then, the discipline of the principal component analysis is theoretically analysis. Third, the principal component analysis method is applied in the underwater image for dimension reduction, extracting the image feather for recognition. Experimental results, which have been performed on a set of real underwater images, demonstrate the robustness and the accuracy of the principal component analysis in the task of underwater object recognition.
Article
Full-text available
Principal component analysis (PCA) in the wavelet domain provides powerful new features for the non-invasive detection of cervical intraepithelial neoplasia (CIN) using fluorescence imaging spectroscopy. These features are known as principal wavelet components (PWCs). The multiscale structure of the fluorescence spectrum for each pixel of the hyperspectral data cube is extracted using the continuous wavelet transform. PCA is then used to compress and denoise the wavelet representation for presentation to a feed- forward neural network for tissue classification. Using PWC features as inputs to a 5-class NN resulted in average correct classification rates of 95% over five cervical tissue classes corresponding to low-grade dysplasia, squamous, columnar, metaplasia plus a fifth class for other unspecified tissue types, blood and mucus. A 2-class NN was also trained to discriminate between CIN1 and normal tissue with sensitivity and specificity of 98% and 99%, respectively. All performance assessments were based on test data from a set of patients not seen during NN training. Trained neural classifiers were used to `compress' and transform 3D hyperspectral data cubes into 2D color-coded images that accurately mapped the spatial distribution of both normal and dysplastic tissue over the surface of the entire cervix.
Conference Paper
Full-text available
Efficient mine clearing operations are essential for maintaining sea lines of communication and for the timely dispatch of military and economic supplies to conflicted areas. To locate stealthy buried mines, a future naval system of systems is under development that incorporates high-resolution acoustic and electromagnetic sensors. This paper describes an evolving target confirmation architecture for this program's buried object scanning sonar that utilizes image and signal classification strategies. Feature extraction from the 3-D sediment volume imagery is described. Image classification using a joint Gaussian Bayesian classifier is demonstrated with synthetic 2-D image classification experiments that employ image clustering and ellipse feature extraction methods. The signal classifier is demonstrated with data collected from mine-like and clutter objects buried in sand. These tests utilize data from different run orientations and transmit angles for training, cross validation and testing, achieved 5-class classification levels of 94% and 2-Class ROC curve knee values of Pcc=96% and Pfc=4%, and thus illustrate the buried mine-hunting potential for time-frequency based signal classification.
Article
Full-text available
The principle of perception redundancy states that by optimally balancing between the information reduction of the input data and sufficient redundancy, classification performance should improve due to smaller search space from the reduced dimensions and noise-invariant from retained redundancy. For dimensionality reduction using global information, principal component analysis (PCA) is a well suited method especially for signal processing task. However, for pattern classification purpose and for image classification in particular, operating on raw input data sometimes limits the benefit of the PCA. Following the expansion-reduction model of data-processing, we propose the use of multiple resolution analysis through continuous wavelet transform (WT) to rearrange input data into different combinations according to wavelet kernel criteria. Quantization further provides intrinsic de-noising result plus sparseness in the transform space which preconditions the orthogonality. PCA is then performed on each level of the data resolution, generating mutually supportive classification discriminants. All together, this multiple resolution principal wavelet component method provides two significant advantages over traditional PCA: i) providing integrated de-noising and redistribution of information content, thereby establishes controlled and mathematically sound downsampling scheme, which alleviates the curse of dimensionality and, at the same time, attenuates noises. ii) Establishing a multiple resolution decision process, whereas each resolution level provides supplemental principal wavelet components, being at least quasi-orthogonal by nature, to support classification with maximum tolerance.
Chapter
The WFT localizes a signal simultaneously in time and frequency by "looking" at it through a window that is translated in time, then translated in frequency (i.e., modulated in time). These two operations give rise to the "notes" gω,t(u). The signal is then reconstructed as a superposition of such notes, with the WFT ƒ(tω,t) as the coefficient function. Consequently, any features of the signal involving time intervals much shorter than the width T of the window are underlocalized in time and must be obtained as a result of constructive and destructive interference between the notes, which means that "many notes" must be used and ƒ(ω, t) must be spread out in frequency. Similarly, any features of the signal involving time intervals much longer than T are overlocalized in time, and their construction must again use "many notes," with ƒ(ω,t) spread out in time. This can make the WFT an inefficient tool for analyzing regular time behavior that is either very rapid or very slow relative to T. The wavelet transform solves both of these problems by replacing modulation with scaling to achieve frequency localization.
Book
Foreword. Preface. 1. Introduction/Background 2. The Wavelet Transform. 3. Practical Resolution, Gain, and Processing Structures. 4. Wavelet Theory Extentions and Ambiguity Functions. 5. Linear Systems Modelling with Wavelet Theory. 6. Wideband Scattering and Environmental Imaging. Related Research. References. Subject Index.
Article
Active sonar classification of suspended, bottomed, and buried mines is very important in littoral warfare. Current active sonars are inadequate because they require many emissions per potential target, yield high false alarm rates, and suffer from high clutter interference. New active biosonar models based on bat-like range profiling and dolphin-like image construction may reduce these problems. The performance of one such biosonar algorithm, the spectrogram correlation and transformation model developed at Brown University, has been compared with the performance of a standard matched filter on a data set obtained from NSWC Coastal Systems Station, Dahlgren Division. This data set contains echoes form six objects (two mine-like objects, a water-filled 50-gallon drum, a rough limestone rock, a smooth granite rock, and a water-saturated log). Three neural network architectures were used as classifiers. Discrimination was performed between man-made and non-man- made objects, between mine-like and non-mine-like objects, among the three types of man-made objects, and among the six different test objects using single pings, multiple ping fusion, fusion of the results from different algorithms, and a combination of algorithm fusion and multiple ping fusion. Bibtex entry for this abstract Preferred format for this abstract (see Preferences) Find Similar Abstracts: Use: Authors Title Abstract Text Return: Query Results Return items starting with number Query Form Database: Astronomy Physics arXiv e-prints
Article
An abstract is not available.