Content uploaded by Petros Boufounos
Author content
All content in this area was uploaded by Petros Boufounos
Content may be subject to copyright.
Near-Optimal Bayesian Localization
via Incoherence and Sparsity
Volkan Cevher∗
Rice University
Petros Boufounos
MERL
Richard G. Baraniuk
Rice University
Anna C. Gilbert
University of Michigan
Martin J. Strauss
University of Michigan
ABSTRACT
This paper exploits recent developments in sparse approx-
imation and compressive sensing to efficiently perform lo-
calization in a sensor network. We introduce a Bayesian
framework for the localization problem and provide sparse
approximations to its optimal solution. By exploiting the
spatial sparsity of the posterior density, we demonstrate that
the optimal solution can be computed using fast sparse ap-
proximation algorithms. We show that exploiting the signal
sparsity can reduce the sensing and computational cost on
the sensors, as well as the communication bandwidth. We
further illustrate that the sparsity of the source locations can
be exploited to decentralize the computation of the source
locations and reduce the sensor communications even fur-
ther. We also discuss how recent results in 1-bit compressive
sensing can significantly reduce the amount of inter-sensor
communications by transmitting only the intrinsic timing in-
formation. Finally, we develop a computationally efficient
algorithm for bearing estimation using a network of sensors
with provable guarantees.
Categories and Subject Descriptors
C.2.1 [Computer-Communication Networks]: Distributed
networks; G.1.6 [Numerical Analysis]: Optimization—Con-
∗Corresponding author {volkan@rice.edu}. This work is supported
by grants NSF CCF-0431150, CCF-0728867, CNS-0435425, and
CNS-0520280, DARPA/ONR N66001-08-1-2065, ONR N00014-
07-1-0936, 00014-08-1-1067, N00014-08-1-1112, and N00014-08-
1-1066, AFOSR A9550-07-1-0301, ARO MURI W311NF-07-1-
0185, and the Texas Instruments Leadership University Program.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
IPSN’09, April 13–16, 2009, San Francisco, California, USA.
Copyright 2009 ACM 978-1-60558-371-6/09/04 ...$5.00.
strained optimization, convex programming, nonlinear pro-
gramming; G.1.2 [Approximation]: Nonlinear approxima-
tion—Sparse approximation
General Terms
Theory, Algorithm, Performance.
Keywords
Sparse approximation, spatial sparsity, localization, bearing
estimation, sensor networks.
1. INTRODUCTION
Source localization using a network of sensors is a clas-
sical problem with diverse applications in tracking, habitat
monitoring, etc. A solution to this problem in practice must
satisfy a number of competing resource constraints, such as
estimation accuracy, communication and energy costs, sig-
nal sampling requirements and computational complexity.
A plethora of localization solutions exists with emphasis on
one or some of these constraints.
Unfortunately, most of the localization solutions do not
provide an end-to-end solution starting from signals to loca-
tion estimates with rigorous guarantees or they are compu-
tationally inefficient. For instance, in a number of emerging
applications, such as localizing transient events (e.g., sniper
fire [21]) or sources hidden in extremely large bandwidths,
sampling source signals at Nyquist rate is extremely expen-
sive and difficult for resource-constrained sensors. Even if
the sources can be sampled at the Nyquist rate, in many
cases, we end up with too many samples and must com-
press to store or communicate them. Furthermore, even if
a processor in the network can receive such a large amount
of data, its localization algorithms to locate the targets is fre-
quently slow, inefficient, and high-dimensional. These algo-
rithms make it infeasible for the required computation to be
distributed at the sensor level as reasonable hardware costs.
This paper re-examines the problem of target localiza-
tion in sensor networks and uses recent results in sparse
approximation and compressive sensing (CS) to provide a
fundamentally different approach with near Bayesian opti-
mality guarantees and highly efficient end-to-end algorithms
that are both rigorous theoretically and practical in real set-
tings. In particular, we show that (i) the Bayesian model or-
der selection formulation to determine the number of sources
naturally results in a sparse approximation problem, (ii) the
sparse localization solution naturally lends itself to decen-
tralized estimation, (iii) it is possible to reduce communi-
cation significantly by exploiting the spatial sparsity of the
sources as well as 1-bit quantization schemes, and (iv) a sim-
ple greedy (matching pursuit) algorithm provides provable
recovery guarantees for special localization cases.
We introduce a Bayesian framework for target localiza-
tion using graphical models in Sect. 3. Optimal localization
under this model is performed by computing the maximum
a posteriori (MAP) estimate of the number of sources and
the source locations. The model is quite general and can be
applied in a variety of localization scenarios.
We develop a discretization of the optimal Bayesian so-
lution that exploits the sparsity of the posterior density func-
tions in Sect. 5. This discretization enables the use of very
efficient optimization algorithms to jointly compute a near-
optimal MAP estimate of the number of sources and their
locations. As with the Bayesian framework itself, this dis-
cretization is quite general.
We exploit source signal sparsity, when available, in two
ways in Sec 6.1. First, we reduce the analog-to-digital sam-
pling requirements on the sensor, and therefore its cost, us-
ing CS techniques. Second, we reduce the amount of com-
munication required per sensor, and therefore its power con-
sumption, by compressing the signal at each sensor as it is
acquired. The latter allows us to efficiently transmit the data
to a central location to localize the sources.
We exploit source incoherence and spatial sparsity to ef-
ficiently decentralize the localization problem using CS in
Sect. 6.2. Each sensor can build a local representation (dic-
tionary) for the problem and use compressive measurements
of the remaining sensor network data to efficiently local-
ize the sources while using limited communication band-
width. Such decentralized processing does not require that
the source signals be sparse, only that there are few sources
distributed in space.
We capitalize on recent 1-bit CS results to further re-
duce the communication requirements for distributed local-
ization in Sect. 7. Specifically, by transmitting only the sign
of the compressive measurements we eliminate all the am-
plitude information from the sensor data and communicate
only phase and timing information. Thus, it is not neces-
sary that the amplitude gain of each sensor or the received-
signal-strength (RSS) be known when the localization is per-
formed. Timing information enables more robust and ac-
curate recovery of the source locations as compared with
communication-constrained approaches that transmit the re-
ceived signal strength at each sensor.
Our experiments with both real and simulated data in
Sect. 9 indicate that our theoretical approach has practical
significance. We show that in realistic situations, our new
Bayesian framework reduces the amount of data collection
and communication by a significant margin with graceful or
no degradation in the localization accuracy.
Prior Work. Related localization approaches have been
considered in [11, 18, 19, 7, 13, 6]. In [11], spatial spar-
sity is used to improve localization performance; however
the computational complexity of the presented algorithm is
high, since it uses the high-dimensional received signals. Di-
mensionality reduction through principal components analy-
sis was proposed in [18] to optimize a maximum likelihood
cost; however, this technique is contingent on knowledge of
the number of sources present for acceptable performance
and also requires the transmission of all the sensor data to
a central location to perform singular value decomposition.
Similar to [18], we do not assume the source signals are in-
coherent. In [19], along with the spatial sparsity assump-
tion, the authors assume that the received signals are also
sparse in some known basis and perform localization in near
and far fields; however, similar to [11], the authors use the
high-dimensional received signals and the proposed method
has high complexity and demanding communication require-
ments. Moreover, the approach is centralized and is not suit-
able for resource constrained sensor network settings. CS
was employed for compression in [7,13], but the method was
restricted to far-field bearing estimation. In [6], the authors
extend CS-based localization setting to near-field estimation
with a maximum likelihood formulation, and examine the
constraints necessary for accurate estimation in the number
of measurements and sensors taken, the allowable amount of
quantization, the spatial resolution of the localization grid,
and the conditions on the source signals. In this paper, we
build on the preliminary results in [6] by showing the opti-
mality of the sparse approximation approaches in Bayesian
inference. Compared to [6], we also provide a new 1-bit
framework and an efficient matching pursuit algorithm with
provable guarantees for cases where the sensors network to
calculate source bearings.
2. COMPRESSIVE SENSING BACKGROUND
Compressive sensing (CS) exploits sparsity to acquire
high-dimensional signals using very few linear measure-
ments [1, 9, 5]. Specifically, consider a vector θin an N-
dimensional space which is K-sparse, i.e., has only Knon-
zero components. Using compressive sensing, this vec-
tor can be sampled and reconstructed with only M=
O(Klog(N/K)) linear measurements:
χ=Φθ+n,(1)
where Φis a M×Nmeasurement matrix, χare the mea-
surements and nis the measurement noise.
The sparse vector θcan subsequently be recovered from
the measurements using the following convex optimization:
b
θ=arg min
θ!θ!1+λ!χ−Φθ!2
2,(2)
where the "pnorm is defined as #θ#p=(
!i|θi|p)1/p, and λ
is a relaxation parameter that depends on the noise variance.
It can also be recovered using greedy algorithms, such as the
ones in [23,24] and references within.
In the absence of noise and under certain conditions on
Φ, both convex optimization and several greedy algorithms
exactly recover θ[5]. This formulation is robust even if the
vector is not sparse but compressible, i.e., has very few sig-
nificant coefficients and can be well approximated by a K-
sparse representation [1,9, 5].
A sufficient but not necessary condition on Φto recover
the signal using (2) is a restricted isometry property (RIP)
or order 2K. This property states that there is a sufficiently
small and positive δsuch that for any 2Ksparse signal θ:
(1 −δ)!θ!2
2≤!Φθ!2
2≤(1 + δ)!θ!2
2.(3)
Although in general it is combinatorially complex to ver-
ify the RIP on an arbitrary measurement matrix Φ, a sur-
prising result in CS is that a randomly generated Φwith
M=O(Klog(N/K)) rows satisfies the RIP with over-
whelming probability.
The same framework applies if a vector is sparse in a
sparsity-inducing basis or dictionary Ψinstead of the canon-
ical domain. Specifically, if a=Ψθ, where ais the mea-
sured signal instead of θ, and Ψis the sparsity-inducing dic-
tionary, then (1) becomes
χ=Φa=ΦΨθ+n(4)
Thus the problem is reformulated as the recovery of a sparse
θfrom χ, acquired using the measurement matrix ΦΨ.
If the sparsity-inducing basis is the Fourier basis, then
it is possible to sample and reconstruct the signal using ex-
tremely efficient Fourier sampling algorithms, such as the
ones presented in [10]. The advantage of these algorithms
is that they can operate with complexity that is sublinear in
the dimensionality of the signal, making them appropriate
for very large signals in computationally constrained envi-
ronments.
3. BAYESIAN INFERENCE FOR LOCALIZATION
3.1 Problem set up and notation
Our objective is to determine the locations of Ksources
in a known planar deployment area using the signals received
by a network of Qsensors. We assume that neither the num-
ber of sources Knor the source signals are known. We de-
note the horizontal hand vertical vlocation of q-th sensor
(q=1,...,Q) by sq=(sqh ,s
qv)!with respect to a known
origin. We assume that the sensor network is calibrated so
that the location of each sensor is known across the network.
We also assume that the local clocks of the sensors are syn-
chronized within ±δseconds.
We denote the received signal vector at the q-th sensor
by zqand its Fourier transform by Zq=Fzq, where Fis
the Fourier transform operator. The time vector zqis formed
by concatenating Treceived signal samples zq(t)at times
t=t1,...,t
T. Similarly, the frequency vector is formed
by concatenating TFourier samples Zq(ω)at frequencies
ω=ω1,...,ωT, corresponding to the time vector zq. We
then denote the unknown source signal vectors, their Fourier
transforms and locations by yk,Ykand xk=(xkh,x
kv)!,
respectively for k=1,...,K. Finally, we represent the full
sensor network data by the (Q×T)×1dimensional vector
Z=(Z!
1,...,Z!
Q)!in frequency and similarly by zin time.
3.2 Signal Propagation and the Sensor Observations
We denote Aas the signal propagation operator, which
takes a source signal yand its location xand calculates in
the observed signal zat a location svia
z=Ax→s[y].(5)
In an isotropic medium with a propagation speed of c,A
is a linear operator, known as the Green’s function, with a
particularly distinguished form in the frequency domain:
Ax→s:Z(ω)= 1
!x−s!αexp „−jω!x−s!
c«Y(ω),(6)
where j=√−1and αis the attenuation constant that de-
pends on the nature of the propagation.
In the sequel, we assume that Ais the Green’s function
and α=1(spherical propagation). In this specific case, (5)
can be represented with linear matrix equation in frequency
domain due to (6). Hence, without loss of generality, we
will discuss the location problem and its solution in the fre-
quency domain. In general localization problems, Amust
be learned or simulated to account for unisotropic media and
multipath. Note that the algorithms in this paper can be mod-
ified to handle different operators as long as they are linear.
When Ais the Green’s function, the sensor network data
Zcan be written as a superposition of the source signals:
Z=
K
X
k=1
A(xk)Yk+N(7)
where Nis an additive noise, and Ais the mixing matrix for
the sensor network due to (6) with the following form:
A(xk)=2
6
4
A1(xk)
.
.
.
AQ(xk)
3
7
5(QT ×T)
,(8)
[Aq(xk)]lm =(1
#xk−sq#exp “−jωl
#xk−sq#
c”,l=m,
0,l$=m,(9)
where l=1,...,T;m=1,...,T; and q=1,...,Q.
Yk
xk
sq
σ2
Q
Zq
K
Figure 1: A visual representation of the localization prob-
lem as a directed acyclic graph.
3.3 Graphical Model and the Inference Problem
We cast the localization problem as a latent variable es-
timation problem in Bayesian inference. To summarize the
inter-dependencies amongst the relevant variables, we vi-
sualize the localization problem in Fig. 1 with a directed
acyclic graphical model. In the graphical model, the dashed
box denotes the set of Qsensor observations Zq, which
are assumed to be independent, in plate notation [2]. The
shaded node within the dashed box represents the observed
variables. The nodes in the solid box represents the latent
variables, namely, the number of targets K, the k-th source
signal Ykand its location xk. The deterministic components
of the problem are shown with solid dots, such the sensor po-
sitions sq’s and — as an example — the additive noise vari-
ance σ2at the sensors. In Fig. 1, arrows indicate the causal
relationships, where the distribution of the variables at the
head of an arrow depends on the variables on at the tail.
Within the graphical model of Fig. 1, the latent vari-
able pair (K, Y)defines a model MK, labeled by K, where
Y=(Y!
1,...,Y!
K)!. Note that the length of the vector of
source signals Ydepends on K. Each model MKrefers to
a different probability density function (PDF) over the ob-
served variables as we vary Kand Y. The comparison of
posterior density of each model MKfor different values of
Kenables us the determine the number of targets and their
source vectors.
Given the sensor network observations Z, the posterior
density of MKcan be determined using Bayes’ rule:
p(MK|Z)∝p(Z|MK)p(MK),(10)
where p(MK)is the prior distribution of each model MK,
and p(Z|MK)is the model evidence distribution. In the
localization problem, we assume that the sources Yare un-
known parameters that are uniformly distributed in their nat-
ural space. Hence, the model prior only depends on the num-
ber of targets K:
p(MK)∝p(K).(11)
This prior p(K)incorporates known information on the
number of targets. In this paper we use an exponential prior,
which penalizes large numbers of targets:
p(K)∝exp −λK. (12)
In general, we do not have a direct expression for the
model evidence p(Z|MK). To determine p(Z|MK), we
marginalize the latent variables X=(x!
1,...,xK)!via
p(Z|MK)=Zp(Z|X,MK)p(X|MK)dX.(13)
by using p(Z|X,MK)and p(X|MK), which are the prob-
ability density function (PDF) of the sensor network data and
the prior distribution of the source locations, respectively.
The PDF p(Z|X,MK)is determined by the physics of
signal propagation and the sensor observations. Assuming
i.i.d. zero means Gaussian noise with variance σ2at the sen-
sors, we obtain via (7) that
p(Z|X,MK)∼N Z˛˛˛˛˛
K
X
k=1
A(xk)Yk,σ2I!,(14)
where N(µ,Σ)is shorthand notation for the Gaussian dis-
tribution with mean µand covariance Σ.
On the other hand, the prior distribution on the target lo-
cations summarizes our prior knowledge on the locations. A
quick inspection of the graphical model (Fig. 1) reveals that
Ykand the xk’s are independent. Hence, the prior distribu-
tion of the source locations has the following form:
p(X|MK)=p(X|K).(15)
If we have prior information on the source locations, then it
can be incorporated in the above. However, for the remainder
of this paper we assume a uniform prior on the locations.
In the optimal Bayesian source location estimation, we
first choose the single most probable model among MK
alone to make a good prediction. Hence, we focus on the
maximum a posteriori (MAP) estimate of the PDF in (10):
c
MK= arg max p(MK|Z).(16)
Then, given this MAP estimate of the model, localization be-
comes an inference problem from the posterior of the target
locations. Specifically, the MAP estimate of the locations
can be obtained as
c
X= arg max p(X|Z,c
MK).(17)
We emphasize that the MAP estimate under the model as-
sumptions can only be unique up to a permutation of the
sources since a re-indexing of the source locations xkdoes
not change the problem or the data. Therefore, (17) is sym-
metric to permutations of X.
4. THE PRICE OF OPTIMALITY: SAMPLING,
COMMUNICATION AND COMPUTATIONAL
CHALLENGES
To realize the Bayesian solution in a sensor network, we
must (i) sample the received signals at their Nyquist rate,
(ii) communicate the sensor observations Zto a collection
point, and (iii) solve the optimization problems correspond-
ing to (16) and (17). In this section, we discuss these issues
in detail and describe how they are traditionally handled.
In numerous localization applications, such as acoustic
vehicle tracking, human speaker localization, etc. [7, 8], the
necessary source Nyquist rate is typically quite low. Hence,
the cost and form factor of the required analog-to-digital con-
verter (ADC) hardware in each sensor are quite manageable.
However, in a number of emerging applications, such as lo-
calizing transient events (e.g., sniper fire [21]) or sources hid-
den in extremely large bandwidths, Nyquist rate sampling is
extremely expensive and difficult.
Even if the sources can be sampled at the Nyquist rate,
it is often necessary to compress the samples before storing
or communicating them. Compression reduces the storage
requirements of a T-dimensional signal by representing it in
a domain where most of the coefficients are zero or close to
zero, i.e., in a domain where the signal is sparse or compress-
ible, respectively (for example, the Fourier, DCT, or wavelet
domain). Classical compression then encodes only the mag-
nitude and location of the most significant coefficients. Com-
pressive sensing (CS) addresses the inefficiency of the classi-
cal sample-then-compress scheme by developing theory and
hardware to directly obtain a compressed representation of
sparse or compressible signals [5,9].
The sensor network communications necessary to
solve (16) and (17) in a centralized manner scale with the
product of the source sparsity and the number of sensors
(see [8] for an example application). The communication
requirements can still be quite demanding on the resources,
for example, in practical battery operated or wireless sen-
sor network. Hence, lossy compression of the observed sig-
nals is typically used. As an example, the received-signal-
strength (RSS) of the observed signals can be used as an ag-
gressive compression scheme. Such aggressive lossy com-
pression schemes focus on distilling the observed signals to
the smallest sketch possible but may result in significant ac-
curacy losses in the target location estimates. The focus of
Sect. 6 is on new signal compression schemes designed to
maintain information necessary for the localization problem.
Although required by the optimal Bayesian solution cen-
tralized processing has many disadvantages, such as cre-
ating communication bottlenecks and catastrophic points
of failure. Moreover, the resulting optimization problems
are high dimensional. Hence, approximate inference meth-
ods on graphical models such as (loopy) belief propaga-
tion, junction-tree algorithms, and variational methods based
on convex duality are often used to distribute the resulting
inference problem over the individual sensors of the net-
work [15, 12, 20]. As a result, the computation load of the
inference task is also distributed across the sensors.
The underlying message passing mechanism of dis-
tributed (approximate) inference methods require communi-
cation of local beliefs on the latent variables whose size is
proportional to the desired resolution of the latent variables.
However, further compression can be achieved by parameter-
izing the problem with certain kernel basis functions, fitting
Gaussian mixtures via fast Gauss transform, or variational
approximations [12,15,20]. The resulting approximation al-
gorithms inherit the estimation (or divergence) guarantees,
such as bounded distortion, of the underlying approximation
engine as well as its disadvantages, such as unknown number
of communication loops.
5. APPROXIMATE INFERENCE FOR
LOCALIZATION
Section 3.3 demonstrated that the Bayesian solution to
the localization problem involves the optimization problems
(16) and (17). The first corresponds to a model order se-
lection problem which determines the number of targets K,
and the second corresponds to the location inference prob-
lem which determines the target locations given K. Unfor-
tunately both optimizations are difficult to solve analytically.
In this section we describe a computational approach
that uses a discretization of the source location grid to ef-
ficiently and accurately compute the optimal solution. Our
approach exploits the incoherence of the sources to factorize
the optimization problem and the sparsity of the posterior
density to compute its sparse approximation on the location
grid. Using this sparse approximation we jointly estimate
both the number of sources and their locations.
Although we can incorporate a variety of priors, for clar-
ity of the derivations we only treat the case of a uniform prior
on the source locations:
p(X|K)∝1.(18)
Thus, the posterior distribution of the target locations is
p(X|Z,MK)∝p(Z|X,MK)∼N Z˛˛˛˛˛
K
X
k=1
A(xk)Yk,σ2I!.
(19)
5.1 Estimation of Source Signals
The Bayesian model order selection problem (16) is
equivalent to the following optimization via (11):
c
MK= arg max
Kp(K) max
YZp(Z|X,MK)dX,(20)
To solve (20), we make the following observations.
Observation 1: The maximization of the posterior PDF,
given in (19), requires us to solve the following least squares
problem:
b
Y= arg min
YE(X,Y),where (21)
E(X,Y)=Z%Z−2Z% K
X
k=1
A(xk)Yk!+‚
‚
‚
‚
‚
K
X
k=1
A(xk)Yk‚
‚
‚
‚
‚
2
.
(22)
If the sources satisfy the following factorization
‚
‚
‚
‚
‚
K
X
k=1
A(xk)Yk‚
‚
‚
‚
‚
2
≈
K
X
k=1
Y%
kA%(xk)A(xk)Yk(23)
then it is easy to prove that the solution to (21) is given by
b
Yk=A†(xk)Z,(24)
where †is the pseudo inverse. When the sources (i) have fast
decaying autocorrelations and (ii) are sufficiently separated
in space [6,13], the factorization in (23) is quite accurate.
Observation 2: The optimal source estimates "
Yk’s
in (24) for k=1,...,K are independent of Kgiven xk.
Then, the maximization operation with respect to Yin (20)
can be moved into the integral. The resulting objective au-
tomatically ties the source signal estimates with the location
estimates and modifies the model selection problem (20) to
b
K= arg max
Kp(K)Zp(Z|X, K, b
Y)dX,where (25)
p(Z|X, K, b
Y)∼N Z˛˛˛˛˛
K
X
k=1
A(xk)A†(xk)Z,σ2I!.(26)
5.2 Discretization of the Source Locations
The Bayesian formulation in Sec. 3 defines the source
locations as continuous random vectors in the 2-D plane. In
this section, we discretize the plane using an N-point spatial
grid. We assume we have a sufficiently dense grid so that
each target location xkis located at one of the Ngrid points.
We then define an N-dimensional grid selector vector θwith
components θithat are 1 or 0 depending on whether or not a
source is present at grid point i. With this notation, note that
the number of sources Kis equal to the "0norm of θ, which
is defined as the number of non-zero elements in the vector.
Since θis a vector of ones and zeros, the number of ones is
also equal to its "1norm, defined as #θ#1=!N
i=1 |θi|=K,
where θicorresponds to the i-th grid point.
With a slight abuse of notation, we will use θinter-
changeably with Xin sequel. By θ, we will refer to either
the grid points or the actual physical locations corresponding
the nonzero elements of θ, depending on the context.
5.3 Joint Model Selection and Posterior Estimation
Using the discretization in Sec. 5.2, we define a dictio-
nary Ψ, whose column iis equal to A(θi)A†(θi)Z. This
columns of this dictionary describes how the single source
signal would be observed at the sensors if it was located at
grid point i. It is possible to show that if the source signals
are uncorrelated with each other or have rapidly decaying au-
tocorrelations then the dictionary Ψis incoherent and well
conditioned for sparse approximation [6].
Via (26), the integral in (25) is then lower-bounded by
Zp(Z|X, K, b
Y)dX≥1
√2πσ2exp −1
2σ2‚
‚
‚Z−Ψb
θK‚
‚
‚2ff,
(27)
where "
θKis the best K-sparse vector that minimizes
#Z−Ψθ#2. The intuition behind this lower bound is
straight forward. We first approximate the continuous in-
tegral by a discrete summation over the grid locations θ.
Then, by only keeping K-heavy hitters of θ(e.g., the best K
columns that maximize joint the posterior), we arrive at (27).
We use (27) and (25) to determine the model order:
b
K≈arg min
K‚
‚
‚Z−Ψb
θK‚
‚
‚2−2σ2log p(K).(28)
Using the exponential prior on Kfrom (12) and substituting
the "1norm of θfor K, this optimization becomes
b
θ≈arg min
θ!Z−Ψθ!2+2σ2λ!θ!1,(29)
where θis a vector of 0 and 1. This optimization jointly
solves for the number of sources Kand their locations.
The discrete nature of θmakes (29) a combinatorial op-
timization problem. To solve it we heuristically relax it and
allow θto take continuous positive values. Thus the opti-
mization becomes a minimization problem easily solved us-
ing linear programming, basis pursuit, or greedy algorithms
(for examples see [5] and references within). In practice we
observe that this relaxation performs very well. There are
guaranteed branch and bound methods to compute the com-
binatorial minimization using the relaxation results, but they
are beyond the scope of this paper.
6. EXPLOITING COMPRESSED SENSING
This section examines how sparsity and compressive
sensing can be exploited in sensor networks for source local-
ization. Source signal sparsity, when available, reduces the
sensing cost and the communication burden for each sensor.
Spatial sparsity distributes the computation of the localiza-
tion algorithm and subsequently increases the robustness of
the network.
6.1 Signal Sparsity
When the signals are sparse in the frequency domain
(i.e., have very few significant frequency components), re-
cent results in CS enable the use of cheaper sensors for dig-
ital data acquisition. Two promising methods are random
demodulation and random sampling [17, 16]. Both methods
can be efficiently implemented in hardware. Furthermore,
random sampling enables very efficient greedy reconstruc-
tion algorithms that recover the signal with computational
complexity sublinear to the signal dimension.
Furthermore, if the source signals are sparse, then the
sensors do not need to communicate the entire received sig-
nals to the processing center. Communication resources can
be saved by transmitting only the significant frequency com-
ponents of the sensed data and their locations on the fre-
quency grid. If the signal is compressively sampled, then
the CS reconstruction algorithms provide these components
at their output. If a classical uniform Nyquist-rate sensor
is used instead, then the sparse components can be iden-
tified using a very low-cost FFT operation. In either case
#Z|ω−Ψ|ωθ#can be used instead of #Z−Ψθ#, where
ωis the frequency support of the signals and |ωselects the
vector or matrix rows only in this frequency support.
6.2 Spatial Sparsity and Decentralized
Processing
Decentralize for Robustness. The minimization of (21)
with respect to the unknown sources Yrequires us to collect
the sensor network data Zat a central location, which is un-
desirable in some cases. To overcome the need for central-
ized processing, consider the following upper-bound to the
objective function in (21):
min
Y!Z−X
k
A(xk)Yk!2= min
Y
Q
X
q=1 !Zq−X
k
Aq(xk)Yk!2
≤min
∀qX
∀i\q‚
‚
‚
‚
‚Zi−X
k
Ai(xk)A†
q(xk)Zq‚
‚
‚
‚
‚
2
= min
∀q‚
‚
‚Z−b
Ψqθ‚
‚
‚2
(30)
where the i-th column of "
Ψqis defined by A(θi)A†
q(θi)Zq
for each grid point i. The upper-bound in (30) is obtained by
(i) simply factoring the objective across the sensors, (ii) in-
dependently optimizing the individual factors at each sensor,
and (iii) choosing the minimum objective across all the sen-
sors. Since each factorization requires local data, the com-
putation is distributed across all the sensors.
Given that we can calculate approximate source esti-
mates individually at each sensor, it is also natural that we
distribute the model order selection problem among the sen-
sors (28). The key idea is that when we plug the local signal
estimates to solve the model order selection (28), the new
objective function with respect to the source locations is still
a surrogate to the original problem:
b
K≈arg min
K‚
‚
‚Z−b
Ψqb
θK‚
‚
‚2−2σ2log p(K),(31)
The objective value of the optimization problem (31) pro-
vides a score for us to rank all the local solutions across the
sensor network. Then, sensor network chooses the minimum
score solution among all the sensors via (30).
Enter CS to reduce communication. Since we know
the desired θis sparse, we can use a Gaussian random ma-
trix Φfor dimensionality reduction. Via RIP, which was dis-
cussed in Sect. 2, we have
1
1+δ!ΦZ−ΦΨθ!2≤!Z−Ψθ!2≤1
1−δ!ΦZ−ΦΨθ!2.(32)
Required isometry is proportional O(Klog N
K)to the
desired spatial sparsity Kof θ.
7. SUPPORT RECOVERY FROM
QUANTIZED MEASUREMENTS
Transmission of the network data Zrequires that the
continuous-amplitude values be quantized to a certain pre-
cision. In this section we describe how quantization to 1-bit
values can be effectively used to transmit the sensor data. A
1-bit quantization scheme was used in [7], using standard CS
reconstruction methods to reduce the communication band-
width. The approach we propose here also uses 1-bit quanti-
zation but is modeled after the 1-bit CS theory [3].
The proposed algorithm eliminates received-signal-
strength (RSS) information from the data and uses no com-
munication bandwidth to transmit them. Instead it transmits
only the more robust timing and phase information in the
signal. This enables significantly more accurate localization
even in far-field and bearing estimation configurations.
The 1-bit data also eliminate signal amplitude informa-
tion, which is not necessary for localization using our algo-
rithm. Thus it is not necessary to perform accurate sensor
calibration to have a common amplitude reference. Further-
more, no communication resources are used to transmit un-
necessary information.
7.1 1-bit Quantization
The 1-bit data transmitted though the network is the sign
of the CS measurements, henceforth denoted using ζ:
ζ≡sign(ΦZ)=sign(ΦΨθ)(33)
where sign(x)is a vector of +1 or −1if the corresponding
element of xis positive or negative, respectively. This vec-
tor, however, only indicates the sign of each measurement.
Directly using it in the optimization (29) as a substitute of Z
would result to suboptimal solutions. Instead, the localiza-
tion algorithm should only ensure that the recovered location
information θis consistent with the measurements, assuming
no measurement noise. Furthermore, the location informa-
tion θis a positive quantity, which should also be enforced
as a constraint in the reconstruction algorithm.
7.2 1-bit Localization Algorithm
The 1-bit information eliminates the amplitude infor-
mation of the signal since sign(ΦΨθ)=sign(cΦΨθ)for
any positive constant c. Thus, determining the sparse lo-
cation data by minimizing the "1norm consistent with the
signs would result to a zero signal. An additional reconstruc-
tion constraint is necessary for to recover the location of the
sources. Since the "1norm of the signal is used in (29) as a
proxy for the source sparsity, [3] constrains the "2norm of
the signal such that #θ#2=1. Thus, the location informa-
tion is recovered by solving the following optimization:
b
θ=arg min
b
θ!b
θ!1,s.t. sign(ΦΨb
θ)=ζ,sign(b
θ) = +1 and !b
θ!2=1.
The resulting location vector can subsequently be scaled to
have the desired properties.
Imposing sign measurements as hard constraints in the
reconstruction has the potential to generate infeasible prob-
lems in the presence of measurement noise or errors in the
transmission. However, under the assumption of Gaussian
measurement noise, it can be shown that the hard constraints
can be softened and approximated by a one-sided quadratic
function:
f(x)=0,x≥0
x2, x < 0.(34)
Thus, the optimization in (34) can be relaxed to:
b
θ=arg min
b
θ!b
θ!1+λ1X
l
f(ζl(ΦΨb
θ)l)+λ2X
k
f((b
θ)k)s.t. !b
θ!2=1,
(35)
where λ1and λ2are the relaxation parameters. The opti-
mization in (35) can be efficiently computed using the algo-
rithm in [3]. Under this relaxation, the optimization in (35)
is the 1-bit equivalent of the optimization in (29) and solves
essentially the same problem when the data is quantized to
1-bit sign information.
8. SPECIAL CASE:
BEARING ESTIMATION
The Bayesian localization framework was derived as-
suming we localize sources on a 2D-grid. The optimization
problems and the discussed solution algorithms exploited the
spatial sparsity of the sources on the location grid. This
framework is general enough that it includes localizing tar-
gets in 1D-grids as a special case, e.g., the bearing estimation
of sources using a network of sensors, which is the focus of
this section.
Note that in localization problems, it is customary to as-
sume that (i) the propagation medium is isotropic and (ii) the
sources are isotropic point sources. In reality, these two as-
sumptions are somewhat idealistic: we almost always have
a non-isotropic medium, and most sources are directional.
Hence, the source propagation can be assumed to be uni-
form only within a small cone, as illustrated in Fig. 2. When
the data from all of the sensors is fused for localization,
the directional nature of the signal propagation discrepan-
cies might cause estimation errors in localization. This ob-
servation motivates a specialized localization algorithm for a
collection of nearby sensors (e.g., Mica nodes [14]) when a
sensor management system can self-organize the sensors to
initiate bearing estimation.
8.1 Far-field localization
We assume that there are Ktargets that are sufficiently
far away from the sensors so that they can be considered on
a ring of radius R2concentric with the sensor disk (e.g.,
see Fig. 3 ). Hence, to localize these targets, it suffices to
estimate their bearing with respect to the sensors. We as-
sume that the sources are restricted to Nequally-spaced grid
points on the circle, where angles are measured with respect
to the horizontal axis.
Without loss of generality, the Ktargets transmit at one
frequency ωwith a corresponding wavelength λ=c/ω.
When the targets transmit at different frequencies, it presents
a simpler problems since we can then solve for the bearings
at each frequency separately. For analysis, we assume that Q
sensors are placed uniformly at random within a concentric
−1
00
−
50 0 50
1
00
−100
−50
0
50
100
x
y
Figure 2: Bearing estimation example: The source lo-
cation is marked with a star, and the sensor locations
are shown with circles. The sensors within the solid and
dashed triangles experience similar propagation for their
observed signals. The bearing of the source can be esti-
mated from these sensors using the Bayesian bearing es-
timation framework.
R1
R2
Figure 3: Far-field scenario in which targets (red) are
placed arbitrarily on a ring at some large distance R2
from a field of sensors (blue) which are placed uniformly
at random within a disk of radius R1. The yellow sensor
is the query sensor.
sensor disk of radius R1with polar coordinates (rp,ψp). The
Qsensors send their lists to a central processing unit which
builds a dictionary Ψand then runs the Bearing Pursuit al-
gorithm of Fig. 4. We assume that the central unit knows the
locations of the the other Psensors so that it can use this
information to build the dictionary. See
Next, we describe the dictionary matrix Ψwhich we
need to form to determine the bearings. With the single fre-
quency assumption, the dictionary matrix Ψcan be analyti-
cally built. In fact, due to geometry, the (p, j)entry in Ψis
simply
Ψp,j =e
2πirp/λcos(ψp−θj).
For any p, j, note that we have |Ψp,j |=1. Then, the bearing
estimation can be obtained via
Ψθ=y. (36)
We note that the matrix Ψis not the result of applying a
Johnson-Lindenstrauss (JL) matrix to each sensor’s observa-
tions. The dictionary does, however, have both sufficient ran-
Algorithm: Bearing Pursuit
Inputs: Number Kof sources, dictionary Ψ,
and yvector of measurements
Output: List Lof O(K)locations
T=O(K)// size of list maintained
r = 0 // current representation
For each iteration j=0,1,...,O(log K){
form vector z=Ψ∗y,
retain top Tentries in z,
update representation r=r+z,
prune rto maintain list of size T,
update measurements y=y−Ψr.}
Figure 4: Pseudocode for Bearing Pursuit Algorithm.
−100 0 100
−150
−100
−50
0
50
100
150
h
v
(a)
time
(b)
Figure 5: (a) Sensor network simulation topology for 2-D
source localization. (b) Source signals used in the simu-
lations.
domness and sufficient structure which Bearing pursuit can
exploit to solve Equation (36) for the non-zero entries in θ.
Appendix 10 proves that Bearing pursuit correctly identifies
the bearing of the sources.
9. EXPERIMENTS
9.1 Near-Optimal Spatial Localization
To demonstrate the localization framework and the op-
timization algorithms, we simulate a sensor network where
we use Q= 30 sensors, randomly deployed on a 150 ×150
m2deployment area, to localize two targets. In Fig. 5(a), the
target locations are shown with stars whereas the sensor loca-
tions are shown with circles. In this experiment, our sources
are actual recoded projectile shots, as shown in Fig. 5(b).
The power of the first source (top) is approximately one third
of the power of the second source (bottom).
We simulate the decentralized message passing scheme
as discussed in Sect. 6.2. We compress the dimensions of the
observed signals 50:1from their Nyquist rate using Gaussian
random matrices for transmission (2% compression) at each
sensor. Given the compressive measurements from other
sensors in the network, each sensor proceeds to solve (31) lo-
cally. Finally, each solution along with the objective score is
passed across the sensor network. Only the solution with the
minimum score among all the sensors is kept during trans-
mission.
5 10 15 20 25 30
0.1
0.12
0.14
0.16
sensors
score
(a)
−100
0
100
−100
0
100
0
0.1
h
v
(b)
−100
0
100
−100
0
100
0
0.1
h
v
(c)
−100
0
100
−100
0
100
0
0.05
0.1
h
v
(d)
Figure 6: (a) Estimated local scores for the localization
solutions. (b) The localization solution corresponding to
the best score (q= 19). The true target locations are
marked with stars. (c) The localization solution corre-
sponding to sensor #27. (d) The mean of all the localiza-
tion vectors.
Figure 6 summarizes the estimation results. Note that
the heights of the peaks are approximately proportional to
the source powers. In Fig. 6(a), the locally computed data
score values are shown. The scores vary because the dictio-
naries are built using the observed signals themselves, which
include both sources. In 6(b), we illustrate the localization
result for the sensor with the best local score. Even in the
presence of additive noise in the observed signals and the
high amount of compression, the resulting location estimates
are satisfactory. In 6(c), we randomly selected sensor # 19
and plotted its localization output. Given the ground truth,
shown with stars in the plot, the localization output of sen-
sor # 19 is much better than the sensor with the best local
score. However, with Monte Carlo run, we expect data fu-
sion scheme to perform better on the average. For complete-
ness, we show the average of all the localization outputs from
the sensor network.
Finally, Fig. 7 summarizes the localization results by
solving the optimization problem (35). In contrast to the re-
sults in Fig. 6, the results in Fig. 7 only use the sign measure-
ments of the compressive samples. Hence, the compression
is 800:1. Unfortunately, we do not have a score function for
the 1-bit support recovery results derive from the Bayesian
framework. Therefore, we heuristically use the mean of all
the sensor estimates, as shown in Fig. 7(a). The two tar-
gets are in the solution as expected along with some spu-
rious peaks due to noise and the drastic compression rates.
Figures 7(b) and (c) show the localization results from two
−100
0
100
−100
0
100
0
0.2
0.4
h
v
(a)
−100
0
100
−100
0
100
0
0.2
0.4
0.6
0.8
h
v
(b)
−100
0
100
−100
0
100
0
0.2
0.4
0.6
0.8
h
v
(c)
Figure 7: 1-bit estimation results. (a) Average of the
sparse solutions across the sensor network. (b) The lo-
calization vector of the sensor with the best score. (c) A
randomly selected sensor output. The true target loca-
tions are marked with stars.
random sensors in the sensor network. Note that the less
powerful target is missed in Fig. 7(c).
9.2 Bearing Estimation
In this section, we demonstrate the bearing estimation
performance of the proposed algorithms in Secs. 7 and 8.
Our focus is to demonstrate the potential reductions in com-
munication cost as well as computational efficiency in ob-
taining bearing estimates in a wireless setting rather than ar-
gue regarding sampling efficiency. For this experiment, we
have collected acoustic vehicle data for a convoy of five ve-
hicles traveling on an oval track. Since the spectral support
of the vehicle signals usually lie in the range of 0–500Hz,
existing sampling hardware suffices for this task. Therefore,
we collected data using 10 sensors, which uniformly sam-
pled data at a sampling rate of 4410Hz. The network reports
its bearing estimates at twice per second. Hence, the number
of signal samples per bearing estimate is 2205.
In the experiment, after the sensors collect Nyquist sam-
ples, they create local dictionaries as described in 6.2 and
calculate the random projections of their data with a pre-
stored Gaussian random matrix of dimensions 100 ×2205,
which is different at each sensor. At each sensor, we create
a uniform grid in the bearing domain [0,π)using N= 180
discrete locations.
Figures 8(a)–(d) summarize the results of four different
approaches. In Fig. 8(a) and (b), the sensors record the zero
crossing information of each measurement as 0/1bits, where
0corresponds to a negative signal sample and 1corresponds
to a positive signal sample. Then, to determine what each
bit corresponds to as a signal value, each sensor calculates
the absolute value of their randomly projected measurements
and estimate their mean, denoted by µ. The resulting number
and its negative corresponds to what the bits 1and 0encode,
respectively.
In the intersensor transmissions, sensors transmit the 1-
bit information of 100 compressive samples as well as the
single number µ, which can be quantized up to the desired
level of accuracy. Note that this transmission bandwidth,
which is on the order of hundred bits, is significantly smaller
than what would take transmit the observed signal itself,
0 10 20 30 40 50
200
250
300
350
Time in [s]
Bearing in [°]
(a)
0 10 20 30 40 50
200
250
300
350
Time in [s]
Bearing in [°]
(b)
0 10 20 30 40 50
200
250
300
350
Time in [s]
Bearing in [°]
(c)
0 10 20 30 40 50
200
250
300
350
Time in [s]
Bearing in [°]
(d)
Figure 8: (a) Baseline bearing tracks. (b) Bearing pur-
suit results with the 1-bit messaging scheme described in
Sect.7. (c) Bearing pursuit results when the compressive
samples of the source signals are used. (d) Result of 1-bit
CS optimization problem (35).
which is 2205-dimensional even if it is compressed. As an
example, with 10:1compression and 8-bit quantization on
the signal values, we would need at least 10×the communi-
cation to transmit the observed signals. Since the signal mes-
sages from multiple sensors need to be accumulated across
the sensors, this may create bottlenecks across the network.
In contrast, the 1-bit intersensor messages require a commu-
nication bandwidth that is on the order of bandwidths typi-
cally used by conventional RSS localization algorithms.
The result in Fig. 8(a) has been previously reported in [7]
and serves as a baseline for the comparison since it uses
computationally costly Dantzig selector for recovery [4]. In
Fig. 8(b), we use our bearing pursuit algorithm, which has
provable guarantees discussed in the appendix. Compared to
the Dantzig selector based approach, the tracks of the bear-
ing pursuit have a small loss of accuracy. However, the up-
shot of our approach is that it can be easily implemented in
simple sensor hardware, since it only requires the iteration
of a single matrix multiplication and two sorting operations.
Moreover, the number of iterations it requires to converge
is on the order of source sparsity. Note that the estimation
performance of the bearing pursuit algorithm improves only
slightly when the compressive measurements are transmitted
directly without compression, as shown in Fig. 8(c).
Finally, Fig. 8(d) illustrates the bearing tracks as a result
of solving the optimization problem (35). Note that for this
optimization problem, the encoding value for the 1-bit mea-
surements µis not needed. In this experiment, it is some-
what surprising to see that only the phase of the randomly
projected measurements is sufficient to obtain the bearing
tracks. This optimization based approach is useful in scenar-
ios when the sensors operate in clusters, where the cluster
head can build the source dictionary and the other sensors
can directly sample the compressive measurements of the ob-
servations. Since only zero crossing information needs to be
transmitted, the sensor hardware can be simplified.
10. CONCLUSIONS
In this paper, we have developed a Bayesian formulation
of the localization problem and posed it as a sparse recov-
ery problem. Our approach allows us to exploit sparsity in
several aspects of the network design: Signal sparsity, when
available, allows very efficient sensing and communication
of the source signals. Spatial sparsity allows decentralized
computation of the source locations and further reduction
in the communication cost, even if the source signals them-
selves are not sparse. It further allows the use of very effi-
cient 1-bit quantization and reconstruction methods that only
transmit timing information relevant for localization. In our
setting, the randomized compressive measurements that are
transmitted between sensor nodes act like fountain codes: As
long as “enough” measurements arrive at the receiver, we
can recover the required information about the signal. This
makes the measurements robust to packet drops. The mea-
surements are also progressive in the sense that each receiver
can choose to receive measurements until they can recover to
within a desired tolerance. In the special case of bearing es-
timation, the combination of sparsity and the incoherence of
the bearing problem also allows us to provide solid theoret-
ical guarantees on the performance of our algorithms. Our
experimental results with synthetic and field data verify and
validate our approach.
11. REFERENCES
[1] R. G. Baraniuk. Compressive Sensing. IEEE Signal Processing
Magazine, 24(4):118–121, 2007.
[2] C. M. Bishop. Pattern recognition and machine learning. Springer,
2006.
[3] P. Boufounos and R. G. Baraniuk. One-Bit Compressive Sensing. In
Conference on Information Sciences and Systems (CISS), Princeton,
NJ, Mar 2008.
[4] E. Candes and T. Tao. The Dantzig selector: statistical estimation
when pis much larger than n.Annals of Statistics, 35(6):2313–2351,
2007.
[5] E. J. Candès. Compressive sampling. In Proc. International Congress
of Mathematicians, volume 3, pages 1433–1452, Madrid, Spain,
2006.
[6] V. Cevher, M. Duarte, and R. G. Baraniuk. Distributed Target
Localization via Spatial Sparsity. In European Signal Processing
Conference (EUSIPCO), Lausanne, Switzerland, Aug 2008.
[7] V. Cevher, A. C. Gurbuz, J. H. McClellan, and R. Chellappa.
Compressive wireless arrays for bearing estimation. In Proc. IEEE
Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP),
number 2497–2500, Las Vegas, NV, Apr 2008.
[8] J. Chen, L. Yip, J. Elson, H. Wang, D. Maniezzo, R. Hudson, K. Yao,
and D. Estrin. Coherent acoustic array processing and localization on
wireless sensor networks. Proceedings of the IEEE,
91(8):1154–1162, 2003.
[9] D. L. Donoho. Compressed Sensing. IEEE Trans. on Information
Theory, 52(4):1289–1306, 2006.
[10] A. Gilbert, M. Strauss, and J. Tropp. A tutorial on fast fourier
sampling. IEEE Signal Processing Magazine, 25(2):57–66, March
2008.
[11] I. F. Gorodnitsky and B. D. Rao. Sparse signal reconstruction from
limited data using FOCUSS: A re-weighted minimum norm
algorithm. IEEE Trans. Signal Processing, 45(3):600–616, 1997.
[12] C. Guestrin, P. Bodi, R. Thibau, M. Paski, and S. Madden.
Distributed regression: an efficient framework for modeling sensor
network data. In Proc. of the Third International Symposium on
Information Processing in Sensor Networks (IPSN), pages 1–10.
ACM Press New York, NY, USA, 2004.
[13] A. C. Gurbuz, V. Cevher, and J. H. McClellan. A compressive
beamformer. In Proc. IEEE Int. Conf. on Acoustics, Speech and
Signal Processing (ICASSP), Las Vegas, Nevada, Mar 30 –Apr 4
2008.
[14] J. Hill and D. Culler. Mica: a wireless platform for deeply embedded
networks. Micro, IEEE, 22(6):12–24, Nov/Dec 2002.
[15] A. T. Ihler. Inference in Sensor Networks: Graphical Models and
Particle Methods. PhD thesis, Massachusetts Institute of Technology,
2005.
[16] J. Laska, S. Kirolos, Y. Massoud, R. Baraniuk, A. Gilbert, M. Iwen,
and M. Strauss. Random sampling for analog-to-information
conversion of wideband signals. In IEEE Dallas Circuits and Systems
Workshop, Dallas, TX, 2006.
[17] J. N. Laska, S. Kirolos, M. F. Duarte, T. Ragheb, R. G. Baraniuk, and
Y. Massoud. Theory and implementation of an analog-to-information
conversion using random demodulation. In Proc. IEEE Int.
Symposium on Circuits and Systems (ISCAS), New Orleans, LA, May
2007. To appear.
[18] D. Malioutov, M. Cetin, and A. S. Willsky. A sparse signal
reconstruction perspective for source localization with sensor arrays.
IEEE Trans. Signal Processing, 53(8):3010–3022, 2005.
[19] D. Model and M. Zibulevsky. Signal reconstruction in sensor arrays
using sparse representations. Signal Processing, 86(3):624–638,
2006.
[20] M. Rabbat and R. Nowak. Distributed optimization in sensor
networks. In Proc. 3rd International Workshop on Inf. Processing in
Sensor Networks (IPSN), pages 20–27. ACM Press New York, NY,
USA, 2004.
[21] G. Simon, M. Maróti, Á. Lédeczi, G. Balogh, B. Kusy, A. Nádas,
G. Pap, J. Sallai, and K. Frampton. Sensor network-based
countersniper system. In Proc. of the 2nd international conference on
Embedded networked sensor systems, pages 1–12. ACM New York,
NY, USA, 2004.
[22] E. M. Stein. Harmonic analysis real-variable methods, orthogonality,
and oscillatory integrals. Princeton University Press, Princeton, NJ,
1993.
[23] J. Tropp and A. C. Gilbert. Signal recovery from partial information
via orthogonal matching pursuit. IEEE Trans. Info. Theory,
53(12):4655–4666, Dec. 2007.
[24] J. Tropp, D. Needell, and R. Vershynin. Iterative signal recovery from
incomplete and inaccurate measurements. In Information Theory and
Applications, San Diego, CA, Jan. 27–Feb. 1 2008.
Appendix: Correctness of Bearing Pursuit
This appendix proves that the Bearing Pursuit algorithm cor-
rectly identifies the bearing of the sources.
LEMMA 1. With R1>K
2Nλand R2> KN R1we
have, for any j&=j!,
###Ep(Ψp,j ¯
Ψp,j!)˛˛˛≤O“1
K”.(37)
PROO F. Let us fix a circle with radius R1/2<r≤R1
around which sensors are placed at uniformly random angles
φ. Because of the scaling of R1, we have r=aλK2N. With
afixed radius r, we have
###Ep(Ψp,j ¯
Ψp,j!)˛˛˛=˛˛˛1
2πZ2π
0
e2πiaK2N[cos(φ−θj)−cos(φ−θj!)] dφ˛˛˛.
Using basic trigonometric identities, we note that
cos(φ−θj)−cos(φ−θj!)=C1cos(C2−φ)
where C1= 2 sin( θj−θj!
2)and C2=φ
2+θj+θj!
2. Because
the target bearings are separated by at least 2π/N (the hard-
est case), the constant C1is at least 4π/N. Then we can
simplify the above expression and obtain
###Ep(Ψp,j ¯
Ψp,j!)˛˛˛≤˛˛˛1
2πZ2π
0
e2πiCK2cos(C2−φ)˛˛˛,
where the constant Cis independent of the other parameters.
Finally, we observe that this integral satisfies the hypotheses
of Proposition 2, Chapter 8 (Oscillatory integrals of the first
kind) in [22] and apply the standard method of stationary
phase to bound it. We arrive at
###1
2π$2π
0
e2πiCK2cos(C2−φ)###≤O%1
√K2&=O%1
K&.
We claim that Equation 37 is a sufficient condition on Ψ
to recover the bearings of the sensors; i.e., to determine the
locations of the Knon-zero entries in θ.
THEOREM 1. If Ψhas O(K)independent rows (i.e., if
we place O(K)sensors uniformly at random in the sensor
disk), then each estimate of the form
e
θj=(Ψ∗Ψθ)j
satisfies
E('
θj)=θj±1
K#θ#1(38)
Var('
θj)≤1
K#θ#2
2.(39)
PROO F. First, we check the expected value of the estima-
tor '
θjwhich corresponds to one sensor.
E('
θj)=Ep(Ψ∗
p,j
N
(
&=1
Ψp,&θ&)
=Ep%θj+(
&$=j
Ψ∗
p,j Ψp,&θ&&
=θj+(
&$=j
Ep(Ψ∗
p,j Ψp,&)θ&
≤θk±1
K#θ−θj#1
≤θk±1
K#θ#1.
If we average over Kindependent sensor estimators, we re-
tain the same expected value bound. Furthermore, if θis
1-sparse with support at position j, then '
θjis approximately
correct in expectation while, if there is no source at position
j, then '
θj=(#θ−θj#1)/k is small. Note that this estimate
produces a separation: large values of the estimator give us
the correct position and small values give us an incorrect po-
sition.
Let us check the second moment as well (which is an
upper bound on the variance in Equation (39)).
E('
θ2
j)=Ep%Ψ∗
p,j Ψp,j (
&,&!
Ψ∗
p,&Ψp,&!θ&θ∗
&!&
=
N
(
&=1 |θ&|2+(
&$=&!
θ&θ∗
&!Ep(Ψ∗
p,&Ψp,&!)
≤#θ#2
2+K#θ#2
2O%1
K&
≤O(1)#θ#2
2.
Note that we use the AMGM to bound the product θ&θ∗
&!by
the norm #θ#2
2and that we have Ksuch terms for a K-sparse
position vector θ. So, we have a single instance of an esti-
mator which produces an approximately correct answer and
whose second moment we have bounded. If we repeatedly
use this estimator for O(K)independent sensor positions
(i.e., look at O(K)independent instances of the estimator),
then we drive down the variance of '
θjby the factor 1/K
and we can estimate θjaccurately for those positions jthat
satisfy, simultaneously, #θj|≥1/K#θ#1(from the bias in
the expectaction) and |θj|2≥1/K#θ#2
2(from the variance
bound). In particular, we can correctly recover the largest-
magnitude θjin a K-sparse vector θ. By making R1and
the number of sensors larger by the factor -−O(1), we can re-
cover positions with magnitude within the factor -of largest,
which makes the algorithm more robust.
Once we estimate accurately such positions, we add
them to our current representation for θ, subtract their esti-
mates from the current set of measurements, and iterate. We
omit details.
Strictly speaking, the proof of Equation 37 restricted our
sensors to an annulus with inner and outer radii approxi-
mately R1(i.e., we use sensors towards the outside of the
inner disk). We note that if we place sensors uniformly at
random within the inner disk, they will be concentrated on
such an annulus and that we can disregard those towards
the inside, possibly placing additional sensors at random to
get O(K)in the outer annulus. This rejection sampling in-
creases the number of sensors necessary by a constant factor
and we simply include it in the factor O(K)without loss
of generality. We can, therefore, conclude that the Bearing
Pursuit algorithm finds the sensors.
COROLLARY 11.1. With O(K)uniformly random sen-
sors, Algorithm 4 correctly identifies the bearings of the sen-
sors (i.e., it correctly determines the non-zero entries in θ).