Quantifying eavesdropping vulnerability in sensor networks.
ABSTRACT With respect to security, sensor networks have a number of considerations that separate them from traditional distributed systems. First, sensor devices are typically vulnerable to physical compromise. Second, they have significant power and processing constraints. Third, the most critical security issue is protecting the (statistically derived) aggregate output of the system, even if individual nodes may be compromised. We suggest that these considerations merit a rethinking of traditional security techniques: rather than depending on the resilience of cryptographic techniques, in this paper we develop new techniques to tolerate compromised nodes and to even mislead an adversary. We present our initial work on probabilistically quantifying the security of sensor network protocols, with respect to sensor data distributions and network topologies. Beginning with a taxonomy of attacks based on an adversary's goals, we focus on how to evaluate the vulnerability of sensor network protocols to eavesdropping. Different topologies and aggregation functions provide different probabilistic guarantees about system security, and make different tradeoffs in power and accuracy.

Conference Paper: Towards Survivable Sensor Networks Using SelfRegenerative Rejuvenation and Reconfiguration
[Show abstract] [Hide abstract]
ABSTRACT: The previous works in sensor networks security have focused on the aspect of confidentiality, authentication and integrity based on cryptographic primitives. There has been no prior work to assess the survivability in systematic way. In this paper, we propose a framework for enhancing the survivability of sensor networks using selfregenerative software rejuvenation and reconfiguration. We utilize self regenerative capabilities for detecting misbehaving in node level and apply software rejuvenation and reconfiguration methodology or both in order to extend the availability of sensor networks. The security analysis shows the feasibility of our approach.Computational Intelligence and Security Workshops, 2007. CISW 2007. International Conference on; 01/2008  SourceAvailable from: HongNing Dai
Conference Paper: An Analytical Model on Eavesdropping Attacks in Wireless Networks
[Show abstract] [Hide abstract]
ABSTRACT: This paper concerns the eavesdropping problem from the eavesdroppers' perspective, which is new since most of previous studies only concentrate on the good nodes. We propose an analytical framework to investigate the eavesdropping attacks, taking account into various channel conditions and antenna models. Our extensive numerical results show that the probability of eavesdropping attacks heavily depends on the shadow fading effect, the path loss effect and the antenna models; particularly, they imply that using directional antennas at eavesdroppers can increase the probability of eavesdropping attacks when the path loss effect is less notable. This study is helpful for us to prevent the eavesdropping attacks effectively and economically.IEEE International Conference on Communication Systems; 11/2014 
Conference Paper: The cyberphysical attacker
[Show abstract] [Hide abstract]
ABSTRACT: The world of CyberPhysical Systems ranges from industrial to national interest applications. Even though these systems are pervading our everyday life, we are still far from fully understanding their security properties. Devising a suitable attacker model is a crucial element when studying the security properties of CPSs, as a system cannot be secured without defining the threats it is subject to. In this work an attacker scenario is presented which addresses the peculiarities of a cyberphysical adversary, and we discuss how this scenario relates to other attacker models popular in the security protocol literature.Proceedings of the 2012 international conference on Computer Safety, Reliability, and Security; 09/2012
Page 1
Department of Computer & Information Science
Departmental Papers (CIS)
University of Pennsylvania
Year
Quantifying Eavesdropping Vulnerability
in Sensor Networks
Madhukar Anand∗
Zachary G. Ives†
Insup Lee‡
∗University of Pennsylvania, anandm@cis.upenn.edu
†University of Pennsylvania, zives@cis.upenn.edu
‡University of Pennsylvania, lee@cis.upenn.edu
Postprint version.
is posted here by permission of ACM for your personal use. Not for redistribution. The
definitive version was published in Proceedings of the 2nd International VLDB Workshop on
Data Management for Sensor Networks 2005 (DMSN 2005), pages 39.
Publisher URL: http://doi.acm.org/10.1145/1080885.1080887
Copyright ACM, 2005.This is the author’s version of the work.It
This paper is posted at ScholarlyCommons.
http://repository.upenn.edu/cis papers/176
Page 2
Quantifying Eavesdropping Vulnerability in Sensor
Networks∗
Madhukar Anand
Department of Computer and
Information Science
University of Pennsylvania
anandm@cis.upenn.edu
Zachary Ives
Department of Computer and
Information Science
University of Pennsylvania
zives@cis.upenn.edu
Insup Lee
Department of Computer and
Information Science
University of Pennsylvania
lee@cis.upenn.edu
ABSTRACT
With respect to security, sensor networks have a number of con
siderations that separate them from traditional distributed systems.
First, sensor devices are typically vulnerable to physical compro
mise. Second, they have significant power and processing con
straints. Third, the most critical security issue is protecting the (sta
tisticallyderived) aggregate output of thesystem, even if individual
nodes may be compromised. We suggest that these considerations
merit a rethinking of traditional security techniques: rather than
depending on the resilience of cryptographic techniques, in this
paper we develop new techniques to tolerate compromised nodes
and to even mislead an adversary. We present our initial work on
probabilistically quantifying the security of sensor network proto
cols, with respect to sensor data distributions and network topolo
gies. Beginning with a taxonomy of attacks based on an adver
sary’sgoals, wefocuson how toevaluate thevulnerabilityof sensor
network protocols to eavesdropping. Different topologies and ag
gregation functions provide different probabilistic guarantees about
system security, and make different tradeoffs in power and accu
racy.
Categories and Subject Descriptors: C.2.0 [Computer
Communication Networks]: Security and Protection
General Terms: Security
Keywords: Wireless Sensor Networks, Eavesdropping, Data
Streams, Probability Distribution.
1.INTRODUCTION
As sensor network technology advances, security and privacy
concerns will increasingly move to the forefront. Many realworld
settings in which sensors might be deployed (e.g., security systems,
intelligent buildings, hospitals, automated warehouses) have signif
icant need not only for privacy policies, but mechanisms for enforc
ing data security and confidentiality.
∗This work was funded in part by NSF grants IIS0477972
and CCR0209024 and ARO grants DAAD190110473 and
W911NF0510182.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profi t or commercial advantage and that copies
bear this notice and the full citation on the fi rst page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specifi c
permission and/or a fee.
DMSN’05, August 29, 2005, Trondheim, Norway.
Copyright 2005 ACM 1595932062/05/0008 ...$5.00.
Inthe Aspenn (Abstractionbased Sensor Programming Environ
ment from Penn) project, we focus on developing the infrastruc
ture for such rich sensor applications, in which the sensing devices
and networks may be heterogeneous (including smart card readers,
video cameras, and mobile sensors) and the sensor network may in
teract with external data sources on the Internet. A major emphasis
of our work lies in protecting application data from eavesdroppers
and hackers.
With respect to security, the sensor network domain has several
important characteristics that differentiate it from traditional dis
tributed systems. First, sensor devices are frequently vulnerable to
physical compromise or local eavesdropping, as they are embed
ded within an environment. Second, sensor devices have signifi
cant power and processing constraints, which often prevent them
from running expensive encryption protocols, but which also limit
the amount of “damage” they can do to the overall sensor network
(e.g., by injecting spurious data or snooping on large volumes of
messages). Third, sensor network applications are generally con
sensus or aggregationbased, meaning that compromising one or a
few nodes may not significantly affect the overall system.
To this point, security techniques have been adapted for the sen
sor network domain by reducing the computation requirements of
cryptography (generally by predistributing keys [18] or reducing
the key size [2]) in order to operate under the limited processing
capabilities of sensor networks. However, cryptography is not the
only means of providing security in a sensor network application
— in fact, if an attacker has sufficient resources, cryptographic
schemes with small key sizes may provide little protection. More
over, such techniques do not consider the systemwide effects if an
attacker compromises a few nodes.
We advocate a different approach, which takes advantage of the
fact that any realworld attacker is limited by the properties of
the system he or she is attempting to compromise. In this paper
we present an initial framework, taxonomy, and methodology for
quantifying theprivacy and security of sensor network applications,
under the assumption that some nodes may be compromised, and
based on the networks’ size, protocols, and computations. Rather
than providing allornothing guarantees about privacy or security,
our goal is to examine probabilistic guarantees with respect to
compromise, and to understand and improve existing aggregation
strategies with respect to these guarantees. Our focus in this pa
per is on the problem of eavesdropping, although we are currently
generalizing to other types of attacks. Specifically, we make the
following contributions:
• We propose a taxonomy of attack models for sensor net
works, based on the goals of the attacker.
Page 3
• We propose what we believe to be the first quantitative ap
proach to assessing systemlevel confidentiality and security,
under the possibility that some nodes are compromised.
• We show how our methods can be used to choose between
different protocols and sampling strategies.
• We discuss how cryptographic and noncryptographic tech
niques can be used to improve the confidentiality of a sensor
network.
The remainder of the paper is organized as follows: in Section 2,
we introduce a taxonomy of attacks in sensor networks. In the sub
sequent section, we develop a model for cost and accuracy in a
sensor network. Section 4 discusses how we model an attacker’s
ability to determine the output of a sensor network, and also her
cost. Next, we identify and assess potential means of combating
eavesdropping. We discuss related work in Section 6, and in Sec
tion 7 we conclude by highlighting avenues for future work.
2. TAXONOMY OF ATTACKER MODELS
By compromising nodes, eavesdropping, or spoofing, an adver
sary may attempt to violate the security of a sensor network appli
cation. In order to evaluate a sensor application’s security charac
teristics, we must first understand the potential goals of the adver
sary’s attack. We define a taxonomy of attack models for sensor
networks, based on the goals of the adversary.
1. Eavesdropping. Here, the adversary (eavesdropper) aims to
determine the aggregate data that is being output by the sen
sor network: it is attempting to see what the system is ob
serving, e.g., to predict how the owner of the sensor network
will react. The adversary either listens to messages transmit
ted by the nodes, or directly compromises those nodes. We
further distinguish between two types of eavesdropping:
(a) Passive: The eavesdropper conceals her presence from
the sensor nodes and uses only the broadcast medium
to eavesdrop on all messages.
(b) Active: The eavesdropper actively attempts to discern
information by sending queries to sensors or aggrega
tion points, or by attacking sensor nodes.
2. Disruption. The intent of the adversary is to disrupt the sen
sor application. This can be a combination of two types of
techniques:
(a) Semantic: Theadversaryinjectsmessages, corruptsdata,
or changes values in order to render the aggregated data
corrupt or useless.
(b) Physical: The adversary upsets sensor readings by di
rectlymanipulatingtheenvironment. Forexample, gen
erating heat in the vicinity of sensors will result in er
roneous values being reported.
3. Hijacking. This variation on the disruption model is a case in
which the adversary attempts to direct the aggregated output
of the sensor application towards a value of her choosing.
If the adversary gains control of enough sensors, then this
attack is the hardest to counter.
Our focus. In this paper, which forms the first step towards ad
dressing the attack models of our taxonomy, we focus strictly on
the case of eavesdropping. As stated above, we assume that the
adversary’s goal is to ascertain the aggregated values output by the
network: while subtly different from the alternative definition —
attempting to precisely ascertain information about the sensed en
vironment — we believe this is a more likely motivation for at
tacking a sensor network. In our definition, what we are trying to
protect is what the system sees, and thus the ability to predict how
the user of the system might react, as opposed to merely protecting
information about the environment. We note that our methods can
generalize to handling the latter case as well: the two definitions
will essentially coincide if we constrain our sensor network appli
cation to return the most accurate information possible about the
environment.
In the next two sections, we first define our network model and
means for determining cost; then we discuss how we evaluate net
works’ vulnerability to eavesdropping — first for heighttwo ag
gregation trees, and then for trees of arbitrary depth.
3.SENSOR NETWORK MODEL
We begin by introducing our model of a sensor network, begin
ning by examining how computation is performed, and then quan
tifying the quality (accuracy) of the network and its cost. These
factors, aswell as the vulnerability of the network toeavesdropping
(next section) will form the basis of assessing sensor networks.
3.1Streams and Aggregation
Data from sensors is typically continuous and timevarying, as
opposed to actually having discrete values; a formal stream model,
similar to that of [1], is appropriate to capture this aspect of data.
DEFINITION 1. (Sensor Stream) A Sensor Stream R is a possi
bly infinite sequence of elements, {?id,d,τ,ρ?n}n≥1, where id ∈
Z+is a identifier for the sensor, d is a sensor data structure, τ is
the timestamp and ρ is either ∅ or the location of the sensor.
2
Wereasonabout twoorthogonal typesofaggregationover streams:
instream aggregation, which occurs over asingle stream, generally
over a time window, and multistream aggregation, which occurs
across the values of multiple streams, either at the same time or
over a time window.
Instream aggregation can be thought of as aggregation over all
data from a single sensor within some time window. We can also
defineaggregationover streamsofdatafromdifferent sensorswithin
the same time window. We refer to this form of aggregation as
multistream aggregation.
3.2Hierarchical Aggregation
Forpurposes offormal analysis, weabstract awayspecificdetails
of sensing, communication and computation and view the network
from a pure data collection and aggregation perspective. The hier
archical aggregation tree is a recursive structure in which, at each
level of the tree, groups of child nodes send their values to a parent
node that aggregates their values. The base station is the interme
diate point at the highest level. Our model is consistent with most
proposed aggregation algorithms, e.g. [16, 23, 11].
Finally, we assume that the values observed at each sensor are
not identical, but can be characterized according to some proba
bilistic data distribution. Data from a sensor network will typically
consist of a number of observed attributes; a probability density
function (pdf) can be used to assign a probability for each possible
assignment to the attributes. Such a model can be learned from data
collected over time, using algorithms such as those in [17]. Learn
ing a model involves maintaining certain parameters, e.g., the mean
and the variance, and coping with noise, outliers, etc. A significant
literature exists on learning models of streams, (e.g., [3, 5]).
Page 4
Many sensor applications include multiple, dynamic attributes,
and hence correlations and temporal aspects to the data distribution
must also beconsidered. In[8], theauthorsused Markovian models
to learn the timevarying effects of sensor readings. In their model,
given the value of all attributesat timet, it isassumed that the value
of the attributes at time t+ 1 are independent of those for any time
earlier than t. This is generally sufficient to capture the dynamic
nature of the sensor data. The same authors have extended their
work in [7] to consider correlations between streams.
Our work assumes that such distributions are given (or can be
reasonably approximated). Based on knowledge of the data distri
bution, wecan provide specific probabilistic guarantees about sens
ing and eavesdropping. Additional information, such as the spatial
distribution of sensors, is not assumed, although it can add to the
precision of the metrics we present.
We illustrate an example aggregation tree in Figure 1, where
nodes s0,...,s5 are in a hierarchical group. Each of s1,...,s5
perform aggregation of data in subgroups and combine their own
data with this before forwarding it to node s0. Node s0, in addition
to recording its own sensor data, is also the final aggregator for all
the data in the network.
s0
s3
s5
s4
s2
s1
Figure 1: Sensor network model
We consider the presence of a powerful adversary who has the
capability of listening to the messages in the sensor network, or of
compromising sensor nodes in an undetectable way, with a certain
probability. The higher a compromised node is in the aggregation
tree, the more power the attacker has.
Notation. We denote the set of all sensor data streams within a
group in the hierarchical network with the symbol S. Some subset,
SC ⊆ S of these data values will be used to compute the stream
aggregate σ (this quantity considers the possibility of dropped mes
sages, filtering, sampling, etc.). The adversary can eavesdrop on
some set of nodes SA ⊆ S, which may overlap with but differ
from SC.
EXAMPLE 1. Consider, thesituationdepicted inFigure1, where
the toplevel group of an activity monitoring sensor network has
nodes s0,...,s5. Assume the sensors s1,...s5perform their local
aggregation tasks and output their values to node s0 once every 5
seconds. Alsoassume that the values from all data streams have the
instream aggregation function σ1 to be the mean of all the read
ings obtained at each node si over the past 4 sampling intervals.
Let the multistream aggregation (σ2) be applied every 20 seconds,
as the mean of the readings from s0,...,s5.
If readings from s0 are {4.82,4.81,4.82,4.83}, then σ1(s0) =
4.82. Similarly, if σ1(s1) = 4.93, σ(s2) = 5.17, σ1(s3) = 4.92,
σ1(s4) = 4.87, σ1(s5) = 5.04 and we compute the mean over all
streams, then σ2(S) = 4.96.
2
3.3Quality of the Sample
Given a model of the distribution of data readings in the environ
ment, there are several possible metrics for estimating the quality
(accuracy) of the sample. We assume that the readings used to
produce a single aggregate stream element occur within some time
window [T,T + ∆]. The length of the window, ∆, is application
specific, and it corresponds to the common notion of an epoch [16]
during which computations are performed, but it allows readings to
occur at any point within the window.
In statistics, goodnessoffit is used to measure the distance be
tween the data and the hypothesis. For example, if the underlying
distribution is normal, then goodnessoffit can be determined by
using the standard χ2test. We adopt a statistic that works bet
ter for small samples and is simple to compute, the Kolmogorov
Smirnov test [12]. To compare a data sample consisting of N
events whose cumulative distribution is SN(x) with a hypothesis
function whose cumulative distribution is Φ(x), the value η is cal
culated as η = maxxSN(x) − Φ(x).The CramerSmirnovVon
Mises test is often used to test that a onedimensional data sample
is compatible with being a random sampling from a given distri
bution: If the density function of the data is f(x), then, the test
measures the goodnessoffit by the measure W2, which is given
byR∞
depending on the distribution of data; for details we refer the reader
to a standard textbook on statistics (e.g., [12]).
−∞[SN(x)−F(x)]2f(x)dx. There are many alternative tests,
EXAMPLE 2. If we assume that the data in Example 1 is dis
tributedN(5,0.1) and use theχ2test asthegoodnessoffit measure,
we have ∆ = 20s andP5
i=0
that we have a sample close to the actual model.
3.4Cost of Sensing
We can estimate the cost of producing a single output element in
the sensor network by considering the cost of acquiring and com
municating the sensor readings. Let the time window be T =
[T,T + ∆], the cost of acquiring a reading at sensor node s be
ca(s), and the cost of transmitting a message from sensor s to the
aggregating point s0 be ct(s). Then the cost of acquiring the data
to be aggregated is Ca(T ,S) =P
diate node in the aggregation tree, the cost of transmitting sensor
data is Ct(T ,S) =P
cause there is no transmission involved from s0 to itself). Let the
reception cost for one reading at s0be cr. Then, the total cost of re
ception Cr(T ,S\S0) = S\S0·crwhere S0is the set of readings
obtained at s0. Thus, the total cost for acquiring and aggregating
the data is C(T ,S) = Ca(T ,S) + Ct(T ,S) + Cr(T ,S\S0) for
any set S of nodes that share a single aggregation point s0.
( ¯ si−4.96)
0.1
= 0.911, which implies
2
s∈Sca(s). For each interme
s∈Sct(s), where ct(s0) = 0. (This is be
EXAMPLE 3. Let us assume that the cost of sensing for at
tribute is 0.015J and transmitting and receiving data takes 0.025J
of energy for all the sensors in Example 1. In one epoch, the sen
sors transmit20
0.015) + 4 × 0.015J + 5 × 4 × 0.025J = 1.36J.
5= 4 packets. Hence, C(S) = 5 × 4 × (0.025 +
2
4.MODELING EAVESDROPPING
We now consider the case of an adversary who has access to
some of the sensor readings (either through eavesdropping or com
promise), and who istrying to determine the aggregate value output
by the sensor network.1We consider the confidentiality of the net
work, in terms of whether the adversary can estimate the output
1As described in Section 2, this definition is motivated by the fact
thattheeavesdropper ismost likelytobeinterestedinpredictingthe
behavior of the person or application monitoring the sensor data.
Page 5
value within some small tolerance δ. We compute the eavesdrop
ping vulnerability based on several important parameters. First,
there is the probability that a compromised set of sensor nodes,
SA, greatly resembles the set of nodes that our application is sam
pling, SC. This probability is a function of the size of SC, the
specific aggregate function σ, and the data distribution of the sen
sors S. For example, if all sensors produce the same reading, then
the adversary can compromise the system from a single reading.
We formalize the probability based on these parameters.
DEFINITION 2. (Eavesdropping Vulnerability) The eavesdrop
ping vulnerability (γ) relative to a set of compromised nodes is de
fined as γ(σ,S,SA,SC,δ) = p(σ(SC) − σ(SA) ≤ δ), where σ
is the aggregating function and δ the adversary’s error tolerance.
2
Although we have considered a single aggregate computation
here, the eavesdropping vulnerability can be generalized to sup
port multiple aggregate computations over different attributes: the
expected value of γ can be obtained by conditioning on different
parameters.
We can compute the expected eavesdropping vulnerability, in
which the specific SA is unknown, as ¯ γ =
I(σ(SC) − σ(s) ≤ δ), where I is an indicator function that
evaluates to 1 if the condition is true and 0 otherwise.
This relieson knowledge of the underlying sensor value distribu
tion of S, and the specific aggregation function, σ. We now show
the derivation of γ values for the most common sensor aggrega
tion functions (min,max,sum,avg and median) over single at
tributes with discrete distributions:
P
sp(SA = s) ·
• Min/Max: I(min(SC) − min(SA) ≤ δ) = 1 if min(SA)
lies between [min(SC) − δ,min(SC) + δ]. If f is the prob
ability density function (pdf) and Φ is the cumulative den
sity function (cdf) of the distribution of S, then, for any j,
f(j) is the probability of obtaining a j and (1 − Φ(j)) is
the probability that a reading is greater than j. Thus in a
sample of size i, j will be the minimum with probability
f(j)(1 − Φ(j))i−1. Therefore:
¯ γ =
S
X
i=1
p(SA = i)
?min(SC)+δ?
X
j=?min(SC)−δ?
f(j) · (1 − Φ(j))i−1
(1)
Using a similar argument for Max, we get:
¯ γ =
S
X
i=1
p(SA = i)
?min(SC)+δ?
X
j=?min(SC)−δ?
f(j) · Φ(j)i−1
(2)
• Sum: I(sum(SC)−sum(SA) ≤ δ) = 1 if sum(SA) lies
between [sum(SC)−δ),sum(SC)+δ)]. If fSAis the pdf
of the sum of variables and ΦSAis the cdf of the sum of
variables, we get:
¯ γ =
S
X
i=1
p(SA = i) ·`ΦSA(u) − ΦSA(l)´
(3)
where u = (sum(SC) + δ) and l = (sum(SC) − δ).
• Avg: I(avg(SC) − avg(SA) ≤ δ) = 1 if sum(SA) lies
between [SA(avg(SC)−δ),SA(avg(SC)+δ)]. If fSA
is the pdf of the sum of variables and ΦSAis the cdf of the
sum of variables, then with a similar argument as before, we
get:
¯ γ =
S
X
i=1
p(SA = i) ·`ΦSA(u) − ΦSA(l)´
(4)
where u = i(avg(SC) + δ) and l = i(avg(SC) − δ).
• Median: I(med(SC) − med(SA) ≤ δ) = 1 if med(SA)
lies in [med(SC) − δ,med(SC) + δ]. If f be the proba
bility density function (pdf),and Φ is the cumulative density
function (cdf) of distribution of S, then, for any j, f(j) is
the probability of obtaining a j, Φ(j) is the probability that a
reading is less than j, and (1 − Φ(j)) is the probability that
a reading is greater than j. Thus in a sample of size i, j will
be the median with probability,
p(j) =`
?i
2?
Therefore:
i
´· f(j) · Φ(j)?i
2?· (1 − Φ(j))i−?i
2?−1.
¯ γ =
S
X
i=1
p(SA = i)
?min(SC)+δ?
X
j=?min(SC)−δ?
p(j)
(5)
2
EXAMPLE 4. To evaluate the expected value of γ for the ap
plication in Example 1, let us assume that the probability of the
adversary eavesdropping onasingle node is0.2 and thedataisdis
tributed as N(5,0.1). Also, let the tolerance δ = 0.1. Noting that
we have σ2(S) = 4.96, we can use Equation (4) to evaluate the
expected probability. We get ¯ γ =P5
2, which on evaluation yields ¯ γ = 0.2499. This agrees with our
intuition that if the adversary is able to compromise one node, then
she is far from being able to estimate the aggregate of the network
consisting of 5 nodes.
i=1pi· (Φ(5.06) − Φ(4.86))
2
4.1Hierarchical Aggregation
Thus far, we have only considered aggregation within a group
with a single aggregation point. We now generalize to eavesdrop
ping over hierarchical groups: the goal is to consider how close the
adversary gets to an aggregate value higher in the tree when she
eavesdrops on data in the lower levels comprising that group. (If
we assume that the adversary eavesdrops only at one level, then
this problem is identical to the one considered above.) The higher
the adversary listens, the closer she gets to aggregate of the whole
network.
An example scenario is depicted in Figure 1, where we assume
thattheadversaryhaseavesdropped ongroups withnodes s1,...,s5
as the nodes responsible for aggregation. Now, we want to know
how close she gets to the aggregate at s0.
The probability of adversary learning the result of aggregation
at a level l is called the eavesdropping vulnerability over a hier
archy and is denoted by γl, where l indicates the hierarchical level
from which the adversary listenswiththe goal of compromising the
overall system. As with γ, γlwill be a function of Sl,Sl
and δ. We consider the effect of a lowerlevel compromise on a
higherlevel node to be a “partial compromise” of the higher node,
i.e., we define Sl
set at level l is the union of sets σ(Sl−1
fact that the sensor values at level l will be aggregates of values at
level l − 1.
A,Sl
C,σ
A=S
iσ(Sl−1
Ai),l > 1. Note that the adversary’s
Ai), which accounts for the
2These values can be found by converting it into standard normal
form for which Φ is well tabulated.
Page 6
DEFINITION 3. (Eavesdropping VulnerabilityoveraHierarchy)
The eavesdropping vulnerability (γl) for the adversary over a hier
archy isdefined asγl(σ,Sl,Sl
where σ is the aggregating function and δ is the error in estimate,
and Sl
A,Sl
C,δ) = p`σ(Sl
C) − σ(Sl
A) ≤ δ´,
A=S
iσ(Sl−1
Ai),l ≥ 1.
2
Note that with this definition, γ = γ0. We can compute γl by
conditioning onvarious parameters. Forexample, knowing σ,Sl,Sl
and δ, we can compute:
C
γl=
X
Sl−1
A1,...,Sl−1
An
p(Sl−1
A1,...,Sl−1
An) · I(d ≤ δ)
(6)
where d = σ(Sl
Computing γl, in general, involves knowing how much the data
from different groups are related. If the data from different groups
at level l−1are correlated, then computing γlcan be quitedifficult.
Correlations between groups are also undesirable because they can
help the adversary can make a good estimate by eavesdropping on
only a few groups.
Although the exact computation of γl is generally difficult, an
approximate answer bymakingsomesimplifyingassumptions, such
as simultaneous eavesdropping in all the groups. The example be
low illustrates this idea.
C) − σ(Sl
A).
EXAMPLE 5. Consider the scenario in Example 1. Let us as
sume that each of the nodes s0,...,s5are themselves aggregating
data in their groups and that the distribution in each group is as
follows: s1 : N(4.9,1), s2 : N(4.8,1), s3 : N(4.8,1), s4 :
N(5,1), s5 : N(5.2,1), and the data from node s0 is distributed
N(5,1). If the data at this level is being averaged, the resulting av
erage will be normally distributed with mean5+4.9+4.8+4.8+5+5.2
and a standard deviation
N(4.95,0.16). Now, if the probability of eavesdropping simulta
neously in every group is 0.5, the eavesdropping vulnerability for
δ = 0.1 isP5
4.2Performance Ratio
The eavesdropping vulnerability γ or γlgives us the probability
that anadversary canobtainagoodestimateof theactual aggregate.
Obviously, we would like to design sensor networks that minimize
this probability; however, to do this, we will generally have to incur
additional overhead.
If we use benefit to mean how close an estimate is to the tar
get (in the case of our application, this is the “real” aggregate;
in the case of the adversary, this is our network’s aggregate), we
can define a performance ratio to compare different sensor network
schemes. We define the performance ratio of the adversary relative
to a set of compromised nodes, ρA, as: ρA(σ,S,SA,SC,δ,C) =
γ(σ,S,SA,SC,δ)
Cr(SA)
. The increase in cost incurred to reduce γ can be
measured by
tolerant data protocol and C is the cost model for the standard
streaming model, as defined earlier. We can now define the per
formance ratio of a sensor network, ρ, as:
6
1+1+1+1+1+1
36
, which has distribution
i=1(0.5)i· (Φ(5.06) − Φ(4.86)) = 0.4599.
2
C(S)
C?(S). Here, C
?isthecost model foranyeavesdropping
ρ(σ,S,SA,SC,δ,C,C
?) =
1
ρA(σ,S,SA,SC,δ,C)·C(S)
C?(S)
(7)
Wecancalculate theexpected value of ρby conditioning on various
parameters. Ideally, we would like to design our data protocol to
maximize ρ as much as possible.
EXAMPLE 6. Consider the application in Example 1 with the
cost as computed in Example 3. We assume that the probability of
the adversary eavesdropping on a node is 0.2, yielding a cost of
0.025 ∗ 4J = 0.1J.
P5
i=1
now be computed as,
for the adversary increases the ratio ρ. If we make it harder for
the adversary to eavesdrop, say reducing the probability of eaves
dropping on a single node to 0.1, then we will have, ¯ ρ = 1.2248.
Techniques for increasing performance ratio are discussed in the
next section.
(Φ(5.06)−Φ(4.86))·pi
0.1i
1.799·1.36
= 1.799. ¯ ρ can
1
1.36= 0.5558. Intuitively, higher cost
2
Toincrease thequality of the sample, weneed more observations
(SC), which, however, increases both cost and (if the distribution
of values remains the same) γ. Hence, we can identify a tradeoff
between quality, cost, and having a eavesdropping vulnerability.
5.COUNTERMEASURES AGAINST
EAVESDROPPING
Given our understanding of the factors that affect eavesdropping
potential, we now present some general techniques to thwart ad
versaries. We distinguish between traditional, cryptographic tech
niques and noncryptographic schemes.
5.1Cryptographic techniques
Encryption and authentication using cryptographic techniques
makes a system significantly more secure against eavesdropping
and other attacks. Encryption can be used to keep data secure
from the adversary, and authentication can be used to safeguard
against spurious data. In essence, these techniques attempt to en
sure systemlevel confidentiality by protecting all links. For the
sensor network environment, symmetric key techniques are most
commonly used, but it is unclear how to manage keys and how to
justifytheoverhead of encryption. Among themany prior works on
cryptographic techniques for privacy in wireless sensor networks,
[18] and [15] describe methods to achieve authenticity and confi
dentiality.
However, many approaches (e.g., [19, 9]) assume a prekey dis
tributionwhichimpedesnetwork creationand makesdynamicmem
bership difficult. In [4], Chan and Perrig advocate that endtoend
encryption is not possible for sensor networks and foresee new
methods as the solution. Moreover, encryption may not help if the
nodes themselves can be compromised. Taking our cue from these
points, we briefly suggest several alternatives below.
5.2Noncryptographic techniques
Noncryptographic techniques make it harder to eavesdrop by
reducing the chance that an adversary’s sensor data sample SA
matches the system’s sample SC.
Data Filtering or Compensation. One technique is to deliber
ately send spurious data (or data with spurious offsets) from the
sensors, and to filter the noise at the aggregating point. After fil
tering, the resulting data set will comprise legitimate information
about the underlying network. The adversary, who is not aware of
this shared information, will see data that follows a different distri
bution.
One such idea, which we are investigating extensively, is termed
confusion [6]. Under such a scheme, whenever the sensor wishes to
transmit a message, it appends the shared secret (token) to the mes
sage. A set of confusiongenerating nodes then could inject spuri
ous data, which is indistinguishable to a third party, into the net
work. Such confusion messages could be generated either by third
party nodes or be a subset of sensors themselves. At the receiving
end, the secret can be used to separate the legitimate message from
the noise. Yet while the aggregate node can filter out superfluous
Page 7
messages from confusers, an eavesdropper with incomplete knowl
edge cannot make such distinctions. Since the eavesdropper is not
aware of which tokens belong to the sensors and which belong to
the confusers, she cannot identify the legitimate messages. Thus, if
she ends up accepting the “noise,” she will end up with a different
distribution of the data in the network.
As with encryption techniques, a confusionbased technique as
sumes a shared secret unique to a sensor, but it may may require
less computational power per sensor node, it is tolerant to the com
promise of a few nodes, and it is resistant to active eavesdropping.
The savings on perdevice power in the confusionbased approach
comes fromthefact that thereisno need forthe expensive exponen
tiation operations involved in encryption. Confusion does require
more message transmissions, but these can be amortized by adding
greater numbers of devices.
EXAMPLE 7. Consider the application in Example 1. Suppose
the sensors double their transmission rate by injecting a spuri
ous value for every legitimate one. Assume that the legitimate
data is distributed within the range N(5,0.1), while the spuri
ous data ensures the adversary’s sample will be uniformly dis
tributed in [4,7]. Given the model of Example 6, the cost is C(S) =
8×5×(0.025+0.015)+4×0.015+8×5×0.025J = 2.66J.
P5
i=1
0.1i
1
0.1492·
the vulnerability of the network, when compared to the baseline
model’s ¯ ρ = 0.5558.
(Φ(5.06)−Φ(4.86))·pi
= 0.1492. ¯ ρ can now be computed as,
2.66= 3.4267. Clearly, this technique greatly reduces
1.36
Data cloaking [10] has been proposed as another approach to
achieving privacy in sensor networks. Cloaking of data involves
perturbing the data by a predefined offset. This has been used to
achieve anonymity within a network. A similar idea can also be
used to counter eavesdropping: 1) First, nodes are partitioned into
disjoint subsets. 2) Then, based on a shared secret, each node
within a partition is assigned an offset. This offset is added to
the actual sensor reading before transmission. Ideally, this offset
should be unique to a partition. 3) At the point of aggregation, the
appropriate offset is subtracted from the reading before aggrega
tion.
Although thisschemerequiresmaintaininganodetooffset map
ping at the aggregating point, it can easily be obviated by having
all the nodes within a partition transmit within a time slot. With
such a routing protocol, only the mapping of different time slots to
the offset would have to be stored and this information is modest
compared the original mapping.
The adversary, who has no information about the offset, will be
readily misled by the transmitted information. Even if she man
ages to compromise a few nodes and learn the offset information,
the damage is limited to members of the partition with the compro
mised nodes.
EXAMPLE 8. Consider the scenario in Example 1. Let us as
sume that the nodes s1,...,s5 are themselves aggregation point
of their groups and their data is distributed as N(5,1). Also,
let the data at node s0 be also distributed N(5,1). If average is
the aggregation function used, it will be normally distributed with
mean
6
= 5 and standard deviation
sume that the probability of eavesdropping on a single message is
0.5, the eavesdropping vulnerability isP5
Φ(4.86)) = 0.9843 × 0.4553 = 0.4482.
Now, if we assume that each sensor i,i ∈ {0,...,5} adds an
offset 0.1i, which is subtracted out at s0, then the average will
be normally distributed with mean5+5.1+5.2+5.3+5.4+5.5
5×6
1×6
36
= 0.16. If we as
i=1(0.5)i· (Φ(5.06) −
6
= 5.25
and standard deviation
ping vulnerability will beP5
0.9843 × 0.1101 = 0.1083. which is a clear reduction in eaves
dropping vulnerability from 0.4482 without using the offsets.
1×6
36
= 0.16. In this case, the eavesdrop
i=1(0.5)i· (Φ(5.06) − Φ(4.86)) =
2
Attributevalue Correlation. Yet another possibility is to use cor
relations between different attributes. If the application at hand is
temperature monitoring and a sensor’s temperature and voltage are
correlated, then, for instance, the sensors might transmit voltages
in certain cases, and temperatures the remainder of the time. If we
assume that the adversary does not have the correlation model, then
such data will be useless to her. Constructing correlations between
attributes has been previously studied (e.g., [7]), with the objec
tive of reducing the cost for the network. Here we use it as shared
information. Importantly, it takes a considerable amount of time,
energy, and node samples to learn this correlation model, mean
ing that an attacker would need to devote significant resources to
compromising a large portion of the system.
EXAMPLE 9. Again consider Example 7, with the modification
that with probability 0.5, the sensors send voltage readings. Fur
ther, they also output as many spurious messages as temperature
readings, in order to ensure that the adversary’s distribution is uni
formly distributed in [4,7]. In this case, C(S) = (1
5×(0.025+0.015)+4×0.015+1
1.76J, ¯ ρ can now be computed as,
is better than strictly using the filtering/compensation approach.
2×8+1
2×4)×
2×8+1
1
0.1492·1.36
2×4)×5×(0.025) =
1.76= 5.1791. which
6.RELATED WORK
Prior works on sensor security [22, 14] present attack models,
but our focus and attack taxonomy are a more general classification
based on the goals of the adversary, and our focus is on the security
of theoverall systemeven when individual nodes arecompromised.
There is also a significant literature on quantifying security in
a contextspecific way. [13] presents a quantitative model of the
security intrusion process based on attacker behavior: their model
is based on empirical data collected from intrusion experiments.
[20] quantifies security strength and risk using economic criteria.
It should be noted that though these are general methods, their ap
plicability to sensor networks is uncertain. Our approach, in con
trast, is based on data models for different applications of sensor
networks.
The idea of developing a probabilistic model for data aggrega
tion in sensor networks was introduced in [8]. We can use the same
techniques to learn a model from the data. However, our focus is
on using the model to understand the security vulnerabilities of a
sensor network, as opposed to minimizing power usage in com
puting aggregates. This slightly resembles the resilient techniques
for data aggregation of [21], although we focus on quantitatively
ascertaining robustness in the presence of an adversary.
7.CONCLUSIONS AND FUTURE WORK
We have presented an attacker taxonomy for sensor networks
which has three main classes of attackers: eavesdropping, disrup
tion, and hijacking. So far as we know, our work is the first to
focus on quantifying systemlevel eavesdropping vulnerability. We
first study asinglelevel aggregation tree (γ) and then a hierarchical
network (γl), developing a probabilistic scheme for assessing their
eavesdropping vulnerability. We then consider trading off power
consumption versus security and data quality/accuracy. Finally, we
propose a series of solutions using cryptographic techniques, data
filtering, and attribute correlation.
Page 8
This paper represents an initial step in a much broader plan.
First, we are extending our model to the disruption and hijacking
models. We are also developing a comprehensive characterization
of common sensor network protocols and aggregation functions
with respect to their robustness. We ultimately hope to consider a
range of other issues, such as unreliable networks, temporary out
ages, and correlations between the values at different sensors.
8.REFERENCES
[1] A. Arasu, S. Babu, and J. Widom. The CQL continuous
query language: Semantic foundations and query execution.
Technical Report 200367, Stanford University, 2003.
[2] S. Avancha, J. L. Undercoffer, A. Joshi, and J. Pinkston.
Secure sensor networks for perimeter protection. Computer
Networks, 43(4):421–435, November 2003.
[3] B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom.
Models and issues in data stream systems. In PODS ’02:
Proceedings of the twentyfirst ACM
SIGMODSIGACTSIGART symposium on Principles of
database systems, pages 1–16, New York, NY, USA, 2002.
ACM Press.
[4] H. Chan and A. Perrig. Security and privacy in sensor
networks. IEEE Computer Magazine, pages 103–105, 2003
2003.
[5] F. Chu, Y. Wang, and C. Zaniolo. An adaptive learning
approach for noisy data streams. In ICDM, pages 351–354,
2004.
[6] E. Cronin, M. Sherr, and M. Blaze. On the reliability of
internet eavesdropping, February 2005. Personal
Communication.
[7] A. Deshpande, C. Guestrin, S. Madden, and W. Hong.
Exploiting correlated attributes in acqusitional query
processing. In ICDE 2005, 2005.
[8] A. Deshpande, C. Guestrin, S. R. Madden, J. M. Hellerstein,
and W. Hong. Modeldriven data acquisition in sensor
networks. In 30th VLDB Conference, 2004.
[9] W. Du, J. Deng, Y. S. Han, S. Chen, and P. Varshney. A key
management scheme for wireless sensor networks using
deployment knowledge. In Proceedings of The 23rd
Conference of the IEEE Communications Society, 2004.
[10] M. Gruteser, G. Schelle, A. Jain, R. Han, and D. Grunwald.
Privacyaware location sensor networks. In Proceedings of
HotOS’03: 9th Workshop on Hot Topics in Operating
Systems, pages 163–168. USENIX, May 2003.
[11] J. M. Hellerstein, W. Hong, S. Madden, and K. Stanek.
Beyond average: Towards sophisticated sensing with queries.
In 2nd International Workshop on Information Processing in
Sensor Networks (IPSN ’03), March 2003.
[12] I.Miller and J.E.Freund. Probability and Statistics for
Engineers, 2nd edition. Prentice Hall,Inc, Englewood Cliffs,
NJ., 1977.
[13] E. Jonsson and T. Olovsson. A quantitative model of the
security intrusion process based on attacker behavior. IEEE
Trans. Softw. Eng., 23(4):235–245, 1997.
[14] C. Karlof and D. Wagner. Secure routing in wireless sensor
networks: Attacks and countermeasures. In IEEE Int’l
Workshop on Sensor Network Protocols and Applications,
pages 113–127, May 2003.
[15] Y. W. Law, S. Etalle, and P. H. Hartel. Assessing
SecurityCritical EnergyEfficient sensor networks. In Conf.
on Security and Privacy in the Age of Uncertainty (SEC),
pages 459–463, May 2003.
[16] S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong.
Design of an acquisitional query processor for sensor
networks. In SIGMOD 2003, pages 491–502, 2003.
[17] T. Mitchell. Machine Learning. McGraw Hill, 1997.
[18] A. Perrig, R. Szewczyk, V. Wen, D. E.Culler, and J. D.
Tygar. SPINS: security protocols for sensor netowrks. In
Mobile Computing and Networking, pages 189–199, 2001.
[19] B. Przydatek, D. Song, and A. Perrig. SIA: secure
information aggregation in sensor networks. In SenSys ’03,
pages 255–265, 2003.
[20] S. E. Schechter. Computer security strength & risk: A
quantitative approach. Harvard University Doctoral
Dissertation, 2004.
[21] D. Wagner. Resilient aggregation in sensor networks. In
SASN: Proc. Workshop on security of ad hoc and sensor
networks, pages 78–87, 2004.
[22] A. D. Wood and J. A. Stankovic. Denial of service in sensor
networks. Computer, 35(10):54–62, 2002.
[23] Y. Yao and J. Gehrke. Query processing for sensor networks.
In CIDR 2003, 2003.
View other sources
Hide other sources
 Available from psu.edu
 Available from Zachary G. Ives · Jun 10, 2014