Content uploaded by Khaled Elleithy
Author content
All content in this area was uploaded by Khaled Elleithy on May 09, 2018
Content may be subject to copyright.
XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE
Efficient Unsupervised Learning to Secure
Communication for Wireless Sensor Network
Middleware
Remah Alshinina and Khaled Elleithy
Department of Computer Science and Engineering
University of Bridgeport
Bridgeport, CT, United States
ralshini@my.bridgeport.edu elleithy@bridgeport.edu
Abstract—Wireless sensor networks (WSNs) are deployed
in many applications such as those used in monitoring, by the
military, in healthcare, etc. Since these applications deal with
the transfer of sensitive data, they need protection from
various attacks and intrusions. From the current literature, we
observed that existing security algorithms are not suitable for
large-scale WSNs due to limitations in energy consumption,
throughput, and overhead. This paper introduces a new
algorithm based on a machine learning technique called
generative adversarial networks (GANs) to protect the
information of sensor nodes when transmitting data to and
from the base station. The GANs generate fake data that is
identical to the data from the sensor nodes to confuse the
adversary in differentiating between the two. This technique
completely eliminates the need for fake sensor nodes, which
consume more power and reduce throughput and the lifetime
of the network. Simulation results show that the proposed
technique provides higher throughput and increases successful
data rates with low energy consumption.
Keywords—GANs, fake data, middleware, generator,
discriminator, adversary, wireless sensor networks, unsupervised
learning, security.
I. INTRODUCTION
Wireless sensor networks (WSNs) have received an
enormous amount of attention in recent years in military,
industrial, surveillance security, and healthcare applications
[1-3]. WSNs have a large scale of sensors deployed
randomly in harsh environments and are capable of
monitoring environmental temperature, humidity, and sound.
The sensor nodes are small, inexpensive and run on non-
rechargeable batteries. However, the major limitations of
WSNs are resource constraints such as battery power,
bandwidth, and memory capacity [4, 5]. The nodes gather
different types of information about the environment and
make decisions in real time, then transmit sensor data to the
base station for further processing.
Recently, middleware is a novel approach to address
WSN limitations and provides desired services for these
applications [6, 7]. Generally, middleware has been able to
bridge the gap between applications and WSNs (low-level),
and allows an application to communicate with different
types of nodes (hardware) [7]. The successful design and
development of middleware must address many challenges,
such as managing limited power and resources, scalability,
security, heterogeneity, data aggregation, and quality of
service [1, 6-8].
Machine learning (ML) algorithms have been used
extensively for WSN applications. ML algorithms are
categorized into supervised, unsupervised, and reinforcement
learning [9].
In this paper, we introduce a unique unsupervised
learning called the generative adversarial networks (GANs)
algorithm for WSN middleware to provide end-to-end,
secure communication. The GANs were first introduced in
2014 by Ian Goodfellow [10] as a class of artificial
intelligence used in unsupervised machine learning. There
are two components of a GAN: a generator (G) network and
a detector (D) network. The generator network creates fake
data similar to the real samples and sends it to the detector
network (D). The detector (D) network uses the fake data
from G along with the real distribution as inputs. The G aims
to learn the distribution of the real data and the D aims to
distinguish whether the input data is real or coming from the
G. During the training sets, the G tries to maximize the
probability of the D accepting G data as real data. The
communication in training set and relationship between the
two networks is expressed in equation 1.
~ ( )
~ ( )
min max log ( )
log(1 ( ( ))) (1)
z
G D x Pdata x
z P z
Dx
D G z
As shown in the equation, the D takes real data (x) and
fake data from the generator, represented as G (z); the output
is the probability of that data being real (p(x)). Thus, the D is
capable of increasing the likelihood of identifying real data
and lowering the probability of accepting fake data from the
generator. The G network takes vector random number (z) as
the input. The first term corresponds to the optimization of the
probability of real data (x) (close to 1), and the second term
corresponds to the optimization of the probability of the fake
data (G (z)) (close to zero) [10-12].
The proposed technique is used to provide anonymity to
sensor nodes data through the use of the fake data. It enhances
the performance of the network and provides a secure
transmission between sensor nodes and to and from the base
station. The rest of this paper is organized as follows: In
Section II, there is related work on WSNs pertaining to
security and performance. Section III shows a detailed design
of the proposed method. Section IV shows the simulation
setup and the evaluation of the results. Finally, conclusions
and discussion are presented in Section V.
II. RELATED WORK
Wireless sensor networks have gained considerable
attention from researchers trying to address performance and
security challenges associated with large-scale networks.
One of the biggest challenges in WSN design is its
complexity and large number of sensor nodes. A variety of
complex applications calls for sensitive data transmission
between the sensor nodes and to and from the base station
wirelessly. Massive amounts of data, processed and
transmitted between the base station and the destination,
require more energy. There are different middleware
approaches which incorporate WSNs to increase
performance, as discussed in [1, 6, 7].
The main objective of research on WSNs is to improve
performance in terms of network lifetime and power
conservation. Current literature proposes standard security
algorithms to secure entire networks from different attackers
[13-15]. WSNs have unique characteristics such as node
failures, communication failures, mobility of nodes, and
dynamic network topology [5]. Furthermore, these
characteristics make secure communication of sensor nodes
within the network very challenging. Most of the existing
traditional security algorithms cannot be applied to WSNs
due to constrained energy and computation capability.
Machine learning (ML) algorithms have been applied to
WSNs and have the capability to significantly reduce energy
consumption and increase network lifetime. The strength of
existing ML algorithms—such as support vector machine
(SVM), neural networks (NN), and K-nearest neighbors (K-
NN), and decision tree (DT)—are based on the quality of the
training set (dataset). The training data is limited because the
dataset is biased (imbalanced) and leads to classification
problems where the classes are not represented equally.
One way to increase the security of WSNs is to introduce
fake sensor nodes that are capable of transmitting fake data
across the network. This approach, while increasing the
reliability of the network against malicious attacks,
significantly increases power consumption. With the sensor
nodes requiring much more power to prevent malicious
attacks, the lifetime of the network is subsequently reduced.
III. PROPOSED TECHNIQUE
This paper presents an intelligent machine learning
algorithm for developing secure wireless sensor networks
middleware (SWSNM). SWSNM provides an efficient,
secure communication between sensor nodes and the base
station with minimal power consumption, increased
probability of successful data delivery, and improved
network lifetime. The proposed approach eliminates the need
to use fake sensor nodes by introducing a machine learning
algorithm into WSNs.
To the best of our knowledge, this is the first generative
adversarial networks (GANs) algorithm applied to wireless
sensor network middleware to provide end-to-end security.
The GANs consist of two networks: the generator (G) and
the discriminator (D). The G network is used to generate fake
data very similar to data generated by the sensor nodes. The
D network is an intelligent network used to distinguish and
filter between real and fake data from sensors before
transmitting to end users [10, 16].
The presented approach is capable of addressing the
anonymity of real data communication by incorporating real
data (from sensor nodes) with fake data (generated from
Generator network) to confuse the adversary. The main goal
of the G network is to generate fake data very similar to the
real data from the sensor nodes, and then combine the fake
data into the real data before diffusing it to the base station.
A. Network Model
The network is composed of sensor nodes, the base
station, and fake data (from the generator network). The
nodes are distributed randomly with the same power,
resources, and computational capabilities. The nodes collect
information about an event and combine their data with fake
data before transmitting it to the base station. The fake data
that is generated from the generator network should be
identical to the real data from the sensor node. The base
station has a higher capacity in terms of power and resources
than other sensor nodes in the network.
B. Generating Fake Data
The generative adversarial networks (GANs) algorithm is
applied to generate fake data that is identical to real data, to
secure the network through D network [16]. The authors
Alshinina and Elleithy provide more details about this
technique in [16]. Therefore, injecting fake data into real data
for each node during the lifetime of the network, instead of
using fake nodes to generate dummy data, seems to have a
positive impact on energy consumption and network
throughput. The real data is hidden within fake data that the
adversary cannot distinguish. By applying this technique, the
data is transmitted to the base station in a secure manner. The
discriminator network (D) should be able to distinguish
between the real data and fake data and filter it, before
sending it to the client or end user.
We evaluate the proposed algorithm by feeding the G
network data that can either be normal or attack data. The G
is able to generate fake data and then append it with the real
data. The sensor node should do the above step before
sending any data to its neighbor or the base station. Finally,
data should pass through D as shown in Fig 1. The D
network evaluates and filters data (real and fake), even if this
data is very similar to each other. After that, only the real
data is transmitted to the end user.
Fig 1. The scenario of our proposed GANs algorithm
IV. SIMULATION AND RESULTS
In our simulation, the size of the network is 1500 ×1500
m2 network topology using NS2. The WSN involves 150
sensor nodes with a transmission range of 40 meters. The
initial energy of the nodes is set to 6 joules. The maximum
energy consumption of the sensor nodes for receiving (Rx)
and transmitting (Tx) the data is set to 14 mW and 13.0
mW, respectively. Sensing and idle nodes have 10.2 mW
and 0.42 mW, respectively. The maximum simulation time
is 45 minutes, and the pause time is 20 seconds for phase
initialization before starting the simulation. During the
testing phase, the GAN takes about 20 seconds to
distinguish between real and fake data. Extensive
experimental evaluation in this approach ensures that the
discriminator network is robustly capable of protecting the
network from any attackers or malicious nodes. It improves
the security of the network without compromising on the
network delay.
The main objective of the simulation is to monitor the
network and secure data communication from internal and
external malicious data. The biggest challenge in WSNs is
when the attacker compromises the node by targeting the
network resources. For this purpose, our proposed algorithm
generates fake data identical to real data from the sensors in
the network area, and then joins the real and fake data
before sending it to the base station through routers.
The proposed framework is in the presence of 12 mobile
nodes and 138 static nodes. We assume that the nodes that
drop all packets passing through them are malicious nodes.
With such indication of dropped packets, the algorithm
assigns a malicious flag to such nodes. The location of each
of the malicious node within the network is calculated and
those nodes are replaced with static (normal) nodes. A
comparison of SWSNM with and without malicious nodes is
shown in Figures 2, 3, and 4. Figure 2 shows the average
amount of energy consumed by the nodes within the
network. It is clearly seen that when the malicious nodes are
replaced with new static nodes, the energy consumption of
the network is reduced. In the proposed approach, the
energy consumed during data transmission as well as sleep
and idle modes are taken into account. The energy
consumption is obtained from equation 2. We assumed that
the energy consumed by node j has bits of packets to
transmit/receive while the node is active. Further, sleep and
idle modes are counted. Where n is the total number in the
network.
1
Total energy consumed at node (2)
nj
jn
Fig 2. Energy consumption with and without malicious nodes
Throughput is defined as the amount of data that is
transmitted from source nodes to the destination (base
station) within certain time, obtained in (3). Figure 3 shows
the comparison of network throughput for each of the two
cases. As seen from Figure 3, the throughput of the network
without the malicious nodes is significantly higher than the
network with malicious nodes. Similarly, probability of
successful data delivery (
success rate
P
) in an all-static node
network is much higher than that of malicious nodes, as
shown in Figure 4 and obtained in (4)
Throughput= (3)
Total number of byt
Number of bytes received
es transmitted at source
at base station
nodes
Total delivered packets 100
P = (4)
success rate Generated packets
Fig 3. Throughput with and without malicious nodes
Fig 4. Successful data delivery with and without malicious nodes
The end-to-end delay, obtained in (5), is another
important parameter to evaluate the performance of the
proposed approach. Figure 5 shows that the end-to-end delay
increases until a certain time (~32 minutes) and stays fairly
constant after that. It is noteworthy that while the trends are
similar, the SWSNM shows significantly lower end-to-end
delay when compared to that with the existence of malicious
nodes (SWSNM w/10 MN).
1(5)EED =
n
j
jD
n
Where n is the total number of nodes in the network. The
delay of node
j
D
is obtained in (6), the
i
rec
D
represents
arrival time at the destination for packet p,
p
snd
D
is
transmission time at the source node.
1 (6)
Number of packets by node
pkt p
p
rec snd
p
j
j
DD
D
Fig 5. End-to-End Delay with and without malicious nodes
Percentage differences for each of the three variables are
shown in Table 1. For the ease of comparison, one location
or time is selected for each of the variables. Table 1 shows
that 13.7% more energy is consumed when the network
included malicious nodes compared to that with no
malicious nodes. Similarly, more than 10% throughput was
increased when all malicious nodes were replaced in the
proposed design. The probability of successful data delivery
is almost 100% with no malicious nodes in comparison to
that of 90% with 10 malicious nodes in the network. Based
on the results, it can be inferred that if the probability of
malicious nodes is higher in the network (for example 20%),
percentage differences in the calculated variables are
expected to be much larger.
TABLE 1. COMPARISON TABLE OF PROPOSED SWSNM
APPROACH
Location/
Time
SWSNM
SWSNM
w/ 10 MN
% Diff.
Energy
Consumption
140th Node
2.638 J
3.056 J
13.7%
Throughput
140th Node
490.39
440.25
10.2%
Successful
Data Delivery
35 Minutes
99.77 %
90.82 %
9.0%
End-to-End
Delay
35 Minutes
0.04
0.053
24.5%
V. CONCLUSION
This paper presents a unique middleware technique
based on the generative adversarial networks algorithm to
provide secure communication for wireless sensor networks.
Many of the existing wireless sensor network middleware
lacks support for security, energy efficiency, and scalability.
We present in this work a new security algorithm for WSNs.
The algorithm has the generator G and discriminator D
networks. These networks work to secure the sensor data
from attackers. In our experiment, we compared the security
of the generated data with real data by using the D network.
The results show that even if the G can generate real data, it
can be easily detected by D network. In this case, the D
network is capable of detecting attack data. Simulation
results demonstrate that the proposed approach provides
stronger security mechanism by detecting and replacing
malicious nodes which leads to lesser energy consumption,
higher throughput, and increased probability of successful
data delivery to and from the base station. Ongoing work
will include estimations of error rates and the lifetime of the
network along with comparison of the proposed technique
with other contending approaches.
REFERENCES
[1] R. Alshinina and K. Elleithy, "Performance and
Challenges of Service-Oriented Architecture for Wireless
Sensor Networks," Sensors, vol. 17, no. 3, p. 536, 2017.
[2] Y. Wang, G. Attebury, and B. Ramamurthy, "A survey
of security issues in wireless sensor networks," IEEE
Communications Surveys & Tutorials, pp. 2-23, Second
Quarter 2006 2006.
[3] X. Chen, K. Makki, K. Yen, and N. Pissinou, "Sensor
network security: a survey," IEEE Communications
Surveys & Tutorials, vol. 11, no. 2, 02 June 2009 2009.
[4] H. Kantharaju and K. N. Murthy, "A Survey on
Enhancing System Performance of Wireless Sensor
Network by Secure Assemblage Based Data Delivery,"
in Recent Advances in Electronics and Communication
Technology (ICRAECT), 2017 International Conference
on, Bangalore, India, 2017, pp. 289-296: IEEE.
[5] Y. Zhang, N. Meratnia, and P. Havinga, "Outlier
Detection Techniques for Wireless Sensor Networks: A
Survey," IEEE Communications Surveys & Tutorials,
vol. 12, no. 2, pp. 159-170, 2010.
[6] A. Shchzad, N. Hung Quoc, S. Y. Lee, and L. Young-
Koo, "A comprehensive middleware architecture for
context-aware ubiquitous computing systems," in Fourth
Annual ACIS International Conference on Computer and
Information Science (ICIS'05), Jeju Island, South Korea,
14-16 July 2005, pp. 251-256.
[7] S. Hadim and N. Mohamed, "Middleware for Wireless
Sensor Networks: A Survey," in 1st International
Conference on Communication Systems Software &
Middleware, New Delhi, India, 8-12 Jan. 2006 pp. 1-7.
[8] J. Al-Jaroodi and A. Al-Dhaheri, "Security issues of
service-oriented middleware," International Journal of
Computer Science and Network Security, vol. 11, no. 1,
pp. PP.153-160, 2011.
[9] M. A. Alsheikh, S. Lin, D. Niyato, and H. P. Tan,
"Machine Learning in Wireless Sensor Networks:
Algorithms, Strategies, and Applications," IEEE
Communications Surveys & Tutorials, vol. 16, no. 4, pp.
1996-2018, 2014.
[10] I. Goodfellow et al., "Generative adversarial nets," in
Advances in neural information processing systems,
2014, pp. 2672-2680.
[11] J. T. Springenberg, "Unsupervised and semi-supervised
learning with categorical generative adversarial
networks," arXiv preprint arXiv:1511.06390, 2015.
[12] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A.
Radford, and X. Chen, "Improved techniques for training
gans," in Advances in Neural Information Processing
Systems, 2016, pp. 2234-2242.
[13] U. E. Standard, "Version-I: Symmetric Key
Cryptosystem using generalized modified Vernam
Cipher method, Permutation method and Columnar
Transposition method, Satyaki Roy, Navajit Maitra,
Joyshree Nath, Shalabh Agarwal and Asoke Nath," in
Proceedings of IEEE sponsored National Conference on
Recent Advances in Communication, Control and
Computing Technology-RACCCT, 2012, pp. 29-30.
[14] A. A. Pirzada and C. McDonald, "Secure routing with
the AODV protocol," in Communications, 2005 Asia-
Pacific Conference on, Perth, WA, Australia, 2005, pp.
57-61: IEEE.
[15] Y. W. Law, J. Doumen, and P. Hartel, "Benchmarking
block ciphers for wireless sensor networks," in Mobile
Ad-hoc and Sensor Systems, 2004 IEEE International
Conference on, Fort Lauderdale, FL, USA, 2004, pp.
447-456: IEEE.
[16] R. Alshinina and K. Elleithy, "A Highly Accurate
Machine Learning Approach for Developing Wireless
Sensor Network Middleware." unpublished.