Article

A Machine Learning Approach to Predict the Average Localization Error With Applications to Wireless Sensor Networks

Abstract and Figures

Node localisation is one of the significant concerns in Wireless Sensor Networks (WSNs). It is a process in which we estimate the coordinates of the unknown nodes using sensors with known coordinates called anchor nodes. Several bio-inspired algorithms have been proposed for accurate estimation of the unknown nodes. However, use of bio-inspired algorithms is a highly time-consuming process. Hence, finding optimal network parameters for node localisation during the network set-up process with the desired accuracy in a short time is still a challenging task. In this paper, we have proposed an efficient way to evaluate the optimal network parameters that result in low Average Localisation Error (ALE) using a machine learning approach based on Support Vector Regression (SVR) model. We have proposed three methods (S-SVR, Z-SVR and R-SVR) based on feature standardisation for fast and accurate prediction of ALE. We have considered the anchor ratio, transmission range, node density and iterations as features for training and prediction of ALE. These feature values are extracted from the modified Cuckoo Search (CS) simulations. In doing so, we found that all the methods perform exceptionally well with method R-SVR outperforming the other two methods with a correlation coefficient (R = 0.82) and Root Mean Square Error (RMSE = 0.147m).
Content may be subject to copyright.
Received October 22, 2020, accepted November 12, 2020, date of publication November 17, 2020,
date of current version November 30, 2020.
Digital Object Identifier 10.1109/ACCESS.2020.3038645
A Machine Learning Approach to Predict the
Average Localization Error With Applications
to Wireless Sensor Networks
ABHILASH SINGH 1, (Member, IEEE), VAIBHAV KOTIYAL 2, SANDEEP SHARMA 2,
JAIPRAKASH NAGAR 3, (Member, IEEE), AND CHENG-CHI LEE 4,5, (Member, IEEE)
1Fluvial Geomorphology and Remote Sensing Laboratory, Indian Institute of Science Education and Research at Bhopal, Bhopal 462066, India
2Department of Electronics and Communication Engineering, Gautam Buddha University, Greater Noida 201312, India
3Subir Chowdhury School of Quality and Reliability, IIT Kharagpur, Kharagpur 721302, India
4Research and Development Center for Physical Education, Health, and Information Technology, Department of Library and Information Science, Fu Jen Catholic
University, New Taipei City 242, Taiwan
5Department of Photonics and Communication Engineering, Asia University, Taichung 41354, Taiwan
Corresponding authors: Sandeep Sharma (sandeepsvce@gmail.com) and Cheng-Chi Lee (cclee@mail.fju.edu.tw)
ABSTRACT Node localisation is one of the significant concerns in Wireless Sensor Networks (WSNs).
It is a process in which we estimate the coordinates of the unknown nodes using sensors with known
coordinates called anchor nodes. Several bio-inspired algorithms have been proposed for accurate estimation
of the unknown nodes. However, use of bio-inspired algorithms is a highly time-consuming process. Hence,
finding optimal network parameters for node localisation during the network set-up process with the desired
accuracy in a short time is still a challenging task. In this article, we have proposed an efficient way
to evaluate the optimal network parameters that result in low Average Localisation Error (ALE) using a
machine learning approach based on Support Vector Regression (SVR) model. We have proposed three
methods (S-SVR, Z-SVR and R-SVR) based on feature standardisation for fast and accurate prediction of
ALE. We have considered the anchor ratio, transmission range, node density and iterations as features for
training and prediction of ALE. These feature values are extracted from the modified Cuckoo Search (CS)
simulations. In doing so, we found that all the methods perform exceptionally well with method R-SVR
outperforming the other two methods with a correlation coefficient (R =0.82) and Root Mean Square Error
(RMSE = 0.147m).
INDEX TERMS ALE, modified CS algorithm, SVR model, GPR model, WSNs.
I. INTRODUCTION
A WSN consists of a set of miniature and inexpensive sensors
that are spatially distributed over an area to measure the phys-
ical parameters or monitor the habitat conditions and also
have many practical areas of implementation such as target
tracking, precision agriculture, etc., [1]–[6]. In most of the
applications, these sensors need to estimate their coordinates
accurately with minimum resource requirements. These sen-
sors can quickly locate their coordinates using an integrated
Global Positioning System (GPS) system. However, it is not
practically feasible to integrate GPS in all the sensors due to
its size and cost. An alternate approach is to use the concept of
localisation algorithms in which several anchor nodes (with
The associate editor coordinating the review of this manuscript and
approving it for publication was Tie Qiu .
integrated GPS) will assist the unknown nodes to determine
their coordinates accurately.
A large number of localisation algorithms have been intro-
duced to solve different localisation problems [7]. These
algorithms are expected to be flexible so that it can work well
in various diverse indoor and outdoor scenarios and topolo-
gies. These localisation algorithms have been divided into
two categories, viz., range-based algorithms and range-free
algorithms. In range-based algorithms, the location of the
unknown nodes is computed with the help of distance
between the anchor and unknown sensor nodes. They utilise
the ranging metrics such as the angle of arrival, time of arrival,
and the Received Signal Strength Indication (RSSI) [8]–[10].
In contrast, the range-free algorithms such as ad-hoc posi-
tioning system [11] and centroid [12], etc., make use of
simple operations related to the connectivity to localise the
VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ 208253
A. Singh et al.: Machine Learning Approach to Predict the ALE With Applications to WSNs
unknown node. They only need the existence of the beacon
signal in the medium by the anchor node. Among both,
the range-based algorithms are widely employed and pre-
ferred over the range-free algorithms [13]–[15].
To design a less complicated algorithm; various bio-
inspired algorithms have been proposed for range-based
approach [16]. Initially, Gopakumar and Jacob [17] rendered
a node localisation method formed on Particle Swarm Opti-
misation (PSO) [18], which imitated the behaviour of a fish
swarm to search for food. This algorithm showed good initial
results, but the implementation tended to get caught in a local
optimum, which results in premature convergence. In 2014,
Goyal and Patterh [19] implemented CS for node locali-
sation in WSNs. It showed noticeable results to minimise
the localisation error. This is mainly because of the tuning
parameters in the CS algorithm, which ease the calculation
process. Recently, a modified version of CS was proposed by
Cheng and Xia [20], which improved the convergence rate
of the conventional CS algorithm. They modified the random
walk step size and the mutation probability to improve the
search process.
ALE metrics assess the accuracy of these localisation algo-
rithms. We select an algorithm that has the minimum ALE
value. The major problem after selecting a bio-inspired algo-
rithm for node localisation is the computational time. During
any network set-up, we need to run the algorithm many times
in order to find the optimal network parameters (such as
anchor ratio, transmission range, node density, etc.,) and to
tune the ALE below the threshold for the desired scenario.
To deal with this limitation, we have proposed an efficient
machine learning approach for accurate and fast prediction
of ALE in such a scenario. As far as we know, no other study
has been conducted and published to address this issue.
In this article, we have presented three methods based on
the SVR model. We have selected and extracted four features,
namely anchor ratio, transmission range, node density and
number of iterations from the modified CS algorithm. Even-
tually, we input this data to train the SVR model and obtained
the predicted ALE using the trained SVR model for all the
three methods.
Further, we have divided this article into six sec-
tions. In Section II, we have discussed the related works.
In Section III, we have discussed the system model for
the node localisation problem. Furthermore, we have also
discussed the details of the features importance, hyper-
parameter tuning and SVR model. Afterwards, in Section IV,
we have discussed the simulation scenarios and parameters
for the modified CS and SVR model. In Section V, we have
discussed the results of all the three methods for ALE predic-
tion. Finally, in Section VI and VII, we have presented the
discussion and conclusion respectively.
II. RELATED WORKS
In this section, we have discussed the several methods
for improving the node localisation accuracy. Several stud-
ies have been conducted to improve localisation accuracy
using machine learning. Morelande et al. [21] introduced
a Bayesian algorithm for node localisation in WSNs. The
proposed algorithm is a refinement of a previous work
referred to as progressive correction [22]. Both these methods
are compared in different scenarios keeping Cramér–Rao
bound (CRB) as the benchmark. The proposed algorithm
proved to be more accurate than its predecessor. Further,
Ghargan et al. [23] presented an approach in which Artifi-
cial Neural Network (ANN) is hybridised individually with
three optimisation algorithms: Particle Swarm Optimisation
(PSO), Backtracking Search Algorithm (BSA) and Gravi-
tational Search Algorithm (GSA). The GSA-ANN hybrid
proved to outperform the other methods with a mean absolute
distance estimation error of 0.02m and 0.2m for outdoor and
indoor scenarios, respectively. In a recent survey, Ahmadi
and Bouallegue [24] compiled the different state-of-the-art
machine learning techniques utilised in node localisation in
WSNs. It compared the cumulative localisation error distri-
bution curve of various techniques like ANN, Support Vector
Machine (SVM), Decision Tree (DT) and Naive Bayes (NB)
method. This study reported that NB outperformed all the
other machine learning techniques based on their cumula-
tive localisation error distributions. Bhatti et al. [25] devel-
oped an outlier detection algorithm named ‘‘iF_Ensemble’
for an indoor localisation environment using a combina-
tion of different supervised, unsupervised, and ensemble
machine learning methods. Here, the supervised learning
techniques are K-nearest neighbour (KNN), Random For-
est (RF) classifiers and SVM, whereas unsupervised learning
techniques is isolation Forest (iForest). These techniques are
used with stacking, that is an ensemble learning method.
The model, including stacking, is compared with the indi-
vidual performances of the machine learning algorithms
involved. The stacking model provides high localisation
accuracy of 97.8% with proposed outlier detection methods.
Recently, Wang et al. [26] introduced a node localisation
algorithm named Kernel Extreme Learning Machines based
on Hop-count Quantization (KELM-HQ). The trained KELM
computes the locations of the unknown nodes. The proposed
algorithm proves the localisation error to be improved by
34.6% when compared with fast-SVM, 19.2% when com-
pared with GADV-Hop algorithm, and 11.9% when com-
pared with DV-Hop-ELM algorithm.
Overall, this study aims to overcome the limitation
of localisation accuracy in previous studies by using a
regression-based machine learning approach.
III. SYSTEM MODEL
In this section, first, we have discussed the system architec-
ture designed for the node localisation process. Then we have
discussed the method to compute the distance between the
anchor and unknown nodes. Afterwards, we have discussed
the objective function formation and working of the mod-
ified CS algorithm for node localisation. Finally, we have
discussed the details of the machine learning model used.
208254 VOLUME 8, 2020
A. Singh et al.: Machine Learning Approach to Predict the ALE With Applications to WSNs
A. SYSTEM ARCHITECTURE
The sensor nodes are considered to be deployed randomly
inside a region with area X×Ysquare units. The system
consists of Manchor nodes. These anchor nodes act as a
reference for all Nunknown nodes of the network, which
need to be localised. All the sensors can transmit/receive
data within a transmission range of Rdistance units. The
anchor’s positional information is utilised as a reference to
evaluate the coordinates of all the localisable unknown nodes.
An unknown node is considered localisable only if it has at
the minimum three anchor nodes inside its communication
range.
B. DISTANCE CALCULATION AND OPTIMISATION
PROBLEM FORMATION
The RSSI is used by the unknown nodes to calculate their
distances from the anchor nodes. Sensors experience a power
loss during the exchange of information because of shad-
owing and multipath fading. This path loss is modelled as
log-normal shadowing [27], which is expressed as shown
in Eq.(1):
PL(d)=PL0+10 ×η×log10 (d
d0
)+Xg(1)
In Eq.(1), PL(d), PL0, and drepresent total path loss
(transmitted power – received power), path loss at a reference
distance d0, and the distance between the transmitter and
the receiver respectively. Besides, ηdenotes the path loss
exponent showing how the strength of the received signal
decreases with the increase in distance between transmitter
and receiver [28]. The value of ηrelies on various parameters
such as signal frequency, antenna height, and the propagation
environment [27]. Generally, the value of ηlies in the range
of 2-6 [29] and is higher than 4 for indoor or shadowed
environment [30]. Furthermore, σrepresents the standard
deviation of shadowing effects, and its value varies with the
signal propagation environment and is generally higher than
4 dB [31]. Xgis a Gaussian random value representing the
attenuation caused by fading.
A ranging error is experienced as the result of log-normal
shadowing. This ranging error observes a zero-mean Gaus-
sian distribution. Its variance σ2is expressed in Eq.(2):
σ2=γ2×D2
ij (2)
where, γrepresents the localisation error between the actual
and measured Euclidean distance Dij between ith node (xi,yi)
and the jth node (xj,yj) and is known as Gaussian noise having
mean zero and standard deviation one. We have considered
the value of γequal to 0.1 as it is the most appropriate value
used in literature [20], [32]. Eq. (2) shows that the standard
deviation of the ranging error varies linearly with the actual
distance between two nodes. The real distance Dij can be
calculated using the following Eq.(3):
Dij =q(xixj)2+(yiyj)2(3)
A circular disk model has been adopted to establish net-
work connectivity: two nodes iand jcan converse with each
other only if Dij 6R, where Ris the transmission range of
both the sensor nodes.
The measured distance is represented by D0
ij, and is given
by the expression in Eq.(4):
D0
ij =Dij +Nij (4)
where, Nij is the ranging error between node iand j.
While calculating the position of the unknown nodes, there
always exists a ranging error. So, we need to evaluate the
position of the unknown nodes as precisely as possible,
considering this inevitable ranging error. To achieve this,
we formulate an Optimisation Function (OF), which is the
mean of the square of the error between the actual distance
of evaluated node coordinates and the estimated distance
of actual unknown node coordinates from the neighbouring
anchor nodes. Let, (xi,yi) and (xj,yj) be the position of ith
unknown node and jth anchor node respectively. The OF is
given in Eq.(5):
OF(xi,yi)=1
M×
M
X
j=1
(Dij D0
ij)2(5)
where, M>3, because an unknown node should have
at the minimum three anchor nodes within its transmission
range to be considered as localisable (trilateration rule). The
(xi,yi) corresponding to the minimum value of the OF is the
evaluated position of the unknown node.
C. MODIFIED CS ALGORITHM FOR NODE LOCALISATION
Modified CS is a bio-inspired meta-heuristic algorithm [20]
used for node localisation in WSNs. It estimates the coordi-
nates of the unknown nodes in the network by initialising a
random population of candidate solutions for every unknown
node. Afterwards, it calculates the fitness value for each
solution using the OF (using Eq.(5)). The worst out of the
candidate solutions are replaced by a new set of randomly
allocated candidate solutions. This process continues over
a predetermined number of iterations, then the coordinates
corresponding to the global best solutions are selected as the
coordinates of the unknown nodes in the network for each of
the node.
D. MACHINE LEAR NING MODEL
Broadly, learning algorithms are divided into supervised and
unsupervised learning. Further, supervised learning is classi-
fied into classification and regression learning, whereas unsu-
pervised learning is classified into clustering and dimension
reduction techniques [33].
In this article, our objective is to assess the potentiality
of regression-based machine learning algorithms for esti-
mating the node localisation error. The key objective of
regression-based machine learning algorithms is to predict
the predictand based on a mapping function. This mapping
VOLUME 8, 2020 208255
A. Singh et al.: Machine Learning Approach to Predict the ALE With Applications to WSNs
FIGURE 1. Predictor importance graph.
function is modelled by feeding a set of features and predic-
tand data known as training data set. In doing so, we have
selected the SVR algorithm. SVR is used in many appli-
cations such as image processing [34], [35], remote sens-
ing [36], and blockchain [37]. It has superb generalisation
competence along with high accuracy. Also, the computa-
tional complexity is independent of the input feature data
set [38].
1) FEATURE IMPORTANCE
In this article, we have evaluated the feature’s importance
by regression ensemble approach. First, we have trained a
regression ensemble model. It contains the results of boosting
hundred regression trees (number of ensemble learning cycle)
using LSBoost ensemble aggregation approach, feature data
and the predictand data. We have used the regression tree,
weak learner, with unity learning rate. After creating an
ensemble, we calculated the estimate of the predictor or fea-
ture importance by summing these estimates over all the weak
learners in it. In doing so, we plotted the feature importance
graph (Fig. 1). We found that out of the four features, the node
density is the most important feature followed by the number
of iterations. In contrast, the anchor ratio and the transmis-
sion range has nearly equal importance. Further, we have
estimated the partial dependency of the features on the pre-
dictand (Fig. 2). In the same plot, we have also plotted the
individual conditional expectation of each data.
2) HYPER-PARAMETER OPTIMISATION
SVR is used to learn from data indicating excellent per-
formance in prediction and pattern recognition. It is also
benefited from the big data collected from onboard analysis.
The hyper-parameters have a significant influence on SVR’s
predictive efficiency. The SVR’s efficiency is determined by
the different hyper-parameters such as Cand , which helps
in identifying the training error. If the residual is higher than
hyper-parameter , then the parameter Cpenalises the train-
ing error. Thus, minimal Cvalues lead to computational com-
plexity, while too large Cvalues lead to model under-fitting.
In this article, we have used the universal grid search
approach to optimise the hyper-parameter present in the SVR
model. In this study, we optimise the penalty factor, C, in the
SVR model by keeping the , constant. We have selected the
famous Mean Square Error (MSE) function as the loss or
objective function (using Eq. (6)) for optimisation.
1
n
n
X
i=1
(Yib
Yi)2(6)
208256 VOLUME 8, 2020
A. Singh et al.: Machine Learning Approach to Predict the ALE With Applications to WSNs
FIGURE 2. Partial Dependence Plot (PDP) and Individual Conditional Expectation (ICE) plots.
We have selected the Cvalue, which corresponds to the
minimum value of the objective function for all the three
methods.
3) SUPPORT VECTOR REGRESSION MODEL
SVR was initially proposed by Drucker et al., which is a
supervised learning technique, based on the concept of Vap-
nik’s support vectors [39], [40]. SVR aims at reducing the
error by determining the hyperplane and minimising the
range between the predicted and the observed values. Min-
imising the value of win the Eq.(7) is similar to the value
defined to maximise the margin, as shown in Fig. 3.
min ||w||2+C
n
X
i
(ξ+
i+ξ
i) (7)
where, n
P
i
(ξi) represents an empirical error. Hence, to min-
imise this error, Eq.(8) is being used.
f(x)=
n
X
iα
i+aiK(x,xi)+B(8)
where,α
i,ai0 represents the Lagrange multiplier,
K(x,xi)represents the kernel function and Brepresents the
bias term. In this study, we have used the Polynomial kernel
given by:
K(x,xi)=γ(xxi+1)d(9)
where dis the polynomial degree and γis the polynomial
constant.
SVR performs better performance prediction than other
algorithms like Linear Regression, KNN and Elastic Net, due
to the improved optimisation strategies for a broad set of vari-
ables. Moreover, it is also flexible in dealing with geometry,
transmission, data generalisation and additional functionality
of kernel [41]. This additional functionality enhances the
model capacity for predictions by considering the quality of
features [42].
The training samples influence the SVR model’s fitting
performance since the SVR algorithm is sensitive to the
interference in the training data. Besides, SVR is useful in
resolving high dimensional features regression problem, and
well-function if the feature metrics is larger than the size of
the sample [43]. In this study, we have extracted four features,
VOLUME 8, 2020 208257
A. Singh et al.: Machine Learning Approach to Predict the ALE With Applications to WSNs
FIGURE 3. Structure of support vector regression.
namely anchor ratio, transmission range, node density and the
number of iterations from modified CS algorithm simulation.
Feature scaling is essential for SVR because, when one
function has greater magnitudes than others, the other fea-
tures will dominate while measuring the distance. To avoid
this, we have used various standardisation approaches. Based
on which, we have proposed three methods, as shown
in Fig. 4. The method I is S-SVR (Scaling SVR). In this
method, we first standardised the features using Eq.(10):
xs=x
σ(10)
where xis the feature vector, xsis the standardised data, and σ
is the standard deviation of the feature vector. The method II is
Z-SVR (Z-score SVR). In this method, we have standardised
the features using Eq.(11):
xs=xx
σ(11)
where xis the mean of the feature vector. The method III is the
R-SVR (Range SVR). In this method, we have standardised
the features using Eq.(12):
xs=xxmin
xmax xmin
(12)
Afterwards, we trained and tested the SVR models
in 70:30 ratio, as shown in Fig. 4. In this study, the dimension
of the features vector are 107 ×1. Hence, we have used
75 data for training and the remaining 32 for testing.
IV. SIMULATION EXPERIMENT
In this section, we have discussed the simulation environment
of the modified CS algorithm and the SVR model.
A. ALE SIMULATION USING MODIFIED CS ALGORITHM
For the calculation of ALE, we set up a simulation envi-
ronment of 100 ×100 m2, and we vary the parameters like
node density, anchor ratio and transmission range of each
node to calculate ALE for different network configurations.
Modified CS has some tuning parameters like step size αand
mutation probability Pa, which lie in the ranges 0.9 to 1.0 and
0.05 to 0.25 respectively. The number of candidate solutions
is fixed at 25. The maximum number of iterations allowed to
localise each unknown node is set to 100.
B. SVR SIMULATION FOR ALE PREDICTION
For simulating the SVR model, we performed the
hyper-parameter tuning through the grid search algorithm.
In doing so, we fixed one of the hyper-parameter (i.e.,at
0.01) and applied the grid search algorithm to find the value
of the other hyper-parameter. We created a 100 ×100 grid
for the penalty factor, C. Each grid represents a specific
value of C. On simulating the grid search algorithm, it finds
an optimal grid that corresponds to the minimum value of
the MSE. The range of optimal Cfor all the three methods
along with the other simulation parameter value is given
in Table 2.
208258 VOLUME 8, 2020
A. Singh et al.: Machine Learning Approach to Predict the ALE With Applications to WSNs
FIGURE 4. Flowchart of the methodology.
TABLE 1. Simulation parameters for Modified CS algorithm.
TABLE 2. Simulation parameters for SVR model.
V. RESULTS
In this section, we have presented the results of the method I,
II and III for ALE prediction in the respective subsections.
We have plotted a linear regression curve between the pre-
dicted ALE and the simulated ALE for comparison.
A. PERFORMANCE OF THE METHOD I
We have compared the predicted ALE results, thus obtained
by the method I with the simulated results of the modified
CS algorithm. We found that predicted results accorded well
with the simulated results and gathered along the straight
regression line with mild scattering (Fig. 5). The shaded
grey region corresponds to the 95% Confidence Interval
VOLUME 8, 2020 208259
A. Singh et al.: Machine Learning Approach to Predict the ALE With Applications to WSNs
FIGURE 5. Prediction results for ALE using method I.
(CI) of the regression line and suggests that the predicted
result has a strong positive correlation with R =0.80 and
RMSE =0.23m.
B. PERFORMANCE OF THE METHOD II
Once we calculated the predicted ALE through method II,
we have evaluated its performance with the simulated results
of the modified CS algorithm. In doing so, we found a good
agreement between the both with R =0.81 and RMSE =
0.20m (Fig. 6). However, some observed values lie outside
the CI of the regression line due to the overestimation of the
ALE value by the SVR model. The overestimation probably
occurs due to the positive bias. This type of error comes under
systematic error which is mainly due to model or approach
used.
FIGURE 6. Prediction results for ALE using method II.
C. PERFORMANCE OF THE METHOD III
We have compared the predicted ALE of the method III with
the simulated ALE obtained through modified CS algorithm.
FIGURE 7. Prediction results for ALE using method III.
In this case, also, we found a strong correlation between the
variables (Fig. 7). Here, we found a pragmatic correlation of
R=0.82 with RMSE =0.15m.
FIGURE 8. Comparison of the computation time for method I, II, III with
different scenarios of modified CS algorithm.
VI. DISCUSSION
In this section, we have first discussed the performance of
all the three methods in terms of computational efficiency.
In doing so, we have calculated the computational time
required to predict or calculate the ALE. Further, to ensure
a fair comparison of the proposed methods with the existing
modified CS, we have compared the obtained results with
the computational time of the modified CS simulated results
for three different configurations i.e., computational time for
node density 100, 200 and 300 have been plotted by taking
the transmission range of 20m and an anchor node of 20 in
100 ×100 m2area (Fig. 8). In this figure, the time axis is
in log scale. The dotted line shows the computational time
208260 VOLUME 8, 2020
A. Singh et al.: Machine Learning Approach to Predict the ALE With Applications to WSNs
TABLE 3. Comparison of the proposed methods with the benchmark.
required by all the three methods when it is compiled in a
single script.
On comparing, we found that the time taken by all the
three methods is significantly lower than the time taken by
the modified CS algorithm. Further, method III taken the least
time followed by method II and method I respectively.
Various other studies have been carried out to improve
the localisation accuracy based on Adaptive Neural Fuzzy
Inference System (ANFIS) [44] with a Mean Absolute
Error (MAE) of 0.283 m and backpropagation based artificial
neural network (BP-ANN) model [45] with a mean locali-
sation error of 0.921 m. Both these studies have reported a
high localisation accuracy. In this study, we have reported
a minimal RMSE of 0.15m. However, to ensure a fair eval-
uation of the proposed methods, we need to compare the
results of SVR with other regression-based machine learning
model. We have selected Gaussian Process Regression (GPR)
for comparison because it is widely used, robust and accu-
rate model [46], [47]. In doing so, we have compared the
obtained results with the corresponding variants of GPR. The
three corresponding GPR variants are Scaling GPR (S-GPR),
Z-score GPR (Z-GPR) and Range GPR (R-GPR) as illus-
trated in Table 3. We have used R, RSME and computational
time for comparing the results of all the methods. In doing
so, we found that the method III is the most effective method
among all the methods.
Although the proposed methods perform better than the
corresponding variant of the GPR, the SVR based meth-
ods are susceptible to under-performance when dealing with
noisy data. In such scenarios, GPR is more likely to perform
better [48]. Also, the performance of the proposed methods
depends on the choice of the kernel and features.
VII. CONCLUSION
In this article, we presented and investigated three SVR based
machine learning model for ALE prediction. These meth-
ods are defined based on the standardisation method used.
In the method I, II and III, we have used scaling, Z-score
and range standardisation methods respectively. Afterwards,
we trained the SVR model with the polynomial kernel using
the standardised data and evaluated its performance using
correlation of coefficient and RMSE metrics. In doing so,
we found that range standardisation (using Eq.(12)) of the
features (i.e., method III) results in lower RMSE in ALE
prediction. Also, the coefficient of correlation is highest in
method III.
Further, we have also compared the performance of all the
three models in terms of the computation time requirement.
Again, method III performs better than the other two methods.
It requires less time than the other two methods. Hence,
method III can be used for ALE prediction during network
set-up process to cut down the time requirements.
ACKNOWLEDGMENT
The authors would like to acknowledge IISER, Bhopal;
Gautam Buddha University, Greater Noida; IIT Kharagpur;
Fu Jen Catholic University, Taiwan; and Asia University,
Taiwan, for providing institutional support. They would like
to thank to the editor and all the anonymous reviewers for
providing helpful comments and suggestions.
CODE AVAILABILITY
The code for this work will be made available on a reasonable
request to the corresponding authors.
REFERENCES
[1] I. Khan, F. Belqasmi, R. Glitho, N. Crespi, M. Morrow, and P. Polakos,
‘‘Wireless sensor network virtualization: A survey,’’ IEEE Commun. Sur-
veys Tuts., vol. 18, no. 1, pp. 553–576, 1st Quart., 2016.
[2] M. Jouhari, K. Ibrahimi, H. Tembine, and J. Ben-Othman, ‘‘Underwater
wireless sensor networks: A survey on enabling technologies, localiza-
tion protocols, and Internet of underwater things,’IEEE Access, vol. 7,
pp. 96879–96899, 2019.
[3] H. Xiong and M. L. Sichitiu, ‘‘A lightweight localization solution for
small, low resources WSNs,’J. Sensor Actuator Netw., vol. 8, no. 2, p. 26,
May 2019.
[4] J. Zheng and A. Dehghani, ‘‘Range-free localization in wireless sensor
networks with neural network ensembles,’J. Sensor Actuator Netw., vol. 1,
no. 3, pp. 254–271, Nov. 2012.
[5] A. Singh, S. Sharma, J. Singh, and R. Kumar, ‘‘Mathematical modelling
for reducing the sensing of redundant information in WSNs based on
biologically inspired techniques,’J. Intell. Fuzzy Syst., vol. 37, no. 5,
pp. 6829–6839, Nov. 2019.
[6] J. Amutha, S. Sharma, and J. Nagar, ‘‘WSN strategies based on sensors,
deployment, sensing models, coverage and energy efficiency: Review,
approaches and open issues,’Wireless Pers. Commun., vol. 111, no. 2,
pp. 1089–1115, Mar. 2020.
[7] M. Khelifi, S. Moussaoui, S. Silmi, and I. Benyahia, ‘‘Localisation algo-
rithms for wireless sensor networks: A review,’’ Int. J. Sensor Netw.,
vol. 19, no. 2, pp. 114–129, 2015.
[8] P. Tarrío, A. M. Bernardos, and J. R. Casar, ‘‘Weighted least squares tech-
niques for improved received signal strength based localization,’’ Sensors,
vol. 11, no. 9, pp. 8569–8592, Sep. 2011.
[9] K. Whitehouse, ‘‘The design of calamari: An ad-hoc localization sys-
tem for sensor networks,’’ Ph.D. dissertation, Univ. California Berkeley,
Berkeley, CA, USA, 2002.
[10] C.-Y. Wen and F.-K. Chan, ‘‘Adaptive AOA-aided TOAself-positioning for
mobile wireless sensor networks,’Sensors, vol. 10, no. 11, pp. 9742–9770,
Nov. 2010.
[11] D. Niculescu and B. Nath, ‘‘Ad hoc positioning system (APS),’’ in Proc.
IEEE Global Telecommun. Conf., vol. 5, Nov. 2001, pp. 2926–2931.
[12] N. Bulusu, J. Heidemann, and D. Estrin, ‘‘GPS-less low-cost outdoor
localization for very small devices,’’ IEEE Pers. Commun., vol. 7, no. 5,
pp. 28–34, Oct. 2000.
[13] A. E. Waadt, C. Kocks, S. Wang, G. H. Bruck, and P. Jung, ‘‘Maximum
likelihood localization estimation based on received signal strength,’’ in
Proc. 3rd Int. Symp. Appl. Sci. Biomed. Commun. Technol. (ISABEL),
Nov. 2010, pp. 1–5.
VOLUME 8, 2020 208261
A. Singh et al.: Machine Learning Approach to Predict the ALE With Applications to WSNs
[14] A. Coluccia and F. Ricciato, ‘‘RSS-based localization via Bayesian ranging
and iterative least squares positioning,’’IEEE Commun. Lett., vol. 18, no. 5,
pp. 873–876, May 2014.
[15] A. Coluccia and A. Fascista, ‘‘Hybrid TOA/RSS range-based localization
with self-calibration in asynchronous wireless networks,’J. Sensor Actu-
ator Netw., vol. 8, no. 2, p. 31, May 2019.
[16] V. R. Kulkarni, V. Desai, and R. V. Kulkarni, ‘‘A comparative investigation
of deterministic and Metaheuristic algorithms for node localization in
wireless sensor networks,’Wireless Netw., vol. 25, no. 5, pp. 2789–2803,
Jul. 2019.
[17] A. Gopakumar and L. Jacob, ‘‘Localization in wireless sensor networks
using particle swarm optimization,’’ in Proc. IET Conf. Wireless, Mobile
Multimedia Netw., 2008, pp. 227–230.
[18] J. Kennedy and R. Eberhart, ‘‘Particle swarm optimization,’’ in Proc. IEEE
ICNN, vol. 4, Nov./Dec. 1995, pp. 1942–1948.
[19] S. Goyal and M. S. Patterh, ‘‘Wireless sensor network localization based
on cuckoo search algorithm,’Wireless Pers. Commun., vol. 79, no. 1,
pp. 223–234, Nov. 2014.
[20] J. Cheng and L. Xia, ‘‘An effective cuckoo search algorithm for node
localization in wireless sensor network,’Sensors, vol. 16, no. 9, p. 1390,
Aug. 2016.
[21] M. R. Morelande, B. Moran, and M. Brazil, ‘‘Bayesian node localisation in
wireless sensor networks,’’ in Proc. IEEE Int. Conf. Acoust., Speech Signal
Process., Mar. 2008, pp. 2545–2548.
[22] C. Musso, N. Oudjane, and F. Le Gland, ‘‘Improving regularised par-
ticle filters,’’ in Sequential Monte Carlo Methods in Practice. Cham,
Switzerland: Springer, 2001, pp. 247–271.
[23] S. Gharghan, R. Nordin, and M. Ismail, ‘‘A wireless sensor network with
soft computing localization techniques for track cycling applications,’
Sensors, vol. 16, no. 8, p. 1043, Aug. 2016.
[24] H. Ahmadi and R. Bouallegue, ‘‘Exploiting machine learning strategies
and RSSI for localization in wireless sensor networks: A survey,’’ in Proc.
13th Int. Wireless Commun. Mobile Comput. Conf. (IWCMC), Jun. 2017,
pp. 1150–1154.
[25] M. A. Bhatti, R. Riaz, S. S. Rizvi, S. Shokat, F. Riaz, and S. J. Kwon,
‘‘Outlier detection in indoor localization and Internet of Things (IoT)
using machine learning,’J. Commun. Netw., vol. 22, no. 3, pp. 236–243,
Jun. 2020.
[26] L. Wang, M. J. Er, and S. Zhang, ‘‘A kernel extreme learning machines
algorithm for node localization in wireless sensor networks,’IEEE Com-
mun. Lett., vol. 24, no. 7, pp. 1433–1436, Jul. 2020.
[27] S. Kurt and B. Tavli, ‘‘Path-loss modeling for wireless sensor networks: A
review of models and comparative evaluations.,’’ IEEE Antennas Propag.
Mag., vol. 59, no. 1, pp. 18–37, Feb. 2017.
[28] G. Mao, B. D. O. Anderson, and B. Fidan, ‘‘Path loss exponent estimation
for wireless sensor network localization,’Comput. Netw., vol. 51, no. 10,
pp. 2467–2483, Jul. 2007.
[29] T. Yadav and P. P. Bhattacharya, ‘‘Signal strength and system operating
margin estimation for vehicular ad-hoc networks in Rayleigh fading envi-
ronment,’Int. J. Comput. Sci. Mobile Comput., vol. 2, no. 3, pp. 41–45,
2013.
[30] A. Palaios, Y. Labou, and P. Mahonen, ‘‘A study on the forest radio
propagation characteristics in European mixed forest environment,’’ in
Proc. IEEE Mil. Commun. Conf., Oct. 2014, pp. 376–381.
[31] Y.-R. Tsai, ‘‘Sensing coverage for randomly distributed wireless sensor
networks in shadowed environments,’’ IEEE Trans. Veh. Technol., vol. 57,
no. 1, pp. 556–564, Jan. 2008.
[32] M. Aziz, M.-H. Tayarani-N, and M. R. Meybodi, ‘‘A two-objective
memetic approach for the node localization problem in wireless sensor net-
works,’Genetic Program. Evolvable Mach., vol. 17, no. 4, pp. 321–358,
Dec. 2016.
[33] D. Praveen Kumar, T. Amgoth, and C. S. R. Annavarapu, ‘‘Machine
learning algorithms for wireless sensor networks: A survey,’Inf. Fusion,
vol. 49, pp. 1–25, Sep. 2019.
[34] J. Jebadurai and J. D. Peter, ‘‘SK-SVR: Sigmoid kernel support vector
regression based in-scale single image super-resolution,’’Pattern Recognit.
Lett., vol. 94, pp. 144–153, Jul. 2017.
[35] K. S. Ni and T. Q. Nguyen, ‘‘Image superresolution using support vector
regression,’IEEE Trans. Image Process., vol. 16, no. 6, pp. 1596–1610,
Jun. 2007.
[36] X. Xiao, T. Zhang, X. Zhong, W. Shao, and X. Li, ‘‘Support vector
regression snow-depth retrieval algorithm using passive microwave remote
sensing data,’Remote Sens. Environ., vol. 210, pp. 48–64, Jun. 2018.
[37] Y. Peng, P. H. M. Albuquerque, J. M. Camboim de Sá, A. J. A. Padula, and
M. R. Montenegro, ‘‘The best of two worlds: Forecasting high frequency
volatility for cryptocurrencies and traditional currencies with support vec-
tor regression,’Expert Syst. Appl., vol. 97, pp. 177–192, May 2018.
[38] M. Awad and R. Khanna, ‘‘Support vector regression,’’ in Efficient Learn-
ing Machines. Cham, Switzerland: Springer, 2015, pp. 67–80.
[39] C. A. Callejas Pastor, I.-Y. Jung, S. Seo, S. B. Kwon, Y. Ku, and J. Choi,
‘‘Two-dimensional image-based screening tool for infants with positional
cranial deformities: A machine learning approach,’Diagnostics, vol. 10,
no. 7, p. 495, Jul. 2020.
[40] H. Drucker, C. J. Burges, L. Kaufman, A. J. Smola, and V. Vapnik,
‘‘Support vector regression machines,’’ in Proc. Adv. Neural Inf. Process.
Syst., 1997, pp. 155–161.
[41] B. Üstün, W. J. Melssen, and L. M. C. Buydens, ‘‘Facilitating the applica-
tion of support vector regression by using a universal pearson VII function
based kernel,’Chemometric Intell. Lab. Syst., vol. 81, no. 1, pp. 29–40,
Mar. 2006.
[42] F. Abbas, H. Afzaal, A. A. Farooque, and S. Tang, ‘‘Crop yield prediction
through proximal sensing and machine learning algorithms,’Agronomy,
vol. 10, no. 7, p. 1046, Jul. 2020.
[43] S. Weng, S. Yu, B. Guo, P. Tang, and D. Liang, ‘‘Non-destructive detection
of strawberry quality using multi-features of hyperspectral imaging and
multivariate methods,’’ Sensors, vol. 20, no. 11, p. 3074, May 2020.
[44] Z. Munadhil, S. K. Gharghan, A. H. Mutlag, A. Al-Naji, and
J. Chahl, ‘‘Neural network-based Alzheimer’s patient localization for
wireless sensor network in an indoor environment,’’ IEEE Access, vol. 8,
pp. 150527–150538, 2020.
[45] S. K. Gharghan, R. Nordin, A. M. Jawad, H. M. Jawad, and M. Ismail,
‘‘Adaptive neural fuzzy inference system for accurate localization of wire-
less sensor network in outdoor and indoor cycling applications,’IEEE
Access, vol. 6, pp. 38475–38489, 2018.
[46] T. Østergård, R. L. Jensen, and S. E. Maagaard, ‘‘A comparison of six
metamodeling techniques applied to building performance simulations,’
Appl. Energy, vol. 211, pp. 89–103, Feb. 2018.
[47] A. Kamath, R. A. Vargas-Hernández, R. V. Krems, T. Carrington, and
S. Manzhos, ‘‘Neural networks vs Gaussian process regression for repre-
senting potential energy surfaces: A comparative study of fit quality and
vibrational spectrum accuracy,’’ J. Chem. Phys., vol. 148, no. 24, Jun. 2018,
Art. no. 241702.
[48] A. McHutchon and C. E. Rasmussen, ‘‘Gaussian process training with
input noise,’’ in Proc. Adv. Neural Inf. Process. Syst., 2011, pp. 1341–1349.
ABHILASH SINGH (Member, IEEE) received the
integrated dual (B.Tech. and M.Tech.) degree in
electronics and communication engineering with
specialization in wireless communication and net-
works from Gautam Buddha University, Greater
Noida, India, in 2017. He is currently pursuing the
Ph.D. degree in the field of remote sensing with the
Indian Institute of Science Education and Research
at Bhopal, Bhopal, India.
Since 2018, he has been working on the
NASA-ISRO Synthetic Aperture Radar (NISAR) Project at IISER Bhopal.
He has been publishing research papers in peer-reviewed conferences
and internationally reputed journals. His current research interests include
microwave remote sensing, machine learning, bio-inspired algorithms, wire-
less sensor networks, and wireless communication.
Mr. Singh is a member of the European Geophysical Union (EGU), ISPRS,
and the Indian Radio Science Society (InRaSS). He was a recipient of the
Gold Medal Awards from the University for been throughout the First Rank
Holder in his UG and PG. He received the prestigious ‘‘DST-INSPIRE’’
Fellowship to carried out his Ph.D. degree from the Department of Science
and Technology (DST), Ministry of Science and Technology, India. He also
received the DAAD Fellowship to attend a Summer School, in 2019.
208262 VOLUME 8, 2020
A. Singh et al.: Machine Learning Approach to Predict the ALE With Applications to WSNs
VAIBHAV KOTIYAL was born in New Delhi,
India, in 1998. He received the integrated dual
(B.Tech. and M.Tech.) degree in electronics and
communication engineering with specialization
in wireless communication and networks from
Gautam Buddha University, Greater Noida, India,
in 2020.
He ranked third in the University in the course.
He is currently working as a Junior Research Fel-
low (JRF) with the Department of Industrial and
Management Engineering, IIT Kanpur, India. His research area during
his M.Tech. degree was on node localization in wireless sensor networks
(WSNs). He trained with Airports Authority of India (AAI) in his summer
training period, learning about various equipment used by the organization
to ensure navigation to the air-crafts all over the Indian airspace. His current
research interests include machine vision, the IoT, and machine learning
application to wireless sensor networks.
SANDEEP SHARMA received the B.Tech. degree
in electronics engineering from RGPV, Bhopal,
India, in 1997, the M.Tech. degree in digital com-
munication from Devi Ahilya University, Indore,
India, in 2005, and the Ph.D. degree in electron-
ics and communication engineering from Gautam
Buddha University, Greater Noida, India, in 2016.
Since 2010, he has been working as a Faculty
Member with the Electronics and Communication
Engineering Department, School of ICT, Gautam
Buddha University. He has published 22 research articles in reputed inter-
national journals and more than 41 papers published in the international
conferences. His research interests include wireless sensor networks, wire-
less network security, physical layer authentication, intrusion detection in
wireless networks, cross-layer design, and machine learning applications in
WSNs.
Mr. Sharma was a recipient of the Best Conference Paper in the interna-
tional conference ICCCS, in 2016, and the Young Scientist Award in 2019 for
his research work. He is an Active Reviewer of IET Communications,
IEEE WIRELESS COMMUNICATIONS LETTERS,Journal of Information Technol-
ogy (Springer), Personal and Ubiquitous Computing (Springer), Multimedia
Tools and Applications (Springer), International Journal of Computer Appli-
cations in Technology (Inderscience), International Journal of Communica-
tion Systems (Wiley), Journal of The Institution of Engineers (India): Series
B, and the Journal of Intelligent and Fuzzy Systems.
JAIPRAKASH NAGAR (Member, IEEE) was born
in Nagla Vasdev, Mathura, Uttar Pradesh, India,
in 1991. He received the integrated B.Tech. (elec-
tronics and communication engineering) and
M.Tech. (wireless communication and networks)
degree from Gautam Buddha University, Greater
Noida, India, in 2015. He is currently pursu-
ing the Ph.D. degree with the Subir Chowdhury
School of Quality and Reliability, IIT Kharagpur,
India.
He has published six research articles in reputed SCI/Scopus indexed jour-
nals and international conferences (IEEE/Springer/Taylor and Francis). His
current research interests include analytical modeling of wireless multihop
networks, the Internet of Things (IoTs), machine learning techniques for the
IoTs, and block-chain implementation for real life applications.
CHENG-CHI LEE (Member, IEEE) received
the Ph.D. degree in computer science from
National Chung Hsing University (NCHU),
Taiwan, in 2007. He is currently a Distinguished
Professor with the Department of Library and
Information Science, Fu Jen Catholic University.
His current research interests include data security,
cryptography, network security, mobile commu-
nications and computing, and wireless communi-
cations. He has published more than 200 articles
on the above research fields in international journals. He is a member of
the Chinese Cryptology and Information Security Association (CCISA),
the Library Association of The Republic of China, and the ROC Phi Tau
Phi Scholastic Honor Society. He has also served as a Reviewer for many
SCI-index journals, other journals, and other conferences. He is also an
Editorial Board Member of some journals.
VOLUME 8, 2020 208263
... In comparison to real data, acquiring synthetic data is efficient and cost-effective. Due to this, the use of synthetic datasets to train machine learning models is increased in the past lustrum 21,[26][27][28][29] . ...
... Relative predictor importance. In machine learning, the choice of input predictors has a substantial control on its performance 28 . Predictor importance analysis is not restricted to any particular representations, tech- www.nature.com/scientificreports/ ...
Article
Full-text available
Momentous increase in the popularity of explainable machine learning models coupled with the dramatic increase in the use of synthetic data facilitates us to develop a cost-efficient machine learning model for fast intrusion detection and prevention at frontier areas using Wireless Sensor Networks (WSNs). The performance of any explainable machine learning model is driven by its hyperparameters. Several approaches have been developed and implemented successfully for optimising or tuning these hyperparameters for skillful predictions. However, the major drawback of these techniques, including the manual selection of the optimal hyperparameters, is that they depend highly on the problem and demand application-specific expertise. In this paper, we introduced Automated Machine Learning (AutoML) model to automatically select the machine learning model (among support vector regression, Gaussian process regression, binary decision tree, bagging ensemble learning, boosting ensemble learning, kernel regression, and linear regression model) and to automate the hyperparameters optimisation for accurate prediction of numbers of k-barriers for fast intrusion detection and prevention using Bayesian optimisation. To do so, we extracted four synthetic predictors, namely, area of the region, sensing range of the sensor, transmission range of the sensor, and the number of sensors using Monte Carlo simulation. We used 80% of the datasets to train the models and the remaining 20% for testing the performance of the trained model. We found that the Gaussian process regression performs prodigiously and outperforms all the other considered explainable machine learning models with correlation coefficient (R = 1), root mean square error (RMSE = 0.007), and bias = − 0.006. Further, we also tested the AutoML performance on a publicly available intrusion dataset, and we observed a similar performance. This study will help the researchers accurately predict the required number of k-barriers for fast intrusion detection and prevention.
... This study used the area of the RoI, sensing range, transmission range, and the number of sensors as the potential features and k-barriers as the predictand. In machine learning, the selection of input features significantly affects its performance (Singh et al., 2020). Hence, before training the machine learning model, ensemble aggregation method (i.e., r = 100), each with a learning rate of one (i.e., α = 1), and the classical decision tree (i.e., decision stumps) has been considered as a weak learner. ...
Preprint
Full-text available
Wireless Sensor Networks (WSNs) is a promising technology with enormous applications in almost every walk of life. One of the crucial applications of WSNs is intrusion detection and surveillance at the border areas and in the defense establishments. The border areas are stretched in hundreds to thousands of miles, hence, it is not possible to patrol the entire border region. As a result, an enemy may enter from any point absence of surveillance and cause the loss of lives or destroy the military establishments. WSNs can be a feasible solution for the problem of intrusion detection and surveillance at the border areas. Detection of an enemy at the border areas and nearby critical areas such as military cantonments is a time-sensitive task as a delay of few seconds may have disastrous consequences. Therefore, it becomes imperative to design systems that are able to identify and detect the enemy as soon as it comes in the range of the deployed system. In this paper, we have proposed a deep learning architecture based on a fully connected feed-forward Artificial Neural Network (ANN) for the accurate prediction of the number of k-barriers for fast intrusion detection and prevention. We have trained and evaluated the feed-forward ANN model using four potential features, namely area of the circular region, sensing range of sensors, the transmission range of sensors, and the number of sensor for Gaussian and uniform sensor distribution. These features are extracted through Monte Carlo simulation. In doing so, we found that the model accurately predicts the number of k-barriers for both Gaussian and uniform sensor distribution with correlation coefficient (R = 0.78) and Root Mean Square Error (RMSE = 41.15) for the former and R = 0.79 and RMSE = 48.36 for the latter. Further, the proposed approach outperforms the other benchmark algorithms in terms of accuracy and computational time complexity.
... This study used the area of the RoI, sensing range, transmission range, and the number of sensors as the potential features andbarriers as the predictand. In machine learning, the selection of input features significantly affects its performance (Singh, Kotiyal, Sharma, Nagar, & Lee, 2020). Hence, before training the machine learning model, we have evaluated the relative importance of each selected feature on the predictand. ...
Article
Full-text available
Wireless Sensor Networks (WSNs) is a promising technology with enormous applications in almost every walk of life. One of the crucial applications of WSNs is intrusion detection and surveillance at the border areas and in the defense establishments. The border areas are stretched in hundreds to thousands of miles, hence, it is not possible to patrol the entire border region. As a result, an enemy may enter from any point absence of surveillance and cause the loss of lives or destroy the military establishments. WSNs can be a feasible solution for the problem of intrusion detection and surveillance at the border areas. Detection of an enemy at the border areas and nearby critical areas such as military cantonments is a time-sensitive task as a delay of few seconds may have disastrous consequences. Therefore, it becomes imperative to design systems that are able to identify and detect the enemy as soon as it comes in the range of the deployed system. In this paper, we have proposed a deep learning architecture based on a fully connected feed-forward Artificial Neural Network (ANN) for the accurate prediction of the number of k-barriers for fast intrusion detection and prevention. We have trained and evaluated the feed-forward ANN model using four potential features, namely area of the circular region, sensing range of sensors, the transmission range of sensors, and the number of sensor for Gaussian and uniform sensor distribution. These features are extracted through Monte Carlo simulation. In doing so, we found that the model accurately predicts the number of k-barriers for both Gaussian and uniform sensor distribution with correlation coefficient (R = 0.78) and Root Mean Square Error (RMSE = 41.15) for the former and R = 0.79 and RMSE = 48.36 for the latter. Further, the proposed approach outperforms the other benchmark algorithms in terms of accuracy and computational time complexity.
... Singh et al. in [26] proposed a method that can predict the average localization accuracy in wireless sensor networks (WSNs). They used a support vector machine (SVM)-based method and called their opted process as support vector regression (SVR), which they further divided into three subcategories. ...
Article
Full-text available
This study aims to realize Sustainable Development Goals (SDGs), i.e., SDG 9: Industry Innovation and Infrastructure and SDG 14: Life below Water, through the improvement of localization estimation accuracy in magneto-inductive underwater wireless sensor networks (MI-UWSNs).The accurate localization of sensor nodes in MI communication can effectively be utilized for industrial IoT applications, e.g., underwater gas and oil pipeline monitoring, and in other important underwater IoT applications, e.g., smart monitoring of sea animals, etc. The most-feasible technology for medium- and short-range communication in IIoT-based UWSNs is MI communication. To improve underwater communication, this paper presents a machine learning-based prediction of localization estimation accuracy of randomly deployed sensor Rx nodes through anchor Tx nodes in the MI-UWSNs. For the training of ML models, extensive simulations have been performed to create two separate datasets for the two configurations of excitation current provided to the Tri-directional (TD) coils, i.e., configuration1-case1_configuration2-case1 (c1c1_c2c1) and configuration1-case2_configuration2-case2 (c1c2_c2c2). Two ML models have been created for each case. The accuracies of both models lie between 95% and 97%. The prediction results have been validated by both the test dataset and verified simulation results. The other important contribution of this paper is the development of a novel assembling technique of a MI-TD coil to achieve an approximate omnidirectional magnetic flux around the communicating coils, which, in turn, will improve the localization accuracy of the Rx nodes in IIoT-based MI-UWSNs.
... One of the several approaches such as distance mapping (Tan et al., 2020), distance vectors (Kanwar & Kumar, 2020), Amri et al. 2019 regression analysis (Madhumathi & Suresh, 2020;Singh et al., 2020) clustering has widely been implemented to overcome the node localization issues (Rani & Jayakumar, 2017;El Khediri et al., 2020). Distance mapping is based on distance from the transmitters present near neighbour nodes and distance vector is based on created vector matrix comprising of various parameters in addition to the distance such as residual energy, energy consumption, drop rate, transmission rate, etc. Further, regression analysis finds the best node based on transmitter node requirements among the neighbour nodes whereas clustering mechanism separates the nodes based on the similarity among the neighbour nodes and localize only appropriate nodes. ...
Article
Full-text available
Among multi‐hop technology, wireless sensor network (WSN) has been extensively investigated owing to its potential application in vivid fields. However, a key issue probing WSN is node location that is also the major area of interest in the present paper. The paper takes advantage of cuckoo search (CS) as the swarm intelligence technique used to address the issues of identification of malicious or unknown nodes within the network. The distance vector (DV)‐hop is used to determine the distance between the anchor sensor node and the unknown or the node with compromised nature. Then, artificial neural network architecture is used to distinguish the nodes based on the characteristics. This is followed by the evaluation of the proposed scheme to offer reliable data transmission using CS optimized data aggregation scheme. The simulation analysis over 1000 deployed nodes shows that CS significantly decreases the localization error to 0.494 and localization time to 0.058 s along with 15%–20% improvement in the throughput and packet delivery ratio. This shows that the proposed CS optimized architecture is successful in identifying the position of unknown nodes as well as compromised nodes that significantly improved the reliability of the data transmission.
... This is the problem where we can exploit machine learning algorithms to validate the proposed analytical models. Recently, several research articles have been published in which authors have applied machine learning algorithms to solve various WSN related issues such as sensor localisation [17,18]. ...
Article
Full-text available
Wireless Sensor Networks (WSNs) is one of the most widely employed technology because it has numerous applications in almost every walk of life. The analytical results available for large-scale WSNs cannot be utilised to estimate the performance of WSNs deployed in a finite region due to Boundary Effects (BEs). In addition, wireless channel characteristics are affected by diverse environmental phenomena such as the presence of impediments, interference, reflection, and refraction, etc. Therefore, we render an analytical model by considering BEs in the shadowed environments to estimate the κ-coverage metric of a WSN installed in a circular region (CR). Validation of the analytical models is a time-consuming and tedious task and requires hours. To overcome this problem, in this study, we proposed a framework based on feed-forward Artificial Neural Network (ANN) to map the κ-coverage probability using nodes, sensing range, the standard deviation of shadowing denoted by sigma, and required κ as features. These features were extracted through Monte Carlo simulations. We estimated the feature importance and performed the feature sensitivity analysis before training the ANN model. We trained two feed-forward ANN models for with and without BEs. We found sensing range is the most important feature in predicting the κ-coverage probability. Further, the proposed feed-forward framework performs equally well for both cases, with correlation coefficient (R) = 0.98 and Root Mean Square Error (RMSE) = 0.07. Furthermore, it also outperforms the results obtained through the Adaptive Neuro-Fuzzy Inference System (ANFIS).
... SVR is one of the support vector machine's (SVM) forms. This method's separation of input data is based on different lines or surfaces (kernels), as shown in Figure 2A (Singh et al., 2020), which are used for training the model for the following predictions. The structure of SVMs is demonstrated in Figure 2B ( Buyukyildiz et al., 2013). ...
Article
Full-text available
Researchers' concentration has been on hybrid systems that can fulfill economic and environmental goals in recent years. In this study, first, the prediction of CO 2 emission and electricity consumption of Saudi Arabia by 2040 is made by employing multi-layer perceptron (MLP) and support vector regression (SVR) methods to see the rate of CO 2 emission and electricity consumption. In this regard, the most important parameters such as gross domestic product (GDP), population, oil consumption, natural gas consumption, and renewable consumption are considered. Estimating CO 2 emission by MLP and electricity consumption by SVR showed 815 Mt/year and 475 TWh/ year, respectively, where R2 for MLP and SVR was 0.99. Prediction results showed a 31% and 39% increase in CO 2 emission and electricity consumption by 2040 compared to 2020. Second, the optimum combination of components for supplying demand load and desalination load in residential usages are found where 0% capacity shortage, 20-60$/t penalty for CO 2 emission, sell back to the grid, and both fixed and random grid outages are considered. Load demands were considered under two winter and non-winter times so that 4,266, 2,346, and 3,300 kWh/day for Aseer, Tabuk, and the Eastern Region were shown, respectively. Results show that 0.12, 0.11, and 0.12 (kW (PV))/(kWh/day(load)) and 0.1, 0.08, and 0.08 (kW(Bat))/(kWh/day(load)) are required under the assumption of this study for Aseer, Tabuk, and the Eastern Region, respectively. Also, COEs for the proposed systems are 0.0934, 0.0915, and 0.0910 $/kWh for Aseer, Tabuk, and the Eastern Region, respectively. Also, it was found that renewable fractions (RFs) between 46% and 48% for all of the case studies could have rational COE and NPCs and fulfill the increasing rate of CO 2 emission and electricity consumption. Finally, sensitivity analysis on grid CO 2 emission and its penalty, load and solar Global Horizontal Irradiance (GHI), PV, and battery prices
... The node deployment plays a critical role in the performance of the system. Node localized at the right place and with minimum localization error [22] substantially boosts the system's performance. If a node is not appropriately placed, potential intruders can enter the system and alleviate the attacks in the sensor network deployed system. ...
Article
Full-text available
Wireless communication networks have much data to sense, process, and transmit. It tends to develop a security mechanism to care for these needs for such modern-day systems. An intrusion detection system (IDS) is a solution that has recently gained the researcher's attention with the application of deep learning techniques in IDS. In this paper, we propose an IDS model that uses a deep learning algorithm, conditional generative adversarial network (CGAN), enabling unsupervised learning in the model and adding an eXtreme gradient boosting (XGBoost) classifier for faster comparison and visualization of results. The proposed method can reduce the need to deploy extra sensors to generate fake data to fool the intruder 1.2-2.6%, as the proposed system generates this fake data. The parameters were selected to give optimal results to our model without significant alterations and complications. The model learns from its dataset samples with the multiple-layer network for a refined training process. We aimed that the proposed model could improve the accuracy and thus, decrease the false detection rate and obtain good precision in the cases of both the datasets, NSL-KDD and the CICIDS2017, which can be used as a detector for cyber intrusions. The false alarm rate of the proposed model decreases by about 1.827%.
Article
Full-text available
The number of older adults with Alzheimer’s disease is increasing every year. The associated memory problems cause many difficulties for Alzheimer’s patients and their caretakers; patients may even become lost in familiar surroundings. In this paper, a proposed localization system based on a wireless sensor network (WSN) and backpropagation based artificial neural network (BP-ANN) was practically implemented to detect and determine the position of an Alzheimer’s patient in an indoor environment. The proposed system consisted of four ZigBee-based XBee S2C anchor nodes and one mobile node carried by the Alzheimer’s patient. The received signal strength indicator (RSSI) of the anchor nodes was collected by the mobile node using a laptop supported by X-CTU software. The obtained RSSI values were used as input for training, testing, and validation processes of the BP-ANN, while two-dimension (2D) locations (x and y) were used as the output of the ANN. The results showed that the obtained mean localization errors were 0.964 and 0.921m for validation and testing phases, respectively, after applying the ANN. Based on a comparison with state-of-the-art technology, we deduced that the proposed ANN method outperformed other techniques in previous studies in terms of mean localization error.
Article
Full-text available
Proximal sensing techniques can potentially survey soil and crop variables responsible for variations in crop yield. The full potential of these precision agriculture technologies may be exploited in combination with innovative methods of data processing such as machine learning (ML) algorithms for the extraction of useful information responsible for controlling crop yield. Four ML algorithms, namely linear regression (LR), elastic net (EN), k-nearest neighbor (k-NN), and support vector regression (SVR), were used to predict potato (Solanum tuberosum) tuber yield from data of soil and crop properties collected through proximal sensing. Six fields in Atlantic Canada including three fields in Prince Edward Island (PE) and three fields in New Brunswick (NB) were sampled, over two (2017 and 2018) growing seasons, for soil electrical conductivity, soil moisture content, soil slope, normalized-difference vegetative index (NDVI), and soil chemistry. Data were collected from 39–40 30 × 30 m2 locations in each field, four times throughout the growing season, and yield samples were collected manually at the end of the growing season. Four datasets, namely PE-2017, PE-2018, NB-2017, and NB-2018, were then formed by combing data points from three fields to represent the province data for the respective years. Modeling techniques were employed to generate yield predictions assessed with different statistical parameters. The SVR models outperformed all other models for NB-2017, NB-2018, PE-2017, and PE-2018 dataset with RMSE of 5.97, 4.62, 6.60, and 6.17 t/ha, respectively. The performance of k-NN remained poor in three out of four datasets, namely NB-2017, NB-2018, and PE-2017 with RMSE of 6.93, 5.23, and 6.91 t/ha, respectively. The study also showed that large datasets are required to generate useful results using either model. This information is needed for creating site-specific management zones for potatoes, which form a significant component for food security initiatives across the globe.
Article
Full-text available
Positional cranial deformities are relatively common conditions, characterized by asymmetry and changes in skull shape. Although three-dimensional (3D) scanning is the gold standard for diagnosing such deformities, it requires expensive laser scanners and skilled maneuvering. We therefore developed an inexpensive, fast, and convenient screening method to classify cranial deformities in infants, based on single two-dimensional vertex cranial images. In total, 174 measurements from 80 subjects were recorded. Our screening software performs image processing and machine learning-based estimation related to the deformity indices of the cranial ratio (CR) and cranial vault asymmetry index (CVAI) to determine the severity levels of brachycephaly and plagiocephaly. For performance evaluations, the estimated CR and CVAI values were compared to the reference data obtained using a 3D cranial scanner. The CR and CVAI correlation coefficients obtained via support vector regression were 0.85 and 0.89, respectively. When the trained model was evaluated using the unseen test data for the three CR and three CVAI classes, an 86.7% classification accuracy of the proposed method was obtained for both brachycephaly and plagiocephaly. The results showed that our method for screening cranial deformities in infants could aid clinical evaluations and parental monitoring of the progression of deformities at home.
Article
Full-text available
In Internet of things (IoT) millions of devices are intelligently connected for providing smart services. Especially in indoor localization environment, that is one of the most concerning topic of smart cities, internet of things and wireless sensor networks. Many technologies are being used for localization purpose in indoor environment and Wi-Fi using received signal strengths (RSSs) is one of them. Wi-Fi RSSs are sensitive to reflection, refraction, interference and channel noise that cause irregularity in signal strengths. The irregular and anomalous RSS values, used in a Wi-Fi indoor localization environment, cannot define the location of any unknown node correctly. Therefore, this research has developed an outlier detection technique named as iF_Ensemble for Wi-Fi indoor localization environment by analyzing RSSs using the combination of supervised, unsupervised and ensemble machine learning methods. In this research isolation forest (iForest) is used as an unsupervised learning method. Supervised learning method includes support vector machine (SVM), K-nearest neighbor (KNN) and random forest (RF) classifiers with stacking that is an ensemble learning method. For the evaluation purpose accuracy, precision, recall, F-score and ROC-AUC curve are used. The evaluation of used machine learning method provides high accuracy of 97.8 percent with proposed outlier detection methods and almost 2 percent improvement in the accuracy of localization process in indoor environment after eliminating outliers.
Article
Full-text available
Soluble solid content (SSC), pH, and vitamin C (VC) are considered as key parameters for strawberry quality. Spectral, color, and textural features from hyperspectral reflectance imaging of 400–1000 nm was to develop the non-destructive detection approaches for SSC, pH, and VC of strawberries by integrating various multivariate methods as partial least-squares regression (PLSR), support vector regression, and locally weighted regression (LWR). SSC, pH, and VC of 120 strawberries were statistically analyzed to facilitate the partitioning of data sets, which helped optimize the model. PLSR, with spectral and color features, obtained the optimal prediction of SSC with determination coefficient of prediction (Rp2) of 0.9370 and the root mean square error of prediction (RMSEP) of 0.1145. Through spectral features, the best prediction for pH was obtained by LWR with Rp2 = 0.8493 and RMSEP = 0.0501. Combination of spectral and textural features with PLSR provided the best results of VC with Rp2 = 0.8769 and RMSEP = 0.0279. Competitive adaptive reweighted sampling and uninformative variable elimination (UVE) were used to select important variables from the above features. Based on the important variables, the accuracy of SSC, pH, and VC prediction both gain the promotion. Finally, the distribution maps of SSC, pH, and VC over time were generated, and the change trend of three quality parameters was observed. Thus, the proposed method can nondestructively and accurately determine SSC, pH, and VC of strawberries and is expected to design and construct the simple sensors for the above quality parameters of strawberries.
Article
Full-text available
Wireless sensor networks (WSNs) are growing rapidly in various fields of commerce, medicine, industrial, agriculture, research, meteorology, etc. that eases complicated tasks. The most active and recent research areas in wireless sensor networks are deployment strategies, energy efficiency and coverage. Besides energy harvesting, network lifetime of the sensors can be increased by decreasing the consumption of energy. This becomes the most challenging areas of utilizing wireless sensor network in practical applications. Deployment in WSNs directly influence the performance of the networks. The usage of sensor nodes in large quantity in the random deployment improves concerns in reliability and scalability. Coverage in wireless sensor networks measures how long the physical space is monitored by the sensors. Barrier coverage is an issue in wireless sensor networks, which is used for security application aims in intruder detection of the protected area. Several ongoing research work focuses on energy efficiency and coverage in wireless sensor networks and numerous schemes, algorithms, methods and architectures have been proposed. Still, there is no comprehensive solution applicable universally. Hence,this work provides with a state-of-the-art of the classification of wireless sensor networks based on different dimensions, such as, types of sensors, deployment strategies, sensing models, coverage and energy efficiency.
Article
Full-text available
Underwater communication remains a challenging technology via communication cables and the cost of underwater sensor network deployment is still very high. As an alternative, underwater wireless communication has been proposed and have received more attention in the last decade. Preliminary research indicated that Radio Frequency (RF) and Magneto-Inductive (MI) communication achieve higher data rate in the near field communication. Optical communication achieves good performance when limited to the line-of-sight positioning. Acoustic communication allows long transmission range. However, it suffers from transmission losses and time-varying signal distortion due to its dependency on environmental properties. These latter are salinity, temperature, pressure, depth of transceivers, and the environment geometry. This study is focused on both acoustic and magneto-inductive communications, which are the most used technologies for underwater networking. Such as acoustic communication is employed for applications requiring long communication range while MI is used for real-time communication. Moreover, this paper highlights the trade-off between underwater properties, wireless communication technologies, and communication quality. This can help the researcher community by providing clear insight for further research.
Article
Full-text available
The paper addresses the problem of localization based on hybrid received signal strength (RSS) and time of arrival (TOA) measurements, in the presence of synchronization errors among all the nodes in a wireless network, and assuming all parameters are unknown. In most existing schemes, in fact, knowledge of the model parameters is postulated to reduce the high dimensionality of the cost functions involved in the position estimation process. However, such parameters depend on the operational wireless context, and change over time due to the presence of dynamic obstacles and other modification of the environment. Therefore, they should be adaptively estimated “on the field”, with a procedure that must be as simple as possible in order to suit multiple real-time re-calibrations, even in low-cost applications, without requiring human intervention. Unfortunately, the joint maximum likelihood (ML) position estimator for this problem does not admit a closed-form solution, and numerical optimization is practically unfeasible due to the large number of nuisance parameters. To circumvent such issues, a novel two-step algorithm with reduced complexity is proposed: A first calibration phase exploits nodes in known positions to estimate the unknown RSS and TOA model parameters; then, in a second localization step, an hybrid TOA/RSS range estimator is combined with an iterative least-squares procedure to finally estimate the unknown target position. The results show that the proposed hybrid TOA/RSS localization approach outperformed state-of-the-art competitors and, remarkably, achieved almost the same accuracy of the joint ML benchmark but with a significantly lower computational cost.
Article
Node localization is one of the promising research issues in Wireless Sensor Networks (WSNs). A novel node localization algorithm termed Kernel Extreme Learning Machines based on Hop-count Quantization (KELM-HQ) is proposed. The proposed algorithm employs the real number hop-counts between anchors and unknown nodes as the training inputs and the locations of the anchors as the training targets for KELM training. The proposed method also employs the real number hop-counts between unknown nodes as the test samples to compute the locations of unknown nodes by the trained KELM. Simulation results demonstrate that the proposed KELM-HQ algorithm improves the accuracy of node localization and it outperforms state-of-the-arts localization methods.
Article
Wireless sensor networks (WSNs) found application in many diverse fields, starting from environment monitoring to machine health monitoring. The sensor in WSNs senses information. Sensing and transmitting this information consume most of the energy. Also, this information requires proper processing before final usages. This paper deals with minimising the redundant information sensed by the sensors in WSNs to reduce the unnecessary energy consumption and prolong the network lifetime. The redundant information is expressed in terms of the overlapping sensing area of the working sensors set. A mathematical model is proposed to find the redundant information in terms of the overlapping area. A combined meta-heuristic approach is used to achieve the optimal coverage, and the effect of the overlapping area is considered in the objective function to reduce the amount of redundant information sensed by the working sensors set. Improved genetic algorithm (IGA) and Binary ant colony algorithm (BACA) are used as heuristic tools to optimise the multi-objective function. The objective was to find the minimum number of sensors that cover a complete scenario with minimum overlapping sensing region. The results show that optimal coverage with the minimum working sensor set is achieved and then by incorporating the concept of overlapping area in the objective function, sensing of redundant information is further reduced.