PreprintPDF Available

RSSI-Based Machine Learning with Pre- and Post-Processing for Cell-Localization in IWSNs

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Industrial wireless sensor networks are becoming crucial for modern manufacturing. If the sensors in those networks are mobile, the position information, besides the sensor data itself, can be of high relevance. E.g. this position information can increase the trustability of a wireless sensor measurement by assuring that the sensor is not physically removed, off track, or otherwise compromised. In certain applications, localization information at cell-level, whether the sensor is inside or outside a room or cell, is sufficient. For this, localization using Received Signal Strength Indicator (RSSI) measurements is very popular since RSSI values are available in almost all existing technologies and no direct interaction with the mobile sensor node and its communication in the network is needed. For this scenario, we propose methods to improve the robustness and accuracy of common machine learning classifiers, by using features based on short-term moments and a second classification stage using Hidden Markov Models. With the data from an extensive measurement campaign, we show the applicability of our method and achieve a cell-level localization accuracy of 93.5\%.
Content may be subject to copyright.
RSSI-Based Machine Learning with Pre- and
Post-Processing for Cell-Localization in IWSNs
Julian Karoliny, Thomas Blazek, Fjolla Ademaj, Hans-Peter Bernhard∗†, Andreas Springer
Silicon Austria Labs GmbH, 4040 Linz
Johannes Kepler University Linz, Institute for Communications Engineering and RF-Systems, 4040 Linz, Austria
Abstract—Industrial wireless sensor networks are becoming
crucial for modern manufacturing. If the sensors in those
networks are mobile, the position information, besides the sensor
data itself, can be of high relevance. E.g. this position information
can increase the trustability of a wireless sensor measurement by
assuring that the sensor is not physically removed, off track, or
otherwise compromised.
In certain applications, localization information at cell-level,
whether the sensor is inside or outside a room or cell, is
sufficient. For this, localization using Received Signal Strength
Indicator (RSSI) measurements is very popular since RSSI values
are available in almost all existing technologies and no direct
interaction with the mobile sensor node and its communication
in the network is needed. For this scenario, we propose methods
to improve the robustness and accuracy of common machine
learning classifiers, by using features based on short-term mo-
ments and a second classification stage using Hidden Markov
Models. With the data from an extensive measurement campaign,
we show the applicability of our method and achieve a cell-level
localization accuracy of 93.5%.
Index Terms—IWSN, RSSI, Machine Learning, HMM, Blue-
tooth, Indoor Localization
I. INTRODUCTION
In industrial environments, sensors are traditionally con-
nected through a wired communication network like field
buses or Ethernet networks. However, wireless communication
is becoming crucial to advanced manufacturing [1] and acts
as an enabler for Industry 4.0. Industrial wireless sensor
networks (IWSNs) must meet stringent reliability and latency
requirements [2], but offer advantages like mobile operation,
easy sensor replacement, flexible mounting, and often lower
cost [3]. For many use cases, it is necessary to record the spa-
tial position of the wireless sensor in addition to its measured
value. As an example, we present an extension to an IWSN-
based measurement system [4] for the emission certification of
cars according to the Euro 6 standard which traces the required
measurements in time and position [5]. During these tests, cars
are moved between differently conditioned areas and for the
position tracking a non-interfering add-on-localization extends
This work is funded by the InSecTT project (https://www.insectt.eu/).
InSecTT has received funding from the ECSEL Joint Undertaking (JU) under
grant agreement No 876038. The JU receives support from the European
Union’s Horizon 2020 research and innovation programme and Austria,
Sweden, Spain, Italy, France, Portugal, Ireland, Finland, Slovenia, Poland,
Netherlands, Turkey. The document reflects only the author’s view and
the Commission is not responsible for any use that may be made of the
information it contains.
the wireless measurement system. Besides the information
about the location itself, the position information of sensor
nodes is used to verify the measurements, e.g. to assure that
a sensor is not physically removed, off track, or otherwise
compromised. For instance, a malicious sensor node from
an unverified location can be identified and its measurement
values are rejected.
As the use of Global Positioning System (GPS) is strongly
limited in indoor environments, factory communication sys-
tems have to use alternative localization systems. In IWSNs,
the main techniques for localization are based on Angle of Ar-
rival (AoA), Time of Arrival (ToA), Time Difference of Arrival
(TDoA) and Received Signal Strength Indicator (RSSI) [6].
Localization based on RSSI values is one of the most promis-
ing solutions for low-cost applications since the RSSI value
is available in existing technologies like Bluetooth R
Low
Energy (BLE), Wireless Local Area Network (WLAN), Zig-
Bee, etc. However, due to multipath fading, noise and limited
dynamic range of the RSSI measurements, exact localization
based on a path-loss model and multilateration becomes quite
challenging. While in the literature many techniques focus on
improving the accuracy of RSSI-based estimation, there are
also many use cases in IWSNs, where a coarse location of the
sensor node is sufficient, such as measurement verification,
security, and automotive testing. The main task in such use
cases is to classify specific environments or regions like the
inside or outside of a room and determine whether a sensor
node belongs to such a confined region. Authors in [7], [8]
have studied the so-called cell-level-based localization with
RSSI values using supervised machine learning methods. A
major challenge with these methods is the limited amount of
training and validation data.
A. Contribution
In this work, we present a RSSI-based cell-level localization
approach as an add-on to an existing IWSN. To acquire the
RSSI measurements of all sensor values sent to the base
station (BS), we use additional sensor nodes which only listen
passively. We propose methods to improve the robustness and
accuracy of common machine learning classifiers (MLCs), by
introducing suitable input features and a subsequent second
classification stage. Here, we take advantage of the fact that
the RSSI measurements are highly correlated in time, i.e., two
subsequent measurements are from similar positions because
of the limited movement speed. Additionally, we conducted
an extensive measurement campaign that allows us to test and
verify the localization method we developed.
B. Notation
Scalars are written as x, while vectors and matrices are
denoted as lower- and uppercase bold respectively (xand
X). For matrices and vectors, the element-wise (Hadamard)
multiplication is denoted with . Time indices are indicated
with superscript xtand a set of time depending measurements
of the size Tis written as x1:T[x1, x2,...xT].
II. EX PER IM ENTAL SE TUP A ND DATA ACQUISITION
All measurements in this work were obtained at an au-
tomotive testbed1, i.e. under real-world conditions. In our
measurement scenario, we want to estimate the position of a
wireless sensor node mounted on a car that is moving in and
out from the testbed, c.f. Fig. 1. This sensor node is referred
to as measurement node (MN). Additionally, ten sniffer nodes
(SNs) using the same transceiver as the MN are placed inside
and outside the testbed, referred to as I-x respectively O-x.
The SNs acquire the RSSI values of every communication
packet sent from the MN to the BS. Additionally, two photo
sensors (PSs) are placed at the doorstep and the car test-
position respectively, to automatically label the measurements.
The PSs are only used during the measurement campaign for
the labeling task. For the operation phase the PSs would not be
sufficient for the localization task, as the application requires
to localize and distinguish multiple cars in different clusters.
Figure 1 depicts the measurement setup with the position
of the SN, the PSs, and the trajectory of the car. The labels
and coordinates of each SN are listed in Table I.
O-E
I-DL
photosensor(PS)
sniffernode(SN)
I-E
I-T1 I-T2 I-T3
I-DR
O-DR
O-DL
O-M
basestation(BS)
Fig. 1. Measurement setup. The Car is driving in and out of the automotive
testbed.
A. Hardware and Protocol
We used the BLE physical (PHY) layer and combined it
with the Energy and Power Efficient Synchronous Sensor
Network (EPhESOS) protocol [9] to realize an IWSN with
up to 100 nodes per BS. EPhESOS provides a deterministic
1Chassis dynamo-meter, AVL List GmbH in Graz, Austria
TABLE I
LAB EL AN D PO SI TI ON O F SN IFF ER -NODES
Position in m
Label Description x y z
I-E inside end -13.30 0.90 0.64
I-T1 inside top 1 -9.88 -0.35 3.92
I-T2 inside top 2 -6.58 -0.35 3.92
I-T3 inside top 3 -1.60 -0.35 3.92
I-DR inside door right -1.55 3.19 1.00
I-DL inside door left -1.55 -0.09 1.00
O-E outside end 11.10 2.02 1.70
O-M outside mid 6.80 2.32 1.84
O-DR outside door right 2.21 3.10 0.99
O-DL outside door left 2.21 0.00 1.00
media access control (MAC) layer using time division multiple
access (TDMA), with a superframe (SF) length of 100 ms. As
hardware platform for the measurements and all applications,
the NordicTMNRF52840 controller with integrated transceiver
is used.
B. Acquired Dataset
In the course of this work, a large number of measurements
were collected, which are published and provided as open-
source data set under SAL Autarkic Localization RSSI BLE
Dataset (SAL-RB-Dataset) [10]. The acquired data set consists
of: (i) two disjoint measurement-sets, where a person walks
inside and outside the automotive testbed, and (ii) eight
disjoint measurement-sets, where a car is driving in and out
the testbed as depicted in Fig. 1. For (ii) two different MN
were mounted on the car to investigate the effects of different
hardware on the classification. Each individual set provides
RSSI values of all ten SN in a 100 ms interval. A missing
link is denoted with -100 dBm. The labels of the RSSI values
correspond to different localization cells and are defined as
follows:
Label = 0: car is outside the testbed
Label = 1: car is completely inside the testbed
Label = 2: car is inside the testbed and on test-position
In total the data set consists of more than 20 000 labeled
RSSI measurements for each SN. Figure 2 shows exam-
ple measurements of the RSSI values in dBm for the SNs
{I-E, I-DR, O-M, O-DR}and the corresponding labels. To
evaluate the findings of this work, six car measurement-sets
from the SAL-RB-Dataset are used.
III. RSSI-BAS ED MACH INE LEARNING CL AS SIFI ER
The MLC uses the sniffed RSSI values of data packets
sent by the MN to identify its position, or more precisely
the label of the corresponding cell in the testbed. The RSSI
measurements from the ten SNs (N= 10) are synchronously
recorded at time step tand collected in the input vector
xtRNwhich is used to estimate the corresponding label
yt∈ {0,1,2}. As usual, the MLC has an offline training phase
and an online classifier phase. The training set {x0
1:T0, y0
1:T0}
with T0samples is used to train the classifier. Afterwards the
100
50
O-DR
100
50
O-M
100
50
I-DR
100
50
I-E
0 50 100 150 200 250 300
Time/s
0
1
2
Labels
Fig. 2. Exemplary RSSI measurements in dBm of four SNs with the
corresponding labels.
classification is performed online with the measured data to
estimate the label ˆyt. The performance of the classification
is assessed by the accuracy score which is calculated as the
number of correctly identified labels divided by the number
of all classifications.
A. Machine Learning Methods and Data Splitting
In the following, we focus on simple machine learning tech-
niques to enable the estimator implementation directly at node
level. We considered three widely used algorithms, namely K-
Nearest Neighbors (KNN), Random Forest (RF), and Support
Vector Machine (SVM). All of these algorithms fall into the
category of supervised machine learning techniques, where the
choice of measurement-sets for the training and subsequent
validation is crucial. A common procedure here is to perform
a random split of the measurement data to obtain a subset
for training and validation. However, since the RSSI values
are measured continuously, subsequent measurements tend to
be very similar. A random split would lead in this case to a
very good, though, unrealistic accuracy score, as the smaller
validation set contains nearly identical measurements of the
training set. To avoid this, we do not split or shuffle individual
measurements set, but keep them whole either for training or
testing. This approach is also comparable to the real use-case
since here the MLC would also be learned at the beginning
and should then work for future measurements. Additionally,
instead of using only a single training and validation set, all
combinations consisting of three training and one validation
data sets, without using the same for both, are evaluated in this
section. This ensures a fair comparison of the different MLC
since it mitigates the problem that some approaches may be
exceptionally good for some data set combinations.
The three proposed algorithms are implemented using
Scikit-learn which is an open-source machine learning library
for Python [11]. For the given classification task, all three
algorithms performed similarly in terms of accuracy and
robustness, though the SVM showed slightly higher accuracy.
With the RSSI values directly as input, the SVM reached an
average accuracy of 77 % for the given data set [10] and is
used exclusively for all following evaluations.
The remaining miss-classifications are caused by overlap-
ping class-conditional distributions due to noise and limited
dynamic range of the measurements. These mainly occur at
class transition regions in the testbed, e.g., the doorstep and
test-position, and none of the proposed MLC could improve in
these areas. In order to further improve the accuracy, especially
at the class boarders, we introduce more suitable input features
for the MLC and a subsequent post-processing.
B. Selecting Features for Machine Learning Classifier
Regarding feature selection, two questions have to be an-
swered. Firstly, it is necessary to know whether the raw
data xtof one SF is sufficient for the classification, or if
including previous samples will improve the performance.
Secondly, it is essential to validate how the position, number,
and combination of the SNs influences the accuracy. In this
context, it is important to answer if also a subset of nodes is
sufficient for the classification task.
Also for this evaluation, the choice of measurement-sets for
the training and subsequent validation is crucial. To ensure a
fair comparison of the different features all combinations of
three training and one validation data sets, without using the
same for both, are evaluated. This mitigates the problem that
some features may be exceptionally good for some data set
combinations
1) Short-Term Moments as Features: The drawback of
using the RSSI measurements of more than one SF as input
features is, that it increases the feature space with each
additional sample. Besides that, the selected MLC may not be
able to model the relation of sequential input data, e.g. RSSI
values are considered individually and the dynamic relation
over time is not modeled.
To avoid this, we propose to use short-term estimates of the
first two moments over Lsamples. Thus, we introduce
xµσ
t="1
L
t
X
τ=tL+1
xτ
| {z }
¯xL
t
,1
L1
t
X
τ=tL+1
(xτ¯xL
t)2
| {z }
sL
t
#.
(1)
This only doubles the feature space, independently of the
number of used SFs. To get an indication of how many
preceding samples benefit this approach in our scenario, we
consider the coherence time of the channel. The coherence
time Tcis a statistical measure of the time duration over
which two received signals have a certain minimum amplitude
correlation, depending on the relative motion between the MN
1 3 5 7 9 11 13 15 17 19 21
Short-term moment length L
0.76
0.78
0.80
0.82
0.84
0.86
0.88
Accuracy score
Fig. 3. Accuracy score over increasing short-term moment length Lfor the
SVM. The raw RSSI measurements are indicated with L= 1.
on the car and the SNs. It can be estimated according to [12]
with
Tc=s9
16πf 2
m
=0.423
fm
,(2)
where fmis the Doppler spread, which is upper bounded by
the maximum occurring Doppler shift, calculated as fmax =
v/λ. With the given BLE center frequency fc= 2.44 GHz
and the average speed of the car v= 1 m
s, the Doppler shift
in case of directly oncoming movement is about 8 Hz, which
results in a lower bound of Tc50 ms. Since the time interval
between two successive SFs is 100 ms, considering more than
two RSSI measurements for (1) may not improve the result.
To investigate this and to analyze the advantages of us-
ing short-term moments as input features, we use the SVM
classifier and compare the results for an increasing number
Lin (1). Figure 3 depicts the accuracy score of the classifier
over L, where L= 1 denotes the result using the raw RSSI
values. The depicted accuracy is the average accuracy over
all 1023 possible SN-combinations and 60 possible training
and validation data set combinations. As mentioned before,
this assures a fair comparison since we observed that the
short-term moments showed a higher improvement for certain
combinations. In contrast to the calculated coherence time,
the accuracy further increases with L, though the highest
relative gain is achieved by using one additional preceding
measurement.
2) Node Selection Scheme: Due to noise and the limited
dynamic range of the measurements, two SNs can provide
similar RSSI values, though they are at different locations.
Additionally, some SN positions may provide exceptional
good measurements for the given classification task, since the
main effect that is exploited is not the free-space pathloss,
but the changes between line-of-sight and non-line-of-sight
channel caused by the movement of the car. Not only the
positions of the SNs are important, but also the combination
of the individual RSSI measurements is crucial for the correct
classification. Instead of evaluating all SN combinations and
simply choosing the one with the highest accuracy, we first
determine combinations that provide poor performance.
Figure 4 depicts the distribution of the accuracy score of all
1023 SN-combinations averaged over all data set combinations
0.5 0.6 0.7 0.8 0.9
Accuracy score
0
20
40
Occurrences
raw RSSI values
short-term moments L= 2
Fig. 4. Accuracy score distribution of all SN combinations using the raw
RSSI measurements and short-term moments for the SVM.
using the raw RSSI values and the short-term moments in (1)
with L= 2. Again the advantages of the short-term moments
can be observed, as all individual SN-combinations show a
higher accuracy. Additionally, they are clustered at higher
percentages which indicate improved robustness. Because the
short-term moments are superior as input features, in the
following we will use them exclusively.
Most SN-combinations show an accuracy between 80-90 %,
tending to the higher ones, while only very few combinations
are below. On closer inspection of the few combinations that
lead to poor results, we found an intuitive explanation. These
combinations are composed of either SNs only outside, only
inside, or only single SNs. On the contrary, combinations that
are composed of SNs equally spaced in the area of interest,
including nodes at significant points, e.g. near the doorstep,
lead to very good results. We found out that about four SNs
are sufficient for our task, with for example the combination
{I-E, I-DR, O-M, O-DR}.
C. Refining Cell Estimates Via Post-Processing
The physical cell-boarders in the x-dimension, as depicted
in Fig. 1 with the PSs, are defined w.r.t. the given local-
ization task and are not chosen optimal in terms of high
differences in the measured RSSI values. As a result, noise
and limited dynamic range of the RSSI measurements lead
to oscillating miss-classifications in time, especially near the
physical borders of the cells, i.e. doorstep and test-position.
Miss-classifications can also occur in the middle of the cell,
e.g. inside the testbed at the far end, which in particular is a
problem for the mentioned industrial use-case.
Therefore, we propose a second classification stage to
mitigate this problem. Instead of considering SFs individually,
we include a certain dynamic in the classification model,
because the samples are highly correlated in time. For example
preceding measurements have a high probability to result in
the same cell and abrupt changes for a few SFs are physically
not possible. The two stage-approach consist of, (i) the MLC
for the current input vector xtor xµσ
trespectively and the
corresponding output ˆyt, and (ii) a filter to account for the
dynamic in the cell transitions with the output ˆzt. For (ii) we
propose two approaches, a simple median filter and a filter
based on Hidden Markov Models (HMMs) [13].
1) Median Filter: By applying a median filter we mitigate
the abrupt changes in the cell estimate. The output of the MLC
ˆytat the time step tis filtered using Mpast predictions with
a windowed median filter
ˆzt= med ˆytM,...,ˆyt,(3)
where ˆyt∈ {0,1,2}. The median filter is only able to
improve the classification if the first stage already provides a
sufficiently good result with only a few errors. It also does not
account for the probabilities of the individual cell transitions or
whether these cell transitions are even possible, e.g. a change
from 0to 2and vice versa is not possible in our scenario.
2) Hidden Markov Model: Here, we assume that the ob-
served cell estimates are corrupted versions of the true cell
positions and impose probabilities both for the cell transitions,
as well as for corrupt observations. This allows us to define a
model for the cell transitions instead of the purely empirical
median filter approach. To limit the complexity of the post-
processing, we assume that the Markov property is satisfied,
that is, the cell or label at time t+ 1 is conditionally
independent on the past, given the current cell estimate at time
t, or
pzt+1|zt, zt1, . . . =pzt+1 |zt,(4)
with zt∈ {S1, S2, . . . , S n}.Sirefers to the ith possible true
cell location. In our case, we observe the original cells, but
with possible flips between true and observed cells, hence yt
{S1, S2, . . . , S n}holds as well. We use a HMM with n= 3
hidden states which is defined by
the transition matrix ARn×nwith the probabilities
aij =pzt+1 =Sj|zt=Siof transition from state Si
to state Sj,
the emission matrix BRn×nwith the probabilities
bij =pˆyt=Sj|zt=Sito observe ˆyt=Sjin the
state Si, and
the state probability vector πtRnwith the probabili-
ties πt,i =pzt=Sithat Siis the cell location at time
t.π0denotes the initial state.
The HMM takes the sequence of estimates ˆytof the MLC
as observations and returns a sequence of cell estimates ˆztas
output. The finding of suitable parameters A,Band π0is
also referred to as learning problem. Since the HMM should
correct the miss-classified samples, an intuitive approach is
to use the mistakes from the MLC in the training phase, i.e.
calculate the transition and emission matrix with the training
data {x0
1:T0, y0
1:T0}and the corresponding prediction ˆy0
1:T0. The
transition matrix Adescribes the probability of each state
transition, hence, we estimate the entries with the labels of
the training data y0
1:T0according to
ˆaij =PT01
t=1 δ[y0
t, Si]δ[y0
t+1, S j]
PT0
t=1 δ[y0
t, Si],(5)
where δ[a, b]is the Kronecker delta function which equals 1
for a=band 0 otherwise. For the estimation of the emission
matrix B, we use the confusion matrix Cof the predicted
1 3 5 7 9 11 13 15 17 19 21
Short-term moment length L
0.80
0.82
0.84
0.86
0.88
0.90
Accuracy score
no filter
median M= 1
median M= 5
median M= 10
HMM
Fig. 5. Accuracy score over increasing short-term moment length Lwith and
without additional filtering for the SVM
labels ˆy0
1:T0, where the entry cij represents the number of
samples where the original label of the training data is Si,
but Sjwas predicted. Since the entries of the emission matrix
define the probabilities of a cell depending on an observation,
we can directly use the normalized confusion matrix for the
estimation with ˆ
bij =cji
Pn
i=1 cji
.(6)
For the choice of the initial vector π0we have two options
assuming the initial state is unknown: (i) calculate the proba-
bility of each cell by counting the occurrences in the training
data y0
i:T0or (ii) assume an uninformative prior, where each
state has the same probability. In this work, the more general
case (ii) is chosen and the entries of the initial vector are
estimated by
ˆπ0,i =1
n,(7)
where nis again the number of states.
In the so-called decoding problem, the learned HMM is
used to find the most likely state sequence ˆz1:Tof the model
that produced the observation ˆy1:T. This problem is usually
solved using the Viterbi algorithm with the drawback that it
needs a sequence of observations for the prediction. In this
work, a simple forward approach is used to perform cell
predictions in online fashion using the forward algorithm.
The forward algorithm calculates the state probabilities πt
at a certain time step tusing the previous state probability
and the current observation. The algorithm is initialized with
π1=π0(ey1B), where enis the canonical basis row vector
with 1in the nth element and 0otherwise. For each following
observation the state probability is calculated with
πt=πt1A
Pn
i=1 πt1,i (eytB).(8)
The estimation of the cell ˆztis given by the state with highest
probability in πt.
IV. CLA SSI FIE R EVALUATIO N
Both proposed post-processing approaches are compared
using the average accuracy score similar to Section III-B1 and
0 50 100 150 200 250 300
Time/s
0
1
2
Labels
prediction reference
(a) raw RSSI measurements
0 50 100 150 200 250 300
Time/s
0
1
2
Labels
prediction reference
(b) short-term moments L= 2 and HMM
Fig. 6. Predicted labels compared to ground truth reference using the SN
combination {I-E, I-DR, O-M, O-DR}using the SVM as MLC
the results are depicted in Fig. 5 for increasing Lin (1). For the
median filter three different window lengths M={1,5,10}
are considered. It can be observed that both approaches are
able to improve the accuracy of the classification. In contrast
to the raw output of the MLC, the accuracy does not improve
with increasing Lafter the proposed filtering. The highest
accuracy is achieved in both cases for L= 2, which matches
the fact that more than 2samples significantly oversteps our
calculated coherence time estimate.
Although the median filter shows slightly better results for
a sufficient window length, the HMM is favourable since it
can be adapted easily to various other classification problems
and for (8) only the state probability of the preceding sample
is needed. The median filter requires storing more previous
predictions and the accuracy does not necessarily improve
with increasing window lengths. It only smooths the output
of the classifier by preventing abrupt cell changes, however,
this might not be suitable for all applications.
In the following, the classification is performed with the
SN combination {I-E, I-DR, O-M, O-DR}and a single data
set combination. Figure 6 depicts the results of the MLC
for: (a) the raw RSSI measurements with an accuracy of
86.2 %, and (b) the short-term moments with L= 2 and an
additional HMM filtering with an accuracy of 93.5 %. Note
that in contrast to Fig. 3, the accuracy is higher since here an
adequate SN combination was chosen.
V. CONCLUSION
We analyzed the performance of cell-level localization
based on RSSI values, measured in an already existing IWSN.
The evaluation showed that for this task the accuracy of the
used MLC was comparable to each other, while the choice of
good data pre- and post-processing was the key to higher ac-
curacy. The introduced features based on short-term moments
significantly increased the accuracy and robustness of the MLC
by considering only one preceding RSSI measurement. To
mitigate abrupt changes of the estimate at the output of the
MLC we added an additional classification stage. Here the
HMM showed excellent cell-level localization results with an
accuracy of 93.5 %. Due to the learning based on the confusion
matrix and training data, it can be adapted to various other
classification problems. Based on an extensive measurement
campaign we were able to test the algorithms in detail and also
investigate the importance of SN position and combination.
The proposed approach can be easily implemented at node-
level to directly label the measurements or verify them based
on their location.
REFERENCES
[1] A. A. Kumar S., K. Ovsthus, and L. M. Kristensen., “An Industrial
Perspective on Wireless Sensor Networks — A Survey of Requirements,
Protocols, and Challenges,” IEEE Communications Surveys Tutorials,
vol. 16, no. 3, pp. 1391–1412, 2014.
[2] K. Montgomery, R. Candell, Y. Liu, and M. Hany, “Wireless user
requirements for the factory workcell,” Tech. Rep., National Institute of
Standards and Technology, jan 2020.
[3] S. Raza, M. Faheem, and M. Guenes, “Industrial wireless sensor and
actuator networks in industry 4.0: Exploring requirements, protocols, and
challenges—A MAC survey, International Journal of Communication
Systems, vol. 32, no. 15, pp. e4074, 2019.
[4] H.-P. Bernhard, J. Karoliny, B. Etzlinger, and A. Springer, “Work-
in-progress: Rssi-based presence detection in industrial wireless sensor
networks,” in 2020 16th IEEE International Conference on Factory
Communication Systems (WFCS), 2020, pp. 1–4.
[5] European Commission., “Commission regulation (eu) no 459/2012 of
29 may 2012 amending regulation (ec) no 715/2007 of the european
parliament and of the council and commission regulation (ec) no
692/2008 as regards emissions from light passenger and commercial
vehicles (euro 6)(1),” Off. J. Eur. Union, L: Legis., vol. 55, pp. 16–24,
2012.
[6] H. Liu, H. Darabi, P. Banerjee, and J. Liu, “Survey of Wireless Indoor
Positioning Techniques and Systems, IEEE Transactions on Systems,
Man, and Cybernetics, Part C (Applications and Reviews), vol. 37, no.
6, pp. 1067–1080, 2007.
[7] K. Lee and L. Lampe, “Indoor cell-level localization based on RSSI
classification,” in 2011 24th Canadian Conference on Electrical and
Computer Engineering(CCECE), 2011, pp. 000021–000026.
[8] S. Mahfouz, P. Nader, and P. E. Abi-Char, “RSSI-based classification
for indoor localization in wireless sensor networks,” in 2020 IEEE In-
ternational Conference on Informatics, IoT, and Enabling Technologies
(ICIoT), 2020, pp. 323–328.
[9] H. Bernhard, A. Springer, A. Berger, and P. Priller, “Life cycle of
wireless sensor nodes in industrial environments, in 2017 IEEE 13th
International Workshop on Factory Communication Systems (WFCS),
2017, pp. 1–9.
[10] J. Karoliny, T. Blazek, F. Ademaj, and H. Bernhard, “SAL-Autarkic-
Localization-RSSI-BLE-Dataset: SAL- RB-Dataset,” Distributed by
Zenodo https://doi.org/10.5281/zenodo.4073072, Oct. 2020.
[11] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,
O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vander-
plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch-
esnay, “Scikit-learn: Machine learning in Python,” Journal of Machine
Learning Research, vol. 12, pp. 2825–2830, 2011.
[12] T. S. Rappaport et al., Wireless communications: principles and practice,
vol. 2, prentice hall PTR New Jersey, 1996.
[13] B. Esmael, A. Arnaout, R. K. Fruhwirth, and G. Thonhauser, “Improving
time series classification using hidden markov models, in 2012 12th
International Conference on Hybrid Intelligent Systems (HIS), 2012, pp.
502–507.
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
Conference Paper
We propose to add a monitoring system consisting of so-called path- and guard nodes to industrial wireless sensor network (IWSN), to increase the security level by using RSSI measurements. Via these measurements, the monitoring system determines the presence of a mobile sensor node in a predefined area, which can be used to handle access rights and to increase automation capabilities in industrial applications. We add this monitoring system to an IWSN based on the EPhESOS protocol, which has a high degree of flexibility to meet industrial requirements in different applications throughout the lifetime of a sensor node while enabling energy-autonomous operation. Two practical machine learning algorithms for RSSI-based presence detection are presented, namely a support vector machine and a neural network algorithm. They are evaluated in an automotive example and tested for their robustness against malicious attacks. Additionally, a method to find the best node locations of the monitoring system is presented.
Full-text available
Article
The vision to connect everyday physical objects to the Internet promises to create the Internet of Things (IoT), which is expected to integrate the diverse technologies such as sensors, actuators, radio frequency identification, communication technologies, and Internet protocols. Thus, IoT promises to transfer traditional industry to advance digital industry known as the Industry 4.0. At the core of the Industry 4.0 are the wireless sensor networks (WSNs) and wireless sensor and actuator networks (WSANs) that led to the development of industrial wireless sensor networks (IWSNs) and industrial wireless sensor and actuator networks (IWSANs). These networks play a central role of connecting machines, parts, products, and humans and create a diverse set of new applications to support intelligent and autonomous decision making. The IWSAN is a promising technology for numerous industrial applications because of their several potential benefits such as simple deployment, low cost, less complexity, and mobility support. However, despite such benefits, they impose several unique challenges at different layers of the protocol stack when deploying them for various monitoring and control applications in the Industry 4.0. In this article, we explore IWSAN, its applications, requirements, challenges, and solutions in the context of industrial control applications. Our main focus is on the medium access control (MAC) layer that can be exploited to satisfy such requirements. Our discussion presents extensive background study of the MAC schemes and it reviews the MAC protocols of the existing wireless standards and technologies. A number of application‐specific MAC protocols developed to support industrial applications, which are not part of these standards, are also elaborated. We rationalize to what extent the existing standards and protocols help in solving such requirements as laid down by the Industry 4.0. In the end, we emphasize on existing challenges and present important future directions.
Full-text available
Conference Paper
We present the design of a suite of protocols for wireless sensor networks (WSNs) with respect to a complete life cycle of a WSN node from warehouse to the end of operation. While there are numerous publications on various, usually isolated, aspects of WSNs, the whole life cycle of a node from registration in an automation system via warehouse, calibration, mounting, performing measurements to finally unmounting, has not yet been sufficiently addressed as compound survey. Our application example is a WSN to be used in automotive test beds in which a large amount of testing with many different sensors is performed in controlled environments. While there is published work on WSNs for performing the measurements focusing on node hardware and MAC protocol, we now extend this work by accounting for the whole life cycle of operation of such a WSN and its nodes. This is mainly achieved by introducing optimized MAC protocols for wireless communication in all life cycle phases. Right from beginning of the life cycle the nodes are synchronized with a base node. Even during long offline periods nodes stay synchronized. The life cycle is modeled via a set of states, instantiated in state machines, which control operation in the base station and the nodes. Besides, considering the whole life cycle of the sensor nodes, our design minimizes energy consumption, largely avoids collisions due to suitable multiple access protocols, and allows tight synchronization even during long sleep periods. A demonstrator concludes the presentation and shows functionality and benefits of the concept.
Full-text available
Conference Paper
Time series data are ubiquitous and being generated at an unprecedented speed and volume in many fields including finance, medicine, oil and gas industry and other business domains. Many techniques have been developed to analyze time series and understand the system that produces them. In this paper we propose a hybrid approach to improve the accuracy of time series classifiers by using Hidden Markov Models (HMM). The proposed approach is based on the principle of learning by mistakes. A HMM model is trained using the confusion matrices which are normally used to measure the classification accuracy. Misclassified samples are the basis of learning process. Our approach improves the classification accuracy by executing a second cycle of classification taking into account the temporal relations in the data. The objective of the proposed approach is to utilize the strengths of Hidden Markov Models (dealing with temporal data) to complement the weaknesses of other classification techniques. Consequently, instead of finding single isolated patterns, we focus on understanding the relationships between these patterns. The proposed approach was evaluated with a case study. The target of the case study was to classify real drilling data generated by rig sensors. Experimental evaluation proves the feasibility and effectiveness of the approach.
Full-text available
Article
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.
Article
Wireless Sensor Networks (WSNs) are applicable in numerous domains, including industrial automation where WSNs may be used for monitoring and control of industrial plants and equipment. However, the requirements in the industrial systems differ from the general WSN requirements. In recent years, standards have been defined by several industrial alliances. These standards are specified as frameworks with modifiable parts that can be defined based on the particular application of WSN. However, limited work has been done on defining industry-specific protocols that could be used as a part of these standards. In this survey, we discuss representative protocols that meet some of the requirements of the industrial applications. Since the industrial applications domain in itself is a vast area, we divide them into classes with similar requirements. We discuss these industrial classes, set of common requirements and various state-of-the-art WSN standards proposed to satisfy these requirements. We then present a broader view towards the WSN solution by discussing important functions like medium access control, routing, and transport in detail to give some insight into specific requirements and the classification of protocols based on certain factors. We list and discuss representative protocols for each of these functions that address requirements defined in the industrial classes. Security function is discussed in brief, mainly in relation to industrial standards. Finally, we identify unsolved challenges that are encountered during design of protocols and standards. In addition some new challenges are introduced and discussed.
Conference Paper
The task of estimating the location of a mobile transceiver using the Received Signal Strength Indication (RSSI) values of radio transmissions is an inference problem. Contextual information, i.e., if the target is in a specific region, is sufficient for most applications. Therefore, instead of estimating position coordinates, we take a slightly different approach and look at localization as a classification problem. We perform a comparison between the K-Nearest Neighbor (KNN), the Support Vector Machine (SVM) and the Simple Gaussian Classifier (SGC), three classifiers proposed previously under different contexts. Using experimental results, we demonstrate that the SGC achieves a competitive performance despite its simplicity. Furthermore, we consider the extension of the SGC to a Hidden Markov Model (HMM) and demonstrate the performance gains. The derivative of the HMM filter allows us to do online parameter tracking, realizing an adaptive scheme. To our knowledge, this adaptive scheme has not been used for the SGC before. Considering the advantages of the SGC, we advocate the SGC as a competitive solution for estimating contextual location information.
Article
Wireless indoor positioning systems have become very popular in recent years. These systems have been successfully used in many applications such as asset tracking and inventory management. This paper provides an overview of the existing wireless indoor positioning solutions and attempts to classify different techniques and systems. Three typical location estimation schemes of triangulation, scene analysis, and proximity are analyzed. We also discuss location fingerprinting in detail since it is used in most current system or solutions. We then examine a set of properties by which location systems are evaluated, and apply this evaluation method to survey a number of existing systems. Comprehensive performance comparisons including accuracy, precision, complexity, scalability, robustness, and cost are presented.
Wireless user requirements for the factory workcell
  • K Montgomery
  • R Candell
  • Y Liu
  • M Hany
K. Montgomery, R. Candell, Y. Liu, and M. Hany, "Wireless user requirements for the factory workcell," Tech. Rep., National Institute of Standards and Technology, jan 2020.