Content uploaded by H.-P. Bernhard

Author content

All content in this area was uploaded by H.-P. Bernhard on May 08, 2021

Content may be subject to copyright.

RSSI-Based Machine Learning with Pre- and

Post-Processing for Cell-Localization in IWSNs

Julian Karoliny∗, Thomas Blazek∗, Fjolla Ademaj∗, Hans-Peter Bernhard∗†, Andreas Springer†

∗Silicon Austria Labs GmbH, 4040 Linz

†Johannes Kepler University Linz, Institute for Communications Engineering and RF-Systems, 4040 Linz, Austria

Abstract—Industrial wireless sensor networks are becoming

crucial for modern manufacturing. If the sensors in those

networks are mobile, the position information, besides the sensor

data itself, can be of high relevance. E.g. this position information

can increase the trustability of a wireless sensor measurement by

assuring that the sensor is not physically removed, off track, or

otherwise compromised.

In certain applications, localization information at cell-level,

whether the sensor is inside or outside a room or cell, is

sufﬁcient. For this, localization using Received Signal Strength

Indicator (RSSI) measurements is very popular since RSSI values

are available in almost all existing technologies and no direct

interaction with the mobile sensor node and its communication

in the network is needed. For this scenario, we propose methods

to improve the robustness and accuracy of common machine

learning classiﬁers, by using features based on short-term mo-

ments and a second classiﬁcation stage using Hidden Markov

Models. With the data from an extensive measurement campaign,

we show the applicability of our method and achieve a cell-level

localization accuracy of 93.5%.

Index Terms—IWSN, RSSI, Machine Learning, HMM, Blue-

tooth, Indoor Localization

I. INTRODUCTION

In industrial environments, sensors are traditionally con-

nected through a wired communication network like ﬁeld

buses or Ethernet networks. However, wireless communication

is becoming crucial to advanced manufacturing [1] and acts

as an enabler for Industry 4.0. Industrial wireless sensor

networks (IWSNs) must meet stringent reliability and latency

requirements [2], but offer advantages like mobile operation,

easy sensor replacement, ﬂexible mounting, and often lower

cost [3]. For many use cases, it is necessary to record the spa-

tial position of the wireless sensor in addition to its measured

value. As an example, we present an extension to an IWSN-

based measurement system [4] for the emission certiﬁcation of

cars according to the Euro 6 standard which traces the required

measurements in time and position [5]. During these tests, cars

are moved between differently conditioned areas and for the

position tracking a non-interfering add-on-localization extends

This work is funded by the InSecTT project (https://www.insectt.eu/).

InSecTT has received funding from the ECSEL Joint Undertaking (JU) under

grant agreement No 876038. The JU receives support from the European

Union’s Horizon 2020 research and innovation programme and Austria,

Sweden, Spain, Italy, France, Portugal, Ireland, Finland, Slovenia, Poland,

Netherlands, Turkey. The document reﬂects only the author’s view and

the Commission is not responsible for any use that may be made of the

information it contains.

the wireless measurement system. Besides the information

about the location itself, the position information of sensor

nodes is used to verify the measurements, e.g. to assure that

a sensor is not physically removed, off track, or otherwise

compromised. For instance, a malicious sensor node from

an unveriﬁed location can be identiﬁed and its measurement

values are rejected.

As the use of Global Positioning System (GPS) is strongly

limited in indoor environments, factory communication sys-

tems have to use alternative localization systems. In IWSNs,

the main techniques for localization are based on Angle of Ar-

rival (AoA), Time of Arrival (ToA), Time Difference of Arrival

(TDoA) and Received Signal Strength Indicator (RSSI) [6].

Localization based on RSSI values is one of the most promis-

ing solutions for low-cost applications since the RSSI value

is available in existing technologies like Bluetooth R

Low

Energy (BLE), Wireless Local Area Network (WLAN), Zig-

Bee, etc. However, due to multipath fading, noise and limited

dynamic range of the RSSI measurements, exact localization

based on a path-loss model and multilateration becomes quite

challenging. While in the literature many techniques focus on

improving the accuracy of RSSI-based estimation, there are

also many use cases in IWSNs, where a coarse location of the

sensor node is sufﬁcient, such as measurement veriﬁcation,

security, and automotive testing. The main task in such use

cases is to classify speciﬁc environments or regions like the

inside or outside of a room and determine whether a sensor

node belongs to such a conﬁned region. Authors in [7], [8]

have studied the so-called cell-level-based localization with

RSSI values using supervised machine learning methods. A

major challenge with these methods is the limited amount of

training and validation data.

A. Contribution

In this work, we present a RSSI-based cell-level localization

approach as an add-on to an existing IWSN. To acquire the

RSSI measurements of all sensor values sent to the base

station (BS), we use additional sensor nodes which only listen

passively. We propose methods to improve the robustness and

accuracy of common machine learning classiﬁers (MLCs), by

introducing suitable input features and a subsequent second

classiﬁcation stage. Here, we take advantage of the fact that

the RSSI measurements are highly correlated in time, i.e., two

subsequent measurements are from similar positions because

of the limited movement speed. Additionally, we conducted

an extensive measurement campaign that allows us to test and

verify the localization method we developed.

B. Notation

Scalars are written as x, while vectors and matrices are

denoted as lower- and uppercase bold respectively (xand

X). For matrices and vectors, the element-wise (Hadamard)

multiplication is denoted with ◦. Time indices are indicated

with superscript xtand a set of time depending measurements

of the size Tis written as x1:T≡[x1, x2,...xT].

II. EX PER IM ENTAL SE TUP A ND DATA ACQUISITION

All measurements in this work were obtained at an au-

tomotive testbed1, i.e. under real-world conditions. In our

measurement scenario, we want to estimate the position of a

wireless sensor node mounted on a car that is moving in and

out from the testbed, c.f. Fig. 1. This sensor node is referred

to as measurement node (MN). Additionally, ten sniffer nodes

(SNs) using the same transceiver as the MN are placed inside

and outside the testbed, referred to as I-x respectively O-x.

The SNs acquire the RSSI values of every communication

packet sent from the MN to the BS. Additionally, two photo

sensors (PSs) are placed at the doorstep and the car test-

position respectively, to automatically label the measurements.

The PSs are only used during the measurement campaign for

the labeling task. For the operation phase the PSs would not be

sufﬁcient for the localization task, as the application requires

to localize and distinguish multiple cars in different clusters.

Figure 1 depicts the measurement setup with the position

of the SN, the PSs, and the trajectory of the car. The labels

and coordinates of each SN are listed in Table I.

O-E

I-DL

photosensor(PS)

sniffernode(SN)

I-E

I-T1 I-T2 I-T3

I-DR

O-DR

O-DL

O-M

basestation(BS)

Fig. 1. Measurement setup. The Car is driving in and out of the automotive

testbed.

A. Hardware and Protocol

We used the BLE physical (PHY) layer and combined it

with the Energy and Power Efﬁcient Synchronous Sensor

Network (EPhESOS) protocol [9] to realize an IWSN with

up to 100 nodes per BS. EPhESOS provides a deterministic

1Chassis dynamo-meter, AVL List GmbH in Graz, Austria

TABLE I

LAB EL AN D PO SI TI ON O F SN IFF ER -NODES

Position in m

Label Description x y z

I-E inside end -13.30 0.90 0.64

I-T1 inside top 1 -9.88 -0.35 3.92

I-T2 inside top 2 -6.58 -0.35 3.92

I-T3 inside top 3 -1.60 -0.35 3.92

I-DR inside door right -1.55 3.19 1.00

I-DL inside door left -1.55 -0.09 1.00

O-E outside end 11.10 2.02 1.70

O-M outside mid 6.80 2.32 1.84

O-DR outside door right 2.21 3.10 0.99

O-DL outside door left 2.21 0.00 1.00

media access control (MAC) layer using time division multiple

access (TDMA), with a superframe (SF) length of 100 ms. As

hardware platform for the measurements and all applications,

the NordicTMNRF52840 controller with integrated transceiver

is used.

B. Acquired Dataset

In the course of this work, a large number of measurements

were collected, which are published and provided as open-

source data set under SAL Autarkic Localization RSSI BLE

Dataset (SAL-RB-Dataset) [10]. The acquired data set consists

of: (i) two disjoint measurement-sets, where a person walks

inside and outside the automotive testbed, and (ii) eight

disjoint measurement-sets, where a car is driving in and out

the testbed as depicted in Fig. 1. For (ii) two different MN

were mounted on the car to investigate the effects of different

hardware on the classiﬁcation. Each individual set provides

RSSI values of all ten SN in a 100 ms interval. A missing

link is denoted with -100 dBm. The labels of the RSSI values

correspond to different localization cells and are deﬁned as

follows:

•Label = 0: car is outside the testbed

•Label = 1: car is completely inside the testbed

•Label = 2: car is inside the testbed and on test-position

In total the data set consists of more than 20 000 labeled

RSSI measurements for each SN. Figure 2 shows exam-

ple measurements of the RSSI values in dBm for the SNs

{I-E, I-DR, O-M, O-DR}and the corresponding labels. To

evaluate the ﬁndings of this work, six car measurement-sets

from the SAL-RB-Dataset are used.

III. RSSI-BAS ED MACH INE LEARNING CL AS SIFI ER

The MLC uses the sniffed RSSI values of data packets

sent by the MN to identify its position, or more precisely

the label of the corresponding cell in the testbed. The RSSI

measurements from the ten SNs (N= 10) are synchronously

recorded at time step tand collected in the input vector

xt∈RNwhich is used to estimate the corresponding label

yt∈ {0,1,2}. As usual, the MLC has an ofﬂine training phase

and an online classiﬁer phase. The training set {x0

1:T0, y0

1:T0}

with T0samples is used to train the classiﬁer. Afterwards the

−100

−50

O-DR

−100

−50

O-M

−100

−50

I-DR

−100

−50

I-E

0 50 100 150 200 250 300

Time/s

0

1

2

Labels

Fig. 2. Exemplary RSSI measurements in dBm of four SNs with the

corresponding labels.

classiﬁcation is performed online with the measured data to

estimate the label ˆyt. The performance of the classiﬁcation

is assessed by the accuracy score which is calculated as the

number of correctly identiﬁed labels divided by the number

of all classiﬁcations.

A. Machine Learning Methods and Data Splitting

In the following, we focus on simple machine learning tech-

niques to enable the estimator implementation directly at node

level. We considered three widely used algorithms, namely K-

Nearest Neighbors (KNN), Random Forest (RF), and Support

Vector Machine (SVM). All of these algorithms fall into the

category of supervised machine learning techniques, where the

choice of measurement-sets for the training and subsequent

validation is crucial. A common procedure here is to perform

a random split of the measurement data to obtain a subset

for training and validation. However, since the RSSI values

are measured continuously, subsequent measurements tend to

be very similar. A random split would lead in this case to a

very good, though, unrealistic accuracy score, as the smaller

validation set contains nearly identical measurements of the

training set. To avoid this, we do not split or shufﬂe individual

measurements set, but keep them whole either for training or

testing. This approach is also comparable to the real use-case

since here the MLC would also be learned at the beginning

and should then work for future measurements. Additionally,

instead of using only a single training and validation set, all

combinations consisting of three training and one validation

data sets, without using the same for both, are evaluated in this

section. This ensures a fair comparison of the different MLC

since it mitigates the problem that some approaches may be

exceptionally good for some data set combinations.

The three proposed algorithms are implemented using

Scikit-learn which is an open-source machine learning library

for Python [11]. For the given classiﬁcation task, all three

algorithms performed similarly in terms of accuracy and

robustness, though the SVM showed slightly higher accuracy.

With the RSSI values directly as input, the SVM reached an

average accuracy of 77 % for the given data set [10] and is

used exclusively for all following evaluations.

The remaining miss-classiﬁcations are caused by overlap-

ping class-conditional distributions due to noise and limited

dynamic range of the measurements. These mainly occur at

class transition regions in the testbed, e.g., the doorstep and

test-position, and none of the proposed MLC could improve in

these areas. In order to further improve the accuracy, especially

at the class boarders, we introduce more suitable input features

for the MLC and a subsequent post-processing.

B. Selecting Features for Machine Learning Classiﬁer

Regarding feature selection, two questions have to be an-

swered. Firstly, it is necessary to know whether the raw

data xtof one SF is sufﬁcient for the classiﬁcation, or if

including previous samples will improve the performance.

Secondly, it is essential to validate how the position, number,

and combination of the SNs inﬂuences the accuracy. In this

context, it is important to answer if also a subset of nodes is

sufﬁcient for the classiﬁcation task.

Also for this evaluation, the choice of measurement-sets for

the training and subsequent validation is crucial. To ensure a

fair comparison of the different features all combinations of

three training and one validation data sets, without using the

same for both, are evaluated. This mitigates the problem that

some features may be exceptionally good for some data set

combinations

1) Short-Term Moments as Features: The drawback of

using the RSSI measurements of more than one SF as input

features is, that it increases the feature space with each

additional sample. Besides that, the selected MLC may not be

able to model the relation of sequential input data, e.g. RSSI

values are considered individually and the dynamic relation

over time is not modeled.

To avoid this, we propose to use short-term estimates of the

ﬁrst two moments over Lsamples. Thus, we introduce

xµσ

t="1

L

t

X

τ=t−L+1

xτ

| {z }

¯xL

t

,1

L−1

t

X

τ=t−L+1

(xτ−¯xL

t)2

| {z }

sL

t

#.

(1)

This only doubles the feature space, independently of the

number of used SFs. To get an indication of how many

preceding samples beneﬁt this approach in our scenario, we

consider the coherence time of the channel. The coherence

time Tcis a statistical measure of the time duration over

which two received signals have a certain minimum amplitude

correlation, depending on the relative motion between the MN

1 3 5 7 9 11 13 15 17 19 21

Short-term moment length L

0.76

0.78

0.80

0.82

0.84

0.86

0.88

Accuracy score

Fig. 3. Accuracy score over increasing short-term moment length Lfor the

SVM. The raw RSSI measurements are indicated with L= 1.

on the car and the SNs. It can be estimated according to [12]

with

Tc=s9

16πf 2

m

=0.423

fm

,(2)

where fmis the Doppler spread, which is upper bounded by

the maximum occurring Doppler shift, calculated as fmax =

v/λ. With the given BLE center frequency fc= 2.44 GHz

and the average speed of the car v= 1 m

s, the Doppler shift

in case of directly oncoming movement is about 8 Hz, which

results in a lower bound of Tc≈50 ms. Since the time interval

between two successive SFs is 100 ms, considering more than

two RSSI measurements for (1) may not improve the result.

To investigate this and to analyze the advantages of us-

ing short-term moments as input features, we use the SVM

classiﬁer and compare the results for an increasing number

Lin (1). Figure 3 depicts the accuracy score of the classiﬁer

over L, where L= 1 denotes the result using the raw RSSI

values. The depicted accuracy is the average accuracy over

all 1023 possible SN-combinations and 60 possible training

and validation data set combinations. As mentioned before,

this assures a fair comparison since we observed that the

short-term moments showed a higher improvement for certain

combinations. In contrast to the calculated coherence time,

the accuracy further increases with L, though the highest

relative gain is achieved by using one additional preceding

measurement.

2) Node Selection Scheme: Due to noise and the limited

dynamic range of the measurements, two SNs can provide

similar RSSI values, though they are at different locations.

Additionally, some SN positions may provide exceptional

good measurements for the given classiﬁcation task, since the

main effect that is exploited is not the free-space pathloss,

but the changes between line-of-sight and non-line-of-sight

channel caused by the movement of the car. Not only the

positions of the SNs are important, but also the combination

of the individual RSSI measurements is crucial for the correct

classiﬁcation. Instead of evaluating all SN combinations and

simply choosing the one with the highest accuracy, we ﬁrst

determine combinations that provide poor performance.

Figure 4 depicts the distribution of the accuracy score of all

1023 SN-combinations averaged over all data set combinations

0.5 0.6 0.7 0.8 0.9

Accuracy score

0

20

40

Occurrences

raw RSSI values

short-term moments L= 2

Fig. 4. Accuracy score distribution of all SN combinations using the raw

RSSI measurements and short-term moments for the SVM.

using the raw RSSI values and the short-term moments in (1)

with L= 2. Again the advantages of the short-term moments

can be observed, as all individual SN-combinations show a

higher accuracy. Additionally, they are clustered at higher

percentages which indicate improved robustness. Because the

short-term moments are superior as input features, in the

following we will use them exclusively.

Most SN-combinations show an accuracy between 80-90 %,

tending to the higher ones, while only very few combinations

are below. On closer inspection of the few combinations that

lead to poor results, we found an intuitive explanation. These

combinations are composed of either SNs only outside, only

inside, or only single SNs. On the contrary, combinations that

are composed of SNs equally spaced in the area of interest,

including nodes at signiﬁcant points, e.g. near the doorstep,

lead to very good results. We found out that about four SNs

are sufﬁcient for our task, with for example the combination

{I-E, I-DR, O-M, O-DR}.

C. Reﬁning Cell Estimates Via Post-Processing

The physical cell-boarders in the x-dimension, as depicted

in Fig. 1 with the PSs, are deﬁned w.r.t. the given local-

ization task and are not chosen optimal in terms of high

differences in the measured RSSI values. As a result, noise

and limited dynamic range of the RSSI measurements lead

to oscillating miss-classiﬁcations in time, especially near the

physical borders of the cells, i.e. doorstep and test-position.

Miss-classiﬁcations can also occur in the middle of the cell,

e.g. inside the testbed at the far end, which in particular is a

problem for the mentioned industrial use-case.

Therefore, we propose a second classiﬁcation stage to

mitigate this problem. Instead of considering SFs individually,

we include a certain dynamic in the classiﬁcation model,

because the samples are highly correlated in time. For example

preceding measurements have a high probability to result in

the same cell and abrupt changes for a few SFs are physically

not possible. The two stage-approach consist of, (i) the MLC

for the current input vector xtor xµσ

trespectively and the

corresponding output ˆyt, and (ii) a ﬁlter to account for the

dynamic in the cell transitions with the output ˆzt. For (ii) we

propose two approaches, a simple median ﬁlter and a ﬁlter

based on Hidden Markov Models (HMMs) [13].

1) Median Filter: By applying a median ﬁlter we mitigate

the abrupt changes in the cell estimate. The output of the MLC

ˆytat the time step tis ﬁltered using Mpast predictions with

a windowed median ﬁlter

ˆzt= med ˆyt−M,...,ˆyt,(3)

where ˆyt∈ {0,1,2}. The median ﬁlter is only able to

improve the classiﬁcation if the ﬁrst stage already provides a

sufﬁciently good result with only a few errors. It also does not

account for the probabilities of the individual cell transitions or

whether these cell transitions are even possible, e.g. a change

from 0to 2and vice versa is not possible in our scenario.

2) Hidden Markov Model: Here, we assume that the ob-

served cell estimates are corrupted versions of the true cell

positions and impose probabilities both for the cell transitions,

as well as for corrupt observations. This allows us to deﬁne a

model for the cell transitions instead of the purely empirical

median ﬁlter approach. To limit the complexity of the post-

processing, we assume that the Markov property is satisﬁed,

that is, the cell or label at time t+ 1 is conditionally

independent on the past, given the current cell estimate at time

t, or

pzt+1|zt, zt−1, . . . =pzt+1 |zt,(4)

with zt∈ {S1, S2, . . . , S n}.Sirefers to the ith possible true

cell location. In our case, we observe the original cells, but

with possible ﬂips between true and observed cells, hence yt∈

{S1, S2, . . . , S n}holds as well. We use a HMM with n= 3

hidden states which is deﬁned by

•the transition matrix A∈Rn×nwith the probabilities

aij =pzt+1 =Sj|zt=Siof transition from state Si

to state Sj,

•the emission matrix B∈Rn×nwith the probabilities

bij =pˆyt=Sj|zt=Sito observe ˆyt=Sjin the

state Si, and

•the state probability vector πt∈Rnwith the probabili-

ties πt,i =pzt=Sithat Siis the cell location at time

t.π0denotes the initial state.

The HMM takes the sequence of estimates ˆytof the MLC

as observations and returns a sequence of cell estimates ˆztas

output. The ﬁnding of suitable parameters A,Band π0is

also referred to as learning problem. Since the HMM should

correct the miss-classiﬁed samples, an intuitive approach is

to use the mistakes from the MLC in the training phase, i.e.

calculate the transition and emission matrix with the training

data {x0

1:T0, y0

1:T0}and the corresponding prediction ˆy0

1:T0. The

transition matrix Adescribes the probability of each state

transition, hence, we estimate the entries with the labels of

the training data y0

1:T0according to

ˆaij =PT0−1

t=1 δ[y0

t, Si]δ[y0

t+1, S j]

PT0

t=1 δ[y0

t, Si],(5)

where δ[a, b]is the Kronecker delta function which equals 1

for a=band 0 otherwise. For the estimation of the emission

matrix B, we use the confusion matrix Cof the predicted

1 3 5 7 9 11 13 15 17 19 21

Short-term moment length L

0.80

0.82

0.84

0.86

0.88

0.90

Accuracy score

no ﬁlter

median M= 1

median M= 5

median M= 10

HMM

Fig. 5. Accuracy score over increasing short-term moment length Lwith and

without additional ﬁltering for the SVM

labels ˆy0

1:T0, where the entry cij represents the number of

samples where the original label of the training data is Si,

but Sjwas predicted. Since the entries of the emission matrix

deﬁne the probabilities of a cell depending on an observation,

we can directly use the normalized confusion matrix for the

estimation with ˆ

bij =cji

Pn

i=1 cji

.(6)

For the choice of the initial vector π0we have two options

assuming the initial state is unknown: (i) calculate the proba-

bility of each cell by counting the occurrences in the training

data y0

i:T0or (ii) assume an uninformative prior, where each

state has the same probability. In this work, the more general

case (ii) is chosen and the entries of the initial vector are

estimated by

ˆπ0,i =1

n,(7)

where nis again the number of states.

In the so-called decoding problem, the learned HMM is

used to ﬁnd the most likely state sequence ˆz1:Tof the model

that produced the observation ˆy1:T. This problem is usually

solved using the Viterbi algorithm with the drawback that it

needs a sequence of observations for the prediction. In this

work, a simple forward approach is used to perform cell

predictions in online fashion using the forward algorithm.

The forward algorithm calculates the state probabilities πt

at a certain time step tusing the previous state probability

and the current observation. The algorithm is initialized with

π1=π0◦(ey1B), where enis the canonical basis row vector

with 1in the nth element and 0otherwise. For each following

observation the state probability is calculated with

πt=πt−1A

Pn

i=1 πt−1,i ◦(eytB).(8)

The estimation of the cell ˆztis given by the state with highest

probability in πt.

IV. CLA SSI FIE R EVALUATIO N

Both proposed post-processing approaches are compared

using the average accuracy score similar to Section III-B1 and

0 50 100 150 200 250 300

Time/s

0

1

2

Labels

prediction reference

(a) raw RSSI measurements

0 50 100 150 200 250 300

Time/s

0

1

2

Labels

prediction reference

(b) short-term moments L= 2 and HMM

Fig. 6. Predicted labels compared to ground truth reference using the SN

combination {I-E, I-DR, O-M, O-DR}using the SVM as MLC

the results are depicted in Fig. 5 for increasing Lin (1). For the

median ﬁlter three different window lengths M={1,5,10}

are considered. It can be observed that both approaches are

able to improve the accuracy of the classiﬁcation. In contrast

to the raw output of the MLC, the accuracy does not improve

with increasing Lafter the proposed ﬁltering. The highest

accuracy is achieved in both cases for L= 2, which matches

the fact that more than 2samples signiﬁcantly oversteps our

calculated coherence time estimate.

Although the median ﬁlter shows slightly better results for

a sufﬁcient window length, the HMM is favourable since it

can be adapted easily to various other classiﬁcation problems

and for (8) only the state probability of the preceding sample

is needed. The median ﬁlter requires storing more previous

predictions and the accuracy does not necessarily improve

with increasing window lengths. It only smooths the output

of the classiﬁer by preventing abrupt cell changes, however,

this might not be suitable for all applications.

In the following, the classiﬁcation is performed with the

SN combination {I-E, I-DR, O-M, O-DR}and a single data

set combination. Figure 6 depicts the results of the MLC

for: (a) the raw RSSI measurements with an accuracy of

86.2 %, and (b) the short-term moments with L= 2 and an

additional HMM ﬁltering with an accuracy of 93.5 %. Note

that in contrast to Fig. 3, the accuracy is higher since here an

adequate SN combination was chosen.

V. CONCLUSION

We analyzed the performance of cell-level localization

based on RSSI values, measured in an already existing IWSN.

The evaluation showed that for this task the accuracy of the

used MLC was comparable to each other, while the choice of

good data pre- and post-processing was the key to higher ac-

curacy. The introduced features based on short-term moments

signiﬁcantly increased the accuracy and robustness of the MLC

by considering only one preceding RSSI measurement. To

mitigate abrupt changes of the estimate at the output of the

MLC we added an additional classiﬁcation stage. Here the

HMM showed excellent cell-level localization results with an

accuracy of 93.5 %. Due to the learning based on the confusion

matrix and training data, it can be adapted to various other

classiﬁcation problems. Based on an extensive measurement

campaign we were able to test the algorithms in detail and also

investigate the importance of SN position and combination.

The proposed approach can be easily implemented at node-

level to directly label the measurements or verify them based

on their location.

REFERENCES

[1] A. A. Kumar S., K. Ovsthus, and L. M. Kristensen., “An Industrial

Perspective on Wireless Sensor Networks — A Survey of Requirements,

Protocols, and Challenges,” IEEE Communications Surveys Tutorials,

vol. 16, no. 3, pp. 1391–1412, 2014.

[2] K. Montgomery, R. Candell, Y. Liu, and M. Hany, “Wireless user

requirements for the factory workcell,” Tech. Rep., National Institute of

Standards and Technology, jan 2020.

[3] S. Raza, M. Faheem, and M. Guenes, “Industrial wireless sensor and

actuator networks in industry 4.0: Exploring requirements, protocols, and

challenges—A MAC survey,” International Journal of Communication

Systems, vol. 32, no. 15, pp. e4074, 2019.

[4] H.-P. Bernhard, J. Karoliny, B. Etzlinger, and A. Springer, “Work-

in-progress: Rssi-based presence detection in industrial wireless sensor

networks,” in 2020 16th IEEE International Conference on Factory

Communication Systems (WFCS), 2020, pp. 1–4.

[5] European Commission., “Commission regulation (eu) no 459/2012 of

29 may 2012 amending regulation (ec) no 715/2007 of the european

parliament and of the council and commission regulation (ec) no

692/2008 as regards emissions from light passenger and commercial

vehicles (euro 6)(1),” Off. J. Eur. Union, L: Legis., vol. 55, pp. 16–24,

2012.

[6] H. Liu, H. Darabi, P. Banerjee, and J. Liu, “Survey of Wireless Indoor

Positioning Techniques and Systems,” IEEE Transactions on Systems,

Man, and Cybernetics, Part C (Applications and Reviews), vol. 37, no.

6, pp. 1067–1080, 2007.

[7] K. Lee and L. Lampe, “Indoor cell-level localization based on RSSI

classiﬁcation,” in 2011 24th Canadian Conference on Electrical and

Computer Engineering(CCECE), 2011, pp. 000021–000026.

[8] S. Mahfouz, P. Nader, and P. E. Abi-Char, “RSSI-based classiﬁcation

for indoor localization in wireless sensor networks,” in 2020 IEEE In-

ternational Conference on Informatics, IoT, and Enabling Technologies

(ICIoT), 2020, pp. 323–328.

[9] H. Bernhard, A. Springer, A. Berger, and P. Priller, “Life cycle of

wireless sensor nodes in industrial environments,” in 2017 IEEE 13th

International Workshop on Factory Communication Systems (WFCS),

2017, pp. 1–9.

[10] J. Karoliny, T. Blazek, F. Ademaj, and H. Bernhard, “SAL-Autarkic-

Localization-RSSI-BLE-Dataset: SAL- RB-Dataset,” Distributed by

Zenodo https://doi.org/10.5281/zenodo.4073072, Oct. 2020.

[11] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,

O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vander-

plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch-

esnay, “Scikit-learn: Machine learning in Python,” Journal of Machine

Learning Research, vol. 12, pp. 2825–2830, 2011.

[12] T. S. Rappaport et al., Wireless communications: principles and practice,

vol. 2, prentice hall PTR New Jersey, 1996.

[13] B. Esmael, A. Arnaout, R. K. Fruhwirth, and G. Thonhauser, “Improving

time series classiﬁcation using hidden markov models,” in 2012 12th

International Conference on Hybrid Intelligent Systems (HIS), 2012, pp.

502–507.