Content uploaded by Georgios Meditskos
Author content
All content in this area was uploaded by Georgios Meditskos on Nov 11, 2020
Content may be subject to copyright.
RSSI Fingerprinting Techniques for Indoor Localization
Datasets
Angelos Chatzimichail, Athina Tsanousa, Georgios Meditskos, Stefanos Vrochidis
and Ioannis Kompatsiaris
Information Technologies Institute
Center for Research and Technology Hellas, Greece
{angechat, atsan, gmeditsk, stefanos, ikom}@iti.gr
Abstract—Indoor localization techniques using Received Signal Strength
Indicator (RSSI) is attractive in the Internet of Things domain due to its simplicity
and cost-effectiveness. However, there are many different approaches proposed
in and there is not a common, widely acceptable solution in the research
community. This is mainly due to the limited number of publicly available
datasets and that the multi-effect signal phenomenon limits each dataset to its
gathering testbed. In this paper, we tested several fingerprinting methods in a
publicly available dataset and we compared them against the RSSI regression
approach, which is considered as the most prominent one in certain domains,
such as indoor and outdoor localization.
Keywords—RSSI, fingerprinting, localization, Internet of Things.
1 Introduction
With the great advancements in the wireless technologies, portable devices and
wearables, there is a need for smart algorithms to personalize services to the users.
Location-based Services, such as advertising and navigation, is one of the most popular
type of services today that make use of location, as well as user context, to provide end-
users with useful personalized information. One of the most important location tracking
method is based on the Global Positioning System (GPS). However, GPS-based
approaches are not appropriate in indoor circumstances because the position is
determined from GPS satellites [1], which is not so accurate in indoor situations.
Most of the research on the localization techniques has been based on the use of
short-range signals, such as Wi-Fi [2-3], Bluetooth [4], ultrasound [5], infrared [6], or
RFID [7]. All of the aforementioned technologies are using the RSSI (Received Signal
Strength indicator) parameter in order to estimate the position of the user. The RSSI-
based approach estimates the signal strength of the received electromagnetic wave,
being dependent on the distance of the signal source. Many RSSI measurements are
being collected from many distances and places in databases in order to gather a larger
number of measurements. Generally, there are two popular categories of indoor
localization methods based on RSSI measurements, namely, triangulation and
fingerprinting.
The main idea of triangulation is to construct the function of RSSI and distance.
First, the method starts with collection of enough data in different environments to
describe the relationship between the measurements and the distance. Most of the
researchers focuses on trying use fitting method to map the function [7], [8]. Then the
position is estimated by using triangulation algorithm.
For the fingerprinting method, creating an offline fingerprint database is crucial for
location estimation. Fingerprinting matches the given positions with RSSI
measurements gathered from access points. After the creation of an offline database,
online RSSI gathering measurements are compared with fingerprints in the offline
database to get the match position. The fingerprinting approach falls into the
classification paradigm and there are many different algorithms such as KNN (K-
nearest Neighbors) [9], SVM (Support Vector Machines) [10], RF (Random Forest)
[11].
The main motivation of our work is to use the Bluetooth Low Energy (BLE)
technology for indoor and outdoor position estimation. BLE is a technology with great
adoption in embedded systems and electronics in general. This is because it consumes
very low power and it is easy to integrate in electronics. The need to compare different
localization techniques derives from the research challenges we face in estimating the
location of children in one of the most crowded Christmas events in Greece that takes
place in December in Trikala (Mills of the Elves). Children wear a smart bracelet with
BLE technology which is wirelessly connected with the parent’s phone. In our case the
child’s bracelet is the RSSI transmitter, while the parent’s phone is the RSSI receiver.
In order to study the problem with the RSSI and distance we needed datasets, however
there is a lack of relevant datasets to train our machine learning techniques in our case.
Therefore, we gathered many RSSI measurements from smart bracelets and we created
our dataset. After gathering the measurements, we performed curve fitting in order to
correlate the RSSI values with the distance. We used the first approach from the two
RSSI approaches in the localization services. In this case, the RSSI is the only value
available to the receiver to correlate it with the distance.
In this paper, we compare different fingerprinting techniques in our dataset that is
publicly available
1
. In addition, we compare the fingerprinting results with the curve
fitting results that we obtained in the same dataset in a previous work in order to
compare the two different approaches.
The rest of the paper is organized as follows: Section 2 reviews the related work.
Section 3 explains the RSSI sensing infrastructure, while Sections 4 and 5 describe the
technical details of the fingerprinting and different machine learning techniques.
Section 6 mentions the experimental results and the paper is concluded with Section 7.
1
http://desmos-project.gr/en/datasets-2
2 Related Work
The indoor positioning algorithms that are usually applied on indoor datasets are
either based on the RSSI distance method or on fingerprinting technology [8].
Examples of RSSI fingerprinting solutions [12, 13] include probabilistic approaches
[14], neural network-based approaches [15] and K-means algorithms [16]. The general
theory for fingerprinting techniques is presented in [17] where the correlation of many
parameters is presented such as the number of measurements, interference etc. Also,
there are works that apply deep learning in the fingerprinting indoor localization, such
as in [18] and [19]. Currently, the localization error reported in fingerprinting solutions
ranges from 3m to 10m using WiFi and BLE RSSI [20].
In order to construct the offline database many researchers try to manually collect
fingerprints at multiple known locations in a building. This is a labor-intensive and
highly time-consuming procedure, even for small areas. To reduce that effort, self-
guided robots equipped with sensors deployed in the building and collect data to build
the dataset in [21]. Also, another technique that is used to easily collect measurements
is crowdsourcing [22]. In this context, people gather random traces of measurements
through smartphone applications as they walk around the investigation area. Another
approach is to collect the RSSI measurements from a gateway device that is on a fixed
place and listens for other devices. This method requires a prior knowledge about the
position of the other devices in order to label them in the training dataset.
In [13] researchers discussed indoor Wi-Fi positioning technology, including the
various phases and processes of Wi-Fi fingerprinting technology and classified the
methods used across. In [10] a zoning indoor localization solution, using the theory of
SVM, applied to a WIFI RSSI technique. Two real world environments with different
architectures (flat/multi-floors) have been used to create multiclass SVM models and
to test their performance. Experimental results show that the proposed solution can
greatly determine the location of the new signals, with a confidence of 97.31% for the
flat topology and 88.38% for the multi-floors topology.
A novel BLE RSSI ranking based fingerprinting method that uses Kendall Tau
Correlation Coefficient (KTCC) is presented in [23]. The aim is to correlate a new
signal position with the signal strength ranking of multiple low-power iBeacon devices
situated in a retail space. This offers a higher positioning accuracy and is supported in
recent smartphones.
Other research studies propose using data from integrated smartphone’s sensors
together with RSSI values to filter the location data and to provide more features in the
dataset [24]. These applications need extra hardware and consume more battery. In our
study, we used only the RSSI values of BLE smart-bracelet devices in order to build
our dataset, as we explain in the following sections.
3 RSSI Measurement Procedure
As a first step we consider the localization space, as a two-dimensional space. The
task of localization is a problem learning the mapping of the observations to the
respective locations. In fingerprinting, the localization task utilizes data collected from
the environment. Essentially, the fingerprinting database is filled by true location-
observation pairs and the location function is learned through the mapping of new
observations to a specific position.
The fingerprinting algorithm needs data in order to be trained. For the RSSI values,
a RF signal feature is collected multiple times in order to compensate any signal noise
effects. These RF signals have to be from the same signal bandwidth. In this work, it is
assumed that the entire area is served by BLE coverage. In particular, the area is covered
by multiple access points (AP). Some references suggest using multiple access points
in order to cover the dead zones. An AP advertises its availability by broadcasting a
message (in JSON or another format) with its MAC address. At the receivers, which
are usually phones, the power of the received RF signal from all APs is measured as
RSSI. The receiver will not detect the APs that are too far away from it.
During the offline procedure it is assumed that the area is discretized into a set of A
known locations A = {xa | a=1…A}, where xa represents the 2-dimensional Cartesian
coordinate of location a. The RSSI is scanned for a certain period to receive multiple
records from every AP. Commonly, if there are N different APs, T different samples
are collected for all the locations A. All these RSSI values are collected in three-
dimensional matrix with dimension A x N x T.
During the online phase, a receiver at a location x listens to all the APs in the area
and then collects the RSSI measurements in a database with dimensions of N x R. The
function should be learned by mapping these new measurements to positions in the
space. In most common fingerprinting algorithms, the function of pattern matching is
to compare the similarity between the fingerprints of offline database and the
fingerprints of the online procedure. One of the most well-known algorithms for
fingerprinting and generally for pattern matching employs the Euclidean Distance to
measure the similarity between fingerprints in the training and the testing procedure.
This distance can be replaced by any kernel function. The whole process is completed
with post-processing methods such as k-nearest neighbors (kNN) or Support Vector
Machines (SVM).
4 Data Collection System
In this section, we describe the process that we followed to collect the RSSI values
from the BLE devices. Standard, widely adopted techniques and common practices
have been followed. To ensure diversity, 2 different BLE devices have been used to
compare the RSSI values in different environments.
We took the records from an office, a hall and outside of the building, which top
view’s picture is shown in the Figure 1. The first indoor environment was a 10x4m
office. It was selected due to its large number of wireless devices, resulting in an
environment with a lot of noise interference. In the office there are 8 office with 1.5 to
2 meters distance to each other. Each office is assigned a different label in the
fingerprint dataset.
The second place used was a 20x2 m. hall outside the office. This place was selected
in order to correlate the distance with the RSSI values in a cleaner environment without
a lot of interference. In this environment, we set as a label different distances.
Fig. 1. Building Map of RSSI measurements
In order to build our offline dataset, the set of BLE modules has to be fixed at specific
locations. For this purpose, we used two BLE smart sport-bracelets. The first one was
Xiaomi MI Band 2 [25] and the second one was the Xiaomi MI Band 3 [26]. The RSSI
tags were sent in JSON format with other information through an android application
in a smartphone that was considered as a fixed anchor station. The smartphone device
scans for the BLE bracelets and sends the JSON tags in a MQTT server. Through the
MQTT server the JSON messages are sent to a MQTT client application in JAVA.
After receiving the JSON messages, they were stored in a MongoDB database for
analysis. All the measurements were performed in the same height, so as to have the
same interference from the surrounding environment. The device knows from which
device the messages are coming, inserting a label number for each office. To perform
a fair set of tests between all the experiments, a similar transmit power and time
measurement interval was required to be used in all the components. The time interval
was 5 seconds. The procedure is showing in Figure 2.
:UserApp :MQTT Broker
Get BLE info
Get GPS signal
:Parent Smartphone
:Child Smart Bracelet
:MQTT Client
Subscribe to topic BLE
Get BLE json
Mongo DB
Store BLE info
Store location info
Set Smartphone Active
Set BLE active
Active Smartphones
Active BLEs
msg: Search in the nearest personnel location
BLE info
Fig. 2. Data collection procedure
5 RSSI Offline Dataset Description
We reused a dataset collected in Project
2
, with some modifications in order to use
the fingerprinting technique as a classification problem with multiple classes. First, the
dataset has over 100 different RSSI values for each position. The measurements were
related to two different environments. One office with many wireless devices and one
building hall. Figure 2 shows the scatter and plot for all the RSSI measurements for
both Band 2 and Band3 in the office. Figure 3 shows the same data for the hall.
The figures show that there is a better distribution of the RSSI values in the office
than in the hall. This is because the different positions in the office are better
distinguishable.
There are 8 different indoor office positions to observe. All the measurements were
carried out in the same day. The distance range varied from 0.2 to 6.0 meters. In each
position the measurements were from 100 to 250 recordings. As we can see in Figure
3 there is a big fluctuation in each position changing over time.
Figure 4 depicts the scatter plots of the RSSI values for Band 2 and Band 3 in the
hall. In this case, we took measurements from seven different positions (1 to 7 meters,
with one meter increment). The number of measurements of each position were over
190 values. Here we can see that there is not so much difference between band 2 and
band 3 devices. Each label here also corresponds to the real distance in meters. We can
observe that there is a big fluctuation in each position.
2
Project name is omitted to follow the double-blind submission requirements
Fig. 3. Scatter Diagram of the RSSI measurements in off
Fig. 4. Scatter Diagram of the RSSI measurements in hall
From the data above, we can conclude that the raw RSSI values are not reliable
enough for the localization as it is observed in most of the bibliography [15], [9]. In the
following, we use Machine Learning methods to observe the performance of the
techniques on our fingerprinting dataset. In addition, we compare these fingerprinting
techniques with our previous work with the regression method approach.
6 Results and Findings
In this section we analyze the techniques we selected to perform fingerprinting. The
raw RSSI readings were split into train and test sets. All algorithms were developed in
python. We used new measurements (around 40 measurements) as a test set for the
machine learning techniques.
6.1 K-nearest Neighbors Algorithm
The principle behind nearest neighbor methods is to find a predefined number of
training samples closest in distance to the new point, and predict the label from these.
The number of samples can be a user-defined constant (k-nearest neighbor learning),
or vary based on the local density of points (radius-based neighbor learning). The
algorithm implements learning based on the nearest neighbors of each query point,
where k is an integer value specified by the user.
Figure 4 shows the computed mean error for the office for the Band 3 bracelet using
k = 1 to 40 nearest neighbors. As we can see the best results are obtained for k=35. We
did the same procedure for the other measurements as well (with band 2 and in the hall
environment). For Band 2 in the office the best k value is 21, while in the hall is 15.
For Band 3 in the hall the k value for the best results is 3.
Fig. 5. Mean Error diagram for Band3 in office
The Table 1 shows the results the accuracy for the K nearest Neighbors.
Table 1. Accuracy Results KNN
Environment
Band2
Band3
Office
0.45
0.44
Hall
0.28
0.30
The best results are for Band 2 and Band 3 in the office environment. This is because
each position is distributed better in the office than in the hall.
6.2 Decision Tree
Decision Trees (DTs) is a non-parametric supervised learning method used for
classification and regression. The goal is to create a model that predicts the value of a
target variable by learning simple decision rules inferred from the data features.
For the decision tree method, we tried to find the best function to measure the quality
of a tree split. We used two different function “gini” and “entropy” to see which one
best fit the models. For Band 3 in the office the best criterion is entropy with maximum
depth of tree 5. For Band 2 in the office the best criterion is gini with maximum depth
of tree 9. In the hall environment Band 2 results were better with gini criterion and
maximum depth of tree 6. For Band 3 the best results were with gini criterion and 3
maximum depth size.
In Table 2 we present the results (average/total) of the Decision Tree techniques for
the 4 different use cases. Support column is the total number of samples of true response
lying in all eight classes.
Table 2. Decision Tree Results (average/total)
Environment/
Band
Precision
Recall
F1-score
Support
Office/ Band2
0.478
0.428
0.417
332
Office/ Band3
0.421
0.416
0.382
332
Hall/ Band2
0.219
0.224
0.213
286
Hall/ Band3
0.226
0.313
0.260
316
Again here, the results are much better in the office that in the hall with no difference
between the two smart bracelets.
6.3 Random Forest
A random forest is a meta estimator that fits several decision tree classifiers on
various sub-samples of the dataset and uses averaging to improve the predictive
accuracy and control over-fitting. Again, here we performed the Random Forest
technique to find the best results
Table 3. Random Forest Results (average/total)
Environment/
Band
Precision
Recall
F1-score
Support
Office/ Band2
0.407
0.407
0.393
332
Office/ Band3
0.421
0.431
0.403
332
Hall/ Band2
0.271
0.273
0.248
286
Hall/ Band3
0.276
0.282
0.260
316
The results (average/total) in Table 3 are the same as in the previous techniques and
there are no big differences between them. The best results here were observed for the
office.
6.4 Support Vector Machines (SVM)
SVM is a binary classifier introduced by Vapnik and Chervonenkis in 1963. SVM
aims at finding a linear classifier, i.e., a hyperplane which maximizes the margin
between two classes. It has extensions to non-linear classifiers and non-separable data
too. We used the kernel Radial Basis Function (RBF) in python for the SVM technique.
We used the RBF kernel, because this kernel gave definitely better results that the other
kernels. The results are showing in the Table 4.
Table 4. SVM Results (accuracy-%)
Environment/
Band
Precision
(%)
Office/ Band2
41.56
Office/ Band3
43.97
Hall/ Band2
22.72
Hall/ Band3
29.74
The SVMs results show that there is a better separation of the classes in the office
environment than in the hall environment, even if there is not so much interference
from the surrounding environment. From the whole procedure, the best results were
observed with decision trees in the office environment with Band 2. Between the smart
bracelets, there is no big difference in the fingerprinting technique.
As we can observe from the results of fingerprinting, it is not an appropriate
approach for our dataset. Compared to our previous work, in which we tested the
distance regression method for RSSI measurements, here the localization performance
is lower. In general, we can observe that we need to define more discriminative labels
in order to classify better the labels. This means that we have to increase the distance
between the labels due to the fluctuation in the RSSI. However, this will decrease the
indoor localization accuracy.
7 Conclusion and Future Work
The main contribution of this paper is to examine the fingerprinting approach with
RSSI recordings and their respective classes, suitable for positioning applications. The
recordings were collected in two different indoor environments. The variability of the
RSSI values in all the environments, creates many difficulties in the fingerprinting
approach. Our experiments confirm the findings of the literature, that the raw RSSI data
is an inappropriate form to use for indoor localization. In such cases, we need either to
change the dataset and take more measurements, or to use the regression technique with
filtering methods. The results were much better in the office than in the hall
environment due to its small diversity (1 meter) and the algorithms cannot separate each
label. For future work, we will enrich the datasets with additional measurements and
settings. We will also investigate additional techniques, like Kalman Filter, in order to
find the exact position of a walking human in a specific area.
8 Acknowledgement
This research has been co-financed by the European Union and Greek national funds
through the Operational Program Competitiveness, Entrepreneurship and Innovation,
under the call RESEARCH-CREATE-INNOVATE (project code: T1EDK-03487).
9 References
1. Yanying Gu, Anthony Lo, and Ignas Niemegeers. 2009. A survey of indoor positioning
systems for wireless personal networks. IEEE Communications Surveys & Tutorials, 11 (1),
2009.
2. Hui Liu, Houshang Darabi, Pat Banerjee, and Jing Liu. 2007. Survey of wireless indoor
positioning techniques and systems. IEEE Transactions on Systems, Man,
and Cybernetics, Part C (Applications and Reviews) 37, 6 (2007), 1067–1080.
3. L. B. Del Mundo, R. L. D. Ansay, C. A. M. Festin and R. M. Ocampo, "A comparison of
Wireless Fidelity (Wi-Fi) fingerprinting techniques," ICTC 2011, Seoul, 2011, pp. 20-25.
4. Germán Martín Mendoza-Silva, Miguel Matey-Sanz, Joaquín Torres-Sospedra, and Joaquín
Huerta. 2019. BLE RSS Measurements Dataset for Research on Accurate Indoor
Positioning. Data 4, 1 (2019), 12.
5. D. Hauschildt and N. Kirchhof, "Improving indoor position estimation by combining active
TDOA ultrasound and passive thermal infrared localization," 2011 8th Workshop on
Positioning, Navigation and Communication, Dresden, 2011, pp. 94-99.
6. K. Wang, A. Nirmalathas, C. Lim, K. Alameh, H. Li, and E. Skafidas, "Indoor infrared
optical wireless localization system with background light power estimation capability,"
Opt. Express 25, 22923-22931 (2017).
7. Lingling Zhu, Aolei Yang, Dingbing Wu, and Li Liu. 2014. Survey of indoor positioning
technologies and systems. In Life System Modeling and Simulation. Springer, 400–409.
8. Li, G., Geng, E., Ye, Z., Xu, Y., Lin, J., & Pang, Y. (2018). Indoor positioning algorithm
based on the improved RSSI distance model. Sensors, 18(9), 2820.
9. Petros Spachos, Ioannis Papapanagiotou, and Konstantinos N Plataniotis. 2018.
Microlocation for smart buildings in the era of the internet of things: A survey of
technologies, techniques, and approaches. IEEE Signal Processing Magazine 35, 5 (2018),
140–152.
10. W. Farjow, A. Chehri, M. Hussein and X. Fernando, "Support Vector Machines for indoor
sensor localization," 2011 IEEE Wireless Communications and Networking Conference,
Cancun, Quintana Roo, 2011, pp. 779-783.
11. Guo, Xiansheng, et al. "Indoor localization by fusing a group of fingerprints based on
random forests." IEEE Internet of Things Journal 5.6 (2018): 4686-4698.
12. V. Honkavirta, T. Perala, S. Ali-Loytty and R. Piche, "A comparative survey of WLAN
location fingerprinting methods," 2009 6th Workshop on Positioning, Navigation and
Communication, Hannover, 2009, pp. 243-251.
13. Xia, Shixiong, et al. "Indoor fingerprint positioning based on Wi-Fi: An overview." ISPRS
International Journal of Geo-Information 6.5 (2017): 135
14. Milioris, Dimitris, et al. "Low-dimensional signal-strength fingerprint-based positioning in
wireless LANs." Ad hoc networks 12 (2014): 100-114.
15. Guoquan Li, Enxu Geng, Zhouyang Ye, Yongjun Xu, Jinzhao Lin, and Yu Pang. 2018.
Indoor positioning algorithm based on the improved RSSI distance model. Sensors 18, 9
(2018), 2820.
16. Bai, Sidong, and Tong Wu. "Analysis of k-means algorithm on fingerprint based indoor
localization system." 2013 5th IEEE International Symposium on Microwave, Antenna,
Propagation and EMC Technologies for Wireless Communications. IEEE, 2013.
17. Tian, X.; Shen, R.; Liu, D.; Wen, Y.; Wang, X. Performance analysis of rss fingerprinting
based indoor localization. IEEE Trans. Mobile Comput. 2017, 16, 2847–2861
18. Nowicki, Michal R. and Jan Wietrzykowski. “Low-Effort Place Recognition with WiFi
Fingerprints Using Deep Learning.” AUTOMATION (2017).
19. L. Xiao, A. Behboodi and R. Mathar, "A deep learning approach to fingerprinting indoor
localization solutions," 2017 27th International Telecommunication Networks and
Applications Conference (ITNAC), Melbourne, VIC, 2017, pp. 1-7.
20. Yiu, Simon, et al. "Wireless RSSI fingerprinting localization." Signal Processing 131
(2017): 235-244.
21. Lun-Wu Yeh, Ming-Hsiu Hsu, Hong-Ying Huang, Yu-Chee Tseng, Design and
implementation of a self-guided indoor robot based on a two-tier localization architecture,
Pervasive Mob. Comput. 8 (2) (2012) 271–281.
22. C. Wu, Z. Yang, Y. Liu, W. Xi, Will: wireless indoor localization without site survey, in:
Proceedings of IEEE INFOCOM, IEEE, 2012, pp. 64–72
23. Z. Ma, S. Poslad, J. Bigham, X. Zhang and L. Men, "A BLE RSSI ranking based indoor
positioning system for generic smartphones," 2017 Wireless Telecommunications
Symposium (WTS), Chicago, IL, 2017, pp. 1-8.
24. Henri Nurminen, Anssi Ristimaki, Simo Ali-Loytty, Robert Piché, Particle filter and
smoother for indoor localization, in: Proceedings of International Conference on Indoor
Positioning and Indoor Navigation (IPIN), 2013, pp. 1–10
25. Mi Band 2. 2019. Specifications. https://www.mi.com/global/miband2/ [Last accessed 5
July 2019].
26. Mi Band 3. 2019. Specifications. https://www.mi.com/global/mi-band-3/ [Last accessed 5
July 2019].