ArticlePDF Available

A Public Domain Dataset for Real-Life Human Activity Recognition Using Smartphone Sensors

Article

A Public Domain Dataset for Real-Life Human Activity Recognition Using Smartphone Sensors

Abstract and Figures

In recent years, human activity recognition has become a hot topic inside the scientific community. The reason to be under the spotlight is its direct application in multiple domains, like healthcare or fitness. Additionally, the current worldwide use of smartphones makes it particularly easy to get this kind of data from people in a non-intrusive and cheaper way, without the need for other wearables. In this paper, we introduce our orientation-independent, placement-independent and subject-independent human activity recognition dataset. The information in this dataset is the measurements from the accelerometer, gyroscope, magnetometer, and GPS of the smartphone. Additionally, each measure is associated with one of the four possible registered activities: inactive, active, walking and driving. This work also proposes asupport vector machine (SVM) model to perform some preliminary experiments on the dataset. Considering that this dataset was taken from smartphones in their actual use, unlike other datasets, the development of a good model on such data is an open problem and a challenge for researchers. By doing so, we would be able to close the gap between the model and a real-life application.
Content may be subject to copyright.
sensors
Article
A Public Domain Dataset for Real-Life Human
Activity Recognition Using Smartphone Sensors
Daniel Garcia-Gonzalez * , Daniel Rivero , Enrique Fernandez-Blanco
and Miguel R. Luaces
Department of Computer Science and Information Technologies, University of A Coruna,
15071 A Coruna, Spain; daniel.rivero@udc.es (D.R.); enrique.fernandez@udc.es (E.F.B.);
miguel.luaces@udc.es (M.R.L.)
*Correspondence: d.garcia2@udc.es
Received: 13 March 2020; Accepted: 7 April 2020; Published: 13 April 2020


Abstract:
In recent years, human activity recognition has become a hot topic inside the scientific
community. The reason to be under the spotlight is its direct application in multiple domains,
like healthcare or fitness. Additionally, the current worldwide use of smartphones makes it particularly
easy to get this kind of data from people in a non-intrusive and cheaper way, without the need for
other wearables. In this paper, we introduce our orientation-independent, placement-independent
and subject-independent human activity recognition dataset. The information in this dataset is
the measurements from the accelerometer, gyroscope, magnetometer, and GPS of the smartphone.
Additionally, each measure is associated with one of the four possible registered activities: inactive,
active, walking and driving. This work also proposes asupport vector machine (SVM) model to
perform some preliminary experiments on the dataset. Considering that this dataset was taken from
smartphones in their actual use, unlike other datasets, the development of a good model on such data
is an open problem and a challenge for researchers. By doing so, we would be able to close the gap
between the model and a real-life application.
Keywords: HAR; human activity recognition; sensors; smartphones; dataset; SVM
1. Introduction
Giving birth to the knowledge area called human activity recognition (HAR), the accurate
identification of different human activities has become a hot research topic. This area tries to identify
the action performed by a subject based on the data records from a set of sensors. The recording of
these sensors is carried out while the subject performs a series of well-defined movements, such as
nodding, raising the hand, walking, running or driving. In this sense, wearable devices, such as
activity bracelets or smartphones, have become of great use as sources of this sort of data. This kind of
devices, especially the latter ones, provide a broad set of sensors in a convenient size which can be used
relatively easy with high-grade performance and accuracy. The researchers use the information about
people’s behaviors gathered by these sensors to support the demands from domains like healthcare,
fitness or home automation [
1
]. The result from the intersection between the widespread sensing all
over the world, due to the smartphones and the models developed from that continuous recording,
is a research area that has attracted increasing attention in recent years [2].
The main challenges to be tackled are two: first, managing the vast number of information that the
devices can produce, as well as their temporal dependency, and, second, the lack of knowledge about
how to relate this data to the defined movements. Some methods have achieved remarkable results in
extracting information from these sensors readings [
3
,
4
]. However, it is relevant to note that in such
studies, the devices have been modified to be carried in a particular way, attached to different body
Sensors 2020,20, 2200; doi:10.3390/s20082200 www.mdpi.com/journal/sensors
Sensors 2020,20, 2200 2 of 14
parts, such as waist or wrist. Therefore, the success of those models can be biased using data collected
in such a controlled environment, with specific device orientations and a few activities. Regarding
these orientations, this is far from the ideal scenario, as every person may use these devices, especially
their smartphones, in many different ways. For the same individual, different clothes may vary the
orientation and placement of the device. In the same way, for different individuals, their body shape,
as well as their behavior, can make an enormous difference too. In this way, the artificial intelligence
(AI) models proposed to date are highly dependent on orientation and placement. For that reason,
they cannot be generalized to every kind of user, so there has not been a real transition to real-life, yet.
Presently, personalization of AI models in HAR for large numbers of people is still an active research
topic [5,6], despite being actively researched for nearly a decade [7,8].
To address the aforementioned issues, this work presents a more realistic dataset which is
independent of the device orientation and placement, while it also keeps the independence of the user.
Those are the main differences according to data with other works developed so far. Additionally,
with the implementation of a simple support vector machine (SVM) model, we present a first model as
proof of concept to detect the main activities in the more realistic dataset. In this way, we are laying
the foundations for the transition of this type of system into real life.
Therefore, the main contributions of this paper can be summed up as follows:
Provide and make publicly available a new HAR dataset closer to a realistic scenario (see the
files in
Supplementary Materials
). This dataset is independent of the device orientation and
placement, while it is also individual independent.
The new dataset adds additional signals not very explored until today like the GPS and
magnetometer sensor measurements.
A first reference model is provided for this dataset, after applying a specific sliding window
length and overlap.
A study of the best architecture for longer-themed activities, such as those suggested in our work.
The organization of the rest of the paper is as follows. Section 2shows some related works on
HAR, as well as other datasets used in this field. Section 3gives a thorough explanation of the dataset
arrangement, as well as the data collection process. Section 4presents and discuss the experimental
results obtained on the SVM model we propose, using our custom dataset; while finally, Section 5
contains the conclusions and future work lines.
2. Related Work
Inside HAR knowledge area, other datasets have been previously published. The first one worth
to mention, because its widespread use in different works and comparisons, is UCI (University
of California, Irvine) HAR dataset. Proposed in [
9
], the dataset contains data gathered while
carrying a waist-mounted smartphone with embedded inertial sensors. The time signals, in this
case, were sampled in sliding windows of 2.56 s and 50% overlap between them, as the activities
researched are done in short intervals of time: standing, sitting, laying down, walking, walking
downstairs and walking upstairs. In this work, they also created an SVM model to be exploited. With a
total of 561 features extracted, they got particularly good results, with accuracies, precisions and recalls
higher than 90%. However, it is a dataset taken in a laboratory environment, with a particular position
and orientation. For that reason, in a realistic environment in which users could use their smartphones
in their way, the results obtained would not be trustable.
Apart from the UCI HAR dataset, there is the WISDM (Wireless Sensor Data Mining) one [
10
],
which is also widely used. In this case, the sliding windows chosen were of 10 s, with apparently no
overlap applied. They mention that they also worked with 20 s, but the results were much better with
the first case. Here, the activities researched were: walking, jogging, ascending stairs, descending stairs,
sitting and standing. In their work, they used some WEKA (Waikato Environment for Knowledge
Analysis) algorithms like J48 or Logistic Regression to perform some predictions over their data,
Sensors 2020,20, 2200 3 of 14
with quite good outcomes. Nonetheless, it has the same problem as the previous case, so its results
could not be taken to a real-life environment either.
To highlight these differences, we show in Table 1a qualitative comparison between these two
datasets and the one we propose in this paper.
Table 1. Comparison between datasets: UCI HAR, WISDM and the proposed one.
UCI HAR WISDM Proposed
Type of actions studied Short-themed Short-themed Long-themed
Smartphone orientation and positioning Fixed Fixed Free
Different individuals Yes Yes Yes
Fixed sensor frequency Yes Yes No
Sensors used Acc. and gyro. Acc. and gyro. Acc., gyro., magn. and GPS
In the literature, many works tested and validated these datasets. For example, in [
11
],
they made a comparison between Convolutional Neural Networks (CNN), Random Forest, Principal
Component Analysis (PCA) and K-Nearest Neighbors (KNN) based algorithms. They concluded
that CNN outperforms the rest of the ones they tested, apart from seeing that larger sliding
windows did not necessarily improve their behavior. Also, they proposed some CNN architectures,
making a comparison between different combinations of hyperparameters and the performance they
achieved. Similarly, more recently, [
12
] also proposed a CNN model to address the HAR problematic,
with apparently slightly better results. On the other hand, [
13
] submitted a combination between
feature selection techniques and a deep learning method, concretely a Deep Belief Network (DBN),
with some good results, higher than the ones achieved with SVM-based models, which showed to be
one of the best algorithms to use in HAR problematics. By contrast, in [
14
,
15
] they made comparisons
between different feature selections for different widely used machine learning (ML) algorithms in the
literature. Results showed that frequency-based features are more feasible, at least for algorithms like
SVM or CNN, as they throw the best results.
Furthermore, many other works built their dataset to carry out their research. One of the most
interesting ones is [
16
]. In their work, they propose an online SVM model approach for nine different
smartphone orientations. Regarding the data collection, they took it while carrying the mobile in
a backpack. On the opposite hand, they also made a comparison between their custom approach
and some other generic classifiers, such as KNN, decision trees, and Naive Bayes. These methods,
alongside some other techniques like SVM, CNN, Random Forest, and Gradient Boosting, showed to
be valid for HAR with a reasonable size of data. In the end, their approach outperformed the rest of
the classifiers, but they addressed that the future of HAR would be in deep learning methods, as they
seem to get better results in practice. More recent works, like [
17
,
18
] show similar results. In these
cases, more sensors apart from accelerometer and gyroscope were used, like GPS or magnetometer,
showing their potentiality in more long-themed activities like walking or jogging.
Following the same line, other works made their datasets but applying purely Deep Learning
methods. In [
19
], the results show that these methods might be the future for HAR, as their results are
very hopeful, at least in the non-stationary activities such as walking or running, as SVM still reigns in
short-timed activities such as standing or laying down. More recently, works implementing LSTM
(long short-term memory) models are arising. The principal advantage of these implementations is
that they take into account past information and, at being a deep learning-based technique, they do
not need a prior feature extraction to perform the training. The downside is that they need big datasets
to get reliable classification results, as well as more time to be trained and suitable stop criteria to
avoid overfitting (and underfitting). For example, in [
20
,
21
] we can see this kind of models and with
particularly good results. In fact, in [20] they implemented a modification of LSTMs which are called
Bi-LSTMs (bidirectional LSTMs). What makes this modification special is that these models can also
learn from the future, throwing accuracies of around 95%.
Sensors 2020,20, 2200 4 of 14
However, as we already addressed in the introduction, all these works depend on a particular
device orientation to get these successful results. In [
22
], the problem of different device orientations,
as well as different smartphone models, was addressed. In this case, they got good results by
transforming the phone’s coordinate system to the earth coordinate system. Moreover, their results did
not show remarkable decreases in accuracy when carrying different smartphone models, but only when
the orientation changed. Even so, it does not address the problem that arises when the smartphone is
put in different places and not only in the pocket (for example, a bag).
As can be seen, there are problems of lack of realism and applicability in real life of the systems
and datasets developed so far in HAR. While the results of many of the models developed in this field
are quite promising, their real-life application would probably not be as successful. Therefore, in our
work, we are determined to know these problems with the formation of our own more realistic dataset.
With a simple SVM model, we could see the performance differences concerning other works and
overcome them in future developments, if there are many.
3. Materials and Methods
This part contains a step-by-step description of our work, divided into the following sections.
First, Section 3.1 presents the procedure carried out to collect the data. Then, in Section 3.2, we describe
how the data was prepared to use once the data collection was over, as well as the features extracted
from them. Finally, Section 3.3 offers a summary of the classification algorithm applied.
The dataset and all the resources used in this paper are publicly available (see the files in
Supplementary Materials).
3.1. Data Collection
Data collection was made through an Android app developed by the authors that allowed an
easy recording, labeling and storage of the data. To do this, we organized an initial data collection that
lasted about a month, to see what data we were getting and to be able to do some initial tests on it.
Later, we carried out another more intensive collection, over a period of about a week, to alleviate the
imbalances and weaknesses found in the previous gathering. Each of the people who took part in the
study was asked to set the activity they were going to perform at each moment, through that Android
app, before starting the data collection. In this way, once the activity was selected, the gathering of such
data was automatically started, until the user indicated the end of such activity. Hence, each stored
session corresponds to a specific activity, carried out by a particular individual. Regarding the activities
performed, they were four:
Inactive: not carrying the mobile phone. For example, the device is on the desk while the
individual performs another kind of activities.
Active: carrying the mobile phone, moving, but not going to a particular place. In other words,
this means that, for example, making dinner, being in a concert, buying groceries or doing the
dishes count as “active” activities.
Walking: Moving to a specific place. In this case, running or jogging count as a “walking” activity.
Driving: Moving in a means of transport powered by an engine. This would include cars, buses,
motorbikes, trucks and any similar.
The data collected comes from four different sensors: accelerometer, gyroscope, magnetometer
and GPS. We selected accelerometer and gyroscope because they are the most used in the literature
and the ones that showed the best results. We also added the magnetometer and GPS because we think
they could be useful in this problem. In fact, in our case, GPS should be essential to differentiate the
activities performed by being able to detect the user’s movement speed who carries the smartphone.
We save the data of the accelerometer, the gyroscope and the magnetometer with their tri-axial
values. In the case of GPS, we store the device’s increments in latitude, longitude and altitude, as well
as the bearing, speed and accuracy of the collected measurements. Also, for the accelerometer, we
Sensors 2020,20, 2200 5 of 14
used the gravity sensor, subtracting the last reading of the latter from the observations of the first.
In this way, we get clear accelerometer values (linear accelerometer), as they are not affected by the
smartphone’s orientation. Therefore, we obtain a dataset independent of the place where the individual
is, as well as of the device’s bearings.
On the other hand, before saving the data locally, a series of filters are applied. In the case of the
accelerometer and magnetometer, we use a low-pass filter to avoid too much noise in these sensor’s
measurements. Concerning the gyroscope, to bypass the well-known gyro drift, a high-pass filter
was used instead. Nevertheless, we also had to deal with Android’s sensor frequency problem, as we
cannot set the same frequency for each one of them. In our case, this is especially problematic, having
to join data from very high-frequency sensors such as the accelerometer, with a low-frequency sensor,
such as the GPS. From the latter, we obtain new measurements every ten seconds, approximately,
compared to the possible ten, or even 50, measurements per second we can get from the accelerometer.
Anyhow, given the inability to set a frequency in Android and having to take the values as they
are offered by the system itself, there may be gaps in the measurements. These gaps are especially
problematic in the case of GPS, where there may be cases where no new measurements were obtained
in more than a minute (although perhaps this is mainly due to the difficulty of accessing closed
environments). Such gaps also occur in the case of the accelerometer, gyroscope or magnetometer,
despite offering about 10, 5 or 8 measurements per second, respectively, in the most stable cases.
In these cases, the gaps are between 1 and 5 s, and occur mostly at the start of each data collection
session, although much less frequently than with GPS. In this way, in Table 2, we show the average
number of recordings per second for each sensor and each activity measured, as well as the resulting
average frequency. Below each average value, in a smaller size, we also show the standard deviation
for each class. Please note that for moving activities such as “active” or “walking” there is an increase
in these measurements, especially with the accelerometer. This is because the smartphone detects
these movements and, to get the most information, its frequency is increased automatically to get the
maximum number of measurements. However, this increase also occurs during “driving” activity,
even more so. Vibrations due to the car use may be the cause of this increase, as they might also be
detected by the sensors of the smartphone. Additionally, in “walking” and “active” activities there
may be certain inactive intervals (like waiting for a traffic light to go green or just standing doing
something, respectively) that lower these averages.
Table 2. Average number of recordings per second for each sensor and each activity measured.
Activity Accelerometer Hz. Gyroscope Hz. Magnetometer Hz. GPS Hz.
Inactive 11.00
±16.38
4.66
±0.74
7.91
±11.72
0.13
±0.35
Active 32.55
±24.80
4.46
±1.44
9.13
±13.64
0.06
±0.23
Walking 31.24
±27.47
6.24
±11.86
8.16
±12.05
0.06
±0.23
Driving 51.16
±31.59
4.66
±2.42
17.00
±20.01
0.04
±0.20
In this way, the final distribution of the activities in our dataset is the one shown in Table 3. In this
table, we measured the total time recorded, the number of recordings, the number of samples and the
percentage of data (this one related to the number of samples), for each of the activities we specified.
Here, each recording refers to a whole activity session, since the individuals begin an action until they
stop it; while each sample is related to a single sensor measurement. As can be seen, there are less
samples on “inactive” activities in proportion to the total time recorded. This is because the frequency
of the sensors increases with activities that require more movement, as explained above, so in these
cases they remained at a lower value. Therefore, the total percentage of the data may give a wrong view
of the total data distribution, once the sliding windows are applied. This is because, by using these
Sensors 2020,20, 2200 6 of 14
windows on which to compute a series of features, the number of samples actually moves into second
place, with the total time recorded being the most important value. The more total time recorded,
the more sliding windows computed, and the more patterns for that class. Hence, there would be
a much clearer imbalance in the dataset, where “inactive” activity would have three times as many
patterns as in the case of “walking”. Regarding the number of recordings made, there are far more
with the “walking” activity than with the rest. Anyhow, we consider that the dataset remains useful
and feasible to implement models that could distinguish these activities. Moreover, the total number
of individuals who participated in the study was 19. Therefore, the dataset also contains different
kinds of behaviors that end up enriching the possible models developed later.
Table 3. Dataset distribution for each activity measured.
Activity Time Recorded (s) Number of Recordings Number of Samples Percentage of Data
Inactive 292,213 147 7,064,757 24.25%
Active 178,806 99 8,918,021 30.62%
Walking 98,071 200 4,541,130 15.59%
Driving 112,226 128 8,602,902 29.54%
Overall 681,316 574 29,126,810 100%
On the other hand, there is also another problem in Android, as not all devices contain a gyroscope
or a magnetometer to this day. While it is mandatory to have an accelerometer and a GPS, a gyroscope
or a magnetometer are not compulsory in older versions of Android. In this way, some of our users
took measurements without including these sensors. In Tables 4and 5, we show the number of samples
that do not include a gyroscope or a gyroscope and a magnetometer simultaneously, as the people
who did not have a magnetometer did not have a gyroscope either. Something important to highlight
in these tables is the difference in the relation between the number of samples and the time recorded
compared to the one showed in Table 3. Here, the number of samples is much higher in relation to
the time recorded. This may explain the strange data that we pointed out before in Table 2, as the
accelerometer may increase more its frequency in general, by becoming the only sensor to detect
motion. On another note, the percentages we show in this table are related to the whole amount of
data, from Table 3. Fortunately, these percentages are quite low, and the dataset is not as affected by
this problem. Anyhow, it will be something to keep in mind when preparing the data to be applied to
a future AI model.
Table 4. Dataset distribution for each activity measured without gyroscope.
Activity Time Recorded (s) Number of Recordings Number of Samples Percentage of Data
Inactive 11,523 8 668,536 2.29%
Active 13,866 7 619,913 2.13%
Walking 4169 15 584,262 2.01%
Driving 25,718 23 3,776,468 12.97%
Overall 55,276 53 5,649,179 19.40%
Table 5. Dataset distribution for each activity measured without gyroscope and magnetometer.
Activity Time Recorded (s) Number of Recordings Number of Samples Percentage of Data
Inactive 5409 2 269,710 0.93%
Active 10,286 2 90,487 0.31%
Walking 0 0 0 0%
Driving 0 0 0 0%
Overall 25,695 4 360,197 1.24%
Sensors 2020,20, 2200 7 of 14
3.2. Data Preparation and Feature Extraction
After having collected the data, we proceed to prepare them to be introduced later in the model.
To do so, and taking into account the well-known time-series segmentation problem in HAR, we opted
to use sliding windows of 20 s, with an overlap of 19 s (95%). We chose 20 s because it is the most
we have seen used in this field. Moreover, we consider that our activities, being long-themed, need a
large window size to be correctly detected. We thought even a greater size could be beneficial,
but we decided to be conservative and see what happens with a smaller one. As for the overlap,
we chose the maximum possible that would allow us to have comfortable handling of the data, as well
as a higher number of patterns, with one second between windows. In this way, we get around
half a million patterns, on a quite long time window, compared to previous works. Additionally,
with this distribution, we hope to get reliable results for the movements we are analyzing, as they are
long-themed (inactive, active, walking and driving).
However, to apply these windows, it is first necessary to pre-process the data. The algorithm
implemented to do so consists of deleting rows that met one or more of the following properties:
1.
GPS increments in latitude, longitude and altitude that are higher than a given threshold, obtained
from a prior, and very conservative, data study. We detected that there were occasional “jumps”
in our GPS-related values, as some of these observations were outside the expected trajectory.
For this reason, we decided to fix a threshold of 0.2 for latitude and longitude increments, and 500
for the altitude ones. In this way, any value that is too far out of line is eliminated, keeping those
that are closer to the expected.
2.
Timestamps that do not match the structure defined (yyyy-MM-dd HH:mm:ss.ZZZ) or that do not
correspond to an actual date (year 1970 values, for example).
3.
Any misplaced value between timestamp and z-axis magnetometer, which showed to appear in
some very few observations at the beginning of the project.
Table 6shows the mean and standard deviation values of each sensor for each of the activities
studied, after the application of this algorithm. To correctly understand the values indicated in this
table, it is important to explain what each of these sensors measures. The accelerometer values
correspond to the acceleration force applied to the smartphone on the three physical axes (
x
,
y
,
z
),
in m/s
2
. On the other hand, the gyroscope measures in rad/s the smartphone’s rotation speed around
each of the three physical axes (
x
,
y
,
z
). Regarding the magnetometer, it measures the environmental
geomagnetic field of the three physical axes (
x
,
y
,
z
) of the smartphone, in
µ
T. As for the GPS, its
values correspond, on the one hand, to the increments of the values of the geographical coordinates,
longitude and latitude, in which the smartphone is located, with respect to the previous measurement.
Similarly, the increments in altitude, in meters, were also measured. Then, the values of speed,
bearing and accuracy were also taken into account. Speed was measured in m/s and specifies the
speed that is taking the smartphone. The bearing measured the horizontal direction of travel of the
smartphone, in degrees. Finally, accuracy values refer to the deviation from the actual smartphone
location, in meters, where the smaller the value, the better the accuracy of the measurement. Going
back to Table 6, in each cell, the values corresponding to the mean are at the top and, at the bottom, in a
smaller size, the standard deviation values. Each pair of values corresponds to the set that forms each
sensor. In the case of the accelerometer, gyroscope and magnetometer, these refer to the values related
to their “
X
”, “
Y
” and “
Z
” axes. As for the GPS, this set is formed by the latitude increments (Lat.),
the longitude increments (Long.), the altitude increments (Alt.), the speed (Sp.), the bearing (Bear.)
and the accuracy (Acc.) of every measurement. Here, it is worth noting some rare data, such as those
relating to GPS “inactive” activity, where the values are very high concerning what is expected from
such action. In this case, we consider that these values are due to the fact that such activity is carried
out in indoor environments, which are not so accessible for GPS. Even so, as can be seen, there are
some clear differences between the activities, so the possibilities of identification with future models
are more than feasible.
Sensors 2020,20, 2200 8 of 14
Table 6. Sensor’s mean and standard deviation values for each activity measured.
Activity
Inactive Active Walking Driving
X0.11761
±0.45934
0.01338
±1.30277
0.09425
±3.33422
0.04747
±0.83290
Accelerometer Y0.06136
±0.26764
0.07598
±1.45440
0.37604
±4.35808
0.12936
±0.93828
Z0.84318
±2.66926
0.13008
±1.70294
0.07353
±4.09859
0.18127
±1.24042
X0.00004
±0.03828
0.00001
±0.36806
0.00760
±1.31125
0.00080
±0.19224
Gyroscope Y0.00004
±0.04719
0.00102
±0.40959
0.00020
±0.89244
0.00277
±0.19835
Z0.00001
±0.03526
0.00055
±0.24528
0.00560
±0.53685
0.00243
±0.16678
X25.93805
±56.45617
6.03153
±30.00980
0.28182
±27.03210
5.96356
±46.08005
Magnetometer Y19.62683
±85.70343
0.02890
±28.76398
18.73800
±29.63926
10.73609
±40.46829
Z56.60425
±33.19593
9.56310
±39.76136
0.64541
±25.55331
2.93043
±29.45994
Lat. 0.00075
±0.00166
0.00112
±0.00234
0.00047
±0.00220
0.00175
±0.00365
Long. 0.00125
±0.00285
0.00118
±0.00314
0.00056
±0.00300
0.00204
±0.00420
GPS Alt. 32.59169
±53.06269
30.77538
±48.65634
34.06931
±42.51933
41.59391
±54.74934
Sp. 0.37222
±0.82495
0.12109
±0.81007
0.79924
±0.71835
10.82191
±11.82733
Bear. 57.25005
±105.49576
14.69719
±56.00693
124.85103
±119.80663
118.88108
±118.78510
Acc. 265.44485
±494.66499
214.57640
±429.81169
75.54539
±259.59907
192.90736
±508.87285
After applying previous preprocessing, since data collection required the user to tap a button
before performing the activity, we eliminated the first five seconds of each activity collection. In the
same way, we did so with the final five seconds of each measurement. Hence, we can prevent the
future models from ending up learning the movement that precedes the start or the end of the action,
such as, for example, putting the smartphone in the pocket or pulling it out. While doing this, we also
take each recorded activity and split it into the previously defined time interval to prepare them for
the next step. The remaining parts of each period are discarded. In this way, in Table 7, the final
results after the application of this sliding window and overlap is shown for the samples containing all
the sensors. As we already addressed in the previous section, although at the sample level the data
may appear lower for activities such as inactive or walking, at the final pattern level the results are
much different.
Table 7.
Number of patterns for the samples containing all the sensors with a sliding window of 20 s
and 19 s overlap.
Activity
Inactive Active Walking Driving Overall
201,501 137,407 86,383 77,852 503,143
(40%) (27%) (17%) (16%)
Sensors 2020,20, 2200 9 of 14
Later, we had to go through a transformation process to extract the features and apply all the
information needed for the classification algorithm. Due to GPS’ low frequency, to carry out this feature
extraction, it was necessary to previously replicate some of the data stored by this sensor, for each of
the windows applied. To do this, if the difference between one observation and the next differed in a
longer time than one second, the latter measurement is replicated, with a different timestamp. For this
reason, all sessions that do not contain at least one GPS observation are removed from the list of valid
ones for this process. We repeat this step until all the windows that may be in the middle are correctly
filled. We selected one second as the amount of time to be between each sample, so there is always at
least one observation in each of the windows applied, making the final feature extraction match to the
data obtained. After that, for each set of measurements, we computed six different types of features,
each generating a series of inputs for the AI model. The features used were: mean, variance, median
absolute deviation, maximum, minimum and interquartile range, all based in the time domain. All of
them were used in previous works like [
16
], with remarkable results. In this way, we maintain the
simplicity of the model, being able to complicate it or change it in future works according to the results
we achieve.
3.3. Classification Algorithm
As already indicated in the related work section, there are many kinds of models used in HAR.
In our case, we chose to employ an SVM model. Although SVM showed excellent results with rather
short-themed activities, we consider it interesting to test it as an initial model in our dataset. It is one
of the most used models in HAR, applied in works such as [
9
,
16
] and, more recently, in [
23
], all with
outstanding overall performance in this field, as well as being a simple and straightforward AI model.
An SVM is a supervised machine learning model that uses classification algorithms for two-group
classification problems. After giving an SVM model tagged training data sets for either category,
they can categorize new examples. To do this, the SVM looks for the hyperplane that maximizes the
margins between the two classes. In other words, it looks for the hyperplane whose distance from
the nearest element in each category is the highest. Hither, non-linearity is achieved through kernel
functions, which implicitly map the data to a more dimensional space where this linear approximation
is applied. On the other hand, other hyperparameters such as C or gamma also affect the definition of
this hyperplane. As for C, it marks the width of the margins of this hyperplane, as well as the number
of errors that are accepted. Concerning gamma, it directly affects the curve of the hyperplane, making
it softer or more accentuated, depending on the patterns that are introduced into the model.
While SVM is typically used to solve binary classification tasks, it can also be used in multi-class
problems. To do this, it is necessary to use a one-vs-all or one-vs-one strategy. The first case is designed
to model each class against all other classes independently. In this way, a classifier is created for
each situation. On the other hand, the second case is used to model each pair of classes separately,
performing various binary classifications, until a final result is found. In our case, we will be using a
one-vs-all approach, as it is the most used one in the literature. For this, we implemented it on Python,
using the functions provided with Scikit-learn.
4. Results and Discussion
4.1. Results
To provide reliable results in this dataset to future users, we conducted a series of experiments
on it. For this purpose, we applied SVM classifiers, looking for the best kernel between Polynomial,
RBF (Radial Basis Function) and Linear SVM. Also, we explored the optimal trade-off parameter
C, the bandwidth
γ
in RBF and Polynomial kernels, as well as the degree in this last one, with the
features discussed in the previous section. The reason we selected these kernels was, on the one hand,
because the RBF kernel is one of the most used ones in the literature. On the other hand, the linear and
Sensors 2020,20, 2200 10 of 14
the polynomial ones were also selected to have a basis for comparison. To select the best configuration
and architecture of the network, we obeyed the following organization:
1.
First, with the whole combination of all sensors, we made a stratified 10-fold with which to have
10 sets with presumably the same number of patterns for each class.
2.
Then, we took each of those folds to use them to perform a grid search on their corresponding
dataset. To evaluate the resulting predictions, since we use a one-vs-all approach that will have
unbalanced data in each sub-classifier, we chose the f1-score metric to minimize this problematic.
The f1-score is a measure of the test accuracy, based on the harmonic mean of the precision and
the recall metrics. Its formula would be as follows:
F1=2×Precision ×Recall
Precision +Recall
With that in mind, it is closely linked to the correct classification of each pattern, not being so
influenced by class imbalances. When this happens, accuracy might give an incorrect idea of
the model’s performance. However, the f1-score will give a slightly smoother value that better
represents that model, making it a good option for our grid search. On the other hand, we also set
a maximum number of iterations (1000) as a stop criterion, given the high-dimensional data and
the scaling problem of SVM. To carry out this process, we selected the following hyperparameters:
as kernels, we chose the polynomial, the RBF and the linear ones, because of what we addressed
before. As for parameter C, we selected those of the set {1, 10, 100, 1000, 10000}. For the
γ
parameter, specifically for the RBF and polynomial kernels, we chose those of the set {0.0001, 0.001,
0.01, 0.1, 1}. Concerning the degree parameter for the polynomial kernel, we selected those of the
set {1, 2, 3, 4}.
3.
Once the grid search is done, we evaluated the results and selected the best combination of
hyperparameters for each fold. Then, we tested the best corresponding model.
4.
Finally, we studied the impact of the gyroscope and magnetometer, taking advantage of the users
that could not include these sensors in their measurements. For this purpose, we prepared three
different sets: accelerometer + gyroscope + magnetometer + GPS (all users but the ones missing
gyroscope and magnetometer), accelerometer + gyroscope + GPS (all users but the ones missing
magnetometer) and accelerometer + GPS (all users).
The first steps of the experiments yielded the results that can be seen in Table 8. In this table,
for each cell, we show the average test f1-score obtained (top), as well as its standard deviation (below).
As can be seen, the best results correspond, in general, to the RBF kernel, and, more specifically, for cases
where
γ
equals 0.01, especially in conjunction with C = 10. With this combination of hyperparameters,
we managed to achieve an f1-score of 64.14%.
The average confusion matrix yielded by the third step of the experiments is the one showed in
Table 9, along with its particular metrics (recall, precision and accuracy). This result corresponds to
an accuracy of 67.22%. As can be seen, the model manages to correctly separate “inactive” events
but struggles with the rest, especially with the “active” one. In this case, we think that this is due
to the diffusion of this action since it combines both moments of inactivity and movement, in which
we may walk from one place to another. On the other hand, we can also see that the activities of
“walking” and “driving” are also confused with each other. This was expected considering that most
driving took place in an urban environment. In this scenario, there may be traffic jams or moments of
less fluidity that may be quite similar, at a sensory level, to the data obtained while performing the
“walking” activity, as well as the rest of actions. Anyhow, the GPS is probably very influential in this
confusion and it would be interesting to change the related features used to see how they affect the
final classification. Maybe greater sliding window sizes or any kind of feature related to the Fourier
transform of the signal, to pick up its periodic component, could positively affect the final model.
Sensors 2020,20, 2200 11 of 14
Table 8.
Mean f1-scores achieved for each combination of kernel, C,
γ
and degree hyperparameters in
the grid search. The best result found is highlighted in bold.
C = 1 C = 10 C = 100 C = 1000 C = 10000
Linear 36.15%
±15.45
31.41%
±12.78
31.41%
±12.78
31.41%
±12.78
31.41%
±12.78
γ= 0.0001 10.56%
±13.25
4.57%
±0.42
17.04%
±9.20
40.72%
±16.80
34.70%
±13.68
γ= 0.001 20.67%
±14.81
21.30%
±19.99
39.71%
±16.41
38.70%
±20.79
46.70%
±17.60
RBF γ= 0.01 60.37%
±12.76
64.14%
±19.66
56.47%
±15.95
57.20%
±16.79
56.49%
±14.14
γ= 0.1 51.76%
±12.00
54.10%
±14.91
57.09%
±13.24
51.62%
±14.97
51.36%
±15.18
γ= 1 50.99%
±12.84
41.16%
±12.58
41.28%
±12.65
41.28%
±12.65
41.28%
±12.65
γ= 0.0001 18.09%
±13.92
21.04%
±18.97
41.00%
±19.70
32.67%
±10.93
37.12%
±16.61
γ= 0.001 16.09%
±8.09
37.86%
±16.86
37.82%
±14.72
37.26%
±18.32
32.01%
±13.80
Poly d=1 γ= 0.01 37.73%
±18.58
41.49%
±17.97
36.16%
±12.30
36.67%
±12.98
36.67%
±12.98
γ= 0.1 33.36%
±15.56
32.58%
±13.87
34.11%
±12.32
34.11%
±12.32
34.11%
±12.32
γ= 1 36.15%
±15.45
31.41%
±12.78
31.41%
±12.78
31.41%
±12.78
31.41%
±12.78
γ= 0.0001 10.96%
±2.27
6.27%
±2.76
7.03%
±5.52
9.34%
±8.00
9.60%
±9.07
γ= 0.001 7.03%
±5.52
9.10%
±7.52
8.39%
±6.12
10.62%
±4.09
22.55%
±6.75
Poly d=2 γ= 0.01 9.60%
±9.07
10.55%
±3.65
23.08%
±7.16
24.34%
±6.93
27.69%
±7.74
γ= 0.1 22.73%
±6.26
23.46%
±4.99
25.84%
±6.67
25.82%
±6.64
25.82%
±6.64
γ= 1 25.58%
±8.47
25.59%
±8.46
25.59%
±8.46
25.59%
±8.46
25.59%
±8.46
γ= 0.0001 6.11%
±2.83
6.86%
±3.19
10.61%
±6.90
9.15%
±5.78
11.16%
±5.29
γ= 0.001 9.15%
±5.78
11.16%
±5.29
6.04%
±3.64
8.56%
±4.89
19.86%
±9.13
Poly d=3 γ= 0.01 8.32%
±5.16
23.63%
±7.98
23.18%
±9.30
20.63%
±6.19
30.29%
±18.15
γ= 0.1 21.79%
±8.69
25.40%
±15.24
27.70%
±14.57
27.70%
±14.57
27.70%
±14.57
γ= 1 23.11%
±15.45
23.11%
±15.45
23.11%
±15.45
23.11%
±15.45
23.11%
±15.45
γ= 0.0001 7.33%
±5.60
8.20%
±3.53
6.96%
±3.13
4.78%
±0.41
10.36%
±7.03
γ= 0.001 10.36%
±7.03
7.63%
±5.89
7.84%
±5.79
13.20%
±8.45
9.68%
±8.53
Poly d=4 γ= 0.01 9.68%
±8.61
9.54%
±8.82
8.04%
±5.00
7.11%
±3.37
11.79%
±8.47
γ= 0.1 8.39%
±3.41
12.34%
±8.48
12.34%
±8.48
12.34%
±8.48
12.34%
±8.48
γ= 1 9.02%
±5.63
9.02%
±5.63
9.02%
±5.63
9.02%
±5.63
9.02%
±5.63
Sensors 2020,20, 2200 12 of 14
Table 9. Average confusion matrix for the experiments conducted.
Ground Truth
Inactive Active Walking Driving Precision
Inactive 15,887 1904 1165 1195 78.84%
Active 3226 6159 3134 1222 44.82%
Walking 259 1540 5863 976 67.88%
Driving 149 653 1073 5910 75.92%
Recall 81.38% 60.05% 52.19% 63.53% 67.22%
To a lesser extent, it is also important to note that there are some cases in which some activities
are confused as an “inactive” action. This was also relatively expected, as every activity is subject to
prolonged stoppages. For example, while acting as “walking” or “driving”, traffic lights that force
the individual to stop may appear. In these situations, these pauses may be mistaken by the model
for cases of pure inactivity. Perhaps the use of other and more specific features could improve the
differentiation in all these cases, as well as the use of another type of AI algorithms and bigger sliding
window sizes.
Regarding the fourth and last step, we also applied the same algorithm for the rest of the data sets
formed, obtaining the results shown in Table 10. Similar to the other tables shown, the average values
are on the left side of each cell, while the standard deviations are on the right side, in a smaller size.
This comparison is made from the average of the test values yielded by the experiments conducted to
each set. As can be seen, the combination of the four sensors performs better in comparison with the
other two, especially with the case formed only by accelerometer and GPS. Both the gyroscope and
the magnetometer seem to have a pretty important implication for the final classification. In the first
case, it seems to significantly improve the final accuracy, as in the other works that included it in their
studies. However, it looks like what makes the highest difference is the appendage of this sensor to
the magnetometer.
Table 10.
Mean accuracies achieved for each set of data, with the best group result highlighted in bold.
Acc. + GPS. Acc. + Magn. + GPS Acc. + Gyro. + Magn.+ GPS
60.10% ±11.43 62.66% ±11.68 67.22% ±13.13
4.2. Discussion
Although the results obtained might not seem as good as those seen so far in the rest of the
literature, we consider that they are promising given the problem addressed. The data used are very
different from those of the other datasets that currently exist in the field, as well as being much less
specific. Therefore, while the results may seem worse, actually they are not comparable. The data
collected correspond to different profiles of people, each with their physical peculiarities and ways of
using their smartphone. Moreover, the nature of each of the defined activities implies short periods
of some of the other actions. For example, within the “active” exercise, there are both moments of
inactivity and moments of travel. Within the “walking” activity, there may be stops due to traffic
lights or other obstacles encountered along the way. Furthermore, during the action of “driving”, it is
noteworthy that an urban environment has many peculiarities and stops that can complicate the final
classification. Therefore, given these problems and the simplicity of the proposed model, we consider
that these results are a relatively good first approximation of what they could be. We believe that
perhaps with other types of models also used in this field, such as Random Forest, the results could be
improved considerably. Also, through the application of algorithms based on deep learning, such as
LSTM, that showed exceptional performance in this domain too. Hence, with this change in the model
to be used and the addition of new metrics, we would surely get closer to that real-life environment
we are searching.
Sensors 2020,20, 2200 13 of 14
5. Conclusions and Future Work
In this paper, we presented a dataset for the HAR field. This dataset contains information from 19
different users, each with its own way of using their smartphone, as well as their physical peculiarities.
The amount of data is enough to make classifications about them, and the information gathered is
realistic enough to be taken to a real-life environment.
Therefore, with the development of this dataset, we hope to alleviate the problems that are seen
in other works. While it is true that the final results we got may not be as good as those seen to date,
we believe that it will be the beginning of the road to take the models developed for HAR to real
life. We also hope that the current confusions of the proposed model, among some of the determined
activities, can be overcome in future research. In this way, it would be possible to implement a system
capable of correctly detecting a person’s movements or activities, regardless of the way they use their
smartphone or their physical peculiarities. This could be very interesting for many companies or
individuals to be able to monitor or predict the activities performed by a particular individual.
For this reason, we will continue advancing in the same line of work, testing other techniques
that also had pretty good results in the field, such as Random Forest, CNN or LSTM. Also, the deletion
or the addition of new features, such as those related to the Fourier transform, to search for possible
periodic components in the stored signals, could positively affect the final model. In this way, we will
be able to compare the results obtained, in search of the best model to solve this problem. In addition,
we will also explore the real impact of the sensors used, as well as other possible sliding windows
greater sizes and combinations of hyperparameters, in search of improving the best configuration
found so far.
Supplementary Materials:
The complete dataset, as well as the scripts used on our experiments, are available
online at http://lbd.udc.es/research/real-life-HAR-dataset. Similarly, they have also been uploaded to Mendeley
Data [24].
Author Contributions:
Conceptualization, D.G.-G., D.R. and E.F.-B.; data curation, D.G.-G.; formal analysis,
D.G.-G., D.R. and E.F.-B.; funding acquisition, M.R.L.; investigation, D.G.-G.; methodology, D.R., E.F.-B. and
M.R.L.; project administration, M.R.L.; resources, M.R.L.; software, D.G.-G.; supervision, D.R., E.F.-B. and M.R.L.;
validation, D.G.-G.; visualization, D.G.-G.; writing—original draft preparation, D.G.-G.; writing—review and
editing, D.G.-G., D.R. and E.F.-B. All authors have read and agreed to the published version of the manuscript.
Funding:
This research was partially funded by Xunta de Galicia/FEDER-UE (ConectaPeme, GEMA: IN852A
2018/14), MINECO-AEI/FEDER-UE (Flatcity: TIN2016-77158-C4-3-R) and Xunta de Galicia/FEDER-UE (AXUDAS
PARA A CONSOLIDACION E ESTRUTURACION DE UNIDADES DE INVESTIGACION COMPETITIVAS.GRC:
ED431C 2017/58 and ED431C 2018/49).
Acknowledgments:
First of all, we want to thank the support from the CESGA to execute the code related to this
paper. Also, we would like to thank all the participants who took part in our data collection experiment.
Conflicts of Interest: The authors declare no conflict of interest.
References
1.
Zhu, N.; Diethe, T.; Camplani, M.; Tao, L.; Burrows, A.; Twomey, N.; Kaleshi, D.; Mirmehdi, M.; Flach, P.;
Craddock, I. Bridging e-health and the internet of things: The sphere project. IEEE Intell. Syst.
2015
,
30, 39–46.
2.
Lara, O.D.; Labrador, M.A. A survey on human activity recognition using wearable sensors. IEEE Commun.
Surv. Tutorials 2012,15, 1192–1209.
3.
Attal, F.; Mohammed, S.; Dedabrishvili, M.; Chamroukhi, F.; Oukhellou, L.; Amirat, Y. Physical human
activity recognition using wearable sensors. Sensors 2015,15, 31314–31338.
4.
Shoaib, M.; Bosch, S.; Incel, O.; Scholten, H.; Havinga, P. Complex human activity recognition using
smartphone and wrist-worn motion sensors. Sensors 2016,16, 426.
5.
Ferrari, A.; Micucci, D.; Mobilio, M.; Napoletano, P. On the Personalization of Classification Models for
Human Activity Recognition. IEEE Access 2020,8, 32066–32079.
Sensors 2020,20, 2200 14 of 14
6.
Solis Castilla, R.; Akbari, A.; Jafari, R.; Mortazavi, B.J. Using Intelligent Personal Annotations to Improve
Human Activity Recognition for Movements in Natural Environments. IEEE J. Biomed. Health Inform.
2020
,
doi:10.1109/JBHI.2020.2966151.
7.
Weiss, G.; Lockhart, J. The Impact of Personalization on Smartphone-Based Activity Recognition.
In Proceedings of the AAAI Publications, Workshops at the Twenty-Sixth AAAI Conference on Artificial
Intelligence, Toronto, ON, Canada, 22–23 July 2012.
8.
Lane, N.; Xu, Y.; lu, H.; Hu, S.; Choudhury, T.; Campbell, A.; Zhao, F. Enabling large-scale human
activity inference on smartphones using Community Similarity Networks (CSN). In Proceedings of the 13th
International Conference on Ubiquitous Computing, Beijing, China, 17–21 September 2011; pp. 355–364.
9.
Anguita, D.; Ghio, A.; Oneto, L.; Parra, X.; Reyes-Ortiz, J.L. A public domain dataset for human activity
recognition using smartphones. In Proceedings of the Esann, Bruges, Belgium, 24–26 April 2013.
10.
Kwapisz, J.R.; Weiss, G.M.; Moore, S.A. Activity recognition using cell phone accelerometers. ACM SigKDD
Explor. Newsl. 2011,12, 74–82.
11.
Ignatov, A. Real-time human activity recognition from accelerometer data using Convolutional Neural
Networks. Appl. Soft Comput. 2018,62, 915–922.
12.
Sikder, N.; Chowdhury, M.S.; Arif, A.S.; Nahid, A.A. Human Activity Recognition Using Multichannel
Convolutional Neural Network. In Proceedings of the 5th International Conference on Advances in
Electronics Engineering, Dhaka, Bangladesh, 26–28 September 2019.
13.
Hassan, M.M.; Uddin, M.Z.; Mohamed, A.; Almogren, A. A robust human activity recognition system using
smartphone sensors and deep learning. Future Gener. Comput. Syst. 2018,81, 307–313.
14.
Seto, S.; Zhang, W.; Zhou, Y. Multivariate time series classification using dynamic time warping template
selection for human activity recognition. In Proceedings of the IEEE Symposium Series on Computational
Intelligence, Cape Town, South Africa, 7–10 December 2015; pp. 1399–1406.
15.
Sousa, W.; Souto, E.; Rodrigres, J.; Sadarc, P.; Jalali, R.; El-Khatib, K. A comparative analysis of the impact
of features on human activity recognition with smartphone sensors. In Proceedings of the 23rd Brazillian
Symposium on Multimedia and the Web, Gramado, Brazil, 17–20 October 2017; pp. 397–404.
16.
Chen, Z.; Zhu, Q.; Soh, Y.C.; Zhang, L. Robust human activity recognition using smartphone sensors via
CT-PCA and online SVM. IEEE Trans. Ind. Informatics 2017,13, 3070–3080.
17. Figueiredo, J.; Gordalina, G.; Correia, P.; Pires, G.; Oliveira, L.; Martinho, R.; Rijo, R.; Assuncao, P.; Seco, A.;
Fonseca-Pinto, R. Recognition of human activity based on sparse data collected from smartphone sensors.
In Proceedings of the IEEE 6th Portuguese Meeting on Bioengineering (ENBENG, Lisbon, Portugal, 22–23
February 2019; pp. 1–4.
18.
Voicu, R.A.; Dobre, C.; Bajenaru, L.; Ciobanu, R.I. Human Physical Activity Recognition Using Smartphone
Sensors. Sensors 2019,19, 458.
19.
Ronao, C.A.; Cho, S.B. Human activity recognition with smartphone sensors using deep learning neural
networks. Expert Syst. Appl. 2016,59, 235–244.
20.
Hernández, F.; Suárez, L.F.; Villamizar, J.; Altuve, M. Human Activity Recognition on Smartphones Using
a Bidirectional LSTM Network. In Proceedings of the XXII Symposium on Image, Signal Processing and
Artificial Vision (STSIVA), Bucaramanga, Colombia, 24–26 April 2019; pp. 1–5.
21.
Badshah, M. Sensor-Based Human Activity Recognition Using Smartphones. Master ’s Thesis, San Jose State
University, San Jose, CA, USA, 2019.
22.
Ustev, Y.E.; Durmaz Incel, O.; Ersoy, C. User, device and orientation independent human activity recognition
on mobile phones: Challenges and a proposal. In Proceedings of the 2013 ACM Conference on Pervasive
and Ubiquitous Computing Adjunct Publication, Zurich, Switzerland, 8–12 September 2013; pp. 1427–1436.
23.
Ahmed, N.; Rafiq, J.I.; Islam, M.R. Enhanced Human Activity Recognition Based on Smartphone Sensor
Data Using Hybrid Feature Selection Model. Sensors 2020,20, 317.
24.
Garcia-Gonzalez D.; Rivero, D.; Fernandez-Blanco, E.; R. Luaces, M. A Public Domain Dataset
for Real-Life Human Activity Recognition Using Smartphone Sensors Mendeley Data
2020
,V1,
doi:10.17632/3xm88g6m6d.1
©
2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
... The above discussion shows that while HAR data collected in a controlled environment might yield good prediction results, they are not a true representation of the randomness associated with the data collection methodology. Therefore, this study incorporates a public domain dataset specifically collected to replicate the uncertainty linked to smartphone storage and use during data collection [22]. This dataset takes the methodology one step closer to real-life implementation where the respondents are free to use their smartphones as they please while data are being collected. ...
... The database used in this study is publicly available, and its collection process has already been discussed in great detail [22]. This study only utilized the raw accelerometer data present within the cited database. ...
... Various machine learning algorithms were used to predict the activities included in the data, including Support Vector Machine (SVM) [22,23], naïve Bayes (NB) [24,25], K-Nearest Neighbor (KNN) [24,26], Random Forest (RF) [27,28], and Extreme Gradient Boosting (XGBoost) [29,30]. The best-performing algorithm was later compared with Feed-forward Neural Network (NN) [31]. ...
Article
Full-text available
The accuracy of Human Activity Recognition is noticeably affected by the orientation of smartphones during data collection. This study utilized a public domain dataset that was specifically collected to include variations in smartphone positioning. Although the dataset contained records from various sensors, only accelerometer data were used in this study; thus, the developed methodology would preserve smartphone battery and incur low computation costs. A total of 175 different features were extracted from the pre-processed data. Data stratification was conducted in three ways to investigate the effect of information sharing between the training and testing datasets. After data balancing using only the training dataset, ten-fold and LOSO cross-validation were performed using several algorithms, including Support Vector Machine, XGBoost, Random Forest, Naïve Bayes, KNN, and Neural Network. A very simple post-processing algorithm was developed to improve the accuracy. The results reveal that XGBoost takes the least computation time while providing high prediction accuracy. Although Neural Network outperforms XGBoost, XGBoost demonstrates better accuracy with post-processing. The final detection accuracy ranges from 99.8% to 77.6% depending on the level of information sharing. This strongly suggests that when reporting accuracy values, the associated information sharing levels should be provided as well in order to allow the results to be interpreted in the correct context.
... The HAR module uses CNN proposed by Gholamrezaii et al. [31] for feature extraction from signals gathered by motion sensor, namely accelerometer and gyroscope. The network was trained on standard UCI HAR dataset [32] and fine tuned on modern HARTH [33] and KU-HAR [34] datasets. ...
... Model evaluation was performed on open datasets, namely UCI HAR [32], ExtraSensory [35], MotionSense [36], SherLock [37], H-MOG [38], UMDAA-02 [39], and BB-MAS [40]. The considered state-of-the-art Abuhamad et al. [9], Reichinger et al. [41], and MMAuth [42] methods are available from the corresponding author on reasonable request. ...
Article
Full-text available
Reliable and non-intrusive user identification and authentication on mobile devices, such as smartphones, are topical tasks today. The majority of state-of-the-art solutions in this domain are based on “device unlock” scenario—checking of information (authentication factors) provided by the user for unlocking a smartphone. As such factors, we may use either single strong authentication factor, for example, password or PIN, or several “weaker” factors, such as tokens, biometrics, or geolocation data. However, these solutions require additional actions from a user, for example, password typing or taking a fingerprint, that may be inappropriate for on-the-fly authentication. In addition, biometric-based user authentication systems tend to be prone to presentation attack (spoofing) and typically perform well in fixed positions only, such as still standing or sitting. We propose BehaviorID solution that is passwordless (transparent) user-adaptive context-dependent authentication method. The feature of BehaviorID is usage of new “device lock” scenario—smartphone is stayed unlocked and can be fast locked if non-owner’s actions are detected. This is achieved by tracking of user’s behavior with embedded sensors after triggering events, such as actions in banking apps, e-mails, and social services. The advanced adaptive recurrent neural network (A-RNN) is used for accurate estimation and adaptation of behavioral patterns to a new usage context. Thus, proposed BehaviorID solution allows reliable user authentication in various usage contexts by preserving low battery consumption. Performance evaluation of both state-of-the-art and proposed solutions in various usage contexts proved the effectiveness of BehaviorID in real situations. Proposed solution allows reducing error levels up to three times in comparison with modern Abuhamad’s solutions (Abuhamad et al., IEEE Internet Things J 7(6):5008–5020, 2020) (about $$0.3\%$$ 0.3 % false acceptance rate (FAR) and $$1.3\%$$ 1.3 % false rejection rate (FRR)) by preserving high robustness to spoofing attack ( $$2.5\%$$ 2.5 % spoof acceptance rate (SAR)). In addition, BehaviorID showed low drift of error level in case of long-term usage in contrast to modern solutions. This makes the proposed BehaviorID solution an attractive candidate for next-generation behavior-based user authentication systems on mobile devices.
... To the best of our knowledge, there has not been any experimental design in the literature for this type of data. Already published works usually focus either on acceleration, gyroscopic or magnetometer data [11,33], or on GPS data [9,34]. The purpose of this work is to identify walking activity phases in a data set measuring daily life activities. ...
... As such, they share a number of commonalities in their experimental design which we can take inspiration from. For instance, in the works of Ortiz [33], Anguita et al. [41], Ortiz [42], Kwapisz et al. [11], Garcia-Gonzalez et al. [34] and Beaufils et al. [9], we can grasp two key experimental setup features: (i) the type of activities are kept relatively simple among all types of activities that an individual can subject themselves to in a free living environment and (ii) it is important to mark a pause in-between two activities when building the training set because it facilitates time point labelling in the different activities. ...
Article
Full-text available
Solutions to assess walking deficiencies are widespread and largely used in healthcare. Wearable sensors are particularly appealing, as they offer the possibility to monitor gait in everyday life, outside a facility in which the context of evaluation biases the measure. While some wearable sensors are powerful enough to integrate complex walking activity recognition models, non-invasive lightweight sensors do not always have the computing or memory capacity to run them. In this paper, we propose a walking activity recognition model that offers a viable solution to this problem for any wearable sensors that measure rotational motion of body parts. Specifically, the model was trained and tuned using data collected by a motion sensor in the form of a unit quaternion time series recording the hip rotation over time. This time series was then transformed into a real-valued time series of geodesic distances between consecutive quaternions. Moving average and moving standard deviation versions of this time series were fed to standard machine learning classification algorithms. To compare the different models, we used metrics to assess classification performance (precision and accuracy) while maintaining the detection prevalence at the level of the prevalence of walking activities in the data, as well as metrics to assess change point detection capability and computation time. Our results suggest that the walking activity recognition model with a decision tree classifier yields the best compromise in terms of precision and computation time. The sensor that was used had purposely low computing and memory capacity so that reported performances can be thought of as the lower bounds of what can be achieved. Walking activity recognition is performed online, i.e., on-the-fly, which further extends the range of applicability of our model to sensors with very low memory capacity.
... Other popular human activity recognition datasets include UniMiB SHAR [14] containing accelerometer samples captured from a smartphone, Real-Life HAR [15] also collected from a smartphone but focusing on real-life situations (for example inactive, active or driving) rather than a laboratory setting, and OPPORTUNITY [16] that uses many sensors of different modalities. ...
... There are 14 subjects in the training set and six subjects in the testing set, representing approximately 77% and 23% of the total number of samples, respectively. Subjects number 5,15,17,18,19, and 20 have been chosen for the testing set since they have completed all activities. Moreover, these subjects have the lowest standard deviation on the percentage of samples for each class in the testing set. ...
Article
Full-text available
Human activity recognition can help in elderly care by monitoring the physical activities of a subject and identifying a degradation in physical abilities. Vision-based approaches require setting up cameras in the environment, while most body-worn sensor approaches can be a burden on the elderly due to the need of wearing additional devices. Another solution consists in using smart glasses, a much less intrusive device that also leverages the fact that the elderly often already wear glasses. In this article, we propose UCA-EHAR, a novel dataset for human activity recognition using smart glasses. UCA-EHAR addresses the lack of usable data from smart glasses for human activity recognition purpose. The data are collected from a gyroscope, an accelerometer and a barometer embedded onto smart glasses with 20 subjects performing 8 different activities (STANDING, SITTING, WALKING, LYING, WALKING_DOWNSTAIRS, WALKING_UPSTAIRS, RUNNING, and DRINKING). Results of the classification task are provided using a residual neural network. Additionally, the neural network is quantized and deployed on the smart glasses using the open-source MicroAI framework in order to provide a live human activity recognition application based on our dataset. Power consumption is also analysed when performing live inference on the smart glasses’ microcontroller.
... To date many datasets acquired from RF, vision and inertial sensors have been published, which are intended for a number of applications. These include WiFi CSI-based activity recognition [11][12][13][14][15] , sign language recognition 16 , fall detection 17 , device-to-device localization 18 , or UWB-based gesture recognition 19 , motion detection/recognition [20][21][22] , passive target localization 23 , people counting 24 , or active radar-based sensing [25][26][27][28] , as well as physical activity recognition using inertial/ wearable sensors [29][30][31][32][33][34] , while others have proposed action recognition datasets acquired from vision and motion capture systems [35][36][37][38] . However, most of these databases have some shortcomings in the layout and number of sensors, which cannot fully represent the human activity features. ...
Article
Full-text available
This paper presents a comprehensive dataset intended to evaluate passive Human Activity Recognition (HAR) and localization techniques with measurements obtained from synchronized Radio-Frequency (RF) devices and vision-based sensors. The dataset consists of RF data including Channel State Information (CSI) extracted from a WiFi Network Interface Card (NIC), Passive WiFi Radar (PWR) built upon a Software Defined Radio (SDR) platform, and Ultra-Wideband (UWB) signals acquired via commercial off-the-shelf hardware. It also consists of vision/Infra-red based data acquired from Kinect sensors. Approximately 8 hours of annotated measurements are provided, which are collected across two rooms from 6 participants performing 6 daily activities. This dataset can be exploited to advance WiFi and vision-based HAR, for example, using pattern recognition, skeletal representation, deep learning algorithms or other novel approaches to accurately recognize human activities. Furthermore, it can potentially be used to passively track a human in an indoor environment. Such datasets are key tools required for the development of new algorithms and methods in the context of smart homes, elderly care, and surveillance applications.
... A recent real-life human activity dataset was published by the University of A Coruna [97]. ey recorded about 189 hours of measurements from the accelerometer, gyroscope, magnetometer, and GPS of smartphones for 19 different subjects with no restriction for mobile position. ...
Article
Full-text available
Using artificial intelligence and machine learning techniques in healthcare applications has been actively researched over the last few years. It holds promising opportunities as it is used to track human activities and vital signs using wearable devices and assist in diseases’ diagnosis, and it can play a great role in elderly care and patient’s health monitoring and diagnostics. With the great technological advances in medical sensors and miniaturization of electronic chips in the recent five years, more applications are being researched and developed for wearable devices. Despite the remarkable growth of using smart watches and other wearable devices, a few of these massive research efforts for machine learning applications have found their way to market. In this study, a review of the different areas of the recent machine learning research for healthcare wearable devices is presented. Different challenges facing machine learning applications on wearable devices are discussed. Potential solutions from the literature are presented, and areas open for improvement and further research are highlighted.
Article
This paper proposes a graph theory approach to perform the human activity recognition. However, as the most common signal employed for performing the activity recognitions is the motion signal while the motion signal is well structured, ordered and independent one another, the graph theory cannot be applied directly. To address this issue, this paper proposes the correlation coefficient based method for generating the graph using the signals in the UCI-HAR dataset. Here, the predefined thresholds are used for determining whether the nodes are connected or not. The features are updated according to the activities via the multi-aggregation fusion approach. Finally, the random forest is used to classify these activities. To demonstrate the effectiveness of our proposed method, the percentage accuracy and the macro averaged F1 score yielded by our proposed method with the graph weights are compared to those without the graph weights as well as with the multi-aggregator are compared with the mean aggregator. Also, our proposed method is compared to some common methods such as those based on the CNN and SVM. It is found that our proposed method can achieve the percentage accuracy up to 98.74%, which significantly outperforms the existing methods.
Conference Paper
Today, portable devices like smartwatches and smartphones have made a great impact in human's wellbeing. From sleep monitoring to exercise scheduling, Human Activity Recognition had played a major role in the habits of the people. In this work, we exploit a Time Series dataset that describes a Human Activity Recognition signal. In the beginning, we extract the features oriented on Spectral, Statistical and Temporal domains. Then, we construct a dataset for each domain and we calculate the classification results using a number of different classifiers. In the sequel, we apply Active Learning techniques and calculate their classification accuracy performance using a small portion of the original datasets as initial labeled set. Finally, we compare the original results with the ones produced with Active Learning methods.
Article
Background Owing to low cost and ubiquity, human activity recognition using smartphones is emerging as a trendy mobile application in diverse appliances such as assisted living, healthcare monitoring, etc. Analysing this one-dimensional time-series signal is rather challenging due to its spatial and temporal variances. Numerous deep neural networks (DNNs) are conducted to unveil deep features of complex real-world data. However, the drawback of DNNs is the un-interpretation of the network's internal logic to achieve the output. Furthermore, a huge training sample size (i.e. millions of samples) is required to ensure great performance. Methods In this work, a simpler yet effective stacked deep network, known as Stacked Discriminant Feature Learning (SDFL), is proposed to analyse inertial motion data for activity recognition. Contrary to DNNs, this deep model extracts rich features without the prerequisite of a gigantic training sample set and tenuous hyper-parameter tuning. SDFL is a stacking deep network with multiple learning modules, appearing in a serialized layout for multi-level feature learning from shallow to deeper features. In each learning module, Rayleigh coefficient optimized learning is accomplished to extort discriminant features. A subject-independent protocol is implemented where the system model (trained by data from a group of users) is used to recognize data from another group of users. Results Empirical results demonstrate that SDFL surpasses state-of-the-art methods, including DNNs like Convolutional Neural Network, Deep Belief Network, etc., with ~97% accuracy from the UCI HAR database with thousands of training samples. Additionally, the model training time of SDFL is merely a few minutes, compared with DNNs, which require hours for model training. Conclusions The supremacy of SDFL is corroborated in analysing motion data for human activity recognition requiring no GPU but only a CPU with a fast- learning rate.
Article
Full-text available
Recently, a significant amount of literature concerning machine learning techniques has focused on automatic recognition of activities performed by people. The main reason for this considerable interest is the increasing availability of devices able to acquire signals which, if properly processed, can provide information about human activities of daily living (ADL). The recognition of human activities is generally performed by machine learning techniques that process signals from wearable sensors and/or cameras appropriately arranged in the environment. Whatever the type of sensor, activities performed by human beings have a strong subjective characteristic that is related to different factors, such as age, gender, weight, height, physical abilities, and lifestyle. Personalization models have been studied to take into account these subjective factors and it has been demonstrated that using these models, the accuracy of machine learning algorithms can be improved. In this work we focus on the recognition of human activities using signals acquired by the accelerometer embedded in a smartphone. The contributions of this research are mainly three. A first contribution is the definition of a clear validation model that takes into account the problem of personalization and which thus makes it possible to objectively evaluate the performances of machine learning algorithms. A second contribution is the evaluation, on three different public datasets, of a personalization model which considers two aspects: the similarity between people related to physical aspects (age, weight, and height) and similarity related to intrinsic characteristics of the signals produced by these people when performing activities. A third and last contribution is the development of a personalization model that considers both the physical and signal similarities. The experiments show that the employment of personalization models improves, on average, the accuracy, thus confirming the soundness of the approach and paving the way for future investigations on this topic.
Conference Paper
Full-text available
Human Activity Recognition (HAR) simply refers to the capacity of a machine to perceive human actions. HAR is a prominent application of advanced Machine Learning and Artificial Intelligence techniques that utilize computer vision to understand the semantic meanings of heterogeneous human actions. This paper describes a supervised learning method that can distinguish human actions based on data collected from practical human movements. The primary challenge while working with HAR is to overcome the difficulties that come with the cyclostationary nature of the activity signals. This study proposes a HAR classification model based on a two-channel Convolutional Neural Network (CNN) that makes use of the frequency and power features of the collected human action signals. The model was tested on the UCI HAR dataset, which resulted in a 95.25% classification accuracy. This approach will help to conduct further researches on the recognition of human activities based on their biomedical signals.
Article
Full-text available
Human activity recognition (HAR) techniques are playing a significant role in monitoring the daily activities of human life such as elderly care, investigation activities, healthcare, sports, and smart homes. Smartphones incorporated with varieties of motion sensors like accelerometers and gyroscopes are widely used inertial sensors that can identify different physical conditions of human. In recent research, many works have been done regarding human activity recognition. Sensor data of smartphone produces high dimensional feature vectors for identifying human activities. However, all the vectors are not contributing equally for identification process. Including all feature vectors create a phenomenon known as ‘curse of dimensionality’. This research has proposed a hybrid method feature selection process, which includes a filter and wrapper method. The process uses a sequential floating forward search (SFFS) to extract desired features for better activity recognition. Features are then fed to a multiclass support vector machine (SVM) to create nonlinear classifiers by adopting the kernel trick for training and testing purpose. We validated our model with a benchmark dataset. Our proposed system works efficiently with limited hardware resource and provides satisfactory activity identification.
Conference Paper
Full-text available
This paper proposes a method of human activity monitoring based on the regular use of sparse acceleration data and GPS positioning collected during smartphone daily utilization. The application addresses, in particular, the elderly population with regular activity patterns associated with daily routines. The approach is based on the clustering of acceleration and GPS data to characterize the user's pattern activity and localization for a given period. The current activity pattern is compared to the one obtained by the learned data patterns, generating alarms of abnormal activity and unusual location. The obtained results allow to consider that the usage of the proposed method in real environments can be beneficial for activity monitoring without using complex sensor networks.
Article
Full-text available
Because the number of elderly people is predicted to increase quickly in the upcoming years, “aging in place” (which refers to living at home regardless of age and other factors) is becoming an important topic in the area of ambient assisted living. Therefore, in this paper, we propose a human physical activity recognition system based on data collected from smartphone sensors. The proposed approach implies developing a classifier using three sensors available on a smartphone: accelerometer, gyroscope, and gravity sensor. We have chosen to implement our solution on mobile phones because they are ubiquitous and do not require the subjects to carry additional sensors that might impede their activities. For our proposal, we target walking, running, sitting, standing, ascending, and descending stairs. We evaluate the solution against two datasets (an internal one collected by us and an external one) with great effect. Results show good accuracy for recognizing all six activities, with especially good results obtained for walking, running, sitting, and standing. The system is fully implemented on a mobile device as an Android application.
Conference Paper
Full-text available
The recognition of users' physical activities through data analysis of smartphone inertial sensors has aided the development of several solutions in different domains such as transportation and healthcare. Mostly of these solutions have been supported by the cloud communication technologies due to the need of using accurate classification models. In an attempt to solve problems related to the smartphone orientation (e.g. landscape) in the user's body, new types of features classified as orientation independent have arisen in the last years. In this context, this paper presents an extensive comparative study between all the features mapped in literature derived from inertial sensors. A number of experiments were carried out using two databases containing data from 30 users. Results showed that the new orientation independent features proposed in literature cannot discriminate properly between the users' activities using the inertial sensors. In addition, this paper provides an extensive analysis of these type of features and a tool that implements all methodological process of human activity recognition based on smartphones.
Conference Paper
This paper presents a human activity recognition (HAR) system that uses accelerometer and gyroscope data obtained from a smartphone as inputs to a bidirectional long short-term memory (LSTM) network. Six human activities were recognized: sitting, standing, laying, walking, walking upstairs, and walking downstairs.
Article
Personal tracking algorithms for health monitoring are critical for understanding an individual's life-style and personal choices in natural environments (NE). In order to train such tracking algorithms in NE, however, annotated data is needed, particularly when tracking a variety of activities of daily living. These algorithms are often trained in laboratory settings, with expectations that they will perform equally well in NE, which is often not the case; they must be trained on annotated data collected in NE and wearable computers provide opportunities to collect such data, though the process is burdensome. Therefore, we propose an intelligent scoring algorithm that limits the number of user annotation requests through the confidence of predictions generated by the tracking algorithm and automatically annotating data with high confidence. We enhance our scoring algorithm by providing improvements in our tracking algorithm by obtaining context data from nearable sensors. Each specific context of a user bounds the set of activities that can likely occur, which in turn improves the tracking algorithm and confidence. Finally, we propose a hierarchical annotation approach, where repeated use allows us to ask for detailed annotations that differentiate fine-grained differences in ways individuals perform activities. We validate our approach in a diet monitoring case study. We vary the number of annotations requested per day to evaluate model accuracy; we improve accuracy in NE by 8% when restricting requests to 20 per day and improve F1-score of activities by 11% with hierarchical annotations, while discussing implementation, accuracy, and power consumption in real-time use.
Article
In last few decades, human activity recognition grabbed considerable research attentions from a wide range of pattern recognition and human-computer interaction researchers due to its prominent applications such as smart home health care. For instance, activity recognition systems can be adopted in a smart home health care system to improve their rehabilitation processes of patients. There are various ways of using different sensors for human activity recognition in a smartly controlled environment. Among which, physical human activity recognition through wearable sensors provides valuable information about an individual's degree of functional ability and lifestyle. In this paper, we present a smartphone inertial sensors-based approach for human activity recognition. Efficient features are first extracted from raw data. The features include mean, median, autoregressive coefficients, etc. The features are further processed by a kernel principal component analysis (KPCA) and linear discriminant analysis (LDA) to make them more robust. Finally, the features are trained with a Deep Belief Network (DBN) for successful activity recognition. The proposed approach was compared with traditional expression recognition approaches such as typical multiclass Support Vector Machine (SVM) and Artificial Neural Network (ANN) where it outperformed them.
Article
With a widespread of various sensors embedded in mobile devices, the analysis of human daily activities becomes more common and straightforward. This task now arises in a range of applications such as healthcare monitoring, fitness tracking or user-adaptive systems, where a general model capable of instantaneous activity recognition of an arbitrary user is needed. In this paper, we present a user-independent deep learning-based approach for online human activity classification. We propose using Convolutional Neural Networks for local feature extraction together with simple statistical features that preserve information about the global form of time series. Furthermore, we investigate the impact of time series length on the recognition accuracy and limit it up to 1. s that makes possible continuous real-time activity classification. The accuracy of the proposed approach is evaluated on two commonly used WISDM and UCI datasets that contain labeled accelerometer data from 36 and 30 users respectively, and in cross-dataset experiment. The results show that the proposed model demonstrates state-of-the-art performance while requiring low computational cost and no manual feature engineering.