ArticlePDF Available

Subject variability in sensor-based activity recognition

Authors:

Abstract and Figures

Building classification models in activity recognition is based on the concept of exchangeability. While splitting the dataset into training and test sets, we assume that the training set is exchangeable with the test set and expect good classification performance. However, this assumption is invalid due to subject variability of the training and test sets due to age differences. This happens when the classification models are trained with adult dataset and tested it with elderly dataset. This study investigates the effects of subject variability on activity recognition using inertial sensor. Two different datasets—one locally collected from 15 elders and another public from 30 adults with eight types of activities—were used to evaluate the assessment techniques using ten-fold cross-validation. Three sets of experiments have been conducted: experiments on the public dataset only, experiments on the local dataset only, and experiments on public (as training) and local (as test) datasets using machine learning and deep learning classifiers including single classifiers (Support Vector Machine, Decision Tree, K-Nearest Neighbors), ensemble classifiers (Adaboost, Random Forest, and XGBoost), and Convolutional Neural Network. The experimental results show that there is a significant performance drop in activity recognition on different subjects with different age groups. It demonstrates that on average the drop in recognition accuracy is 9.75 and 12% for machine learning and deep learning models respectively. This confirms that subject variability concerning age is a valid problem that degrades the performance of activity recognition models.
Content may be subject to copyright.
Vol.:(0123456789)
1 3
Journal of Ambient Intelligence and Humanized Computing
https://doi.org/10.1007/s12652-021-03465-6
ORIGINAL RESEARCH
Subject variability insensor‑based activity recognition
Ali OlowJimale1,2 · Mohd HalimMohdNoor1
Received: 9 September 2020 / Accepted: 31 August 2021
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2021
Abstract
Building classification models in activity recognition is based on the concept of exchangeability. While splitting the dataset
into training and test sets, we assume that the training set is exchangeable with the test set and expect good classification
performance. However, this assumption is invalid due to subject variability of the training and test sets due to age differ-
ences. This happens when the classification models are trained with adult dataset and tested it with elderly dataset. This
study investigates the effects of subject variability on activity recognition using inertial sensor. Two different datasets—one
locally collected from 15 elders and another public from 30 adults with eight types of activities—were used to evaluate the
assessment techniques using ten-fold cross-validation. Three sets of experiments have been conducted: experiments on the
public dataset only, experiments on the local dataset only, and experiments on public (as training) and local (as test) datasets
using machine learning and deep learning classifiers including single classifiers (Support Vector Machine, Decision Tree,
K-Nearest Neighbors), ensemble classifiers (Adaboost, Random Forest, and XGBoost), and Convolutional Neural Network.
The experimental results show that there is a significant performance drop in activity recognition on different subjects with
different age groups. It demonstrates that on average the drop in recognition accuracy is 9.75 and 12% for machine learning
and deep learning models respectively. This confirms that subject variability concerning age is a valid problem that degrades
the performance of activity recognition models.
Keywords Activity recognition· Deep learning· Machine learning· Subject variability
1 Introduction
The increased life expectancy together with declining birth
rates led to an aged population structure. The population of
the world is rapidly aging (Lee etal. 2020). Approximately
all countries in the world are experiencing growth in the
percentage of elderly in their population. For instance, the
current number of elderly people (60 years and older) in the
world is higher than the number of children younger than 5
years old. By 2050, it is expected that 1 in 6 persons in the
globe will be over 65 years old (United Nations 2019). This
increased longevity is a threat to the stability of every society
due to its negative effects on elderly health and social care
(Howdon and Rice 2018) including loss of physical, mental,
and cognitive abilities causing impaired actions and greater
vulnerability to morbidity and mortality (Chang etal. 2019).
Aged people are always vulnerable to many age-related
problems including diabetes, stroke, Parkinson’s, Alzhei-
mer’s, dementia, cardiovascular, osteoarthritis, and other
chronic diseases (Vepakomma etal. 2015; Subasi etal.
2020). These diseases together with the weak cognitive and
physical ability of the elderly prevent them from independ-
ent living and barriers them in performing daily activities
(i.e. toileting, bathing, cooking, etc.) (Van Kasteren etal.
2010). To assist elderly people, some family members and
governments provide high nursing spendings on elderly care
(Vepakomma etal. 2015; Yao etal. 2018). However, with the
increase of the elderly population, caregivers’ assistance is
becoming scarce and the caregivers become overburdened
with the continuous monitoring responsibility (Piyathilaka
and Kodagoda 2015; Richter etal. 2017).
* Mohd Halim Mohd Noor
halimnoor@usm.my
Ali Olow Jimale
eng.olow@simad.edu.so
1 School ofComputer Sciences, Universiti Sains Malaysia,
11800PulauPinang, Malaysia
2 Faculty ofComputing, SIMAD University, Mogadishu,
Somalia
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
A.Jimale, M.Mohd Noor
1 3
Therefore, there is a primary need for a system that can
early detect elderly gradual cognitive changes and automati-
cally recognize elderly activities to monitor their health
conditions, and provide evidence-based nursing assistants
(Nambu etal. 2000; Vijayaprabakaran etal. 2020). This has
recently attracted many scientists who proposed activity rec-
ognition systems aimed at promoting and assisting the living
independence of older people through developing techniques
and systems that recognize the mobility, daily life activities,
and physiological signs of elderly people (Khusainov etal.
2013). This is one reason why activity recognition is becom-
ing a hot research area in sensor-rich and ubiquitous mobile
devices (Zahin etal. 2019) that is specially applied in the
elderly healthcare domain (Dinarević etal. 2019). Under-
standing the different kinds of human activities can also
have an extensive contribution to solving other real-world
problems such as security and military (Labrador and Yejas
2011; Lara and Labrador 2013), entertainment, surveillance,
gaming, remote monitoring, intelligent environments (Hus-
sain etal. 2019), health tracking and monitoring, rehabili-
tation and assisted living (Rezaie and Ghassemian 2018),
home behavior analysis (Satapathy and Das 2016), gait anal-
ysis (Hammerla etal. 2016), gesture recognition (Kim and
Toomajian 2016), assistive technologies and manufacturing.
It may easily change the way we sense, monitor, recognize,
and predict human physical activities and surrounding envi-
ronments (Campbell etal. 2008; Chiang etal. 2019).
Activity recognition is the process of identifying prede-
fined activities of interest performed by a human through
monitoring human activities and/or surrounding environ-
ments using sensors (Chiang etal. 2019). Most of activity
recognition systems follow four regular phases (data col-
lection, data pre-processing, feature extraction, and training
and activity classification or recognition (Nweke etal. 2018;
Straczkiewicz and Onnela 2019) with slight variations based
on the model (machine learning vs. deep learning), appli-
cation domain, and dataset. Typically, during training and
activity classification, the classification models are evaluated
using public datasets or locally collected datasets. (Nweke
etal. 2018).
The way human activities are performed and their dura-
tions vary from one person to another (Akbari and Jafari
2020). Subject variability can be caused by several factors
such as age, sex, fitness level, and environmental state.
This variation (referred to as subject variability) changes
the pattern of the sensory data from one subject to another
and limits the generalization of the classification models
to new subjects, hence reduces the recognition accuracy
of the models. Although subject variability is a real prob-
lem in activity recognition, it remains largely unexplored.
This study is focusing on variation generated by age differ-
ences among subjects. It occurs whenever the classification
models are trained with sensor data from one particular age
group such as adults and tested the trained model with sen-
sor data from another different age group e.g. elderly. The
signals of elderly activities vary from the signals of adult
activities even when the same activity is being performed.
Typically, the acceleration (magnitude) is lower and the
activity signals have a longer duration. This is because,
elderly people have a lower intensity of dynamic (e.g. walk-
ing, running, jogging) and transitional activities and a less
stable static activity (e.g. standing, sitting) than adults. This
variation originates from the fact that adults are stronger,
more confident, and active than elderly people in perform-
ing the activities. Consequently, a classification model that
is trained on activity data that is collected from adults is not
able to generalize to elderly’s dataset.
To perceive subject variability, this study aims to inves-
tigate the effects of subject variability generated by age
differences on activity recognition. It is an assessment
study that focuses on proofing that subject variability is a
valid problem in activity recognition that has a role in per-
formance decline. This will be achieved via investigation
on whether activity recognition models achieve better sub-
ject variability performance than subject similarity using
adult and elderly datasets. The adult dataset is a public
dataset and the elderly dataset was internally collected.
The experiments are conducted in three stages whereby
each stage is carried out in sequence using adult dataset
only, elderly dataset only, and both adult and elderly data-
sets as a training set and a test set respectively. Machine
learning and deep learning techniques are used for the
activity recognition. The main contribution of this study
is investigating the effects of subject variability on activity
recognition using inertial sensors for the first time. The
contributions of this research are summarized as follows:
a. To our best knowledge, we are the first to investigate the
effects of subject variability generated by age difference
on activity recognition.
b. We conduct comprehensive experiments to investigate
the effects of subject variability in activity recognition
using various machine learning and deep learning tech-
niques.
c. We discuss the performance degradation caused by sub-
ject variability contributed by age differences in activity
recognition.
.
The remainder of this study is organized as follows.
Section2 discusses the related work of the study, Sect.3
explains subject variability in detail, Sect.4 provides the
research methodology of this study, Sect.5 contains exper-
imental results, and Sect.7 concludes the paper.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Subject variability insensor-based activity recognition
1 3
2 Related work
Wearable sensors including accelerometer and gyroscope
have recently dominated activity recognition (Cornacchia
etal. 2017). Despite that sensor-based activity recogni-
tion has recently achieved higher performance, one of the
challenges confronting its task is subject variability gen-
erated by age differences as explained in Sect.3. Current
scientists do not pay any attention to the misclassification
problems that can be caused by such kind of subject vari-
ability. This overlook can be observed from the existing
HAR studies.
For instance, Xu etal. (2020) have proposed a new loss
function named “harmonic loss” and label replication tech-
nique to improve the classification performance of activ-
ity recognition using Long Short Term Memory (LSTM)
networks. They have individually trained their model using
two public HAR benchmarks namely: OPPORTUNITY
dataset and UCI HAR datasets. OPPORTUNITY Dataset
contains daily morning activities. It was collected from
12 subjects based on a European research project called
OPPORTUNITY. During the data collection, the partic-
ipants performed 17 activities including eating a sand-
wich, drinking coffee, cleaning, opening fridge, closing
fridge, opening dishwasher, and closing dishwasher for
6h (Roggen etal. 2010).
UCI HAR dataset has been collected by 30 adults
within an age bracket of 19–48 years. Participants per-
form six activities including standing, sitting, lying down,
walking, walking downstairs, and upstairs. During the
data collection, participants were wearing a smartphone
embedded with accelerometer and gyroscope sensors on
the waist. They also recorded the experiments to label data
manually. This dataset was sampled in fixed-width sliding
windows of 2.56s and 50% overlap (128 readings/window)
(Anguita etal. 2013).
In addition to that, Hashim and Amutha (2020) have
proposed a dimensionality reduction technique called “fast
feature dimensionality reduction technique” to reduce the
number of features used in the UCI HAR dataset with less
time consumption. Their work was proposed for less pow-
erful systems. They have managed to reduce the number of
features from 561 to 66. To recognize elderly activities, they
also used locally collected elderly dataset and managed to
reduce 76% of its features. Their elderly dataset was col-
lected from 10 elder subjects aging 60+ years. The partici-
pants the elderly dataset performed five activities: sitting,
upstairs, downstairs, standing, and walking. Similar to the
above studies, activity recognition models in this study are
trained and tested on the two datasets individually.
Other recent activity recognition approaches include
the work of Khatun and Morshed (2018). They designed a
transition activity recognition technique using a decision
tree with an ensemble approach. They have used Mobile
Health (mHealth) public dataset to evaluate the effective-
ness of their work. mHealth dataset consists of 12 daily
activities. The aim of collecting this data was health appli-
cation (Banos etal. 2014).
Another work that has trained and tested their method
using a single dataset is the study of Gil-Martín etal. (2020).
They proposed an improved physical activity recognition
using a new CNN architecture and post-processing tech-
niques. They have trained their model on the PAMAP2 data-
set. This dataset was collected from nine adults performing
18 different activities. The collected activities include lying,
sitting, standing, walking, running, cycling, watching TV,
and using computers (Reiss and Stricker 2012).
Xia etal. (2020) also evaluated their work on the same
age group subjects as those used to train the model. These
authors have developed a novel LSTM-CNN model for
activity recognition. They have experimented with their
work on the UCI-HAR, OPPORTUNITY, and WISDM data-
sets. WISDM dataset was collected from 29 young adults
at Mining Lab Fordham University using a single android
based mobile phone accelerometer sensor. During the data
collection, participants performed simple ambulatory activi-
ties including sitting, standing, and jogging (Kwapisz etal.
2011). Other activity recognition studies that make use of
the WISDM dataset for activity recognition include the work
of Zhang etal. (2020). The authors of this study proposed an
IoT-perspective activity recognition technique that utilizes
multi-head CNNs and the attention mechanism for a better
feature extraction and selection purpose.
Furthermore, Gani etal. (2019) have used datasets with
the same age group. They have developed a computationally
efficient activity recognition approach using dynamical sys-
tems and chaos theory. They have evaluated the performance
of their work using self-collected and the UCI HAR Public
Datasets. In their self data collection, they collected walk-
ing, walking upstairs, walking downstairs, running, sitting,
standing, elevator up, and an elevator down from ten adults
using an accelerometer. They have experimented with their
method using Decision Tree, KNN, SVM, SVM-Gaussian,
weighted KNN, and Bagged trees. All of the above studies
indicate that the experiments are conducted with subjects
from the same age group. The results show that the classifi-
cation models perform well on subjects of similar age. How-
ever, it is not guaranteed that the classification models will
generalize to new unseen subjects of different age groups.
Among the few studies that have investigated vari-
ability in activity recognition is the work of Sakuma
etal. (2019). As stated in their study, there is a variabil-
ity that can be caused by the relationship between time
and human activity. For instance, lack of movement in
the bedroom might show a sleeping activity at night, but
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
A.Jimale, M.Mohd Noor
1 3
trouble at the day. To address such kind of variability and
reduce the calculation cost of activity recognition, they
have proposed three contextual (Spatio-temporal, spatial,
and temporal) approaches as well as context-free, online
CluStream, and offline Minibatch k-means techniques.
The authors managed to approximately reduce 20% in
calculation cost and improve 20% recognition accuracy.
However, the focus is the contextual variability rather
than subject variability. Furthermore, the authors trained
their models on a dataset with the same age group. Man-
nini and Intille (2019) have proposed supervised fine-
tuning classification layers and unsupervised retraining
of feature extraction layers to experiment on how uncer-
tainties in activity recognition could be measured. Unlike
the focus of this research, the authors focused on intra-
subject variability that occurs when the same person per-
forms the same activity differently. By using individual
datasets from adults and youth, they found out that rec-
ognition accuracy varies from 80.8 to 96.5% in youth
and 70.7–95.0% in adults. Similar to the above study,
the authors trained and tested their models with subjects
from the same age group, one time for adults and another
for youth. In this study, we investigated the effects of
subject variability on activity recognition. Three stages
of experiments are performed: activity recognition using
adult dataset only, activity recognition using elderly
dataset only, and activity recognition using adult dataset
as training and elderly dataset as testing. The purpose
of these experiments is to prove that subject variability
caused by age differences among subjects is a valid prob-
lem in activity recognition that has a role in performance
decline.
3 Subject variability inactivity recognition
Subject variability in activity recognition refers to the vari-
ations in the activity signals. Subject variability of human
activities is generated by the physical patterns’ difference
of humans due to many factors including age, sex, fitness
level, and environmental state. There are two known types of
subject variability: inter-subject variability and intra-subject
variability.
Intra-subject variability occurs when a given activity per-
formed by the same subject at different times shows varia-
tions. This could be contributed by the mood and emotion
of the person such as sad versus happy, energetic versus
tired, etc. For example, the physical activities of a sleepy or
tired person are different from the walking style of the same
person in an active and fresh mood. In other words, the walk-
ing activity can be more dynamic in the morning after a nice
sleep than in late evening after a full day working.
Inter-subject variability also known as cross-subject vari-
ability, on the other hand, is observed when the activities
vary from subject to another subject. For example, the jog-
ging activity of a subject is normally different from the jog-
ging activity of another subject. Subjects’ physical activities
can also vary, given different age groups.
This subject variability is mainly due to the variation in
the body acceleration of the subjects while performing the
activities, especially the dynamic and transitional activities.
This is to say that elderly activities vary from the adult activ-
ities in which the magnitude of the acceleration is relatively
lower and the length of the signals is relatively longer.
To clarify the subject variability of an age difference,
visual imagery of the signal variations (2s) for an elderly
subject and an adult subject are provided in Figs.1, 2 and
Timesteps Timesteps
(a) (b)
Acceleration
Fig. 1 Comparison of walking activity signals of an aelderly subject and badult subject
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Subject variability insensor-based activity recognition
1 3
3. Figure1 illustrates two walking activity signals that
were collected from the elderly and adult subjects. As can
be seen in Fig.1, the walking activity signal of the elderly
has a lower acceleration than the walking activity signal
of the adult people where the peak-to-peak amplitude of
the elderly walking activity signal is just one-third of the
adult’s walking activity signal. This is due to the elderly
has a slower walking speed than the adult and this also
applies to other dynamic activities i.e. jogging, running
and jumping.
Figure2 shows a comparison of transitional activity sig-
nals of the elderly and adult subjects. It shows that the length
of transitional activity signals (stand-to-sit) of the elderly (as
shown in Fig.2a) is longer compared to the length of transi-
tional activity signals of the adult (as shown in Fig.2b). The
elderly took approximately 4s (from 0 to 200) to complete
the activity while the adult took about 3.18s (from 0 to 159)
to do so. This applies to other transitional activities such as
sit-to-stand, sit-to-lie and lie-to-sit.
Figure3 shows a comparison of standing activity signals
of the elderly and adult subjects. Normally, elderly people
perform less stable static activities such as standing due to
their weakened muscles, and other age-related problems.
This compromises their balance ability to remain steady on
their feet. So, the activity signals often contain more ran-
dom, irregular and sparse spikes compared to the ones gen-
erated by adults.
All the aforementioned issues create a data distribution
gap between the training set and the test set that causes a sig-
nificant recognition performance degradation if the trained
HAR models are tested on subjects not included in the train-
ing set. Although subject variability degrades the recogni-
tion performance when the HAR model is trained with an
adult dataset and tested with an elder dataset, this kind of
subject variability which can eliminate the classification of
HAR models and reduce the generalization performance of a
learning algorithm is overlooked (Lv etal. 2020). Due to the
absence of techniques for investigating subject variability,
activity recognition for inter-age problems is still not robust
enough for real-world deployment (Hossain and Roy 2019).
This study investigates the performance degradation caused
by subject variability contributed by age differences in activ-
ity recognition. The next section is the research methodology
and experimental setup of this study.
4 Methodology andexperimental setup
The section presents the systematic plan implemented to
conduct this research. The methodology of this study is
divided into three main phases: Experimental datasets col-
lection and pre-processing, model training and testing, and
classification and evaluation.
Acceleration
Timesteps Timesteps
(a) (b)
Fig. 2 Comparison of stand-to-sit activity signals of an aelderly subject and badult subject
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
A.Jimale, M.Mohd Noor
1 3
4.1 Experimental datasets collection
andpre‑processing
In this study, two activity recognition datasets are used
which are internally collected (local) dataset and a public
dataset.
4.1.1 Local dataset
The local dataset contains accelerometer and gyroscope
measurements that are collected from fifteen (15) elderly
people. The inertial sensor is configured to generate sam-
ples at 50Hz. The mean age of the subjects is 64.8 with the
standard deviation of 8.79. Each subject wore three inertial
sensors, one on the chest, one on the right waist and one on
the right ankle. The subjects were asked to perform a set
of activities such as walking, standing, sitting, lying down
and the transitional activities in their own preferred style
and pace. No specific instructions were given about how to
perform the activities. The activities were performed con-
tinuously for a single trial in the elderly house. Previous
studies have shown that the waist is the best location for a
single-sensor activity recognition because the acquired sen-
sor data represent the major body movement (Attal etal.
2015). Therefore, only data from the sensor on the waist is
used in the actual recognition. The other sensors are used
as references for data labeling in the experimental analy-
ses. Written informed consent was obtained prior to data
collection in accordance with the approval by the human
research ethics committee of Universiti Sains Malaysia
(USM/JEPeM/18040205). Although this dataset is specifi-
cally designed for this project, it can be extended to other
future projects for elderly action recognition. Table1 shows
the number of windows (samples) for each activity.
4.1.2 Public dataset
The Smartphone-Based Recognition of Human Activities
and Postural Transitions (UCI HAPT) dataset of (Reyes-
Ortiz etal. 2016) has been chosen as the adult dataset in this
study for several reasons. First, the dataset is the first large
dataset for activity recognition which was collected using
inertial sensors (accelerometer and gyroscope) embedded in
Acceleration
Timesteps Timesteps
(a) (b)
Fig. 3 Comparison of standing activity signals of an aelderly subject and badult subject
Table 1 Counting activities of
the local dataset Activity Number of
samples
Walking 48,428
Sitting 31,613
Standing 25,492
Lying down 17,413
Stand-to-sit 7724
Sit-to-stand 6082
Sit-to-lie 2703
Lie-to-sit 2321
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Subject variability insensor-based activity recognition
1 3
mobile phones. Second, it is used in the recent state-of-the-
art studies and is the only dataset that contains transitional
activity signals. Finally, it is similar to our dataset in terms
of the activities performed and the sensor placements (on
the waist). Other related datasets such as those discussed
in the related work are not used in this study since they
contain only dynamic and static activities. This dataset has
been collected by 30 adults within an age bracket of 19–48
years. Participants perform 12 activities including walking,
walking upstairs, walking downstairs, sitting, standing, lying
down, stand-to-sit, sit-to-stand, sit-to-lie, lie-to-sit, stand-to-
lie, lie-to-stand. However, only walking, standing, stand-to-
sit, sitting, sit-to-stand, sit-to-lie, lying down, and lie-to-sit
are considered for models training as the other four activities
are not present in the local dataset. During the data collec-
tion, participants were wearing a smartphone embedded with
accelerometer and gyroscope on the waist. Table2 shows the
number of windows (samples) for each activity.
4.2 Model training andtesting
This study applies machine learning and deep learning tech-
niques for activity recognition. The single machine learning
classifiers are Logistic Regression, Support Vector Machine
(SVM), and Decision Trees. The ensemble classifiers are
Random Forest (RF), Adaboost, and Extremely Gradient
Boosting Trees (XGBoost). Both single and ensemble clas-
sifiers follow five universal phases: activity sensing, feature
extraction, feature selection, model training, and activity
recognition. We have selected six statistical features for
model training namely mean, median, variance, maximum,
minimum, and skew. Table3 contains the descriptions of
each feature. Each feature is extracted from each axis of
accelerometer (
Ax
,
Ay
and
Az
) and gyroscope (
Gx
,
Gy
and
Gz
) signals. Then, the feature set is reduced using random
forest feature importance to select the most relevant fea-
tures. Third, selected features are used to train the classi-
fication models. In this study, the experiments are carried
out in three stages. First, the adult dataset is used to train
and test the classification models. In the second stage, only
the elderly dataset is used to train and test the classification
models. Finally, the classification models are trained using
the adult dataset and tested on the elderly dataset. All of
these experiments are performed to investigate the effects
of subject variability on activity recognition.
Deep learning technique has been recently applied in
activity recognition for automatic feature extraction and
activity classification. Convolutional Neural Network (CNN)
is considered to be one of the state-of-the-art deep learning
models in activity recognition. In this study, two CNN with
different architectures have been experimented to investigate
the subject variability. As shown in Table4, the first CNN
contains four convolutional layers with rectified linear unit
(ReLU) followed by dropout and max pooling respectively
as the feature learning pipeline. The classification pipeline
consists of a single dense layer with a softmax activation
function. Adam optimizer with a learning rate of 0.0002 and
a categorical cross-entropy loss function has been utilized to
train the CNN model.
The second CNN architecture contains two convolutional
layers followed by dropout and max pooling as the feature
learning pipeline. The classification pipeline consists of two
dense layers with ReLU and softmax activation functions.
Adam optimizer with a learning rate of 0.0002 and categor-
ical cross-entropy loss function has been utilized to train
Table 2 Counting activities of the UCI HAPT dataset
Activity Number of samples
Walking 187,069
Lying down 137,296
Standing 135,896
Sitting 124,712
Sit-to-lie 15,728
Sit-to-stand 13,675
Lie-to-sit 12,209
Stand-to-sit 11,790
Table 3 Feature description
Feature Description Equation
mean Mean value of each activity signal
Ax,Ay,Az
,
,
Gy
,
Gz
𝜇X
=X=
1
nn
i=1
X
i
median The median value of each activity signal
Ax,Ay,Az
,
Gx
,
Gy
,
Gz
i
m=
{
n+1
2,if n is odd
n
2
,if n is even
variance The variance of each activity signal
Ax,Ay,Az
,
Gx
,
Gy
,
Gz
𝜎
X=
1
n
n
i=1X2
i
1
n
n
i=1Xi
2

max The maximum value of each activity signal
Ax,Ay,Az
,
Gx
,
Gy
,
Gz
-
min The minimum value of each activity signal
Ax,Ay,Az
,
,
Gy
,
Gz
-
skew The skewness of each activity signal
Ax,Ay,Az
,
Gx
,
Gy
,
Gz
skest
=
3(
𝜇
M)
𝜎
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
A.Jimale, M.Mohd Noor
1 3
the second CNN model. Table5 shows the CNN2 model
summary.
In the classification and evaluation phase, the eight physical
activities are classified. An evaluation is also carried out to
validate the research findings using recognition accuracy. The
ten-fold cross-validation is used to avoid bias in the results.
5 Experimental results
The experimental results confirm that subject variability is
an issue in activity recognition as explained in Sects.5.1
and 5.2. It provided that subject variability contributed by
age differences degrades the recognition performance of
activity recognition models.
5.1 Activity recognition using machine learning
In general, the results of the activity recognition using
machine learning indicate that there is a significant per-
formance drop when two different datasets (UCI HAPT
and local datasets) with different age groups are used for
model training and testing respectively. Table6 shows the
recognition accuracy for the single classifiers where each
column represents the three stages of the experiments.
Overall, the activity recognition using UCI HAPT data-
set only achieved an average of 78% whereas the average
accuracy of activity recognition using the local dataset
only is 79%. The performance of the classifiers drops sig-
nificantly when the UCI HAPT dataset and local dataset
are used for training and testing respectively. The average
recognition accuracy is only 68.2%.
A similar performance drop is observed in ensemble clas-
sifiers while using two different datasets (UCI HAPT and
local datasets) with different age groups for model training
and testing respectively. Table7 provides the recognition
accuracy for the ensemble classifiers where each column
represents one of the three stages of the experiments. On
Table 4 CNN1 architecture
Layer (type) Configuration Output shape
1D convolution Filters = 16
Kernel size = 7
Activation = ReLU
94, 16
Dropout Rate = 0.3 94, 16
Max pooling Pool size = 2 47, 16
1D convolution Filters = 32
Kernel size = 5
Activation = ReLU
43, 32
Dropout Rate = 0.3 43, 32
Max pooling Pool size = 2 21, 32
1D convolution Filters = 64
Kernel size = 3
Activation = ReLU
19, 64
Dropout Rate = 0.2 19, 64
Max pooling Pool size = 2 9, 64
1D convolution Filters = 128
Kernel size = 3
Activation = ReLU
7, 128
Dropout Rate = 0.2 7, 128
Max pooling Pool size = 2 3, 128
Flatten 384
Dense Num_classes
Activation = softmax
8
Table 5 CNN2 model summary
Layer (type) Configuration Output shape
1D convolution Filters = 16
Kernel size = 5
Activation = ReLU
96, 16
Dropout Rate = 0.4 96, 16
Max pooling Pool size = 2 48, 16
1D convolution Filters = 32
Kernel size = 3
Activation = ReLU
46, 32
Dropout Rate = 0.4 46, 32
Max pooling Pool size = 2 23, 32
Flatten Not applicable 736
Dense Dense = 50
Activation = ReLU
50
Dense Num_classes
Activation = softmax
8
Table 6 Activity recognition results using single classifiers
Models UCI HAPT
dataset (%)
Local
dataset (%)
UCI HAPT and
local datasets (%)
Logistic regression 79 79 64
Linear SVM 79 82 71
Kernel SVM 78 77 72
Decision tree 77 78 67
KNN 77 79 67
Table 7 Activity recognition results using ensemble classifiers
Models UCI HAPT
dataset (%)
Local data-
set (%)
UCI HAPT and
local datasets
(%)
Random forest 80 79 75
AdaBoost 77 77 69
XGBoost 85 80 70
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Subject variability insensor-based activity recognition
1 3
average, the activity recognition using UCI HAPT dataset
only is 81%, the activity recognition using the local dataset
only is 79%, and the activity recognition using UCI HAPT
dataset for training and local dataset for testing is 71%.
These results indicate that that the average recognition accu-
racy using UCI HAPT dataset for training and local dataset
for testing is 8 and 10% lower than the average recognition
accuracy of using Local Dataset only or UCI HAPT only.
In general, the ensemble classifiers have a better perfor-
mance than the single classifiers. Table8 shows the aver-
age accuracy of ensemble classifiers is 3% higher than the
average accuracy of single classifiers in activity recogni-
tion using UCI HAPT and UCI HAPT and local datasets
respectively. This is due to the superior prediction power
of ensemble classifiers that combine multiple classification
models to make better decisions.
To better understand the performance of the classifiers,
Tables9 and 10, and Table11 show the confusion matri-
ces of activity recognition (with linear SVM) using UCI
HAPT dataset, local dataset, and UCI HAPT and local data-
sets respectively. As shown in Table9, in general, most of
the activities are well classified except for sit-to-lie activity
Table 8 The average accuracy of machine learning models
Classifiers/experiments UCI HAPT
dataset (%)
Local
dataset
(%)
UCI HAPT and
local datasets
(%)
Single classifiers 78 79 68
Ensemble classifiers 81 79 71
Table 9 Confusion matrix of linear SVM using UCI HAPT only
Walking Standing Stand-to-sit Sitting Sit-to-stand Sit-to-lie Lying down Lie-to-sit
Walking 0.95 0.00 0.05 0.00 0.00 0.00 0.00 0.00
Standing 0.01 0.82 0.12 0.01 0.01 0.01 0.01 0.01
Stand-to-sit 0.05 0.20 0.72 0.00 0.00 0.01 0.01 0.01
Sitting 0.06 0.02 0.00 0.89 0.00 0.00 0.01 0.02
Sit-to-stand 0.22 0.12 0.10 0.00 0.50 0.03 0.00 0.03
Sit-to-lie 0.22 0.08 0.40 0.00 0.06 0.23 0.01 0.00
Lying down 0.19 0.03 0.00 0.23 0.04 0.00 0.50 0.01
Lie-to-sit 0.04 0.07 0.17 0.05 0.04 0.01 0.00 0.62
Table 10 Confusion matrix of
linear SVM using local dataset
only
Walking Standing Stand-to-sit Sitting Sit-to-stand Sit-to-lie Lying down Lie-to-sit
Walking 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Standing 0.00 0.79 0.20 0.01 0.00 0.00 0.00 0.00
Stand-to-sit 0.04 0.10 0.82 0.02 0.02 0.00 0.00 0.00
Sitting 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00
Sit-to-stand 0.36 0.07 0.04 0.00 0.30 0.19 0.00 0.04
Sit-to-lie 0.31 0.03 0.07 0.00 0.07 0.48 0.00 0.03
Lying down 0.00 0.00 0.00 0.67 0.00 0.00 0.22 0.11
Lie-to-sit 0.12 0.00 0.00 0.50 0.12 0.12 0.00 0.14
Table 11 Confusion matrix of
linear SVM using UCI HAPT
and local datasets
Walking Standing Stand-to-sit Sitting Sit-to-stand Sit-to-lie Lying down Lie-to-sit
Walking 0.94 0.00 0.02 0.00 0.01 0.00 0.01 0.02
Standing 0.00 0.79 0.09 0.04 0.00 0.00 0.00 0.08
Stand-to-sit 0.04 0.36 0.47 0.02 0.01 0.00 0.00 0.10
Sitting 0.00 0.19 0.00 0.77 0.00 0.00 0.01 0.03
Sit-to-stand 0.41 0.12 0.08 0.06 0.18 0.02 0.08 0.05
Sit-to-lie 0.25 0.05 0.08 0.13 0.14 0.16 0.06 0.13
Lying down 0.02 0.07 0.00 0.44 0.00 0.00 0.40 0.07
Lie-to-sit 0.11 0.11 0.00 0.45 0.04 0.04 0.06 0.19
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
A.Jimale, M.Mohd Noor
1 3
and lying down. Specifically, walking, standing and sitting
achieved a true positive rate of 0.82 and above. It is observed
that 0.4 of sit-to-lie samples are misclassified as stand-to-sit.
This is because, the fixed sliding window segmentation tech-
niques used and limited samples of the activity. Fixed sliding
window technique cannot produce perfect segmentation for
transitional activity signals as the length of its signals varies
depending on the time to perform the activity.
In Table10, the classifier performs well on classifying
most of the activities except sit-to-stand and lie-to-sit activi-
ties using the local dataset only. The misclassification of
these transitional activities is contributed by two factors:
fixed sliding window segmentation and the short duration
characteristics of transitional activities. In particular, only a
smaller amount of transitional activity examples per activ-
ity samples can be collected from elderly people at a time.
As shown in Table11, the classification performance of
the classifier drops significantly for most activities using
UCI HAPT dataset for training and local dataset for testing.
For instance, the true positives of walking, sitting, stand-to-
sit, sit-to-stand, and sit-to-lie activity using the local dataset
only drop to 0.94, 0.79, 0.47, 0.18, and 0.16 respectively.
In particular, the results indicate that transitional activities
have less than 50% true positive rate. This is due to the fixed
sliding window segmentation technique that cannot perfectly
segment transitional activity signals since the length of tran-
sitional activity signals varies between elderly and adults.
Depending on the time to perform the transitional activi-
ties, the length of the activity signals for the elderly people
is relatively longer compared to the adults. For example,
the elderly took about 4s to complete stand-to-sit activ-
ity while the adult took about 3.18s to do so as shown in
Fig.2. This is reflected in Table11 where 0.36 of stand-to-
sit activity is misclassified as standing activity while 0.25 of
sit-to-lie activity is misclassified as walking and 0.45 of lie-
to-sit activity is misclassified as a sitting activity. The same
applies to other transitional activities. Table11 shows that
0.12 and 0.41 of sit-to-stand activity samples are misclassi-
fied as standing activity and walking activity respectively.
This confirms that subject variability (concerning age) per-
formance degradation applies to machine learning models.
5.2 Activity recognition using deep learning
There is about a 10% performance drop when the deep learn-
ing models are trained and tested using UCI HAPT dataset
and local dataset respectively. Table12 shows that the activ-
ity recognition using UCI HAPT dataset only achieved an
average of 80.5% while the average accuracy of activity rec-
ognition using the local dataset only is 82.5%. The average
accuracy of activity recognition drops to 69.5% when UCI
HAPT dataset is used as training set and the local dataset is
used as test set.
To better describe the performance of the deep learning
models, Tables13 and 14, and 15 show the confusion matri-
ces for deep learning classification models (with CNN2)
using UCI HAPT dataset, local dataset, and UCI HAPT
and local datasets respectively. As shown in Table13, the
recognition performance of the CNN2 model is high using
the UCI HAPT dataset only. More than 0.50 true positive
is observed except for sit-to-lie activity. The classes that
achieve the best true positive include walking, sitting, stand-
ing, and sit-to-stand. Similarly, as shown in Table14, the
confusion matrix of the CNN2 model using the local dataset
only shows fewer misclassification errors and high classifica-
tion performance. All activity classes achieve a true positive
of 0.50 and above. Walking (1.00), sitting(0.99), sit-to-lie
(0.79), stand-to-sit (0.76), and lie-to-sit (0.75) activity have
the best true positive rate respectively while lying down
activity has the worst true positive of 0.50. This is due to
the nature of static and transitional activities.
Conversely, the performance of the classification mod-
els drops while using the UCI HAPT dataset as training set
and the local dataset as test set. Table15 shows the confu-
sion matrix of CNN2 model using the UCI HAPT dataset
as training and the local dataset as testing. Although CNN
performed better than machine learning models, there are
no great improvements observed. As can be seen, the clas-
sification performance of walking, stand-to-sit, sitting, sit-
to-stand, and lie-to-sit activities have a relatively lower true
positive rate compared to the activity recognition using UCI
HAPT dataset only and local dataset only. Similar misclas-
sification pattern can be seen in the table whereby the transi-
tional activities have the highest misclassification rates. The
fixed sliding window segmentation method failed to seg-
ment the transitional activity signals due to their relatively
longer length compared to the adults. This is reflected in
Table15 whereby 0.28 of stand-to-sit activity is misclassi-
fied as standing activity, 0.24 of sit-to-stand is misclassified
as walking, 0.21 of sit-to-lie is misclassified as walking, 0.27
of lying down is misclassified as sitting, and 0.30 of lie-to-sit
activity is misclassified as lying down.
Overall, the classification models did not achieve state-
of-the-art performance. This is due to the limited size of
the training set. The training samples were reduced after
eliminating null classes, and other non-related activities in
the training set. The performance of the models was also
Table 12 Activity recognition results using deep learning
Models UCI HARPT
dataset (%)
Local dataset
(%)
UCI HAPT and
local datasets
(%)
1D CNN1 80 81 69
1D CNN2 81 84 70
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Subject variability insensor-based activity recognition
1 3
affected by the class imbalances available in both bench-
mark and local datasets. Nevertheless, the results confirmed
that subject variability occurs when classification models
are trained and tested on datasets with different age groups.
The results also indicate that subject variability degrades the
performance of activity recognition.
We analyze the activity signals of the elderly and adult
subjects to determine the performance degradation. The per-
formance degradation is mainly caused by the variation of
the length of transitional activity signals which indicate the
completion time of the activity, the speed of walking activ-
ity, and the stability of standing, sitting, and lying down
activities between the elderly and adults. To analyze these
three factors, we calculated the mean length of the transi-
tional activities while for walking and static activities, we
calculated the variance of the magnitude of the three-axis
acceleration within the window segmentation of all subjects
in the local and UCI HAPT datasets. The findings of the
analysis indicate that the mean length of stand-to-sit and sit-
to-stand activity signals of the elderly are approximately 0.2
and 0.095s longer than the mean length of stand-to-sit and
sit-to-stand activity signals of the adult respectively. As a
result, the fixed sliding window segmentation method failed
to produce perfect segmentation due to the varying comple-
tion time of the activity. Table16 shows the mean length
of stand-to-sit and sit-to-stand activity signals. The analy-
sis also shows that the walking activity of adults generates
higher acceleration than the walking activity of the elderly.
This is shown in Table17 where the mean of the variance
of walking activity of adults. The mean of the variance also
shows that the static activities of adults are more statble
than the static activities of the elderly as shown in Table18.
Table 13 Confusion matrix of
CNN2 using UCI HAPT dataset
only
Walking Standing Stand-to-sit Sitting Sit-to-stand Sit-to-lie Lying down Lie-to-sit
Walking 0.95 0.00 0.05 0.00 0.00 0.00 0.00 0.00
Standing 0.01 0.84 0.08 0.01 0.03 0.01 0.01 0.01
Stand-to-sit 0.09 0.24 0.63 0.00 0.00 0.01 0.02 0.01
Sitting 0.06 0.02 0.00 0.88 0.00 0.00 0.02 0.02
Sit-to-stand 0.08 0.00 0.02 0.00 0.75 0.02 0.03 0.10
Sit-to-lie 0.25 0.12 0.31 0.00 0.04 0.24 0.00 0.04
Lying down 0.15 0.03 0.00 0.19 0.05 0.00 0.55 0.03
Lie-to-sit 0.01 0.09 0.17 0.02 0.02 0.00 0.01 0.68
Table 14 Confusion matrix of
CNN2 using local dataset only Walking Standing Stand-to-sit Sitting Sit-to-stand Sit-to-lie Lying down Lie-to-sit
Walking 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Standing 0.00 0.69 0.19 0.12 0.00 0.00 0.00 0.00
Stand-to-sit 0.07 0.09 0.76 0.00 0.04 0.02 0.02 0.00
Sitting 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.01
Sit-to-stand 0.17 0.04 0.00 0.00 0.68 0.04 0.07 0.00
Sit-to-lie 0.10 0.00 0.04 0.00 0.07 0.79 0.00 0.00
Lying down 0.00 0.00 0.00 0.38 0.12 0.00 0.50 0.00
Lie-to-sit 0.12 0.00 0.00 0.00 0.00 0.13 0.00 0.75
Table 15 Confusion matrix of
CNN2 using UCI HAPT and
local datasets
Walking Standing Stand-to-sit Sitting Sit-to-stand Sit-to-lie Lying down Lie-to-sit
Walking 0.84 0.00 0.01 0.01 0.00 0.00 0.04 0.10
Standing 0.00 0.75 0.07 0.05 0.00 0.00 0.06 0.07
Stand-to-sit 0.07 0.28 0.50 0.00 0.01 0.03 0.01 0.10
Sitting 0.00 0.07 0.00 0.85 0.00 0.00 0.02 0.06
Sit-to-stand 0.24 0.06 0.04 0.01 0.35 0.05 0.17 0.08
Sit-to-lie 0.21 0.03 0.02 0.03 0.16 0.36 0.11 0.08
Lying down 0.00 0.02 0.13 0.27 0.00 0.00 0.55 0.03
Lie-to-sit 0.02 0.04 0.04 0.12 0.09 0.11 0.30 0.28
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
A.Jimale, M.Mohd Noor
1 3
The difference in the intensity of the walking activity, and
the presence of irregular, random and sparse spikes in the
static activity signals of the elderly causes the classification
models that were trained on UCI HAPT dataset failed to
generalize to the local dataset.
6 Conclusions
In this paper, we investigated the effect of subject variabil-
ity on the performance of activity recognition. In the exper-
iments, two datasets with different age groups are used to
train and test machine learning and deep learning models.
The experiments are carried out in three stages: activity
recognition using the public dataset only, activity recogni-
tion using the local dataset only, and activity recognition
using the public dataset as training and the local dataset
as testing. The results show that subject variability due to
age differences among the subjects significantly drops the
performance of the classification models. On average, the
drop in recognition accuracy is 9.75 and 12% for machine
learning and deep learning models respectively. The clas-
sification results showed that transitional activities such
as stand-to-sit, sit-to-stand, sit-to-lie, and lie-to-sit have
the worst performance decline compared to dynamic and
static activities. This is due to the slower movement nature
of elderly people compared to adults. This affects the win-
dow size as fixed sliding window segmentation technique
cannot adaptively adjust the window size depending on the
time to perform the activity to produce optimal window
size. The experimental results confirm that subject vari-
ability is an important issue that requires immediate atten-
tion. The future work of this study is to propose a method
for solving performance degradation caused by such kind
of subject variability.
Acknowledgements This work was supported by the Universiti Sains
Malaysia and Ministry of Higher Education Malaysia under Fundamen-
tal Research Grant Scheme (Grant No. 203.PKOMP.6711798).
Funding This work was supported by the Universiti Sains Malay-
sia and Ministry of Higher Education Malaysia under Fundamental
Research Grant Scheme (Grant No. 203.PKOMP.6711798).
Availability of data and material Not applicable.
Code Availability Not applicable.
Declarations
Conflict of interest The authors declare that they have no conflicts of
interest to report regarding the present study.
Ethics approval Written informed consent was obtained prior to data
collection in accordance with the approval by the human research eth-
ics committee of Universiti Sains Malaysia (USM/JEPeM/18040205).
Consent to participate Informed consent was obtained from all par-
ticipants included in this study.
Consent for publication We give our consent for our paper to be pub-
lished in the Journal of Ambient Intelligence and Humanized Com-
puting.
References
Akbari A, Jafari R (2020) Personalizing activity recognition models
with quantifying different types of uncertainty using wearable
sensors. IEEE Trans Biomed Eng. https:// doi. org/ 10. 1109/ TBME.
2019. 29638 16
Anguita D, Ghio A, Oneto L, Parra X, Reyes-Ortiz JL (2013) A public
domain dataset for human activity recognition using smartphones.
In: ESANN 2013 proceedings, 21st European symposium on arti-
ficial neural networks, computational intelligence and machine
learning (April), pp 437–442
Attal F, Mohammed S, Dedabrishvili M, Chamroukhi F, Oukhellou L,
Amirat Y (2015) Physical human activity recognition using wear-
able sensors. Sensors (Switzerland) 15(12):31314–31338. https://
doi. org/ 10. 3390/ s1512 29858
Banos O, Garcia R, Holgado-terriza JA, Damas M (2014) mHealth-
Droid: a novel framework for agile development of mobile health
applications. Springer, Cham
Campbell AT, Lane ND, Miluzzo E, Peterson RA, Lu H, Zheng X,
Musoles M, Fodor K, Ahn G-S, Eisenman SB (2008) The rise of
people-centric sensing. IEEE Internet Comput 12(4):12–21
Chang AY, Skirbekk VF, Tyrovolas S, Kassebaum NJ, Dieleman JL
(2019) Measuring population ageing: an analysis of the global
burden of disease study 2017. Lancet Public Health 4(3):e159–
e167. https:// doi. org/ 10. 1016/ S2468- 2667(19) 30019-2
Table 16 The mean length of transitional activity signals of elderly
and adults
Category of subject Stand-to-sit (s) Sit-to-stand (s)
Elderly (Local dataset) 3.566 2.675
Adult (UCI HAPT) 3.408 2.580
Table 17 The mean of the variance of walking activity of elderly and
adults
Category of subject
Ax
Ay
Az
Elderly (Local dataset) 0.021 0.035 0.034
Adult (UCI HAPT) 0.043 0.035 0.041
Table 18 The mean of the variance of static activity of elderly and
adults
Category of subject
Ax
Ay
Az
Elderly (Local dataset) 0.284 0.244 0.072
Adult (UCI HAPT) 0.194 0.177 0.128
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Subject variability insensor-based activity recognition
1 3
Chiang TC, Bruno B, Menicatti R, Recchiuto CT, Sgorbissa A
(2019) Culture as a sensor? A novel perspective on human
activity recognition. Int J Soc Robot. https:// doi. org/ 10. 1007/
s12369- 019- 00590-3
Cornacchia M, Ozcan K, Zheng Y, Velipasalar S (2017) A survey on
activity detection and classification using wearable sensors. IEEE
Sens J 17(2):386–403. https:// doi. org/ 10. 1109/ JSEN. 2016. 26283
46
Dinarević EC, Husić JB, Baraković S (2019) Issues of human activity
recognition in healthcare. In:2019 18th International symposium
INFOTEH-JAHORINA, INFOTEH 2019-proceedings (March),
pp 20–22. https:// doi. org/ 10. 1109/ INFOT EH. 2019. 87177 49
Gani MO, Fayezeen T, Povinelli RJ, Smith RO, Arif M, Kattan AJ,
Ahamed SI (2019) A light weight smartphone based human activ-
ity recognition system with high accuracy. J Netw Comput Appl
141:59–72. https:// doi. org/ 10. 1016/j. jnca. 2019. 05. 001
Gil-Martín M, San-Segundo R, Fernández-Martínez F, Ferreiros-López
J (2020) Improving physical activity recognition using a new deep
learning architecture and post-processing techniques. Eng Appl
Artif Intell 92:103679. https:// doi. org/ 10. 1016/j. engap pai. 2020.
103679
Hammerla NY, Halloran S, Plötz T (2016) Deep, convolutional, and
recurrent models for human activity recognition using wearables.
In: IJCAI international joint conference on artificial intelligence
2016-Janua, pp 1533–1540
Howdon D, Rice N (2018) Health care expenditures, age, proximity
to death and morbidity: implications for an ageing population. J
Health Econ 57:60–74. https:// doi. org/ 10. 1016/j. jheal eco. 2017.
11. 001
Hussain Z, Sheng M, Zhang WE (2019) Different approaches for
human activity recognition: a survey. arXiv preprint arXiv
190605074, pp 1–28
Khatun S, Morshed BI (2018) Fully-automated human activity recog-
nition with transition awareness from wearable sensor data for
mHealth. In: IEEE international conference on electro information
technology 2018-May, pp 934–938. https:// doi. org/ 10. 1109/ EIT.
2018. 85001 35
Khusainov R, Azzi D, Achumba IE, Bersch SD (2013) Real-time
human ambulation, activity, and physiological monitoring: tax-
onomy of issues, techniques, applications, challenges and limita-
tions. Sensors (Switzerland) 13(10):12852–12902. https:// doi. org/
10. 3390/ s1310 12852
Kim Y, Toomajian B (2016) Hand gesture recognition using micro-
doppler signatures with convolutional neural network. IEEE
Access 4:7125–7130. https:// doi. org/ 10. 1109/ ACCESS. 2016.
26172 82
Kwapisz JR, Weiss GM, Moore SA (2011) Activity recognition
using cell phone accelerometers. ACM SIGKDD Explor Newsl
12(2):74. https:// doi. org/ 10. 1145/ 19648 97. 19649 18
Labrador MA, Yejas ODL (2011) Human activity recognition using
wearable sensors and smartphones. CRC, Cambridge
Lara ÓD, Labrador MA (2013) A survey on human activity recognition
using wearable sensors. IEEE Commun Surv Tutor 15(3):1192–
1209. https:// doi. org/ 10. 1109/ SURV. 2012. 110112. 00192
Lee G, Choi B, Jebelli H, Ahn CR, Lee SH (2020) Wearable biosensor
and collective sensing-based approach for detecting older adults
environmental barriers. J Comput Civ Eng 34(2):1–12. https:// doi.
org/ 10. 1061/ (ASCE) CP. 1943- 5487. 00008 79
Lv T, Wang X, Jin L, Xiao Y, Song M (2020) Margin-based deep
learning networks for human activity recognition. Sensors (Swit-
zerland). https:// doi. org/ 10. 3390/ s2007 1871
Mannini A, Intille SS (2019) Classifier personalization for activity
recognition using wrist accelerometers. IEEE J Biomed Health
Inform 23(4):1585–1594. https:// doi. org/ 10. 1109/ JBHI. 2018.
28697 79
Mohammed Hashim BA, Amutha R (2020) Human activity recognition
based on smartphone using fast feature dimensionality reduction
technique. J Ambient Intell Humaniz Comput. https:// doi. org/ 10.
1007/ s12652- 020- 02351-x
Nambu M, Nakajima K, Kawarada A, Tamura T (2000) The automatic
health monitoring system for home health care. In: Proceedings of
the IEEE/EMBS region 8 international conference on information
technology applications in biomedicine, ITAB, pp 79–82. https://
doi. org/ 10. 1109/ itab. 2000. 892353
Nweke HF, Teh YW, Al-garadi MA, Alo UR (2018) Deep learning
algorithms for human activity recognition using mobile and wear-
able sensor networks: State of the art and research challenges.
Expert Syst Appl 105:233–261. https:// doi. org/ 10. 1016/j. eswa.
2018. 03. 056
Piyathilaka L, Kodagoda S (2015) Human activity recognition for
domestic robots. Springer, Cham
Reiss A, Stricker D (2012) Introducing a new benchmarked dataset for
activity monitoring. In: Proceedings - international symposium
on wearable computers, ISWC (June 2012), pp 108–109. https://
doi. org/ 10. 1109/ ISWC. 2012. 13
Reyes-Ortiz JL, Oneto L, Samà A, Parra X, Anguita D (2016) Transi-
tion-aware human activity recognition using smartphones. Neuro-
computing 171:754–767. https:// doi. org/ 10. 1016/j. neucom. 2015.
07. 085
Rezaie H, Ghassemian M (2018) Comparison analysis of Radio_Based
and Sensor_Based wearable human activity recognition systems.
Wirel Pers Commun 101(2):775–797. https:// doi. org/ 10. 1007/
s11277- 018- 5715-4
Richter J, Wiede C, Dayangac E, Shahenshah A, Hirtz G (2017) Activ-
ity recognition for elderly care by evaluating proximity to objects
and human skeleton data. In: Lecture notes in computer science
(including subseries lecture notes in artificial intelligence and lec-
ture notes in bioinformatics) 10163 LNCS, pp 139–155. https://
doi. org/ 10. 1007/ 978-3- 319- 53375-9_8
Roggen D, Calatroni A, Rossi M, Holleczek T, Förster K, Tröster G,
Lukowicz P, Bannach D, Pirkl G, Ferscha A, Doppler J, Holz-
mann C, Kurz M, Holl G, Chavarriaga R, Sagha H, Bayati H,
Creatura M, Del R. Millàn J (2010) Collecting complex activity
datasets in highly rich networked sensor environments. In: INSS
2010–7th international conference on networked sensing systems,
pp 233–240. https:// doi. org/ 10. 1109/ INSS. 2010. 55734 62
Sajjad Hossain HM, Roy N (2019) Active deep learning for activity
recognition with context aware annotator selection. In: Proceed-
ings of the ACM SIGKDD international conference on knowl-
edge discovery and data mining, pp1862–1870. https:// doi. org/
10. 1145/ 32925 00. 33306 88
Sakuma Y, Kleisarchaki S, Gurgen L, Nishi H (2019) Exploring vari-
ability in IoT data for human activity recognition. In: IECON
Proceedings (industrial electronics conference) 2019-Octob, pp
5312–5318. https:// doi. org/ 10. 1109/ IECON. 2019. 89274 72
Satapathy SC, Das S (2016) PCA based optimal ANN classifiers for
human activity recognition using mobile sensors data. Springer,
Cham
Straczkiewicz M, Onnela J (2019) A systematic review of human activ-
ity recognition using smartphones. arXiv e-prints arXiv: 1910.
03970
Subasi A, Khateeb K, Brahimi T, Sarirete A (2020) Human activity
recognition using machine learning methods in a smart healthcare
environment. Elsevier Inc, Amsterdam
United Nations (2019) World population prospects 2019. Ten key find-
ings. https:// popul ation. un. org/ wpp/ Publi catio ns/ Files/ WPP20 19_
10Key Findi ngs. pdf
Van Kasteren TLM, Englebienne G, Kröse BJA (2010) An activity
monitoring system for elderly care using generative and discrimi-
native models. Pers Ubiquit Comput 14(6):489–498. https:// doi.
org/ 10. 1007/ s00779- 009- 0277-9
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
A.Jimale, M.Mohd Noor
1 3
Vepakomma P, De D, Das SK, Bhansali S (2015) A-Wristocracy: deep
learning on wrist-worn sensing for recognition of user complex
activities. In:2015 IEEE 12th international conference on wear-
able and implantable body sensor networks, BSN 2015, pp 1–6.
https:// doi. org/ 10. 1109/ BSN. 2015. 72994 06
Vijayaprabakaran K, Sathiyamurthy K, Ponniamma M (2020) Video-
based human activity recognition for elderly using convolutional
neural network. Int J Secur Priv Pervasive Comput. https:// doi.
org/ 10. 4018/ IJSPPC. 20200 10104
Xia K, Huang J, Wang H (2020) LSTM-CNN architecture for human
activity recognition. IEEE Access 8:56855–56866. https:// doi. org/
10. 1109/ ACCESS. 2020. 29822 25
Xu LI, He FX, Tian Z, Liu WEI (2020) Harmonic loss function for sen-
sor-based human activity recognition based on LSTM recurrent
neural networks. IEEE Access. https:// doi. org/ 10. 1109/ ACCESS.
2020. 30031 62
Yao L, Sheng QZ, Li X, Gu T, Tan M, Wang X, Wang S, Ruan W
(2018) Compressive representation for device-free activity recog-
nition with passive RFID signal strength. IEEE Trans Mob Com-
put 17(2):293–306. https:// doi. org/ 10. 1109/ TMC. 2017. 27062 82
Zahin A, Tan LT, Hu RQ (2019) Sensor-based human activity recog-
nition for smart healthcare: a semi-supervised machine learning.
Springer, Cham
Zhang H, Xiao Z, Wang J, Li F, Szczerbicki E (2020) A novel IoT-
perceptive human activity recognition (HAR) approach using mul-
tihead convolutional attention. IEEE Internet Things J 7(2):1072–
1080. https:// doi. org/ 10. 1109/ JIOT. 2019. 29497 15
Publisher’s Note Springer Nature remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
... The elderly dataset was sampled in fixed-width sliding windows of 2 seconds and 50% overlap (100 readings/window). A full description of the used dataset can be found in section 4.1.1 of [48]. Preprocessed data is used to train the FCGAN and compare it with the state-of-the-art Unified CGAN by Chan and Noor [26]. ...
... The third evaluation technique, usability evaluation, aims to investigate the quality of the synthetic data generated by the proposed architecture in improving the performance of activity recognition classification models. First, the synthetic data generated by both FCGAN architecture and Unified GAN is preprocessed to perform four experiments on it together with the real data using the best-performing deep learning classifiers by Jimale & Mohd Noor [48]. These experiments are experiments on 70% real data and 30% synthetic data, experiments on 50% real data and 50% synthetic data, experiments on 30% real data and 70% synthetic data, and experiments on 100% synthetic data. ...
Article
Full-text available
Conditional Generative Adversarial Networks (CGAN) have shown great promise in generating synthetic data for sensor-based activity recognition. However, one key issue concerning existing CGAN is the design of the network architecture that affects sample quality. This study proposes an effective CGAN architecture that synthesizes higher quality samples than state-of-the-art CGAN architectures. This is achieved by combining convolutional layers with multiple fully connected networks in the generator’s input and discriminator’s output of the CGAN. We show the effectiveness of the proposed approach using elderly data for sensor-based activity recognition. Visual evaluation, similarity measure, and usability evaluation is used to assess the quality of generated samples by the proposed approach and validate its performance in activity recognition. In comparison to the state-of-the-art CGAN, the visual evaluation and similarity measure demonstrate that the proposed models’ synthetic data more accurately represents actual data and creates more variations in each synthetic data than the state-of-the-art approach respectively. The experimental stages of the usability evaluation, on the other hand, show a performance gain of 2.5%, 2.5%, 3.1%, and 4.4% over the state-of-the-art CGAN when using synthetic samples by the proposed architecture.
... Authors of [22,23,24] achieved promising classification results for the UCI-HAPT dataset. However, in this work, we did not use the proposed frequency domain variables but extracted samples from the raw sensor readings. ...
Conference Paper
Full-text available
Human Activity Recognition (HAR) has recently become in the spotlight of scientific research due to the development and proliferation of wearable sensors. HAR has found applications in such areas as digital health, mobile medicine, sports, abnormal activity detection and fall prevention. Neural Networks have recently become a widespread method for dealing with HAR problems due to their ability automatically extract and select features from the raw sensor data. However, this approach requires extensive training datasets to perform sufficiently under diverse circumstances. This study proposes a novel Deep Learning - based model, pre-trained on the KU-HAR dataset. The raw, six-channel sensor data was preliminarily processed using the Continuous Wavelet Transform (CWT) for better performance. Nine popular Convolutional Neural Network (CNN) architectures, as well as different wavelets and scale values, were tested to choose the best-performing combination. The proposed model was tested on the whole UCI-HAPT dataset and its subset to assess how it performs on new activities and different amounts of training data. The results show that using the pre-trained model, especially with frozen layers, leads to improved performance, smoother gradient descent and faster training on small datasets. Additionally, the model performed on the KU-HAR dataset with a classification accuracy of 97.48% and F1-score of 97.52%, which is a competitive performance compared to other state-of-the-art HAR models.
... 5,6 However, the recognition task is difficult due to the vast number of sensor modalities, noisy data, variances in the spatial and temporal dimensions of the feature space between people, 7 and also, the variability when a subject or different subjects perform the same task at various times, among other factors. 8 Examples of wearable sensors include accelerometers, gyroscopes, magnetometers, and others. Various machine learning models have been proposed to recognize the activities collected using these sensors; an example of such can be seen in Sani et al. 9 where the authors used K-Nearest Neighbor for classification. ...
Article
With the development of deep learning, numerous models have been proposed for human activity recognition to achieve state‐of‐the‐art recognition on wearable sensor data. Despite the improved accuracy achieved by previous deep learning models, activity recognition remains a challenge. This challenge is often attributed to the complexity of some specific activity patterns. Existing deep learning models proposed to address this have often recorded high overall recognition accuracy, while low recall and precision are often recorded on some individual activities due to the complexity of their patterns. Some existing models that have focused on tackling these issues are always bulky and complex. Since most embedded systems have resource constraints in terms of their processor, memory and battery capacity, it is paramount to propose efficient lightweight activity recognition models that require limited resources consumption, and still capable of achieving state‐of‐the‐art recognition of activities, with high individual recall and precision. This research proposes a high performance, low footprint deep learning model with a squeeze and excitation block to address this challenge. The squeeze and excitation block consist of a global average‐pooling layer and two fully connected layers, which were placed to extract the flattened features in the model, with best‐fit reduction ratios in the squeeze and excitation block. The squeeze and excitation block served as channel‐wise attention, which adjusted the weight of each channel to build more robust representations, which enabled our network to become more responsive to essential features while suppressing less important ones. By using the best‐fit reduction ratio in the squeeze and excitation block, the parameters of the fully connected layer were reduced, which helped the model increase responsiveness to essential features. Experiments on three publicly available datasets (PAMAP2, WISDM, and UCI‐HAR) showed that the proposed model outperformed existing state‐of‐the‐art with fewer parameters and increased the recall and precision of some individual activities compared to the baseline, and the existing models.
... Accurate HAR in neurological populations requires diverse data from multiple participants with a broad range of ages, fitness levels, disease duration, mood, and health conditions to ensure inter-subject and intrasubject variability have minimal impact on recognition accuracy [47]. For example, people with different stroke types (e.g., ischemic, hemorrhagic) and post-stroke recovery durations may show different levels of impaired mobility during stair ascending/descending. Increasing the size of the dataset may also contribute to minimizing the impact of subject variability in classification models. ...
Article
Full-text available
Inertial sensor-based human activity recognition (HAR) has a range of healthcare applications as it can indicate overall health status or functional capabilities of people with impaired mobility. Typically, artificial intelligence models achieve high recognition accuracies when trained with rich and diverse inertial datasets. However, obtaining such datasets may not be feasible in neurolog-ical populations due to e.g., impaired patient mobility to perform a many daily activities. This study proposes a novel framework to overcome the challenge of creating rich and diverse datasets for HAR in neurological populations. The framework produces images from numerical inertial time-series data (initial state) and then artificially augments the number of produced images (enhanced state) to achieve a larger dataset. Here, we used convolutional neural network (CNN) architectures by utilizing image input. In addition, CNN enables transfer learning which enables limited datasets to benefit from models that are trained with big data. Initially, two benchmarked public datasets were used to verify the framework. Afterwards, the approach was tested in lim-ited local datasets of healthy subjects (HS), Parkinson’s disease (PD) population and stroke sur-vivors (SS) to further investigate validity. The experimental results show that when data aug-mentation is applied, recognition accuracies have been increased in HS, SS and PD by 25.6%, 21.4% and 5.8%, respectively compared to the no data augmentation state. In addition, data augmentation contributes to better detection of stair ascent and stair descent by 39.1% and 18.0%, respectively in limited local datasets. Findings also suggest that CNN architectures that have a small number of deep layers can achieve high accuracy. The implication of this study has the potential to reduce burden on participants and researchers where limited datasets are accrued.
Article
Full-text available
Human activity recognition (HAR) has been a very popular field in both real practice and theoretical research. Over the years, a number of many-vs-one Long Short-Term Memory (LSTM) models have been proposed for the sensor-based HAR problem. However, how to utilize sequence outputs of them to improve the HAR performance has not been studied seriously. To solve this problem, we present a novel loss function named harmonic loss, which is utilized to improve the overall classification performance of HAR based on baseline LSTM networks. First, label replication method is presented to duplicate true labels at each sequence step in many-vs-one LSTM networks, thus each sequence step can generate a local error and a local output. Then, considering the importance of different local errors and inspired by the Ebbinghaus memory curve, the harmonic loss is proposed to give unequal weights to different local errors based on harmonic series equation. Additionally, to improve the overall classification performance of HAR, integrated methods are utilized to exploit the sequence outputs of LSTM models based on harmonic loss and ensemble learning strategy. Finally, based on the LSTM model construction and hyper-parameter setting, extensive experiments are conducted. A series of experimental results demonstrate that our harmonic loss significantly achieves higher macro-F1 and accuracy than strong baselines on two public HAR benchmarks. Compared with previous state-of-art methods, our proposed methods can achieve competitive classification performance.
Article
Full-text available
Human activity recognition aims to identify the activities carried out by a person. Recognition is possible by using information that is retrieved from numerous physiological signals by attaching sensors to the subject’s body. Lately, sensors like accelerometer and gyroscope are built-in inside the Smartphone itself, which makes activity recognition very simple. To make the activity recognition system work properly in smartphone which has power constraint, it is very essential to use an optimization technique which can reduce the number of features used in the dataset with less time consumption. In this paper, we have proposed a dimensionality reduction technique called fast feature dimensionality reduction technique (FFDRT). A dataset (UCI HAR repository) available in the public domain is used in this work. Results from this study shows that the fast feature dimensionality reduction technique applied for the dataset has reduced the number of features from 561 to 66, while maintaining the activity recognition accuracy at 98.72% using random forest classifier and time consumption in dimensionality reduction stage using FFDRT is much below the state of the art techniques.
Article
Full-text available
Human activity recognition (HAR) is a popular and challenging research topic, driven by a variety of applications. More recently, with significant progress in the development of deep learning networks for classification tasks, many researchers have made use of such models to recognise human activities in a sensor-based manner, which have achieved good performance. However, sensor-based HAR still faces challenges; in particular, recognising similar activities that only have a different sequentiality and similarly classifying activities with large inter-personal variability. This means that some human activities have large intra-class scatter and small inter-class separation. To deal with this problem, we introduce a margin mechanism to enhance the discriminative power of deep learning networks. We modified four kinds of common neural networks with our margin mechanism to test the effectiveness of our proposed method. The experimental results demonstrate that the margin-based models outperform the unmodified models on the OPPORTUNITY, UniMiB-SHAR, and PAMAP2 datasets. We also extend our research to the problem of open-set human activity recognition and evaluate the proposed method’s performance in recognising new human activities.
Article
Full-text available
In the past years, traditional pattern recognition methods have made great progress. However, these methods rely heavily on manual feature extraction, which may hinder the generalization model performance. With the increasing popularity and success of deep learning methods, using these techniques to recognize human actions in mobile and wearable computing scenarios has attracted widespread attention. In this paper, a deep neural network that combines convolutional layers with long short-term memory (LSTM) was proposed. This model could extract activity features automatically and classify them with a few model parameters. LSTM is a variant of the recurrent neural network (RNN), which is more suitable for processing temporal sequences. In the proposed architecture, the raw data collected by mobile sensors was fed into a two-layer LSTM followed by convolutional layers. In addition, a global average pooling layer (GAP) was applied to replace the fully connected layer after convolution for reducing model parameters. Moreover, a batch normalization layer (BN) was added after the GAP layer to speed up the convergence, and obvious results were achieved. The model performance was evaluated on three public datasets (UCI, WISDM, and OPPORTUNITY). Finally, the overall accuracy of the model in the UCI-HAR dataset is 95.78%, in the WISDM dataset is 95.85%, and in the OPPORTUNITY dataset is 92.63%. The results show that the proposed model has higher robustness and better activity detection capability than some of the reported results. It can not only adaptively extract activity features, but also has fewer parameters and higher accuracy.
Article
Full-text available
A typical healthcare application for elderly people involves monitoring daily activities and providing them with assistance. Automatic analysis and classification of an image by the system is difficult compared to human vision. Several challenging problems for activity recognition from the surveillance video involving the complexity of the scene analysis under observations from irregular lighting and low-quality frames. In this article, the authors system use machine learning algorithms to improve the accuracy of activity recognition. Their system presents a convolutional neural network (CNN), a machine learning algorithm being used for image classification. This system aims to recognize and assist human activities for elderly people using input surveillance videos. The RGB image in the dataset used for training purposes which requires more computational power for classification of the image. By using the CNN network for image classification, the authors obtain a 79.94% accuracy in the experimental part which shows their model obtains good accuracy for image classification when compared with other pre-trained models.
Article
Full-text available
In this rapidly aging society, the mobility of older adults is critical for the prosperity and well-being of communities. Despite such importance, various types of environmental barriers (e.g., steep slopes and uneven sidewalks) have limited their mobility. Recent wearable biosensors have shown the potential to less invasively, less laboriously, and continuously detect environmental barriers by measuring stress in older adults’ daily trips. However, stress alone could not be indicative of environmental barriers because various stress stimuli (e.g., emotions and physical fatigue) are mixed up in their daily trips. To fill this gap, the authors propose and test a computational approach that spatially identifies stress resulting from environmental barriers by aggregating multiple people’s physiological and location data. The proposed approach measures stress commonly sensed from multiple people in a specific location (collective stress) as an indication of environmental barriers, applying wearable biosensors, signal processing, and geocoding. To test the feasibility of the proposed approach, collective stress was compared between locations with and without environmental barriers based on 2 weeks of field data collected from the daily trips of 16 subjects. As a result, the collective stress was statistically higher in the locations with environmental barriers than without. This result shows that the proposed approach is feasible to compute collective stress measures that are indicative of environmental barriers. This finding contributes to the body of knowledge by confirming the feasibility of a new computational approach that understands locational stress-inducing factors by spatially aggregating multiple people’s physiological signals using wearable biosensors, signal processing, and geocoding. Given the feasibility of the proposed approach to detect environmental barriers, future studies can generate and validate a less invasive, less laborious, and continuous method to detect environmental barriers, which can facilitate mobility improvement.
Article
Full-text available
Recognizing activities of daily living (ADL) provides vital contextual information that enhances the effectiveness of various mobile health and wellness applications. Development of wearable motion sensors along with machine learning algorithms offer a great opportunity for ADL recognition. However, the performance of the ADL recognition systems may significantly degrade when they are used by a new user due to inter-subject variability. This issue limits the usability of these systems. In this paper, we propose a deep learning assisted personalization framework for ADL recognition with the aim to maximize the personalization performance while minimizing solicitation of inputs or labels from the user to reduce user's burden. The proposed framework consists of unsupervised retraining of automatic feature extraction layers and supervised fine-tuning of classification layers through a novel active learning model based on a given model's uncertainty. We design a Bayesian deep convolutional neural network with stochastic latent variables that allows us to estimate both aleatoric (data-dependent) and epistemic (model-dependent) uncertainties in recognition task. In this study, for the first time, we show how distinguishing between the two aforementioned sources of uncertainty leads to more effective active learning. The experimental results show that our proposed method improves the accuracy of ADL recognition on a new user by 25% on average compared to the case of using a model for a new user with no personalization with an average final accuracy of 89.2%. Moreover, our method achieves higher personalization accuracy while significantly reducing user's burden in terms of soliciting inputs and labels compared to other methods.
Article
This paper proposes a Human Activity Recognition system composed of three modules. The first one segments the acceleration signals into overlapped windows and extracts information from each window in the frequency domain. The second module detects the performed activity at each window using a deep learning structure based on Convolutional Neural Networks (CNNs). The first part of this structure has several layers associated to each sensor independently and the second part combines the outputs from all sensors in order to classify the physical activity. The third module integrates the window-level decision in longer periods of time, obtaining a significant performance improvement (from 89.83% to 96.62%). These are the best classification results on the PAMAP2 dataset with a Leave-One-Subject-Out (LOSO) evaluation.