Content uploaded by Hasala Marakkalage
Author content
All content in this area was uploaded by Hasala Marakkalage on Jan 03, 2019
Content may be subject to copyright.
1
Understanding the Lifestyle of Older Population:
Mobile Crowdsensing Approach
Sumudu Hasala Marakkalage, Serhad Sarica, Billy Pik Lik Lau, Sanjana Kadaba Viswanath,
Thirunavukarasu Balasubramaniam, Chau Yuen, Senior Member, IEEE, Belinda Yuen, Jianxi Luo,
and Richi Nayak
Abstract—In this paper, we present a mobile crowdsensing
approach to understand the daily lifestyle of older population
in Singapore. By implementing novel clustering, sensor fusion,
and user profiling techniques to analyze the multi-sensor data
(location, noise, light etc.) collected from a smartphone applica-
tion, we identified the travel patterns at several points of interest
(POI), the impact of travel frequency for certain POI, and three
main user profiles. The results show that older adults mostly
spend time at food courts and community centers in their home
neighbourhood, but they travel away from neighbourhood for
healthcare and religious purposes. We found that POIs have more
visits if they are easily accessible (in terms of travel time from
home) regardless of the distance from home.
Index Terms—Crowdsensing, Human Mobility, Sensor Fusion,
Urban Computing, User Profiling.
I. INTRODUCTION
THE ubiquitous nature of smartphones nowadays has
made mobile crowdsensing a more convenient method
to collect data from masses with the emerging concepts of
the internet of people (IoP) [1] and smart communities [2].
Since smartphones have various types of in-built sensors, there
is potential to collect more data easily through smartphone
based applications, when compared to traditional data col-
lection methods such as custom built sensor nodes[3], [4],
surveys, and questionnaires. The portability of smartphones
enables the collection of user-centric data such as user location
information, the user environment information (e.g. noise and
light levels of the environment of the device).
The content presented in this paper aims to identify popular
regions and places among older adults in Singapore, and their
travel patterns, by analysing the smartphone data collected
from a group of older adults who live in the Bukit Panjang
area of Singapore. In this paper, we introduce a few key
insights. First, we identify regions of interest (ROI) and points
of interest (POI) [5], [6] among older adults in Singapore.
Next, we present the impact of travel frequency for certain
POI, based on the distance and travel time from user’s home
location. Then, we introduce a technique called profiling of
S. H. Marakkalage, S. Sarica, B. P. L. Lau, S. K. Viswanath, C. Yuen,
and J. Luo are with the Engineering Product Development Pillar, Singapore
University of Technology and Design, Singapore. Corresponding author e-
mail: (marakkalage@mymail.sutd.edu.sg)
B. Yuen is with the Lee Kuan Yew Centre for Innovative Cities, Singapore
University of Technology and Design, Singapore.
T. Balasubramaniam and R. Nayak are with the Queensland University of
Technology, Australia.
Digital Object Identifier 10.1109/TCSS.2018.2883691
older adults based on daily travel patterns, time spent at stay
points, and their work status (i.e. working or non-working). It
enables to categorize the older adults into a few groups such
that people who have similar lifestyle fall into the same group,
and people with different lifestyle fall into different groups.
We present our results of profiling the older adults into few
groups, based on the above mentioned criteria.
A. Older Population and Smartphone Data Collection
Older population is an important category in any society.
Singapore has a rapidly increasing number of people who are
aged 65 and above. In fact, the resident Old-Age Support
Ratio (people aged 20 −64 years per older adult aged 65
years and above) has significantly reduced from 9to 5.4from
the year 2000 to 2015 respectively [7]. Prior research found
that the older adults spend their leisure time travelling around
[8]. Therefore, it is important to identify the places where
the older adults prefer to visit more often. To identify such
places, the conventional method is to approch the target study
group of people, and acquire information about the places they
visit by communicating with them. In the case of the older
adults, they might not remember or keep track in every place
they visit and how much time they spend in those places.
Hence, the more convenient approach is to collect the data
through a smartphone application, instead of conducting verbal
or written surveys. Older adults in Singapore find smartphones
useful and entertaining and they have positive attitude towards
smartphones [9]. Therefore, using a smartphone mobile appli-
cation is a convenient and an effective approach to collect data
about places where older adults visit, and it makes it easier to
understand the daily lifestyle of the older adults.
B. Related Work
We went through an extensive review of related work to
understand what previous researchers have done to understand
the lifestyle of older population. Prior work have done to track
user locations using smartphone applications [10]. Hence,
smarphone application based location tracking is well estab-
lished in research arena. Tourist tracking through smartphone
location data [11] is implemented to identify tourist travel
sites along with surveys to collect feedback on them. The
previous work have conducted in the context of smart homes,
to support the older adults who stay at home. To name a few,
proactive environment change for elderly [12], real-time cloud
based activity tracking [13] to help remotely monitor the older
2
adults at home [14], and radio-frequency identification (RFID)
based location tracking to ensure the safety of older adults
in smart homes [15]. To support older adults to use smar-
phones conveniently, an adaptable interface for smartphones
has developed in [16]. Accessible urban paths recommendation
using crowdsourced information has done in [17] for disabled
people with limited mobility. An interesting appication is done
to transfer the knowledge from older generation to younger
generation by using mobile terminals such as smartphone apps,
emails, and phonecalls [18]. A web-based platform to improve
healthy living conditions of older adults has developed in [19].
To the best of our knowledge, no previous work is carried
out to understand the daily lifestyle of the older population
by collecting smartphone application data, and to profile the
older adults based on their daily travel patterns, time spent at
stay points, and work status. Therefore, our contributions in
this study are listed below.
1) Introducing Validation based Stay Points Detection
(VSPD) algorithm to identify the popular regions and
points of interest.
2) Identifying the spatial and temporal patterns of POI
visited by older adults.
3) Identifying the impact of the travel frequency to a certain
POI, based on the distance and travel time from the
home location.
4) Profiling the older adults to identify 3main profiles,
based on their travel pattern, time spent at stay points,
and work status as features.
The subsequent sections of the paper are structured as
follows. Section II, discusses the methodology we used to
acquire data from the older adults via smartphone based mo-
bile application, and the overview of the implemented system.
Section III describes the data analysis methods, including
stay point extraction and sensor fusion based environment
classification. Section IV presents the key insights discovered
from processed data. Section V explains the techniques of the
profiling of older population into few groups, based on their
daily travel patterns, gender, age, and work status, and key
results of profiling. Section VI concludes the paper.
II. METHODOLOGY
A. System Overview
We use a novel mobile crowdsensing approach to collect
data from the older population due to the wide availability of
smartphones among the older adults in Singapore.
1) Mobile Application Development: We designed and de-
veloped a user-friendly mobile application which runs on
Android platform to collect information from smartphones
of the older adults. The app runs as a background service
once the phone is turned on, and hence it needs minimal user
interaction. There is a user interface to view past trips with an
animation of the trajectory. This app is available to download
on Google Play[20].
2) Database Model: The data collected from smartphones
are transferred to store in a cloud server. The database model
is shown in Figure 1. The database consists 4tables namely,
User Info, Location Info, Environment Info, and Activity Info.
The types of data stored in each table are as follows.
•User Info: This table contains basic information about
the older adult, such as his/her age (years), gender
(male/female), ethnicity (Chinese, Malay, Indian, and
Other), and work status (working/non-working).
•Location Info: This table contains the location infor-
mation collected through Android location application
programming interface (API). It gives the location pa-
rameters such as, latitude, longitude, accuracy (in meters),
and the timestamp in which the location is acquired.
•Environment Info: This table contains basic device
information and its surrounding environment information.
Parameters such as environment noise level (dB), device
battery percentage, light level (Lux), and the timestamp
in which the data is acquired.
•Activity Info: This table contains the activity informa-
tion received by Google API [21]. 4types of activities
are stored in this table such as ’Still’, ’On Foot’, ’On
Bicycle’, and ’In Vehicle’.
Fig. 1: Database Model
3) Elderly Data Collection: The target participant group
for our study is the older population in Singapore, who use
Android smartphones. In our study, we recruited people who
reside near Bukit Panjang area of Singapore. We recruited
people from different ethnic groups such as Chinese, Malay,
and Indian. We explained every participant about what data we
collect and ensured them that we do not reveal their identity in
our analysis. The data collection and analysis are done through
a unique auto-generated device ID to ensure privacy of the
participants.
Data from each participant is collected for a minimum of
one month period starting from the day they install the mobile
application on their smartphones. After one month, we looked
into data to see whether we have consistent data for one month.
In this paper, we utilize the data from 50 selected older adults,
whose data quantity is consistent for a robust analysis. Figure
2 shows the overall system overview of our approach.
B. Data Pre-processing
The raw data collected from smartphones, consist with
duplicate records and outliers. For example, Android API gives
3
Fig. 2: System Overview
an accuracy parameter for every GPS location it acquired. In
some cases, the GPS accuracy value for a particular location
record can be more than 1000m, which means that the mobile
device can be anywhere within the 1000mradius from the
given GPS coordinates. This kind of outliers happen due to
poor GPS signals. Hence, it leads to substantial inaccuracy
for the analysis. Therefore, it is necessary to apply some data
pre-processing techniques to clean up the collected raw data
as a first step before conducting further analysis. Data pre-
processing consists of 3major stages namely, De-noise, Time-
sync, and Noise normalization.
1) De-noise: Removing duplicate records and outliers is
done in this stage of data pre-processing to clean the data
collected from smartphone application.
2) Time-sync: As shown in Figure 1, the 3tables (Location
Info, Environment Info, and Activity Info) do not receive
data samples at the same rate. The number of records in the
Environment Info table is more than that of the Location Info
and Activity Info tables. Activity Info table receives the least
number of records compared to the other tables due to the
configuration of the Google API. Therefore, each record in
the database has different timestamp. Hence, it is necessary
to synchronize the timestamps before analyzing the data. To
achieve this, 5-minute time slots are created for a particular
participant’s particular day’s data. Next, each data record from
each table are synchronized with those 5-minute time slots.
3) Noise Normalization: The smartphone application col-
lects noise level of the surrounding environment of the mobile
device. Due to different hardware configurations in different
mobile devices (even the mobile device of the same brand and
the same model can have different ranges of noise levels the
device is able to capture), it is necessary to bring the noise
data into the same scale (in this study we bring it to 0−10
scale) to conduct proper analysis. Noise normalization is done
according to the Equation 1, where Nnorm is the normalized
noise, sis the current noise record value, smax and smin are
minimum and maximum noise levels captured from particular
participant’s device respectively.
Nnorm = 10( s−smin
smax −smin
)(1)
III. DATA ANALYS IS
In this section, we explain the methods we used to extract
information from pre-processed data from Section II. We
utilize the pre-processed data to extract the stay points of users
to identify their trajectory pattern. The following subsections
present the stay point extraction and environment classification
techniques we used to identify the popupar regions and points
of interest among older adults in our study group.
TABLE I: Location parameters and their description
Symbol Description
ϕLatitude
λLongitude
αGoogle activity
βBattery level
ηNoise level
A. Stay Points Extraction and Clustering
We introduce Validation based Stay Points Detection
(VSPD) algorithm [22] to detect stay points, and adopt Density
based Spatial Clustering of application with Noise (DBSCAN)
[23] as clustering technique to form POI if a user stayed
longer than the designated time threshold. Validation function
is added for removing outlier data that might confuse the
system since we do not possess ground truth for all the data
collected.
First, we define the set of location data collected from
the mobile devices as L={P1, P2, ..., PN}, where Pis
denoted as Pi={ϕi, λi, ai, ti}. The latitude and longitude
data are represented by ϕand λrespectively, which is part of
GPS information Pi. The trepresents the timestamp of data
obtained and adenotes the accuracy of GPS data. Environment
information is denoted by Q={E1, E2, ..., EN}, where
Eis denoted as Ei={αi, βi, ηi}, where α,β, and η
represent Google activity information, battery level, and noise
information respectively. Algorithm 1 is used to perform POI
extraction.
Algorithm 1 Overall System Algorithm
Data: {L, Q}where it= 1,2, ..., N
begin
preprocess({L, Q})
stayPoints = ValidationStayPointAlgorithm(P)
clusteredData = DBSCAN(stayPoints)
Tx= Trajectory(stayPoints, P, clusteredData)
timeSync({L, Q})
SFEC({L, Q})
Initial testing and validation results for the proposed POI
extraction methods are verified with ground truth data in [22].
The model used for extracting the POI can be divided into
two parts, namely VSPD and DBSCAN.
1) Validation based Stay Points Detection (VSPD): VSPD
algorithm is an extension of the stay point detection algorithm
[24], in which an additional function is added to cope with
indoor GPS accuracy issue. Due to poor indoor GPS accuracy,
4
confidence radius of a particular location increases and this
may wrongly classify a stay point if the user left that particular
area.
We define the stay point SP , which consists of the following
information: SPi={ϕi, λi, ai, ti}The distance between a
location pair {ϕ1, λ1}, and {ϕ2, λ2}can be calculated using
the Harvesine formula as shown in Eqn (2):
hav(d
r) = hav(ϕ2−ϕ1) + cos(ϕ1) cos(ϕ2)hav(λ2−λ1)(2)
where Haversine function hav() is originated [25] and dis the
distance between two points with consideration of the radius
of earth, r. The hav() used in Eqn (3) is defined as follows:
hav(θ) = sin2(θ
2) = 1−cos(θ)
2(3)
To obtain the distance between two points, we apply the
inverse Harvesine function for Equation (4) by using arcsine
function.
d=r×hav−1(h)=2rarcsin √h(4)
Subsequently, we consider the time difference ∆tbetween
two stay points, which is another essential component for
deciding stay time. A threshold for time is used for deciding
a particular user is staying or travelling. We formulate the ∆t
as follows:
∆t=ti+1 −ti(5)
where ti+1 is the timestamp for the current sample and tiis
the timestamp for the previous sample.
After defining all the key elements, we perform VSPD
algorithm as shown in Algorithm 2. The validation function
validates both time and distance before classifying a particular
point as a stay point. It is denoted as:
validaty(d, ∆t)=∆t < Θtand d < Θd(6)
where Θtis the threshold for time and Θdis the threshold for
distance.
In general, location based on sequential temporal such as
distance, GPS accuracy, and duration will be calculated before
evaluating it using validation function. A new stay point can
be formed when a user decides to leave the place. If within
the time windows, the user does not leave, we consider the
user stays at a place until end of the day.
After extracting the stay points, the trajectory of a user can
be formed as Tx={SP1, S P2, ..., SPx}for further analysis.
2) Density based Spatial Clustering of Application with
Noise: For spatial data, DBSCAN performs well as it offers
clustering of arbitrary shape and the capability of detecting
the outliers. In DBSCAN, two parameters are needed to
be predetermined, namely distance eps and minimum points
minP ts. Meanwhile, the minP ts is fixed at the value of 1
since we do not treat any POI as outliers at the moment.
In our model, we use the accuracy of GPS location as eps.
Algorithm 2 Validation based Stay Point (VSPD) algorithm
Data: Piwhere i= 1,2, ..., N
Result: SPx
for k←1to Ndo
while j←(k+ 1) to Ndo
Check(a{i,i+1,...,j}, threshold(a)) using Equation (7)
Calculate dusing Equation (4)
if d > a then
Caclulate ∆tusing Equation (5)
if ∆t > threshold(t)then
if Validity(d, ∆t)then
addStayPoint(Pi, Pj)
else
break
It can be denoted as:
eps =ai+ai−1if ai+ai−1<Θlwhere minP ts = 1
Θlif ai+ai−1≥Θlwhere minP ts = 1
(7)
where confidence radius is considered as a part to determine
the validity of the stay points. The main reason for doing so is
due to GPS signal at a place with bad reception. For example,
if a user is entering underground tunnel while driving a car,
the GPS signal will remain at fixed location until the next
GPS location is refreshed at the end of the tunnel. Traditional
method will treat this period as stay time as location of user
does not vary so much until new GPS location is refreshed.
Our validation model will treat the period of entering tunnel as
part of traveling because the time taken and distance traveled is
considered not valid. In Equation (7), a threshold of Θl= 200
was included to prevent GPS confidence radius become too
large and form a false cluster outside the actual location.
B. Sensor Fusion based Environment Classification (SFEC)
Multi-sensor information collected from smartphones [26]
give us the potential to extract high level information about
devices’ environment. The stay points are classified as indoor
vs. outdoor and private vs. public, using sensor fusion based
environment classification. An indoor environment is where
the average GPS accuracy for the POI is above a certain
threshold, while an outdoor environment is where the average
GPS accuracy for the POI is below that threshold (here,
the GPS accuracy is returned by Android API, in which,
when a higher value for accuracy is returned, it means the
GPS accuracy is lower). A private environment is where the
noise level for the POI is below a certain threshold and,
while a public environment is where the noise level for the
POI is above that threshold. Since we cannot obtain a 100%
estimation to determine the type of environment where the POI
is, we assign percentage confidence levels for determination.
These information is useful when profiling the older adults in
Section V.
The classifier requires multi-sensor data (i.e. GPS accuracy,
Noise, Battery Level, Light etc.), from start and end time of the
POI. The duration of POI is divided into 5 minute slots, and
each of those slots are given a confidence percentage for type
5
of above mentioned environments. Each type of environment
is labeled into one of the 4different categories, such as
indoor, outdoor, private, and public, which are encoded into
{1,2,3,4}respectively.
Total percentage confidence level for a particular type of
environment, for each type of slot, is calculated using the
Equation (8), where, nis number of 5minute slots in POI,
Pcis confidence percentage, Sc
kis percentage of kth slot being
type c, and cis type of environment where c={1,2,3,4}, and
(1 ≤k≤n). If there is no data in a slot for the classification,
we use P0to indicate the environment type ’Unclassified’.
Pc=1
n×
n
X
k=1
Sc
k; if n > 0(8)
The confidence level calculation for the environment type
Indoor and Outdoor, are presented in the Equation (9) and
Equation (10) respectively.
For P1, percentage contributions from sensors are, 90% by
GPS accuracy(G), 5% by battery level (β), where β= 1 if
battery is charging, and β= 0 otherwise, and 5% by Activity =
’Still’ (denoted by αs, where αs={0,1}) which is returned
by location API. In Equation(9) and Equation(10), T hGis
threshold GPS accuracy, and xis average GPS accuracy in
the slot.
P1="x−T hG
T hG×0.9+(β+αs)×0.05#; if x > T hG
(9)
For P2, percentage contributions from sensors are, 90% by
G, and 10% by light level (l), where l= 1 if light level
is above threshold level Thlor l= 0 otherwise. T hG= 30,
T hN= 5, and T hl= 1000 based on empirical studies verified
with ground truth data presented in [22].
P2="T hG−x
T hG×0.9+l×0.1#; if x < T hG(10)
The confidence level calculation for the environment type
Private and Public, are presented in Equation (11) and (12)
respectively where, T hNis threshold noise level, and yis
average normalized noise level in the slot.
For P3, percentage contributions from sensors are, 90% by
noise level, and 10% by Activity = ‘Still’(αs).
P3="T hN−y
T hN×0.9+αs×0.1#; if y < T hN(11)
For P4, percentage contributions from sensors are, 90% by
noise level, and 10% by Activity = ‘Walking’(denoted by w,
where αw= 0,1).
P4="y−T hN
T hN×0.9+αw×0.1#; if y > T hN(12)
IV. RES ULTS
Identifying the places of interest where older adults prefer to
visit is necessary to understand their lifestyle, and it is useful
for city planners in better preparation for ageing population.
Place analysis is done in two stages such as Regions of Interest
(ROI) and Points of Interest (POI).
A. Regions of Interest (ROI)
We used Voronoi zones to identify ROI of older adults. ROI
have 3major sections as follows. First, we try to understand
infomation about participants’ home stay. Next, we intend
to gather insights about participants’ lifestyle across their
neighbourhood region. Finally, we get an understanding about
participants’ lifestyle across Singapore region.
1) Home Stay Duration: Knowing the percentage of time
the participants spend at their homes is useful when under-
standing the lifestyle of older population. Percentage of home
stay duration for a particular participant is calculated according
to Equation 13, where PHis the percentage of home stay
duration, THome is the total time spent at home cluster during
the data collection period, and TT otal is the total time spent
at all the clusters (including home cluster) during the data
collection period.
PH=THome
TT otal
(13)
5 10 15 20 25 30 35 40 45 50
Participants
0
10
20
30
40
50
60
70
80
90
100
Percentage
45
50
55
60
65
70
75
80
Age (Years)
Female
Male
Age
Fig. 3: Percentage home stay duration, gender, and ages of the
50 participants
Figure 3 shows the percentage home stay duration, gender,
and age of each of the participant in our study. The oldest
participant is 75 years old, and the average age for participants
is 60 years. There are 28% of male, and 72% of female
participants. It can be seen that a large number of participants
spend a large amount of their time at home. In particular, 20%
of the participants stayed at their homes for more than 19.2
hours a day on average, 34% of the participants stayed at their
homes between 14.4and 19.2hours a day on average, 30%
of the participants stayed at their homes between 7.2and 14.4
hours a day on average, and 16% of the participants stayed at
their homes less than 7.2hours a day on average.
It can be noticed that participant IDs 1,12,17,21,22,25,
and 39 have home stay duration smaller than 30%
(approximately 7.2hours), which is substantially low.
The reason for such a result is because those participants
occasionally turn off their phones during nights.
6
2) Across Neighbourhood: We divided Bukit Panjang area
into 12 Voronoi Zones, in order to get an understanding about
participants’ home region since all of them are residents of
Bukit Panjang. The Voronoi zone distibution, zone number,
and number of participant homes in each Voronoi zone is
shown in Figure 4.
Fig. 4: Voronoi zones and number of participant homes in each
zone across neighbourhood
Figure 5a shows the percentage of visits in each region
and Figure 5b shows the percentage of durations in each
region. In both Figures, the inner circles display the Voronoi
zone number and percentage of visits/duration accordingly.
The outer rings show the percentages of aforementioned
percentages contributed by non-home zone participants.
From the Figures, it can be observed that visits and durations
in zone number 2(Senja Cashew CC) and zone number
7(Junction 10) have 100% of non-home zone participants,
which means, all the participants who visit and spend time in
those two zones are residing outside them. Especially, none
of the participant homes are located in zone number 7. Zone
number 2also shows the highest percentage duration, while
zone number 3(Masjid Al-Iman) shows the highest percentage
of visits.
Another interesting observation is that, zone number 3has
higher percentage of visits when compared to percentage of
duration of the same. The reason for such a result is, because
the Bangkit light rail transit (LRT) station is located inside
zone number 3. In terms of percentage duration, zones 2and
6are popular among non-home zone participants. Zones 5
and 9are popular among home zone participants. The reason
for such popularity is because those zones include POI such
as Bukit Panjang Community Center (CC) (zone 6), Senja
Cashew CC (zone 2), and Fajar Market (zone 9).
3) Across Singapore: The map of Singapore is divided into
9 Voronoi zones in order to understand what regions across
Singapore are in interest of users in terms of number of visits
and duration of time spent. The Voronoi seeds are selected to
cover the main regions of entire Singapore. The Voronoi zone
distibution and zone names are shown in Figure 6a.
Figure 6b shows the percentage number of visits and
durations for each region across Singapore. It can be observed
from the Figure that region number 3(Bukit Panjang) is the
most popular among participants, in terms of both visits and
duration. In fact, 53.5% of visits and 68.6% of durations
belong to that region. The reason for such popularity is because
all the participants are residing in Bukit Panjang area and
they tend to visit places nearby their homes. Apart from that,
regions 1,4,5, and 9are popular regions. However, regions
1and 9tend to have shorter duration than visits.
4) ROI Results Comparison with Telco Data: We compared
the above ROI analysis results with DataSpark Mobility In-
telligence API [27] which contains telco user mobility data.
The stay points analyis API provided by DataSpark is used to
compare the trend of number of visits with our data (across
Bukit Panjang and across Singapore). DataSpark data are
filtered for users who are above 50 years old and whose
home locations are inside Bukit Panjang planning area. The
API supports 3location levels in Singapore, namely Planning
Regions, Planning Areas, and Subzones. We grouped the
Voronoi zones in Figure 6a accordingly to belonging planning
regions (there are 6planning regions in total).
Across Singapore visits trend is shown in Figure 7a. Our
data and DataSpark data follow almost the same pattern.
Hence, it validates our study, although our data give more
detailed insights about the mobility of older adults.
We grouped the Voronoi zones in Figure 4 accordingly to
belonging subzones in Bukit Panjang planning area. Across
Bukit Panjang visits trends is shown in Figure 7b. As men-
tioned earlier, the trend accross Bukit Panjang also follows the
same pattern.
In conclusion, ROI analysis in 3major stages such as home-
stay duration, across neighbourhood, and across Singapore
shows that older adults spend most of the time in their home
region. By comparing ROI visits distributions with telco data
distributions, we can verify that the proposed method of data
collection and analysis in this paper provides the capability
to better understand the daily lifestyle of older adults in
Singapore.
TABLE II: POI Ranking based on Number of participants
Checked In
No. Point of Interest
Percentage of
Participants
visit
Average Hours
Spent Per
Participant
Per Visit
1 Bukit Panjang Hawker Centre 68% 3.7
2 Bukit Panjang Plaza 56% 2.9
3 Bukit Panjang Community Centre 34% 1.9
4 Fajar Shopping Centre 30% 2.5
5 Senja Cashew Community Club 28% 4.4
6 Holland Bukit Panjang Town Council 28% 2.2
7 Block 125 Neighborhood Stores 26% 3.0
8 Lot One Shopping Mall 24% 1.7
9National Healthcare
Group Pharmacy - Choa Chu Kang 22% 1.7
10 Chinese Temple at Bencoolen Link 22% 0.9
11 Greenridge Shopping Centre 20% 2.5
12 Ten Mile Junction Mall 20% 2.2
13 Plaza Singapura 18% 2.6
14 Causeway Point 18% 1.6
15 Block 172 Neighborhood Stores 16% 2.4
16 Zheng hua Community Club 16% 2.0
18 Teck Whye Ave
Neighborhood Stores 16% 1.3
19 Temple Street at China Town 16% 1.2
20 Phoenix Road Shop Lots 14% 2.6
Orange - Within Bukit Panjang Area, Blue - Within 5KM range of Bukit
Panjang, Purple - 5KM away from Bukit Panjang
7
(a) Percentage of visits (b) Percentage of durations
Fig. 5: Voronoi zones in Bukig Panjang area and percentages of non-home zone visit and durations
(a) Voronoi zones across Singapore
123456789
Regions
0
10
20
30
40
50
60
70
Percentage
Visits
Duration
(b) Percentage number of visits and durations across Singapore
Fig. 6: Voronoi zones across Singapore and percentages of
visits and durations
B. Points of Interest (POI)
After analyzing at region scale, we will look into series of
POI extracted using the aforementioned stay points extraction
(a) Across Singapore
(b) Across Bukit Panjang
Fig. 7: Comparison of percentage number of visits with our
data and DataSpark data
algorithm and clustering technique to group POIs accordingly.
The distance of clustering is fixed at 100m. In this context,
user privacy is preserved by excluding their home and work
8
places. After initial filtering, we managed to obtain 980
POIs from all the valid users. Three different studies are
conducted to identify the common POIs among participants,
time of daily visit, and participants’ attributes (gender and
occupation).
China Town
Bugis
Lot One
Shopping Mall /
Chua Chu Kang MRT
Teck Whye Ave /
Chua Chu Kang
Community Club
Marina Square
City Plaza /
One KM Mall
Buono Vista
MRT Station
Jurong East,
J-Walk Area
Marsiling MRT
Shopping District
Woodlands MRT
Jurong Point
Clementi
MRT Station
Ikea
Alexandra
Ochard
Road
Limbang Park /
Choa Chu Kang
Sports Hall
Sembawang
God of Wealth
Temple
Fig. 8: POI heat maps across Singapore
1) Common POI Among Participants: A heat map of POI
across Singapore is generated based on the number of visits as
shown in Fig. 8 to identify common POI among older adults.
Common POI are highlighted in red, where yellow represents
the areas with less visits. The surrounding areas of the partici-
pants’ houses are excluded in the list for better visualization of
the other areas. Most common areas are located few kilometers
away from the participant homes. Those are identified as parks,
sports halls, and shopping malls. Other common places in
central area mostly consist of shopping districts or shopping
malls (such as Bugis, Marina Square, China Town, and City
Plaza). Hence, this is an indication that older adults are most
likely to travel around for shopping as shown in the previous
subsection.
We rank the common POIs near Bukit Panjang area in 3
distance levels, as shown in Table II. It can be observed from
the table that frequently visited places are located within 5km
from Bukit Panjang. On the other hand, since the residential
areas can be multi purpose (such as commercial stores, clinic
or kindergarden), Block 125 in the rank 7became one of the
places where frequently visited by older adults. Other locations
mostly consist of shopping centers (Bukit Panjang Plaza, Fajar
Shopping Center, Greenridge Shoppping center, and Ten Mile
Junction Mall).
Results show that majority of the older adults visit nearby
places for all food, shopping and social activities. However,
when it comes to medical or religious purposes, older adults
travel far from their neighbourhood.
2) Temporal Studies on POIs: Based on the previous sec-
tion, five common POIs are studied namely Bukit Panjang
Community Center (CC), Senja Cashew CC, Bukit Panjang
Plaza, Fajar Shopping Center, and Bukit Panjang Hawker
Center. Two different categories of data are studied, which are
check-in time based on hour of the day, and duration spent
for each POI based on the day of the week. The results are
presented in the Fig. 9.
Bukit Panjang Hawker Center (one of the top common POI)
is considered as a morning (8am -10am) place based on the
observation compared to others, because it is a common place
due to market, and most of the time, older adults socialize
at the nearby coffee shop. However, we noticed that Mondays
have less visits because of the operational hours of the market,
where most stores in the market are closed on Mondays.
Shopping malls such as Bukit Panjang Plaza (BPP) and
Fajar Shopping Center (FSC) have different patterns, where
BPP resembles a Gaussian distribution and two obvious peaks
can be observed at FSC. The main reason for BPP higher peak
at afternoon is correlated with the nearby transportation hub,
which contributes to the crowd flow during lunch hour.
Community centers (CC) are another type of common
places among the participants. We noticed both CC (Bukit Pan-
jang and Senja Cashew) have quite similar check-in patterns.
However, when the number of hours spent are considered,
Bukit Panjang CC tends to have more check-ins on Monday
and Sunday. The reason for such variation is due to different
recreational activities offered by CC.
Overall, the older adults are more likely to visit Bukit
Panjang Hawker Center and CC in the mornings. However,
check-in pattern of CC rely on the recreational activities
offered. In contrast, the check-in pattern of shopping mall is
related to the transportation accessibility and location itself.
3) User Attribute Studies: Two user attributes (gender and
work status) are examined in this case study. The results are
shown in Fig 10.
The active user groups are highlighted in red bold boxes.
It can be observed that non-working female are more active
than non-working male. Most of the common places among
the non-working female are related with shopping districts or
malls such as Bugis, Marina Square, China Town, Orchard
Road etc. It also correlates with previous finding that most of
the common places are shopping districts. On the other hand,
non-working male have a higher tendency to stay at the area
nearby their home.
However, when it comes to working group, working male
are more active than working female. It is correlated with
the nature of work they do. In contrast, female seem to have
a more fixed routine between their residence area and town
central. One speculation would be related to shopping districts
and malls.
In conclusion, the frequently visited POIs are located within
5km from Bukit Panjang (which is the participants’ home
region). In fact, the older adults are more likely to visit Bukit
Panjang Hawker Center and the Community Center. Moreover,
working male are more active than working female, and non-
working female are more active than non-working male.
C. POI Travel Frequency Analysis
This subsection presents an analysis of travel frequeny to
certain POIs. Basically, we try to understand how the distance
from participants’ home to a particular POI, and time taken
9
Histogram on Check In Hours Total Time Spent on Day of Week
(a-1)Bukit Panjang Community Centre
1AM 6AM 12PM 6PM 12AM
Time, from 0000 to 2359 Hours
0
10
20
30
40
50
60
70
Number of Check Ins
(b-1)Senja Cashew Community Centre
1AM 6AM 12PM 6PM 12AM
Time, from 0000 to 2359 Hours
0
10
20
30
40
50
60
70
Number of Check Ins
(a-2)Bukit Panjang Community Centre
SUN MON TUE WED THU FRI SAT
Day of Week
0
10
20
30
40
50
Total Time Spent (Hours)
(b-2)Senja Cashew Community Centre
SUN MON TUE WED THU FRI SAT
Day of Week
0
10
20
30
40
50
Total Time Spent (Hours)
(c-1)Bukit Panjang Plaza
1AM 6AM 12PM 6PM 12AM
Time, from 0000 to 2359 Hours
0
20
40
60
80
100
Number of Check Ins
(d-1)Fajar Shopping Centre
1AM 6AM 12PM 6PM 12AM
Time, from 0000 to 2359 Hours
0
20
40
60
80
100
Number of Check Ins
(c-2)Bukit Panjang Plaza
SUN MON TUE WED THU FRI SAT
Day of Week
0
10
20
30
40
50
Total Time Spent (Hours)
(d-2)Fajar Shopping Centre
SUN MON TUE WED THU FRI SAT
Day of week
0
10
20
30
40
50
Total Time Spent (Hours)
(e-1)Bukit Panjang Hawker Centre and Market
1AM 6AM 12PM 6PM 12AM
Hours, from 0000 to 2359 Hours
0
50
100
150
200
250
Number of Check Ins
(e-2)Bukit Panjang Hawker Centre and Market
SUN MON TUE WED THU FRI SAT
Day of Week
0
20
40
60
80
100
Total Time Spent (Hours)
Fig. 9: Check In Time and Time Spent for the following POI (Top to Bottom, Left to Right): (a) Bukit Panjang Community
Centre, (b) Senja Cashew Community Centre, (c) Bukit Panjang Plaza, (d) Fajar Shopping Centre, and (e) Bukit Panjang
Hawker Centre and Market. Each colour of the bar represents a unique user.
Working - female Working - male
Non working - female Non working - male
Fig. 10: Active area based on gender and work status
to reach that POI from participants’ home, would impact the
frequency of travel (the number of visits) to that particular
POI.
Figure 11a shows the distance from participants’ home to
each POI (within 5000 m) vs. the number of visits from all the
participants in the study. According to the Figure we can see
that when the distance from participants’ home increases, the
number of visits decreases except for few distances such as
500 m, 900 m, 1200 m, 1900 m, 3600 m, and 4000 m. These
exceptions occur due to few of the top POIs where paticipants
prefer to visit, namely Bukit Panjang Plaza, Bukit Panjang
Town Council, and Bukit Panjang Hawker Centre.
Figure 11b shows the time taken from participants’ home
to reach each POI vs. the number of visits from all the
participants in the study. It can be seen from the figure that
the previously identified top POIs (Bukit Panjang Plaza, Bukit
Panjang Town Council, and Bukit Panjang Hawker Centre) are
easily accessible (less than 60 minutes are taken to reach those
places from participants’ home).
In conclusion, we can say that, even the distance from
participants’ home to the POI is far (more than 1km), the
older adults prefer to visit such POI if they are able to access
those places easily. There are three POIs that are outside par-
ticipants’ neighbourhood Bukit Panjang, e.g. Chinatown Point,
Plaza Singapura, and Causeway Point, where the number of
visits is substantially high. Since those three POIs are shopping
malls, we can say that the older adults prefer to go shopping
for specific malls even though it takes more than an hour to
reach those places from their homes.
V. US ER PRO FIL IN G
We used the collected data, whose details had given in
preceding sections, to define possible features, which can help
us to profile older adults into groups based on their visiting
behaviors and smartphone sensor fusion data. The smartphone
data were utilized to create 10 features, whereas we used
demographic data collected to create 3additional features.
These features are given in Table III with the information of
data sources. The whole profiling process can be summarized
as in Figure 12 where they are discussed in detail in the next
sub-sections.
10
(a) Distance from Home (km) to POI vs. Number of Visits
(b) Travel Time taken from home to POI vs. Number of Visits
Fig. 11: POI Travel Frequency Analysis
TABLE III: Features used for profiling (Bold - Important
features in main clusters, Underline - Important features in
sub clusters, Bold and underline - Important features in both
cluster types)
ID Feature
1POI visit ratio
2Home stay ratio
3 Indoor POI stay ratio
4Outdoor POI stay ratio
5 Private POI stay ratio
6 Public POI stay ratio
7 Healthcare POI visit ratio
8 Food court POI visit ratio
9 Shopping POI visit ratio
10 Weekday distinct POI visit ratio
11 Work status (Full time, part time, not working)
12 Age
13 Gender
A. Input Data Pre-Processing
The pre-processed data from previous steps are pushed
to the profiling process. This already processed data is pre-
processed again to make it ready for profiling. First of all,
the location data that returns points of interest outside of
Singapore are removed from the dataset. Secondly, if a user
stays for too long (more than 3days) in a stay point or has
a single travel duration more than 12 hours, that index is
eliminated since it would bias the features which are based
on time.
Fig. 12: Profiling Process
11
B. Feature Extraction and Standardization
The processed data is used to create features listed in Table
III for every user. One issue is to calculate total time of
data collection. The time between first and last data collection
time cannot be used since some user datasets have large gaps
between data collection windows (that could be due to various
reasons such as phone turned off, no GPS data for long time,
etc). If we sum only stay point intervals to reach total time,
we miss the travel time between stay points. Based on this
information, the total time is calculated by adding up time
spent at stay points and travel duration. As a last step, all
continuous features are standardized such that they have a zero
mean and unit variance. Standardization process is done for
preventing possible dominance of a feature which might have
a much higher magnitude of variance with its unstandardized
form comparing to other features.
C. Clustering, Best Features and Visualization
If we consider profiling as a machine learning problem, it
should be classified as an unsupervised learning problem for
our case. For the specific profiling problem and features at
hand, this problem can be defined as a clustering problem,
where features indicate users’ position in feature-space. We
used two clustering algorithms, namely k-means [28] and hier-
archical clustering (specifically agglomerative clustering), and
measured their performance on our feature dataset. Since we
don’t have any idea about how many clusters are there in our
dataset (or a ground truth for clustering), we run a systematical
experiment on feature dataset to find the best clustering. We
tried k-means and hierarchical clustering algorithms on our
dataset, changing number of clusters to be found between 2
to 10. Since k-means algorithm randomizes the centroids, we
run k-means 50000 times for each number of clusters to be
found. As metrics of performance, Silhouette coefficient [29]
and Calinski-Harabaz score [30] are employed for measuring
how distinct the found clusters are from each other. Comparing
the results, we found that for each clustering algorithm, best
performance metrics are encountered when clusters to be
found is set to 3. Table IV shows the performance of clustering
algorithms on the dataset we have. Since k-means algorithm’s
performance is slightly better than hierarchical clustering, from
this point onwards, the clusters obtained by k-means algorithm
will be discussed.
TABLE IV: Number of clusters found and performance met-
rics for the used clustering methods
Method Number of
Clusters
Silhouette
coefficient
Calinski-Harabaz
score
k-means 3 0.396 17.842
Hierarchical Clustering 3 0.383 16.166
The clustering by k-means consists of 3clusters, which are
populated by 39,8, and 3people. For visualization of these
clusters, principal component analysis (PCA) is used to reduce
the dimensionality of feature vectors to 3. Figure 13 shows the
scatter of users, divided into clusters. As it can be seen, there
is a clear distinction between the clusters.
Fig. 13: Visualization of user feature vectors whose dimen-
sions reduced to 3 by using PCA
It is important for future studies to learn which features
are most important and helpful for achieving the clustering
we had here. To reveal these features, we fed the clustering
membership labels found by k-means algorithm to the system
which is illustrated in Figure 14 to find out best features,
features which create highest variation for the given division
of clusters. Since, decision trees generally fit the data well,
we pushed the dataset under examination to decision tree
algorithms and pull the feature importance information from
the classifier. We used different decision tree implementations
and repeated same process for 10000 times for each imple-
mentation. The entries with bold text in Table III are revealed
as the significant features those discriminate the clusters.
Fig. 14: Method for revealing best features
Figure 15 shows the mean of the normalized scores, for
3important features of the distinct clusters obtained by k-
means algorithm. Those features are POI visit ratio, outdoor
stay ratio, and home stay ratio. It can be observed from the
Figure 15 that the people in cluster 1 prefer to spend more
time at outdoor POI when compared to people from cluster 2
and cluster 3. People in cluster 2 prefer to spend more time at
home when compare to people from cluster 1 and cluster 3.
People in cluster 3 prefer to visit more POI when compared
to people from cluster 1 and cluster 2. A summary of each
cluster is shown in Table V.
12
TABLE V: Summary of clusters found by k-means algorithm
Cluster ID Number of People Summary
1 03 Working female, the most outdoor POI visits
2 39 Mixed work status and gender, the most time spent at home
3 08 Working male, the most POI visits
POI Visit Ratio Outdoor Stay Ratio Home Stay Ratio
Features
0
0.2
0.4
0.6
0.8
1
Normalized score
Cluster 1
Cluster 2
Cluster 3
Fig. 15: Distinct profiles for 3 main clusters
Cluster 2, which has 39 out of 50 people, is a very large
cluster, which might consist of sub-clusters. We applied k-
means clustering again to the dataset of features of these
39 people and this large cluster is divided to 4sub-clusters.
The same process illustrated in Figure 14 is repeated here
for finding the best features which create higher variance in
the cluster 2dataset. These features are underlined in Table
III. Figure 16 shows the mean of the normalized scores, for 6
important features of the distinct clusters obtained by k-means
algorithm. Those features are ratios of home stay, healthcare
POI visits, POI visits, weekday POI visits, weekend POI visits,
and food POI visits. According to portioning found by this
last step, the sub-clusters can be summarized qualitatively as
shown in Table VI by observing the data distributions of best
features.
In conclusion, we can identify 3 main profiles of older adults
namely, outdoor users, home users, users with more POI visits.
Based on the results of profiling, we can better understand the
behaviour and needs of different profiles, and such information
can be used to design policy, incentive, activity, and even
points of interest that could enhance the quality of life for
older adults.
VI. CONCLUSIONS
This paper presents a novel mobile crowdsensing approach
to understand the daily lifestyle of the older population in
Singapore. We use a smartphone based mobile application
to collect data such as location information, environment
information (e.g. noise, light, etc.) and analyze those data to
demonstrate various travel patterns of older adults at several
points of interest (POI), based on the number of visits and
the amount of time spent at each POI. Results show that by
using the proposed system it is possible to identify the travel
patterns, impact of travel frequency for certain POI, based on
the distance and travel time from participants’ home location.
We observed that the amount of time taken to reach a certain
POI from a person’s home, is an important factor regardless of
the distance to those POI when it comes to the number of visits
a person makes for certain POI. Utilizing the collected multi-
sensor data and the travel patterns it is also possible to profile
the older adults into few groups, based on their POI visit ratio,
outdoor stay ratio, and and home stay ratio. Moreover, we
identified three main categories of older adults based on their
daily life. In future work, we are looking to recruit a larger
population for the testing of the proposed method. Moreover,
we plan to utilize WiFi information to analyze such indoor
mobility patterns, which we believe can better keep track of a
person even in an indoor environment and hope it can better
capture the social relationship among our participants.
ACKNOWLEDGMENT
We would like to express our sincere gratitude to all the
participants in the study who allowed us to collect their
location information. We are thankful to everyone who spared
their time to help us in the process of recruiting older adults for
this study. This research was supported by the Lee Kuan Yew
Centre for Innovative Cities under Lee Li Ming Programme
in Aging Urbanism, and in part by the NSFC 61750110529.
REFERENCES
[1] M. Conti, A. Passarella, and S. K. Das, “The internet of people (iop):
A new wave in pervasive mobile computing,” Pervasive and Mobile
Computing, vol. 41, pp. 1 – 27, 2017.
[2] F. Xia, N. Y. Asabere, A. M. Ahmed, J. Li, and X. Kong, “Mobile
multimedia recommendation in smart communities: A survey,” IEEE
Access, vol. 1, pp. 606–624, 2013.
[3] B. P. L. Lau, T. Chaturvedi, B. K. K. Ng, K. Li, M. S. Hasala, and
C. Yuen, “Spatial and temporal analysis of urban space utilization
with renewable wireless sensor network,” in Proceedings of the 3rd
IEEE/ACM International Conference on Big Data Computing, Appli-
cations and Technologies, pp. 133–142, ACM, 2016.
[4] B. P. L. Lau, N. Wijerathne, B. K. K. Ng, and C. Yuen, “Sensor fusion
for public space utilization monitoring in a smart city,” IEEE Internet
of Things Journal, vol. 5, no. 2, pp. 473–481, 2018.
[5] S. H. Marakkalage, B. P. L. Lau, V. S. Kadaba, T. Balasubramaniam,
C. Yuen, B. Yuen, and R. Nayak, “Identifying points of interest for
elderly in singapore through mobile crowdsensing,” in 6th International
Conference on Smart Cities and Green ICT Systems (SMARTGREENS
2017), (Porto, Portugal), April 2017.
[6] S. H. Marakkalage, B. P. L. Lau, S. K. Viswanath, C. Yuen, and B. Yuen,
“Real-time data analysis using a smartphone mobile application,” in
Ageing and the Built Environment in Singapore, pp. 221–240, Springer,
2019.
[7] Department of Statistics, Singapore, Resident Old-Age Support Ratio,
2017 (accessed June 01, 2017). http://www.singstat.gov.sg/statistics/
visualising-data/charts/old- age-support- ratio.
[8] L. M. Capella and A. J. Greco, “Information sources of elderly for
vacation decisions,” Annals of Tourism Research, vol. 14, no. 1, pp. 148–
151, 1987.
[9] N. Pang, X. Zhang, S. Vu, and S. Foo, “Smartphone use by older adults
in singapore,” Gerontechnology, vol. 13, no. 2, p. 270, 2014.
13
Home stay ratio Healthcare POI visit ratio POI visit ratio Weekday POI visit ratio Weekend POI visit ratio Food POI visit ratio
Features
-0.5
0
0.5
1
Normalized score
Sub cluster 1
Sub cluster 2
Sub cluster 3
Sub cluster 4
Fig. 16: Distinct profiles for 4 sub clusters
TABLE VI: Summary of sub-clusters found in giant component by k-means algorithm
Sub cluster Members Summary
1 14 Work full time, similar pattern to sub cluster 2, except for less food POI visits
(probably due to their full time work nature)
2 15 Work part time, more food POI visits
3 03 Mixed work status, more weekday/less weekend POI, and more healthcare POI visits
4 07 Mixed work status, less home stay and more weekend POI
[10] N. Wijerathne, S. K. Viswanath, M. S. Hasala, V. Beltran, C. Yuen,
and H. B. Lim, “Towards comfortable cycling: A practical approach to
monitor the conditions in cycling paths,” in IEEE 4th World Forum on
Internet of Things (WF-IoT) 2018, IEEE, 2018.
[11] S. K. Viswanath, C. Yuen, X. Ku, and X. Liu, “Smart tourist-passive
mobility tracking through mobile application,” in International Internet
of Things Summit, pp. 183–191, Springer, 2014.
[12] S. Helal, B. Winkler, C. Lee, Y. Kaddoura, L. Ran, C. Giraldo,
S. Kuchibhotla, and W. Mann, “Enabling location-aware pervasive
computing applications for the elderly,” in Pervasive Computing and
Communications, 2003.(PerCom 2003). Proceedings of the First IEEE
International Conference on, pp. 531–536, IEEE, 2003.
[13] M. Fahim, I. Fatima, S. Lee, and Y.-K. Lee, “Daily life activity
tracking application for smart homes using android smartphone,” in
Advanced Communication Technology (ICACT), 2012 14th International
Conference on, pp. 241–245, IEEE, 2012.
[14] A. Yassin, Y. Nasser, M. Awad, A. Al-Dubai, R. Liu, C. Yuen,
R. Raulefs, and E. Aboutanios, “Recent advances in indoor localization:
A survey on theoretical approaches and applications,” IEEE Communi-
cations Surveys Tutorials, vol. 19, pp. 1327–1346, Secondquarter 2017.
[15] S.-C. Kim, Y.-S. Jeong, and S.-O. Park, “Rfid-based indoor location
tracking to ensure the safety of the elderly in smart home environments,”
Personal and ubiquitous computing, vol. 17, no. 8, pp. 1699–1707, 2013.
[16] F. Arab, Y. Malik, and B. Abdulrazak, “Evaluation of phonage: an
adapted smartphone interface for elderly people,” in IFIP Conference
on Human-Computer Interaction, pp. 547–554, Springer, 2013.
[17] S. Mirri, C. Prandi, P. Salomoni, F. Callegati, and A. Campi, “On
combining crowdsourcing, sensing and open data for an accessible
smart city,” in 2014 Eighth International Conference on Next Generation
Mobile Apps, Services and Technologies, pp. 294–299, IEEE, 2014.
[18] A. Hiyama, Y. Nagai, M. Hirose, M. Kobayashi, and H. Takagi,
“Question first: Passive interaction model for gathering experience and
knowledge from the elderly,” in Pervasive Computing and Communi-
cations Workshops (PERCOM Workshops), 2013 IEEE International
Conference on, pp. 151–156, IEEE, 2013.
[19] G. Ghiani, M. Manca, F. Patern`
o, and C. Santoro, “Towards an archi-
tecture supporting social, adaptive and persuasive services for active
elderly.,” in CASFE, pp. 36–41, 2013.
[20] SUTDDev, “City app,” in City - Location tracking application, Google
Play, 2015.
[21] ”Google”, ”DetectedActivity - Android API”, ”2016 (accessed March
28, 2016)”. http://developers.google.com/android/reference/com/google/
android/gms/location/DetectedActivity.
[22] B. P. L. Lau, M. S. Hasala, V. S. Kadaba, T. Balasubramaniam, C. Yuen,
B. Yuen, and R. Nayak, “Extracting point of interest and classifying
environment for low sampling crowd sensing smartphone sensor data,”
in IEEE International Conference on Pervasive Computing and Commu-
nication Workshops (IEEE PerCom 2017), (Kona, Big Island, Hawaii),
pp. 1–6, IEEE, 2017.
[23] M. Ester, H.-P. Kriegel, J. Sander, X. Xu, et al., “A density-based
algorithm for discovering clusters in large spatial databases with noise.,”
in Kdd, vol. 96, pp. 226–231, 1996.
[24] Q. Li, Y. Zheng, X. Xie, Y. Chen, W. Liu, and W.-Y. Ma, “Mining
user similarity based on location history,” in Proceedings of the 16th
ACM SIGSPATIAL international conference on Advances in geographic
information systems, p. 34, ACM, 2008.
[25] C. Veness, “Calculate distance, bearing and more between lat-
itude/longitude points,” not dated, http://www. movable-type. co.
uk/scripts/latlong. html, 2010.
[26] R. Liu, C. Yuen, T. N. Do, and U.-X. Tan, “Fusing similarity-based
sequence and dead reckoning for indoor positioning without training,”
IEEE Sensors Journal, vol. 17, pp. 4197–4207, July 2017.
[27] DataSpark, Dataspark Mobility Intelligence API, 2017 (accessed De-
cember 03, 2017). https://apis.datasparkanalytics.com/documentation/
getting-started.
[28] D. Arthur and S. Vassilvitskii, “k-means++: The advantages of careful
seeding,” in Proceedings of the eighteenth annual ACM-SIAM sympo-
sium on Discrete algorithms, pp. 1027–1035, Society for Industrial and
Applied Mathematics, 2007.
[29] P. J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and
validation of cluster analysis,” Journal of computational and applied
mathematics, vol. 20, pp. 53–65, 1987.
[30] T. Cali´
nski and J. Harabasz, “A dendrite method for cluster analysis,”
Communications in Statistics-theory and Methods, vol. 3, no. 1, pp. 1–
27, 1974.
14
Sumudu Hasala Marakkalage received the Bach-
elor of Engineering (Hons) degree in Electronic
Engineering from the Sheffield Hallam University,
UK, in 2011. During the period of 2011 2015,
he worked at the Dialog - University of Moratuwa
Mobile Communications Research Laboratory, Sri
Lanka as a Research Engineer, where he was in-
volved in developing innovative mobile application
platforms for Dialog Axiata, which is a leading telco
in Sri Lanka. He is currently pursuing the Ph.D.
degree with the Engineering Product Development
Pillar, Singapore University of Technology and Design (SUTD), Singapore.
His current research interests include mobile crowdsensing, sensor fusion,
internet of things, and big data.
Serhad Sarica received his BSc and MSc in Elec-
trical & Electronics Engineering from Middle East
Technical University, Turkey in 2007 and 2011 re-
spectively. During the period 2007 -2016, he worked
as a senior system designer at Aselsan Co., Turkey,
where he involved and led several naval communica-
tion system design projects. He is currently pursuing
the PhD degree with the Engineering Product Devel-
opment Pillar, Singapore University of Technology
and Design (SUTD), Singapore. His current research
interest include change propagation in complex sys-
tems, semantic relations in technology and innovation space, and utilization
of NLP methods for design ideation..
Billy Pik Lik Lau received degree in computer
science and M. Phil degree in computer science from
Curtin University in 2010 and 2014 respectively. He
is currently pursuing PhD in Singapore University
of Technology and Design (SUTD) under Dr Yuen
Chau’s supervision. He previously works on improv-
ing cooperation rate between agents in multi agents
systems during master studies and current research
focus includes smart city, internet of things, big data
analysis, data discovery, and unsupervised machine
learning.
Sanjana Kadaba Viswanath (sanjana@sutd.edu.sg)
received her M.Sc in computing from Imperial Col-
lege London in 2011. She is currently a research
associate at Singapore University of Technology and
Design (SUTD). Her research interests include IoT
and big data.
Thirunavukarasu Balasubramaniam received the
B.E degree from Anna University, India in 2015 and
currently doing Ph.D. at Queensland University of
Technology (QUT), Australia. Before joining QUT,
he was a Research Assistant of Singapore University
of Technology and Design in 2016. During 2015-
2016 he was a visiting researcher at SUTD-MIT
International Design Centre, Singapore. He was a
Research Intern at CCMP Lab, Kyungpook National
University, South Korea during 2014-2015. His re-
search interests includes Machine Learning, Tensor
mining, Recommender Systems.
Chau Yuen (yuenchau@sutd.edu.sg) received the
B.Eng. and Ph.D. degrees from Nanyang Techno-
logical University, Singapore, in 2000 and 2004,
respectively. He was a postdoctoral fellow with
Lucent TechnologiesBell Labs, Murray Hill, New
Jersey, in 2005. He was a visiting assistant professor
with Hong Kong Polytechnic University in 2008.
From 2006 to 2010, he worked at I2R, Singapore,
as a senior research engineer. Currently, he is an
associate professor with the Singapore University
of Technology and Design. He is an editor of
IEEE Transactions on Communications and IEEE Transactions on Vehicular
Technology. In 2012, he received the IEEE Asia-Pacific Outstanding Young
Researcher Award.
Belinda Yuen is professorial fellow and research
director at the Lee Kuan Yew Centre for Innovative
Cities, Singapore University of Technology and De-
sign where she leads the Lee Li Ming Programme
in Ageing Urbanism.
Jianxi Luo is director of the Data-Driven Innova-
tion Lab at SUTD (http://ddi.sutd.edu.sg). He holds
B.S. and M.S. degrees in Mechanical Engineering
from Tsinghua University and a S.M. degree in
Technology & Policy and a Ph.D. degree in En-
gineering Systems from Massachusetts Institute of
Technology (MIT). He is currently assistant profes-
sor of engineering product development at Singa-
pore University of Technology and Design (SUTD)
and the associate director of the SUTD Technology
Entrepreneurship Programme (STEP). His research
is focused on developing artificial intelligences and data science methods
and tools to enhance innovation in engineering. His teaching is focused on
entrepreneurship and innovation. He was a faculty member at New York
University, a visiting scholar at Columbia University and the University of
Cambridge, and a chair emeritus of the Technology, Innovation Management
and Entrepreneurship Section of INFORMS. In practice, he is a co-founder
and advisor of several startups, an innovation consultant and public speaker.
Richi Nayak is Associate Professor of Computer
Science in the Queensland University of Tech-
nology(QUT), Australia. She is an internationally
recognised expert in data mining and web intelli-
gence. Her particular research interests are machine
learning and in recent years she has concentrated her
work on text mining, personalization, automation,
and social network analysis. She has published high-
quality conference and journal articles and highly
cited in her research field. She has been successful
in attaining over $1 million in competitive external
research funding over the past five years in the area of data mining. She
has received a number of awards and nominations for teaching, research and
service activities.