ChapterPDF Available

Continuous Trajectory Pattern Mining for Mobility Behaviour Change Detection

Authors:
  • c.technology AG

Abstract and Figures

With the emergence of ubiquitous movement tracking technologies, developing systems which continuously monitor or even influence the mobility behaviour of individuals in order to increase its sustainability is now possible. Currently, however, most approaches do not move beyond merely describing the status quo of the observed mobility behaviour, and require an expert to assess possible behaviour changes of individual persons. Especially today, automated methods for this assessment are needed, which is why we propose a framework for detecting behavioural anomalies of individual users by continuously mining their movement trajectory data streams. For this, a workflow is presented which integrates data preprocessing, completeness assessment, feature extraction and pattern mining, and anomaly detection. In order to demonstrate its functionality and practical value, we apply our system to a real-world, large-scale trajectory dataset collected from 139 users over 3 months.
Content may be subject to copyright.
Continuous Trajectory Pattern Mining
for Mobility Behaviour Change Detection
David Jonietz and Dominik Bucher
Abstract With the emergence of ubiquitous movement tracking technologies, devel-
oping systems which continuously monitor or even influence the mobility behav-
iour of individuals in order to increase its sustainability is now possible. Currently,
however, most approaches do not move beyond merely describing the status quo of
the observed mobility behaviour, and require an expert to assess possible behaviour
changes of individual persons. Especially today, automated methods for this assess-
ment are needed, which is why we propose a framework for detecting behavioural
anomalies of individual users by continuously mining their movement trajectory data
streams. For this, a workflow is presented which integrates data preprocessing, com-
pleteness assessment, feature extraction and pattern mining, and anomaly detection.
In order to demonstrate its functionality and practical value, we apply our system to
a real-world, large-scale trajectory dataset collected from 139 users over 3 months.
Keywords Mobility Trajectory mining Anomaly detection
Sustainability Behavior change
1 Introduction
Human mobility is ubiquitous in modern societies and represents an integral part
of our daily behavioural routines. At the same time, however, there are numerous
undesirable effects, such as traffic jams or increased fossil fuel consumption (Taaffe
1996). With regards to Switzerland, for instance, roughly a half of the total CO2
emissions are contributed by the transportation sector (including international avia-
tion), with motorized individual mobility being responsible for around two thirds of
these emissions (Bundesamt fuer Umwelt 2014). If no major changes occur in the
D. Jonietz ()D. Bucher
Institute of Cartography and Geoinformation, ETH Zurich, Stefano-Franscini-Platz 5,
8093 Zurich, Switzerland
e-mail: jonietzd@ethz.ch
D. Bucher
e-mail: dobucher@ethz.ch
© Springer International Publishing AG 2018
P. Kiefer et al. (eds.), Progress in Location Based Services 2018, Lecture Notes
in Geoinformation and Cartography, https://doi.org/10.1007/978-3-319-71470-7_11
211
212 D. Jonietz and D. Bucher
transport system, these numbers are widely expected to rise in the coming decades
(Boulouchos et al. 2017).
Recently, the significance of emerging technologies which enable ubiquitous
monitoring as well as real-time regulation and management of human mobility has
been emphasized as potential game changing aspect for increasing the sustainability
of travel behaviour (Boulouchos et al. 2017). Indeed, current developments in the
field of location-acquisition technologies such as Global Navigation Satellite Sys-
tems (GNSS), Wireless Local Area Networks (WLAN), or Global System for Mobile
Communications (GSM) allow to monitor and record human movement at an excep-
tional level of detail and at relatively low cost and effort (Feng and Zhu 2016). Due
to the widespread use of modern smart phones, as well as a general trend towards
digitalization in the transportation and mobility sector, Big Mobility Data are now
widely available and ready to be utilized for gaining unprecedented insights into the
fundamental mechanisms that guide human mobility (Brunauer and Rehrl 2016).
In fact, since the late 1990s, human movement trajectories, i.e. series of chrono-
logically ordered x, y-coordinate pairs with time stamps (Andrienko et al. 2016),
have increasingly been used for travel surveys (Shen and Stopher 2017). Apart from
notable exceptions (e.g. Schlich and Axhausen 2003; Stopher et al. 2013), however,
these studies have mainly applied a snapshot approach (e.g. Schüssler 2008; Kohla
and Meschik 2013), with the center of interest being put on inter-personal variabil-
ity (differences in the behaviour of different persons) rather than intra-personal vari-
ability (different behaviour of one person from day to day) (Schlich and Axhausen
2003). What has often been neglected, therefore, is analysing the dynamic dimension
of mobility behaviour, i.e. behaviour changes such as trying out new travel alterna-
tives, or forming new mobility habits.
Especially today, however, it would be worthwhile to be able to automatically
detect and analyse such changes in mobility behaviour. On the one hand, in con-
trast to merely surveying mobility behaviour, there are now systems which move
further by aiming to directly influence people’s mobility behaviour towards more
sustainable transport alternatives (cf. Banister 2008), e.g. by using mobile applica-
tions which continuously record the movements of users, stream the data to a server,
and utilize them to provide their users with feedback or even suggest more sustain-
able travel options (Froehlich et al. 2009;Montinietal.2015). To the best of our
knowledge, currently none of these systems apply strategies for automatically detect-
ing behaviour change, but instead require manual checking of the data for evaluating
the effectiveness of the conducted persuasive measures. A fully automated system
which continuously monitors movement behaviour based on a stream of trajectory
data, and detects behavioural changes, however, could take over this tedious task
and even trigger dynamic reactions to users based on their behavioural changes, e.g.
encourage sustainable mobility behaviour adaptations and discourage in the opposite
case. On the other hand, apart from application scenarios where behaviour change
is actively induced, the development of methods for detecting such variations in
movement data would also be useful for general transportation research and planning
purposes. Thus, for instance, insights are still needed in terms of evaluating and pre-
dicting peoples reactions to today’s novel mobility options, such as shared mobility,
Continuous Trajectory Pattern Mining for Mobility Behaviour Change Detection 213
mobility as a service, electric mobility and autonomous vehicles. Being confronted
with these, one can expect numerous people to adapt their mobility behaviour, e.g.
by testing novel alternatives and even forming new travel habits (Boulouchos et al.
2017). In order to accurately understand these behavioural changes, travel surveys
are needed which involve tracking numerous participants over a long period of time.
In addition, a set of suitable methods are necessary to analyse the collected data and
be able to accurately understand these behavioural changes.
For developing such methods, however, a practical problem is posed by insuffi-
cient data quality. It is especially data incompleteness which represents a critical
challenge for GNSS-based travel surveys, since it comprises missing records for
parts of trips, one or more full trips, or even one or more full days of the record-
ing period (Hecker et al. 2010). These gaps can have various causes, e.g. the cold
start problem at the start of movement, bad signal reception, participants leaving
the device switched off, or other technological problems (Shen and Stopher 2017).
While shorter gaps can often be handled by means of map matching techniques
(see Sect. 2.1), longer ones can heavily distort or bias the results of the following
analyses. In the context of automated behaviour change detection, for instance, the
occurrence of missing movement data could lead to misleading calculations, e.g.
drastically lower values for CO2emissions produced during the respective week of
recording. In this case, a system might erroneously interpret this drop in numbers as
a behaviour change, whereas it is in fact merely the result of missing data. To avoid
such misdetection of behaviour changes, methods need to be sensitive to recording
gaps, i.e. distinguish them from cases where observed changes are actually due to
changed mobility patterns.
Before this background, this study proposes a method for identifying and evalu-
ating changes in human mobility behaviour by first detecting and quantifying spatio-
temporal recording gaps in a stream of movement trajectory data, and then contin-
uously mining it for anomalies with regards to various mobility features, i.e. a sub-
set of variables which can be extracted from movement data, and describe selected
aspects of mobility behaviour (e.g. average speed, travelled distances). Focussing on
sustainable mobility as the application scenario, we simulate a real-time data stream
using a real trajectory dataset collected from 139 users over 3 months in Switzerland.
This paper is structured as follows: First, in Sect.2background information is pro-
vided starting with a brief review of available methods for surveying human mobility
behaviour on the basis of movement trajectory datasets. Then, the focus is shifted
to the potential of similar techniques for inducing and analysing changes in mobil-
ity behaviour. In the following Sect.3, our concept is presented and discussed with
regards to data preprocessing, completeness assessment, feature extraction and pat-
tern mining, and finally anomaly detection. In Sect. 4, the framework is applied to a
test dataset, before the results are discussed and the paper is concluded in Sect. 5.
214 D. Jonietz and D. Bucher
2 Related Work
In the context of this study, relevant prior work applies one of two distinct perspec-
tives on mobility behaviour and movement data analysis, and is briefly reviewed in
this section:
1. Assessing the present state of mobility behaviour, i.e. where,when and how
a person travels. This is normally achieved by means of GNSS-assisted travel
surveys.
2. Aiming to change existing mobility behaviour in order to increase its sustain-
ability, e.g. bymeans of mobile applications which provide both tracking and user
feedback functionalities.
2.1 Movement Trajectories for Surveying Human Mobility
Behaviour
Before the rise of position tracking technologies, the traditional ways of gaining
insights about the mobility behaviour of people were face-to-face interviews, mail-
out/mail-back or telephone surveys. Since the late 1990s, however, GNSS-assisted
travel surveys emerged as a novel method, and gradually replaced these approaches
due to numerous advantages, such as a relatively high accuracy in recording time and
position, low cost (especially with modern smartphones), and less problems with
regards to trip-misreporting by respondents (Shen and Stopher 2017). Nowadays,
exemplary approaches are manifold, and have spread from pilot studies undertaken
in the USA (Wagner 1997) to a range of other countries, including Switzerland (Shen
and Stopher 2017).
After recording the movements of test persons, the data require extensive process-
ing in order to extract relevant mobility features, in particular places that have been
visited for a certain purpose and the travelled routes between these places. With
regards to the former category, stay points are typically detected based on various
clustering techniques (e.g. Palma et al. 2008), or the movement speed (e.g. Li et al.
2008). With regards to the travelled routes, via map matching, the exact path taken
through a road network can be inferred from the tracking points, e.g. by simple point-
to-curve snapping (e.g. White et al. 2000) or advanced techniques such as evolution-
ary algorithms (Quddus and Washington 2015). Apart from the routes, numerous
studies have proposed approaches to infer the used traffic mode, for instance based on
identifying walking transitions between mode changes (Zheng et al. 2010), analysing
a range of movement descriptors (Sester et al. 2012), or the underlying transportation
network (Stenneth et al. 2011).
In order to describe a person’s mobility behaviour based on trajectory data, these
(and other) mobility features need to be further analysed to extract patterns, i.e.
observable regularities in movement behaviour such as habits or long-lasting pref-
erences and restrictions. Thus, one can calculate general statistics over certain time
Continuous Trajectory Pattern Mining for Mobility Behaviour Change Detection 215
intervals, such as the average duration and length of trips, the modal split, or the
usual times of travel (Axhausen and Frick 2005), but also more use-case specific
aspects such as frequently visited places other than home or the work location (Siła-
Nowicka et al. 2015) or the location of regularly performed activities like eating,
shopping or physical exercise (e.g. Zheng et al. 2010; Furletti et al. 2013). When
being properly interpreted, mobility features and their regular patterns can serve as
indicators for higher-level attributes, such as the sustainability of mobility behaviour.
In this context, for instance, (Nicolas et al. 2003) formulated a set of potential sus-
tainability indicators which can be extracted from travel survey data. Among others
which refer to the aggregate city level, those which could be extracted from trajec-
tory data include the daily number of trips, the structure of trip purposes (e.g. com-
muting versus leisure), the daily average time budget spent for travelling, the modal
split (especially the share of slow mobility, i.e. walking and cycling), the average
distance travelled daily, and the average movement speed. Other relevant indicators
which have been formulated in the literature include the amount of CO2emissions
and the degree to which trips are intermodally integrated, i.e. use different traffic
modes in combination (World Business Council 2015).
Naturally, the validity of the results computed for mobility features depend to a
large degree on the quality of the input trajectory data, in particular the completeness
of the recorded movement. Missing trips or even full day gaps will lead to erroneous,
in some cases even heavily biased, results (Hecker et al. 2010), however, are a regu-
larly occurring issue in travel surveys (Shen and Stopher 2017). Although this issue is
frequently discussed in the literature (e.g. Shen and Stopher 2017;Wolfetal.2003),
only few studies propose solutions, such as evaluating the intrinsic trajectory data
quality based on the spatial and temporal resolution (Prelipcean et al. 2015), a statisti-
cal approach to detect dependencies between mobility behaviour, socio-demography
and missing data (Hecker et al. 2010), or imputation, the process of inferring the
missing trips based on observed data using statistical relationships (Polak and Han
1997). Another popular option to improve and ensure the completeness and correct-
ness of the movement data in travel surveys are prompted recall (PR) methods, in
which during the tracking phase, respondents are regularly asked to manually val-
idate and complete their recorded movements, for instance at the end of each day
(e.g. Bucher et al. 2016).
In traditional travel surveys, the focus is usually put on analysing the status quo
of mobility behaviour, since, as (Schlich and Axhausen 2003) argue, there is a gen-
eral assumption that travel behaviour mainly consists of highly habitual routines,
and remains relatively static over time. Thus, in most cases, mobility features are
calculated once on the basis of the entire available data in order to assess the present
state of transportation system usage (e.g. Schüssler 2008; Kohla and Meschik 2013)
rather than analysing its temporal dynamics. Additionally, this snapshot approach is
often caused by practical limitations with regards to the available movement data,
with durations of the tracking period rarely exceeding two weeks (Shen and Sto-
pher 2017). There are, however, also examples of longitudinal analyses of travel
behaviour (e.g. Hanson and Huff 1988; Schlich and Axhausen 2003; Stopher et al.
2013; Gonzalez et al. 2008; Song et al. 2010). These studies were mostly concerned
216 D. Jonietz and D. Bucher
with detecting day-to-day variations, stability measures, and statistical properties of
mobility behaviour from movement data of various kinds, such as those obtained
with GSM or GPS, or traditional travel survey methods. While GSM data typically
covers long durations and large numbers of users, transport surveys and GPS record-
ings stem from much less persons over the course of merely a few weeks. Gonza-
lez et al. (2008), for instance, developed an aggregated model of human mobility
based on extensive mobile phone data, and found strong inter-personal regularities,
but did not distinguish between individual users or temporal changes. Schlich and
Axhausen (2003) report on different mobility indicators, and how they can be used
to compute similarity measures between mobility behaviour on two different days.
2.2 Inducing Change in Human Mobility Behaviour
Apart from merely monitoring and analysing the status quo of mobility behaviour,
other studies have built on similar analytical methods to actively influence users
in order to make them travel in a more environmentally sustainable way. For this,
mobile applications and a feedback loop were used, with examples including Ubi-
Green (Froehlich et al. 2009), PEACOX (Montini et al. 2015), or GoEco! (Bucher
et al. 2016). In some cases, apart from merely summarizing the recorded mobil-
ity behaviour, the provided feedback also included the proposal of more sustainable
travel alternatives. At present, however, most approaches suffer from either short
study periods (Hamari et al. 2014), or from basing their feedback and suggestions for
more sustainable mobility on a single snapshot, for example data which was recorded
during a pre-study or a baseline-tracking phase. This shortcoming hinders the devel-
opment of long-running applications that continuously monitor mobility behaviour
and are thus able to provide feedback based on detected changes of current in com-
parison to past behavioural patterns.
Thus, a system would be worthwile with the ability to automatically detect
changes in behaviour, which could then, based on established models of behav-
ioural change processes, select actions to be taken to support (in case of increased
sustainability) or prevent (in the opposite case) the observed behaviour change. A
commonly used psychological conceptualization is the Transtheoretical Model (Pro-
chaska and Velicer, 1997) which separates behaviour change into precontempla-
tion,contemplation,preparation,action and maintenance phases. Upon detecting
a change in mobility, one could for instance infer that a user started contemplating
new behavior, and support a transition towards this behavior by supplying her with
information (e.g. Tulusan et al. 2012; Taniguchi et al. 2003), rewarding further good
choices (e.g. Ben-Elia and Ettema 2011), dissuading unsustainable behavior (e.g.
Schade and Schlag 2003), or otherwise engage and motivate her to move to the
preparation or action stage (Weiser et al., 2015). Alternatively, for users without
changes in mobility (one could argue they are in a precontemplation or maintenance
phase), a system might foster self-experience of travel alternatives (e.g. Abou-Zeid
Continuous Trajectory Pattern Mining for Mobility Behaviour Change Detection 217
et al. 2012; Bamberg et al. 2003; Bamberg 2006) in order to make them try out new
and more sustainable transport options.
Automatically exposing behaviour change is closely related to anomaly detec-
tion, the identification of deviations from a certain norm (Chandola et al. 2009). In
contrast to filtering out noise, in this case the focus of interest is usually placed on
the nature of the abnormalities themselves. In the transportation domain, researchers
have been interested in detecting anomalies in large collective mobility datasets (cf.
Souto and Liebig 2016; Yang and Liu 2011) for urban traffic applications and emer-
gency management. Another line of research considers (geometrical) pattern match-
ing on trajectory data (e.g. Florescu et al. 2012; Du Mouza et al. 2005), for example
by building a higher-order Markov model of a user’s transitions from one mobile
phone cell to another (Sun et al. 2004). The authors encode the individual patterns
inamobility trie, which they in turn use to search for anomalies by computing dis-
tances between previous and new, potentially anomalous patterns. They explicitly
note on the importance of dynamically updating “normal behaviour”, and weight-
ing recent patterns higher than ones which occurred longer ago. However, all these
approaches are based on a relatively crude assessment of mobility, which either only
considers transitions from one region to another, or aggregate data from many users
to get a complete view of the current traffic situation. For detecting individual behav-
iour change over time, however, a method is needed which works with a continuous
stream of non-aggregated movement data on an individual level, and tests multiple
dimensions of mobility behaviour for anomalies, by comparing them to the user’s
past behaviour.
3Method
In this section, we present a system for detecting mobility behaviour change based
on a continuous stream of movement data from individual users. The proposed work-
flow is illustrated in Fig.1. We assume that a user’s raw movement trajectories,
recorded via a smartphone application or a similar device, are constantly streamed
to a server, and logged in a database. After a certain time period has passed (we
propose one week), the data recorded in this interval are fed into a data processing
engine, where they pass through four processing steps: first, the trajectories are pre-
processed, i.e. filtered, segmented, annotated with the traffic mode, and matched to
the road network. Then, the available data for this time period are tested for com-
pleteness in order to evaluate their sufficiency for the following analytical processes.
If found insufficiently complete, the data are discarded, if rated appropriate, how-
ever, they are fed to the next module, which extracts a range of mobility features and
mines for patterns. The results are stored in a database, and provide the input for
an anomaly detection sub-process, which identifies behaviour change and triggers
an appropriate reaction. As can be seen on the far right of Fig. 1, this may involve
sending out notifications to the users or analysts, triggering a response (e.g. encour-
aging or discouraging the observed behaviour change), logging the occurrence of
218 D. Jonietz and D. Bucher
Fig. 1 Workflow
the anomaly, or providing information to an expert for decision support. The exact
nature of these system reactions, however, is beyond the scope of this paper. Instead,
since our focus is put on the data processing engine, its four sub-modules will be
further described in this section.
3.1 Data Preprocessing
As it has been described, movement data are continuously streamed to a server, and
logged in a database. In order to evaluate behavioural changes, however, it is neces-
sary to define discrete time intervals (in the following: one week), which will serve
as atomic units for later temporal analysis. Thus, after all available data for a full
week have been stored in the database, they are fed into the data processing engine
(cf. Fig. 1), and further analysed. In a first step, the data need to be preprocessed,
which involves the sub-processes noise filtering, stay point detection, segmentation,
mode detection, and map matching (Zheng 2015). Please note that whereas exem-
plary methods for these preprocessing steps are proposed in the following, they could
also be replaced by other solutions which are better suited to the respective study
aims or data characteristics.
In the beginning, the data are cleaned by removing noisy trackpoints based on
a set of filter functions such as a spatial query with a certain study area, or plausi-
bility checks with regards to speed constraints (Zheng 2015). Then, the stay points
are detected in the remaining trackpoints, e.g. by means of a clustering technique
(Palma et al. 2008). The next preprocessing step detects the traffic mode(s) used,
e.g. by computing and analysing various movement descriptors such as the speed or
acceleration (Sester et al. 2012). Finally, map matching needs to be performed for all
Continuous Trajectory Pattern Mining for Mobility Behaviour Change Detection 219
Fig. 2 The different layers
of movement data
aggregation used in this
study. Note that in contrast to
“home” and “work”, the
transition points between
train,bus and tram are not
considered activities.
Basemap© Mapbox.com
points using one of the available techniques, e.g. evolutionary algorithms (Quddus
and Washington 2015).
After basic preprocessing, it is necessary to structure the movement data into
meaningful units. Inspired by prior approaches (Axhausen and Frick 2005), we pro-
pose to distinguish between the following elements: At the most fundamental level,
trajectories (the complete trace of a users movement over a given time frame) are
made up of trackpoints. In a first layer of aggregation, trackpoints are grouped into
trip legs based on the used transport mode. Finally, a trip consists of one or more
legs, and describes the journey from one ‘activity’ to another. A stay point simply
denotes a location where someone spent longer than a certain time span, and can
qualify as an activity if it represents an actual destination of travel (e.g. work, home
or a shop), and not merely a location where a user spent time waiting for a bus or
stuck in a traffic jam. Figure 2shows an exemplary trip with its constituting elements.
3.2 Data Completeness Assessment
After preprocessing, the available data for the current week are tested for their com-
pleteness. As has been discussed in Sect. 2.1, missing trips or other gaps in recording
can have negative effects on downstream analysis processes (e.g. Shen and Stopher
2017;Wolfetal.2003). In our case, for instance, missing data, if not identified and
filtered previously, might result in misdetections of behaviour changes due to drasti-
cally altered values for mobility features. Please note that in this step, we assume the
norm to be continuous tracking over the whole study period, as it is often the case
in related surveys (e.g. Montini et al. 2015; Bucher et al. 2016).
As a first step, we distinguish between different types of recording gaps:
Tem po ral gaps: the duration with no recorded data between the last recorded time
stamp of a trip leg or stay point and the first recorded time stamp of the sub-sequent
trip leg or stay point. The spatial deviance between the position of the last track
220 D. Jonietz and D. Bucher
point of the former, and the first track point of the latter tripleg or stay point is
smaller than an expected GPS error (e.g. 250 m).
Spatio-temporal gaps: gaps for which the spatial distance between the last track
point of the former, and the first track point of the latter trip leg or stay point is
larger than an expected GPS error.
This distinction is motivated by the fact that in the first case, chances are high that
no mobility behaviour has been missed since the user might simply have remained
stationary during the recording gap, whereas in the second case, the user’s change
in position proves that movement has certainly taken place but was not recorded.
Both types of gaps can be easily extracted from the database by calculating the
time differences as well as spatial distances between the start and end points of sub-
sequent pairs of trip legs and stay points. The data completeness for the current time
interval can then be evaluated based on two index values:
gduri=𝛥gi
𝛥ti
gdisti=dist(gi)
dist(triplegsi)
where gduriis the ratio of the summed durations 𝛥giof all temporal and spatio-
temporal gaps giand the total duration 𝛥tiof week i. In the second index, gdisti
is the ratio of the summed distances dist(gi) of all spatio-temporal gaps giand the
summed distances dist(triplegsi) of all trip legs triplegsirecorded within week i.In
combination, these index values express the temporal extent of recording gaps, as
well as the relative magnitude of missed mobility behaviour. For instance, in a week
in which a user has travelled relatively less compared to others, recording gaps of
similar temporal length can be rated as less critical, since less travelled distance, i.e.
mobility behaviour, might be missing in the data.
3.3 Mobility Feature Extraction and Pattern Mining
After the available data has been confirmed to be of sufficient completeness, selected
mobility features can be extracted. Of course, these will depend to a large degree on
the study aims. As our focus is on sustainability, we compute durations, distances,
speed, and produced CO2emissions for each trip leg to serve as basis for comput-
ing the indicators listed in Sect. 2.1. Next, in addition to segmenting the movement
trajectories based on their semantics (e.g. trip legs by traffic mode, trips between
activities), as described in Sect. 3.1, we also induce a temporal structure by group-
ing all movement on a daily basis. Of course, the pre-defined discrete time interval at
which the data is processed (here: one week) provides a further temporal analytical
unit.
Continuous Trajectory Pattern Mining for Mobility Behaviour Change Detection 221
Table 1 Units of analysis for deriving mobility features and patterns
Analysis unit Delimiting factor Description
Trip leg Transport mode/vehicle Mono-modal trip segment between
two points without changing mode
or vehicle
Trip Purpose Trip between two locations for a
certain purpose; consists of one or
more trip legs
Day Time All tr ips within 24 h; contains one
or more complete or incomplete
travels (incomplete: beyond
temporal delimitation)
Week Time All trips within 7 consecutive days
Table 2 Mobility features
Descriptor Day Week
Total number of trips x
Average number of triplegs per trip x
Total distance travelled x
Total distance travelled (per trip purpose) x
Total distance travelled (per traffic mode) x
Average distance travelled x x
Total duration spent travelling x
Total duration spent travelling (per trip purpose) x
Total duration spent travelling (per traffic mode) x
Average duration spent travelling x x
Total CO2emissions x
Average travel speed x
Average travel speed (per traffic mode) x
Frequently visited places x
The resulting analytical units for computing mobility features are summarized in
Table 1.
For assessing the sustainability of the user’s mobility behaviour within the week,
we compute a set of indicators (Nicolas et al. 2003; World Business Council 2015)
as listed in Table 2.
Whereas the first three indicators can be easily extracted from the preprocessed
data, several others require a classification of the stay points and their related trips
according to their purpose. Purpose and activity detection can either be achieved by
computational methods, e.g. based on visited POI (e.g. Furletti et al. 2013), or by
simply asking the users to annotate the data manually in the course of an accompa-
nying PR survey. The total CO2emissions produced by travelling depend primarily
222 D. Jonietz and D. Bucher
on the modal split, and can for instance be calculated based on the Mobitool con-
sumption and emission factors (Tuchschmid and Halder 2010), which provide the
consumption and emissions of the full life-cycle of a mode of transport per single
kilometre travelled in Switzerland.
Finally, although not being directly related to sustainability, the frequently vis-
ited places are nevertheless included in the list of mobility features. This is due to
the fact that this attribute allows for drastic changes in the personal circumstances
to be detected (e.g. moving to a different city). Thus, if other indicators such as the
CO2emissions change, but the visited places remain unaltered, this could indicate
that a user is testing new travel options (e.g. taking the bicycle to work) while her
circumstances remain the same. For mining the frequently visited places in a way
which allows them to be compared to the results obtained for previous weeks, we
choose a clustering approach. Using the DBSCAN algorithm (Sander et al. 1998),
we cluster all activities found during the week. Due to the fact that although a user
might have visited the same place as in the week before, the recorded activities and
their associated point geometries will not correspond spatially, we choose an alter-
native approach and compute a minimum bounding geometry of the points based on
their cluster membership. In order to avoid creating multiple instances of the “same”
place in the database, the resulting polygon is tested for overlaps with already exist-
ing places in the database. If an overlap is found, no new place instance is created,
but rather the id of the overlapping place in the database is extracted and stored in
a list of frequently visited places for the current week. If no overlap with already
existing places is detected, a new place instance is created in the database, and a new
id is assigned. Figure 3shows an example of activities and the overlapping cluster
geometries from different weeks for one user. Since they all overlap, only the first
occurrence would be created as an instance and assigned an id. For all the other
clusters, only the information that the place has been visited frequently enough to
be detected as a cluster would be stored together with its id. After computation, the
results for all indicators are stored in a database (see Fig.1).
Fig. 3 The activities are shown on top of the overlapping minimum bounding polygons, as derived
from the point clusters at different weeks
Continuous Trajectory Pattern Mining for Mobility Behaviour Change Detection 223
3.4 Anomaly Detection
At the present stage of the workflow, mobility features and patterns have been
detected and stored for the current week. Now, it is possible to load similar data
computed for the previous weeks from the database, and assess potential anomalies
in mobility behaviour (see Fig. 1). Numerous algorithms available for anomaly detec-
tion simply classify individual data points (in our case, aggregations of all mobility
features for the current week) as anomalous or normal, without allowing further
insight into which feature exactly caused the data point to be classified as anom-
alous (cf. Chandola et al. 2009). This knowledge, however, is critical for our pur-
poses since merely knowing that an anomaly occurred is not sufficient, but rather the
results should allow deeper interpretation of the detected behaviour change. Thus, to
decide which system action should be triggered as a reaction, it is critical to explicitly
identify the mobility features which have changed, i.e. were detected as anomalous.
For instance, an increase in bicycling distance could trigger encouraging feedback,
whereas an increase in CO2production could lead to a discouraging response. There
is work on explaining anomalies in more detail after their detection (e.g. Pevn`
y and
Kopp 2014), which could therefore be used in combination with any anomaly detec-
tion algorithm. For our purpose, we found this unnecessary and rather detect anom-
alies for each feature individually.
For each mobility feature fi(except the frequently visited places, which will be
explained separately) we compute the mean 𝜇iand standard deviation 𝜎iof the n
weeks preceding the week currently under investigation, where nis a tunable win-
dow size (set to 5 weeks in our tests). Comparing the values computed for the current
week, it is now possible to assess if an existing deviation should be considered a nor-
mal fluctuation or an anomaly. This is controlled by another parameter 𝜆, i.e. a feature
fiis considered anomalous if |fi𝜇i|>𝜆𝜎i. Accordingly, if the feature re-centred
around zero has a deviation larger than what can be expected given previous feature
values, it is treated as anomalous. We found a value of 𝜆=3to yield reasonable
results.
To compute if a set of frequently visited places within a week should be consid-
ered anomalous, a similar approach is applied. We encode the presence of a certain
place in a given week with a 1, and its absence with a 0. For every place, this results
in a list of binary digits, e.g. the sequence (0,0,1,0,1) encodes a place being visited
in weeks 3 and 5, but not in any other week. Using this numerical representation, we
can compute if the appearance of an individual place in any week should be consid-
ered anomalous or not by using a similar technique as above. However, as this results
in every place being an additional mobility feature (which results in frequent cases
with large number of anomalous features), we sum the number of anomalous places
in every week, and perform another anomaly detection process on the resulting val-
ues. For example, a person frequently travelling for work purposes will constantly
yield high numbers of anomalous places (i.e. first time visits at new places), a fact
which is not particularly useful in terms of behaviour change detection. If, however,
this number drops suddenly, and the visited places show a more regular pattern,
224 D. Jonietz and D. Bucher
it signals a behavioural change (which could be due to holidays, a job change, etc.).
Summarizing anomalies in the frequently visited places as described allows us to
handle them as a single mobility feature, and to report their anomalies for further
interpretation by an automated system or an expert.
4 Case Study
We implemented the described method as a Python application (using a PostgreSQL
database with the PostGIS extension for all spatial operations), and evaluated it on
a large dataset collected over a period of three months, from approximately middle
of December 2016–March 2017. 139 people used a smartphone tracking app, which
passively recorded all their journeys, inferred a transport mode, and allowed them to
change it in case the proposed one was wrong. The dataset consists of 52’370’797
trackpoints, which are divided into 125’759 trip legs and 71’099 trips.
Using these data, we simulated a continuous data stream by feeding data for
each week subsequently into the data processing engine. Below, the results for our
mobility behaviour change detection process are provided for two exemplary users.
Figures 4and 5show the detected anomalies for these users per week. The blue dots
indicate the number of anomalies for each week, while the yellow dots show the
number of anomalies with regards to frequently visited places. Please note that this
does not correspond to the total number of places visited by a user, but only to those
that were unexpectedly visited or skipped in the respective week. Not surprisingly,
the place-related anomalies are relatively more frequent in the first weeks, which is
due to the cold start problem, i.e., sparse data making it difficult to assess whether
a frequently visited place should represent an anomaly. Weeks which are missing
values were filtered out previously, due to insufficient data completeness. For this,
Fig. 4 All (blue) and only place-related (yellow) anomalies for user A of our test sample. In weeks
2016-50, 2016-52, and 2017-02, the data completeness was found insufficient to reliably assess
mobility behaviour patterns
Continuous Trajectory Pattern Mining for Mobility Behaviour Change Detection 225
Fig. 5 All (blue) and only place-related (yellow) anomalies for user B of our test sample. In week
2017-07, the data completeness was found insufficient to reliably assess mobility behaviour patterns
we defined threshold values so that data for weeks were only further analysed if their
gduri0.25 and gdisti0.25.
The mobility behaviour of user A, whose anomalies are shown in Fig.4, remains
rather constant up until calender week 2017-06, where several anomalies are detected.
Whereas in that week, only the average walking speed is noticeably higher compared
to preceding weeks, in the following week 2017-07 we detect an increase in the dis-
tance (𝜇d=7.0km fd=33.2km) and duration (𝜇t=19min ft=1h41min)
of travels made by bus. In week 2017-08, one can observe an additional increase in
distance and duration of both walking (18.6 km 58.4 km; 2 h 38 min 9h43min)
and bicycling (1.9 km 31.2km;5min1 h 37 min). Due to the fact that in con-
trast to these anomalies, the frequently visited places still remain largely unchanged
compared to the weeks before, we can conclude that this user indeed changed her
mobility behaviour by increasingly using slow mobility (walking and bicycling) and
public transport. An automated feedback system as described previously could now
trigger reinforcing measures for this behaviour, e.g., by providing incentives, and
thus assisting the user to transition to a phase where this new mobility behaviour is
internalized and does not require further motivation.
The results for user B are shown in Fig.5. Here, changes in mobility behaviour
can be observed between weeks 2017-05 and 2017-08, which in this case, however,
originate from increases in the totally travelled distance (e.g., 690km 1’836 km),
the average speed (41.4km/h 97.1 km/h) the distance covered by car (307km
1’091 km), bike (1.1 km 14.5 km) and walking (12.3 km 34.1 km), as well as the
related durations (plus the duration spent travelling by tram in week 2017-06). Based
on the observation of such a general increase in mobility activities (not just one spe-
cific mode of transport), and set in combination with the occurrence of several place-
related anomalies in weeks 2017-06 and 2017-08, one can interpret this pattern as
an exceptional change of behaviour likely caused by altered personal circumstances,
e.g., a holiday or business trip, rather than a gradual change of new habit formation.
Indeed, when analysing the movement data for this user in more detail, we found
226 D. Jonietz and D. Bucher
several long distance car journeys with destinations outside of Switzerland during
the respective weeks. Furthermore, in the user’s home Kanton, the weeks 2017-07
and 2017-08 are usually winter holidays. This would also explain the observed data
incompleteness in week 2017-07, since the smartphone tracking method deployed in
this study relies on a mobile data connection, which is often unavailable when travel-
ling abroad. In this case, an automated system reaction could be to rate the detected
changes as likely temporary, and ignore them for the time being.
5 Discussion and Conclusion
In this study, we proposed a framework for continuously mining streams of move-
ment trajectory data of users for detecting mobility behaviour changes. As it has
been discussed, after data preprocessing, the completeness of the available move-
ment recordings needs to be assessed in order to avoid misdetections of behavioural
anomalies in the later steps of the analysis process. For this purpose, we presented a
solution for quantifying recording gaps, hereby distinguishing between purely tem-
poral and spatio-temporal gaps. Furthermore, we calculated a list of mobility features
to serve as sustainability indicators, and proposed a method to compute and evalu-
ate frequently visited places. Finally, the anomaly detection process was described
which yields detailed results with regards to the exact mobility feature causing the
anomaly occurrence. By applying the framework to a simulated stream based on
a pre-recorded large-scale trajectory dataset, and evaluating the plausibility of the
results obtained for two exemplary users, we could demonstrate its functionality and
practical value.
In our view, this work provides a first step towards the development of person-
alized, automated mobility support systems which provide adaptive intervention
strategies for gradually changing people’s mobility behaviour towards a higher sus-
tainability. The proposed framework, however, is not restricted to this application
domain, but could be applied for other purposes as well, e.g. for general monitoring
of mobility behaviour and computing descriptive statistics, or for detecting anom-
alies in the movements of animals or even automated vehicles or drones. A practical
advantage of our approach worth mentioning is the fact that whereas the derived
mobility feature values are stored for every week (feature and pattern log in Fig. 1),
the actual movement data (movement data log in Fig.1) can be deleted immediately
after processing. This not only reduces the resources necessary for data storage, but
also addresses privacy concerns, since the most sensitive data are deleted regularly.
There are, however, still some limitations to our approach. Thus, although the
most sensitive movement data can be deleted after analysis, there still remain con-
cerns with regards to location privacy. With mobile devices constantly gaining in
computation and storage capabilities, however, a potential solution could be to shift
critical parts of the analytical process to the client, and simply transmit the computed
index values to the server for anomaly detection. Moreover, the list of used sustain-
ability indicators is not exhaustive, and more complex values, e.g. incorporating car
Continuous Trajectory Pattern Mining for Mobility Behaviour Change Detection 227
occupancy, would increase the realism with which sustainability is quantified in our
study. These restrictions, however, largely depend on the quality and level of detail of
the available data. Furthermore, in the exemplary application of our system, we could
clearly observe problems for the first iterations due to the cold start problem, which
is a usual challenge for user profiling and sequence mining applications. The useful-
ness of our system would therefore be reduced to a certain degree in the first phase of
application. In addition, it would certainly be worthwhile to include more detailed
mobility features, e.g. the usual times of travel, distinguish between the weekend
and working days, or incorporate contextual information (e.g. the weather) for bet-
ter results. However, special care needs to be taken for correlating features (e.g.,
distance and duration), as they would be flagged as anomalous in the same weeks,
thus leading to a wrong assessment of behaviour change. At the same time, it can be
expected that an increase in the number of features could complicate their semantic
interpretation. Decision support, e.g. in the form of automated feature classification
could therefore be worthwhile. Finally, due to the fact that at the current stage of this
study, we have no access to ground truth data with regards to the behavioural anom-
alies (e.g. in the form of user interviews), a systematic evaluation of the proposed
method must be regarded as future work.
Apart from testing and evaluating the model with a subset of users who can pro-
vide additional information with regards to their mobility behaviour, it is planned to
refine the list of mobility features and develop a prototype of an expert system capa-
ble of interpreting the detected behavioural changes. It would also be interesting to
apply a semantic perspective to the interpretation of place-related anomalies, e.g. by
incorporating POI from additional data sources to assess the type of places visited.
Acknowledgements This research was supported by the Swiss National Science Foundation (SNF)
within NRP 71 “Managing energy consumption” and by the Commission for Technology and Inno-
vation (CTI) within the Swiss Competence Center for Energy Research (SCCER) Mobility.
References
Abou-Zeid M, Witter R, Bierlaire M, Kaufmann V, Ben-Akiva M (2012) Happiness and travel mode
switching: findings from a Swiss public transportation experiment. Transport Policy 19(1):93–
104
Andrienko G, Andrienko N, Fuchs G (2016) Understanding movement data quality. J Loc Based
Serv 10(1):31–46
Axhausen KW, Frick M (2005) Nutzungen—Strukturen—Verkehr
Bamberg S (2006) Is a residential relocation a good opportunity to change peoples travel behavior?
results from a theory-driven intervention study. Env Behav 38(6):820–840
Bamberg S, Rölle D, Weber C (2003) Does habitual car use not lead to more resistance to change
of travel mode? Transportation 30(1):97–108
Banister D (2008) The sustainable mobility paradigm. Transport policy 15(2):73–80
Ben-Elia E, Ettema D (2011) Changing commuters behavior using rewards: a study of rush-hour
avoidance. Trans Res Part F Traffic Psychol Behav
Boulouchos K, Cellina F, Ciari F, Ciari F, Cox B, Georges G, Hirschberg S, Hoppe M, Jonietz
D, Kannan R, Kovacz N, Küng L, Michl T, Raubal M, Rudel R, Schenler W (2017) Towards
228 D. Jonietz and D. Bucher
an energy efficient and climate compatible future swiss transportation system. SCCER mobility
working paper
Brunauer R, Rehrl K (2016) Big data in der mobilität–FCD modellregion salzburg. In: Big Data,
pp 235–267. Springer
Bucher D, Cellina F, Mangili F, Raubal M, Rudel R, Rizzoli RE, Elabed O (2016) Exploiting fit-
ness apps for sustainable mobility-challenges deploying the Goeco! app. ICT for sustainability
(ICT4S)
Bundesamt fuer Umwelt (BAFU), Treibhausgasemissionen der Schweiz 1990–2014
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv
(CSUR) 41(3):15
Du Mouza C, Rigaux P, Scholl M (2005) Efficient evaluation of parameterized pattern queries. In:
Proceedings of the 14th ACM international conference on information and knowledge manage-
ment, pp 728–735. ACM
Feng Z, Zhu Y (2016) A survey on trajectory data mining: techniques and applications. IEEE Access
4:2056–2067
Florescu S, Körner C, Mock M, May M (2012) Efficient mobility pattern stream matching on mobile
devices. In: Proceedings of the ubiquitous data mining workshop (UDM 2012), pp 23–27
Froehlich J, Dillahunt T, Klasnja P, Mankoff J, Consolvo S, Harrison B, Landay JA (2009) Ubi-
green: investigating a mobile tool for tracking and supporting green transportation habits. In:
Proceedings of the sigchi conference on human factors in computing systems, pp 1043–1052.
ACM
Furletti B, Cintia P, Renso C, Spinsanti L (2013) Inferring human activities from GPS tracks. In:
Proceedings of the 2nd ACM SIGKDD international workshop on urban computing—13. Asso-
ciation for Computing Machinery (ACM)
Gonzalez MC, Hidalgo CA, Barabasi A-L (2008) Understanding individual human mobility pat-
terns. Nature 453(7196):779–782
Hamari J, Koivisto J, Pakkanen T (2014) Do persuasive technologies persuade?-a review of empir-
ical studies. In: International conference on persuasive technology, pp 118–136. Springer
Hanson S, Huff OJ (1988) Systematic variability in repetitious travel. Transportation 15(1):111–135
Hecker D, Stange H, Korner C, May M (2010) Sample bias due to missing data in mobility surveys.
In: 2010 IEEE International conference on data mining workshops, Dec, pp 241–248
Kohla B, Meschik M (2013) Comparing trip diaries with gps tracking: Results of a comprehensive
austrian study. In: Transport survey methods: best practice for decision making, pp 305–320.
Emerald Group Publishing Limited
Li Q, Zheng Y, Xie X, Chen Y, Liu W, Ma W-Y (2008) Mining user similarity based on location
history. In: Proceedings of the 16th ACM SIGSPATIAL international conference on advances
in geographic information systems, p 34. ACM
Montini L, Prost S, Schrammel J, Rieser-Schüssler N, Axhausen KW (2015) Comparison of travel
diaries generated from smartphone data and dedicated GPS devices. Trans Res Proc 11:227–241
Nicolas J-P, Pochet P, Poimboeuf H (2003) Towards sustainable mobility indicators: application to
the lyons conurbation. Transport Policy 10(3):197–208
Palma AT, Bogorny V, Kuijpers B, Alvares LO (2008) A clustering-based approach for discover-
ing interesting places in trajectories. In: Proceedings of the 2008 ACM symposium on Applied
computing, pp 863–868. ACM
Pevn`
y T, Kopp M (2014) Explaining anomalies with sapling random forests. In: Information
technologies—applications and theory workshops, posters, and tutorials (ITAT 2014)
Polak J, Han X (1997) Iterative imputation based methods for unit and item non-response in travel
surveys. In: 8th meeting of the international association of travel behaviour research. Austin,
Texas
Prelipcean AC, Gidofalvi G, Susilo YO (2015) Comparative framework for activity-travel diary
collection systems. In: 2015 International conference on, models and technologies for intelligent
transportation systems (MT-ITS), pp. 251–258. IEEE
Continuous Trajectory Pattern Mining for Mobility Behaviour Change Detection 229
Prochaska JO, Velicer WF (1997) The transtheoretical model of health behavior change. Am J
Health Promotion 12(1):38–48
Quddus M, Washington S (2015) Shortest path and vehicle trajectory aided map-matching for low
frequency gps data. Trans Res Part C Em Technol 55:328–339
Sander J, Ester M, Kriegel H-P, Xu X (1998) Density-based clustering in spatial databases: the
algorithm gdbscan and its applications. Data Mining Knowl Discovery 2(2):169–194
Schade J, Schlag B (2003) Acceptability of urban transport pricing strategies. Trans Res Part F
Traffic Psychol Behav 6(1):45–61
Schlich R, Axhausen KW (2003) Habitual travel behaviour: evidence from a six-week travel diary.
Transportation 30(1):13–36
Schüssler N (2008) Processing GPS raw data without additional information
Sester M, Feuerhake U, Kuntzsch C, Zhang L (2012) Revealing underlying structure and behaviour
from movement data. KI—Künstliche Intelligenz 26(3):223–231
Shen L, Stopher PR (2017) Review of GPS travel survey and GPS data- processing methods. Trans
Rev 1–19
Siła-Nowicka K, Vandrol J, Oshan T, Long JA, Demšar U, Fotheringham AS (2015) Analysis of
human mobility patterns from GPS trajectories and contextual information. Int J Geograph Inf
Sci 30(5):881–906
Song C, Qu Z, Blumm N, Barabási A-L (2010) Limits of predictability in human mobility. Science
327(5968):1018–1021
Souto G, Liebig T (2016) On event detection from spatial time series for urban traffic applications.
In: Solving large scale learning tasks. Challenges and algorithms, pp 221–233. Springer
Stenneth L, Wolfson O, Yu PS, Xu B (2011) Transportation mode detection using mobile phones
and GIS information. In: Proceedings of the 19th ACM SIGSPATIAL international conference
on advances in geographic information systems—GIS 11. Association for Computing Machinery
(ACM)
Stopher PR, Moutou CJ, Liu W (2013) Sustainability of voluntary travel behaviour change initia-
tives: a 5-year study
Sun B, Yu F, Wu K, Leung V (2004) Mobility-based anomaly detection in cellular mobile networks.
In: Proceedings of the 3rd ACM workshop on Wireless security, pp 61–69. ACM
Taaffe EJ (1996) Geography of transportation. Morton O’Kelly, New Jersey, USA
Taniguchi A, Hara F, Takano S, Kagaya S, Fujii S (2003) Psychological and behavioral effects of
travel feedback program for travel behavior modification. Trans Res Record J Trans Res Board
1839:182–190
Tuchschmid M, Halder M (2010) Mobitool-grundlagenbericht: Hintergrund, methodik & emis-
sionsfaktoren. Tuchschmid und M, Halder im Auftrag von SBB, Swisscom, BKW und ÖBU
Tulusan J, Steggers H, Fleisch E, Staake T (2012) Supporting eco-driving with eco-feedback tech-
nologies: recommendations targeted at improving corporate car drivers’ intrinsic motivation to
drive more sustainable. In: 14th ACM international conference on ubiquitous computing (Ubi-
Comp), p 18. Ubicomp
Wagner DP (1997) Lexington area travel data collection test: GPS for personal travel surveys. Final
report, office of highway policy information and office of technology applications. Federal High-
way Administration, Battelle Transport Division, Columbus, pp 1–92
Weiser P, Bucher D, Cellina F, De Luca V (2015) A taxonomy of motivational affordances for
meaningful gamified and persuasive technologies. In: Proceedings of the 3rd international con-
ference on ICT for sustainability (ICT4S), ser. Adv Comput Sci Res 22, pp 271–280. Atlantis
Press, Paris
White CE, Bernstein D, Kornhauser AL (2000) Some map matching algorithms for personal navi-
gation assistants. Trans Res Part C Em Technol 8(1):91–108
Wolf J, Loechl M, Thompson M, Arce C (2003) Trip rate analysis in GPS-enhanced personal travel
surveys. In: Transport survey quality and innovation. Emerald Group Publishing Limited, pp
483–498
230 D. Jonietz and D. Bucher
World Business Council for Sustainable Development (WBCSD) (2015) Methodology and indica-
tor calculation method for sustainable urban mobility. WBCSD, Geneva, Switzerland
Yang S, Liu W (2011) Anomaly detection on collective moving patterns: a hidden Markov model
based solution. In: Internet of things (iThings/CPSCom), 2011 international conference on and
4th international conference on cyber, physical and social computing, pp 291–296. IEEE
Zheng VW, Cao B, Zheng Y, Xie X, Yang Q (2010) Collaborative filtering meets mobile recom-
mendation: a user-centered approach. In: Proceedings of the twenty-fourth AAAI conference on
artificial intelligence, ser. AAAI’10, pp 236–241. AAAI Press
Zheng Y (2015) Trajectory data mining. TIST 6(3):1–41
Zheng Y, Chen Y, Li Q, Xie X, Ma W-Y (2010) Understanding transportation modes based on GPS
data for web applications. ACM Trans Web 4(1):1:1–1:36
... The relations between the different classes are shown in the connecting lines. Figure adapted fromJonietz and Bucher (2018). ...
Article
Full-text available
Over the past decade, scientific studies have used the growing availability of large tracking datasets to enhance our understanding of human mobility behavior. However, so far data processing pipelines for the varying data collection methods are not standardized and consequently limit the reproducibility, comparability, and trans-ferability of methods and results in quantitative human mobility analysis. This paper presents Trackintel, an open-source Python library for human mobility analysis. Trackintel is built on a standard data model for human mobility used in transport planning that is compatible with different types of tracking data. We introduce the main functionalities of the library that covers the full life-cycle of human mobility analysis, including processing steps according to the conceptual data model, read and write interfaces, as well as analysis functions (e.g., data quality assessment, travel mode prediction, and location labeling). We showcase the effectiveness of the Trackintel library through a case study with four different tracking datasets. Trackintel can serve as an essential tool to standardize mobility data analysis and increase the transparency and comparability of novel research on human mobility. The library is available open-source at https://github.com/mie-lab/trackintel.
... More recent apps instead run in the background and automatically detect start and end of every trip, also recognizing the transport mode used (Söderberg et al., 2021). Despite of such progresses, however, Jonietz and Bucher (2018) note that at some degree all such apps would still benefit from a manual check of the transport mode by the user, if a good level of accuracy is sought for. A recent trial aimed at assessing the effectiveness of state-of-the-art mobile tracking apps also confirmed that automatic detection capability is still limited (Harding, 2019) and there is room for improvement of precision, recall and overall accuracy. ...
Article
Full-text available
In the effort to counteract problems associated with the current carbon intensive transport system, app-based tools persuading mobility behaviour change have emerged worldwide. Most of such apps adopt a gamified approach and motivate behaviour change through external extrinsic motivational factors such as real-life prizes, that are attributed based on the distance travelled by non-car transport modes. Despite this approach might be effective in promoting additional leisure trips by sustainable mobility, it might keep car-based commuting habits unaltered, or even stimulate unfair app behaviour to gain points. In this paper, we focus on the Bellidea persuasive app, that was co-designed with interested citizens in a Swiss-based living lab experiment, and present how we addressed the shortcomings of prize-based rewarding systems, while also dealing with the constraints imposed by current levels of accuracy in automatic transport mode detection. We illustrate and discuss our design choices and the related algorithmic solutions by referring to the following dilemmas: "single transport modes versus modal split", "trust versus control", "dynamism versus rigidity", and "global versus local". We conclude by analysing real-life mobility data-sets collected by the Bellidea app and discussing our design solutions against their capacity to attract its target user group, namely car driver individuals.
... The research interest on human mobility analysis has extensively expanded over the last few years, driven by the increasing availability of trajectory data acquired by pervasive motion tracking technologies. These data represent a primary source of information on human travel behaviors [1,2], giving rise to a multitude of data mining investigations on motion analysis and trajectory-related applications [3][4][5][6], ranging from personalized recommendation systems [7,8], to transportation planning [9,10], to resource management plans [11,12]. In today's digital world of location-based services and positioning devices, the collection of mobility data covers a variety of acquisition modalities, including mobile phone networks, GPS signals, and social media platforms. ...
Article
Full-text available
Trajectory data represent an essential source of information on travel behaviors and human mobility patterns, assuming a central role in a wide range of services related to transportation planning, personalized recommendation strategies, and resource management plans. The main issue when dealing with trajectory recordings, however, is characterized by temporary losses in the data collection, causing possible spatial–temporal gaps and missing trajectory segments. This is especially critical in those use cases based on non-repetitive individual motion traces, when the user’s missing information cannot be directly reconstructed due to the absence of historical individual repetitive routes. Inserted in the context of location-based trajectory modeling, we tackle the problem by proposing a technical parallelism with the natural language processing domain. Specifically, we introduce the use of the Bidirectional Encoder Representations from Transformers (BERT), a state-of-the-art language representation model, into the trajectory processing research field. By training deep bidirectional representations from unlabeled location sequences, jointly conditioned on both left and right context, we derive an explicit predicted estimation of the missing locations along the trace. The proposed framework, named TraceBERT, was tested on a real-world large-scale trajectory dataset of short-term tourists, exploring an effective attempt of adapting advanced language modeling approaches into mobility-based applications and demonstrating a prominent potential on trajectory reconstruction over traditional statistical approaches.
... The latter is also a useful approach for distinguishing meaningless staypoints (e.g., waiting for the bus) from actual destinations of travel (activities). In other studies, PR survey methods have been used to classify staypoints and activities (Jonietz and Bucher 2018). Apart from these, movement data segmentation requires traffic mode identification, e.g., by detecting walking transitions when changing between other modes (Zheng et al. 2010), computing a range of movement descriptors (Sester et al. 2012), or analysing the underlying transportation network (Stenneth et al. 2011). ...
Article
Though GPS-based human trajectory data have been commonly used in travel surveys and human mobility studies, missing data or data gaps that are intrinsically relevant to research reliability remain a critical and challenging issue. This study proposes a novel framework for imputing data gaps based on frequent-pattern mining and time geography, which allows for considering spatio-temporal travel restrictions during imputation by evaluating the spatio-temporal topology relations between the space-time prisms of gaps and corresponding frequent activities or trips. For the validation, the proposed framework is applied to raw GPS trajectories that were collected from 139 participants in Switzerland. In the case study, the temporal and spatio-temporal gaps are artificially generated by randomly choosing activities and trips from the trajectory data. Through comparing the mobility indicators (i.e. duration and distance) calculated from raw data, imputed data, and data with gaps, we quantitatively evaluate the performance of the proposed method in terms of Pearson correlation coefficients and deviation. We further compare the framework with the shortest path interpolation method based on the generated spatio-temporal gaps. The comparison results demonstrate the performance and advantage of the proposed method in imputing gaps from GPS-based human movement data.
... The collection of motion data ranges over a variety of acquisition modalities (e.g., mobile phone traces, GPS signals, social media check-ins), often implicating the tracking of large numbers of people and, consequently, the creation of big datasets of historical motion traces. The rise of data availability has boosted the interest in human mobility [3,4], paving the way to various data mining approaches for motion behavior analysis and trajectory-related studies [5,6]. ...
Article
Full-text available
Neural machine translation is a prominent field in the computational linguistics domain. By leveraging the recent developments of deep learning, it gave birth to powerful algorithms for translating text from one language to another. This study aims to assess the feasibility of transferring the neural machine translation approach into a completely different context, namely human mobility and trajectory analysis. Building a conceptual parallelism between sentences (sequences of words) and motion traces (sequences of locations), we aspire to translate individual trajectories generated by a certain category of users into the corresponding mobility traces potentially generated by a different category of users. The experiment is inserted in the background of tourist mobility analysis, with the goal of translating the motion behavior of tourists belonging to a specific nationality into the motion behavior of tourists belonging to a different nationality. The model adopted is based on the seq2seq approach and consists of an encoder–decoder architecture based on long short-term memory (LSTM) neural networks and neural embeddings. The encoder turns an input location sequence into a corresponding hidden vector; the decoder reverses the process, turning the vector into an output location sequence. The proposed framework, tested on a real-world large-scale dataset, explores an effective attempt of motion transformation between different entities, arising as a potentially powerful source of mobility information disclosure, especially in the context of crowd management and smart city services.
... Trajectory data have the utmost importance. Travel recommendation [1], Urban planning [2], traffic control [3], the behavioral study [4,100], Social friendship [5] are some of its primary realms. Commercial companies like Uber, Ola, and Didi exploit trajectory data to fulfill the need of the customers to gain a comparative edge. ...
Article
Full-text available
Recent explorative growth in telecommunication and telepathy technology has flooded the market with location-based data, which paves the way for location-aware prediction services. These applications have vast domain influence in route navigation, recommendation system, traffic-congestion control, ecological study, climatological forecast, and many more. Research efforts are spent on the put-forward overall picture of location prediction through trajectory data. This survey offers an extensive overview of location prediction enveloping basic definitions and concepts, data sources, approaches, and applications. Moreover, Spatial–Temporal pattern-based prediction models are discussed, highlighting the advantages and disadvantages of each. Sequential, periodic, and frequent pattern mining advances are noted. This paper presents a recent deep learning methodology for extracting features of a large trajectory. Distributive big data models using Hadoop and MapReduce frameworks are recorded. Location prediction using the social media platform is mentioned. Content-based and semantic mining models are studied. Tables and diagrams are displayed to provide at a glance view to facilitate smooth understanding. Furthermore, application and challenges are addressed related to the next location prediction. The overall conclusion of the survey and future directions are also listed.
... The rise of motion data availability has boosted the interest in human mobility analysis, establishing various methods for trajectory data mining [27,28] to either describe the observable motion behavior [29] or to predict future activities [30]. ...
Article
Full-text available
The increasing availability of trajectory recordings has led to the mining of a massive amount of historical track data, allowing for a better understanding of travel behaviors by revealing meaningful motion patterns. In the context of human mobility analysis, the problem of motion prediction assumes a central role and is beneficial for a wide range of applications, including for touristic purposes, such as personalized services or targeted recommendations, and sustainability studies related to crowd management and resource redistribution. This paper tackles a particular case of the trajectory prediction problem, focusing on large-scale mobility traces of short-term foreign tourists. These sparse trajectories, short and non-repetitive, lack spatial and temporal regularity, making prediction analysis based on individual historical motion data unreliable. To face this issue, we hereby propose a deep learning-based approach, taking into account the collective mobility of tourists over the territory. The underlying semantics of motion patterns are captured by means of a long short-term memory (LSTM) neural network model trained on pre-processed location sequences, aiming to predict the next visited place in the trajectory. We tested the methodology on a real-world big dataset, demonstrating its higher feasibility with respect to traditional approaches.
... The second category is human activity recognition using location information in both continuing and noncontinuous types of trajectories. (1) For continuous trajectories, these studies usually use constrained random fields, support vector machines, or Bayesian to identify travel modes with spatiotemporal features of trajectory data [18][19][20][21]. An algorithm was developed for automatically annotating raw trajectories with the activities performed by analyzing users' stop points to infer the POI (Point of Interest) that the user has visited [22]. ...
Article
Full-text available
The human daily activity category represents individual lifestyle and pattern, such as sports and shopping, which reflect personal habits, lifestyle, and preferences and are of great value for human health and many other application fields. Currently, compared to questionnaires, social media as a sensor provides low-cost and easy-to-access data sources, providing new opportunities for obtaining human daily activity category data. However, there are still some challenges to accurately recognizing posts because existing studies ignore contextual information or word order in posts and remain unsatisfactory for capturing the activity semantics of words. To address this problem, we propose a general model for recognizing the human activity category based on deep learning. This model not only describes how to extract a sequence of higher-level word phrase representations in posts based on the deep learning sequence model but also how to integrate temporal information and external knowledge to capture the activity semantics in posts. Considering that no benchmark dataset is available in such studies, we built a dataset that was used for training and evaluating the model. The experimental results show that the proposed model significantly improves the accuracy of recognizing the human activity category compared with traditional classification methods.
... Once they get accustomed to this piece of information, individuals start contemplating change. Even though ideas and methods for automatically determining the time at which this happens (from passively recorded mobility data) exist [38], they were not advanced enough to be used in GoEco. Therefore, GoEco! assumes that people start contemplating change after they have been supplied with such a mobility feedback. ...
Article
Full-text available
The present urban transportation system, mostly tailored for cars, has long shown its limitations. In many urban areas, public transportation and soft mobility would be able to effectively satisfy many travel needs. However, they tend to be neglected, due to a deep-rooted car dependency. How can we encourage people to make sustainable mobility choices, reducing car use and the related CO_2 emissions and energy consumption? Taking advantage of the wide availability of smartphone devices, we designed GoEco!, a smartphone application exploiting automatic mobility tracking, eco-feedback, social comparison and gamification elements to persuade individual modal change. We tested the effectiveness of GoEco! in two regions of Switzerland (Cantons Ticino and Zurich), in a large-scale, one year long randomized controlled trial. Notwithstanding a large drop-out rate experienced throughout the experiment, GoEco! was observed to produce a statistically significant impact (a decrease in CO_2 emissions and energy consumption per kilometer) for systematic routes in highly car-dependent urban areas, such as the Canton Ticino. In Zurich, instead, where high quality public transport is already available, no statistically significant effects were found. In this paper we present the GoEco! experiment and discuss its results and the lessons learnt, highlighting practical difficulties in performing randomized controlled trials in the field of mobility and providing recommendations for future research.
Chapter
Full-text available
Urban mobility and the transport of people have been increasing in volume inexorably for decades. Despite the advantages and opportunities mobility has brought to our society, there are also severe drawbacks such as the transport sector’s role as one of the main contributors to greenhouse-gas emissions and traffic jams. In the future, an increasing number of people will be living in large urban settings, and therefore, these problems must be solved to assure livable environments. The rapid progress of information and communication, and geographic information technologies, has paved the way for urban informatics and smart cities, which allow for large-scale urban analytics as well as supporting people in their complex mobile decision making. This chapter demonstrates how geosmartness, a combination of novel spatial-data sources, computational methods, and geospatial technologies, provides opportunities for scientists to perform large-scale spatio-temporal analyses of mobility patterns as well as to investigate people’s mobile decision making. Mobility-pattern analysis is necessary for evaluating real-time situations and for making predictions regarding future states. These analyses can also help detect behavioral changes, such as the impact of people’s travel habits or novel travel options, possibly leading to more sustainable forms of transport. Mobile technologies provide novel ways of user support. Examples cover movement-data analysis within the context of multi-modal and energy-efficient mobility, as well as mobile decision-making support through gaze-based interaction.
Conference Paper
Full-text available
The large interest in analyzing one's own fitness led to the development of more and more powerful smartphone applications. Most are capable of tracking a user's position and mode of locomotion, data that do not only reflect personal health, but also mobility choices. A large field of research is concerned with mobility analysis and planning for a variety of reasons, including sustainable transport. Collecting data on mobility behavior using fitness tracker apps is a tempting choice, because they include many of the desired functions, most people own a smartphone and installing a fitness tracker is quick and convenient. However, as their original focus is on measuring fitness behavior, there are a number of difficulties in their usage for mobility tracking. In this paper we denote the various challenges we faced when deploying GoEco! Tracker (an app using the Moves R fitness tracker to collect mobility measurements), and provide an analysis on how to best overcome them. Finally, we summarize findings after one month of large scale testing with a few hundred users within the GoEco! living lab performed in Switzerland.
Chapter
Full-text available
Since the last decades the availability and granularity of location-based data has been rapidly growing. Besides the proliferation of smartphones and location-based social networks, also crowdsourcing and voluntary geographic data led to highly granular mobility data, maps and street networks. In result, location-aware, smart environments are created. The trend for personal self-optimization and monitoring named by the term ‘quantified self’ will speed-up this ongoing process. The citizens in conjunction with their surrounding smart infrastructure turn into ‘living sensors’ that monitor all aspects of urban living (traffic load, noise, energy consumption, safety and many others). The “Big Data”-based intelligent environments and smart cities require algorithms that process these massive amounts of spatio-temporal data. This article provides a survey on event processing in spatio-temporal data streams with a special focus on urban traffic.
Article
With the increasing popularity of location tracking services such as GPS, more and more mobile data are being accumulated. Based on such data, a potentially useful service is to make timely and targeted recommendations for users on places where they might be interested to go and activities that they are likely to conduct. For example, a user arriving in Beijing might wonder where to visit and what she can do around the Forbidden City. A key challenge for such recommendation problems is that the data we have on each individual user might be very limited, while to make useful and accurate recommendations, we need extensive annotated location and activity information from user trace data. In this paper, we present a new approach, known as user-centered collaborative location and activity filtering (UCLAF), to pull many users’ data together and apply collaborative filtering to find like-minded users and like-patterned activities at different locations. We model the userlocation- activity relations with a tensor representation, and propose a regularized tensor and matrix decomposition solution which can better address the sparse data problem in mobile information retrieval. We empirically evaluate UCLAF using a real-world GPS dataset collected from 164 users over 2.5 years, and showed that our system can outperform several state-of-the-art solutions to the problem.
Chapter
Purpose — In order to analyse applicability, comparability and limitations of GPS technology in travel surveys, different mobility survey techniques were tested in an Austrian pilot study. Methodology/approach — Four groups of voluntary respondents recorded their travel behaviour over a time period of three consecutive days. The groups were assigned to three different and combined methods of data collection: Paper–pencil trip diaries, passive GPS tracking, active GPS tracking and prompted recall interviews. Findings — The resulting mobility parameters show that self-reported paper– pencil surveys yield accurate sociodemographic information on the respondents as well as trip purposes and modes of transportation, although too few trips are reported. Passive GPS-based methods minimize the strain for respondents. Methods that combine GPS-based data collection and questionnaire provide the most reliable mobility data at the moment. Research limitations/implications — Due to funding restrictions the sample sizes had to be relatively small (235 participants). Further development in research methodology will increase the effectiveness of automated data analysis, for example more accurate detection of activities and transport modes. The usefulness of GPS-based data collection in a large-scale surveys is planned to be tested in the next Austrian national travel survey. Originality/value of paper — The pilot study allows a detailed comparison of traditional and GPS-based travel survey methods for the first time, due to data collection combined with prompted recalls.
Chapter
Mobilität als System betrachtet ist vielschichtig, hoch dynamisch und komplex. Aufgrund von unterschiedlichen Einflussfaktoren ist das System einem ständigen Wandel unterzogen und nur schwer zu verstehen und zu kontrollieren. Der folgende Artikel beschreibt, wie Fragestellungen im Bereich der Mobilität mit Hilfe von Big Data untersucht und besser verstanden werden können. Hierbei geht es einerseits um den Zugang zu und die Nutzbarmachung von geeigneten Datenquellen, die das System „Mobilität“ beschreiben, andererseits aber auch darum, wie die Daten aufbereitet werden müssen, um als Entscheidungsgrundlagen für aktuelle und zukünftige Fragestellungen geeignet zu sein. Erstes wird zeigen, dass vor allem die Integration von unterschiedlichsten Datenquellen neue, bisher nicht betrachtete Blickwinkel auf das Mobilitätsgeschehen zulässt. Zweites geht der Frage nach, wie aus der Vielzahl von heute sowie zukünftig verfügbaren Datenquellen Mobilitätsinformationen extrahiert werden können, die in Folge von unterschiedlichen Stakeholdern unterschiedlich genutzt werden. Für Mobilitätsdienstleister, Mobilitätsentscheidungsträger und Mobilitätsforscher bedeutet Big Data vor allem ein Umdenken von modellbasierten zu (mehr) datengetriebenen Methoden zur Systembeschreibung. Für Mobilitätsteilnehmer bewegt sich Big Data zwischen der optimierten und einfacheren Erfüllung von Mobilitätsbedürfnissen und der totalen Überwachung. Der Beitrag zeigt anhand von konkreten Beispielen, dass Big Data in der Mobilität nicht das Ziel sondern die logische Konsequenz der fortschreitenden Digitalisierung ist. Aus heutiger Sicht scheinen Digitalisierung und Datenintegration allen Stakeholdern einen Vorteil zu verschaffen, wodurch sich ein Nutzen sowohl für den Einzelnen aber auch für die Gesellschaft ergibt.
Article
Understanding of data quality is essential for choosing suitable analysis methods and interpreting their results. Investigation of quality of movement data, due to their spatio-temporal nature, requires consideration from multiple perspectives at different scales. We review the key properties of movement data and, on their basis, create a typology of possible data quality problems and suggest approaches to identify these types of problems.
Article
Rapid advance of location acquisition technologies boosts the generation of trajectory data, which track the traces of moving objects. A trajectory is typically represented by a sequence of timestamped geographical locations. A wide spectrum of applications can benefit from the trajectory data mining. Bringing unprecedented opportunities, large-scale trajectory data also pose great challenges. In this paper, we survey various applications of trajectory data mining, e.g., path discovery, location prediction, movement behavior analysis, and so on. Furthermore, this paper reviews an extensive collection of existing trajectory data mining techniques and discusses them in a framework of trajectory data mining. This framework and the survey can be used as a guideline for designing future trajectory data mining solutions.
Article
The escalating rate of energy consumption underpins the need to set goals that promote a reduction in CO2 emissions. In 2011 the transport sector contributed 23% to the total EU CO2 emissions; road transport alone was responsible for 71% of this 23% compared to 12% from aviation transport. Corporate car drivers drive, on average, three times more than private car users in Europe (21,500 miles). An improvement in their fuel efficiency, by encouraging sustainable driving using eco-feedback technologies, has the potential to reduce CO2 emissions and promote fuel cost savings of 1% to 8%. This paper evaluates, through an explorative structured analysis, these findings further by defining recommendations for an organization intending to use eco-feedback technologies to reduce the overall corporate fleet's CO2 emission. The theoretical analysis of these findings, through the lens of the Feedback Intervention Theory and appraisal of corporate car drivers' extrinsic and intrinsic motivation, revealed that it is imperative to raise driver's awareness of their fuel consumption. Drivers' concerns regarding management monitoring leading to control and punishment, if their fuel efficiency has not improved, must be addressed. It is essential that an organizational roll-out is not associated with punishments, but focused on motivating employees by providing extrinsic motivation through realistic goal setting and constructive feedback.
Article
The increasing amount of mobile phones that are equipped with localization technology offers a great opportunity for the collection of mobility data. This data can be used for detecting mobility patterns. Matching mobility patterns in streams of spatiotemporal events implies a trade-off between efficiency and pattern complexity. Existing work deals either with low expressive patterns, which can be evaluated efficiently, or with very complex patterns on powerful machines. We propose an approach which solves the trade-off and is able to match flexible and sufficiently complex patterns while delivering a good performance on a resource-constrained mobile device. The supported patterns include full regular expressions as well as relative and absolute time constraints. We present the definition of our pattern language and the implementation and performance evaluation of the pattern matching on a mobile device, using a hierarchy of filters which continuously process the GPS input stream.
Article
During the early part of the first decade of the 2000s, a number of localities in Australia introduced Voluntary Travel Behaviour Change (VTBC) initiatives, otherwise known as TravelSmart. These initiatives were all monitored in the shortterm and suggested that there were reductions in person kilometres of travel (PKT) on the order of 6 to 18 percent. Beginning in 2007, the Institute of Transport and Logistics Studies (ITLS) was asked to undertake a 5-year study to determine if the effects of TravelSmart were sustained in the longer term. This paper describes the study methodology, which was a rotating panel drawn from the Australian Capital Territory, Queensland, South Australia, and Victoria, with panel members asked to carry a portable GPS device with them wherever they went for a period of 15 days in the September-November time period each year from 2007 to 2012, providing a total of six waves of panel data. All members of sampled households over the age of 14 were provided with a GPS device and asked to carry it with them. The paper reports on the inevitable panel attrition and the process of make up for attrition. The panel covered roughly 120 households per year, with approximately 40 households that had not participated in TravelSmart (the control group) and 80 households that had participated. Make up for attrition maintained this approximate split between the treatment and control groups. Details of the sampling procedures are provided in the paper. The sample provided data on about 3,600 person days of travel in each wave or a total of about 20,000 person days of travel over the six waves of the study. Each year, estimates were made on a state-by-state basis and for the entire sample of the change in daily average PKT for each of the control and treatment groups. The paper reports on these year-by-year averages for each of the two groups and for each state and overall. It was found that, while there was some variation from year to year, in general, the treatment group continued to show lower PKT than the control group, suggesting that the changes were sustained over the study period. To our knowledge, this is the first time that a longer-term monitoring of the effects of a VTBC has been undertaken, and is certainly the first one to use GPS measurements of travel to do this. The conclusions of the study suggest that the noncoercive policy of VTBC is effective for at least 5 to 8 years after the intervention is undertaken. This study also shows that a small sample of households using GPS for a two-week period provides an adequate basis for monitoring and evaluating the long-term effects of such an intervention.