ArticlePDF Available

Abstract and Figures

The increased interest in the automation of travel diary collection, together with the ease of access to new artificial intelligence methods led scientists to explore the prerequisites to the automatic generation of travel diaries. One of the most promising methods for this automation relies on collecting GPS traces of multiple users over a period of time, followed by asking the users to annotate their collected data by specifying the base entities for a travel diary, i.e., trips and triplegs. This led scientist on one of two paths: either develop an in-house solution for data collection and annotation, which is usually an undocumented prototype implementation limited to few users, or contract an external provider for the development, which results in additional costs. This paper provides a third path: an open-source highly modular system for the collection and annotation of travel diaries of multiple users, named MEILI. The paper discusses the architecture of MEILI with an emphasis on the data model, which allows scientists to implement and evaluate their methods of choice for the detection of the following entities: trip start/end, trip destination, trip purpose, tripleg start/end, and tripleg mode. Furthermore, the open source nature of MEILI allows scientists to modify the MEILI solution in compliance with their legal and ethical specifications. MEILI was successfully trialed in multiple case studies in Stockholm and Gothenburg, Sweden between 2014 and 2017.
Content may be subject to copyright.
THIS IS A COPY OF THE POST-PRINT MANUSCRIPT.
THE PUBLISHED VERSION:
Prelipcean, A.C., Gid´
ofalvi, G. and Susilo, Y.O. (2018) MEILI: A Travel
Diary Collection, Annotation and Automation System. Computers, En-
vironment and Urban Systems, ahead-of-print, 1-11, found at https://
doi.org/10.1016/j.compenvurbsys.2018.01.011
Preferred citation:
Prelipcean, A.C., Gid´
ofalvi, G. and Susilo, Y.O. (2018) MEILI: A Travel
Diary Collection, Annotation and Automation System. Computers, En-
vironment and Urban Systems, ahead-of-print, 1-11, found at https://
doi.org/10.1016/j.compenvurbsys.2018.01.011
1
MEILI: A Travel Diary Collection, Annotation and
Automation System
Adrian C. Prelipcean*,**,Gy˝
oz˝
o Gid´
ofalvi* and Yusak O. Susilo**
* Division of Geoinformatics, Department of Urban Planning and Environment, KTH,
Sweden; ** Division of Transport and Location Analysis, Department of Transport
Science, Sweden
Abstract
The increased interest in the automation of travel diary collection,
together with the ease of access to new artificial intelligence methods
led scientists to explore the prerequisites to the automatic generation
of travel diaries. One of the most promising method for this automa-
tion relies on collecting GPS traces of multiple users over a period of
time, followed by asking the users to annotate their collected data by
specifying the base entities for a travel diary, i.e., trips and triplegs.
This led scientist on one of two paths: either develop an in-house
solution for data collection and annotation, which is usually an un-
documented prototype implementation limited to few users, or con-
tract an external provider for the development, which results in addi-
tional costs. This paper provides a third path: an open-source highly
modular system for the collection and annotation of travel diaries of
multiple users, named MEILI. The paper discusses the architecture
of MEILI with an emphasis on the data model, which allows scien-
tists to implement and evaluate their methods of choice for the de-
tection of the following entities: trip start/end, trip destination, trip
purpose, tripleg start/end, and tripleg mode. Furthermore, the open
source nature of MEILI allows scientists to modify the MEILI solu-
tion in compliance with their legal and ethical specifications. MEILI
was successfully trialed in multiple case studies in Stockholm and
Gothenburg, Sweden between 2014 and 2017.
Keywords Travel Diaries; Destination, Purpose and Travel Mode Infer-
ences; Travel Diary Collection System; Open Source; System Design and
Architecture
2
1 Introduction
Understanding how people behave has been the bedrock of social science
since its inception. As with most branches of science, an objective and
non-biased analysis is dependent on the prowess of the analysts and on
the quality and representativeness of the data that is analyzed. As such,
scientists have focused on developing both analysis methods and data col-
lection tools as an effort towards a more reliable understanding into how
people behave.
In particular, transportation science studies people’s travel behavior by
analyzing how people make the travel choices to fulfill their daily sched-
ule. The most common way to obtain the information on how people allo-
cate their travels throughout the day has been via the collection of travel
diaries. A travel diary is a sequential description of what a traveler has
been doing during a predefined time frame (of usually one day), where
a respondent describes her trips and triplegs. A trip describes an activity
and, as such, contains information about: 1) the start and stop time of a
trip, 2) the origin and destination of a trip, 3) the length of a trip, 4) the
purpose of the trip, and 5) how the user traveled between the origin and
the destination. Similarly, a tripleg represents the part of a trip that was
traveled solely via one travel mode, and contains information about: 1) the
start and stop time of a tripleg, 2) the start and stop place of a tripleg, 3) the
length of a tripleg, and 4) the travel mode. While the amount of informa-
tion contained by a tripleg is undeniably useful, it is difficult to design
an effective survey that can prompt users to specify all tripleg related in-
formation without increasing the survey fill-in burden and, subsequently,
the drop rate (Axhausen, 2008; Prelipcean et al., 2017a; Richardson et al.,
1995). As such, it is common practice for surveys to limit the questions
regarding triplegs to the sequence of travel modes or to the main travel
mode.
One of the main drawbacks of using the traditional ways of filling-in a
travel survey is the response rate, which has been shown to steadily and
consistently drop during the last decades (Richardson et al., 1995). This
drawback is also accompanied by a declaration bias, which is partly in-
duced by respondents forgetting to declare some trips (Stopher, 1992), and
partly induced by using terms in the travel diary survey that are not com-
mon for non-experts, e.g., trips and triplegs. These drawbacks, accompa-
nied by the surge of devices that allow for a seamless collection of data at
a low cost (Prelipcean et al., 2014) have prompted scientists to investigate
new ways of deriving the same information as offered by travel diaries but
with a higher response rate and lower declaration bias. These new promis-
3
ing options try to complement or replace the traditional declaration-based
travel diary collection with methods that extract travel diary specific in-
formation from trajectories and auxiliary datasets.
However, since these new methods are seldom tested more than once,
there is a lack of convergence towards a widely accepted framework for
collecting data and attaching travel diary specific semantics to the data.
This downside is accompanied by a lack of thorough software and hard-
ware documentation, as well as a lack of specifications that guided the
development process, which makes most previous research difficult to re-
produce.
This paper introduces MEILI1– a travel diary collection, annotation
and automation system – and describes how MEILI was used during mul-
tiple case studies: three case studies in Stockholm, Sweden (Prelipcean et
al., 2017a) and one in Gothenburg (Allstr¨
om et al., 2016). The modular
design of MEILI allows those interested in travel diary automation from
sequences of GPS locations describing the users’ movement (trajectories)
fused with accelerometer readings, to modify and adapt the source code
to her own needs, e.g., embedding the readings from other sensors into
the trajectories, using spatial datasets that are region-specific, etc. Most
components of MEILI are released as copyleft open source products as an
effort to form a community around MEILI that encompasses the interests
of both research and industry communities.
The remainder of this paper is structured as follows: Section 2 presents
the literature review relevant for systems that attempt to collect travel di-
aries (manual, semi-automatic and automatic), Section 3 describes the sys-
tem architecture of MEILI and its modular design, Section 4 illustrates the
performance of MEILI during a real-world case study where MEILI was
used to collect travel diaries from 171 respondents, Section 5 concludes the
paper, and, finally, Section 6 presents the future work in this field.
2 Literature Review
The traditional methods to collect travel behaviour data rely on either col-
lecting travel diaries or on collecting activity travel diaries, which slightly
differ in terms of what the focus of the questions is, but ultimately result in
the same output (Clarke et al., 1981; Stopher, 1992). While these two types
of travel behaviour collection methods are the most widely used, there
have been attempts on increasing the response rate (Murakami & Wagner,
1According to http://mythology.wikia.com/wiki/Meili, MEILI is the Norse god of
travel.
4
1999), eliminating the response over time bias (Golob & Meurs, 1986), and
diminishing the number of forgotten trips (Pierce et al., 2003) by augment-
ing the collection with computer assisted telephone interviews (CATI),
computer assisted personal interviews (CAPI), and computer-assisted self-
interviews (CASI). For a detailed overview of these technologies, the reader
is directed to Wermuth, Sommer, & Kreitz, 2003 and J. Wolf, 2006.
However, the pervasiveness of mobile phones together with the declin-
ing response rate to travel diaries has led scientist to investigate whether
travel diaries can be complemented or replaced by methods that rely on
collecting trajectories from users and annotate / infer travel diary seman-
tics for the aforementioned trajectories (J. Wolf et al., 2004; Prelipcean et
al., 2015; Stopher et al., 2008; J. Wolf et al., 2001, 2003). As such, a new re-
search focus is found in deriving travel diaries from trajectories, which do
not achieve, at the date of this writing, the automated generation of travel
diaries from trajectories (Prelipcean et al., 2016). This is mostly due to the
fact that it is difficult to develop and set up a system that allows for the
collection of trajectories and their annotation into travel diaries. Further-
more, the difficulty drastically increases with the scale at which the system
is supposed to collect data for, both size-wise (with regards to number of
users) and region-wise.
Previous approaches rely on hiring external developers (Bohte & Maat,
2009; Greene et al., 2016), designing in-house tools whose source code is
not released and its license is not specified and seldom reused for multiple
case studies (Cottrill et al., 2013; Greene et al., 2016; Montini et al., 2015;
Nitsche et al., 2012; Safi et al., 2017; Wang et al., 2017), as well as manually
visualizing and analysing the data using different Computer Aided De-
sign techniques (J. Wolf et al., 2004; Stopher et al., 2008; J. Wolf et al., 2001,
2003). While numerous systems have been developed commercially (e.g.,
J. L. Wolf, 2000; Kim, Kim, Estrin, & Srivastava, 2010; Cottrill et al., 2013;
Berger & Platzer, 2015; Geurs, Thomas, Bijlsma, & Douhou, 2015; Montini
et al., 2015; Sense.DAT - DAT.Mobility, n.d.; rMove, n.d.), to the authors’
knowledge, there is no available open source solution for collecting travel
diaries that is freely available for everyone to use, which impedes research
progress due to the vendor lock-in.
This paper proposes the MEILI system, which is designed for collecting
and annotating trajectories from large groups of users that is released un-
der an open-source specific license, which allows scientists to collaborate
on maintaining and improving the system.
5
Figure 1: MEILI system’s architecture. The two main parts of the archi-
tecture, i.e., the client and server side, are presented within dashed rect-
angles. The client side contains the MEILI Mobility Collector, which is the
application that continuously runs in the background on a user’s smart-
phone and the MEILI Travel Diary, which is the web page that the user
can use to view and annotate her data. The server contains an API that al-
lows the communication between the client component and the database,
a database that stores raw data, user annotations and interaction logs, and
the Artificial Intelligence (AI) API that partially annotates the user’s data.
The flow of actions is denoted in numbered arrows and is explained in
text.
3 System Architecture
MEILI is an open-source system designed for the collection and annota-
tion of travel diaries of multiple users. The design and implementation
of MEILI has undergone multiple iterations, but the philosophy behind
developing MEILI remained constant, i.e., MEILI should be a system that
fulfills the following criteria: 1) can collect GPS locations fused with ac-
celerometer readings in a battery efficient manner, from a large number
of users, 2) allows users to annotate their collected data into travel di-
aries, 3) uses different machine learning techniques to aid users during
data annotation stages, 4) is available to the research community, and 5) it
is simple to deploy without specific expertise.
With regards to the aforementioned criteria, MEILI was designed as
a typical, three-tier, Model-View-Controller that has two types of clients:
a data collection component named Mobility Collector (which is further
6
explained in Section 3.3.1) and a data annotation component named Travel
Diary (which is further explained in Section 3.3.5). The primary task of
the data collection component is to collect movement information from a
user’s smartphone in a seamless and battery efficient fashion. The primary
task of the data annotation component is to allow users to annotate their
trajectories with travel semantics (i.e., trips, triplegs, travel modes, trip
destinations and purposes) and to display the inferred travel semantics.
MEILI has two main cycles: the data collection cycle (arrows 1 to 5
in Figure 1) and the user annotation cycle (arrows 6 to 11 in Figure 1).
To start using MEILI, the user first needs to install the Mobility Collector
(Prelipcean et al., 2014) on her personal smartphone and then register a
username and password (arrow 1). After the registration, the user’s smart-
phone can start collecting data in a battery efficient way (this is briefly
described in Section 3.3.1). After the collection is started, the Mobility Col-
lector periodically updates the collected data to the central server (arrow
3). Any upload of a batch of locations to the database prompts a middle-
ware to form a stream of all the locations belonging to the user that are not
within an inferred trip, and pushes it forward to the AI (Artificial Intelli-
gence) component (arrow 4). The AI module then segments the stream of
locations into trips and triplegs. The segmented trips and triplegs are then
inserted into the database and made available to the user when logging in
to the Travel Diary (arrow 5, and the second cycle).
To annotate the collected data, the user logs in via the MEILI Travel Di-
ary (arrow 6), which prompts the database to be queried for the retrieval
of the user’s least recent unannotated trips and triplegs, together with a
list of all available travel modes (for triplegs), a list of nearby destinations
and a list of all available purposes (for trips), as shown in arrows 7 to
10. To aid the users in the annotation process, the AI module orders the
elements of each list based on their likelihood (according to the logic de-
scribed in Section 3.3.4) before displaying them to the users as options to
choose from (arrows 10 and 11). However, if the AI module does not have
sufficient history / ground truth data to perform the ordering based on
likelihood, the elements of the returned lists are ordered alphabetically in-
stead (arrow 11). Alternatively, when the AI module has sufficient history
/ ground truth data and computes a likelihood greater than a specified
threshold value, the returned list also preselects the element with the like-
lihood greater than the specified threshold value when displaying the op-
tions to the user (arrow 11). Whenever the user annotates her data (arrow
6), the AI module parameters can be re-calibrated since MEILI uses an ac-
tive learning process, which can either be done synchronuously or asyn-
chronuously at pre-defined time intervals (arrows 4, 5). MEILI relies on
7
Figure 2: The relational database design that describes the data model
used by MEILI. The “General Entities” contains the raw data collected by
the user, together with other user credentials and a log of how users inter-
act with MEILI. The “Trips and relevant entities” and “Triplegs and rele-
vant entities” parts attach travel diary semantics on top of the raw data.
the database for ACID properties (Atomicity, Consistency, Isolation, Dura-
bility) by broadcasting each interaction on the client side to the database.
In case of failure, i.e., the MEILI Travel Diary enters an inconsistent state
that prevents any annotation being made, a rollback on the last annotated
trip is performed by removing all its annotations, which prompts the user
to redo that trip’s annotation.
3.1 Data Model
This section describes the data collected for the travel diary automation
and the annotations that attach travel diaries semantics to the raw data.
With regards to Figure 2, there are three main entity types that MEILI
uses:
“General Entities”, which contains information about the users that
are using MEILI as part of an on-going case study and her creden-
tials, the location data collected from the users’ smartphones, and an
interaction log which describe how users interact with the system.
“Trips and relevant entities”, which contains the trip-level semantics
needed to form travel diaries from raw GPS trajectories. There are
8
two entities associated with trips: the “Purpose” entity, which is an
exhaustive list of the purposes a user can perform (the list varies be-
tween case studies) and the “POI” entity, which is a non-exhaustive
list of Points of Interest (POIs) with spatial attributes. There are two
types of “Trip” entities: “Trips Inf”, which represent the trips in-
ferred by different mechanisms (e.g., heuristic rules, machine learn-
ing), and “Trips GT”, which represent the ground truth trips that
have been verified and modified by the users who performed them.
“Triplegs and relevant entities”, which contains the tripleg-level se-
mantics needed to form travel diaries from raw GPS trajectories.
There are two entities associated with triplegs: the “Travel Mode”
entity, which is an exhaustive list of the travel modes available to all
users (the list varies between case studies) and the “Transport POI”
entity, which is a non-exhaustive list of transportation related POIS
with spatial attributes, i.e., the location of transportation stations and
parking places. There are two types of “Tripleg” entities: “Triplegs Inf”,
which represent the triplegs inferred by different mechanisms (e.g.,
heuristic rules, machine learning), and “Triplegs GT”, which repre-
sent the ground truth triplegs that have been verified and modified
by the users who performed them.
As shown in Figure 2, the relationship between the entities is repre-
sented by foreign keys. As such, a trip cannot have a purpose that is
not present in the “Purpose” table, a tripleg cannot belong to a trip that
is not present in one of the “Trip” table, etc. One of the advantages of
this type of relational modeling is the ease of data extraction for analysis
tasks. It is important to note that all “Trip” and “Tripleg” entities con-
tain a “type of period” attribute. This is a binary indicator that describes
whether the trip tripleg is associated with a movement period, where
the type is equal to 1, or a stationary period, where the type is equal to
0. By using this indicator, it is easy to extract the time spent waiting be-
tween triplegs (e.g., the time spend when transitioning between walking
and taking the bus), or the time a user spent while performing an activ-
ity (e.g., the time the user spent between arriving at the gym and going
towards the restaurant). Any two consecutive entities of type 1 are sepa-
rated by an entity of type 0, even though the stationary period can have a
duration of 0, to assure temporal and logical continuity between entities.
It is important to note that there are two types of information that
MEILI uses, primary and auxiliary. The primary information consists of
the data that are required to generate travel diaries, i.e., the GPS locations
9
and the schemas used for travel mode and purpose. Similarly, the auxil-
iary information consists of data that are not required to generate travel
diaries, but that greatly improves the user experience, i.e., destination and
transportation POIs, which are either displayed for the user as options to
choose from, or are used by the AI module to infer travel modes, purposes
and / or destinations.
3.2 Modular Design
As shown in Figure 1, there are two main components: client-side and
server-side. However, there are other components associated with each of
the main components, namely: the Mobility Collector and the Travel Diary
for the client-side, and the Database, API, and AI for the server-side.
The system has been designed in a modular fashion to allow different
contributors to focus on their interest without having to modify the entire
code base. As such, if a person is interested in using MEILI, and the only
extra-functionality that she needs is adding a survey to the MEILI Travel
Diary component that is displayed whenever a new trip is annotated, then
the modifications will be done on the MEILI Travel Diary (for prompting
the user to fill in the survey) and the API and Database to ensure that
the results are persistent in the database, if needed. Similarly, if a person
wants to test different AI methods, then her focus can be only on the AI
component. This offers a major advantage to the scientific community
since it replaces the considerable amount of time that would have been
allocated to designing a similar system with the time it would take a user
to get accustomed with MEILI by reading the documentation.
To ensure that a community of developers and open source enthusi-
asts can be built around MEILI, the following components are released
under a GNU Affero General Public License (freely share and modify, but
the modifications have to be made available to the community): MEILI
Mobility Collector, MEILI Travel Diary, MEILI API, and MEILI AI. Due
to restrictions on how the data are stored, encrypted, and collected, the
Database component is released under an Open Data Commons Attribu-
tion License (freely share and modify). The MEILI AI module is embedded
as part of the MEILI API (for the segmentation of trajectories into trips and
triplegs) and MEILI Database (for the inference of travel modes, purposes
and destinations).
3.3 Main Components
This section describes the components of MEILI, with regards to Figure 1.
10
3.3.1 Data Collection Component - Mobility Collector
The MEILI system has a component that is dedicated for data collec-
tion, named MEILI Mobility Collector (Prelipcean et al., 2014). The Mo-
bility Collector captures the sequence of GPS points that describes the
users’ route and the accelerometer readings that describe the users’ physi-
cal movement. One of the problems with accelerometer readings is the col-
lection frequency, which is very high (usually, once every 200 ms), because
it makes it difficult to store every accelerometer reading. To overcome
this, the Mobility Collector temporarily stores all the accelerometer read-
ings between two consecutive locations and generates from the historical
descriptive statistics and other useful features: (1) the mean, minimum,
maximum and the standard deviation values of every accelerometer read-
ing per axis, (2) the number of steps, and (3) whether the object is moving
or stationary. The aforementioned features are then stored in the database,
in the Location table.
To make the collection as non-intrusive as possible, the MEILI Mobil-
ity Collector makes use of adaptive sampling and in-doors detection algo-
rithms as strategies for battery efficiency. Furthermore, the Mobility Col-
lector automatically uploads the collected raw data (GPS locations fused
with accelerometer readings) to the central server every hour when the
smartphone has access to an Internet connection.
The MEILI Mobility Collector life cycle is represented by arrows 1 and
2 in Figure 1. Users can register and log in via the MEILI Mobility Col-
lector, and start the collection. The details of the MEILI Mobility Collector
are discussed in-depth by Prelipcean et al., 2014.
3.3.2 Backend Component - Database
The relational database design for the MEILI Database is shown in Fig-
ure 2, and the data model has been discussed in Section 3.1. The main
purpose of the database is to: 1) persistently store the data collected by
the Mobility Collector, 2) integrate with the API to allow users to annotate
their data, and 3) communicate with the Artificial Intelligence component
for storing the inferred entities “Trips Inf” and “Triplegs Inf”.
As it can be seen in Figure 1, the arrows 3, 4, 5, 8 and 9 indicate that
the Database is central to the MEILI system, since it is the hub of all the
operations. As such, the main problem with the Database Component is
11
the response time for particular API inquiries, which can slow the MEILI
system down to an unresponsive state.
3.3.3 Backend Component - API
The API component provides the functionality that supports the user
interaction with the MEILI Travel Diary, which results in annotated travel
diaries. The API component broadcasts the users’ interactions to the database,
which ensures the persistence of the annotations. In Figure 1, the API com-
ponent actions are depicted by arrows 3, 8 and 9.
The functions that are supported by the API to allow the users to an-
notate their data are part of the Create Read Update Delete (CRUD) list
of operations. The CRUD operations provided by MEILI are described
in Table 1. While certain operations such as retrieving the destinations
within a buffer of a trip’s end, or retrieving a list of probable purposes can
be viewed as being of the Read type as opposed to the Create type, these
operations are only called when new trips and triplegs are created, and
do not provide independent functionality, which is why they have been
labeled with C, as opposed to R.
3.3.4 Middleware Component - Artificial Intelligence
The role of the Artificial Intelligence module is two-fold: first, it acts
as a trigger and it segments incoming streams of locations into trips and
triplegs (Figure 1, arrow 4), and second, it acts as an on-demand classi-
fier and provides ordered lists (based on the probability / likelihood of
inferences) of travel modes (for triplegs), destinations and purposes (for
trips) to users when they are annotating their data (Figure 1, arrow 5).
Applying AI and ML methods for automatically inferring the aforemen-
tioned entities and attributes is an active field of research (see Prelipcean
et al., 2014, 2016 for automatic travel mode detection, Bohte & Maat, 2009
for purpose inference and Prelipcean, 2016; J. Wolf, 2006 for destination
inference) that does not have any widely accepted methods to perform
the inferences or to measure the performance of the methods (Prelipcean
et al., 2016, 2017b). In its current form, MEILI uses standard AI and ML
algorithms, i.e., a variation of a clustered nearest neighbor classifier for
travel mode detection (which obtained an initial precision of 53.5% for 15
12
Table 1: CRUD operations provided by the MEILI API, and the entities
that are modified by the operations.
Entity affected Type (CRUD) Description
Trip C Insert new trip
Destination C Get close-by destinations for inserted trip
Purpose C Get probable purposes for inserted trip
POI C Insert new POI
Tripleg C Insert new tripleg
Transition C Get close-by destinations for inserted tripleg
Location C Insert location to enrich route geometry
Next R Pagination – retrieve next trip
Previous R Pagination – retrieve previous trip
Trip U Update the inserted trip
POI U Update the inserted POI
Tripleg U Update the inserted tripleg
Location U Update the inserted location
Trip D Delete the inserted trip
Tripleg D Delete the inserted tripleg
travel modes and increased to 75% precision after each user annotated the
triplegs of their first days), and a Naive-Bayes classifier for destination
and purpose inference (a precision of 54.7% for 13 purposes, and a pre-
cision of 41.9% for destination inference2) . The details of the AI and ML
classifiers have been thoroughly discussed by Prelipcean, 2016, along with
a critique of widely used performance measures (Prelipcean et al., 2016).
However, taking into account that developing new AI and ML classifiers
for trajectory segmentation and travel mode, destination and purpose in-
ference is an active research area, MEILI provides the interfaces for any AI
methods to be embedded within the MEILI system, both as trigger-based
synchronous methods that classify data on the fly and as periodic asyn-
chronous methods that classify data at predefined time periods.
3.3.5 Frontend Component - MEILI Travel Diary
The MEILI Travel Diary is designed to allow people to annotate their col-
lected trajectories into travel diaries, which corresponds to arrows 6, 7, 10
2The precision values for purpose and destination are generated for the case of using
the type of POI as a classification feature with regards to Prelipcean, 2016.
13
and 11 in Figure 1. The main entities described by a travel diaries are trips
and triplegs, and MEILI allows people to annotate different data types to
describe trips triplegs, with regards to their:
1) spatial aspect – this contains information about the route that the user
chose for the trip / tripleg, the origin3and destination of the trip (the
POI where the trip started and the POI where the trip ended) / trip-
leg (the POI / transportation stop where the tripleg started and the
POI / transportation stop where the tripleg ended), and the length
of the trip, which can be computed as the length of the polyline that
describes the route or the Euclidean distance between the origin POI
and the destination POI.
2) temporal aspect – this contains information about when the user
started and ended the trip / tripleg and how long the trip / trip-
leg lasted, which is either computed from the start and end time, or,
in the absence of start and end time, is declared by the user. Wait-
ing time can be computed for each tripleg as the time a user waited
when transitioning in between triplegs and then it can be aggregated
to find out the total waiting time or average waiting time per trip.
Similarly, one can compute the waiting time in between consecutive
trips to identify the time spent at the origin POI.
3) descriptive aspect – this contains information about why the user
performed a certain trip (i.e., trip purpose) or about the transporta-
tion modes the user employed for a tripleg (i.e., tripleg travel mode).
3.4 User Interaction with MEILI Web Diary
This section presents how users can interact with their collected data via
a web interface, and discusses the main implemented procedures for the
interaction.
The interface offered to users for annotating their trips is conceptually
described in Figure 3a, and has two main components: a map compo-
nent that displays the sequence of points associated with a trip on top of
a basemap, and a timeline component that describes the sequence of the
inferred / annotated triplegs that belong to a trip. This interface is linked
to the CRUD operations supported by the API component on the entities
that are relevant for travel diaries, such as:
3To avoid redundant modeling, it is sufficient to specify that the destination of a trip
is the origin of the trip that follows it.
14
(a) Conceptual design of the MEILI
Travel Diary
(b) Implementation of the MEILI
Travel Diary
Figure 3: The conceptual design and implementation of the MEILI Travel
Diary. To aid the user ’s understanding of the trip, the Travel Diary dis-
plays the two important dimensions of a trip: its geometry, on top of a
basemap (left side), and its timeline, i.e., a verbose summary of the se-
quence of triplegs within the trip, as well as relevant information regard-
ing the previous trip and the next trip (if any).
GPS points - the user can currently delete GPS points or add new
points to enrich the geometry of a trip (in case of insertion, the times-
tamp of the inserted point is interpolated based on the timestamp of
its two neighbors, the distance between the inserted point and each
of its neighbors, and the distance between its neighbors),
triplegs - the user can update a tripleg by modifying its start and
end boundaries (i.e., enlarge or shrink the tripleg), split one tripleg
into several, in case the AI failed to detect a tripleg, and merge two
triplegs into one, in case the AI oversegmented a tripleg,
trips - the user can update a trip by modifying its start and end
boundaries (i.e., enlarge or shrink the trip), split one trip into sev-
eral, in case the AI failed to detect a trip, and merge two trips into
one, in case the AI oversegmented a trip, and
POIs (transportation and destination POIs) – the user can declare a
new POI in the absence of any representative POI in the vicinity of a
destination / transition, the user can delete a trip’s association with
a POI, and the user can edit destination POIs that have been inserted
by herself and any transportation POIs that have been inserted by
other users.
15
Figure 4: The deployment pipeline used by MEILI. The defined sequence
of operations extracts the transportation and destination POIs from OSM
inside the bounding box specified as input, retrieves the most recent ver-
sion of the MEILI Database, it sets up the database with the connection
parameters specified as input, populates it with test data to run the inte-
gration and unit tests to make sure that the setup was correct, and popu-
lates it with the POI data downloaded from OSM.
3.5 A strategy for the deployment of MEILI to any region
One of the main challenges that open source projects face occurs when
onboarding non-technical contributors and users. This varies according
to the maturity of a project, but in most cases the difficulty of getting a
project up and running acts as a bottleneck and limits the number of users
for a system. To overcome this bottleneck, a parameterized deployment
pipeline was implemented to automatically perform most of the steps that
users would have to take when setting up MEILI on their own. The flow
of operations and parameters used in the deployment pipeline are shown
in Figure 4.
As shown on the left side of Figure 4, the user specifies two different
16
types of information: database connection information, which is used to
set up, connect to and import data inside the MEILI database, and ge-
ographical information, which contains the coordinates of the bounding
box of the region where MEILI is being deployed to.
As a first step, MEILI extracts the auxiliary data (i.e., transportation
and destination POI datasets) within the specified geographical area from
Open Street Map (OSM), a widely available crowd-sourced geographical
data repository (Haklay & Weber, 2008), using the Overpass API (Overpass
API, n.d.). OSM was chosen as a main source for auxiliary dataset for the
following reasons: 1) it has good coverage since data are maintained by
a large number of users (approximately 3,100,000 users adding or editing
OSM specific data in July 2016 according to Stats - OpenStreetMap Wiki,
n.d.), 2) data are licensed under an open-source friendly license (i.e., Open
Data Commons Open Database License - Open Database License (ODbL)v
1.0, n.d.), 3) it makes the data accessible via well documented APIs and
software tools, which can be embedded within the MEILI deployment
pipeline.
Second, after the data are downloaded from OSM, the pipeline re-
trieves the latest version of the MEILI database deployment scripts from
the MEILI Database repository, and initializes the database with the spec-
ified connection parameters. After the database is initialized, a script pop-
ulates it with test data and performs unit and integration tests to confirm a
successfull setup. In this step, the most recent (stable) version of the MEILI
database is downloaded from the repository to assure that all available
performance improvements and security fixes are used.
Finally, the data downloaded from OSM (i.e., transportation and des-
tination POIs) are inserted in the database using osm2pgsql (Osm2pgsql,
n.d.) and mapped to the Transport POI and POI tables (Figure 2). At this
stage, the setup is complete and the script prints out the final instructions
on which parameters should be modified in the MEILI Travel Diary web
app to connect to the setup database.
This pipeline produces a fully functional MEILI system that can be
used to collect data in any geographical area, using the transportation and
destination POIs that are available in OSM. One of the last remaining steps
for the full automation of the deployment of MEILI is the compilation and
deployment of the Mobility Collector apps to users, which is part of future
work.
17
3.6 Description of Deployed Components
The following implementations have been chosen for the MEILI System,
which are also available as an open-source project at https://github.com/
Badger-MEILI:
MEILI Mobility Collector is implemented on both iOS4and Android5
platforms.
MEILI Travel Diary6(see Figure 3b) is implemented using different
HTML and Javascript libraries as part of a Node JS (Node.js, n.d.) and
Express JS (Express - Node.js web application framework, n.d.) project.
MEILI API7is implemented using the routing capabilities of Express
JS and a database connector as part of a Node JS and Express JS
project.
MEILI Database8is implemented using Postgres 9.3 (PostgreSQL: The
world’s most advanced open source database, n.d.) and PostGIS 2.0 (PostGIS
– Spatial and Geographic Objects for PostgreSQL, n.d.).
MEILI AI9is implemented as a simple nearest neighbor classifier
that computes feature similarity between candidates and class ag-
gregates (to minimize noise). The AI module is retrained with every
new available annotation.
4 Case Study
The MEILI System was deployed during the same period of time as the
National Travel Survey collection performed in Stockholm, Sweden, in
4The source code of the iOS Mobility Collector app is available at https://github
.com/Badger-MEILI/MEILI-Mobility-Collector-iOS
5The source code of the iOS Mobility Collector app is available at https://github
.com/Badger-MEILI/MEILI-Mobility-Collector-Android
6The source code of MEILI Travel Diary is available at https://github.com/Badger
-MEILI/MEILI-Travel-Diary
7The source code of the MEILI API is available at https:// github .com/ Badger
-MEILI/MEILI-Travel-Diary
8The source code pf the MEILI Database is available at https://github.com/Badger
-MEILI/MEILI-Database.
9The source code for the trajectory segmentation part of the AI module is avail-
able at https://github.com/Badger-MEILI/MEILI-Travel-Diary and the the source
code for the travel mode, destination and purpose inference is available at
https://github.com/Badger-MEILI/MEILI-Database.
18
November 2015. This section contains a general overview of the case
study, the implementation details of the deployed MEILI system, how
users interacted with the MEILI system and a benchmark performed on
the CRUD operations.
4.1 General Overview
The MEILI system was deployed and collected data between 02.11.2015
and 09.11.2015, a period that overlaps the Swedish National Travel Survey
of 2015. The invitation that asked users to participate in using MEILI was
sent together with the Swedish National Travel Survey recruitment letter.
Out of all the reached users, 415 showed interest, and 171 used the MEILI
system for a period of at least one day. The median age of the user groups
is 42 years old, and the MEILI system collected 2142 trips and 5961 triplegs
from the participating users, with a schema of 16 different travel modes
and 13 different purposes, using a POI set containing 21953 POIs and 6610
transportation POIs (i.e., transportation stations and parking places). Dur-
ing the period of the case study when travel diaries were collected both
with MEILI and via the traditional declarative web interfaces, MEILI has
captured 166 trips that were missed by using the declarative method (usu-
ally short trips with a median duration of 12 minutes), and has missed
112 trips that were collected by using the declarative method (lengthier
trips performed with a high speed travel mode). However, as the focus of
this paper is on benchmarking MEILI, for further information on the qual-
ity of the collected travel diaries and a comparison to traditional declar-
ative methods of collecting travel diaries, the reader is directed towards
Allstr¨
om et al., 2016. Furthermore, for other work that made use of MEILI
for data collection and reported on data quality, the reader is directed to-
wards Susilo et al., 2016; Allstr¨
om et al., 2016; Prelipcean et al., 2015, 2016,
2017a; Prelipcean, 2016.
The user response rate has decreased on a daily basis, which can be
explained by three main factors: 1) the MEILI Travel Diary annotation
experience was considered cumbersome, 2) unexpected errors prompted
users to stop using the system, and 3) the lack of a functional incentive
did not keep the users motivated enough to annotate through the whole
period.
4.2 Data usage
In order to understand how users interacted with MEILI, the logged an-
notations and API calls of the users that successfully annotated data for at
19
0
100
200
300
400
500
600
700
800
900
123456789
# operations
Days since started annotating
Number of MEILI operations
0%
20%
40%
60%
80%
100%
123456789
% CRUD operations
Distribution of MEILI operations
CRUD
(a) Daily distribution of CRUD opera-
tions
8PM-12AM
4PM-8PM
12PM-4PM
8AM-12PM
4AM-8AM
12AM-4AM
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday
42% 19% 9% 32% 0% 0% 47%
38% 2% 12% 12% 13% 100% 40%
15% 16% 40% 28% 5% 0% 9%
3% 44% 34% 3% 81% 0% 0%
0% 13% 3% 2% 0% 0% 0%
0% 3% 0% 20% 0% 0% 2%
0
20
40
60
80
100
(b) Time of day distribution of opera-
tions
Figure 5: The distribution of the CRUD operations during the case study.
least a whole week, i.e., 51 users, have been analyzed.
First, the variability of the number of interactions with the system in
relation to the number of days since the annotations started is depicted
on the left side of in Figure 5a. First, it is noticeable that there are two
peaks in the fourth and seventh day since the annotations started, which
correspond to an email sent to the registered users that informs them that
a prize will be randomly awarded to the participants that annotate at least
five days of data (on the fourth day), and an email reminder sent to the
registered users that to annotate their data to be eligible for the prize (on
the seventh day). The low number of operations during the first day can
be explained by the fact that people forgot to start annotating their data,
encountered difficulties or did not collect enough data to be able to view
their trips in the MEILI Travel Diary. Similarly, the last annotation day
has a low number of operations since most users finished annotating their
data during the seventh or eight day. Finally, the low value of annotations
for the fifth day might be due to the fact that the day in question was a
Friday and the users were not preoccupied with data annotations.
Second, the distribution of the CRUD operations was analyzed through-
out the nine annotation days to investigate whether there are days dur-
ing which people interact with the MEILI system in a drastically different
manner (right side of Figure 5a). The distribution of the CRUD operations
through the days is almost static, with the majority of the operations being
the Update and Insert operations, which was expected. The Read opera-
tions correspond to pagination type of operations, which are seldom used
by the users more than once to navigate to the next trip that requires anno-
tation. The Delete operations are mostly due to wrong inferences of trips
20
and triplegs from the AI part, and the low overall percentage of delete
operations (as seen in Figure 5a, 5-15% of all operations are delete opera-
tions, compared with roughly 40% update operations and 40% create op-
erations) hints at the fact that the AI has a relatively good performance for
segmenting trajectories into trips and triplegs (in particular with regards
to the issue of oversegmentation Prelipcean et al., 2016).
Finally, the temporal distribution of the MEILI operations with regards
to day of the week and hour of day is depicted in Figure 5b. The peak
interaction with the MEILI system occurs during morning and noon from
Tuesday to Friday, with Thursday as an exception, where the peak also
occurs during night time, and during evening and night from Saturday to
Monday.
This information hints at the fact that there are both days and periods
of a day during which people interact with the system more often, which
raises worries regarding the scalability of the system, i.e., if the system
can handle the operations load during peak periods (e.g., after sending an
email reminder to users). This led to the need of benchmarking the CRUD
operations to identify which operations are the most expensive in terms
of execution time, i.e., bottlenecks. The benchmarking is discussed in the
proceeding section.
4.3 Benchmarking
The benchmark was performed on a personal laptop that has the following
specifications:
Processor: Intel(R) Core(TM) i7-4550U CPU @ 1.50GHz
Memory: 2 x 4 Gb DDR3 Synchronous 1600 MHz Random Access
Memory
Storage: Samsung MZNTD256HAGL-00000 256GB Solid State Drive
It is important to note that the benchmark is done on the Database
operations that the API calls and not on the API calls themselves, since
benchmarking the API is difficult and unreliable due to the numerous ex-
ternal factors such as general server latency, automatic scalability of dif-
ferent component, shared server resources, etc. Furthermore, the storage
unit is a Solid State Drive, not a Hard Disk Drive, which has a better per-
formance than HDDs.
21
0
5
10
15
20
Destination(0.1%)
Location(0.2%)
POI(1.8%)
Purpose(6.9%)
Trip(21.7%)
Transition(34.6%)
Tripleg(34.6%)
Execution Time (ms)
Create Operations in MEILI
Operations performed
often
Non-indexed
Indexed
(a) Create operations
0
200
400
600
800
1000
1200
1400
1600
Previous(14.7%)
Next(85.3%)
Execution Time (ms)
Read Operations in MEILI
Operation performed
often
Execution Time
CUD
Non-indexed
Indexed
(b) Read operations
0
0.5
1
1.5
2
Location(0.3%)
POI(1.4%)
Trip(48.6%)
Tripleg(49.7%)
Execution Time (ms)
Update Operations in MEILI
Operations performed
often
Non-indexed
Indexed
(c) Update operations
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Trip(44.5%)
Tripleg(55.5%)
Execution Time (ms)
Delete Operations in MEILI
Both operations are performed
often
Non-indexed
Indexed
(d) Delete operations
Figure 6: CRUD operations with and without indexing. Indexes offer a
significant improvement in terms of execution time to most operations.
4.3.1 Measurements
Figure 6 shows the execution time of all CRUD operations grouped by
their type of operation. As seen in Figure 5, the operations that constitute
the majority of calls are of type Create and Update, which implies that
extra attention should be allocated to these types of operations.
The least expensive operations in terms of execution times are the Up-
date and Delete types, which have an average response rate of the order
of 1ms. The Create operations are an order of magnitude more expensive
than the Update and Delete operations, where the operations that extract
the most probable purpose and the closest POIs to a destination or a tripleg
are the most expensive (the reader is directed towards (Prelipcean, 2016)
for a thorough discussion on the most probable purpose extraction). The
most expensive operations are the Read ones, which are three orders of
22
magnitude more expensive than Update and Delete operations, and two
order of magnitude than the Create ones.
To reduce the execution time of these bottlenecks, the database has
been indexed and the tests have been re-run.
4.3.2 Identified Bottlenecks and Improvements
The database was indexed using the Postgres implementations of bi-
nary trees for primary keys, and a R-tree index built on top of a gener-
alized search tree for geometries. Indexing the database has reduced the
execution time of the Create operations to the order of milliseconds, of
the Read operations to the order of hundreds of milliseconds, and of the
Update and Delete operations to hundreds of nanoseconds.
The indexing scheme did not affect the Update POI and Update Loca-
tion operations, which can be explained by the fact that the queries asso-
ciated with these operations were not affected by the indexing scheme, or
by the fact that these operations are mainly based on primary keys of the
tables, which can be fit and cached in the main memory.
The indexing scheme greatly improved the execution time of the pre-
viously problematic Create operations that extract the most probable pur-
pose and the closest POIs to a destination or a tripleg.
Finally, the Read operations have been greatly reduced but the execu-
tion time is still high compared to the other types of operations, which in-
dicate that the pagination queries should be rewritten or new strategies for
pre-loading unannotated data into the main memory should be explored.
4.4 The effect of active learning for travel mode detection
While the number of tasks that are needed for transforming GPS trajecto-
ries into travel diaries is substantial, this section only focuses on travel
mode inference. This particular task was chosen because of the active
learning process MEILI uses to continuously improves its predictions on
travel mode. Please note that this is just an example of the implementa-
tion, for a thorough discussion and analysis of the employed algorithms
and selected features, see Prelipcean, 2016.
The current case study specifications required collecting travel diaries
with 14 different travel modes for triplegs: bicycle, bus, car as driver, car as
passenger, commuter train, ferryboat, flight, moped, other, subway, taxi,
train, tram, and walk. While this is a difficult task due to the inherent
23
Precision (%)
Minimum number of triplegs
Active learning for precision
MinMax
Avg +\- SD
Avg
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
65
70
75
80
85
90
(a) Precision improvement over time
Recall (%)
Minimum number of triplegs
Active learning for recall
MinMax
Avg +\- SD
Avg
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
65
70
75
80
85
90
(b) Recall improvement over time
Figure 7: The effect of active learning on precision and recall. The average
precision and recall increase, while the minimum, maximum converge to-
wards the average values and the standard deviation decreases over time.
similarity of different travel mode features (e.g., car as driver, car as pas-
senger and taxi), the active learning process allowed MEILI to achieve
75% precision and recall according to the new error measures proposed
by Prelipcean et al., 2016. The learning power of the classifier from anno-
tated data is shown in Figure 7.
Whereas the common research focus is infering travel modes (Gong et
al., 2017; Prelipcean et al., 2014, 2016, 2017b; Shafique & Hato, 2017; Zhou
et al., 2017) or destinations and purposes (Gong et al., 2016, 2017; Pre-
lipcean, 2016; Su et al., 2014; Usyukov, 2017) after the completion of the
case study, the main issue with that approach is the lack of focus on user
experience improvements. Embedding active learning as part of the anno-
tation process improves the user experience since it minimizes the amount
of data the users have to annotate and gives the users the opportunity to
correct the output of the classifiers. In this case, the purpose of a classifier
shifts from providing the most likely travel mode (or any other inferred
entity) to providing a list of travel modes ordered by their likelihood of
occurrence. This shift of focus also prompts for the consideration of differ-
ent performance measures, such as the top kprecision, i.e., the percentage
of instances when a classifier proposes the correct target in the first kpre-
dictions (in the list ordered by probability of occurrence). An in-depth
analysis of the performance of the classifiers used by MEILI has been ex-
tensively covered by Prelipcean, 2016, where the classifier achieves a top
3 precision of 82.1% for 14 different travel modes. For further discussions
on travel mode detection from an interdisciplinary perspective, the reader
24
is directed towards Prelipcean et al., 2017b.
5 Conclusions
The main intention of this paper is to provide a readily available, easy to
improve and modify system that makes the study of travel diary extrac-
tion, destination or purpose inference, trajectory segmentation, and any
other type of research that uses trajectory data or travel diaries, widely ac-
cessible to the research community. To do so, this paper proposes an alter-
native to the two most widely used methods for data collection (complete
in-house development and contracting developers) by providing MEILI,
an open source travel diary collection, annotation and automation system.
The system’s architecture and data model are explained, and a benchmark
is provided to show the capabilities of MEILI.
The MEILI system has multiple components (i.e., Mobility Collector,
Travel Diary, API, Database and AI), each of which being a stand-alone
entity that communicates with other entities to provide the MEILI sys-
tem experience. The main advantages of the modular architecture consist
in the ease of improvement of the MEILI system, where improvements
for each component can be suggested without affecting the system on its
own. For example, changing the data model in the database can be done
independent of the other modules as long as the database functions that
are linked to the API provide the same output as before. This advantage
mostly resonates with minimizing the development efforts, allowing back-
end developers to focus on the Database and API components, frontend
developers on the Travel Diary Component, and mobile developers on the
Mobility Collector Component.
The MEILI system was trialed during a nine day period in Novem-
ber, 2015, in Stockholm, and the users’ interactions with the system were
logged. This data is further analyzed to identify when people are more
likely to spend time annotating their data into travel diaries, which allows
administrators to be pro-active about resource management by allocating
more resources during peak periods, as well as estimating what can be
the cost of running MEILI at a certain throughput level. Furthermore, the
number of operations per day is decreased on Fridays and is increased
after an informative email about data annotation is sent to the user group.
Finally, after identifying the peak annotation periods, the paper pro-
vides a benchmark on the CRUD operations supported by MEILI and
identified the Read operations as bottlenecks. The database was then in-
dexed, and the execution time of the Read operations was an order of
25
magnitude lower. However, efforts should be made towards decreasing
the execution time to the same order of magnitude of the other Create,
Update and Delete operations.
6 Future Work
While it is clear that the scope of the future work regarding the exploration
of the data collected with MEILI is very broad, since it encompasses mul-
tiple active research areas and disciplines, this section describes the future
work on the MEILI system, and not on the potential of the data collected
with MEILI.
One of the future work priorities of the authors is to further decrease
the execution time of the CRUD operations, with an emphasis on the ex-
ecution time of the Read operations. While the execution time might not
be problematic for hundreds of active users, it is a bottleneck that can pre-
vent MEILI from scaling to thousands of users. Furthermore, while the
indexing strategy used to decrease the execution time of the CRUD oper-
ations is suitable for the current data size, it is worth exploring whether it
would also be suitable when increasing the data size by multiple orders of
magnitude.
Another priority of the authors is to perform realistic simulations to
identify the real-time limits of the MEILI Database and of the whole MEILI
system as the number of active users increases. This can give further in-
sight regarding how MEILI scales with an increase of the number of users,
and can raise discussions on whether the newer trend of non-relational
database models (Leavitt, 2010) would be more suitable than the current
relational model.
As stated throughout the paper, one of the intentions behind making
MEILI widely available and open source is to kindle the possibility of hav-
ing a group of experts actively developing and maintaining the MEILI
code base. This would allow scientists to focus on using MEILI as a sys-
tem for collecting trajectories fused with accelerometer readings that are
semantically enriched into travel diaries, as opposed to focusing on devel-
oping a data collection system that might only be used once. The future
work among this direction consists in the initial establishment of a devel-
opment community around MEILI.
Finally, one of the interests of the main author is to use MEILI for differ-
ent data collection ventures, not limited to travel diary collection, in differ-
ent geographical places to understand how the usage of the system varies
between users from different geographical regions, age groups, interests,
26
etc. This would allow for the proposal of efficient strategies for communi-
cating information to the users, as well as identify what would be a good
incentive for people to use MEILI outside of the scope of organized case
studies (e.g., identifying services that could be offered by MEILI such as
notifying the users when their regular route towards the next destination
is congested).
Acknowledgments
Acknowledgments hidden for peer review. This work was partly sup-
ported by Travikverket (Swedish Transport Administration) under Grant
“TRV 2014/10422”.
References
Allstr¨
om, A., Gidofalvi, G., Kristoffersson, I., Prelipcean, A. C., Ryder-
gren, C., Susilo, Y. O., & Widell, J. (2016). Experiences from smart-
phone based travel data collection – system development and eval-
uation. Retrieved from https://www.kth.se/files/view/yusak/
5836ed8912b51d1042597bac/SPOT final report.pdf/
Allstr¨
om, A., Prelipcean, A. C., Gejdeb¨
ack, M., & Skoglund, T. (2016).
Erfarenheter fr˚an f¨ors¨ok med smartphone-baserad resdatainsamling i
oteborg. (In Swedish)
Axhausen, K. W. (2008). Social networks, mobility biographies, and travel:
survey challenges. Environment and Planning B: Planning and design,
35(6), 981–996.
Berger, M., & Platzer, M. (2015). Field evaluation of the smartphone-
based travel behaviour data collection app smartmo. Transportation
Research Procedia,11, 263–279.
Bohte, W., & Maat, K. (2009). Deriving and validating trip purposes and
travel modes for multi-day gps-based travel surveys: A large-scale
application in the netherlands. Transportation Research Part C: Emerg-
ing Technologies,17(3), 285–297.
Clarke, M., Dix, M., & Jones, P. (1981). Error and uncertainty in travel
surveys. Transportation,10(2), 105–126.
Cottrill, C., Pereira, F., Zhao, F., Dias, I., Lim, H., Ben-Akiva, M., & Ze-
gras, P. (2013). Future mobility survey: Experience in developing a
smartphone-based travel survey in singapore. Transportation Research
Record: Journal of the Transportation Research Board(2354), 59–67.
27
Express - node.js web application framework. (n.d.). https://expressjs.com/.
(Accessed: 2017-06-27)
Geurs, K. T., Thomas, T., Bijlsma, M., & Douhou, S. (2015). Automatic trip
and mode detection with move smarter: First results from the dutch
mobile mobility panel. Transportation research procedia,11, 247–262.
Golob, T. T., & Meurs, H. (1986). Biases in response over time in a seven-
day travel diary. Transportation,13(2), 163–181.
Gong, L., Kanamori, R., & Yamamoto, T. (2017). Data selection in machine
learning for identifying trip purposes and travel modes from longi-
tudinal gps data collection lasting for seasons. Travel Behaviour and
Society.
Gong, L., Yamamoto, T., & Morikawa, T. (2016). Comparison of activity
type identification from mobile phone gps data using various ma-
chine learning methods. Asian Transport Studies,4(1), 114–128.
Greene, E., Flake, L., Hathaway, K., & Geilich, M. (2016). A seven-day
smartphone-based gps household travel survey in indiana 2. In
Transportation research board 95th annual meeting.
Haklay, M., & Weber, P. (2008). Openstreetmap: User-generated street
maps. IEEE Pervasive Computing,7(4), 12–18.
Kim, D. H., Kim, Y., Estrin, D., & Srivastava, M. B. (2010). Sensloc: sensing
everyday places and paths using less energy. In Proceedings of the 8th
acm conference on embedded networked sensor systems (pp. 43–56).
Leavitt, N. (2010). Will nosql databases live up to their promise? Computer,
43(2).
Montini, L., Prost, S., Schrammel, J., Rieser-Sch¨
ussler, N., & Axhausen,
K. W. (2015). Comparison of travel diaries generated from smart-
phone data and dedicated gps devices. Transportation Research Proce-
dia,11, 227–241.
Murakami, E., & Wagner, D. P. (1999). Can using global positioning system
(gps) improve trip reporting? Transportation research part c: emerging
technologies,7(2), 149–165.
Nitsche, P., Widhalm, P., Breuss, S., & Maurer, P. (2012). A strategy on
how to utilize smartphones for automatically reconstructing trips in
travel surveys. Procedia-Social and Behavioral Sciences,48, 1033–1046.
Node.js. (n.d.). https://nodejs.org/en/. (Accessed: 2017-06-27)
Open database license (odbl)v 1.0. (n.d.). https://opendatacommons.org/
licenses/odbl/1.0/. (Accessed: 2017-06-27)
Osm2pgsql. (n.d.). https://github.com/openstreetmap/osm2pgsql. (Ac-
cessed: 2017-06-27)
Overpass api. (n.d.). http://overpass-api.de/. (Accessed: 2017-06-27)
Pierce, B., Casas, J., & Giaimo, G. (2003). Estimating trip rate under-
28
reporting: preliminary results from the ohio household travel sur-
vey. In Transportation research board 82nd annual meeting, national re-
search council, washington, dc.
Postgis – spatial and geographic objects for postgresql. (n.d.). http://postgis
.net/. (Accessed: 2017-06-27)
Postgresql: The world’s most advanced open source database. (n.d.). https://
www.postgresql.org/. (Accessed: 2017-06-27)
Prelipcean, A. C. (2016). Capturing travel entities to facilitate travel behaviour
analysis: A case study on generating travel diaries from trajectories fused
with accelerometer readings (Unpublished doctoral dissertation). KTH
Royal Institute of Technology.
Prelipcean, A. C., Gid´
ofalvi, G., & Susilo, Y. O. (2014). Mobility collector.
Journal of Location Based Services,8(4), 229–255.
Prelipcean, A. C., Gidofalvi, G., & Susilo, Y. O. (2015). Comparative frame-
work for activity-travel diary collection systems. In Models and tech-
nologies for intelligent transportation systems (mt-its), 2015 international
conference on (pp. 251–258).
Prelipcean, A. C., Gidofalvi, G., & Susilo, Y. O. (2016). Measures of trans-
port mode segmentation of trajectories. International Journal of Geo-
graphical Information Science,30(9), 1763–1784.
Prelipcean, A. C., Gid´
ofalvi, G., & Susilo, Y. O. (2017a). A series of three case
studies on the semi-automation of activity travel diary generation using
smartphones (Tech. Rep.).
Prelipcean, A. C., Gid ´
ofalvi, G., & Susilo, Y. O. (2017b). Transportation
mode detection–an in-depth review of applicability and reliability.
Transport Reviews,37(4), 442–464.
Richardson, A. J., Ampt, E. S., & Meyburg, A. H. (1995). Survey methods for
transport planning. Eucalyptus Press Melbourne.
rmove. (n.d.). https://rmove.rsginc.com/. (Accessed: 2017-12-15)
Safi, H., Assemi, B., Mesbah, M., & Ferreira, L. (2017). An empirical com-
parison of four technology-mediated travel survey methods. Journal
of Traffic and Transportation Engineering (English Edition),4(1), 80–87.
Sense.dat - dat.mobility. (n.d.). http :// www .dat .nl / en / products /
sensedat/. (Accessed: 2017-12-15)
Shafique, M. A., & Hato, E. (2017). Classification of travel data with mul-
tiple sensor information using random forest. Transportation Research
Procedia,22, 144–153.
Stats - openstreetmap wiki. (n.d.). http://wiki.openstreetmap.org/wiki/
Stats. (Accessed: 2017-06-27)
Stopher, P. (1992). Use of an activity-based diary to collect household
travel data. Transportation,19(2), 159–176.
29
Stopher, P., FitzGerald, C., & Zhang, J. (2008). Search for a global position-
ing system device to measure person travel. Transportation Research
Part C: Emerging Technologies,16(3), 350–369.
Su, X., Tong, H., & Ji, P. (2014). Activity recognition with smartphone
sensors. Tsinghua Science and Technology,19(3), 235–249.
Susilo, Y. O., Prelipcean, A. C., Gidofalvi, G., Allstr ¨
om, A., Kristoffers-
son, I., & Widell, J. (2016). Lessons from a trial of meili, a smart-
phone based semi-automatic activity-travel diary collector, in stock-
holm city, sweden.
Usyukov, V. (2017). Methodology for identifying activities from gps data
streams. Procedia Computer Science,109, 10–17.
Wang, B., Gao, L., & Juan, Z. (2017). A trip detection model for individ-
ual smartphone-based gps records with a novel evaluation method.
Advances in Mechanical Engineering,9(6), 1687814017705066.
Wermuth, M., Sommer, C., & Kreitz, M. (2003). Impact of new technologies
in travel surveys. In Transport survey quality and innovation (pp. 455–
481). Emerald Group Publishing Limited.
Wolf, J. (2006). Applications of new technologies in travel surveys.
In Travel survey methods: Quality and future directions (pp. 531–544).
Emerald Group Publishing Limited.
Wolf, J., Guensler, R., & Bachman, W. (2001). Elimination of the travel
diary: Experiment to derive trip purpose from global positioning
system travel data. Transportation Research Record: Journal of the Trans-
portation Research Board(1768), 125–134.
Wolf, J., Oliveira, M., & Thompson, M. (2003). The impact of trip underre-
porting on vmt and travel time estimates: preliminary findings from
the california statewide household travel survey gps study. Trans-
portation Research Record,1854, 189–198.
Wolf, J., Sch ¨
onfelder, S., Samaga, U., Oliveira, M., & Axhausen, K. (2004).
Eighty weeks of global positioning system traces: approaches to en-
riching trip information. Transportation Research Record: Journal of the
Transportation Research Board(1870), 46–54.
Wolf, J. L. (2000). Using gps data loggers to replace travel diaries in the collection
of travel data (Unpublished doctoral dissertation). School of Civil and
Environmental Engineering, Georgia Institute of Technology.
Zhou, C., Jia, H., Gao, J., Yang, L., Feng, Y., & Tian, G. (2017). Travel
mode detection method based on big smartphone global position-
ing system tracking data. Advances in Mechanical Engineering,9(6),
1687814017708134.
30
... One barrier to conducting smartphone-based GPS surveys is developing, maintaining, and accessing smartphone applications that collect respondents' GPS traces. Some researchers have developed the application in-house Cottrill et al. 2013;Prelipcean et al. 2018a). Others rely on commercial applications such as the rMove, MOTIONTAG, Traction Travel, and paid-for software access on demand (Calastri et al. 2020;Resource Systems Group 2017;Winkler et al. 2023;Kimley-Horn, n.d.). ...
... This has always led to a trade-off between response rate and survey costs. Furthermore, fielding a paper-based survey incurs additional material and labor costs (Prelipcean et al. 2018a). With the decline of landlines and the popularity of mobile phones, many households do not have a central household phone number, which weakens the reach of telephone-based surveys (Strauts 2010). ...
... Surveys were conducted using computerassisted telephone interviews (CATI), computer-assisted personal interviews (CAPI), and computer-assisted web interviews (CAWI) (Wolf 2006). Cost savings may also be found using electronic and web survey forms instead of physical paper ones (Prelipcean et al. 2018a). However, these measures do not eliminate the drawback of traditional travel diary collection methods, as the underlying survey procedures are similar. ...
Article
Full-text available
This paper introduces an innovative travel survey methodology that utilizes Google Location History (GLH) data to generate travel diaries for transportation demand analysis. By leveraging the accuracy and omnipresence among smartphone users of GLH, the proposed methodology avoids the need for proprietary GPS tracking applications to collect smartphone-based GPS data. This research utilizes the existing travel survey software, TRavel Activity Internet Survey Interface (TRAISI), which allows for the design and implementation of surveys through highly modular and customizable components. A new module was developed within this software to serve as a repository for GLH, enabling the derivation of activity-travel diaries from each respondent’s GLH. The feasibility of this data collection approach is showcased through the Google Timeline Travel Survey (GTTS) conducted in the Greater Toronto Area, Canada. The resultant dataset from the GTTS is demographically representative and offers detailed and accurate travel behavioural insights.
... Various methods have been used in transport mode classification, as illustrated in Table 1. Decision Trees (DT) (Xiao et al., 2017), k Nearest Neighbor algorithm (kNN) (Prelipcean et al., 2018), Naive Bayesian classifier (NB) (van Dijk, 2018) and RF Alam et al., 2023) are the most commonly used methods. Fuzzy Logic method (FL) (Biljecki et al., 2013), Support Vector Machine (SVM) (Pereira et al., 2013) and feature independent learning such as Neural Networks (NN) (Gonzalez et al., 2010), Deep Neural Networks (DNN) , and Convolutional Neural Network (CNN) as a type of applied DNN (Dabiri and Heaslip, 2018;Li et al., 2020;Li et al., 2023) are also used in the literature. ...
... This application handles the transmission of large datasets with short time intervals of 10 ms (i.e., 100 Hz). While few applications focus on collecting GPS data (Prelipcean et al., 2018), ours is one of the rare solutions that address sensor data collection from mobile devices at such high frequencies, with automatic labeling of each row based on the corresponding transport mode. ...
Article
Full-text available
The transport network is a complex system that benefits from detailed data on user mobility. Analyzing user trajectories through clustering or classification methods can provide valuable insights into mobility patterns. Extracting transport modes from these trajectories using classification methods enhances the understanding of user mobility. The complexity of classification methods varies, with some classifying a few transport modes, such as walking, running, bicycling, and driving. In contrast, others classify up to seven modes or use private, unpublished datasets. A key challenge in transport mode classification is ensuring the comparability of different methods across various contexts. Additionally, comparing results is further complicated by the insufficient use of existing standardized benchmark, which in the case of transport mode classification, must contain a structured testing framework and a dataset on which the testing will be conducted. This research introduces a process for collecting data to develop a new transport mode classification dataset. The goal is to enhance the benchmark by evaluating classification methods across diverse traffic patterns and geographic areas, thereby assessing their spatial independence. Spatial independence is crucial because it ensures that classification methods remain accurate regardless of geographic variations. This improves comparability by enabling consistent evaluation of methods across regions, as the improved benchmark addresses spatial independence and ensures robustness for real-world deployment. The current benchmark in literature examines three types of independence: user, position, and time independence. Our tests employ a multilevel method based on Transition State Matrices (TSMs) and the Random Forest (RF) algorithm for transport mode classification. The results demonstrate that the multilevel method maintains spatial independence and achieves higher accuracy compared to the original benchmark problem.
... In the early 2010s, smartphones began to come with embedded GPS technology and other sensors that made it feasible for them to record user locations, and researchers began to develop smart surveys for mobility behavior that made use of these features (Cottrill et al., 2013;Nitsche et al., 2014;Berger & Platzer, 2015;Greaves et al., 2015). Here, too, the specific smart features differed per app: some made use of additional device sensors, fusing the GPS records with accelerometer data (Prelipcean et al., 2018), and some integrated the machine-based check mechanisms with user feedback (Greaves et al., 2015). Soon, recommendations began to emerge for how best to make use of all possible smart features in order to improve data quality and reduce user burden (Harding et al., 2021). ...
Technical Report
Full-text available
Smart surveys combine traditional survey questions with sensor-based data from smartphones, wearables, and other devices. By leveraging modern technology, these surveys can improve data quality, reduce participant burden, and yield more timely and granular insights. This deliverable reports on the first stage of the Smart Survey Implementation (SSI) project, whose overarching aim is to develop an end-to-end research methodology for smart surveys. Focusing on European Time Use Surveys (TUS) and Household Budget Surveys (HBS), it identifies knowledge gaps in the existing literature and outlines how the project will address them in subsequent stages. The deliverable centers on four key challenges that hinder large-scale adoption of smart surveys: (1) recruiting and retaining diverse participants (including difficult-to-reach groups), (2) using machine learning to enhance human-computer interaction, (3) ensuring strong usability and user experience, and (4) integrating smart survey data with conventional surveys while controlling for mode effects. Drawing on previous ESSNET projects and broader international research, each chapter examines these challenges in detail and describes planned field tests and randomized controlled trials. The ultimate goal is to evaluate best practices and design trade-offs across multiple countries, thereby providing a robust methodology and concrete recommendations for implementing smart surveys in official European statistics.
... Travel diaries, in which respondents record key facts about the trips they make over a fixed time period act as a fundamental data source for the understanding and measurement of the travel behaviour of individuals and households and are therefore essential to the comprehensive planning and monitoring of transport policy, operations and infrastructure (Golob & Meurs, 1986;Axhausen, 1991). The travel diary as defined by Prelipceana, et al. (2018) is "a sequential description of what a traveller has been doing during a predefined time frame (of usually one day), where a respondent describes their trips and trip legs". ...
Thesis
Full-text available
Most of our journeys start and end with a walk, and its ubiquitous practice has made it both a physical and a social activity. Every street provides a unique experience, whether for a local walking in their own neighbourhood or for a visitor strolling the streets of a city or town they are visiting. Simply looking can give a special pleasure, no matter how common place the sight might be. Walking is known to be a healthy and sustainable way of moving about the city, particularly in comparison with motorised forms of transport. For these and other reasons, there is a growing interest amongst urban planners and policy makers in enhancing conditions for walkers. Tourists are a distinct category of walker and may have different views on what makes a good walking environment. However, there are only a limited number of studies that have examined walkability from a tourist perspective, even though walking is a fundamental and significant activity that most tourists engage in. The lack of evidence on tourists’ perceptions and behaviours in walkability research poses a significant challenge in designing urban environments that can cater to the needs and preferences of a diverse range of people. To elucidate this issue, this research addressed the question of how visitors perceive and evaluate the city they are visiting when they walk. This research was motivated by a working hypothesis that the walking behaviour of tourists is different from that of local residents and that factors that enable or constrain tourists’ desires to walk are different from those affecting local residents. An empirical study was conducted in two cities in New Zealand: Christchurch and Wellington. Both quantitative and qualitative data were gathered in an A5 size Walk Diary with Likert scale scoring of walking attributes, developed through a research framework, and an A3 map for respondents’ comments. A convenience sampling method was used, and 132 locals and tourists evaluated the environment during a walk. The findings reveal that tourists faced numerous challenges in both cities that were not perceived by locals, thus highlighting the importance of considering both locals and tourists and their familiarity with the environment when evaluating walking experiences. The study also identified attributes, such as thermal conditions and the presence of unleashed dogs, that were not captured by the research framework but that emerged through the qualitative comments, thereby strengthening the research framework for future studies of a similar nature. By highlighting the differences in walking experiences between locals and tourists, this research contributes to a deeper understanding of the complex dynamics of urban mobility. The findings offer valuable insights for urban planners and policymakers in designing inclusive and pedestrian-friendly environments that can meet the diverse needs of both residents and tourists. Ultimately, the study underscores the significance of considering the unique perspectives of tourists in shaping urban spaces that promote enjoyable, accessible, and sustainable walking experiences for all.
... According to Prelipcean and Yamamoto (2018) the biggest challenge is an open-source distribution of travel diary collection systems in order to decrease development costs as up-to-date current developments are typically not published. Hence, according to the automatic trip collection system MEILI (Prelipcean et al., 2018a) all codes of this project are made publicly available. Additionally, no software has to be downloaded. ...
Article
Transportation planners use household travel surveys to understand travel behavior, whether to develop forecasts of travel activity using travel demand or simulation models, to analyze personal motivations that drive the decision to travel, to identify users of existing transportation infrastructure and services, or to determine responsiveness to available or future travel options. Large urban area transportation planning organizations are faced with making decisions on their next household travel surveying effort to meet growing data needs. The New York Metropolitan Transportation Planning Council (NYMTC) has focused attention on surveying options, including conducting a pilot study to evaluate the ability to collect household travel data using a smartphone app. This research reviews current travel surveying strategies aimed at improving timeliness and accuracy, while reducing participant burden (e.g., using smartphone technologies). It also includes a vision for a long-term data program, facilitated by the establishment of a regional Community of Practice (CoP) to support multi-agency data collection efforts.
Article
Full-text available
When the global positioning system became available for civil uses in the early 1990s, there was an enthusiasm and anticipation that information stored in GPS data streams would replace the traditional data collection methods, especially in the transportation field. Despite the wealth of GPS surveys available to practitioners to work with, the existing studies have not made much progress to deliver models for identification of activities from GPS data streams. The lack of models for identifying activities prevents the reconstruction of activity patterns stored in GPS data streams. The present study proposes a methodology for the identification of activities using a rule-based and discrete choice modeling. This novel approach uses a rule-based model that implements the properties of home-based tours in the form of the feedback loop in order to allow identification of home activities. This model is inert to the presence of travel characteristics as it can be applied to most multi-day GPS data sets, and not just prompted recall surveys. In regard to the non-home activities, a discrete choice model is calibrated to Transportation Tomorrow Survey (TTS), for identification of work and other activities. The estimated results are positive, as they are compared against the TTS, and are consistent with the observed patterns.
Article
Full-text available
Personal travel pattern is significant to transportation analysis and modeling, and the rapid development of in-depth application of location-based services makes it possible to obtain large-scale positioning data. So, it is crucial to develop proper algorithm to identify trips/trip-segments from individual positioning records. This article presents an automatic trips/trip-segment detection method based on instantaneous Global Positioning System records collected by smartphones. The method consists of a series of procedures including data cleaning and pre-processing, inferring and removing pseudo trip ends, as well as trip combination. The result of the model has been compared with the “ground truth” collected and verified by volunteers. Finally, 1954 trips from 125 volunteers were identified and the overall detection accuracy is between 97.5% and 98.7% with a 95% confidence level. Besides, purity was introduced to evaluate the performance of the proposed method. In addition, the integration of instantaneous speed over time shows an excellent performance in calculating the trip distance.
Article
Full-text available
This article proposes a machine learning–based travel mode detection method using urban residents’ travel routes as the data source, collected via smartphone global positioning system modules. A data-driven machine learning strategy was chosen in the model construction. This study performed data cleaning and mining on over 4400 pieces of urban resident travel records containing several millions of global positioning system tracking points. Series of characteristic values of speed, travel distance, and direction are calculated, which reflect the travel mode of smartphone holders. In travel mode identification, first, the transition regions of travel segments of different travel modes are effectively distinguished; then, continuous tracking points for single-mode travel are connected into single-mode travel segments. The travel mode of the surveyed subjects is identified based on the calculated features of average speed, average acceleration, and average change of direction within each single-mode segment. The random forest method is chosen as the basis model to classify travel mode. Three-quarters of the travel records were used to construct the random forest classifier, and the detection accuracy of the established model for the remaining ¼ of the travel record reached 94.4%. The proposed method uses massive smartphone global positioning system tracking points as the basis; the detection results are consistent with manually collected prompted recall survey records.
Article
Full-text available
Recently, a lot of studies have been focused on the use of smartphones for automatic detection of transportation mode. This task is made easy by the availability of sensors like accelerometer and GPS in modern smartphones. The advantages include the increased accuracy which was partially lost due to underreporting in case of conventional travel surveys. In this paper Probe Person data collected by 46 participants in three different cities of Japan, namely Niigata, Gifu and Matsuyama, was used. Although the data, comprising of acceleration and GPS information, was collected by a wearable device but the same can be achieved very easily with the help of smartphones. In order to address the most important problem of continuously changing position of smartphone during the trip, only resultant acceleration was taken. In addition, personal characteristics like age and gender were also included. Regarding GIS information, distance and time calculated by Google Maps for both driving and walking was introduced to increase the prediction accuracy. Random Forest was applied for the purpose of prediction. 70% of the data was randomly selected to train the algorithm and rest 30% was used to test it. Prediction was done among four different modes; walk, bicycle, car and train. The results are quite promising with an overall prediction accuracy of more than 99.6% for all three cities. A slight improvement in the prediction accuracy is achieved by selecting the best features for the classification purpose.
Conference Paper
Full-text available
The growing need of acquiring data that is useful for travel behaviour analysis led scientists to pursue new ways of obtaining travel diaries from large groups of people. The most promising alternative to traditional (declarative) travel diary collection methods are those that rely on collecting trajectories from individuals and then extract travel diary semantics from the trajectories. However, most studies report on routines specific to the post-processing of data, and seldom focus on data collection. Even the few studies that deal explicitly with data collection describe the final state of the collection system, but do not go at the lengths that are required to describe the decision that were taken to bring the system to its current state. This leads to a considerable amount of work that is needed for designing collection systems that are often undocumented, which impedes the reuse of the aforementioned systems. In light of the aforementioned problems, this paper presents a series of three case studies behind the continuous development of MEILI, a travel diary collection, annotation and automation system, in an effort to: 1) illustrate the utility of the developed system to collect travel diaries, 2) identify how MEILI and other semi-automatic travel diaries collection systems can be improved, and 3) propose MEILI as an open source system that has the potential of being improved into a widely available semi-automated travel diary collection system.
Article
Full-text available
The increasing demand for advanced modelling methods, which can reflect complex travel activities of individuals, requires enhanced travel data collection methods. The introduction of GPS-assisted data collection methods has provided an alternative to the conventional methods of travel data collection. GPS-assisted data collection methods improve the accuracy of data collection and enable capturing more details of individuals’ travel behaviour. Recent technological advancements in smartphone-based positioning technologies and communication facilities have opened up new opportunities to apply smartphones as the media of GPS-assisted data collection. Although, different GPS-assisted methods have been employed recently, their performance has not been widely evaluated in real-world experiments compared to traditional data collection methods. Accordingly, this paper evaluates the performance of three GPS-assisted methods, namely handheld GPS tracking, smartphone-based GPS tracking and smartphone-based prompted-recall data collection methods, in conjunction with the web-based data collection to shed light on different aspects of GPS-assisted data collection methods. These methods are compared in terms of the quality and accuracy of the collected data, the demographic attributes of participants and the specifications of labelled trips. The results show that an appropriate employment of smartphones enhances the accuracy of data collection. It is also found that putting an extra burden on participants during a travel data collection survey results in lower trip-rates and poor data quality. Finally, it is found that the application of smartphone-assisted data collection methods help reporting non-motorised trips more accurately.
Article
Full-text available
The wide adoption of location-enabled devices, together with the acceptance of services that leverage (personal) data as payment, allows scientists to push through some of the previous barriers imposed by data insufficiency, ethics and privacy skepticism. The research problems whose study require hard-to-obtain data (e.g. transportation mode detection, service contextualisation, etc.) have now become more accessible to scientists because of the availability of data collecting outlets. One such problem is the detection of a user's transportation mode. Different fields have approached the problem of transportation mode detection with different aims: Location-Based Services (LBS) is a field that focuses on understanding the transportation mode in real-time, Transportation Science is a field that focuses on measuring the daily travel patterns of individuals or groups of individuals, and Human Geography is a field that focuses on enriching a trajectory by adding domain-specific semantics. While different fields providing solutions to the same problem could be viewed as a positive outcome, it is difficult to compare these solutions because the reported performance indicators depend on the type of approach and its aim (e.g. the real-time availability of LBS requires the performance to be computed on each classified location). The contributions of this paper are three fold. First, the paper reviews the critical aspects desired by each research field when providing solutions to the transportation mode detection problem. Second, it proposes three dimensions that separate three branches of science based on their main interest. Finally, it identifies important gaps in research and future directions, that is, proposing: widely accepted error measures meaningful for all disciplines, methods robust to new data sets and a benchmark data set for performance validation.
Article
Application of machine learning methods shows a popular attempt to identify the purpose of a trip and mode of travel on Global Positioning System (GPS) trajectory data. Data selection for the training and test sets is important in these methods. However, the feasibility and effects of choosing these data from different periods of the year are still unknown. This detail is particularly important since collecting data via GPS decreases the burden on participants to such an extent that it can last for seasons which may own distinct features. In order to bridge this gap, this paper employs Aslan & Zech’s test (AZ-test) and Random Forests (RF) successively to investigate the influence of data selection from different seasons for training and test sets. The dataset obtained in a city with distinct seasons, Hakodate, Japan, is used for our empirical analysis. The results of AZ-test suggest that explanatory variables of the two data sets from distinct seasons follow different distributions. Furthermore, it concludes that data set from two-seasons and data set from single season also follow different distributions. However, this test achieves some contradictory results in some cases. Due to this, RF is used to check how the accuracy varies in a further detail. RF confirms the findings by AZ-test in most cases. In addition, RF results show that including GIS features as explanatory variables has positive effect on the identification accuracy while including weather features has negative effect on the identification accuracy.
Thesis
<Thesis available at http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-187491> The increase in population, accompanied by an increase in the availability of travel opportunities have kindled the interest in understanding how people make use of the space around them and their opportunities. Understanding the travel behaviour of individuals and groups is difficult because of two main factors: the travel behaviour's wide coverage, which encompasses different research areas, all of which model different aspects of travel behaviour, and the difficulty of obtaining travel diaries from large groups of respondents, which is imperative for analysing travel behaviour and patterns. A travel diary allows an individual to describe how she performed her activities by specifying the destinations, purposes and travel modes occurring during a predefined period of time. Travel diaries are usually collected during a large-scale survey, but recent developments show that travel diaries have important drawbacks such as the collection bias and the decreasing response rate. This led to a surge of studies that try to complement or replace the traditional declaration-based travel diary collection with methods that extract travel diary specific information from trajectories and auxiliary datasets. With the automation of travel diary generation in sight, this thesis presents a suitable method for collecting data for travel diary automation (Paper I), a framework to compare multiple travel diary collection systems (Paper II), a set of relevant metrics for measuring the performance of travel mode segmentation methods (Paper III), and applies these concepts during different case studies (Paper IV).
Conference Paper
This paper describes the lessons learned from the trial of MEILI, a smartphone based semi-automatic activity-travel diary collector, in Stockholm city, Sweden. The design of the system, together with state-of-the-art improvements of different elements of the tool, are presented before and after the trial to better illustrate the improvements based on the lessons learned from the trial. During the trial, both MEILI and a paper-based diary captured about 65% of the total number of detected trips, but only about half of the trips were captured by both systems. The unmatchable trips are partly due to different definitions of activities and points-of-interest declarations, which were verbosely described by the users in the paper-and-pencil, compared to the ones that are inferred by MEILI. In terms of subjective appreciation, the user experiences vary a lot between the different participants in the pilot. Presumably, this is mainly due to different level of IT-knowledge of the respondents, but also because of the occasionally non-uniform behaviour of the location collection service due to hardware and / or software difficulties. Based on these inputs, further web and support system improvements have been implemented. The improved version of the tool will be soon deployed side-by-side with the collection of 2015 Swedish Travel Survey in Stockholm.