Conference PaperPDF Available

An Early Event Detection Technique with Bus GPS Data

Authors:

Abstract and Figures

The analysis and study of the relationship between a geo-spatial event and human mobility in an urban area is very significant for improving productivity, mobility, and safety. In particular, in order to alleviate serious road congestions, traffic jams, and stampedes, it is essential to predict and be informed about the occurrence of an event as soon as possible. When we know an event occurrence in advance, some of those who are not interested in the event might change their plans and/or might take a detour to avoid to get involved in a heavy congestion. In this context, this paper presents an early event detection technique using GPS trajectories collected from periodic-cars, which are vehicles periodically traveling on a pre-scheduled route with a predetermined departure time, such as a transit bus, shuttle, garbage truck, or municipal patrol car. Using these trajectories, which provide the real-time and continuous traffic flow and speed, our technique detects large-scale events in advance, without incurring any privacy invasion. The behavior of periodic-cars shows a certain sign of a large-scale event before attendees gather around a venue because traffic can be slowed around the venue before the event occurrence. We evaluated our method using over 7, 000-bus data from January to May in 2015 in Beijing, which we compared with the check-in data collected from a social network service.
Content may be subject to copyright.
An Early Event Detection Technique with Bus GPS Data
Shunsuke Aoki
shunsuka@andrew.cmu.edu
Carnegie Mellon University
Pittsburgh, Pennsylvania
Kaoru Sezaki
sezaki@iis.u-tokyo.ac.jp
The University of Tokyo
Tokyo, Japan
Nicholas Jing Yuan
Xing Xie
nicholas.yuan@microsoft.com
xing.xie@microsoft.com
Microsoft Research Asia
Beijing, China
ABSTRACT
The analysis and study of the relationship between a geo-spatial
event and human mobility in an urban area is very signicant for
improving productivity, mobility, and safety. In particular, in order
to alleviate serious road congestions, trac jams, and stampedes, it
is essential to predict and be informed about the occurrence of an
event as soon as possible. When we know an event occurrence in
advance, some of those who are not interested in the event might
change their plans and/or might take a detour to avoid to get in-
volved in a heavy congestion. In this context, this paper presents
an early event detection technique using GPS trajectories collected
from
periodic
-
cars
, which are vehicles periodically traveling on a
pre-scheduled route with a pre-determined departure time, such
as a transit bus, shuttle, garbage truck, or municipal patrol car. Us-
ing these trajectories, which provide the real-time and continuous
trac ow and speed, our technique detects large-scale events in
advance, without incurring any privacy invasion. The behavior of
periodic
-
cars
shows a certain sign of a large-scale event before at-
tendees gather around a venue because trac can be slowed around
the venue before the event occurrence. We evaluated our method
using over 7
,
000-bus data from January to May in 2015 in Beijing,
which we compared with the check-in data collected from a social
network service.
CCS CONCEPTS
Information systems Spatial-temporal systems
;Informa-
tion integration;
KEYWORDS
Urban computing, event detection, GPS trajectory, location knowl-
edge
ACM Reference format:
Shunsuke Aoki, Kaoru Sezaki, Nicholas Jing Yuan, and Xing Xie. 2017. An
Early Event Detection Technique with Bus GPS Data. In Proceedings of
SIGSPATIAL’17, Los Angeles Area, CA, USA, November 7–10, 2017, 4 pages.
https://doi.org/10.1145/3139958.3139959
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
SIGSPATIAL’17, November 7–10, 2017, Los Angeles Area, CA, USA
©2017 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-5490-5/17/11.
https://doi.org/10.1145/3139958.3139959
(a) Usual trac ow.
(b) Anomalous trac ow. (c) Temporal changes.
Figure 1: Example of trac anomalous signal.
1 INTRODUCTION
Large-scale events attracting many participants in urban area have a
strong negative impact on productivity, mobility, comfortability, and
safety [
2
,
3
]. For example, within the last few years, a large number
of serious accidents caused by congestion resulting from football
games, religious ceremonies, festivals, and so on, have occurred
worldwide. To alleviate trac jams and congestion, people need to
know in advance that an event will occur. In fact, if we can know an
event occurrence beforehand, some of those who are not interested
in the event might change their plans and might take a detour to
avoid to get involved in a heavy congestion.
Many research studies have been focused on event detection and
extraction based mainly on the data generated in the cyber world,
which means social media platforms such as Twitter, Instagram,
and Foursquare [
1
,
4
]. However, these data might be hard to detect
a geo-spatial event before it occurs, because the users tend to share
the location information of the event on these social media after
arriving at the venue. On the other hand, the trac ow around
the venue might indicate a large-scale event before the attendees
gather around the venue.
In this context, we present an early event detection technique
using GPS trajectories collected from
periodic
-
cars
, such as a transit
bus, shuttle, garbage truck, or municipal patrol car, that periodically
travel on a pre-scheduled route with a pre-determined departure
time. Since the
periodic
-
cars
are disallowed to use an alternative
route even when an anomaly trac jam occurs, their trips are ad-
versely aected by the trac jams and congestions. In the following,
an example is described.
Example:
In Figure 1 (a), one of the periodic-cars, a transit bus
runs on a pre-scheduled route, which comprises a round trip be-
tween point
A
and point
B
. When a large-scale event has occurred
at location
V1
as shown in Figure 1 (b), it aects trac speeds of the
SIGSPATIAL’17, November 7–10, 2017, Los Angeles Area, CA, USA S. Aoki et al.
surroundings before the event starts, because people are gathering
at location
V1
by their own vehicles, bicycles, or taxicabs. There-
fore, the trac ow heading toward
V1
from its surrounding area
is considerably increased and trac congestion occurs around the
location
V1
. Even under these trac congestions, transit buses have
to keep traveling on the xed routes from point
A
to point
B
. Figure
1 (c) shows the temporal behavior of a bus, and depicts that it is
caught in a massive trac jam caused by the event occurring at
V1
and the vehicle speed is dramatically decreased. A driver may
attempt to recover from the delay, but typically, he/she never in-
tentionally causes a delay. Figure 1 (a) and 1 (b) show one bus line
for simplicity, but in practice, there are many pre-determined lines
where many transit buses run routinely in a city.
The main benets of our research are two-fold. First, our event
detection technique uses the GPS trajectories of periodic-cars that
do not contain any privacy. Second, since the periodic-cars run on
the pre-dened routes, the trac speed of the periodic-car is much
more sensitive to congestions than that of other vehicles. Some of
the taxis and private cars having no interests in the event might
change their plans and take a detour [5, 6].
The denition of the periodic-car as follows. The key idea of our
research is to evaluate the road status continuously by monitoring
the periodic-car.
Periodic-car
: A car running on a pre-dened route in a con-
stant period, such as a transit bus, school bus, garbage truck, and
municipal patrol car.
The contribution of our paper lies in two aspects.
Network-based Event Detection:
We design an early event de-
tection algorithm with a time-dependent network using the features
of periodic-cars.
Real Data Evaluation:
We evaluate our method using a series
of large-scale real GPS trajectories generated by over 7
,
000 buses
in Beijing from January to May in 2015.
2 SPATIO-TEMPORAL EVENT DETECTION
In this section, we present a spatio-temporal event detection tech-
nique using a Time-dependent Congestion Network (TCN) and a
Spatio-Temporal Event Likelihood (STEL). Our technique detects
anomalous trac speed, connects them as the TCN, and estimates
event venues where people are gathering with the STEL.
2.1 Time-dependent Congestion Network
(TCN)
The Time-dependent Congestion Network (TCN) is composed of
anomalous road segments where the trac speed is much slower
than usual. We can know the event occurrence with the TCN, by
measuring the size of the TCN and by monitoring the multiple
subnetted TCN. To develop the TCN, we process the data in the
three steps: (i) Converting to Trac Speed, (ii) Network Mapping,
and (iii) Edge Anomaly Detection.
First, since the periodic-cars run on the pre-dened routes and
would not take a detour, we can estimate the road trac speed from
the GPS trajectory data. We calculate the trip distance between
each sample by using the map information, and estimates the trac
speed from the calculated distance and time to be taken.
Algorithm 1: Edge Anomaly Detection Technique
Input : Real-time data rτ(D,T)
(ni,nj)
Output : TRUE (congested) or FALSE (not congested)
Extract Θ(ni,nj)={θ1(ni,nj),θ2(ni,nj),· · · } from accumulated data;
Select Θτ(D,T)
(ni,nj)={θτ(D,T)
1(ni,nj),θτ(D,T)
2(ni,nj),· · · };
ϒ= Average(Θτ(D,T)
(ni,nj));
if
rτ(Dk,Tl)
(ni,nj)
ϒ>Γthen
return TRUE;
else
return FALSE;
end
Second, our technique maps the calculated speed data to the road
network. In the road network, each intersection is regarded as a
node and each road is used as an edge. To monitor the trac speed
in each direction, we regard the network as directed graph.
Third, we monitor the trac speed for each edge and compare
the real-time data to the accumulated data by using a threshold Γ.
In addition, our anomaly detection technique accounts for a time-
dependency, and therefore, is able to compare the data collected at a
specic time duration
τ
, which is decided by two factors: day types
(
D
) and time periods (
T
). As for the
τ
, we have weekday and holiday,
which are represented as
D=Dw
and
D=Dh
, respectively. For
time periods, we categorize to three time periods: morning rush-
hour (7-10 o’clock;
T=Tm
), daytime (10-17 o’clock;
T=Td
), and
evening rush-hour (17-20 o’clock;
T=Te
). That is, all of the data are
categorized to 6 types, and the real-time data would be compared
to the accumulated data stored in the same category.
The algorithm for the edge anomaly detection is presented in
Algorithm 1. The real-time trac speed from node
ni
to node
nj
is represented as
rτ(D,T)
(ni,nj)
, which is used as an input.
Θ(ni,nj)
rep-
resents the collection of the data for the edge from node
ni
to
nj
.
θb(ni,nj)(b=
1
,
2
,· · · )
represents the single data for the edge from
node
ni
to
nj
. Once an edge is detected as an anomalous in the
algorithm, the edge is added to the TCN.
2.2 Event Venue Estimation
The venue estimation technique is composed of three following
steps: (i) Calculating a Cascading Level, (ii) Estimating Human
Gathering, and (iii) Calculating a Spatio-Temporal Event Likelihood.
First, we give a Cascading Level
ϕnl
to each node for knowing
where the network has the congestion and how it is large. The
Cascading Level
ϕnl
describes how many upstream continuous
links the node
nl
has in the TCN. The collection of the Cascading
Level is denoted Φ={ϕn1,ϕn1,· · · }
The algorithm for calculating the Cascading Level
ϕnl
is pre-
sented in Algorithm 2, where the TCN is represented as
G0=
Figure 2: Time-dependent Congestion Network (TCN).
An Early Event Detection Technique with Bus GPS Data SIGSPATIAL’17, November 7–10, 2017, Los Angeles Area, CA, USA
Algorithm 2: Calculating the Cascading Levels Φ
foreach nlV(G0)do
if de д(nl)=0then
ϕnl=0;
else
ϕnl=1;
end
end
while not reach stable Φdo
k=1;
foreach nlV(G0)do
foreach nunpa r en t
ldo
if ϕnukthen
ϕnl+ +;
end
end
end
k+ +;
end
Algorithm 3: Estimating the Human Gathering
foreach nlV(G0)do
if ϕnl>=then
w0=nl;
for k=0 : (ϕnl1)do
foreach wk+1wpa r en t
kdo
if ϕwk+1<Ψ;then
Mt.add(wk+1,nl)
end
end
end
end
end
(V(G0),E(G0))
. Here,
deд
(
nl
) represents the indegree of the node
nl, which is the number of edges leading to that node.
After giving the Cascading Level to each node, our technique
estimates the human mobility by using the TCN with the Cascading
Levels. The algorithm to estimate the human gathering direction
is presented in Algorithm 3, where
Mt
represents the collection of
the human mobility direction at time t.
To monitor the human mobility, the algorithm uses two thresh-
olds:
and
Ψ
. First, the threshold value
is used for the Cascading
Level to extract the event having a considerable impact on human
mobility. Second, the threshold value
Ψ
is used to know the original
point of the human movement.
Finally, our technique calculates a Spatio-Temporal Event Likeli-
hood (STEL) from the estimated human movements, as shown in
Algorithm 4. We give the STEL value
ρ
to the area having the high
possibility to have an event, and extract the area having more than
ρeven t , which is the threshold value for the STEL value.
In the algorithm, we assume that the event venue is much close
to the end of the congestion and that the venue is probably ahead
of the congested road segments. When the multiple sub-networks
of the TCN indicates the congestion, the STEL value becomes high.
In Figure 3, we present the simple example of the collection
Mt
and the STEL value calculation. Here,
Mt
has four pairs
{qO
1,qD
1}
,
{qO
2,qD
2}
,
{qO
3,qD
3}
, and
{qO
4,qD
4}
and each pair gives the STEL
value to the corresponding area. The area with red shadow has the
highest STEL value in the scenario. To estimate the event venue
Algorithm 4: Estimating the Event Venue using STEL value
foreach (qO,qD) ∈ Mtdo
Make the circle centered at qdwith the radius of α;
Give STEL value zto the closer semicircle to o;
Give STEL value c·zto the farther semicircle to o(c>1) ;
end
Extracting the area having STEL value ρ>ρeven t ;
Figure 3: STEL value and probable event location.
location with high accuracy, we have to setup the appropriate value
for the value ρeven t .
3 EXPERIMENTAL EVALUATION
In this section, we implement our early event detection technique
and evaluate the detection accuracy for geo-spatial events. We rst
collect the information of 209 geo-spatial events for using as ground-
truth data. The event information has been collected from the dif-
ferent sources, such as yers, news reports, posts on social network
services, and web pages. Some of the event information were avail-
able before the event occurrences but some were unavailable.
3.1 Data Set
Route-bus trajectories
We use the GPS trajectories of xed-route
buses as mobility data, which are shown in Table 1
1
. There are over
7
,
000 sensor-equipped buses. However, they sometimes have a rest
period in the garage, and therefore, the number of buses operating
at any one time is smaller than 5,000.
Map data
We have the road networks of Beijing, the statistics
of which are shown in Table 1. Each intersection is regarded as a
node in the road networks. In addition, we have all of the bus-route
information in Beijing. The number of bus lines in the dataset are
466, in which dierent bus lines can use the same road.
Check-in data
We use the check-in data of SinaWeibo
2
SinaWeibo
provides location-based service such as check-ins. The check-in data
in SinaWeibo contains location ID, user ID, and a timestamp. We
use the check-in data to understand the similarities or dierences
between the trac ow changes and social media responses.
3.2 Detection Accuracy
This section evaluates the reliability of our road-network-based
event detection algorithm. As the parameters of the detection algo-
rithm, time span of the calculation is set as 15 minutes.
As shown in Table 2, we focus on the events happening in the
surroundings of the Workers’ Stadium and the LeSports Center.
1
The data was collected by crawling the Beijing Real-time Buses. https://itunes.apple.
com/us/app/bei-jing- shi-shi- shi-gong- jiao/id703306506?mt=8
2www.weibo.com
SIGSPATIAL’17, November 7–10, 2017, Los Angeles Area, CA, USA S. Aoki et al.
Table 1: Statistics of dataset.
data duration: Jan-May 2015
Trajectories
# of route-buses 7,131
# of bus lines 466
# of eective days 106
# of data points 158M
minimum sampling rate 15 (sec)
Roads # of road segments 162,246
# of road nodes 121,771
Social media avg. # of tweets per day 17,770
avg. # of check-ins per day 12,088
Table 2: Event detection accuracy.
Workers’ Stadium # of Target Events 17
Precision (w/ interpolatioin) 0.727
Recall (w/ interpolation) 0.941
F-measure (w/ interpolation) 0.820
LeSports Center # of Target Events 33
Precision (w/ interpolatioin) 0.737
Recall (w/ interpolatioin) 0.848
F-measure (w/ interpolatioin) 0.789
The ground-truth events in the Workers’ Stadium are 13 football
games and 4music concerts, and these in the LeSports Center are 9
basketball games, 22 music concerts, and 2international conference.
As shown in Table 2, the results indicate that Precision, Recall,
and F-measure are suciently high, and our algorithm can detect
the geo-spatial events happening in the city area. The missed events
in our algorithm are music concerts started from the morning, and
this is reasonable because the trac impacts of event occurrence is
small during the morning rush-hour.
3.3 Statistical Analysis
This evaluation assesses the statistical signicance of our method
using 209 geo-spatial events in Beijing. The events that we evaluated
include sport games, music concerts, school festivals, exhibitions,
and so on. These events were held at various venues, such as a
sports center, football stadium, university, or exhibition center. In
our evaluation, in order to compare and evaluate the data generated
surrounding the venues in an event day vs. an ordinary day, we
focus on the mobility data collected in a circle with a radius of 1km
having its center at the venue. Physical impacts by each event are
shown in Figure 4. The gure describes the speed changes of the
route-bus running near the venue, which is the comparison of the
trac data collected at half an hour before event occurrence and
collected at the same time period in an ordinal day.
In addition, we test for an average dierence using the paired
t-test. In the statistical data presented in Figure 4, the mean and
standard deviation of the dierences is 2
.
231 and 3
.
267, respectively.
From the calculation with degree of freedom, 189, the text statistic
is 9
.
315, and this value indicates that the event occurrence certainly
impacts on the trac speed of the surroundings, because the p-value
in this case is much smaller than 0.01.
3.4 Case Study
We further explore our approach using a case study. In the case
study, we focus on the CBA playo basketball game as shown in
Figure 5. The venue for the game is the LeSports Center located in
Figure 4: Physical impacts by each event.
Date Mar 15(Sun)
Location LeSports Center
Start time 19:35
Daily ave. speed 19.57(km/h)
(Statistics,Sun)
Daily ave. speed 18.41(km/h)
(Event day)
Avg. speed of 18:00-19:00 20.47(km/h)
(Statistics,Sun)
Avg. speed of 18:00-19:00 7.76(km/h)
(Event day)
(a) Basic Information. (b) Trac impacts and check-ins.
Figure 5: Trac impacts by CBA basketball games.
the center of Beijing, and the game started at 19:35. The mobility
data to be handled are generated in a circle with a radius of 1km
having its center at the venue. Figure 5(b) shows the impact on the
physical world and the cyber world, that is, the trac impact and
the number of check-ins, respectively. The sign from the transit bus
data appears more quickly than that in the microblog check-in data.
In addition, the audiences tend to check in at the beginning or at
the end of the games.
4 CONCLUSION
In this paper, we presented the early event detection technique that
uses GPS trajectories of the periodic-cars, which routinely travel in
the urban area. Since the periodic-cars have pre-scheduled routes
and departure time, their behaviors showed the people movement
around the event venue. Our method is very prospective and will
be applicable to many other existing transportation services, such
as school buses, garbage trucks, and municipal patrol cars.
REFERENCES
[1]
Hila Becker, Mor Naaman, and Luis Gravano. 2011. Beyond Trending Topics:
Real-World Event Identication on Twitter. ICWSM 11 (2011), 438–441.
[2]
Liang Hong, Yu Zheng, Duncan Yung, Jingbo Shang, and Lei Zou. 2015. Detecting
urban black holes based on human mobility data. In Proceedings of the 23rd
SIGSPATIAL International Conference on Advances in Geographic Information
Systems. ACM, 35.
[3]
Yoshihide Sekimoto, Ryosuke Shibasaki, Hiroshi Kanasugi, Tomotaka Usui, and
Yasunobu Shimazaki. 2011. Pow: Reconstructing people ow recycling large-
scale social survey data. IEEE Pervasive Computing 10, 4 (2011), 27–35.
[4]
Chaolun Xia, Raz Schwartz, Ke Xie, Adam Krebs, Andrew Langdon, Jeremy
Ting, and Mor Naaman. In Proceedings of the companion publication of the 23rd
international conference on World wide web companion (WWW).
[5]
Jing Yuan, Yu Zheng, Xing Xie, and Guangzhong Sun. 2013. T-drive: enhancing
driving directions with taxi drivers’ intelligence. Knowledge and Data Engineering,
IEEE Transactions on 25, 1 (2013), 220–232.
[6]
Yu Zheng, Yanchi Liu, Jing Yuan, and Xing Xie. In Proceedings of the 13th Interna-
tional Conference on Ubiquitous Computing. New York, NY, USA.
... These systems can be classified by their topic, sensing unit and sensing device (Table I). [5], [6], [7] Other City Sensing [8] [9], [10], [11] Rider/Driver Assessment [12] [13] [14], [15] General Purpose [16], [17], [18], [19] A. City Sensing ...
... As for other city sensing projects, Biketastic [8] is a platform about sensing the bike route from the mobile phone and facilitating the route exchange among bikers for better biking. BusBeat collects the sensor data to monitor urban environments using public transportation services [10]. Cruisers [9] is a sensing platform for cities using garbage collecting trucks. ...
... anomaly( p test ) = min(anomaly(s test ), anomaly(t test )) (7) where min() is a function returning the smaller of the two values. The intuition behind function (7) is that a trajectory that takes more time and drives a longer distance is more likely to detour maliciously, which means higher anomaly(s test ) and anomaly(t test ). ...
Article
Researchers have proposed many novel methods to detect abnormal taxi trajectories. However, most of the existing methods usually adopt a counting-based strategy, which may cause high false positives due to imprecisely identifying diverse trajectories as anomalies and therefore, they need the support of large-scale historical trajectories to work properly. To improve detection precision and efficiency, in this article, we propose STR, an online abnormal taxi trajectory detection method based on spatio-temporal relations. The basic principle behind STR is that given the displacement from the source point to a testing point, if the driving time and driving distance are not within the normal ranges, the point is identified as anomalous. To learn the two normal ranges for driving time and driving distance, STR defines two spatio-temporal models which characterize the relationship between displacement and driving distance/driving time. To improve detection efficiency, STR reduces the number of models that need to be learned by making full use of the similarity of transportation modes in different time periods and neighboring areas. The effectiveness and performance of STR are evaluated on real-world taxi trajectories. The experiment results show that compared with counting-based methods, STR achieves greater precision by reducing false positives. Furthermore, STR is more efficient than its counterparts and is suitable for online detection.
... We discussed vehicle collisions and shared road segments in Sections 3, 4, 5, 6 and 8, but we will have to consider city-scale traffic management [141,142,143] and navigation [144,145] at the same time, in order to decrease the opportunity that multiple vehicles encounter each other on public roads. By reducing the chances of encounters on roads, ...
Thesis
Full-text available
With advances in Cyber-Physical System (CPS) technologies, Connected and Automated Vehicles (CAVs) are becoming increasingly feasible. This advent of CAVs presents new opportunities to improve road safety, traffic throughput and energy efficiency. In fact, the National Highway Traffic Safety Administration (NHTSA) points out that more than 35,000 people die in motor vehicle-related crashes every year in the US. Automation technologies have the great potential to reduce that number because of one critical fact: more than 90 % of serious crashes occur due to human error. In addition, due to traffic congestion and resulting delays, Americans waste an extra 8.8 billion hours and an extra 3.3 billion gallons of fuel in 2017. NHTSA anticipates that CAVs alleviate traffic congestion and free up nearly 50 minutes each day for the average commuter. As for energy, the U.S. Energy Information Administration (EIA) reports that the U.S. net import of petroleum is approximately equal to 11 % of U.S. petroleum consumption. CO2 emissions due to motor gasoline and diesel fuel consumption comprise over 30 % of total U.S. energy-related CO2 emissions. Motivated by these datapoints, we observe that vehicle cooperation is key to the safety and energy efficiency of CAVs on public roads. In this dissertation, we introduce system-design methodologies, frameworks and tools to enable the design of safe, cooperative and energy-efficient connected and automated vehicles. First, we present a family of vehicle cooperation mechanisms for collision- and deadlock-free traffic management around shared road segments, such as road intersections, merge points, construction zones and single-track lanes. We also propose a framework and tool for eco-autonomous driving strategies that incorporate static and dynamic data sources to generate driving profiles and to define vehicle behavior goals. In these methodologies and frameworks, CAVs utilize a variety of information from local on-board sensors, V2X (Vehicle-to-Everything) communications, pre-loaded vehicle-specific lookup tables, and map database. Our proposed solutions are implemented and tested in the CMU self-driving software to achieve safe and energy-efficient automation.
... Traffic congestion and accidents can also be detected from vehicle GPS traces [4]. Aoki et al. [1] proposed using bus GPS traces to predict events such as traffic congestion before the events become more severe. Dong et al. [6] proposed using mobile phone call data record to predict unusual crowded events. ...
Conference Paper
Social media, traffic sensors, GPS trajectories, and location-based social network data provide diverse Spatio temporal information sources that help to detect and analysis Spatio temporal events. Nowadays, bike sharing systems are active all over the world in major cities, and collecting a large amount of data regarding trips taken by users and status of the stations. Through analysis of the data aggregated by bike sharing systems, one can gain an understanding of crowd/commuter movements and behaviors. However, no one has used only the bike sharing data for generic event detection. In this paper, we propose a clustering-based detection method to identify Spatio temporal events that deviate from normal or regular everyday life using publicly available bike sharing data. In particular, we apply spectral clustering on bike station and bike flow data as evolving graphs and monitor changes of the bike share network (edge/node values) over time. Our proposed method decides whether a cluster is expected or anomalous (unusual). When a cluster is anomalous, there is an unusual event occurring at that time instance. Preliminary results on 6-months of data from Philadelphia and Washington DC are used to show the feasibility of our proposed method. In particular, our preliminary results show that some signatures of local (and less prominent) events (e.g., university events/activities in an urban area) can show up when bike sharing data is utilized for generic event detection.
... For example, [52] uses taxi trajectory data to detect flawed urban planning, and [48] recommends driving directions based on patterns mined from historical taxi trajectory data. Besides, data from taxi and other transportation services have also been used in traffic speed prediction [42], event detection [3], city structure discovery [41], human mobility [6,[34][35][36], safe driving [47], crowd management [11,49], taxi ride-sharing [21,30], trajectory clustering [31], mobile crowd sensing [45], route planning [28] and other urban computing topics [40,43,44]. ...
Article
Full-text available
Ride-on-demand (RoD) services use dynamic prices to balance the supply and demand to benefit both drivers and passengers, as an effort to improve service efficiency. However, dynamic prices also create concerns for passengers: the “unpredictable” prices sometimes prevent them from making quick decisions at ease. It is thus necessary to give passengers more information to tackle this concern, and predicting dynamic prices is a possible solution. We focus on fine-grained dynamic price prediction – predicting the price for every single passenger request. Price prediction helps passengers understand whether they could get a lower price in neighboring locations or within a short time, thus alleviating their concerns. The prediction is performed by learning the relationship between dynamic prices and features extracted from multi-source urban data. There are linear or non-linear models as candidates for learning, and using different models leads to varying implications on accuracy, interpretability, model training procedures, etc. We train one linear and one non-linear model as representatives, and evaluate their performance from different perspectives based on real service data. In addition, we interpret feature contribution, at different levels, based on both models and figure out what features or datasets contribute the most to dynamic prices. Finally, based on evaluation results, we provide discussions on model selection under different circumstances, and propose a way to combine the two models. Our hope is that the study not only serves as an accurate prediction for passengers, but also provides concrete guidance on how to choose between models to improve the prediction.
Article
Intelligent city transportation systems are one of the core infrastructures of a smart city. The true ingenuity of such an infrastructure lies in providing the commuters with real-time information about citywide transport like public buses, allowing them to pre-plan their travel. However, providing prior information for transportation systems like public buses in real-time is inherently challenging because of the diverse nature of different stay-locations where a public bus stops. Although straightforward factors like stay duration extracted from unimodal sources like GPS at these locations look erratic, a thorough analysis of public bus GPS trails for 1335.365 km at the city of Durgapur, a semi-urban city in India, reveals that several other fine-grained contextual features can characterize these locations accurately. Accordingly, we develop BuStop , a system for extracting and characterizing the stay-locations from multi-modal sensing using commuters’ smartphones. Using this multi-modal information BuStop extracts a set of granular contextual features that allow the system to differentiate among the different stay-location types. A thorough analysis of BuStop using the collected in-house dataset indicates that the system works with high accuracy in identifying different stay-locations like regular bus stops, random ad-hoc stops, stops due to traffic congestion, stops at traffic signals, and stops at sharp turns. Additionally, we also develop a proof-of-concept setup on top of BuStop to analyze the potential of the framework in predicting expected arrival time, a critical piece of information required to pre-plan travel at any given bus stop. Subsequent analysis of the PoC framework, through simulation over the test dataset, shows that characterizing the stay-locations indeed helps make more accurate arrival time predictions with deviations less than 60 seconds from the ground-truth arrival time.
Article
The detection of anomalies in spatiotemporal traffic data is not only critical for intelligent transportation systems and public safety but also very challenging. Anomalies in traffic data often exhibit complex forms in two aspects, (i) spatiotemporal complexity (i.e. we need to associate individual locations and time intervals formulating a panoramic view of an anomaly) and (ii) multi-source complexity (i.e. we need an algorithm that can model the anomaly degree of the multiple data sources of different densities, distributions and scales). To tackle these challenges, we proposed a three-step method that uses factor analysis to extract features, then uses the goodness-of-fit test to obtain the anomaly score of a single data point and then uses one class support vector machine to synthesize the anomaly score. Finally, we conduct extensive experiments on real-world trip data include taxi and bike data. And these extensive experiments demonstrate the effectiveness of our proposed approach.
Preprint
Intelligent city transportation systems are one of the core infrastructures of a smart city. The true ingenuity of such an infrastructure lies in providing the commuters with real-time information about citywide transports like public buses, allowing her to pre-plan the travel. However, providing prior information for transportation systems like public buses in real-time is inherently challenging because of the diverse nature of different stay-locations that a public bus stops. Although straightforward factors stay duration, extracted from unimodal sources like GPS, at these locations look erratic, a thorough analysis of public bus GPS trails for 720km of bus travels at the city of Durgapur, a semi-urban city in India, reveals that several other fine-grained contextual features can characterize these locations accurately. Accordingly, we develop BuStop, a system for extracting and characterizing the stay locations from multi-modal sensing using commuters' smartphones. Using this multi-modal information BuStop extracts a set of granular contextual features that allow the system to differentiate among the different stay-location types. A thorough analysis of BuStop using the collected dataset indicates that the system works with high accuracy in identifying different stay locations like regular bus stops, random ad-hoc stops, stops due to traffic congestion stops at traffic signals, and stops at sharp turns. Additionally, we also develop a proof-of-concept setup on top of BuStop to analyze the potential of the framework in predicting expected arrival time, a critical piece of information required to pre-plan travel, at any given bus stop. Subsequent analysis of the PoC framework, through simulation over the test dataset, shows that characterizing the stay-locations indeed helps make more accurate arrival time predictions with deviations less than 60s from the ground-truth arrival time.
Article
Full-text available
Understanding people flow on a macroscopic scale requires reconstructing it from various forms of existing fragmentary spatiotemporal data. This article illustrates a process for reconstructing such data using existing person-trip survey data.
Conference Paper
Many types of human mobility data, such as flows of taxicabs, card swiping data of subways, bike trip data and Call Details Records (CDR), can be modeled by a Spatio-Temporal Graph (STG). STG is a directed graph in which vertices and edges are associated with spatio-temporal properties (e.g. the traffic flow on a road and the geospatial location of an intersection). In this paper, we instantly detect interesting phenomena, entitled black holes and volcanos, from an STG. Specifically, a black hole is a subgraph (of an STG) that has the overall inflow greater than the overall outflow by a threshold, while a volcano is a subgraph with the overall outflow greater than the overall inflow by a threshold (detecting volcanos from an STG is proved to be equivalent to the detection of black holes). The online detection of black holes/volcanos can timely reflect anomalous events, such as disasters, catastrophic accidents, and therefore help keep public safety. The patterns of black holes/volcanos and the relations between them reveal human mobility patterns in a city, thus help formulate a better city planning or improve a system's operation efficiency. Based on a well-designed STG index, we propose a two-step black hole detection algorithm: The first step identifies a set of candidate grid cells to start from; the second step expands an initial edge in a candidate cell to a black hole and prunes other candidate cells after a black hole is detected. Then, we adapt this detection algorithm to a continuous black hole detection scenario. We evaluate our method based on Beijing taxicab data and the bike trip data in New York, finding urban anomalies and human mobility patterns.
Article
This paper presents a smart driving direction system leveraging the intelligence of experienced drivers. In this system, GPS-equipped taxis are employed as mobile sensors probing the traffic rhythm of a city and taxi drivers' intelligence in choosing driving directions in the physical world. We propose a time-dependent landmark graph to model the dynamic traffic pattern as well as the intelligence of experienced drivers so as to provide a user with the practically fastest route to a given destination at a given departure time. Then, a Variance-Entropy-Based Clustering approach is devised to estimate the distribution of travel time between two landmarks in different time slots. Based on this graph, we design a two-stage routing algorithm to compute the practically fastest and customized route for end users. We build our system based on a real-world trajectory dataset generated by over 33,000 taxis in a period of 3 months, and evaluate the system by conducting both synthetic experiments and in-the-field evaluations. As a result, 60-70% of the routes suggested by our method are faster than the competing methods, and 20% of the routes share the same results. On average, 50% of our routes are at least 20% faster than the competing approaches.
Conference Paper
User-contributed messages on social media sites such as Twitter have emerged as powerful, real-time means of information sharing on the Web. These short messages tend to reflect a variety of events in real time, making Twitter particularly well suited as a source of real-time event content. In this paper, we explore approaches for analyzing the stream of Twitter messages to distinguish between messages about real-world events and non-event messages. Our approach relies on a rich family of aggregate statistics of topically similar message clusters. Large-scale experiments over millions of Twitter messages show the effectiveness of our approach for surfacing real-world event content on Twitter. 1
  • Hila Becker
  • Mor Naaman
  • Luis Gravano
Hila Becker, Mor Naaman, and Luis Gravano. 2011. Beyond Trending Topics: Real-World Event Identification on Twitter. ICWSM 11 (2011), 438-441.