Conference PaperPDF Available

Geospatial Data Aggregation and Reduction in Vehicular Sensing Applications: the Case of Road Surface Monitoring

Authors:

Abstract and Figures

Mobile devices present several features which make them attractive as enabling technology for crowdsensing systems. In particular, their spectrum of sensing capabilities, together with consolidated diffusion and ease of use contribute to an increasing adoption in different mobility-based sensing scenarios. On the other hand, the availability of massive volumes of geospatial data provided by large-scale distributed sensing systems prompts the need for innovative approaches to efficient data gathering and processing. Data reduction strategies are often necessary in order to cope with challenges posed by these volumes, for instance when dealing with real-time visualization of query results. In this paper we present a data reduction and aggregation approach for mitigating the impact of data size in a vehicular sensing application aimed at monitoring the roughness of road surfaces. Data collected by smartphones on board of vehicles is progressively thinned at different levels of the proposed architecture through sampling and spatial/temporal aggregation. Preliminary results show that the proposed methodology provides substantial benefits in terms of reduced impact of data while, at the same time, it enables full exploitation of statistical error compensation.
Content may be subject to copyright.
Geospatial Data Aggregation and Reduction in
Vehicular Sensing Applications: the Case of Road
Surface Monitoring
Valerio Freschi, Saverio Delpriori, Lorenz Cuno Klopfenstein,
Emanuele Lattanzi, Gioele Luchetti, Alessandro Bogliolo
DiSBeF, University of Urbino, Urbino, Italy
Email: {valerio.freschi, saverio.delpriori, lorenz.klopfenstein, emanuele.lattanzi, gioele.luchetti, alessandro.bogliolo}@uniurb.it
Abstract—Mobile devices present several features which make
them attractive as enabling technology for crowdsensing systems.
In particular, their spectrum of sensing capabilities, together with
consolidated diffusion and ease of use contribute to an increasing
adoption in different mobility-based sensing scenarios. On the
other hand, the availability of massive volumes of geospatial
data provided by large-scale distributed sensing systems prompts
the need for innovative approaches to efficient data gathering
and processing. Data reduction strategies are often necessary in
order to cope with challenges posed by these volumes, for instance
when dealing with real-time visualization of query results. In this
paper we present a data reduction and aggregation approach for
mitigating the imp act of data size in a vehicular sensing appli-
cation aimed at monitoring the roughness of road surfaces. Data
collected by smartphones on board of vehicles is progressively
thinned at different levels of the proposed architecture through
sampling and spatial/temporal aggregation. Preliminary results
show that the proposed methodology p rovides substantial benefits
in terms of reduced impact of data while, at the same time, it
enables full exploitation of statistical error compensation.
I. INTRODUCTION
The increasing diffusion of mobile embedded sensing
devices with wireless communication capabilities opens the
way to unprecedented opportunities in the development of
large scale crowdsensing systems [1]. Current smartphones,
in particular, are commonly equipped with sensors that can
continuously monitor several physical quantities (e.g. acceler-
ation). This provides, in combination with location coordinates
available through GPS or other localization systems, a r ich
source of geo-referenced information. Moreover, the perva-
siveness of these devices makes them the enabling technology
for designing large scale mobile distributed systems aimed at
massive sensing, either volunteer or incentive-based [2].
Needless to say, this perspective also poses significant
challenges to the research community in order to build systems
capable of efficiently and accurately collecting, processing and
making available this wealth of data, ranging from system
architecture to algorithmic design, from communication proto-
cols to database. An additional dimension is represented by the
massive nature of geo-referenced data to be handled by these
hardware/software systems, especially when vehicular sensing
applications are foreseen, which entail further peculiar issues
to be addressed [3]. As such, a common feature of several
research studies in recent scientific literature is represented by
efforts made towards effectively scaling systems for storage
and analysis of big geospatial sensing data.
In this paper we introduce a system for reduction and
aggregation of geospatial data in vehicle-based monitoring
applications. In particular, we describe a novel approach to
manage data produced in a crowdsensing application for road
surface quality control by means of spatial and temporal
aggregation techniques. We demonstrate that the proposed
distributed architecture is suitable to reduce the burden of
large scale geo-referenced data volumes produced by sens-
ing devices mounted on common smartphones. This system
architecture provides the opportunity for fine grained data
gathering and batch processing while, at the same time, it
enables effective visualization and real-time analysis. Progres-
sive spatial and temporal aggregation of data is performed at
different levels of the proposed architecture (from user mobile
devices on vehicles, up to the cloud) resulting into a significant
reduction (w.r.t. raw data logged from smartphones) of the
amount of data to be analyzed and visualized.
The paper is organized as follows: in Section II we sum-
marize the state of the art of related scientific literature. In
Section III we describe the proposed crowdesensing architec-
ture and data aggregation strategy. In Section IV we discuss
experimental results. In Section V we draw final conclusions.
II. RELATED WORK
A huge body of literature has flourished in the last decade
around the vast field of mobile sensing information systems.
In this section we try to summarize which are, in our opinion,
some of the main trends related to the topics which are the
scope of this paper.
Crowdsensing is an increasingly popular paradigm for
gathering si gnificant amounts of data from active communities
of users (i.e. participatory sensing) or agents opportunistically
carrying on sensing tasks (i.e. opportunistic sensing) [1]. Data
is usually sensed by mobile devices whose location can be
tracked with a given precision so that useful geo-referenced
information can be obtained and geographic information sys-
tems (GIS) can be exploited for analytics extraction. The ever
increasing widespread diffusion of commodity smartphones
and the availability of several sensors (e.g. accelerometers,
GPS, ambient light, microphones, cameras, etc.) on board of
them, make these devices the ideal candidate sensing platform
for many large scale mobile monitoring tasks [1], [2].
A. Vehicle-based sensing system architectures
Eriksson et al. proposed in 2008 CarTel, a system for road
surface monitoring focused on pothole detection by means of
embedded accelerometers and GPS sensors mounted in cars
equipped with embedded microprocessors [4]. Mohan et al.
developed Nericell, a smartphone-based mobile sensing system
aimed at detecting traffic conditions, bumps, and honking
events by integrating audio and acceleration data from micro-
phones and triaxial accelerometers mounted on smartphones
[5]. Vtrack is a system that enables road traffic delay est imation
using mobile phones, with emphasis on energy consumption
and noise compensation [6]. A follow-up paper from the same
research group described an approach to trajectory mapping
from cellular GSM fingerprints instead of WiFi and GPS traces
[7]. A prominent example of large-scale system based on
mobile sensing is represented by OpenSense, a system aimed at
monitoring air pollution by means of sensor stations deployed
on public transport vehicles and through participative sensing
from citizens equipped with ad hoc pocket sensors or enhanced
smartphones [8]. A system for road surface collaborative mon-
itoring, called SmartRoadSense, has been recently introduced
[9]. SmartRoadSense is a mobile/cloud architecture designed
for continuous monitoring of road surface quality conditions,
estimated by means of a roughness index computed on board
of smartphones and stored/processed/visualized in cloud.
B. Big geospatial data analysis
Mobile crowdsensing inherently implies dealing with ex-
pected large volumes of data that prompt for efficient and
scalable solutions both at system and at algorithmic level.
The growing research field of the so called spatial BigData
mainly refers to the development of novel methodologies
and approaches to address all issues related to geospatial
massive datasets. Within this framework, some recent works
highlighted the need for new flexible approaches and, at the
same time, pointed out the inadequacy of more tr aditional
approaches rooted in database research [3], [10]. Moreover,
while modern database management systems routinely face
problems related to efficient storage, search, and processing of
data, visualization systems need to be re-designed in order to
keep pace with BigData. According to this perspective, Keller
et al. introduced Vizzly, a middleware designed for interactive
browsing of large data sets in sensor networks applications
which has been integrated in the OpenSense project framework
[11]. Battle et al. stressed the lack of a thorough support of
visualization systems to larger s cales. In order t o overcome
some of the related challenges they proposed ScalaR, a system
for dynamic resolution reduction to be applied when results
of a query are expected to be too big to be handled by
standard data base management systems (DBMS). Reduction
is achieved t hrough a chain of aggregation, sampling, and
filtering operations [12].
III. PROPOSED ARCHITECTURE
This section presents the architecture of SmartRoadSense
and the solutions adopted for data gathering, aggregation,
reduction, and visualization. Scalability issues to be faced at
each stage will be then discussed in next section.
The algorithmic pipeline is distributed at different levels
of SmartRoadSense, a cloud-based system for collaborative
Fig. 1. System overview.
road quality monitoring designed for estimating the surface
roughness of roads by means of a smartphone’s triaxial ac-
celerometers.
The architecture is based on three main components,
schematically represented in Figure 1: i) a mobile application
running on Android devices which reads the data provided by
the embedded GPS and accelerometers and computes every
second a geo-tagged estimate of the roughness of the road
surface; ii) a server that gathers roughness indexes fr om all
the smartphones running the SmartRoadSense application and
makes use of OpenStreetMap [13] to perform spatio-temporal
aggregation and reduction, and iii) a cloud-based front-end for
graphical visualization.
Our approach to data reduction is obtained through dif-
ferent algorithmic strategies developed at different levels of
this system. Figure 2 represents the various phases of the
implemented algorithmic pipeline, labelled (a), (b), (c), (d) and
(e). In the following we sketch the main tasks performed at
each level of the pipeline.
Phase (a). During phase (a), synthetic numerical
values (called Roughness indexes, RI) are computed
real-time on board of smartphones from accelerations
sensed by the devices. These values, which provide a
reasonable esti mate of the roughness of the underlying
road monitored by the vehicle, represent the result
of a first level of data reduction. In fact, sampling
sensed data according to GPS sampling capabilities
and summarizing the information coming from three
axes into a single numerical estimate value provides
sizeable data compression w.r.t. raw data.
Phase (b). This step consists of data serialization
and storage of roughness indexes (with geographical
coordinates and timestamp) on the memory of the
smartphones. Batches of stored data are periodically
transmitted to a remote server through GSM channels.
Phase (c). This phase, implemented on the cloud,
performs consistent spatial aggregation of points re-
ceived by the back-end. Each point i s mapped onto
a map database and aggregated according to specific
geometric constraints. This makes it possible to con-
sistently map the sensed physical quantities of several
adjacent points into a single aggregate, providing
100 x
epochs
t
0
t
-1
t
-2
t
-3
t
-4
t
-5
t
-6
t
-7
t
-n-1
t
-n
...
Fig. 2. Algorithmic pipeline.
further reduction of the number of points which are the
final target of a visualization task and also smoothing
outliers thanks to statistical compensation.
Phase (d). Phase (d) regards temporal aggregation.
Weighted average values of the monitored quantities
are periodically computed, resulti ng into manifold
benefits. The database is kept updated with last sig-
nificant changes (incrementally down-weighting older
points), data to be (also visually) analyzed is further
thinned, and statistical robustness deriving from mul-
tiple measurements associated to a given location can
be exploited.
Phase (e). This last step entails the visualization of
data (namely, the aggregated roughness index) pro-
vided as output by previous steps, by means of a
graphical front-end.
In the next subsections, we further detail some aspects
of the implementation of data reduction and aggregation
algorithms, referring them to the three components of the
SmartRoadSense architecture.
A. Smartphone level
The first layer of the system architecture consists of an
Android application which is in charge of gathering data from
the smartphone’s sensors, namely GPS and triaxial accelerom-
eters and implements phases (a) and (c) of the algorithmic
pipeline of Figure 2. Since the sampling frequency of GPS
mounted on current smartphones is much lower than that of
triaxial accelerometers on the same devices (typically 1Hz
and 100 Hz, respectively) the former represents a constraint
on the spatial resolution that can be exploited for a first-cut
reduction of the data to be collected. In fact, the developed
mobile application works on windows of 100 samples (i.e. 100
seconds, taking the above mentioned sampling frequencies)
and computes, for each window, an aggregated roughness
index RI. RI represents the average value of the power of the
prediction errors (named PPE’s) computed when a prediction
filter is applied [9]. Prediction errors are computed for each
time window and along each of the three axial components
of the acceleration. Given the power of the prediction errors
P P E
x
, P P E
y
, and P P E
z
computed by applying a Linear
Predictive Coding algorithm (LPC for short) to the collected
samples, the roughness of the road surface upon which the
vehicle travels is estimated as their arithmetic average.
This estimate provides significant information on the qual-
ity of road surface, given the capability of the LPC algorithm
of filtering out (up to a certain degree) spurious components of
the acceleration signals (engine vibrations, gravitation, inertial
forces, etc.). RI values represent a compact sketch that can
be usefully exploited i n a collaborative setting. In fact, the
contribution of many roughness indexes can be taken into
account to represent the quality of a given road in a specific
geographical position, thus providing a worth of meaningful
information that can be properly averaged. RI values annotated
with a track identification code and with time and position
references are stored in memory according to Java serialization
format and periodically transmitted in batch to a remote server
through GSM connection. Data payload i s encoded in JSON
and HTTP protocol is used for data transfer to the cloud.
B. Cloud back-end level
A server application has been implemented (the YouSense
server of Figure 1) that exposes a set of application program
interfaces (in particular RESTful API’s) in order to allow
permitted users to upload data. The collection back-end has
been designed exploiting PostgreSQL with PostGIS extension
(for geospatial processing) as database. This layer of the
system architecture is in charge of the spatial and temporal
data aggregation corresponding to phases (c) and (d) of Figure
2. First of all it makes use of a map matching algorithm in
order to map each newly received point (composed of spatio-
temporal coordinates, RI and metadata) to its closest road.
Road cartography is provided by OpenStreetMap and map
matching is currently implemented by associating points to
geometrically closest road segments. Artifacts are removed by
a simple post-processing that takes new data points sorted by
timestamp. The list of these points is analyzed using a window
of 3 points, p
1
, p
2
, p
3
. If p
1
and p
3
are matched to the same
road, while p
2
is associated to another road, p
2
is matched
back to the road of p
1
and p
3
, since it is assumed that the road
change was misdetected. Needless to say, several alternatives
could be taken into consideration to enhance the accuracy of
mapping [6].
1) Spatial aggregation: Spatial aggregation is obtained by
uniformly sampling road segments matched by points during
the map-matching phase. Given an input parameter (termed
Spatial Sampling Factor, or SSF) we sample each road uni-
formly every SSF meters obtaining a set of landmark points.
We then track all points falling within a circle of given radius
(called Coverage Circle Radius, CCR) centered around each
landmark point. The average RI value to be associated to
the landmark is then computed as the average value of RIs
associated to points falling within the circle. Data values are
weighted by their distance from the landmark point (i.e. from
the center of the circle) using an inverse exponential function
and annotated with their timestamp.
2) Temporal aggregation: Temporal aggregation (corre-
sponding to phase (d) of the pipeline reported i n Figure 2)
is achieved by aggregating all roughness values computed
for the same position on the same road. Values contributing
to the same point are sorted by descending timestamp. The
contribution of each roughness value decreases exponentially
in time, thus the latest computed value has the highest weight,
while older values are steeply down-weighted. This exponen-
tial decay is simply implemented by updating the temporal
estimate (a daily estimate is a reasonable time horizon in road
quality monitoring) as the average between the current value
and the previous aggregated estimate (regardless of the days
elapsed since last update). This corresponds to an exponential
decay if new esti mates are provided every day. If there are
gaps, they are implicitly filled by assuming the daily value
equals previous estimates.
C. Front-end level
The SmartRoadSense graphical front-end is based on Car-
toDB, a cloud service for visualization t asks of geographical
maps and associated overlays [14]. The service offers web
APIs that allow the back-end to upload updated roughness
values for each geographic point. It also provides functions for
retrieving a list of roughness points for all roads inside a given
geographic area. This is used to populate a map of roughness
points and to display it as an overlay to a geographical map
(e.g. Google Maps [15]), thus implementing the functionalities
associated to phase (e) of Figure 2. Each point is graphically
displayed and filled according to a linear color map that
represents the RI values (green to red, from lowest to highest),
thus providing useful visual information about the roughness
experienced by vehicles travelling along a given road segment.
IV. SCALABILITY ANALYSIS
A full-fledged prototype of SmartRoadSense was devel-
oped at the University of Urbino and systematically used for
one months to monitor the roughness of the roads along a
path of 275Km traveled by public transport twice a day. The
selected path went through 744 roads of the OpenStreetMap
DB adopted in the test bed. Buses were equipped with Android
smartphones (namely, Motorola Moto G) running the mo-
bile SmartRoadSense application. The roughness index values
computed on board of mobile devices once per second were
opportunistically transmitted to the server exploiting either
Wi-Fi or m2m 3G connections when available. Connection
attempts were automatically performed by the application
every 15 minutes.
Spatial aggregation was performed by the server with a
sampling step of 20m along the travelled r oads, corresponding
to a coverage circle radius (CCR) of 40m. Aggregated data
were re-computed every day and stored as geo-tagged time
series. Temporal aggregation was then performed at each
sampling point to obtain a scalar value to be graphically
represented on the map as the combination of all the elements
of the roughness time series associated with that particular
point.
Scalability is analyzed and discussed in the following
section focusing on each single step of the data flow.
A. Data size
In order to investigate the impact of the different phases of
the proposed data flow we studied how the size of the payload
changes at each step, taking into account both the information
content and the encoding adopted. Moreover, we analyzed the
overhead introduced either by the protocols or by the data base
management system for performance optimization.
Figure 3 summarizes the results of this analysis. The figure
is divided into three conceptual sections, from left to right.
On the left a bar plot is used to represent data size (split
into payload, protocol overhead, and DBMS overhead) at each
distinct phase starting from raw sensor data (at the bottom)
up to CartoDB visualization (at the top). The effects of data
serialization, JSON encoding, and HTTP transmission are also
represented. Data size is expressed in bits per sample (bps),
which corresponds to bits per second in the first 5 steps and
to bits per point after spatial aggregation.The phases of the
data flow, labelled (a), (b), (c), (d), and (e) as in Figure 2,
are graphically represented in the middle, pointing out the
element of the architecture involved at each step (smartphones,
cloud, front-end) and the communication among them. Finally,
multiplicative scale factors are reported on the right, using
circles to denote the steps affected by each of them. The
scale factors are: the absolute number of seconds of activity
of the system (s ecs), the average number of simultaneous
users collecting roughness indexes at each second (users), the
absolute number of days of activity of the system (days) which
corresponds to the length of time series associated with each
sampling point, and the total length of the monitored streets
(length) which is upper bounded by the total length of the
streets of the underlying OpenStreetMap DB. White circles
are used to mark phases which are not critically affected by
scale factors in terms of performance. For instance, the number
of users doesn’t impact processing steps carried on board of
smartphones, since we have one device per user.
In the following we detail the results reported in the bar
plot of Figure 3. For what concerns the size of the payload, it is
worth noticing that it depends on the amount of information to
be conveyed but also on the type of encoding, thus determining
different space requirements even without processing steps.
1) Raw sensor data: Payload of raw sensor data is com-
posed of: three values encoding the accelerations (32 bits
each), two values encoding longitude and latitude (64 bits
each) and two values representing other data (termed bearing
and accuracy) to be possibly used for post-processing (each
parameter being 32 bits long). This results into 9856 bits for
each sample of data.
2) Roughness index: After on board-processing, the result-
ing roughness index needs only 320 bits per sample for its
encoding because we need only to keep a single RI value
instead of three acceleration values.
Fig. 3. Data size and scaling factors.
3) Java binary serialization: Since roughness indexes are
written on the memory of mobile devices for batch trans-
mission, we need to take into account the overhead due to
this process (called Java binary serialization). In particular,
the size of the payload is increased because of the encoding
of additional required information (an ID number, start time,
duration, etc.) amounting to 704 bits per sample, while the
overall protocol accounts for a 6504 bits of overhead.
4) JSON over HTTP: In order to be sent over HTTP data
packets are JSON encoded. The resulting payload is 1368 bits
long, while the contribution of protocols to overhead is 1016
bits (JSON), 22 bits (HTTP) and 5 bits (JSON over HTTP):
in total 1043 bits per sample.
5) PostgreSQL storage: When data is received on the
cloud, it is stored in PostgreSQL format. While 1024 bits are
sufficient for encoding the payload, the DBMS introduces a
significant overhead of 4736 bits, that are to be added to 1760
bits needed by the indexing s tructures.
6) PostgreSQL processed: The impact of aggregation at
this phase of the algorithmic pipeline is apparent, since the
payload reduces to 512 bits for each sample, while the over-
head decreases to 802 bits.
7) Final CartoDB data: Data needed by the visualization
front-end consists of 320 bits (needed for encoding geometry
and roughness index) of payload and remaining 2528 bits
required by CartoDB as database overhead.
The effectiveness of data reduction and aggregation strate-
gies is evident from the results reported in Figure 3, previously
described. It is also worthwhile to notice that scalability analy-
sis also clearly points out the impact of database management
systems on the data size. For instance, CartoDB makes use
of indexes in order to provide effective visualization support.
Nonetheless, the adoption of these optimization structures
leads to substantial increase of data dimension (up to 88.7%
of the whole size).
V. CONCLUSION
The increasing diffusion of mobile devices with sensing
and communication capabilities (i.e. smartphones) provides
the opportunity for capillary crowdsensing applications. The
enormous amount of data potentially produced in these set-
tings raises the question of how to handle it for building
efficient analytics frameworks which are the ultimate target of
these systems. In this paper we introduced a data reduction
and aggregation approach aimed at mitigating t he impact
of geospatial BigData in vehicular sensing applications. Our
approach consists of a sequence of geometric sampling and
temporal aggregation steps implemented at different levels
of SmartRoadSense, a system architecture that supports road
quality monitoring. Experimental results show that the pro-
posed methodology is beneficial in terms of the reduced impact
of data size in geospatial applications, while it provides full
support to exploit statistical robustness in a massive distributed
sensing environment.
REFERENCES
[1] R. K. Ganti, F. Ye, and H. Lei, “Mobile crowdsensing: current state and
future challenges, Communications Magazine, IEEE, vol. 49, no. 11,
pp. 32–39, 2011.
[2] N. D. Lane, E. Miluzzo, H. Lu, D. Peebles, T. Choudhury, and
A. T. Campbell, A survey of mobile phone sensing, Communications
Magazine, IEEE, vol. 48, no. 9, pp. 140–150, 2010.
[3] S. Shekhar, V. Gunturi, M. R. Evans, and K. Yang, “Spatial big-data
challenges intersecting mobility and cloud computing, in Proceedings
of the Eleventh ACM International Workshop on Data Engineering for
Wireless and Mobile Access, ser. MobiDE ’12. New York, NY, USA:
ACM, 2012, pp. 1–6.
[4] J. Eriksson, L. Girod, B. Hull, R. Newton, S. Madden, and H. Balakr-
ishnan, “The pothole patrol: using a mobile sensor network for road
surface monitoring, in Proceedings of the 6th international conference
on Mobile systems, applications, and services. ACM, 2008, pp. 29–39.
[5] P. Mohan, V. N. Padmanabhan, and R. Ramjee, “Nericell: rich mon-
itoring of road and traffic conditions using mobile smartphones, in
Proceedings of the 6th ACM conference on Embedded network sensor
systems. ACM, 20 08, pp. 323–336.
[6] A. Thiagarajan, L. Ravindranath, K. LaCurts, S. Madden, H. Balakrish-
nan, S. Toledo, and J. Eriksson, “Vtrack: accurate, energy-aware road
traffic delay estimation using mobile phones, in Proceedings of the
7th ACM Conference on Embedded Networked Sensor Systems. ACM,
2009, pp. 85–98.
[7] A. Thiagarajan, L. Ravindranath, H. Balakrishnan, S. Madden, and
L. Giraud, Accurate, low-energy trajectory mapping for mobile de-
vices, in Proceedings of the NSDI, 2011, pp. –.
[8] K. Aberer, S. Sathe, D. Chakraborty, A. Martinoli, G. Barrenetxea,
B. Faltings, and L. Thiele, “Opensense: open community driven sensing
of environment, in Proc. of the 1st Intl Workshop on GeoStreaming
(IWGS 10), 2010, pp. 39–42.
[9] G. Alessandroni, L. Klopfenstein, S. Delpriori, M. Dromedari,
G. Luchetti, B. Paolini, A. Seraghiti, E. Lattanzi, V. Freschi, A. Carini,
and A. Bog liolo, “Smartroadsense: Collaborative road surface condition
monitoring, in Proc. of Eighth International Conference on Mobile
Ubiquitous Computing, Systems, Services and Technologies, Ubicomm
2014. IARIA, 2014, p. accepted to.
[10] B. Simion, D. N. Ilha, A. D. Brown, and R. Johnson, “The price of
generality in spatial indexing, pp. 8–12, 2013.
[11] M. Keller, J. Beutel, O. Saukh, and L. Thiele, “Visualizing large sensor
network data sets in space and time with vizzly, in Local Computer
Networks Workshops (LCN Workshops), 2012 IEEE 37th Conference
on. IEEE, 2012, pp. 925–933.
[12] L. Battle, M. Stonebraker, and R. Chang, “Dynamic reduction of
query result sets for interactive visualizaton, in Big Data, 2013 IEEE
International Conference on. IEEE, 2013, pp. 1–8.
[13] “Openstreetmap, 2014. [Online]. Available:
http://www.openstreetmap.org
[14] Vizzuality, “Cartodb, 2013. [Online]. Available: http://cartodb.com
[15] Google, “Google maps, 2014. [Online]. Available:
http://maps.google.com
... Due to its inherent ability, such as high mobility, scalability and costeffectiveness, it has been widely used in many sensing applications. Related research has investigated the application of urban space, including the urban environment monitoring [2], mobile social recommendations [3], public safety [4], traffic control and planning [5], geospatial information collection [6] and large-scale industrial environment issues [7]. ...
... where the set of the nearest K neighbors of the target user i is expressed as |K neighbors of i|; Sim(i, j) is the similarity between user i and user j, which can be calculated from formula (6). U i and U j expressed potential feature vector of user i and user j. ...
Article
Full-text available
Aiming at the problems of low data quality and high incentive costs caused by the low enthusiasm of participants in mobile crowd sensing, a new task recommendation framework is proposed in this paper. First, the participants' historical behaviors are analyzed, assuming that user behaviors can be quantified as the user's willingness to participate, and the cosine similarity theorem is used to calculate the similarity between participants, thereby constructing a user-hybrid model. Secondly, probabilistic matrix factorization is developed to predict the willingness of participants, and a ranking model is obtained through learn-to-rank algorithm. Finally, a task recommendation list is generated according to the ranking model, which serves as the target participant's preferred task list for sensing task recommendation. The experiment in this paper is carried out on the MATLAB platform based on two real check-in datasets, Gowalla and Brightkite. The results show that the average allocation precision rate can reach 96%, and the sensing user participation rate is about 97%. Meantime, the user's mobile cost is reduced, and the overall goal of maximizing accuracy and minimizing perceived cost is achieved.
... Os autores fornecem uma visão geral das abordagens de localização existentes e discutem os desafios de pesquisa implícitos. Freschi et al. [10] apresentam uma abordagem de redução e agregação de dados para uso em uma aplicação de crowdsensing de sensoriamento veicular destinada a monitorar a rugosidade de superfícies de estradas. A metodologia proposta pelos autores permitiu reduzir o tamanho dos dados em aplicações geoespaciais de forma bastante eficiente. ...
Conference Paper
Devido à grande distribuição geográfica ou à existência de ampla infraestrutura rodoviária, é desafiador garantir que operadoras de dados móveis forneçam um serviço adequado ao longo das rodovias. Uma metodologia automatizada para descrever e avaliar a cobertura de sinal de rede móvel em rodovias pode colaborar para que governos e operadoras planejem com eficiência ações para melhorar a qualidade do serviço provido pelas operadoras. Neste trabalho, é proposto uma metodologia automatizada para descrever e permitir a avaliação da cobertura de sinal de rede móvel em rodovias usando dados de rastreamento de veículos. A abordagem proposta agrega, usando um critério espaço-temporal, amostras de dados obtidos através de um sistema de rastreamento de veículos. A saída resultante deste processo de agregação é um mapa das rodovias com um esquema de cores gradiente, para indicar a disponibilidade do sinal e a confiança desta informação. Como o tráfego é dinâmico e os dados de tráfego não seguem uma distribuição uniforme, a medida de incerteza associada ao mapa indica a confiança da informação provida. Como estudo de caso, foi usado um conjunto de dados relativos a viagens no sul do Brasil coletados através de um serviço privado de rastreamento de veículos.
... Data falling within an area of fixed radius are reduced to a single value, smoothing the effect of possible outliers. Temporal aggregation of these quantities is achieved by periodically calculating the weighted average of points over time, incrementally down-weighting older data [27]. SmartRoadSense also implements efficient algorithms that detect and correct mapping artefacts and significantly enhance the accuracy of map-matching of user-supplied geospatial data in crowdsensing applications [28]. ...
Article
Full-text available
Mobile crowdsensing (MCS) is a well-established paradigm that leverages mobile devices’ ubiquitous nature and processing capabilities for large-scale data collection to monitor phenomena of common interest. Crowd-powered data collection is significantly faster and more cost-effective than traditional methods. However, it poses challenges in assessing the accuracy and extracting information from large volumes of user-generated data. SmartRoadSense (SRS) is an MCS technology that utilises sensors embedded in mobile phones to monitor the quality of road surfaces by computing a crowdsensed road roughness index (referred to as PPEP_{PE} ). The present work performs statistical modelling of PPEP_{PE} to analyse its distribution across the road network and elucidate how it can be efficiently analysed and interpreted. Joint statistical analysis of open datasets is then carried out to investigate the effect of both internal and external road features on PPEP_{PE} . Several road properties affecting PPEP_{PE} as predicted are identified, providing evidence that SRS can be effectively applied to assess road quality conditions. Finally, the effect of road category and the speed limit on the mean and standard deviation of PPEP_{PE} is evaluated, incorporating previous results on the relationship between vehicle speed and PPEP_{PE} . These results enable more effective and confident use of the SRS platform and its data to help inform road construction and renovation decisions, especially where a lack of resources limits the use of conventional approaches. The work also exemplifies how crowdsensing technologies can benefit from open data integration and highlights the importance of making coherent, comprehensive, and well-structured open datasets available to the public.
... Xu et al. [36] proposed a new data aggregation scheme-called PAVS. Freschi et al. [37] showed an alternative data aggregation method to monitor the roughness of road surfaces. In addition, a series of data aggregation schemes [38]- [42] have been proposed. ...
Article
Full-text available
Spatial data on a cellular network load can be used to develop commercial and public services. However, such data is calculated based on individual users' behavior and can contravene their privacy rights. Moreover, direct tracking of individual devices violates the European Union's regulations. To solve this issue, we propose to use data aggregated in individual cells of the public land mobile network without tracking an individual mobile device in the entire process. To prove that the proposed data collection method is useful, we compared the obtained results with a closed-circuit television system in an estimation of the number of people. The proposed system is sensitive enough to detect untypical global events in an urban area and distinguish transport demand zones of various types as we showed on real data from the City of Warsaw.
... The research project SmartRoadSense analyzes smartphone acceleration data in the inner cabin of vehicles. Between the actual calculation of road roughness [22], the project addresses speed dependencies for sensor signals [23], [24] and geospatial aggregation techniques of crowd-sensing data [25]. Despite of the elegance of using smartphones, which comes with a great penetration rate on the roads, this approach makes use of the collaborative approach as the data of a single device is subject to uncertainty in the position and cushioning (e.g. ...
Conference Paper
Full-text available
In the past years, automated driving has become one of the most important research fields in the automotive industry. A key component for a successful substitution of human driving by vehicles is a real-time model of the current environment including the traffic situation, the guide-way, and the road itself. Although, most of the information for the environment model are provided via in-vehicle generated data based on camera, LIDAR, and RADAR sensors, we propose a solution of classifying road quality within the spring-damper system of the vehicle. In this paper, we utilize the Vehicle Level Sensor (VLS), which is a standard component in modern vehicles, for road condition assessment. We present a simulation of the Quarter Vehicle Model (QVM) for road elevation measurement to enable each connected vehicle to provide valid data for a potential crowd sensing approach where every vehicle contributes data for past and consumes data for upcoming segments. The generated data is capable of providing the environment model with real-time data of upcoming road segments. The simulation results are validated on a test bench including a review of the errors.
... In the research project SmartRoadSense [15], the acceleration sensors of smartphones analyze road roughness or defects from the inner cabin. The project focus, between the actual calculation of road roughness [15], on speed dependencies for the sensor signals [16]- [20] and geospatial aggregation techniques of crowd-sensing data [21]. The use of smartphones comes with a great penetration rate on the roads. ...
... The research project SmartRoadSense analyzes smartphone acceleration data in the inner cabin of vehicles. In addition to the actual calculation of road roughness [24], the project addresses speed dependencies for sensor signals [25], [26] and geospatial aggregation techniques of crowd-sensing data [27]. Despite the elegance of using smartphones, which have a large penetration rate on roads, this approach is collaborative, as data from a single device are subject to uncertainty with regard to the position and cushioning (e.g., seats, beverage storage area) of the smartphone in the inner cabin and the intermediate spring-damper system of different vehicles. ...
Article
Full-text available
In recent years, automated driving has become one of the most important research fields in the automotive industry. A key component for a successful substitution of human driving by vehicles is a real-time model of the current environment including the traffic situation, the guide-way, and the road itself. We propose a solution for measuring road conditions within the spring-damper system of the vehicle. In this paper, we utilize a Vehicle Level Sensor (VLS) and an Acceleration Sensor (AS), both of which are standard components in modern vehicles, for road condition monitoring. Our model-based approach therefore consists purely of additional software. We present a calculation of the Quarter Vehicle Model (QVM) for road elevation measurements to enable each connected vehicle to provide valid data for a potential crowd-sensing approach, where every vehicle contributes past data and consumes data for upcoming segments. The generated data are capable of providing the environment model with real-time data. Our calculations are first validated in a laboratory setup, representing a down-scaled Quarter-Vehicle. The knowledge gained it then applied to a real vehicle. For this purpose, the measurement setup is explained, the model-based calculation and the parameters are adjusted, and the results are compared.
Chapter
The process of context recognition in dynamically changing transportation processes is quite complex. The complexity arises from many aspects of understanding the concept of “context”. When developing computer-based systems, it is important to define what is included and how the concept of context is understood. Of course, a fairly wide range of ICT infrastructure components can be integrated into the digitization of contextual data. Quite a variety of devices and software can be included to help capture environmental data: different wireless channels, heterogeneous wireless sensor networks (WNSs), different flow management and e-document management systems, and various monitoring systems. Heterogeneity arises when we want to include different sensors and different means of communication. Vehicles can also be equipped with a wide variety of specialized apparatus and systems. Roads are equipped with special tools and road infrastructure software components. One of the goals of this part of the study is the desire to convey a wide range of context recognition processes. Artificial intelligence (AI) methods can be used for context recognition and such components form the basis of intelligent service systems. The provision of smart services should be based on a wide range of management needs in freight transport processes. How primary data sources are incorporated into all possible interconnected infrastructures is analysed in this section.KeywordsContext awarenessInformation Communication Technology (ICT)Information Management System (IMS)Processing methodsData aggregation
Conference Paper
Full-text available
Monitoring of road surface conditions is a critical activity in transport infrastructure management. Many research solutions have been proposed in order to automatically control and check the quality of road surfaces. Most of them make use of expensive sensors embedded in vehicles or mainly focus on detection of specific anomalies during monitoring activity.In this paper, we describe the design of a system for collaborative monitoring of road surface quality. The overall architecture encompasses the integration of a custom mobile application, a georeferenced database system and a visualization front-end. Road surface condition is summarized through a roughness parameter computed using signal processing algorithms running on mobile devices. The roughness values computed are sub-sequently transmitted and stored into a back-end geographic information system enabling processing of aggregated traces and visualization of road conditions. The proposed approach introduces a thoroughly integrated system suitable for monitoring applications in a scalable, crowdsourcing collaborative setting.
Conference Paper
Full-text available
Efficient indexing can significantly speed up the processing of large volumes of spatial data in many BigData applications. Many new emerging spatial applications (e.g., biomedical imaging, genome analysis, etc.) have varying indexing requirements, thus, a unified indexing infrastructure for implementing new indexing schemes without requiring knowledge of database internals is beneficial. However, designing a generic indexing framework is a challenging task. We study the issues with general indexing schemes, such as the GiST (used in PostGIS) and expose the tradeoff between generality and performance, showing that generality can be severely detrimental to performance if the abstractions are not carefully designed. Our experiments indicate that the GiST framework, as implemented in PostgreSQL/PostGIS, performs 4.5-6x slower for filtering records through the index, compared to a custom R-tree implementation. We also isolate the GiST-specific overhead by implementing the framework outside the DBMS, showing that the GiST-based R-tree is up to 2x slower than the raw R-tree algorithm that it uses internally. We conclude that although a generic framework for a wide range of spatial BigData application domains is desirable, implementers of new frameworks need to be careful in designing the abstractions to avoid paying a hefty performance penalty.
Conference Paper
Full-text available
This paper presents Vizzly, a middleware for the interactive browsing of large sensor network data sets. Provided map and line plot widgets allow to visualize structured data from mobile and static sensors. A user is completely free in selecting sensor data based on time and location, suitable levels of temporal and spatial detail are automatically chosen by the Vizzly server. Vizzly automatically adapts to user interactions, new data is automatically loaded when query parameters change. Request response times are significantly reduced by the use of caching techniques, most requests are served from already pre-computed data that is stored in the memory of the Vizzly server. Vizzly has already been successfully integrated into the PermaSense and OpenSense projects, a single instance is currently handling more than 550 millions of data points.
Conference Paper
Full-text available
Modern database management systems (DBMS) have been designed to efficiently store, manage and perform computations on massive amounts of data. In contrast, many existing visualization systems do not scale seamlessly from small data sets to enormous ones. We have designed a three-tiered visualization system called ScalaR to deal with this issue. ScalaR dynamically performs resolution reduction when the expected result of a DBMS query is too large to be effectively rendered on existing screen real estate. Instead of running the original query, ScalaR inserts aggregation, sampling or filtering operations to reduce the size of the result. This paper presents the design and implementation of ScalaR, and shows results for an example application, displaying satellite imagery data stored in SciDB as the back-end DBMS.
Article
Full-text available
Increasingly, location-aware datasets are of a size, variety, and update rate that exceeds the capability of spatial computing technologies. This paper addresses the emerging challenges posed by such datasets, which we call Spatial Big Data (SBD). SBD examples include trajectories of cellphones and GPS devices, vehicle engine measurements, temporally detailed road maps, etc. SBD has the potential to transform society via next-generation routing services such as eco-routing. However, the envisaged SBD-based next-generation routing services pose several significant challenges for current routing techniques. SBD magnifies the impact of partial information and ambiguity of traditional routing queries specified by a start location and an end location. In addition, SBD challenges the assumption that a single algorithm utilizing a specific dataset is appropriate for all situations. The tremendous diversity of SBD sources substantially increases the diversity of solution methods. Newer algorithms may emerge as new SBD becomes available, creating the need for a flexible architecture to rapidly integrate new datasets and associated algorithms.
Article
Full-text available
CTrack is an energy-efficient system for trajectory map-ping using raw position tracks obtained largely from cellular base station fingerprints. Trajectory mapping, which involves taking a sequence of raw position sam-ples and producing the most likely path followed by the user, is an important component in many location-based services including crowd-sourced traffic monitor-ing, navigation and routing, and personalized trip man-agement. Using only cellular (GSM) fingerprints instead of power-hungry GPS and WiFi radios, the marginal en-ergy consumed for trajectory mapping is zero. This ap-proach is non-trivial because we need to process streams of highly inaccurate GSM localization samples (aver-age error of over 175 meters) and produce an accurate trajectory. CTrack meets this challenge using a novel two-pass Hidden Markov Model that sequences cellu-lar GSM fingerprints directly without converting them to geographic coordinates, and fuses data from low-energy sensors available on most commodity smart-phones, in-cluding accelerometers (to detect movement) and mag-netic compasses (to detect turns). We have implemented CTrack on the Android platform, and evaluated it on 126 hours (1,074 miles) of real driving traces in an urban en-vironment. We find that CTrack can retrieve over 75% of a user's drive accurately in the median. An impor-tant by-product of CTrack is that even devices with no GPS or WiFi (constituting a significant fraction of to-day's phones) can contribute and benefit from accurate position data.
Article
Full-text available
Mobile phones or smartphones are rapidly becoming the central computer and communication device in people's lives. Application delivery channels such as the Apple AppStore are transforming mobile phones into App Phones, capable of downloading a myriad of applications in an instant. Importantly, today's smartphones are programmable and come with a growing set of cheap powerful embedded sensors, such as an accelerometer, digital compass, gyroscope, GPS, microphone, and camera, which are enabling the emergence of personal, group, and communityscale sensing applications. We believe that sensor-equipped mobile phones will revolutionize many sectors of our economy, including business, healthcare, social networks, environmental monitoring, and transportation. In this article we survey existing mobile phone sensing algorithms, applications, and systems. We discuss the emerging sensing paradigms, and formulate an architectural framework for discussing a number of the open issues and challenges emerging in the new area of mobile phone sensing research.
Conference Paper
Full-text available
We consider the problem of monitoring road and traffic conditions in a city. Prior work in this area has required the deployment of dedicated sensors on vehicles and/or on the roadside, or the tracking of mobile phones by service providers. Furthermore, prior work has largely focused on the developed world, with its relatively simple traffic flow patterns. In fact, traffic flow in cities of the developing regions, which comprise much of the world, tends to be much more complex owing to varied road conditions (e.g., potholed roads), chaotic traffic (e.g., a lot of braking and honking), and a heterogeneous mix of vehicles (2-wheelers, 3-wheelers, cars, buses, etc.). To monitor road and traffic conditions in such a setting, we present Nericell, a system that performs rich sensing by piggybacking on smartphones that users carry with them in normal course. In this paper, we focus specifically on the sensing component, which uses the accelerometer, microphone, GSM radio, and/or GPS sensors in these phones to detect potholes, bumps, braking, and honking. Nericell addresses several challenges including virtually reorienting the accelerometer on a phone that is at an arbitrary orientation, and performing honk detection and localization in an energy efficient manner. We also touch upon the idea of triggered sensing, where dissimilar sensors are used in tandem to conserve energy. We evaluate the effectiveness of the sensing functions in Nericell based on experiments conducted on the roads of Bangalore, with promising results.
Conference Paper
This paper investigates an application of mobile sensing: detecting and reporting the surface conditions of roads. We describe a system and associated algorithms to monitor this important civil infrastruc- ture using a collection of sensor-equipped vehicles. This system, which we call the Pothole Patrol (P2), uses the inherent mobility of the participating vehicles, opportunistically gathering data from vibration and GPS sensors, and processing the data to assess road surface conditions. We have deployed P2 on 7 taxis running in the Boston area. Using a simple machine-learning approach, we show that we are able to identify potholes and other severe road surface anomalies from accelerometer data. Via careful selection of train- ing data and signal features, we have been able to build a detector that misidentifies good road segments as having potholes less than 0.2% of the time. We evaluate our system on data from thousands of kilometers of taxi drives, and show that it can successfully de- tect a number of real potholes in and around the Boston area. After clustering to further reduce spurious detections, manual inspection of reported potholes shows that over 90% contain road anomalies in need of repair.
Conference Paper
Traffic delays and congestion are a major source of ineffi- ciency, wasted fuel, and commuter frustration. Measuring and localizing these delays, and routing users around them, is an important step towards reducing the time people spend stuck in traffic. As others have noted, the proliferation of commod- ity smartphones that can provide location estimates using a variety of sensors—GPS, WiFi, and/or cellular triangulation— opens up the attractive possibility of using position samples from drivers' phones to monitor traffic delays at a fine spatio- temporal granularity. This paper presents VTrack, a system for travel time estimation using this sensor data that addresses two key challenges: energy consumption and sensor unrelia- bility. While GPS provides highly accurate location estimates, it has several limitations: some phones don't have GPS at all, the GPS sensor doesn't work in "urban canyons" (tall buildings and tunnels) or when the phone is inside a pocket, and the GPS on many phones is power-hungry and drains the battery quickly. In these cases, VTrack can use alter- native, less energy-hungry but noisier sensors like WiFi to estimate both a user's trajectory and travel time along the route. VTrack uses a hidden Markov model (HMM)-based map matching scheme and travel time estimation method that interpolates sparse data to identify the most probable road segments driven by the user and to attribute travel times to those segments. We present experimental results from real drive data and WiFi access point sightings gathered from a de- ployment on several cars. We show that VTrack can tolerate significant noise and outages in these location estimates, and still successfully identify delay-prone segments, and provide accurate enough delays for delay-aware routing algorithms. We also study the best sampling strategies for WiFi and GPS sensors for different energy cost regimes.