Content uploaded by Eugenio Realini
Author content
All content in this area was uploaded by Eugenio Realini
Content may be subject to copyright.
International Symposium on Geoinformatics for Spatial Infrastructure Development in Earth and Allied Sciences 2010
goGPS: ACCURATE ROAD MAPPING USING
LOW-COST GPS RECEIVERS
Lisa Pertusini1, Eugenio Realini2, Mirko Reguzzoni1
1 DIIAR, Politecnico di Milano, Como Campus, via Valleggio 11, 22100 Como, Italy
Email: lisa.pertusini@mail.polimi.it; mirko@geomatica.como.polimi.it
2Graduate School for Creative Cities, Osaka City University
3-3-138 Sugimoto, Sumiyoshi-ku, Osaka 558-8585, Japan
Email: realini@gscc.osaka-cu.ac.jp
ABSTRACT
Road mapping is traditionally performed by using high-level and high-cost instrumentation such as
photogrammetric digital cameras, double frequency GPS receivers, inertial measurement units, etc. However,
latest developments in GPS data analysis allow obtaining good results also using low-cost devices supported by
software like for example goGPS. This software allows enhancing GPS positioning with low-cost receivers
mainly by exploiting the principle of relative positioning with respect to a master station located in the area
where the survey is performed. Typically goGPS output is a sequence of estimated points at 1 second sampling
rate leading to a huge amount of data and, consequently, arcs when these data are used for road mapping. In
this paper we implement an algorithm to reduce the number of arcs by first selecting the nodes using an
agglomerative clustering procedure and then fitting the resulting polyline on the original GPS dataset by
least-squares adjustment. In this way we aim at automatizing the production of road networks, simplifying the
procedure currently used for example in OpenStreetMap. This could be particularly useful for developing
countries, where the availability of high cost professional instrumentation is often limited. The proposed method
is applied in a test scenario evaluating its performances.
1. INTRODUCTION
A problem of great interest in the field of GPS navigation, especially in developing
countries, is how to estimate a road network with low-cost instrumentation. Despite the
increasing popularity of route guidance systems, current digital maps are still inadequate in
developing countries for many advanced applications. Among others, the main drawbacks are
the insufficient accuracy of road geometry and the lack of fine-grained information, like for
example lane positions.
In this paper, an approach to derive high-precision road maps by using low-cost GPS
receivers in differential mode is presented. The procedure is based on the goGPS software,
which is introduced in Section 2, and consists of the successive steps described in Section 3.
An example on how the procedure works with real GPS data is provided in Section 4, also
discussing future developments.
2. goGPS OPEN SOURCE SOFTWARE FOR GPS NAVIGATION
Low-cost GPS devices typically implement small patch or helix antennas and single
frequency (L1) receivers functioning in stand-alone mode; moreover, they are often highly
sensitive to GPS signal (even if degraded) in order to assure positioning with bad sky
goGPS: accurate road mapping using low-cost GPS receivers
visibility conditions, like in dense urban environments or under forest canopy. Nevertheless,
the positioning accuracy with this kind of devices is generally low, not only because of the
classical errors induced by atmospheric and clock delays, but also because of the low quality
of involved hardware and the multipath resulting from high sensitivity.
goGPS (http://www.gogps-project.org) is an open source software package designed to
perform relative positioning with code and phase (Real Time Kinematic, RTK) using low-
cost GPS instrumentation, if this gives raw data (GPS observations) in output: in this way
atmospheric and clock errors are removed or made negligible. Moreover, goGPS includes a
Kalman filter applied to GPS observations which reduces the estimation error by modelling
the receiver dynamics, by weighing observations on the basis of their quality (i.e. signal-to-
noise ratio and satellite elevation), by integrating additional information like digital terrain
models or road networks and by exploiting at best the information provided by the L1 phase,
managing its ambiguities when changes of satellite configuration or cycle slips happen.
goGPS can work either in RTK or stand-alone mode and it can be used either for real-time or
post-processing tasks. Observations coming from a reference GPS station are needed for the
RTK mode, thus a mobile Internet connection has to be available when using goGPS in real-
time. On the other hand, for post-processing tasks, reference station data in RINEX format
can be used.
3. ARC-SEGMENTATION ALGORITHM
The simplest and probably most common solution to the problem of road mapping by
using low-cost GPS receivers is to report directly the computed trajectory on the map of
interest. However it should be noticed that the path returned by GPS receivers (e.g. NMEA)
is nothing but a sequence of points which can be extremely lengthy if the sampling interval of
the GPS observations is equal to 1 second. For example, there is no simplification in the case
of a straight line that could be equivalently described by a single arc instead of many points.
Hence a simple network topology (polyline), based on a limited number of arcs and nodes,
should be preferred for the trajectory modelling. To this aim, many solutions have been
studied (see e.g. Douglas and Peucker, 1973); here we propose an algorithm based on the
following steps:
1. automatic selection of significant nodes chosen from several GPS points, i.e. among the
positions estimated by goGPS at every second;
2. assignment of each GPS point to a certain arc (which is a segment joining two nodes);
3. least-squares adjustment of the node positions, in such a way that the resulting polyline
correctly interpolates the GPS points.
The nodes estimated at step 3 are provided as input to the step 2 and the procedure is
iterated to obtain the final polyline. Let us now describe the individual steps in more detail.
3.1 Node selection by AGNES
The identification of significant nodes and consequently the determination of their
number is performed using the agglomerative classification algorithm AGNES (Kaufman and
Rousseeuw, 1990).
At first all the GPS points are considered as nodes. Then the angle between each pair of
consecutive arcs is computed; to be more precise, calling A, O and B three generic
International Symposium on Geoinformatics for Spatial Infrastructure Development in Earth and Allied Sciences 2010
consecutive nodes of coordinates xA = (xA, y
A), x0 = (x0, y0) e xB = (xB, y
B), the angle is
calculated as:
()
(
)
⎥
⎥
⎦
⎤
⎢
⎢
⎣
⎡
−−
−⋅−
=
00
00
arccos
ˆxxxx
xxxx
BOA
BA
BA °≤≤ 180
ˆ
0BOA (1)
Among all the computed angles, the largest is detected, i.e. the closest to 180 degrees.
In other words, the two consecutive arcs with the smallest change in direction are identified.
If this maximum angle is larger than a certain threshold (e.g. 170 degrees), the node O, which
is common to both arcs AO and OB , is eliminated, resulting in a single arc AB . After
recomputing the angles with respect to this new arc, the algorithm is repeated iteratively until
the maximum angle is below the threshold (see Figure 1).
Figure 1. Scheme of how the agglomerative algorithm works for the selection of nodes.
The procedure is applied to the goGPS solution based on code and phase double
differences , typically overweighing the dynamic model of the Kalman filter. In this way the
sequence of GPS points draws a smooth path, avoiding the agglomerative algorithm to keep
too many nodes.
Despite this expedient, when the receiver is kept stationary during the measurement,
the Kalman filter returns a cloud of points around the true position with sudden changes of
direction between two consecutive arcs (Brovelli et al. 2008). This causes the selection of a
large number of nodes in a very restricted area. To avoid this, a further selection is
implemented downstream of the agglomerative method, verifying that the distance between
two consecutive nodes is greater than a certain threshold. If this condition is not satisfied,
only the first of the two nodes is selected. The test is repeated iteratively until the condition is
true for all pairs of adjacent nodes (see Figure 2).
Figure 2. Scheme of the elimination of consecutive nodes too close each other.
3.2 Data classification using bounding boxes
Once the nodes have been selected, the arcs are implicitly defined by connecting
consecutive nodes. However, the fact that the selected nodes are a subset of the original GPS
N
5
N
1
N
2
N
3
N
4
N
5
N
1
N
3
N
4
N
1
N
3
N
5
maximum angle greater than
the threshold Æ eliminate N2 maximum angle greater than
the threshold Æ eliminate N4 maximum angle less than
the threshold Æ STOP
N1
N2
N1
N2
circle centered in N2, with
radius equal to the threshold all nodes after N2 that fall in
the circle are eliminated
goGPS: accurate road mapping using low-cost GPS receivers
points implies that the corresponding arcs do not necessarily optimally interpolate the other
GPS points (see Figure 3, on the left). Moreover, as mentioned above, the selection procedure
of the nodes is done on a smoothed trajectory, while the final polyline should be calculated
on a trajectory as faithful as possible to reality.
For these reasons, the GPS points are re-estimated using goGPS, but this time by
properly weighing the dynamic model of the Kalman filter with respect to the measurement
errors of code and phase observations. The obtained GPS points have to be assigned to the
different arcs, in order to subsequently optimize the interpolating polyline. In other words
each arc is named with a certain label (a number from 1 to n, where n is the number of
segments) and one and only one label is assigned to each GPS point (including the label 0 if
the GPS point is not assigned to any arc).
In order to perform this classification, a rectangular buffer (i.e. a bounding box) of size
Δ is built around the line connecting two consecutive nodes (see Figure 3, on the right). The
following classification rules are implemented:
• if a GPS point does not fall into any buffer, it is labelled with 0 and it is excluded
from the subsequent process of least-squares interpolation;
• if a GPS point falls into one and only one buffer, it is marked by the corresponding
arc label;
• if a GPS point falls into more than one buffer (an event that frequently occurs near
the nodes if the value of Δ is too high), the sum of the distances between the GPS
point and the two nodes is computed for each arc involved; the label of the arc for
which this sum is minimum is attributed to the GPS point.
Finally it may occur that no GPS points or only a single GPS point fall in the buffer
of a certain arc. This does not allow to re-estimate the arc by the subsequent least-squares
interpolation and then the arc itself is eliminated by deleting one of its two nodes.
Figure 3. Example of arc non-optimally interpolating the GPS points (on the left);
definition of the corresponding bounding box of size Δ (on the right).
3.3 Least-squares adjustment of the nodes
The idea is to interpolate the GPS points falling into the same bounding box with a line,
namely with the following model:
)( 00 xxmyy
−
=
−
(2)
where 0
x is an arbitrary value set a priori (for example, the midpoint of the x coordinates of
the GPS points), while the corresponding value 0
y and the angular coefficient m of the line
are the two unknowns of the problem. Note that the GPS points are “observed” with an error
in both coordinates, so we can write:
0ii
ix
x
xv=+ , 0ii
iy
y
yv
=
+ , with i = 1,2, … , N (3)
• •
•
•
N
1
N
2
N
1
N
2
Δ
Δ
•
•
•
•
International Symposium on Geoinformatics for Spatial Infrastructure Development in Earth and Allied Sciences 2010
where the variances of the errors 2
i
x
σ
and 2
i
y
σ
are provided by the Kalman filter implemented
in goGPS, while the covariance is neglected for simplicity. Assuming that the angular
coefficient m can be written as mmm
δ
+
=
~
and that the approximation ii
x
x
mv mv≅ holds,
the observation equation of the least-squares problem results:
0000
() 0
ii i i
yx
yvymxx mv−−− − + =
(4)
where the approximate value of the angular coefficients is generally estimated as:
0
10
1i
i
N
i
y
y
mNxx
=
−
=−
∑
, where 0
1
1
i
N
i
x
x
N=
=
∑
, 0
1
1
i
N
i
y
y
N=
=∑. (5)
The target function of the least-squares problem can be written as:
22
22
1
1
2ii
ii
Nyx
iyx
vv
φσσ
=
⎛⎞
=+
⎜⎟
⎜⎟
⎝⎠
∑ (6)
which has to be minimized with respect to the condition (4) using Lagrange multipliers. The
resulting system by which y0 and m are estimated can be written in matrix notation as:
00 0
0
2
00 00 000
()
() () ()
ii
ii ii
ii i
ii i
ii i
ii i
qqxx qy
y
m
qx x qx x qy x x
⎡⎤⎡⎤
−⎡⎤
=
⎢⎥⎢⎥
⎢⎥
−− −
⎣⎦
⎢⎥⎢⎥
⎣⎦⎣⎦
∑
∑∑
∑∑ ∑ (7)
where
()
1
222
ii
iy x
qm
σσ
−
=+
.
Since the values of y0 and m are independently estimated arc by arc, it is necessary to
recompute the intersection of two consecutive lines, thus determining the new location of the
intermediate node, which varies not only with respect to the y coordinate but also with respect
to the x coordinate (Figure 4, on the left). However if the new intersection is too far from the
previous node, i.e. if it is beyond a certain distance threshold defined by the user, the x
coordinate of the node is not changed, while the y coordinate is computed as the average of
the two values of the two lines at the x coordinate (Figure 4, on the right). The resulting
sequence of nodes and arcs defines the network topology (polyline).
Figure 4. Determination of new nodes after least-squares interpolation of the arcs.
4. TESTS AND FUTURE DEVELOPMENTS
Various tests were performed to check the performance of the proposed segmentation
algorithm. In this paper we are going to illustrate only one of them, representative of the
current state-of-the-art of the algorithm. The test dataset was surveyed by mounting a low-
cost GPS device (u-blox AEK-4T) on the rooftop of a car, driven for about 50 minutes
x coordinate of the old
node
•
new node obtained by
intersection of the
new arcs •
threshold circle to accept
a node as the intersection
of the new arcs x coordinate of the old
node
•
new node
intersection of
the new arcs
goGPS: accurate road mapping using low-cost GPS receivers
through the streets of Izumi and Sakai cities, part of the Osaka metropolitan area. The path
covered approximately 18 km, with a total number of 3047 surveyed points (at 1 Hz
measurement rate). The average speed on straight roads was about 40 km/h, while it was
about 20 km/h on curves. The parameters of the segmentation algorithm were manually
adjusted to obtain a good balance between data reduction and detail level, especially for
representing curves. The node selection was performed by setting AGNES angular threshold
to 178 degrees and the minimum distance between consecutive nodes to 12 m. The bounding
box size was set to 20 m for the first iteration and 15 m for the second iteration (smaller sizes
led to the deletion of too many arcs). The distance threshold to accept or reject new nodes
was 2 m for both iterations. These values allowed a very good simplification of the trajectory
on straight paths, nevertheless preserving a good level of smoothness for curves (Figure 5).
Figure 5. Arc-segmentation example on a path with smooth curves
(path overlaid on Google Earth).
The parameterization obviously depends on the type of survey, therefore adjusting
parameters manually could be very cumbersome for users. To address this issue, future
developments include the automatic tuning of parameters according to the path features.
Another limitation of the current implementation of the arc-segmentation algorithm is
that it can work only on non-intersecting paths. This issue is going to be addressed by
identifying sub-paths composed by non-intersecting segments, which could then be processed
independently. The computation of a full network topology is planned for the future. Further
developments include also the comparison of the algorithm proposed in this paper with the
widely used Douglas-Peucker algorithm for line simplification (Douglas and Peucker, 1973)
and the testing of the resulting polyline with OpenStreetMap map drawing tools.
5. ACKNOWLEDGEMENTS
We acknowledge for his collaboration the student Befkadu Nigussie Alemu, who based
his master thesis on this research.
6. REFERENCES
Brovelli, M.A., Realini, E., Reguzzoni, M., Visconti, M.G., 2008. Comparison of the
performance of medium and low level GNSS apparatus, with and without reference
networks. International Archives of Photogrammetry, Remote Sensing and Spatial
Information Sciences, vol. XXXVI, part 5/C55, 54-61.
Douglas, D.H., Peucker, T.K., 1973. Algorithms for the reduction of the number of points
required to represent a digitized line or its caricature. Cartographica: The International
Journal for Geographic Information and Geovisualization, 10(2), 112-122.
Kaufman, L., Rousseeuw, P.J., 1990. Finding groups in data. An introduction to cluster
analysis. John Wiley & Sons, New York.