ChapterPDF Available

Exploratory Completeness Analysis of Mapillary for Selected Cities in Germany and Austria

Authors:

Abstract and Figures

Mapillary, founded in early 2014, is a Web 2.0 application that allows voluntary users to contribute street level photographs from all over the world. This paper analyses the growth in uploaded data and contributor numbers over time, and assesses data completeness by comparing Mapillary with Google Street View and OpenStreetMap data for selected cities in Germany and Austria. Results show that as of now Google Street View generally provides a better coverage in the cities in which it is offered, but that Mapillary has the potential to reach higher coverage along off-road segments, such as footpaths and bicycle trails, or even railroads.
Content may be subject to copyright.
535
Exploratory Completeness Analysis of Mapillary
for Selected Cities in Germany and Austria
Levente Juhasz and Hartwig Hochmair
University of Florida, Fort Lauderdale/USA · levente.juhasz@ufl.edu
Full paper double blind review
Abstract
Mapillary, founded in early 2014, is a Web 2.0 application that allows voluntary users to
contribute street level photographs from all over the world. This paper analyses the growth
in uploaded data and contributor numbers over time, and assesses data completeness by
comparing Mapillary with Google Street View and OpenStreetMap data for selected cities
in Germany and Austria. Results show that as of now Google Street View generally pro-
vides a better coverage in the cities in which it is offered, but that Mapillary has the poten-
tial to reach higher coverage along off-road segments, such as footpaths and bicycle trails,
or even railroads.
1 Introduction
Street level photographs are an important data source for a variety of analysis tasks, includ-
ing the identification of road features (e.g., crosswalks, traffic signs) and the assessment of
wheelchair accessibility of sidewalks. Besides commercial products such as Google Street
View or Mapjack, Mapillary is the first platform offering a street level photograph service
based on crowd-sourced data. It therefore provides a unique addition to the currently avail-
able data resources of Volunteered Geographic Information (VGI) (GOODCHILD 2007).
Mapillary is run by a company located in Malmö, Sweden, and began providing services at
the beginning of 2014. Imagery is available (at http://www.mapillary.com) to the public for
any kind of purpose under an open license, CC BY-SA 4.0.
This study provides an overview of the development of Mapillary data within its first year,
including the number of individual contributors and data growth. Mapillary data has already
been collected for all five continents, with Europe showing the most contributions. The
geographic focus of this study is the 40 largest cities in Germany, and the 4 largest cities in
Austria. Since half of the analyzed German cities offer the Google Street View service,
which provides the same type of data as Mapillary, we can use these cities to analyse the
potential effect of the presence of Google Street View data on the completeness of Mapil-
lary data. Google Street View dates back to 2007 and is currently the most prominent pro-
vider of street level photographs. It allows users to access 360º panorama images from
selected roads all over the world.
Contributors to Mapillay can either upload photos through an application on a GPS enabled
smartphone, or manually via the Mapillary Website as long as the photograph comes with
GI_Forum Journal for Geographic Information Science, 1-2015.
© Herbert Wichmann Verlag, VDE VERLAG GMBH, Berlin/Offenbach. ISBN 978-3-87907-558-4.
© ÖAW Verlag, Wien. ISSN 2308-1708, doi:10.1553/giscience2015s535.
L. Juhasz and H. Hochmair 536
an EXIF file containing geographic coordinates. Google Street View focuses on main
roads, since this facilitates fast data capture from cars, and therefore frequent data updates.
Google also developed solutions to capture Street View photos for off-road trails, including
a special bike, or even a backpack. Mapillary contributors typically use mobile phones to
upload data from wherever they are traveling. This frequently also includes off-road paths,
such as trails in parks.
Fig. 1(a) provides a screenshot of how a collected street level picture is shown on the
Mapillary Website. The lines visualize GPS tracks, along which pictures were taken, and
the icon with the arrow indicates the image location together with the orientation of the
camera. This picture was taken from a scenic viewpoint at an off-road location in Salzburg,
Austria.
2 Previous Work
Previous studies report on the use of Google Street View imagery as an effective method
for neighbourhood audit to eliminate in-person fieldwork (CLARKE et al. 2010, RUNDLE et
al. 2011, GRIEW et al. 2013, VANWOLLEGHEM et al. 2014). Google Street View imagery has
also been analysed using computer vision to determine the geographic position of other
photos, or to identify road features (ZAMIR & SHAH 2010, HARA et al. 2013). Previous
research has, however, not analysed the spatial coverage and completeness of Google Street
View. This will be addressed in our paper, at least for selected cities, through assessing the
overlap of Google Street View geometries with OpenStreetMap (OSM) road features. One
related study compared the completeness of Google Maps road feature data (not Google
Street View, though) with those from Bing Maps and OSM for five cities in Ireland, con-
cluding that none of these datasets were substantially better than another (CIPELUCH et al.
2010).
For part of the research that assesses the completeness of Mapillary and Google Street
View data in this study, OSM network data is used as a reference dataset. Although OSM is
not governed by an authoritative agency that guarantees certain quality standards, OSM was
found to provide good coverage for roads in analysed urban areas (GIRRES & TOUYA 2010,
HAKLAY 2010, ZIELSTRA & ZIPF 2010). OSM coverage of road segments for non-motor-
ized traffic (e.g., walking or cycling trails) was also shown to be of high quality. For exam-
ple, for seven cities in the US and Europe it was found that the length of off-road bicycle
trails mapped in OSM nearly doubled between 2009 and 2013, except in London, or grew
even more (HOCHMAIR et al. 2015). Another study showed that the total length of mapped
footpaths was higher in OSM than for a proprietary data provider (TeleAtlas back then,
now TomTom) (ZIELSTRA & HOCHMAIR 2011) for selected cities in Germany. For exam-
ple, in Berlin OSM had 3.2 times as many pedestrian related network data as Tele Atlas.
For Munich this ratio was even 5.6. These numbers give evidence that OSM provides a
robust reference dataset for determining Mapillary and Google Street View data complete-
ness, also for off-road data analysis.
Exploratory Completeness Analysis of Mapillary 537
3 Study Design
3.1 Study Area
Only the largest 20 German cities provide Google Street View coverage, in addition to
some smaller towns, such as Oberstaufen. For our analysis we selected the 40 largest cities
in Germany, which includes 20 cities with and 20 cities without Google Street View im-
agery. This split allows us to study the potential effect of the availability of Street View
imagery on Mapillary coverage in different cities. Furthermore, we also examined the
Mapillary coverage in the four largest Austrian cities. The effect of Google Street View
imagery on Mapillary coverage could not be assessed for Austrian cities since this Google
service is not available in Austria. The German and Austrian cities comprising the study
area are shown in Fig. 1(b) (green and blue areas). Red lines indicate Mapillary coverage
outside the selected cities. As can be seen, the data collection efforts of the Mapillary com-
munity have so far focused on urban areas.
(a) (b)
Fig. 1: Street level image as shown on the Mapillary Website (a) and study area (b)
3.2 Data Preparation
3.2.1 Tile System
The completeness analysis of this study is based on raster tiles. To facilitate the comparison
of Google Street View data with other datasets, we adhered to the tile specifications used in
Google Street View. Tiles of the Street View coverage are accessible via the Google Maps
API and provided as 256 x 256 pixel PNG images. In this system, the world is divided into
tiles corresponding to zoom levels. In each zoom level, tiles are indexed by X (column) and
Y (row) values starting from the top-left corner. Zoom level 0 covers the whole world in
one tile. The number of tiles in each zoom level is 22z, where z is the zoom level. Logically,
this system can be considered a hierarchy of folders and files, where each zoom level is a
folder, each X coordinate is a subfolder and each Y coordinate is a PNG image file. This so
called XYZ tile scheme provided by Google became a de facto standard in Web mapping,
and is used by other map providers including Bing Maps, Yahoo Maps, OSM, and Mapbox.
L. Juhasz and H. Hochmair 538
Using this schema allows one to convert geographic coordinates to tile coordinates and the
other way round. As a preliminary step in our analysis, we converted all data sources into
this schema.
3.2.2 Google Street View
Google does not provide information on how its Street View tiles are generated. For this
study, zoom level 13 tiles were chosen for the analysis, but Street View coverage tiles are
highly generalized by default (Fig. 2a) at that level. To make them comparable to the
Mapillary dataset, tiles were regenerated (Fig. 2b). To do so, a vector version of Street
View lines was extracted based on the Street View coverage, which was downloaded at
zoom level 17 (i.e. further zoomed in), using a client-side script in September 2014. As a
result of this step, all PNG tiles appeared in the Web browser’s cache. Individual images of
zoom level 17 were then extracted from the cache and stored within the XYZ tile folder
structure. This zoom level corresponds to a ground pixel size of approximately 1.2 m. After
geocoding each tile, the PNG raster tiles were loaded into GRASS GIS, where they were
patched together, a thinning algorithm was applied, and the 1 pixel wide raster images were
vectorised. All extracted lines were then uploaded to a PostgreSQL database with line ge-
ometries.
(a) (b)
Fig. 2:
Original (a) and regenera-
ted (b) Street View tiles
at zoom level 13
As the last step, TileMill and Mapnik toolkits were used to render tiles at the zoom level 13.
Background pixels had a value of 0, and the pixel size in this zoom level is ~19 m, conceal-
ing GPS positioning errors occurring during the data collection process.
3.2.3 Mapillary
Mapillary offers various methods to download data via their JSON API. For this study,
image sequences were used, which are LineStrings of coherent images and their attributes.
Images are taken one after another by walking, driving, or riding a bike. Although geome-
tries and additional attributes of each sequence can be downloaded, the Mapillary team
provided us with a database dump that included some additional information, including user
ID, timestamp, and geometry of each sequence. The lines are GPS trajectories, where each
node is the position of an image. Individual GPS trajectories are mapped as separate lines
even if they were taken on the same road. Using map tiles with a zoom level of 13 (pixel
size ~19 m) was found to be an efficient method to avoid double counting sequences on the
same road. This means that segments of multiple lines are counted as only one line if they
Exploratory Completeness Analysis of Mapillary 539
fall within the same pixel. Mapillary tiles were then rendered with the same parameters
used to render Street View tiles. We used a dataset that contains LineStrings of photos
uploaded to Mapillary up until November 18, 2014.
3.2.4 OpenStreetMap
An OSM database dump was downloaded from Geofabrik in October, 2014. All roads were
extracted with Osmosis using a highway=[key] filter, and uploaded into a spatially-enabled
PostgreSQL database. Since this study uses certain OSM road categories, additional queries
were formulated to extract the following road categories:
Main roads: connect settlements and cities
Residential roads: minor, lower level roads with moderate traffic
Pedestrian/Cycle roads: minor elements of the road network used for pedestrians or
cyclists for daily routine or recreational purposes
Inaccessible roads, sidewalks, road crossings, tunnels, and indoor features were excluded
based on their tags. Further, all pedestrian and bicycle highway features within 25 m from
main and residential roads were removed to be able to assess completeness of off-road
pedestrian and bicycle features in Mapillary and Google Street View. In a final step, OSM
map tiles were generated for zoom level 13 in the same way as previously done for Google
Street View and Mapillary data.
3.3 Determination of Relative Completeness Between Data Sources
3.3.1 Mapillary vs. Google Street View
20 out of the 40 analysed cities in Germany provide Google Street View service, which
allows a comparison between Mapillary and Street View coverage. Since Mapillary is a
relatively new service, it can be expected that Street View has better coverage in most ana-
lysed areas. However, since Street View is mostly bound to car accessible roads, it can also
be expected that Mapillary exceeds Street View coverage in some off-road areas, e.g. at
recreational sites inaccessible to cars.
A self-developed python script compared tiles from Mapillary and Street View datasets
within predefined boundaries. The script loaded each tile from the two datasets and counted
non-zero value pixels within the tile. As a result, each unique tile identifier could be associ-
ated with the count values of Mapillary and Street View pixels of that tile. The geographic
extent of tiles at zoom level 13 is ~ 4.9 x 4.9 km2. Each tile was subsequently divided into
16 squares with a spatial resolution of ~ 1.2 x 1.2 km2. This provided enough detail to illus-
trate local differences in mapping completeness at the city level. This new reference system
is also identical to the tile system at zoom level 15. Results were uploaded into a Post-
greSQL database with polygon vector geometries. A relative completeness difference that
ranges between -1 and 1 was calculated (Equation 1a). A value of 1 indicates that a tile
contains only Street View coverage but no Mapillary data, whereas -1 means the opposite.
3.3.2 Completeness Relative to OpenStreetMap
The method to calculate Mapillary and Street View completeness relative to predefined
OSM road categories is similar. The difference is how pixels are counted. Instead of coun-
ting all pixels within a tile, completeness requires to identify only those Mapillary and
L. Juhasz and H. Hochmair 540
Street View pixels that overlap with pixels of selected OSM road categories. Since the
reference tile system is the same for all datasets, it is sufficient to identify the position of
OSM road pixels in a tile, and then check whether that specific pixel has a value of one in
the other datasets or not. Results were again uploaded to a PostgreSQL database. Overall
completeness of each city can then be calculated by selecting tiles within the city boundary
and computing the fraction of Mapillary or Street View pixels overlapping with OSM road
pixels in the selected tiles (Equation 1b). Calculations were performed on all predefined
OSM road categories for Mapillary and Street View.
,0, 0

,0
rr
r
Where d is the relative completeness differ-
ence, SV is the count of Street View pixels and
Map is the count of Mapillary pixels
Where Cr is the completeness of Mapil-
lary or Street View on a road category,
Pxr is a pixel overlapping with a pixel
from the OSM road category r and OSMr
is a pixel of the OSM road category r.
(a) (b)
Equation 1: Relative completeness difference (a) and computation of completeness (b)
4 Results
4.1 User Contributions
Fig. 3(a) shows the number of Mapillary users actively contributing in Germany and Aus-
tria for each month. The majority of users contribute on a regular basis (returning users).
Until November 2014 the total number of contributing users reached 388. Mappers covered
sequences of more than 46500 km, whereby it must be noted that some of the sequences
were taken on the same roads. Fig. 3(b) shows the distribution of data contributors for vari-
ous total distance ranges. The most active contributor uploaded pictures along almost 6100
km to Mapillary. Fig. 3(b) also expresses inequalities in data contributions among volun-
teers, which has been previously identified for other VGI data sources, such as OSM (NEIS
& ZIELSTRA 2014) or drone images (HOCHMAIR & ZIELSTRA 2014). In those VGI data
sources, but also with Mapillary, few users contribute most of the data, whereas the major-
ity of users make only few data contributions. In the case of Mapillary, more than 60%
(205) of the users contributed less than 10 km. Conversely, only 5% (19) mapped 500 km
or more. Another aspect of the contribution pattern is the number of cities a user mapped in.
Fig. 3(c) shows that a large majority of users (40%) contributed in only one city, as op-
posed to only 4% of users who contributed in 5 or more study cities, which again reflects
contribution inequality.
Exploratory Completeness Analysis of Mapillary 541
Fig. 3: User contributions to Mapillary
4.2 Completeness
4.2.1 Relative Completeness Difference Between Mapillary and Street View
The relative completeness difference between Mapillary and Google Street View data was
determined for the 20 largest cities in Germany. This computation was not possible for
Austria, where Street View is not provided. First, all tiles within a city were selected, then
all relevant pixels within those tiles were used to calculate the relative completeness differ-
ence based on Equation 1a. Values range between 0.38 (Dresden) and 0.98 (Düsseldorf).
The values are all positive, meaning that Google Street View provides better overall cover-
age in the analysed cities.
The median of relative completeness differences is 0.84. An outlier (Dresden) with a value
of 0.38 indicates that this city has high Mapillary coverage compared to other cities. Fig.
4(a) and 4(b) visualize patterns of relative completeness difference for Dresden and Berlin
(the latter with a value of 0.65). Purple tiles (without borders) show areas where more roads
Fig. 4:
Spatial distribution of rela-
tive completeness differ-
ence (a-b) and examples of
better Mapillary coverage
(c-d)
(
a
)
(
b
)
(
c
)
(
d
)
(
a
)
(
b
)
(
c
)
L. Juhasz and H. Hochmair 542
are mapped in Google Street View, whereas orange tiles (with borders) show areas with
higher Mapillary coverage. The low relative completeness difference for Dresden matches
the visual appearance of the tile map with its large portions of orange areas in the suburbs
and the town centre. In contrast, Berlin shows fewer orange areas. Whereas Street View
imagery mostly relies on the road network, Mapillary photos are often taken off-road or on
minor roads. Typical examples can be found on riversides. Mapillary is more complete in a
park in Dresden (Fig. 4(c)) or along a river in Frankfurt (Fig. 4(d)). In general, differences
between both data sources are smaller in city centres or other populated areas, indicated
through lighter colours.
4.2.2 Completeness Relative to OpenStreetMap Features
At the city level, Street View is more complete than Mapillary, but it is limited to the 20
biggest German cities, with some exceptions. Completeness values relative to OSM are
shown in Table 1 for some selected German cities and the largest four cities in Austria.
Table 1: Completeness relative to selected OSM road categories (%)
City OSM Main OSM Residential OSM Ped./Cycle Street
View
Google Mapillary Google Mapillary Google Mapillary
Austria
Vienna 0 13.93 0 2.26 0 0.80
Graz 0 32.72 0 16.25 0 7.00
Linz 0 8.82 0 2.39 0 0.82
Salzburg 0 67.73 0 13.47 0 5.68
Germany
Berlin 91.44 42.41 81.44 8.10 1.67 1.86 X
Hamburg 83.35 15.36 72.80 1.84 2.15 0.54 X
Munich 90.08 22.11 82.81 3.76 2.43 1.52 X
Frankfurt 82.72 14.95 63.66 1.37 2.62 3.63 X
Düsseldorf 87.97 1.32 74.44 0.19 3.30 0.19 X
Dortmund 85.72 20.82 79.02 2.51 3.76 0.65 X
Bremen 85.33 6.46 75.49 0.45 1.85 0.28 X
Dresden 86.77 59.01 78.10 18.75 2.18 4.65 X
Nuremberg 92.10 22.94 80.19 3.21 2.96 0.54 X
Duisburg 88.84 4.31 83.99 0.07 3.49 0.12 X
Bielefeld 77.11 20.50 58.23 1.50 1.72 0.67 X
Major roads are well covered in Google Street View, with completeness ranging between
77% (Bielefeld) and 92% (Nuremberg). Numbers are also high for residential roads and
ranging between 84% (Duisburg) and 58% (Bielefeld), but much smaller for off-road pe-
destrian/cycle paths with a maximum of 4%. Completeness is considerably smaller for
Mapillary in all categories, with a few exceptions in the pedestrian/cycle category. For the
latter category it is possible that some of the positive percentage numbers in Google Street
View stem from images taken by cars along roads in the vicinity of pedestrian or cycling
paths. This situation can occur near roads that were not previously excluded from the analy-
sis (only main or residential roads were removed), but are still driveable, such as “track” or
“service” roads in OSM. Salzburg provides the most complete coverage of main roads in
Exploratory Completeness Analysis of Mapillary 543
Mapillary among all cities with a value of 68%. Considering all 44 selected cities, main
roads in Mapillary are mapped most completely (18.6%), followed by residential roads
(3.2%) and pedestrian or cycling paths (1.1%).
No association could be identified between city size and data completeness in Mapillary.
For example, some of the larger cities, such as Düsseldorf (pop. ~ 590,000), Bremen (pop.
~ 550,000), or Duisburg (pop. ~ 490,000) are poorly covered in Mapillary. Previous re-
search suggests that the availability of freely available datasets from other sources in a
region diminishes the community’s motivation for voluntary data collection efforts, e.g. in
the case of OSM (ZIELSTRA & HOCHMAIR 2011). However, mean comparison of Mapillary
completeness between cities with and without Google Street View did not reveal statistical-
ly significant differences in Mapillary completeness between both groups of cities.
4.2.3 Data Growth
Fig. 5(a) shows the growth of average Mapillary completeness values for the three OSM
road categories and the selected 44 cities. Main roads are growing more rapidly than other
road categories. Fig. 5(b)-(d) show the completeness growth curves for five selected cities
in the three road categories. Dresden and Salzburg reveal the highest current completeness
in all three categories. A very active mapping period can be identified between May and
July 2014 in Salzburg. For the other four selected cities, growth slowed down in August,
2014. However, this trend cannot be observed when considering all 44 selected cities (Fig.
5(a)).
Fig. 5: Completeness growth in Mapillary data: Average for three road categories in
44 cities (a), and completeness in five selected cities for three road categories
(b)-(d)
(
a
)
(
b
)
(
c
)
(
d
)
L. Juhasz and H. Hochmair 544
5 Conclusion and Future Work
This study analysed the completeness and development of Mapillary data, which presents
an alternative data source of street level photographs to commercial datasets. This study
focused on 40 German and 4 Austrian cities. A comparison with Google Street View
showed that Google Street View provides a better coverage than Mapillary, with a few
exceptions. Since Mapillary is not bound to professional equipment that needs to be moved
by car, it could become a complimentary data source of street level imagery to Google
Street View for footpaths and bicycle trails, even in those cities where Google Street View
is present.
For future work we plan to extend the spatio-temporal analysis of Mapillary data beyond
Germany and Austria. It is yet to be seen if privacy concerns of local residents will hamper
the continuous growth of Mapillary, as was the case for Google Street View in some Euro-
pean countries. Another aspect of future work is to analyse to which extent Mapillary is
used as a data source for OSM. Tags in the source field of OSM features indicate that OSM
mappers are already using Mapillary imagery to add landmark point features, such as a bus
stop, which can be visually identified on Mapillary street level photographs. Germany was
also one of the most active regions during the beginning of the OSM project, similar to
Mapillary. We therefore plan to analyse whether the same group of users that pushed the
OSM project is also actively contributing to the Mapillary project, or whether Mapillary
reaches out to a new crowd of voluntary mappers.
References
CLARKE, P., AILSHIRE, J., MELENDEZ, R., BADER, M. & MORENOFF, J. (2010), Using
Google Earth to conduct a neighborhood audit: Reliability of a virtual audit instrument.
Health & Place, 16 (6), 1224-1229.
GIRRES, J. F. & TOUYA, G. (2010), Quality assessment of the French OpenStreetMap
dataset. Transactions in GIS, 14 (4), 435-459.
GOODCHILD, M. F. (2007), Citizens as Voluntary Sensors: Spatial Data Infrastructure in the
World of Web 2.0 (Editorial). International Journal of Spatial Data Infrastructures
Research (IJSDIR), 2, 24-32.
GRIEW, P., HILLSDON, M., FOSTER, C., COOMBES, E., JONES, A. & WILKINSON, P. (2013),
Developing and testing a street audit tool using Google Street View to measure
environmental supportiveness for physical activity. International Journal of Behavioral
Nutrition and Physical Activity, 10 (1), 103.
HAKLAY, M. (2010), How good is Volunteered Geographical Information? A comparative
study of OpenStreetMap and Ordnance Survey datasets. Environment and Planning B:
Planning and Design, 37 (4), 682-703.
HARA, K., LE, V., SUN, J., JACOBS, D. & FROEHLICH, J. E. (2013), Exploring Early So-
lutions for Automatically Identifying Inaccessible Sidewalks in the Physical World
Using Google Street View. Human Computer Interaction Consortium 2013.
HOCHMAIR, H. H. & ZIELSTRA, D. (2014), Analysing User Contribution Patterns of Drone
Pictures to the dronestagram Photo Sharing Portal. Journal of Spatial Science.
Exploratory Completeness Analysis of Mapillary 545
HOCHMAIR, H. H., ZIELSTRA, D. & NEIS, P. (2015), Assessing the Completeness of Bicycle
Trail and Designated Lane Features in OpenStreetMap for the United States. Trans-
actions in GIS, 19 (1), 63-81.
NEIS, P. & ZIELSTRA, D. (2014), Recent developments and future trends in volunteered
geographic information research: The case of OpenStreetMap. Future Internet, 6 (1), 76-
106.
RUNDLE, A. G., BADER, M. D. M., RICHARDS, C. A., NECKERMAN, K. M. & TEITLER, J. O.
(2011), Using Google Street View to Audit Neighborhood Environments. American
Journal of Preventive Medicine, 40 (1), 94-100.
VANWOLLEGHEM, G., DYCK, D. V., DUCHEYNE, F., BOURDEAUDHUIJ, I. D. & CARDON, G.
(2014), Assessing the environmental characteristics of cycling routes to school: a study
on the reliability and validity of a Google Street View-based audit. International Journal
of Health Geographics, 13 (19).
ZAMIR, A. R. & SHAH, M. (2010), Accurate Image Localization Based on Google Maps
Street View. In: DANIILIDIS, K., MARAGOS. P. & PARAGIOS, N. (Eds.), Computer Vision
– ECCV 2010. Springer, Berlin/Heidelberg, 255-268.
ZIELSTRA, D. & HOCHMAIR, H. H. (2011), A Comparative Study of Pedestrian Accessibility
to Transit Stations Using Free and Proprietary Network Data. Transportation Research
Record: Journal of the Transportation Research Board, 2217, 145-152.
... The explicit and homogeneous data quality makes it easier for researchers to assess the applicability of the data for a particular task. However, the costly data collection approach also results in spatial unavailability for some cities/countries at the macroscale (e.g., for policy reasons), as well as at the microscale along pavements, cycle tracks, and walkways (Juhasz and Hochmair 2015;Li 2021;Ki, Park, and Chen 2023). Furthermore, it is not common for SVIs collected using the traditional and costly methods to have limited temporal resolution due to low update frequency and limited access to older imagery (d' Andrimont et al. 2018). ...
... Notably, despite large-scale data imports from governments (e.g., VicRoads) or other agencies (e.g., Bing Maps), VSVI as a whole still suffers from quality uncertainty due to inconsistencies in the quality requirements of these power contributors and the fact that the contribution of a single source tends to be concentrated on specific regions and road types. Such novel SVI datasets have the potential to provide images in districts and along minor roads (e.g., cycle tracks and walkways) where traditional SVI collectors are more difficult to cover (Juhasz and Hochmair 2015;Biljecki and Ito 2021;Ding, Fan, and Gong 2021;Zheng and Amemiya 2023). Furthermore, access to a more complete dataset, including older images, enhances the ability of VSVI to provide data with higher temporal and spatial resolution (d' Andrimont et al. 2018;Tsutsumida and Funada 2023). ...
... Nevertheless, the use of these user-generated data in environmental audit studies is currently limited, partly owing to the extensive coverage and popularity of traditional SVI (especially GSV) and, more importantly, the inherent spatial heterogeneity of VSVI contributions (Juhasz and Hochmair 2015;Juhász and Hochmair 2016;d'Andrimont et al. 2018;Ma et al. 2019;Mahabir et al. 2020;Seto and Nishimura 2022) and the unknown mechanisms governing the quality control of these data. Understanding the mechanisms by which contribution activities yield quality improvements to VSVI can confirm the feasibility of this contribution method in building high-quality datasets and provide an easy-to-calculate trust measure as a proxy for quality assessment, especially in contexts where authoritative data are unavailable (Haklay et al. 2010;Antoniou and Skopeliti 2015). ...
Article
Full-text available
Street View Imagery (SVI) is crucial for urban environmental audits. Although Volunteered Street View Imagery (VSVI) has the potential to provide higher spatial and temporal resolution than traditional SVI, the use of such user‐generated data is currently limited by the inherent spatial heterogeneity in contributions and unclear quality control. To explore the underlying mechanisms, this study focused on the validity of Linus’ law in VSVI quality improvement in terms of spatial and temporal qualities related to environmental audits, using Mapillary as an example. It is assumed that as the number of street revisits for a road increases, so does the level of data quality. Results from regression and correlation analyses show that Linus’ law applies to VSVI quality improvement with different relationships depending on quality elements and road types. Furthermore, this study determines the number of revisits required to exceed the traditional SVI or achieve a higher quality level.
... The difference between the VSVI and reference data results may be explained by unavoidable factors, such as plant growth in different months ( Figure 10) and years (Figure 11), as well as different shooting positions (Figure 12). VSVI has received increasing attention from the research domain in recent years [35,[41][42][43][44][45] and has been used in various surveying tasks such as road information extraction [46,47], building observation [48], and crop monitoring [49]. Using full-free open VSVI data in SGV surveys can help to expand the types of useful data in streetscape monitoring research and facilitate the democratization of urban audit surveys using big data. ...
... Only 96 intersections (49 after the filtering process) out of 272 have available data, and only a small number of intersections have pictures from all road directions. Even though the amount of Mapillary imagery data in Japan has surpassed 40 million [35] and is growing at an exponential rate, complete coverage of roads and directions remains difficult for VSVI data contributed by the crowd, VSVI has received increasing attention from the research domain in recent years [35,[41][42][43][44][45] and has been used in various surveying tasks such as road information extraction [46,47], building observation [48], and crop monitoring [49]. Using full-free open VSVI data in SGV surveys can help to expand the types of useful data in streetscape monitoring research and facilitate the democratization of urban audit surveys using big data. ...
Article
Full-text available
Street greenness visibility (SGV) is associated with various health benefits and positively influences perceptions of landscape. Lowering the barriers to SGV assessments and measuring the values accurately is crucial for applying this critical landscape information. However, the verified available street view imagery (SVI) data for SGV assessments are limited to the traditional top-down data, which are generally used with download and usage restrictions. In this study, we explored volunteered street view imagery (VSVI) as a potential data source for SGV assessments. To improve the image quality of the crowdsourced dataset, which may affect the accuracy of the survey results, we developed an image filtering method with XGBoost using images from the Mapillary platform and conducted an accuracy evaluation by comparing the results with official data in Shinjuku, Japan. We found that the original VSVI is well suited for SGV assessments after data processing, and the filtered data have higher accuracy. The discussion on VSVI data applications can help expand useful data for urban audit surveys, and this full-free open data may promote the democratization of urban audit surveys using big data.
... Comparing these results with GSV, they further reported that GSV consistently provided greater completeness. Similar results were also reported by Juhasz and Hochmair [51] assessing the completeness of Mapillary for cities in Germany and Austria. Building on this work, Juhasz and Hochmair [52] further examined the cross-linkage between Mapillary and OSM and reported that most Mapillary tags used within OSM relate to changesets (i.e., group of edits) compared to individually edited features. ...
Article
Full-text available
Over the last decade, Volunteered Geographic Information (VGI) has emerged as a viable source of information on cities. During this time, the nature of VGI has been evolving, with new types and sources of data continually being added. In light of this trend, this paper explores one such type of VGI data: Volunteered Street View Imagery (VSVI). Two VSVI sources, Mapillary and OpenStreetCam, were extracted and analyzed to study road coverage and contribution patterns for four US metropolitan areas. Results show that coverage patterns vary across sites, with most contributions occurring along local roads and in populated areas. We also found that a few users contributed most of the data. Moreover, the results suggest that most data are being collected during three distinct times of day (i.e., morning, lunch and late afternoon). The paper concludes with a discussion that while VSVI data is still relatively new, it has the potential to be a rich source of spatial and temporal information for monitoring cities.
... Despite the wide use of Street View imagery in geospatial applications and research, the spatial coverage of Street View, i.e., data completeness and geographic extent, has so far not been discussed in the literature. An earlier version of the study presented in this paper revealed that Street View provides fairly complete coverage where service is offered in a city (Juhász and Hochmair 2015). ...
Thesis
Full-text available
Volunteered Geographic Information (VGI) refers to spatial data generated by ordinary citizens through their online activities. The era of Web 2.0 experienced a rapid growth both in the number of available VGI platforms and in the number of users who engage in such activities. VGI is already proven to be useful for a number of applications, ranging from the creation of traditional map data to extracting traffic signs from voluntarily shared photographs. The geographic information science (GIScience) community also started to eccessively utilize VGI to answer a variety of research questions, for example to better understand human mobility through geotagged social media messages, or to provide ground information for first responders during natural or man-made crises. However, as VGI is often generated by non-professionals, or for reasons other than scientific research, its quality can be questionable. Naturally, this dissertation has a community focus since people who generate VGI are key to understand the characteristics of data they may provide. Therefore, this work carries out a set of investigations to illustrate recent trends in contribution behavior, especially on new data platforms, which in turn will help understand data quality issues associated with VGI. First, the emergence of new platforms is illustrated by analyzing and characterizing user contributions to Mapillary, a street level photograph application, both on the country and on the individual level. A completeness assessment and comparison to a commercial platform is also given to illustrate how effective contributors are in reaching the goals of a new mapping platform. In the next chapters, using Mapillary and OpenStreetMap as examples, the data interplay is discussed between these two platforms. The cross-fertilization of data between VGI platforms was only observed in recent years. This research describes how user generated data from one platform is being used to improve another, and analyzes the spatio-temporal characteristics of this behavior using various techniques. Lastly, the final part of the dissertation conducts an experiment with VGI users and assesses the effectiveness of outreach techniques in triggering community growth. In addition, the behavior of different VGI user groups is also described with special emphasis on the motivation of users.
... Despite the wide spread use of Street View imagery in geospatial applications and research, the spatial coverage of Street View, i.e. data completeness and geographic extent, has so far not been discussed in the literature. An earlier version of the study presented in this article revealed that Street View provides fairly complete coverage where service is offered in a city (Juh asz and Hochmair 2015). For the 20 largest German cities Street View covers 77-92% of the main roads, 58-84% of the residential roads, but fewer than 4% of off-road pedestrian and bicycle network segments. ...
Article
Full-text available
Mapillary is a Web 2.0 application which allows users to contribute crowdsourced street level photographs from all over the world. In the first part of the analysis this article reviews Mapillary data growth for continents and countries as well as the contribution behavior of individual mappers, such as the number of days of active mapping. In the second part of the analysis the study assesses Mapillary data completeness relative to a reference road network dataset at the country level. In addition, a more detailed completeness analysis is conducted for selected urban and rural areas in the US and part of northern Europe for which the completeness of Mapillary data will also be compared with that of Google Street View. Results show that Street View provides generally a better coverage on almost all road categories with some exceptions for pedestrian and cycle paths in selected cities. However, Mapillary data can be conveniently collected from any mobile device that is equipped with a photo camera. This gives Mapillary the potential to reach better coverage along off-road segments than Google Street View.
Article
Full-text available
Zusammenfassung: Die Verfügbarkeit flächendeckender Gehwegdaten spielt unter anderem in der Routenplanung und Navigation für Fußgänger und mobilitätseingeschränkte Personen eine bedeutende Rolle. Jedoch ist die Abdeckung potentieller Datenquellen, wie dem freien Geodatendienst OpenStreetMap, mit Gehwegdaten verhältnismäßig gering. Eine weitere potentielle Quelle für die Extraktion von zusätzlichen Gehwegdaten stellen sogenannte Street-Level-Bilder dar. Hier soll ein Ansatz zur Beurteilung der Zuverlässigkeit einer mittels Crowdsourcing durchgeführten Gehwegerkennung in Street-Level-Bildern vorgestellt werden. Dazu wird eine Stichprobe von Street-Level-Bildern aus Heidelberg der Plattform Mapillary von Crowdsourcing-Teilnehmern hinsichtlich des im Bild sichtbaren Vorhandenseins von Gehwegen ausgewertet. Anschließend werden aus amtlichen Gehweg- und Straßendaten für das Untersuchungsgebiet Referenzdaten zum tatsächlichen Gehwegvorkommen abgeleitet. Diese werden mit den Ergebnissen der Crowdsourcing-Gehwegerkennung verglichen und so deren Zuverlässigkeit beurteilt. Die Ergebnisse zeigen, dass die Methode gut geeignet ist, um vorhandene Gehwege zu erkennen - sie werden mit einer Genauigkeit von 88 % richtig interpretiert. Nicht vorhandene Gehwege, also Straßenabschnitte, die aus der Fußgänger-Routenplanung ausgeschlossen werden sollten, werden jedoch nur schlecht erkannt. In der Diskussion zeigen wir Ideen zur Verbesserung des Verfahrens.
Article
Full-text available
User-generated content (UGC) platforms on the Internet have experienced a steep increase in data contributions in recent years. The ubiquitous usage of location-enabled devices, such as smartphones, allows contributors to share their geographic information on a number of selected online portals. The collected information is oftentimes referred to as volunteered geographic information (VGI). One of the most utilized, analyzed and cited VGI-platforms, with an increasing popularity over the past few years, is OpenStreetMap (OSM), whose main goal it is to create a freely available geographic database of the world. This paper presents a comprehensive overview of the latest developments in VGI research, focusing on its collaboratively collected geodata and corresponding contributor patterns. Additionally, trends in the realm of OSM research are discussed, highlighting which aspects need to be investigated more closely in the near future.
Article
Full-text available
This article assesses the completeness of bicycle trail and on-street lane features in OpenStreetMap (OSM). Comparing OSM cycling features with reference data from local planning agencies for selected US Urbanized Areas shows that OSM bicycle trails tend to be more completely mapped than bicycle lanes. Manual evaluation of mapped cycling features in OSM and Google Maps for selected test areas within the Central Business Districts of Portland (OR) and Miami (FL) through comparison with governmental datasets, satellite imagery, and Google Street View, shows that the Bicycle layer in Google Maps can help to identify some missing or erroneously mapped OSM cycling links. However, Google Maps was also found to have some gaps in its data layers, suggesting that consultation of current trail and lane data from local planning authorities, if available, should be considered as an additional data source for bicycle related planning projects.
Article
Full-text available
Background Google Street View provides a valuable and efficient alternative to observe the physical environment compared to on-site fieldwork. However, studies on the use, reliability and validity of Google Street View in a cycling-to-school context are lacking. We aimed to study the intra-, inter-rater reliability and criterion validity of EGA-Cycling (Environmental Google Street View Based Audit - Cycling to school), a newly developed audit using Google Street View to assess the physical environment along cycling routes to school. Methods Parents (n = 52) of 11-to-12-year old Flemish children, who mostly cycled to school, completed a questionnaire and identified their child’s cycling route to school on a street map. Fifty cycling routes of 11-to-12-year olds were identified and physical environmental characteristics along the identified routes were rated with EGA-Cycling (5 subscales; 37 items), based on Google Street View. To assess reliability, two researchers performed the audit. Criterion validity of the audit was examined by comparing the ratings based on Google Street View with ratings through on-site assessments. Results Intra-rater reliability was high (kappa range 0.47-1.00). Large variations in the inter-rater reliability (kappa range -0.03-1.00) and criterion validity scores (kappa range -0.06-1.00) were reported, with acceptable inter-rater reliability values for 43% of all items and acceptable criterion validity for 54% of all items. Conclusions EGA-Cycling can be used to assess physical environmental characteristics along cycling routes to school. However, to assess the micro-environment specifically related to cycling, on-site assessments have to be added.
Article
Full-text available
Background: Walking for physical activity is associated with substantial health benefits for adults. Increasingly research has focused on associations between walking behaviours and neighbourhood environments including street characteristics such as pavement availability and aesthetics. Nevertheless, objective assessment of street-level data is challenging. This research investigates the reliability of a new street characteristic audit tool designed for use with Google Street View, and assesses levels of agreement between computer-based and on-site auditing. Methods: The Forty Area STudy street VIEW (FASTVIEW) tool, a Google Street View based audit tool, was developed incorporating nine categories of street characteristics. Using the tool, desk-based audits were conducted by trained researchers across one large UK town during 2011. Both inter and intra-rater reliability were assessed. On-site street audits were also completed to test the criterion validity of the method. All reliability scores were assessed by percentage agreement and the kappa statistic. Results: Within-rater agreement was high for each category of street characteristic (range: 66.7%-90.0%) and good to high between raters (range: 51.3%-89.1%). A high level of agreement was found between the Google Street View audits and those conducted in-person across the nine categories examined (range: 75.0%-96.7%). Conclusions: The audit tool was found to provide a reliable and valid measure of street characteristics. The use of Google Street View to capture street characteristic data is recommended as an efficient method that could substantially increase potential for large-scale objective data collection.
Article
Full-text available
Abstract The concept of Volunteered Geographic Information (VGI) has recently emerged from the new Web 2.0 technologies. The OpenStreetMap project is currently the most significant example of a system based on VGI. It aims at producing free vector geographic databases using contributions from Internet users. Spatial data quality becomes a key consideration in this context of freely downloadable geographic databases. This article studies the quality of French OpenStreetMap data. It extends the work of Haklay to France, provides a larger set of spatial data quality element assessments (i.e. geometric, attribute, semantic and temporal accuracy, logical consistency, completeness, lineage, and usage), and uses different methods of quality control. The outcome of the study raises questions such as the heterogeneity of processes, scales of production, and the compliance to standardized and accepted specifications. In order to improve data quality, a balance has to be struck between the contributors' freedom and their respect of specifications. The development of appropriate solutions to provide this balance is an important research issue in the domain of user-generated content.
Article
Drones, also known as unmanned aerial vehicles, are nowadays frequently used to supplement traditional airborne data collection methods such as aerial photography and satellite imagery. Dronestagram, launched in July 2013, is one of the first Web 2.0 projects that share georeferenced drone pictures, providing a valuable source of VGI image data. This paper analyses spatial patterns of contributions to dronestagram world-wide and for two selected regions. Results show that the number of uploaded pictures is associated with the socioeconomic development of a country and the presence of geographical features, and that pictures are clustered in sub-regions.
Article
Availability of a transit service is a key factor in a traveler's choice of transportation mode. Transit service is a realistic option only if the service is available at or near locations when a person plans to travel. Whereas various measures exist for transit availability such as service frequency, the focus of this study was on the spatial aspect of pedestrian accessibility to transit stations, that is, on service coverage. Service areas are commonly used to visualize accessibility for pedestrians to transit systems and to analyze the potential ridership. Because the service area for a station is defined over the maximum network walking distance from a transit station, a complete street network that includes pedestrian segments, that is, shortcuts, is highly important for a realistic assessment of service areas. Whereas most proprietary geodata providers concentrate solely on car-related geodata, public domain street data and volunteered geographic information, such as OpenStreetMap, provide a potential valuable source for pedestrian data. The authors compared the amount of pedestrian-related data between freely available sources (OpenStreetMap or TIGER or both) and proprietary providers (Tele Atlas or NAVTEQ or both). The effect on modeling transit accessibility for pedestrians was analyzed for five U.S. and four German cities, and differences between these two countries were identified.
Conference Paper
Finding an image's exact GPS location is a challenging computer vision problem that has many real-world applications. In this paper, we address the problem of finding the GPS location of images with an accuracy which is comparable to hand-held GPS devices. We leverage a structured data set of about 100,000 images build from Google Maps Street View as the reference images. We propose a localization method in which the SIFT descriptors of the detected SIFT interest points in the reference images are indexed using a tree. In order to localize a query image, the tree is queried using the detected SIFT descriptors in the query image. A novel GPS-tag-based pruning method removes the less reliable descriptors. Then, a smoothing step with an associated voting scheme is utilized; this allows each query descriptor to vote for the location its nearest neighbor belongs to, in order to accurately localize the query image. A parameter called Confidence of Localization which is based on the Kurtosis of the distribution of votes is defined to determine how reliable the localization of a particular image is. In addition, we propose a novel approach to localize groups of images accurately in a hierarchical manner. First, each image is localized individually; then, the rest of the images in the group are matched against images in the neighboring area of the found first match. The final location is determined based on the Confidence of Localization parameter. The proposed image group localization method can deal with very unclear queries which are not capable of being geolocated individually.
Article
Research indicates that neighborhood environment characteristics such as physical disorder influence health and health behavior. In-person audit of neighborhood environments is costly and time-consuming. Google Street View may allow auditing of neighborhood environments more easily and at lower cost, but little is known about the feasibility of such data collection. To assess the feasibility of using Google Street View to audit neighborhood environments. This study compared neighborhood measurements coded in 2008 using Street View with neighborhood audit data collected in 2007. The sample included 37 block faces in high-walkability neighborhoods in New York City. Field audit and Street View data were collected for 143 items associated with seven neighborhood environment constructions: aesthetics, physical disorder, pedestrian safety, motorized traffic and parking, infrastructure for active travel, sidewalk amenities, and social and commercial activity. To measure concordance between field audit and Street View data, percentage agreement was used for categoric measures and Spearman rank-order correlations were used for continuous measures. The analyses, conducted in 2009, found high levels of concordance (≥80% agreement or ≥0.60 Spearman rank-order correlation) for 54.3% of the items. Measures of pedestrian safety, motorized traffic and parking, and infrastructure for active travel had relatively high levels of concordance, whereas measures of physical disorder had low levels. Features that are small or that typically exhibit temporal variability had lower levels of concordance. This exploratory study indicates that Google Street View can be used to audit neighborhood environments.
Article
Within the framework of Web 2.0 mapping applications, the most striking example of a geographical application is the OpenStreetMap (OSM) project. OSM aims to create a free digital map of the world and is implemented through the engagement of participants in a mode similar to software development in Open Source projects. The information is collected by many participants, collated on a central database, and distributed in multiple digital formats through the World Wide Web. This type of information was termed ‘Volunteered Geographical Information’ (VGI) by Goodchild, 2007. However, to date there has been no systematic analysis of the quality of VGI. This study aims to fill this gap by analysing OSM information. The examination focuses on analysis of its quality through a comparison with Ordnance Survey (OS) datasets. The analysis focuses on London and England, since OSM started in London in August 2004 and therefore the study of these geographies provides the best understanding of the achievements and difficulties of VGI. The analysis shows that OSM information can be fairly accurate: on average within about 6 m of the position recorded by the OS, and with approximately 80% overlap of motorway objects between the two datasets. In the space of four years, OSM has captured about 29% of the area of England, of which approximately 24% are digitised lines without a complete set of attributes. The paper concludes with a discussion of the implications of the findings to the study of VGI as well as suggesting future research directions.