Conference PaperPDF Available

Quality Improvement of Remotely Volunteered Geographic Information via Country-Specific Mapping Instructions

Authors:

Abstract and Figures

Volunteered geographic information can be seen as valuable data for various applications such as within disaster management. OpenStreetMap data, for example, are mainly contributed by remote mappers based on satellite imagery and have increasingly been implemented in response actions to various disasters. Yet, the quality often depends on the local and country-specific knowledge of the mappers, which is required for performing the mapping task. Hence, the question is raised whether there is a possibility to train remote mappers with country-specific mapping instructions in order to improve the quality of OpenStreetMap data. An experiment is conducted with Geography students to evaluate the effect of additional material that is provided in wiki format. Furthermore, a questionnaire is applied to collect participants' socio-demographic information, mapping experience and feedback about the material. This pre-study gives hints for future designs of country-specific mapping instructions as well as the experiment design itself.
Content may be subject to copyright.
Klonner et al.
Quality Improvement of VGI
WiPe Paper Prevention and Preparation
Proceedings of the 14th ISCRAM Conference Albi, France, May 2017
Tina Comes, Frédérick Bénaben, Chihab Hanachi, Matthieu Lauras, Aurélie Montarnal, eds.
Quality Improvement of Remotely
Volunteered Geographic Information
via Country-Specific Mapping
Instructions
Carolin Klonner
Institute of Geography, Heidelberg Academy
of Sciences and Humanities (HAW),
Heidelberg, Germany
c.klonner@uni-heidelberg.de
Melanie Eckle
Institute of Geography,
Heidelberg, Germany
eckle@uni-heidelberg.de
Tomás Usón
Institute of Geography, Heidelberg Academy
of Sciences and Humanities (HAW),
Heidelberg, Germany
uson@uni-heidelberg.de
Bernhard Höfle
Institute of Geography, Heidelberg Academy
of Sciences and Humanities (HAW),
Heidelberg Center for the Environment
(HCE), Heidelberg, Germany
hoefle@uni-heidelberg.de
ABSTRACT
Volunteered geographic information can be seen as valuable data for various applications such as within disaster
management. OpenStreetMap data, for example, are mainly contributed by remote mappers based on satellite
imagery and have increasingly been implemented in response actions to various disasters. Yet, the quality often
depends on the local and country-specific knowledge of the mappers, which is required for performing the mapping
task. Hence, the question is raised whether there is a possibility to train remote mappers with country-specific
mapping instructions in order to improve the quality of OpenStreetMap data. An experiment is conducted with
Geography students to evaluate the effect of additional material that is provided in wiki format. Furthermore, a
questionnaire is applied to collect participants’ socio-demographic information, mapping experience and feedback
about the material. This pre-study gives hints for future designs of country-specific mapping instructions as well
as the experiment design itself.
Keywords
OpenStreetMap, country-specific mapping instructions, VGI, quality, disaster.
INTRODUCTION
The number of people affected and the damage produced by natural hazards, like floods, are increasing in the last
decades (EM-DAT, 2016). This can be mainly attributed to changing climate conditions, the urban expansion in
risk areas due to a rapidly-growing world population and the impact of human beings in nature (Ebert et al., 2009).
Thus, disaster management plays an important role in dealing with such events. However, oftentimes, official data,
e.g., map material for routing or the location of buildings, is not available or out of date. Therefore, in order to
react in an efficient and fast way, there is already a high number of cases, in which volunteered geographic infor-
mation (VGI) is used within natural hazard analysis (Horita et al., 2015; Klonner et al., 2016).
The map project OpenStreetMap (OSM) can be considered as very useful for such applications and since its first
crisis setting in 2010 after the earthquake in Haiti, OSM has been applied for many use cases (Soden et al., 2014b).
Further, the Humanitarian OSM Team (HOT), which evolved from the Haiti earthquake response, develops
939
Klonner et al.
Quality Improvement of VGI
WiPe Paper Prevention and Preparation
Proceedings of the 14th ISCRAM Conference Albi, France, May 2017
Tina Comes, Frédérick Bénaben, Chihab Hanachi, Matthieu Lauras, Aurélie Montarnal, eds.
mapping tasks in cooperation with humanitarian aid organizations. In addition, projects like Missing Maps1bring
forward the need of map material for preparedness in order to have current map data already available before a
disaster occurs. However, the quality has to be evaluated before data of a collaborative project like OSM, which
is collected, edited and shared by volunteers from all over the world, can be applied in disaster management
(Barron et al., 2014; Fan et al., 2014). This is especially important for crisis maps because most of the mappers
work remotely on the basis of satellite imagery, and therefore, they might neither be familiar with the local
conditions nor have country-specific (geographic) knowledge (Eckle et al., 2015).
An experiment conducted by Eckle et al. (2015) tackled this issue and they compared remotely mapped data by
Geography students to the results of a local mapper from Kathmandu. They focused on Kathmandu in Nepal be-
cause this area is earthquake prone and the local Kathmandu Living Labs team2supports the mapping of the area
in OSM since official data is very sparse (Soden et al., 2014a). The experiment of Eckle et al. (2015) can be
considered as an initial study as it only had eight participants for the remote mapping. They state that their methods
should be tested with a larger sample size and that the results showed that errors made by the remote mappers
could be minimized by providing material explaining country-specific features. Moreover, See et al. (2013) por-
trayed in their study that with specific training and individual feedback, volunteers were able to improve more than
experts of the domain.
Therefore, the objective of the following study is the quality analysis of remote mapping and the evaluation of the
effect of additional material about country-specific features with a larger experiment setting, which is based on the
Solomon Design and 72 Geography students as participants. A wiki page provides the additional country-specific
instructions for the remote mapping and the experiment concludes with a questionnaire. The analysis of the results
is based on correctness and completeness via a comparison of data contributions by the students versus OSM
reference data.
STUDY AREA AND DATASETS
In general, during or after a disaster, remote mapping data is contributed to OSM mainly in areas, where no or
only little map material is available from the official side. Therefore, there is generally a lack of independent
reference data in order to evaluate the mapping quality of the volunteers.
Figure 1. Study area in Kathmandu, Nepal.
The research area of the following quality examination is located in the city centre of Kathmandu, Nepal, (Figure 1)
due to the following advantages in comparison to other remotely mapped areas. In 2012, a project aiming at the
seismic resilience of the Kathmandu Valley was launched by the World Bank, the Global Facility for Disaster
1http://www.missingmaps.org/ (accessed: 15.01.2017)
2http://www.kathmandulivinglabs.org/ (accessed: 14.01.2017)
940
Klonner et al.
Quality Improvement of VGI
WiPe Paper Prevention and Preparation
Proceedings of the 14th ISCRAM Conference Albi, France, May 2017
Tina Comes, Frédérick Bénaben, Chihab Hanachi, Matthieu Lauras, Aurélie Montarnal, eds.
Reduction and Recovery (GFDRR) and the Government of Nepal3(Soden et al., 2014a). There was no complete
database of schools and health facilities combined with coordinates and information about construction type avail-
able for a disaster risk model (ibid.). The collection of these data by locals within the Open Cities Kathmandu
project was taken over after the end of the project by the Nepalese NGO Kathmandu Living Lab (ibid.). These
efforts led to a rich OSM database, which is constantly updated. Moreover, during and after the earthquake in
Nepal 2015, OSM was updated by thousands of remote and local mappers, including many experienced ones. This
suggests high data quality and the OSM data are therefore used as a reference for the evaluation of the experiment
data. To avoid any influence due to new satellite imagery, the reference is downloaded from the OSM data base
on the day of the experiment (15.12.2015). The students’ mapping data is based on the same Bing satellite imagery
and for their mapping they use JOSM4, a Java based open source editor for OSM. In addition, Kathmandu is chosen
as research area following the previous experiment conducted by Eckle et al. (2015), in order to have comparable
conditions.
METHODOLOGY
Experiment Design
An experiment can be seen as a form of an empirical study, which aims at identifying cause-effect-relations in
order to explain social phenomena (Eifler, 2014). Processes are actively evoked by the experiment leader to iden-
tify the real cause of a phenomenon and therefore it is important to control factors that could also be causes to
avoid alternative explanations (ibid.).
Further, Eifler (2014) states that a control group makes it possible to control the factors which influence the internal
validity and that specific controlling techniques can be applied such as eliminating disruptive elements or keeping
them constant as well as selecting the group randomly or parallel. In accordance, the Solomon Four Group Design
is used as it aims at avoiding the influence of a pre-test, which is the mapping of a testing area in the following
study, on the mapping results of the post-test, i.e. the mapping of the study area in the experiment. Since this
method allows to exclude influences on the internal validity (ibid), it is chosen for conducting the experiment.
Participants
A Cartography lecture of 90 minutes was chosen as experiment setting and 72 students, mostly in their first se-
mester, were taking part. It was assumed that these students represent to a certain extent the mapping community
of new users. Moreover, this setting allows to have a similar set of participants regarding age and education and
to have enough participants in order to provide about 20 participants per group. As Geography is a subject usually
with quite equal gender distribution, the final group of participants consists of 36 female and 36 male students.
Tasks
All participants were attending the same introductory session with information about the experiment itself and
explanations about the tools the students were going to use for the mapping (JOSM editor). Afterwards, they were
randomly distributed into four different groups (Figure 2). Due to practical reasons and the setting, the distribution
was not completely random in the experiment described here, but it was intended to make it as randomly as possible
by handing out the same amount of task sheets for each group to the students. The task sheet included specific
instructions about what to map and whether to read an additional wiki page (link was provided on the sheet). Each
task on the sheet was assigned with a certain amount of time for fulfilling. The instructors of the experiment also
reminded the participants of the single groups about the time. Moreover, the test area as well as the study area
outline was provided on an online learning platform for the download into JOSM, which allowed the students to
easily identify the area of mapping in the satellite image. After the mapping part, the students had to upload their
results to the online platform. The task sheet gave further information about the folders on the online learning
platform and the way they should save their data. The last task for all group participants was the questionnaire, for
which a link was provided on the sheets.
Half of the participants had to do a pre-test, the test area in the experiment. One part (group 1) had to read the
additional material on the wiki page first, before starting mapping the test area. They got a certain amount of time
to do that. Group 2 could directly start with the test area. After mapping the test area for 10 minutes, both groups
started the post-test, the study area in our experiment. The other half of the participants had no pre-test and again
3http://www.opencitiesproject.org/cities/kathmandu/ (accessed: 19.01.2017)
4https://josm.openstreetmap.de/ (accessed: 19.01.2017)
941
Klonner et al.
Quality Improvement of VGI
WiPe Paper Prevention and Preparation
Proceedings of the 14th ISCRAM Conference Albi, France, May 2017
Tina Comes, Frédérick Bénaben, Chihab Hanachi, Matthieu Lauras, Aurélie Montarnal, eds.
one group had additional material (group 3) in contrast to the control group (group 4). These two groups only
mapped the post-test (study area).
Figure 2. Experiment design based on the Solomon Four Group Design.
Wiki Page
The form of a wiki page is used for the experiment as, especially within educational contexts, this method proved
to be a successful way for collaborative knowledge building (Kump et al., 2013). Cultural specific information for
remote mappers can be provided and the content is based on experience and local knowledge of many people.
Thus, a wiki enables, for example, the large OSM community in general, or the crisis mappers in specific, to con-
tribute and share their local knowledge about a certain area with remote mappers. In urgent disaster cases a specific
use case wiki page can be added. Moreover, it is possible, to share the information in different languages. Some
of the content of the applied wiki page for the experiment was already in use after the earthquake in Nepal in 2015
and proved to be useful for remote mapping. This shows both the importance and applicability of such methods
but also the urgent need to evaluate these tools in order to make statements about the quality of the resulting
mapping and the possibilities of improving the portrayal of additional material.
The wiki of the experiment5comprises of characteristics of buildings of this region with specific mapping hints. It
considers buildings that are irregular in elevation and therefore might appear as a set of houses, a row of houses,
which may look like one big house, and complex building structures like multi-polygons (Figure 3). Further, the
correct mapping of the actual layout of a building is explained. Focus is set on buildings as they are mostly the
features new mappers start with. Moreover, especially for disaster management, the exact outline and the number
of buildings is important for population estimation or to identify the type of usage of the building, e.g., elements
at risk like schools or hospitals have often a certain shape.
Figure 3. Wiki page with additional information about specific features. Example of multi-polygon creation6.
5https://wiki.openstreetmap.org/wiki/Nepal_remote_mapping_guide_Experiment (accessed: 15.01.2017)
6Ibid.
942
Klonner et al.
Quality Improvement of VGI
WiPe Paper Prevention and Preparation
Proceedings of the 14th ISCRAM Conference Albi, France, May 2017
Tina Comes, Frédérick Bénaben, Chihab Hanachi, Matthieu Lauras, Aurélie Montarnal, eds.
Questionnaire
The experiment consists of a mapping part followed by a questionnaire in order to gain background information
about the students such as their age, gender, semester, and experience with geoinformation, remote mapping and
OSM. Another issue is their knowledge about Kathmandu. Additionally, information about the mapping material
and the applicability is inquired. Furthermore, they are asked about the time for finishing the task and the language
issues. Finally, the motivation of the students for further mapping is evaluated. A link for the questionnaire is
included in the task sheet.
Analysis
The overall hypothesis of the experiment is that additional country-specific mapping instructions improve the
mapping quality of remote mappers. Therefore, the following section presents analysis methods to measure the
quality of the data mapped by the students in comparison to the OSM reference data.
Different indicators can be used in order to evaluate the data created by the students. The ISO standards
19157:2013 (International Organization for Standardization) provide a set of quality indicators for geographic
information, which are also described in the work of van Oort (2006) and Haklay (2010).The ISO standards define
correctness or positional accuracy as “the accuracy of the position of features within a spatial reference system”,
while completeness is referred to as “the presence and absence of features, their attributes and relationships” (ISO
19157:2013). In the context of this analysis, the correctness indicates the accuracy of the mapped features by the
students, whereas completeness reveals excess and missing data in the student dataset (cf. Rutzinger et al., 2009).
Therefore, the indicators of correctness and completeness can be used to evaluate the quality of the mapping results
of the students.
Klonner et al. (2015) applied two comparative methods for the work with OSM data in urban areas, namely the
centroid and the overlap method. They conclude that the overlap method achieves more realistic results in areas
with terraced houses and blocks of buildings. The urban area of Kathmandu resembles this kind of building struc-
tures, and therefore this method is chosen for the study at hand and applied for the comparison of the students’
mapping results to the reference OSM data. Figure 4 shows an example of a mapped building within the student
data (orange outline) and the reference data (red outline) as well as the area of their intersection. This intersection
of the two building polygons represents the overlapping area (Klonner et al., 2015). Thus, in other words, the
overlapping area shows the correctly mapped student data. In the example (Figure 4), the overlapping area (light
orange) covers the entire reference building, which indicates that the student even overestimated the building area.
Figure 4. Example of a building in the reference data and mapped by a student. The overlapping area can be used for
a comparison.
The overlapping area of the houses of the two datasets can be compared to the overall area of the building polygons
mapped by the students and, of the reference buildings. These calculations can be used to assess the quality indi-
cators. The completeness refers to the ratio of the overlapping area of all buildings and the sum of all areas of the
buildings in the reference data. Correctness stands for the ratio of the overlapping area of all buildings and the area
mapped by the students (cf. Rutzinger et al., 2009).
completeness = overlapping area of all buildings/ area of buildings in the reference data.
correctness = overlapping area of all buildings/ area mapped by the students.
RESULTS
In the following, the results of the mapping by the students during the experiment are evaluated based on the
indicators of completeness and correctness. Moreover, the outcomes of the specific tasks such as the mapping of
a multi-polygon and a row of houses are presented. The final part gives further information provided by the stu-
dents within the questionnaire.
943
Klonner et al.
Quality Improvement of VGI
WiPe Paper Prevention and Preparation
Proceedings of the 14th ISCRAM Conference Albi, France, May 2017
Tina Comes, Frédérick Bénaben, Chihab Hanachi, Matthieu Lauras, Aurélie Montarnal, eds.
Mapping
The calculations of completeness and correctness can be done for the whole study area as well as for the test area.
In this way percentages for all student datasets can be evaluated. The following section shows the results of a
qualitative analysis due to the available sample size. The final number of student submissions (69) differs from
the participant number (72) because three of the student datasets had to be excluded due to invalid data.
Figure 5 shows an example of a student of group 1. The map shows the study area with the reference building area
(red outline), building polygons mapped by the student (orange outline) and the overlapping area (light orange
area). This student resulted in a mapping completeness of 86% and correctness of 62%.
Figure 5. Buildings in the reference data (red) and mapped by a student (orange) as well as the overlapping area (light
orange) in the study area of Kathmandu, Nepal.
The sample size of each group ranges from 16 to 18 participants and thus the following statistical analyses are not
representative but can give some insights for further experiment designs. First of all, the average correctness values
from group 1 and group 3 are compared in order to see whether there is an influence of the test area. A correspond-
ing t-test results in a non-significant outcome (p = 0.84). The same comparison for the completeness results in a
p-value of 0.53, and is therefore also not significant. So, it can be assumed that there is no influence of the test
area, which allows future experiments to have only 2 groups instead of the 4 groups, which enables larger sample
sizes of the groups for representative evaluations.
The assessment of the impact of additional information is based on the analysis of the country-specific features
portrayed in the wiki and how the students succeeded in their mapping. In the following, focus is set on the map-
ping of a multi-polygon and a row-of-houses. The results of these analyses can give hints for the improvement of
the wiki page for the follow-up experiment with only 2 groups and a larger sample size.
Moreover, the feature of the multi-polygon (Figure 6) can be used as an indicator whether the students really used
the additional material. In the experiment of Eckle et al. (2015) it was not identifiable whether the participants read
the material or mapped without reading it. This addition of the multi-polygon makes this possible as new mapper
944
Klonner et al.
Quality Improvement of VGI
WiPe Paper Prevention and Preparation
Proceedings of the 14th ISCRAM Conference Albi, France, May 2017
Tina Comes, Frédérick Bénaben, Chihab Hanachi, Matthieu Lauras, Aurélie Montarnal, eds.
usually do not apply such specific features.
Table 1. Multi-polygon
Group 1
Group 2
Group 3
Group 4
Use of multi-polygon
14 of 18 (78%)
1 of 18 (6%)
9 of 16 (56%)
2 of 17 (12%)
Figure 6. Example of a multi-polygon mapped by a student of group 1 (left) and not mapped by a student of group 2
(right).
The results show that the students of group 1 and group 3 read the material and used the multi-polygon in contrast
to only a small number of mapped multi-polygons in the groups without the additional material (Table 1).
Table 2. Row of houses
Group 1
Group 2
Group 3
Group 4
row of houses as several single houses AND
correctly mapped (7±1 and >80% completeness)
3
3
1
0
row of houses as several single houses BUT
80% of area of houses are overlapped (completeness)
or/and not 7 (
±1) polygons
8
13
11
15
row of houses mapped as 1 single building
4
0
2
2
row of houses not mapped
3
2
2
0
For the evaluation of the mapping of the row of houses, different categories have to be applied because there are
several ways to map this specific feature (Table 2). Some students did not map the row of houses at all, while
others made one single large building. For the students who mapped single buildings a threshold was applied: The
row of houses was only counted as mapped correctly when the number of polygons was 7±1 (of the 7 reference
polygons) and the completeness of this specific feature was >80% (Figure 7).
Figure 7. Mapping of a row of houses. From left to right: reference data with 7 single polygons, correctly mapped data
by the student, several buildings mapped by student but under the threshold of completeness, only one large building
mapped for the row of houses.
On the wiki, the instruction for a row of houses also includes the remark that if they are not sure, they should map
it as one big building. The comparison of group 1 and 2 shows that the students might have followed this advice
945
Klonner et al.
Quality Improvement of VGI
WiPe Paper Prevention and Preparation
Proceedings of the 14th ISCRAM Conference Albi, France, May 2017
Tina Comes, Frédérick Bénaben, Chihab Hanachi, Matthieu Lauras, Aurélie Montarnal, eds.
as 4 students of group 1 made one big building in comparison to no one in group 2.
The overall results of the mapping of a row of houses of all groups indicate that the correct mapping is still very
difficult for beginners. This has to be taken into consideration when using OSM data for disaster management or
other applications.
Questionnaire
Only one of the students had been in Kathmandu before the experiment. The results of this student (group 4) are
above the average of all groups, which might be a hint to the usefulness of local knowledge.
Moreover, the questionnaire reveals that some students had problems regarding the language. While this is only
true for 15%, this still might indicate that future material provided for crisis management by HOT, for example,
might even lead to more accurate mapping results if it is translated in different languages, which is technically
straightforward in a wiki.
Regarding the time the students had available for the mapping task, the questionnaire shows that 24 students had
no time problem while 44 either had to hurry up to finish the task or had not enough time to finish it. 4 did not
respond. This time issue is due to the experiment setting because in real mapping situations the students could take
their time for mapping.
DISCUSSION AND CONCLUSION
The experiment can be seen as a first step towards the evaluation of the impact of additional country specific
material provided for remote mappers. In this paper, only a small analysis is portrayed as the work is still in
progress. An extensive analysis of the questionnaire and the mapping of the remaining specific features will also
show the different mapping results with respect to the time constraints, to previous knowledge or how the students
rate their improvement themselves due to the mapping material.
Additionally, this pre-study gives hints about the experiment design and that the follow-up experiment does not
need to use the Solomon Design anymore. The evaluation shows that there is no significant influence of the test
area. Thus, only 2 groups are necessary for another mapping experiment allowing for larger sample sizes.
Moreover, regarding the wiki page, this experiment also reveals valuable information for improvement of a future
experiment setting as there are still some issues that need to be taken into consideration such as language barriers.
Furthermore, on the one hand the mapping might have been influenced by the motivation as, although the experi-
ment was a near-to reality mapping, the students still had a different motivation than normal OSM mappers. On
the other hand, the tasks of the experiment did not clearly distinguish whether the overall aim is correct mapping
(so no time constraints) or mapping as much as possible (with time pressure). In a follow up testing, the exact
indication of the aim of the mapping, whether completeness or correctness is required, should be made clear at the
beginning in order to have the same setting for all.
Future work may also include further quality measures in order to face the complexity of crisis mapping. Semantic
accuracy, for example, plays an important role especially regarding critical infrastructure. It gives insights into the
link between the (geometric) representation of an object and the intended interpretation. Moreover, temporal ac-
curacy can provide information about updates of the dataset regarding changes in the real world (van Oort, 2006;
Haklay, 2010).
As remotely contributed VGI is gaining more and more importance in the disaster management sector (Eckle et
al., 2016; Horita et al., 2015; Klonner et al., 2016), there need to be tools for enhancing the mapping quality already
during the production phase. The study reveals that students use such guides like the description of the mapping
of multi-polygons. However, for more complex issues like the outline of a row of houses, there is no clear differ-
ence between mappers who used additional material and mappers without specific information. Therefore, it has
to be kept in mind that remotely mapped OSM data is very useful for crisis mapping, but that it cannot give all the
details local mappers can provide and which might also be required for efficient disaster management. This shows
that it is important to select the mapper type (remote or local) according to the task, which has to be supported. In
addition, the management of the input of such heterogeneous groups needs to be taken into consideration in future
work. Further, it is also advisable to include local communities to the creation of more specific mapping instruc-
tions because it is necessary to find out the needs of local actors or specific use case requirements. In the experiment
case, the material was already approved to be useful during the crisis mapping of the Nepal earthquake in 2015.
946
Klonner et al.
Quality Improvement of VGI
WiPe Paper Prevention and Preparation
Proceedings of the 14th ISCRAM Conference Albi, France, May 2017
Tina Comes, Frédérick Bénaben, Chihab Hanachi, Matthieu Lauras, Aurélie Montarnal, eds.
ACKNOWLEDGEMENTS
This work was supported by the WIN-Kolleg of the Heidelberg Academy of Sciences and Humanities (HAW).
The authors would like to express gratitude to the students of the Cartography lecture for participating in the
experiment. Special thanks to René Westerholt for his advice on the statistical evaluation.
REFERENCES
Barron, C., Neis, P. and Zipf, A. (2014) A Comprehensive Framework for Intrinsic OpenStreetMap Quality
Analysis. Transactions in GIS, 18, 6, 877-895.
Ebert, A., Banzhaf, E. and McPhee, J. (2009) The influence of urban expansion on the flood hazard in Santiago
de Chile, Proceedings of the Joint Urban Remote Sensing Event, Shanghai, China.
Eckle, M. and de Albuquerque, J. P. (2015) Quality Assessment of Remote Mapping in OpenStreetMap for
Disaster Management Purposes, Proceedings of the 12th International Conference on Information Systems
for Crisis Response and Management (ISCRAM), Kristiansand, Norway.
Eckle, M., Herfort, B., de Alberquerque, J. P., Leiner, R., Wolff, R., Jacobs, C. and Zipf, A. (2016) Leveraging
OpenStreetMap to support flood risk management in municipalities: A prototype decision support system,
Proceedings of the 13th International Conference on Information Systems for Crisis Response and
Management (ISCRAM), Rio de Janeiro, Brazil.
Eifler, S. (2014) Experiment, In Baur and Blasius (Eds.), Handbuch Methoden der empirischen Sozialforschung.
Wiesbaden: Springer Fachmedien, 195-209.
EM-DAT. (2016) The OFDA/CRED International Disaster Database. Accessed: 14.01.2017,
http://www.emdat.be/disaster_trends/index.html.
Fan, H., Zipf, A., Fu, Q. and Neis, P. (2014) Quality assessment for building footprints data on OpenStreetMap.
International Journal of Geographical Information Science, 28, 4, 700-719.
Haklay, M. (2010) How good is volunteered geographical information? A comparative study of OpenStreetMap
and Ordnance Survey datasets. Environment and Planning B: Planning and Design, 37, 4, 682-703.
Horita, F. E. A., Albuquerque, J. P. de, Degrossi, L. C., Mendiondo, E. M. and Ueyama, J. (2015) Development
of a spatial decision support system for flood risk management in Brazil that combines volunteered
geographic information with wireless sensor networks. Computers & Geosciences, 80, 84-94.
International Organization for Standardization. ISO 19157:2013. Geographic Information. Data Quality.
Klonner, C., Barron, C., Neis, P. and Höfle, B. (2015) Updating digital elevation models via change detection
and fusion of human and remote sensor data in urban environments. International Journal of Digital Earth, 8,
2, 153-171.
Klonner, C., Marx, S., Usón, T., Porto de Albuquerque, J. and Höfle, B. (2016) Volunteered Geographic
Information in Natural Hazard Analysis: A Systematic Literature Review of Current Approaches with a
Focus on Preparedness and Mitigation. ISPRS International Journal of Geo-Information, 5, 7, 103.
Kump, B., Moskaliuk, J., Dennerlein, S. and Ley, T. (2013) Tracing knowledge co-evolution in a realistic course
setting: A wiki-based field experiment. Computers & Education, 69, 60-70.
Rutzinger, M., Rottensteiner, F. and Pfeifer, N. (2009) A Comparison of Evaluation Techniques for Building
Extraction From Airborne Laser Scanning. IEEE Journal of Selected Topics in Applied Earth Observations
and Remote Sensing, 2, 1, 11-20.
See, L., Comber, A., Salk, C., Fritz, S., van der Velde, M., Perger, C., Schill, C., McCallum, I., Kraxner, F. and
Obersteiner, M. (2013) Comparing the Quality of Crowdsourced Data Contributed by Expert and Non-
Experts. PLoS ONE, 8, 7, e69958.
Soden, R., Budhathoki, N. R. and Palen, L. (2014a) Resilience-Building and the Crisis Informatics Agenda:
Lessons Learned from Open Cities Kathmandu, Proceedings of the 11th International Conference on
Information Systems for Crisis Response and Management (ISCRAM), University Park, Pennsylvania, USA.
Soden, R. and Palen, L. (2014b) From Crowdsourced Mapping to Community Mapping: The Post-Earthquake
Work of OpenStreetMap Haiti, Proceedings of the 11th International Conference on the Design of
Cooperative Systems (COOP), Nice, France.
van Oort, P. (2006) Spatial Data Quality: From Description to Application. PhD thesis, Wageningen University.
947
... created or even collected. Typical approaches incorporate the education of contributors, for example, to measure positions with better accuracy or to recognize certain types of objects on aerial imageries [44], and the stimulation of discussions in the community that lead to a more consistent conceptualization. These methods differ from ex post approaches, which improve data quality after the data has been collected. ...
... The grounding also exposes extrinsic characteristics, because the common knowledge of local and remote mappers are different, and the contexts of the creation process are thus different too [18]. The quality of remotely created data can be improved by conveying local knowledge to remote mappers, for example, by region-specific mapping instructions for typical features of the area of interest [44]. The creation of data often involves different sensors, for example, GPS devices and aerial images on computers, but human perception is one of the main sources of information. ...
Article
Full-text available
Data quality and fitness for purpose can be assessed by data quality measures. Existing ontologies of data quality dimensions reflect, among others, which aspects of data quality are assessed and the mechanisms that lead to poor data quality. An understanding of which source of information is used to judge about data quality and fitness for purpose is, however, lacking. This article introduces an ontology of data quality measures by their grounding, that is, the source of information to which the data is compared to in order to assess their quality. The ontology is exemplified with several examples of volunteered geographic information (VGI), while also applying to other geographical data and data in general. An evaluation of the ontology in the context of data quality measures for OpenStreetMap (OSM) data, a well-known example of VGI, provides insights about which types of quality measures for OSM data have and which have not yet been considered in literature.
... Another limitation is that this method does not consider the extent to which users have the necessary skills for a particular task or context. Volunteers may be reliable at producing data about the surroundings in which they live, but when generating data about geographic areas for which they do not have any contextual knowledge, they may produce less reliable geographic information (KLONNER et al., 2017). ...
Thesis
Full-text available
Crowdsourced Geographic Information (CGI) encompasses both “active/conscious” and “passive/unconscious” georeferenced information generated by non-experts. The use of CGI in the domain of flood management is considerably recent and has been motivated by its potential as source of geographic information in situations where authoritative data is scarce or unavailable. Given that citizens may vary greatly in knowledge and expertise, the quality of such information is a key concern when making use of CGI. Moreover, the usability of the crowdsourcing platforms is another critical point that impacts the quality of CGI, since increasing complexity of such systems can lead to the provision of erroneous or inaccurate information. Although usability aspects have been increasingly discussed among designers and developers of computerized systems, there is a lack of studies that investigate strategies for the enhancement of the usability of crowdsourcing platforms. In this perspective, the assessment of CGI quality is an important step to determine if the information fits a specific purpose. A common way of assessing the quality of CGI gathered by crowdsourcing platforms is the evaluation of each CGI item. However, in crisis situations, there is short time to scrutinize a great amount of data and, therefore, minimizing information overload is critically important. An interesting, but poorly explored, strategy is the assessment of the quality of aggregated CGI elements, instead of a single one. This doctoral thesis proposes an approach for the improvement and assessment of CGI quality in the domain of flood management. It describes a taxonomy of methods for the assessment of CGI quality in the absence of authoritative data, as well as proposes a method for evaluating the quality of CGI and a new interface for the Citizen Observatory of Floods. Results obtained in the evaluation of the main contributions reveal that the method can explain the quality of CGI and the usability of the new interface increased.
... In the outdoor mapping approach, volunteers using low-cost technical sensors capture the in situ sensed VGI by visiting a location and performing some type(s) of field measurements [38,39]. On the other hand, with the remote mapping approach, the remotely sensed VGI (also referred to as remotely VGI [78]) is generated without volunteers' physical presence at a location by way of the visual interpretation of geo-referenced VHR optical imagery or other spatial data sources (e.g., old maps) and modification/attribute enrichment of existing geo-data or the digitization of new geographical features [39,79]. ...
Article
Full-text available
This paper presents a collective sensing approach that integrates imperfect Volunteered Geographic Information (VGI) obtained through Citizen Science (CS) tree mapping projects with very high resolution (VHR) optical remotely sensed data for low-cost, fine-scale, and accurate mapping of trees in urban orchards. To this end, an individual tree crown (ITC) detection technique utilizing template matching (TM) was developed for extracting urban orchard trees from VHR optical imagery. To provide the training samples for the TM algorithm, remotely sensed VGI about trees including the crowdsourced data about ITC locations and their crown diameters was adopted in this study. A data quality assessment of the proposed approach in the study area demonstrated that the detected trees had a very high degree of completeness (92.7%), a high thematic accuracy (false discovery rate (FDR) = 0.090, false negative rate (FNR) = 0.073, and F1 score (F1) = 0.918), and a fair positional accuracy (root mean square error(RMSE) = 1.02 m). Overall, the proposed approach based on the crowdsourced training samples generally demonstrated a promising ITC detection performance in our pilot project.
Chapter
Traditionally, government and national mapping agencies have been a primary provider of authoritative geospatial information. Today, with the exponential proliferation of Information and Communication Technologies or ICTs (such as GPS, mobile mapping and geo-localized web applications, social media), any user becomes able to produce geospatial information. This participatory production of geographical data gives birth to the concept of Volunteered Geographic Information (VGI). This phenomenon has greatly contributed to the production of huge amounts of heterogeneous data (structured data, textual documents, images, videos, etc.). It has emerged as a potential source of geographic information in many application areas. Despite the various advantages associated with it, this information lacks often quality assurance, since it is provided by diverse user profiles. To address this issue, numerous research studies have been proposed to assess VGI quality in order to help extract relevant content. This work attempts to provide an overall review of VGI quality assessment methods over the last decade. It also investigates varied quality assessment attributes adopted in recent works. Moreover, it presents a classification that forms a basis for future research. Finally, it discusses in detail the relevance and the main limitations of existing approaches and outlines some guidelines for future developments.
Article
Full-text available
The growing use of crowdsourced geographic information (CGI) hasprompted the employment of several methods for assessing information quality, which are aimed at addressing concerns on the lack of quality of the information provided by non-experts. In this work, we propose a taxonomy of methods for assessing the quality of CGI when no reference data are available, which is likely to be the most common situation in practice. Our taxonomy includes 11 quality assessment methods that were identified by means of a systematic literature review. These methods are described in detail, including their main characteristics and limitations. This taxonomy not only provides a systematic and comprehensive account of the existing set of methods for CGI quality assessment, but also enables researchers working on the quality of CGI in various sources (e.g., social media, crowd sensing, collaborative mapping) to learn from each other, thus opening up avenues for future work that combines and extends existing methods into new application areas and domains.
Article
Full-text available
With the rise of new technologies, citizens can contribute to scientific research via Web 2.0 applications for collecting and distributing geospatial data. Integrating local knowledge, personal experience and up-to-date geoinformation indicates a promising approach for the theoretical framework and the methods of natural hazard analysis. Our systematic literature review aims at identifying current research and directions for future research in terms of Volunteered Geographic Information (VGI) within natural hazard analysis. Focusing on both the preparedness and mitigation phase results in eleven articles from two literature databases. A qualitative analysis for in-depth information extraction reveals auspicious approaches regarding community engagement and data fusion, but also important research gaps. Mainly based in Europe and North America, the analysed studies deal primarily with floods and forest fires, applying geodata collected by trained citizens who are improving their knowledge and making their own interpretations. Yet, there is still a lack of common scientific terms and concepts. Future research can use these findings for the adaptation of scientific models of natural hazard analysis in order to enable the fusion of data from technical sensors and VGI. The development of such general methods shall contribute to establishing the user integration into various contexts, such as natural hazard analysis.
Conference Paper
Full-text available
Floods are considered the most common and devastating type of disasters world-wide. Therefore, flood management is a crucial task for municipalities- a task that requires dependable information to evaluate risks and to react accordingly in a disaster scenario. Acquiring and maintaining this information using official data however is not always feasible, especially for smaller municipalities. This issue could be approached by integrating the collaborative maps of OpenStreetMap (OSM). The OSM data is openly accessible, adaptable and continuously updated. Nonetheless, to make use of this data for effective decision support, the OSM data must be first adapted to the needs of decision makers. In the pursuit of this goal, this paper presents the OpenFloodRiskMap (OFRM)- a prototype for a OSM based spatial decision-support system. OFRM builds an intuitive and practical interface upon existing OSM data and services to enable decision makers to utilize the open data for emergency planning and response.
Conference Paper
Full-text available
Over the last couple of years Volunteered Geographic Information (VGI) and particularly OpenStreetMap (OSM) have emerged as an important additional source of information in disaster management. The so-called OSM Crisis Maps are primarily developed by OSM contributors who work remotely. While local OSM contributors know their area of interest and rely upon local knowledge, often the sole basis for the remote mapping is satellite imagery. This fact may raise doubts about the quality of the Crisis Maps. This study introduces an experimental approach to assess the data quality that remote mappers produce. In an experimental setting, data sets produced by a group of remote mappers are evaluated by comparing them to data sets created by a selected expert mapper with local knowledge. The presented approach proved to be useful for assessing data quality of remote mapping and can be used to support decisions about the suitability of crowdsourced geographic data.
Article
Full-text available
In the past two years, several applications of generating three-dimensional 3D buildings from OpenStreetMap OSM have been made available, for instance, OSM-3D, OSM2World, OSM Building, etc. In these projects, 3D buildings are reconstructed using the buildings’ footprints and information about their attributes, which are documented as tags in OSM. Therefore, the quality of 3D buildings relies strongly on the quality of the building footprints data in OSM. This article is dedicated to a quality assessment of building footprints data in OSM for the German city of Munich, which is one of the most developed cities in OSM. The data are evaluated in terms of completeness, semantic accuracy, position accuracy, and shape accuracy by using building footprints in ATKIS German Authority Topographic–Cartographic Information System as reference data. The process contains three steps: finding correspondence between OSM and ATKIS data, calculating parameters of the four quality criteria, and statistical analysis. The results show that OSM footprint data in Munich have a high completeness and semantic accuracy. There is an offset of about four meters on average in terms of position accuracy. With respect to shape, OSM building footprints have a high similarity to those in ATKIS data. However, some architectural details are missing; hence, the OSM footprints can be regarded as a simplified version of those in ATKIS data.
Article
Full-text available
There is currently a lack of in-situ environmental data for the calibration and validation of remotely sensed products and for the development and verification of models. Crowdsourcing is increasingly being seen as one potentially powerful way of increasing the supply of in-situ data but there are a number of concerns over the subsequent use of the data, in particular over data quality. This paper examined crowdsourced data from the Geo-Wiki crowdsourcing tool for land cover validation to determine whether there were significant differences in quality between the answers provided by experts and non-experts in the domain of remote sensing and therefore the extent to which crowdsourced data describing human impact and land cover can be used in further scientific research. The results showed that there was little difference between experts and non-experts in identifying human impact although results varied by land cover while experts were better than non-experts in identifying the land cover type. This suggests the need to create training materials with more examples in those areas where difficulties in identification were encountered, and to offer some method for contributors to reflect on the information they contribute, perhaps by feeding back the evaluations of their contributed data or by making additional training materials available. Accuracies were also found to be higher when the volunteers were more consistent in their responses at a given location and when they indicated higher confidence, which suggests that these additional pieces of information could be used in the development of robust measures of quality in the future.
Article
OpenStreetMap (OSM) currently represents the most popular project of Volunteered Geographic Information (VGI): geodata are collected by common people and made available for public use. Airborne Laser Scanning (ALS) enables the acquisition of high-resolution digital elevation models that are used for many applications. This study combines the advantages of both ALS and OSM, offering a promising new approach that enhances data quality and allows change detection: the mainly up-to-date 2D data of OSM can be combined with the high-resolution – but rarely updated – elevation information provided by ALS. This case study investigates building objects of OSM and ALS data of the city of Bregenz, Austria. Data quality of OSM is discerned by the comparison of building footprints using different true positive definitions (e.g. overlapping area). High quality of OSM data is revealed, yet also limitations of each method with respect to heterogeneous regions and building outlines are identified. For the first time, an up-to-date Digital Surface Model (DSM) combining 2D OSM and ALS data is achieved. A multitude of applications such as flood simulations and solar potential assessments can directly benefit from this data combination, since their value and reliability strongly depend on an up-to-date DSM.
Article
The growing availability of spatial data along with growing ease to use the spatial data (thanks to wide-scale adoption of GIS) have made it possible to use spatial data in applications inappropriate considering the quality of the data. As a result, concerns about spatial data quality have increased. To deal with these concerns, it is necessary to (1) formalise and standardise descriptions of spatial data quality and (2) to apply these descriptions in assessing the suitability (fitness for use) of spatial data, before using the data. The aim of this thesis was twofold: (1) to enhance the description of spatial data quality and (2) to improve our understanding of the implications of spatial data quality.Chapter 1 sets the scene with a discussion on uncertainty and an explanation of why concerns about spatial data quality exist. Knowledge gaps are identified and the chapter concludes with six research questions.Chapter 2 presents an overview of definitions of spatial data quality. Overall, I found a strong agreement on which elements together define spatial data quality. Definitions appear to differ in two aspects: (1) the location within the meta-data report: some elements occur not in the spatial data quality section but in another section of the meta-data report; and (2) the explicitness with which elements are recognised as individual elements. For example, the European pre-standard explicitly recognises theelement'homogeneity'. Other standards recognise the importance of documenting the variation in quality, without naming it explicitly as an individual element.In chapter 3 we quantified the spatial variability in classification accuracy for the agricultural crops in the Dutch national land cover database (LGN). Classification accuracy was significantly correlated with: (1) the crop present according to LGN, (2) the homogeneity of the 8-cell neighbourhood around each cell, (3) the size of the patch in which a cell is located, and (4) the heterogeneity of the landscape in which a cell is located.In chapter 4 I present methods that use error matrices and change detection error matrices as input to make more accurate land cover change estimates. It was shown that temporal correlation in classification errors has a significant impact and must be taken into account. Producers of time series land cover data are recommended not only to report error matrices, but also change detection error matrices.Chapter 5 focuses on positional accuracy and area estimates. From the positional accuracy of vertices delineating polygons, the variance and covariance in area can be derived. Earlier studies derived equations for thevariance,this chapter presents a covariance equation. The variance and covariance equation were implemented in a model and applied in a case-study. The case-study consisted of 97 polygons with a small subsidy value (in euros per hectare) assigned to each polygon. With the model we could calculate the uncertainty in the total subsidy value (in euros) of the complete set of polygons as a consequence of uncertainty in the position of vertices.Chapter 6 explores the relationship between completeness of spatial data and risk in digging activities around underground cables and pipelines. A model is presented for calculating the economic implications of over- and incompleteness. An important element of this model is therelationship between detection time and costs. The model can be used to calculate the optimal detection time, i.e. the time at which expected costs are at their minimum.Chapter 7 addresses the question why risk analysis (RA) is so rarely applied to assess the suitability of spatial data prior to using the data. In theory, the use of RA is beneficial because it allows the user to judge if the use of certain spatial data does not produce unacceptable risks. Frequently proposed hypotheses explaining the scarce adoption of RA are all technical and educational. In chapter 7 we propose a new group of hypotheses, based on decision theory. We found that the willingness to spend resources on RA depends (1) on the presence of feedback mechanisms in the decision-making process, (2) on how much is at stake and (3) to a minor extent on how well the decision-making process can be modelled.Chapter 8 presents conclusions on the six research questions (chapters 2-7) and lists recommendations for users, producers and researchers of spatial data. With regard to the description, four recommendations are given. Firstly, spend more effort on documenting the lineage of reference data. Secondly, quantify and report correlation of quality between related data sets. Thirdly, investigate the integration of different forms of uncertainty (error, vagueness, ambiguity). Fourthly, study the implementation and use of spatial data quality standards. With regard to the application of spatial data quality descriptions, I have two main recommendations. Firstly, to continue the line of research followed in this thesis: quantification of implications of spatial data quality, through development of theory along with tangible illustrations in case-studies. Secondly, there is a need for more empirical research into how users cope with spatial data quality.
Article
OpenStreetMap (OSM) is one of the most popular examples of a Volunteered Geographic Information (VGI) project. In the past years it has become a serious alternative source for geodata. Since the quality of OSM data can vary strongly, different aspects have been investigated in several scientific studies. In most cases the data is compared with commercial or administrative datasets which, however, are not always accessible due to the lack of availability, contradictory licensing restrictions or high procurement costs. In this investigation a framework containing more than 25 methods and indicators is presented, allowing OSM quality assessments based solely on the data's history. Without the usage of a reference data set, approximate statements on OSM data quality are possible. For this purpose existing methods are taken up, developed further, and integrated into an extensible open source framework. This enables arbitrarily repeatable intrinsic OSM quality analyses for any part of the world.
Article
The co-evolution model of collaborative knowledge building by Cress & Kimmerle (2008) assumes that cognitive and social processes interact when users build knowledge with shared digital artifacts. While these assumptions have been tested in various lab experiments, a test under natural field conditions in educational settings has not been conducted. Here, we present a field experiment where we triggered knowledge co-evolution in an accommodation and an assimilation condition, and measured effects on student knowledge building outside the laboratory in the context of two university courses. Therefore, 48 students received different kinds of prompts that triggered external accommodation and assimilation while writing a wiki text. Knowledge building was measured with a content analysis of the students‟ texts and comments (externalization), and with concept maps and association tests (internalization). The findings reveal that (a) different modes of externalization (accommodation and assimilation) could be triggered with prompts, (b) across both conditions, this externalization co-occurred with internalization (student learning), and (c) there is some evidence that external assimilation and accommodation had differential effects on internal assimilation and accommodation. Thus, the field experiment supports the assumptions of the co-evolution model in a realistic course setting. On a more general note, the study provides an example of how wikis can be used successfully for collaborative knowledge building within educational contexts.