ArticlePDF Available

Mapathons versus automated feature extraction: a comparative analysis for strengthening immunization microplanning

Authors:

Abstract and Figures

Background Social instability and logistical factors like the displacement of vulnerable populations, the difficulty of accessing these populations, and the lack of geographic information for hard-to-reach areas continue to serve as barriers to global essential immunizations (EI). Microplanning, a population-based, healthcare intervention planning method has begun to leverage geographic information system (GIS) technology and geospatial methods to improve the remote identification and mapping of vulnerable populations to ensure inclusion in outreach and immunization services, when feasible. We compare two methods of accomplishing a remote inventory of building locations to assess their accuracy and similarity to currently employed microplan line-lists in the study area. Methods The outputs of a crowd-sourced digitization effort, or mapathon, were compared to those of a machine-learning algorithm for digitization, referred to as automatic feature extraction (AFE). The following accuracy assessments were employed to determine the performance of each feature generation method: (1) an agreement analysis of the two methods assessed the occurrence of matches across the two outputs, where agreements were labeled as “befriended” and disagreements as “lonely”; (2) true and false positive percentages of each method were calculated in comparison to satellite imagery; (3) counts of features generated from both the mapathon and AFE were statistically compared to the number of features listed in the microplan line-list for the study area; and (4) population estimates for both feature generation method were determined for every structure identified assuming a total of three households per compound, with each household averaging two adults and 5 children. Results The mapathon and AFE outputs detected 92,713 and 53,150 features, respectively. A higher proportion (30%) of AFE features were befriended compared with befriended mapathon points (28%). The AFE had a higher true positive rate (90.5%) of identifying structures than the mapathon (84.5%). The difference in the average number of features identified per area between the microplan and mapathon points was larger (t = 3.56) than the microplan and AFE (t = − 2.09) (alpha = 0.05). Conclusions Our findings indicate AFE outputs had higher agreement (i.e., befriended), slightly higher likelihood of correctly identifying a structure, and were more similar to the local microplan line-lists than the mapathon outputs. These findings suggest AFE may be more accurate for identifying structures in high-resolution satellite imagery than mapathons. However, they both had their advantages and the ideal method would utilize both methods in tandem.
This content is subject to copyright. Terms and conditions apply.
Mendesetal. Int J Health Geogr (2021) 20:27
https://doi.org/10.1186/s12942-021-00277-x
RESEARCH
Mapathons versusautomated feature
extraction: acomparative analysis
forstrengthening immunization microplanning
Amalia Mendes1* , Tess Palmer1, Andrew Berens1, Julie Espey1, Rhiannan Price2, Apoorva Mallya3,
Sidney Brown3, Maureen Martinez4, Noha Farag4 and Brian Kaplan4
Abstract
Background: Social instability and logistical factors like the displacement of vulnerable populations, the difficulty
of accessing these populations, and the lack of geographic information for hard-to-reach areas continue to serve as
barriers to global essential immunizations (EI). Microplanning, a population-based, healthcare intervention planning
method has begun to leverage geographic information system (GIS) technology and geospatial methods to improve
the remote identification and mapping of vulnerable populations to ensure inclusion in outreach and immuniza-
tion services, when feasible. We compare two methods of accomplishing a remote inventory of building locations to
assess their accuracy and similarity to currently employed microplan line-lists in the study area.
Methods: The outputs of a crowd-sourced digitization effort, or mapathon, were compared to those of a machine-
learning algorithm for digitization, referred to as automatic feature extraction (AFE). The following accuracy assess-
ments were employed to determine the performance of each feature generation method: (1) an agreement analysis
of the two methods assessed the occurrence of matches across the two outputs, where agreements were labeled as
“befriended” and disagreements as “lonely”; (2) true and false positive percentages of each method were calculated in
comparison to satellite imagery; (3) counts of features generated from both the mapathon and AFE were statistically
compared to the number of features listed in the microplan line-list for the study area; and (4) population estimates
for both feature generation method were determined for every structure identified assuming a total of three house-
holds per compound, with each household averaging two adults and 5 children.
Results: The mapathon and AFE outputs detected 92,713 and 53,150 features, respectively. A higher proportion
(30%) of AFE features were befriended compared with befriended mapathon points (28%). The AFE had a higher true
positive rate (90.5%) of identifying structures than the mapathon (84.5%). The difference in the average number of
features identified per area between the microplan and mapathon points was larger (t = 3.56) than the microplan and
AFE (t = 2.09) (alpha = 0.05).
Conclusions: Our findings indicate AFE outputs had higher agreement (i.e., befriended), slightly higher likelihood of
correctly identifying a structure, and were more similar to the local microplan line-lists than the mapathon outputs.
© The Author(s) 2021. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and
the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material
in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material
is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the
permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco
mmons. org/ licen ses/ by/4. 0/. The Creative Commons Public Domain Dedication waiver (http:// creat iveco mmons. org/ publi cdoma in/
zero/1. 0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Open Access
International Journal of
Health Geographics
*Correspondence: opf9@cdc.gov
1 Division of Toxicology and Human Health Sciences, Agency for Toxic
Substance and Disease Registry, 4770 Buford Hwy NE, Atlanta, GA 30341,
USA
Full list of author information is available at the end of the article
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 2 of 13
Mendesetal. Int J Health Geogr (2021) 20:27
Background
Of the 20 million children across the world with incom-
plete or no essential immunization (EI) for vaccine-
preventable diseases, nearly half live in countries with
conflicts and population displacement (e.g., Afghanistan,
Central African Republic, Iraq, Mali, Nigeria, Pakistan,
and Somalia) [1]. Conflicts and regional instabilities gen-
erally lead to poor vaccination coverage and interrupted
vaccine schedules [2] due to disruption of health systems
and impeded access to care resulting in vaccine delivery
inequities. Currently, the barriers to vaccine preventable
disease control are less about pathogen biology and more
about the identification of sub-populations missed by
the Expanded Programme on Immunization and there-
fore left without equitable access to interventions like
essential immunization and supplementary vaccination
campaigns [3, 4]. Immunization programs miss or under-
serve hard-to-reach sub-populations for various reasons
including geographic inaccessibility, irregular population
migration due to regional instabilities, and nomadic life-
styles. For this reason, it remains imperative to employ
innovative and effective technologies to improve remote
identification of hard-to-reach sub-populations, thereby
allowing service delivery during periods of accessibility.
Understanding the geographic distribution of target
populations for health interventions is a critical com-
ponent of microplanning—an epidemiologic database
aimed at delivering health-care interventions like child-
hood essential immunizations by addressing the imple-
mentation demands of a specific setting [5]. Microplans
critically inform decisions regarding appropriate deliv-
ery strategies (i.e., fixed-post, outreach, or mobile) and
logistics needed to reach children targeted for the inter-
vention (i.e., target populations) [6]. Each microplan is
composed of a line-list where every row represents data
pertaining to the geographic unit of analysis being stud-
ied while columns illustrate variables containing demo-
graphic information (example- children under 5 years
of age, number of households to be visited, estimates of
total resources needed, etc.). Despite the utility of cur-
rent microplans, arguments have been made for updated
methods of microplanning that leverage Geographic
Information Systems (GIS) and satellite imagery to gen-
erate high quality and up-to-date maps of target popu-
lation distributions and maps of built features such as
residential structures and settlements [7, 8]. In their
Reach Every District (RED) strategy for essential immu-
nization, the World Health Organization (WHO) and the
United Nations Children’s Fund (UNICEF) recognized
the need for these updated methods and outlined new
GIS-enhanced microplanning tactics for improved loca-
tion surveillance of some populations.
In some situations, GIS-based microplanning incurs
higher costs than traditional, non-GIS based micro-
planning; however, this does not necessarily imply cost
ineffectiveness. A recent cost-effectiveness analysis con-
ducted in two Nigerian states determined that increased
cost for GIS-based microplanning was mostly due to pur-
chasing additional vaccines for populations previously
uncounted and unreached by traditional microplanning
methods [7]. Not only does GIS-based microplanning
save resources when executed appropriately, it also pro-
tects the lives of field workers in settings where conflict
could compromise their security by reducing the need
for deployment to high-risk areas [6]. When in-person
access is safe and feasible, having field workers physically
present in the region of interest allows for ground-truth-
ing which is needed to validate maps generated remotely
(i.e., generated using imagery and without physical access
to the area of interest). Supplementing microplanning
methods with the integration of GIS technologies could
further support other public health interventions, such
as spraying insecticides for mosquito abatement and
malaria prevention [9, 10] and the provision of maternal
and child health care services [7].
To support the integration of GIS technology in pub-
lic health planning, researchers take advantage of
high- or very high-resolution (VHR) satellite imagery
generated by satellites like GeoEye, QuickBird, RapidEye,
and WorldView. Sub-meter resolution imagery from
these satellites allows analysts to digitize features such as
buildings, rooftops, roads, nomadic camps, and informal
settlements. e size of a population can even be mod-
eled from these footprints.
Large-scale feature digitization (e.g., digitization of
individual structures across multiple districts or prov-
inces) from imagery without automated methods is very
time-consuming for a small group of analysts, especially
when the features of interest are sparse in the imagery.
Consequently, a method of participatory data acquisi-
tion has gained popularity—the “mapathon”—which is a
time-limited, crowd-sourced effort by a group of trained
These findings suggest AFE may be more accurate for identifying structures in high-resolution satellite imagery than
mapathons. However, they both had their advantages and the ideal method would utilize both methods in tandem.
Keywords: Feature extraction, Mapathon, Essential immunization, Population estimates, Microplanning, Satellite
imagery, Building footprints
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 3 of 13
Mendesetal. Int J Health Geogr (2021) 20:27
participants with or without formal geospatial analysis
backgrounds. Participants, used in this paper to describe
both the group of contributors and validators, gener-
ate spatial data of features like residential structures or
informal settlements within a specific area of interest by
using GIS platforms, such as OpenStreetMap and Arc-
GIS Online. Generally there is no financial incentive for
contributions made during a mapathon [11] and anyone
with a computer and internet connection can contrib-
ute. Consequently, humanitarian efforts frequently rely
on mapathons to identify mobile populations and unde-
tected settlements [11, 12]. Similarly, data generated
from mapathons are useful for detecting and enumerat-
ing populations missed during immunization campaigns;
thereby, optimizing immunization campaign microplans.
Mapathons also provide data that are used to map health
facility catchment areas when merged with other key
information [12].
An alternative method to using mapathons is auto-
mated feature extraction (AFE), a type of model-based
feature generation, which can be semi- (i.e., some human
support) or fully automated (i.e., no human support).
After an initial time investment to manually develop
training data using selected examples of features of inter-
est (e.g., man-made structures) and examples of features
not of interest (e.g., large boulders), AFE does not require
time-consuming and labor-intensive steps such as identi-
fying structures and placing points or drawing polygons
manually on a computer. AFE relies on computer algo-
rithms and models to learn patterns, edges, and shapes
of features (e.g., rooftops or settlement footprints) to
digitize and categorize. Machine learning algorithms are
designed to enhance performance by effectively teaching
the computer how to extract the desired spatial data from
imagery with both precision and accuracy. AFE has been
leveraged for a myriad of purposes, such as mapping
agricultural land use [1316] and water boundaries [17,
18], estimating human and livestock populations [19, 20],
road feature extraction [21, 22], building feature extrac-
tion [2329], and to support disaster relief efforts [30,
31].
Like mapathons, AFE relies on high-resolution imagery
for optimal performance, but image collection param-
eters can be refined to account for cloud cover, thick veg-
etation, and low spectral resolution. Additionally, using a
time-series of images can improve the accuracy of feature
detection by minimizing false-positives [14, 18] and is
especially helpful when analyzing pre- and post-disaster
impacts to roads [30] and facilities [31].
ere is currently no information on how results from
participatory mapping compare to the results from AFE;
if researchers determine AFE to be as accurate and pre-
cise as mapathons but faster at generating spatial data,
increasing its use could save valuable resources and time
for public health programs without compromising qual-
ity. Additionally, as geospatial professionals gain a deeper
understanding of the strengths of each method, future
projects can more optimally combine the two to comple-
ment and enhance their end-products.
Disparities in equitable access to health services will
decrease when additional sub-populations are identified
in microplans and serviced by EI campaigns and other
public health interventions. Here, we seek to explore and
compare the accuracy of two methods of feature genera-
tion—mapathons and AFE—to provide evidence for the
suitability of each method in identifying hard-to-reach
populations vulnerable to vaccine-preventable diseases
in inaccessible areas and whether the two methods can
work in a complementary or synergistic way.
Methods
Both feature generation events (i.e., mapathon and AFE)
studied here used the same satellite imagery. e study
area comprises two districts in Central Asia that were
inaccessible to EI at the time of the study. To protect the
security of populations living in our study region, the
specific geographic areas will not be disclosed. e var-
ied terrain of the urban and rural study areas included
rocky and forested mountainous regions, low plateau
areas with desert terrain, and some fertile plains used for
farmland. e climate in the study area is arid to semiarid
with low rainfall in most areas of the region. Permanent
and temporary housing structures were visible in the sat-
ellite imagery used for the study and included small mud
free-standing structures, larger mud-brick and stone
compounds surrounded by walls, modern free-standing
structures and housing complexes in urban areas, small
cliff dwellings, and temporary tents or yurts.
Mapathon
We conducted the mapathon for this project under the
guidance of the WHO and the Geospatial Research, Anal-
ysis, and Services Program (GRASP) at the Centers for
Disease Control and Prevention (CDC). e mapathon
coordinators created a dedicated ArcGIS Online (ESRI,
Redlands, CA) hub page for enrolling and training both
novice and experienced mappers. is mapathon resource
repository included registration information, tutorials
on digitizing and application use, a real-time monitoring
dashboard to analyze participant progress, links to com-
munication channels, and sections on frequently asked
questions. Mapathon participants from the CDC and
WHO were recruited via emails and hardcopy informa-
tional posters. Participants logged into an ArcGIS Online
web application with basic data editing functionality to
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 4 of 13
Mendesetal. Int J Health Geogr (2021) 20:27
view current, high-resolution (0.3–0.5m) satellite imagery
downloaded from DigitalGlobe (DG) with the goal of
identifying structures inside the two study districts.
e imagery for both districts covered a total area of 6146
square kilometers. e entire area of interest was divided
into 1km × 1km grid cells for the contributors who then
digitized features of interest within one cell at a time. For
each cell, contributors placed one spatially linked point on
the center of any man-made structure that was larger than
9 m2 in the image (Fig. 1). If multiple structures existed
within a compound (i.e. several structures surrounded by a
common wall), the contributors digitized any eligible struc-
ture within the compound rather than counting the com-
pound grouping of structures as just one point. Digitized
features could be structures used for any purpose. Addi-
tionally, contributors were instructed to place points on
structures that seemed to be under construction, regard-
less of shape, while avoiding those that appeared to be
destroyed. Structures partially within the grid were treated
as within the grid and were digitized. When a contributor
marked a cell as complete, all man-made structures larger
than 9 m2 and visible in the imagery should have been digi-
tized as point feature class data (a discrete location repre-
sented by longitude and latitude coordinates). GIS experts
served as validators within the mapathon coordination
team, using a separate ArcGIS Online web application to
validate any cells marked as complete by the contributors.
Validators did not evaluate the quality of digitized points
submitted by each contributor but ensured that all features
of interest in the underlying satellite image were correctly
digitized by contributors and made edits as needed before
finalizing each cell. Because the mapathon contributors
and validators had little-to-no knowledge about the setting,
they did not make any classifications regarding the current
use of the buildings they digitized.
Automated feature extraction
e alternative method of acquiring spatial data for this
project leveraged semi-automated feature extraction
(Fig.2) using the results from machine-learning deploy-
ments on millions of structures across various developing
countries. e results gathered from previously con-
ducted deployments supported Ecopia Tech’s (Toronto,
ON, Canada) proprietary machine-learning models in
generating building footprints for structures of interest
detected in the imagery.
To support the models in extracting building footprints,
relevant imagery was broken into a grid of 256 × 256-pixel
chips. Within each chip, a classifier ran through every pixel
and assigned each one a probability score for containing a
feature of interest, using a variety of textural feature data
from neighboring pixels in its calculations. e classifier
algorithm used is a proprietary algorithm developed by
Ecopia which measures the shearing of pixels along with
color gradients to determine the likelihood that a struc-
ture falls within a pixel. Shearing in straight and/or circular
lines can be indicative of man-made materials. If sheer-
ing, texture and contrast scores exceeded Ecopia’s internal
threshold of 1 then the pixels were classified as likely con-
taining or being a part of a structure. e classifier algo-
rithm then digitized each structure’s boundary, using the
confidence scores previously generated for each pixel. Any
chips that did not contain structures were removed from
the algorithm’s output. A team offormer geospatial pro-
fessionals and remote sensing enthusiasts who areexpert
annotatorsthen reviewed the resulting data sets, manually
corrected any errors, and provided any necessary updates
to the classifier algorithm. Using a “CrowdRank” algorithm
[32] we were able to classify users who perform better
when compared against other users completing the same
task. Users who regularly fall below a pre-defined bench-
mark are removed from the project in an iterative fashion
to promote the highest accuracy possible. Informed by the
updated and improved data, the classifier algorithm then
iteratively reproduced the process to increase overall accu-
racy. During these iterations, the annotators continued to
manually revise any incorrectly generated vector edges and
updated the classifier algorithm accordingly. Furthermore,
the annotators manually digitized obscured structures to
reflect accurate footprints of structures of interest. Prior to
this deployment, Ecopia Tech developed an AFE algorithm
capable of classifying footprints and partnered with Maxar
Technologies (Westminster, CO) to utilize their very high-
resolution imagery mosaics and guidance on categorizing
the building footprint outputs. To accurately categorize
footprints as commercial, compound, or residential, expert
imagery analysts from Ecopia manually identified exam-
ples of each from the imagery to use in training data for
the machine-learning model. Guided by discussions with
local consultants, Ecopia defined a compound as typically
including several structures along with a yard surrounded
by a wall. Non-walled, free-standing structures were then
categorized as either commercial or residential depend-
ing on other contextual factors, such as the proximity
and presence of latrines, farmlands, vehicles, and other
indicators of human activity (Fig.3). e AFE algorithm
excluded structures that were round and smaller than 9 m2
to ensure boulders were not mistaken for structures. e
outputs of the model were polygons drawn around the
perimeter of all free-standing structures larger than 9 m2
(whether categorized as commercial or residential) and
all residential compounds (Fig.1). Unlike the mapathon,
compounds, regardless of the number of structures con-
tained inside, were treated as one polygon feature.
An estimate of the population inhabiting the structures
captured by each method was calculated employing the
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 5 of 13
Mendesetal. Int J Health Geogr (2021) 20:27
assumption that a compound includes three households,
where each household has an average of 2 adults and 5
children. erefore, it was estimated that each compound
housed an average of 21 individuals.
Accuracy assessment
e study areas were selected for two reasons: their
geographic heterogeneity despite the low spectral
diversity (e.g., deserts, arid mountainous, alluvial
plains, etc.) and the inaccessibility of local ground-
truth data due to continuous insecurity.
We employed the following accuracy assessment
techniques to determine how well each feature genera-
tion method—mapathon or AFE—captured the actual
location of features of interest.
Fig. 1 Feature generation using two methods: mapathon (point) and automated feature extraction algorithm (polygon)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 6 of 13
Mendesetal. Int J Health Geogr (2021) 20:27
Assessment 1
We conducted an agreement analysis of the two feature
classes to assess matches across the two outputs. To
ensure a uniform comparison across both sets of fea-
tures, we only considered the non-compound AFE fea-
tures and the mapathon points that were not part of a
compound. A simple ‘select by location’ query was used
within ArcGIS Pro, whereby both feature types, point
and polygon, were analyzed together.
To allow for small shifts in geographic location when
comparing mapathon points and AFE polygons, features
within 5 m of another polygon’s perimeter were consid-
ered a match and labeled as “befriended” (Fig.4). If a point
from the mapathon did not fall within an AFE polygon or
have an AFE polygon within 5m of it, we labeled that point
as “lonely”. Similarly, if a polygon from AFE did not have a
corresponding mapathon point within 5m of the polygon’s
edge, we labeled that polygon as “lonely”. Five meter buffers
were applied to points and polygons separately, rather than
simultaneously, such that the consideration of a potential
5m shift in the location of the polygon or the point was ana-
lyzed first for one of the feature types and then the other.
Assessment 2
We conducted a subset analysis of the data from both
feature generation methods. Two GIS experts who were
not part of the mapathon validation team, independently
analyzed the same set of 100 random, lonely points and
Fig. 3 Example of free-standing structures in the study area, categorized as commercial (left) and residential (right) (© 2020 Maxar Technologies)
Fig. 2 Sequence of steps employed for automated feature extraction (AFE)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 7 of 13
Mendesetal. Int J Health Geogr (2021) 20:27
100 random, lonely polygons against the same high-
resolution imagery. e subset analysis was limited to
lonely points and lonely polygons as lonely features were
not considered to be matches from Assessment 1. e
GIS experts did not have access to ground truth due to
security reasons in the study region. As an alternative,
high-resolution satellite imagery was used as the source
of verification for their assessment. ey classified points
and polygons correctly corresponding to a structure as
true positives (TP) based on verification against satel-
lite imagery and classified the remaining features as false
positives (FP), also based on verification against satellite
imagery. Finally, the true positive percentage was calcu-
lated by averaging the number of TP and FP yielded by
the two GIS experts.
Fig. 4 Comparing the results of two feature generation methods: match assessments categorized as “befriended” or lonely”. 1For illustrative
purposes only. 2AFE = automated feature extraction. 3Five-meter buffers were measured around each point and measured from the edges of each
polygon. For clarity of illustration, 5-m polygon buffers are only shown for the lonely polygon
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 8 of 13
Mendesetal. Int J Health Geogr (2021) 20:27
Assessment 3
e third accuracy assessment involved statistically com-
paring the features generated from both the mapathon
and AFE to a microplan, considered the gold-standard
data, shared by the local-level team from one of the
study districts. e microplan was developed by the local
health authorities and is created by listing out the known
settlements in the areas targeted for vaccination and
estimating the number of households that vaccination
teams should expect to find in each settlement. e study
district consisted of 44 operational sub-districts called
clusters and one vaccination team was assigned to work
in each cluster. e microplan included cluster names,
number of households per cluster, number of vaccination
teams, the population aged 0–59months (i.e., target age
for vaccination), and total population.
To account for differences in feature extraction tech-
niques and parameters, we analyzed residential or
compound polygons and mapathon points. Because
mapathon points captured structures of any use while
the AFE and microplan indicated household structures
of residential use only, the count of mapathon points
per cluster was recalculated to better approximate the
number of households in the cluster. To count only one
mapathon point per compound, we first removed mapa-
thon points that fell inside of AFE polygons categorized
as compounds from the analysis. We then multiplied the
percentage of AFE compounds containing at least one
mapathon point (83%) with the number of compounds
in each of the 44 clusters and added that number to the
mapathon points for each cluster, thus creating a more
comparable dataset to the AFE polygons and microplan.
e null hypothesis for this assessment assumed no sig-
nificant differences between the average number of features
per cluster in the microplan in comparison with the average
number of features per cluster obtained from each feature
generation method. To test this assumption, we conducted
2-sample t-tests: (a) comparing the average number of
points per cluster from the mapathon to the microplan and
(b) comparing the average number of polygons per cluster
from AFE to the microplan. T-statistics indicated whether
there were significant differences (p < 0.05) between each
feature generation method and the microplan.
Assessment 4
Population in the study area was estimated by apply-
ing the following assumptions to the mapathon and
AFE data—A compound consists of 3 households and a
household consists of 7 individuals. is assumption was
based on advice from in-country colleagues. erefore,
population estimates were calculated based on the num-
ber of free-standing (7 individuals) residences and the
number of compound residences (21 individuals).
Results
Descriptive statistics ofmapathon andvalidation
e mapathon took place in August 2018 over five days
and recruited 107 participants. Seven organizers spent
approximately 840h, or approximately 120h per person,
preparing for and conducting the event. e contribu-
tors and validators captured a total of 92,713 valid indi-
vidual structures across an area of 6146 km2 during the
mapathon. e total number of individual structures did
not take into account, the adjustments made by valida-
tors where mapathon points of insufficient quality were
deleted. e number of digitized features differed widely
between participants, with a minimum point count of 1
and a maximum of 10,134. Participants spent a total of
98h digitizing, averaging 748 features per person.
Of the 92,713 structures digitized during the mapa-
thon, the vast majority (n = 79,640, 85.9%) required no
revision by a validator, a sizeable proportion (n = 12,608,
13.6%) were uniquely generated by the validators because
contributors missed these structures entirely, and < 0.5%
were edited by the validators or contributors themselves.
Descriptive statistics ofautomated feature extraction
e AFE process required a total of nine days to com-
plete, costing $25,000. is cost included image mosaic
preparation, training data development, model deploy-
ment and iterations, and quality checks for a total area
of 6146 km2. e semi-automated method identified
53,150 individual structures and compounds. e com-
bined use of Maxar satellite imagery processing and
Ecopia algorithms enabled the generation of build-
ing footprints and consequent categorization of those
footprints. Due to the difference in methodologies, it
is expected that the AFE method would result in fewer
features than the mapathon. e AFE classifier catego-
rized 80.7% (n) of the building footprints as compound
structures, 16.4% (n) as commercial structures, and 2.9%
(n) as residential structures. e average area of all AFE
polygons, representing the footprints of compounds,
was 808.2 m2, while the average area for commercial and
residential building footprints were 24.4 m2 and 65.3 m2,
respectively.
Assessment 1: Comparing mapathon andAFE
Based on the matches assessed across the two feature
generation outputs (Fig.4), 30% (n/N) of the non-com-
pound AFE-identified structures intersected or were
within 5m of a mapathon point, while 70% (n/N) were
not. Comparatively, 28% (n/N) of the mapathon points
that were not part of a compound fell inside of or were
within 5 m of an AFE polygon, while 72% (n/N) did
not. A slightly higher proportion of AFE features were
befriended (30%) than the proportion of mapathon points
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 9 of 13
Mendesetal. Int J Health Geogr (2021) 20:27
that were befriended (28%). 2% more identified polygons
were corroborated by a mapathon point than identified
mapathon points were corroborated by a polygon.
Assessment 2: Subset analysis
e subset analysis demonstrated that the AFE method,
including residential and commercial structures, had a
higher true positive percent (90.5%) than the mapathon
(84.5%) in identifying structures (see Appendix 1).
Assessment 3: Comparing average number offeatures
percluster againstthemicroplan
When compared against the 25,141 total features
included in the microplan, the mapathon identified an
additional 20,804 features. e difference between the
microplan and the AFE results was smaller (8142). Fig-
ure5 shows the variation of all features, resulting from
the three different techniques in one district of the study
area.
e average number of features per cluster in the
microplan was 571.39 and the average number of
mapathon points per cluster was statistically signifi-
cantly higher (mean = 1044.20, t = 3.56, p < 0.001) as
was the average number of AFE polygons per cluster
(mean = 756.43, t = 2.09, p = 0.04). Both comparisons
indicate that the microplans were missing structures in
the clusters reviewed or that both the methods overesti-
mated the number of structures in the microplan clusters.
e p-value was significant (alpha = 0.05) for both t-tests,
providing sufficient support to reject the null hypoth-
esis, which assumed no significant difference between
the average number of features obtained through both
extraction methods and the microplan (Table1).
Assessment 4: Estimating population
e population in the study area estimated from mapa-
thon results was 648,991 and the population based on
AFE results was 911,302.
25141
33283
45945
0500010000 15000 20000 25000 30000 35000 40000 45000 50000
Total # of Households in Microplan
Total # of Polygons (Resident + Compound)
Total # of Points (Mapathon)
Feature Variaon - District A
Fig. 5 Count of features across all clusters by feature generation or listing technique
Table 1 Two-Sample t-Test results comparing features per cluster
t-Test: two-sample assuming unequal variances
Mapathon points Microplan AFE features Microplan
Mean 1044.20 571.39 Mean 756.43 571.39
Variance 717,621.79 60,285.87 Variance 284,076.30 60,285.87
Observations 44 44 Observations 44 44
Hypothesized mean dif-
ference 0 Hypothesized mean dif-
ference 0
df 50 df 60
t Stat 3.56 t Stat 2.09
P(T t) two-tail 0.00083 P(T t) two-tail 0.04071
t Critical two-tail 2.01 t Critical two-tail 2.00
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 10 of 13
Mendesetal. Int J Health Geogr (2021) 20:27
Discussion
e results obtained from both feature generation meth-
ods were compared to estimates from the current field-
level source, a microplan line-list. Even though the study
compared differing methodologies for feature generation,
measures to ensure a uniform comparison were taken
into consideration, including the exclusion of commer-
cial structures and recalculation of mapathon points
to better approximate the number of households in the
cluster. Results of the t-tests indicated statistically sig-
nificant differences for each technique in comparison
with the microplan, with the total number of features per
cluster larger than the microplan and in the case of the
mapathon, almost twice the average. e AFE was found
to be similar to the microplan when looking at abso-
lute feature counts. While the accuracy of the field-level
microplan itself is unknown, it is the best comparison
dataset the authors had to compare mapathon and AFE
results to. AFE outputs had a higher true positive percent
(90.5%) than the mapathon (84.5%), meaning the AFE
was slightly better at correctly identifying a structure in
the satellite imagery as a structure. e two techniques
could be optimized to more accurately detect structures,
as both were subject to false positives and an unknown
number of false negatives. Large boulders and trees were
accidentally digitized manually as structures in the mapa-
thon which could be avoided by using various indices
that enhance spectral diversity and by conducting more
training with the participants. Likewise, numerous struc-
tures amidst cliffs and hilly terrain were not captured by
the AFE technique.
Population estimates in inaccessible regions are often
difficult to ascertain due to dynamic population changes
and the enumeration process being labor-intensive [12].
is has important consequences for planning immuni-
zation campaigns and estimating vaccination coverage
for EI. Acquiring precise population estimates translates
into improved vaccine delivery programs once areas
become accessible and more accurate evaluations on the
coverage of the campaign [33]. is analysis was able to
produce rough population estimates for the study area
derived from each feature generation method, based on
structure to population ratio assumptions supplied by
country level partners. Our overarching purpose in com-
paring both feature generation methods was to deter-
mine which method more accurately identified structures
in high-resolution satellite imagery and how the two
methods might best complement one another. e most
accurate population estimates are a result of optimum
accuracy in structure identification.
As populations and population movements continue
to fluctuate across large geographic areas, the availability
of up-to-date information on the distribution of human
settlements constantly changes [34]. ese are circum-
stances in which AFE that can be rerun and retrained
quickly could make valuable contributions to data availa-
bility compared to mapathons that require time-consum-
ing manual inspections for updates. is AFE process
required two days for mosaic preparation, and seven
days for training data curation, model deployment itera-
tions, and quality checks. Algorithms like the ones used
here are useful for expediting work while maintaining or
enhancing quality; however, these algorithms are costly.
e cost of the pilot AFE project was $25,000 across 6146
km2 of the study area and required highly specialized
technical expertise (Table2). As this technology becomes
more commonly used and explored, it is expected that
the cost could decrease over time, making it more acces-
sible. In contrast, participatory and collaborative map-
ping like mapathons require an extensive amount of
manpower and time, making it much harder to trans-
late into monetary costs and are therefore most valu-
able when timeliness is less of a priority, the geographic
scale of the study is limited, and current, high-resolution
imagery is available (Table2).
e use of mapathons for public health interven-
tions has increased meaningfully in recent years [35].
Mapathons have the ability to promote effective com-
munity engagement, creating a sustainable mechanism
of generating geographic data that can be used by local
immunizers during campaigns, ensuring the inclusion
of all settlements. is collaborative style of mapping
can recruit a range of expertise and be conducted mostly
free of cost; however, the two main methodological chal-
lenges are the uncertainty of the quality of data gener-
ated by participants and the number of hours it takes to
organize and conduct a mapathon.
Manual feature generation and model-driven feature
generations are also useful methods to utilize in tan-
dem to exploit the merits of each and develop a product
superior to that which would be created by using only
one method alone. For example, smaller scale mapa-
thon efforts are an efficient means of training data crea-
tion for AFE. Additionally, AFE footprints can be added
to an online application to assist mapathon validators
in assessing incoming results during a mapathon event.
Both AFE footprints and mapathon points, or a combina-
tion of the two, can be utilized as inputs for population
estimation models whereby structures are a proxy for
the population. e researchers suggest that parameters,
such as terrain type, delivery deadline, budget, human
resources, computing resources, imagery availability, and
requirements of the data output should be considered to
strike an appropriate balance in using these two meth-
ods together or to decide if one method is more favora-
ble than the other for a particular project. Although
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 11 of 13
Mendesetal. Int J Health Geogr (2021) 20:27
assessing whether the mapathons together with AFE
provide more accurate results was outside of the scope of
this paper, future work might include an analysis of how
these two methods could be used in tandem.
Limitations
While the strengths of this study included the use of cur-
rent, equivalent satellite imagery across the two feature
generation methods compared, multiple assessments to
understand the ways in which the results of each method
were similar and different, and a thorough logistical
comparison on how to decide which method to employ
(when you must choose only one), the study also had
limitations. An important caveat for interpreting our
findings is the lack of a true gold standard, in the form
of ground-truthed data collection, which made it impos-
sible to calculate the false-negative rate for each method.
Furthermore, due to the insecurity of the area, build-
ing footprints and points generated through the study
could not be validated in the field for potential inaccu-
racies. Instead, the findings were compared to a micro-
plan line-list developed by the country’s local teams and
considered closest to ground truth. However, microplans
are also limited because they capture known areas of set-
tlements and may not reflect newly established or aban-
doned settlements. Finally, the over and underestimation
of structures extracted through both techniques cannot
be investigated on ground due to pending security access
within the region. e authors suggest replicating this
study in accessible areas to evaluate and compare find-
ings. Another inherent challenge in this study is the use
of different feature extraction techniques. e mapa-
thon participants were instructed to place one point per
unique rooftop included within compounds, while the
AFE grouped numerous structures into one feature (as
shown in Fig.1) when the polygon feature represented a
compound. is resulted in a considerable underestima-
tion of individual structures (39,563 fewer structures)
with the AFE technique, which can be an issue if using
individual structure counts to estimate population.
Another important limitation is the lack of equivalent,
constant oversight and the introduction of human bias
by the mapathon participants in comparison with AFE.
While mapathon participants were provided with train-
ing resources and had access to constant communica-
tion with GIS experts through an online chat application,
mapathon coordinators were not able to monitor every
point placed by novice contributors. Mapathon valida-
tors did confirm the accuracy of each digitized point and
made any necessary edits following submission by the
contributors, but human error could still be present in
this validation process. e AFE method involved in this
study also utilized human input as part of the validation
process after the classification algorithm was run, but the
possibility of human error is lower than that of the mapa-
thon because much less human input was involved.
Conclusions
We presented results comparing two feature extraction
methods with the objective of determining how accu-
rately each method identified settlements in hard-to-
reach areas for the purpose of improving EI efforts. e
Table 2 Comparison of feature generation techniques
a Cost does not take cost of imagery into account, as imagery did not have a stand-alone procurement fee for the specic event studied here
b Population was estimated by applying the following assumptions to the mapathon and AFE data: a compound consists of 3 households and a household consists of
7 individuals
Mapathon Indicator Automated feature extraction
A. generic indicators
60 days Time 9 days
Cost of organizational ArcGIS Online licenses (creator license = $1000/
year, Editor license = $200/year). Labor cost for coordinators (based on
salary of coordinators). Participants were unpaid volunteers
Costa to conduct project $25,000 for 6146 km2
Specialized application development expertise required of coordinators Skill level Specialized machine-learning
expertise required
Smaller geographic regions Area (best suited for) Larger geographic areas
B. Performance Indicators
92,713 Number of structures identified 53,150
+20,804 Difference in structures identified com-
pared to microplan
+8142
28% Non-compound match rate (%befriended) 30%
30,904 Number of compounds identified 43,395
84.50% True positive percent 90.50%
648,991 Estimated populationb911,302
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 12 of 13
Mendesetal. Int J Health Geogr (2021) 20:27
findings suggest that the AFE is more robust in detecting
structures when compared to the mapathon; however,
the results need to be validated in the field when feasible
in order to calculate sensitivity and specificity compared
to the gold standard of data collected on the ground (i.e.,
ground truthing). AFE could be particularly useful for
essential immunization efforts because it generates spa-
tial data from imagery rapidly and has the potential to
be more accurate than mapathons. e geographic data
obtained from this study will be used to improve existing
microplans with the intent of increasing the EI coverage
rate in our study area. Future comparison studies must
consider a consistent methodological framework across
both feature extraction techniques to improve the find-
ings presented in this study. Although both feature gen-
eration techniques could be improved further, this study
is a step towards strengthening the understanding of
potential methods of mapping population distribution in
inaccessible areas to support public health interventions.
Appendix1
Assessment 2: Subset analysis results.
GIS Analyst Mapathon points AFE polygons
True
positive False
positive True
positive False
positive
A 91 9 94 9
B 78 22 87 13
Total 169 31 181 22
Average
percent 84.5% 15.5% 90.5% 11%
Abbreviations
AFE: Automatic feature extraction; CDC: Centers for Disease Control and
Prevention; DG: DigitalGlobe; EI: Essential immunization; FP: False positives;
GIS: Geographic information system; GRASP: Geospatial Research, Analysis,
and Services Program; RED: Reach every district; TP: True positives; UNICEF:
United Nations Children’s Fund; VHR: Very high-resolution; WHO: World Health
Organization.
Acknowledgements
The authors would like to thank all the mapathon volunteers from World
Health Organization and the Centers for Disease Control and Prevention.
Disclaimers
The findings and conclusions in this report are those of the author(s) and
do not necessarily represent the official position of the Centers for Disease
Control and Prevention, the Agency for Toxic Substances and Disease Registry,
or other institutions to which the authors belong.
This research was supported in part by an appointment to the Research
Participation Program at the Centers for Disease Control and Prevention
administered by the Oak Ridge Institute for Science and Education, Training
Programs in Epidemiology and Public Health Interventions Network (TEPHI-
NET) and DRT Strategies through an interagency agreement between the U.S.
Department of Energy and CDC.
Authors’ contributions
TP participated in study design and coordination, drafted the manuscript, and
contributed to the data collection, analysis, and paper conceptualization. AB
contributed to the paper conceptualization. AM participated in coordination,
drafted manuscript, and contributed to the data collection, analysis, and paper
conceptualization. JE participated in coordination, drafted the manuscript,
conducted analyses and contributed to the paper conceptualization. BK
conceived and designed study and contributed to data collection and coordi-
nation. RP conceived of study, conducted analyses, and participated in study
design. AM contributed to the paper conceptualization. SB contributed to the
paper conceptualization. MM conceived of study and participated in coordi-
nation and study design. NH contributed to the paper conceptualization. All
authors reviewed and provided feedback to the manuscript. All authors read
and approved the final manuscript.
Funding
This research received no specific grant from any funding agency in the pub-
lic, commercial, or not-for-profit sectors.
Availability of data and materials
The datasets generated during and/or analyzed during the current study
are not publicly available to protect the security of populations living in our
study region but are available from the corresponding author on reasonable
request.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Author details
1 Division of Toxicology and Human Health Sciences, Agency for Toxic
Substance and Disease Registry, 4770 Buford Hwy NE, Atlanta, GA 30341,
USA. 2 Sustainable Development Practice, Maxar Technologies, 1300 W 120th
Avenue, Westminster, CO 80234, USA. 3 Polio Program, Bill & Melinda Gates
Foundation, 500 5th Ave N, Seattle, WA 98109, USA. 4 Global I mmunization
Division, Centers for Disease Control and Prevention, 1600 Clifton Rd, Atlanta,
GA 30333, USA.
Received: 8 March 2021 Accepted: 9 May 2021
References
1. WHO. National Immunization Coverage Scorecards Estimates for 2018;
2018. https:// www. who. int/ docs/ defau lt- source/ immun izati on/ pertu ssis/
gvap- natio nal- immun izati on- cover age- score cards- estim ates- 2018. pdf?
sfvrsn= 46a24 831_2. Accessed 13 Dec 2019.
2. Sodha S, Dietz V. Strengthening routine immunization systems to
improve global vaccination coverage. Br Med Bull. 2015;113:5–14.
3. WHO. Global Vaccine Action Plan 2011–2020; 2013. https:// www. who. int/
immun izati on/ global_ vacci ne_ action_ plan/ GVAP_ doc_ 2011_ 2020/ en/.
Accessed 13 Dec 2019.
4. GPEI. Polio Endgame Strategy 2019–2023: Eradication, integration, certi-
fication and containment; 2019. http:// polio eradi cation. org/ wp- conte nt/
uploa ds/ 2019/ 06/ engli sh- polio- endga me- strat egy. pdf. Accessed 17 Dec
2019.
5. WHO. Global Polio Eradication Initiative: Best Practices in Microplanning
for Polio Eradication; 2018. http:// polio eradi cation. org/ wp- conte nt/ uploa
ds/ 2018/ 12/ Best- pract ices- in- mirco plann ing- for- polio- eradi cation. pdf.
Accessed 19 Dec 2019.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Page 13 of 13
Mendesetal. Int J Health Geogr (2021) 20:27
6. WHO. Microplanning for immunization service delivery using the Reach-
ing Every District (RED) strategy; 2009. https:// www. who. int/ immun izati
on/ sage/9_ Final_ RED_ 280909. pdf. Accessed 19 Dec 2019.
7. Ali D, Levin A, Abdulkarim M, Tijjani U, Ahmed B, Namalam F, et al. A cost-
effectiveness analysis of traditional and geographic information system-
supported microplanning approaches for routine immunization program
management in northern Nigeria. Vaccine. 2020;38:1408–15.
8. WHO. Global routine immunization strategies and practices (GRISP)-a
companion document to the Global Vaccine Action Plan (GVAP); 2016.
https:// apps. who. int/ iris/ bitst ream/ handle/ 10665/ 204500/ 97892 41510
103_ eng. pdf. Accessed 7 Jan 2020.
9. Kamanga A, Renn S, Pollard D, Bridges DJ, Chirwa B, Pinchoff J, et al.
Open-source satellite enumeration to map households: planning and
targeting indoor residual spraying for malaria. Malar J. 2015;14(1):345.
10. Kelly G, Seng CM, Donald W, Taleo G, Nausien J, Batarii W, et al. A spatial
decision support system for guiding focal indoor residual spraying inter-
ventions in a malaria elimination zone. Geospat Health. 2011;6:21–31.
11. See L, Mooney P, Foody G, Bastin L, Comber A, Estima J, et al. Crowdsourc-
ing, citizen science or volunteered geographic information? The current
state of crowdsourced geographic information. ISPRS Int J Geo Inf.
2016;5(5):55.
12. de Albuquerque JY, G.; Pitidis, V.; Ulbrich, P. Towards a participatory
methodology for community data generation to analyse urban health
inequalities: a multi-country case study. In: 52nd Hawaii International
Conference on System Sciences; Hawaii; 2019.
13. Brown M, McCarty J. Remote sensing data and methods for identifying
urban and peri-urban smallholder agriculture in developing countries
and in the United States. 2017.
14. Debats SR, Luo D, Estes LD, Fuchs TJ, Caylor KK. A generalized computer
vision approach to mapping crop fields in heterogeneous agricultural
landscapes. Remote Sens Environ. 2016;179:210–21.
15. Ellis P, Griscom B, Walker W, Gonçalves F, Cormier T. Mapping selective
logging impacts in Borneo with GPS and airborne lidar. For Ecol Manag.
2016;365:184–96.
16. North HC, Pairman D, Belliss SE. Boundary delineation of agricultural fields
in multitemporal satellite imagery. IEEE J Select Topics Appl Earth Obs
Remote Sens. 2019;12(1):237–51.
17. Rishikeshan CR, An H. automated mathematical morphology driven
algorithm for water body extraction from remotely sensed images. ISPRS
J Photogramm Remote Sens. 2018;146:11–21.
18. Zimba H, Kawawa B, Chabala A, Phiri W, Selsam P, Meinhardt M, et al.
Assessment of trends in inundation extent in the Barotse Floodplain,
upper Zambezi River Basin: a remote sensing-based approach. J Hydrol
Reg Stud. 2018;15:149–70.
19. Kellenberger B, Marcos D, Tuia D. Detecting mammals in UAV images:
best practices to address a substantially imbalanced dataset with deep
learning. Remote Sens Environ. 2018;216:139–53.
20. Wania A, Kemper T, Tiede D, Zeil P. Mapping recent built-up area changes
in the city of Harare with high resolution satellite imagery. Appl Geogr.
2014;46:35–44.
21. Miao Z, Shi W, Gamba P, Li Z. An object-based method for road network
extraction in VHR satellite images. IEEE J Select Topics Appl Earth Obs
Remote Sens. 2015;8(10):4853–62.
22. Nunes DM, Medeiros ND, Santos AD. Semi-automatic road net-
work extraction from digital images using object-based classifica-
tion and morphological operators. Boletim de Ciências Geodésicas.
2018;24(4):485–502.
23. Arun P, Katiyar S. An intelligent approach towards automatic shape
modelling and object extraction from satellite images using cellular
automata-based algorithms. GIScience Remote Sens. 2013;50(3):337–48.
24. Dumitru CO, Cui S, Schwarz G, Datcu M. Information content of very-
high-resolution SAR images: semantics, geospatial context, and ontolo-
gies. IEEE J Select Topics Appl Earth Obs Remote Sens. 2014;8(4):1635–50.
25. Hung CLJ, James LA, Hodgson ME. An automated algorithm for mapping
building impervious areas from airborne LiDAR point-cloud data for flood
hydrology. GIScience Remote Sens. 2018;55(6):793–816.
26. Konstantinidis D, Stathaki T, Argyriou V, Grammalidis N. Building detection
using enhanced HOG–LBP features and region refinement processes.
IEEE J Select Topics Appl Earth Obs Remote Sens. 2016;10(3):888–905.
27. Manno-Kovács A, Ok AO. Building detection from monocular VHR images
by integrated urban area knowledge. IEEE Geosci Remote Sens Lett.
2015;12(10):2140–4.
28. Sedaghat A, Ebadi H. Distinctive order based self-similarity descriptor
for multi-sensor remote sensing image matching. ISPRS J Photogramm
Remote Sens. 2015;108:62–71.
29. Yousefi B, Mirhassani SM, AhmadiFard A, Hosseini MM. Hierarchical
segmentation of urban satellite imagery. Int J Appl Earth Obs Geoinf.
2014;30:158–66.
30. Coulibaly I, Spiric N, Lepage R, St-Jacques M. Semiautomatic road
extraction from VHR images based on multiscale and spectral angle in
case of earthquake. IEEE J Select Topics Appl Earth Obs Remote Sens.
2017;11(1):238–48.
31. Dubois D, Lepage R. Fast and efficient evaluation of building damage
from very high resolution optical satellite images. IEEE J Select Topics
Appl Earth Obs Remote Sens. 2014;7(10):4167–76.
32. Maxar. Technologies M, editor; 2013. [cited 2021]. https:// blog. maxar.
com/ earth- intel ligen ce/ 2013/ moore. Accessed 4 May 2021.
33. Kamadjeu R. Tracking the polio virus down the Congo River: a case study
on the use of Google EarthTM in public health planning and mapping. Int
J Health Geogr. 2009;8(1):4.
34. Esch T, Heldens W, Hirner A, Keil M, Marconcini M, Roth A, et al. Breaking
new ground in mapping human settlements from space—the Global
Urban Footprint. ISPRS J Photogramm Remote Sens. 2017;134:30–42.
35. Coetzee SM, Minghini M, Solis P, Rautenbach V, Green C. Towards
understanding the impact of mapathons - reflecting on Youthmappers
experiences. Int Arch Photogramm Remote Sens Spat Inf Sci. 2018.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in pub-
lished maps and institutional affiliations.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
... Herfort et al. (2019) observed that integrating crowdsourcing with deep learning outperformed a crowdsourcing-only approach and reduced volunteer effort at all studied sites by at least 80 percent. Mendes et al. (2021), using machine learning, compared mapathon results with automated feature extraction from satellite imagery, reporting a slightly higher likelihood of correctly identifying structures using an automated approach. Nassozi (2022) reported that mappers using high-quality AI-generated building suggestions could map 2500-3000 buildings per day, compared to 1000-1500 buildings per day without AI assistance. ...
Article
Full-text available
AI-assisted mapping is an innovative approach to data production in OpenStreetMap (OSM), designed to add new buildings to maps using advanced editing tools based on deep learning techniques and recently released global-scale building datasets derived from satellite imagery. However, the identification of OSM data derived from AI-generated datasets remains challenging without a comprehensive global overview of the scale, magnitude, and impact of AI-assisted mapping in OSM. The present study examines the evolution of spatiotemporal mapping of buildings in OSM, applying the ohsome framework, a high-performance data analysis platform for full-history OSM data analysis. The study’s findings indicate that tags recommended by data providers are effective in identifying AI-generated buildings, and that the spatial distribution of AI-assisted mapping is highly uneven, with over 50 percent of all AI-generated buildings in OSM located in the United States and 75 percent concentrated in just five countries. A positive correlation is observed between the prevalence of AI-generated buildings in maps and both population size and natural disaster mortality rates per 100,000 people. In most countries, AI-generated buildings are modified less frequently than non-AI-generated buildings. A case study of a selected location to verify the quality of AI-generated buildings is also presented.
... It involves examining and judging how insights from other disciplines can extraction' and 'automated pattern recognition.' Artificial intelligence and machine learning algorithms would automatically extract any targeted (micro-scaled or macro-scaled) landforms and geomorphological features from RS data (Sofia 2020;Mendes et al. 2021), as for example, identification of river networks, delineation slope gradients, detection of landforms, and characterization of landscape elements, and reformation the mapping and classification processes. On the contrary, machine learning algorithms, such as 'Convolutional Neural Networks' (CNNs), can recognize and classify geomorphological patterns and landforms (Du et al. 2019;Catani 2021;van der Meij et al. 2022). ...
Chapter
The concept of a paradigm shift in applied geomorphology has gained considerable attention in recent years, as researchers and practitioners recognize the need to reassess traditional approaches and embrace new perspectives. The introductory chapter aims to explore the question of whether a paradigm shift is necessary in applied geomorphology and, if so, to what extent it is accepted within the scientific community. Through a comprehensive review of relevant literature and case studies, this chapter examines the challenges and opportunities associated with a paradigm shift in applied geomorphology. It highlights the limitations of conventional methodologies and theories, emphasizing the need for innovative approaches that integrate interdisciplinary knowledge and emerging technologies. Furthermore, the chapter analyzes the factors that contribute to the acceptability of a paradigm shift in the said field. It discusses the role of scientific consensus, stakeholder engagement, and the ability to address pressing societal and environmental challenges. The current endeavor also explores the resistance and barriers encountered in implementing a paradigm shifts, including institutional inertia and disciplinary boundaries. Ultimately, this chapter advocates for raising the agenda of a paradigm shift in applied geomorphology. It argues that embracing new conceptual frameworks and methodologies can enhance individuals’ understanding of landscape dynamics, improve hazard assessment and mitigation strategies, and promote sustainable land management practices. By addressing the question of acceptability, this chapter aims to stimulate further discussion and encourage the adoption of transformative approaches in the field of geomorphology.
... In this paper, we combine the characteristics of sports action, consider the comprehensiveness of feature extraction and the description of local features, and use the eight-star model and Zernike moments commonly used in sports posture multifeature extraction to extract sports action multifeatures. In order to reduce the redundancy of features and the dimensionality of the feature vector, this paper uses a genetic algorithm-based approach to fuse the above extracted features [15]: ...
Article
Full-text available
A sports-assisted education method based on a support vector machine (SVM) is proposed to address the problem of complex and variable sports actions leading to easy ghosting of target detection and high dimensionality of feature extraction, which reduces the low accuracy of sports action recognition. The ViBe target detection algorithm is improved by using Wronskian function and the “4-linked algorithm” seed filling algorithm, which effectively solves the ghosting problem and obtains clearer human sports targets. By using the genetic algorithm to fuse the eight-star model with sports action features extracted by the Zernike moment, redundant features are reduced and differentiability between different classes is ensured. Sports action classification was achieved by using a one-to-one construction of an SVM classifier. The results show that the proposed method can effectively recognize sports movements with an average recognition accuracy of more than 96%, which can assist physical education and has a certain practical application value.
Preprint
Full-text available
Background The increasing availability globally of building footprint datasets has brought new opportunities to support a geographic approach to health programme planning. This is particularly acute in settings with high disease burdens but limited geospatial data available to support targeted planning. The comparability of building footprint datasets has recently started to be explored, but the impact of utilising a particular dataset in analyses to support decision making for health programme planning has not been studied. Here, we quantify the impact of utilising four different building footprint datasets in analyses to support health programme planning, with an example of malaria vector control initiatives in Zambia. Methods Using the example of planning indoor residual spraying (IRS) campaigns in Zambia, we identify priority locations for deployment of this intervention based on criteria related to the area, proximity and counts of building footprints per settlement. We apply the same criteria to four different building footprint datasets and quantify the count and geographic variability in the priority settlements that are identified. Results We show that nationally the count of potential priority settlements for IRS varies by over 230% with different building footprint datasets, considering a minimum threshold of 25 sprayable buildings per settlement. Differences are most pronounced for rural settlements, indicating that the choice of dataset may bias the selection to include or exclude settlements, and consequently population groups, in some areas. Conclusions The results of this study show that the choice of building footprint dataset can have a considerable impact on the potential settlements identified for IRS, in terms of (i) their location and count, and (ii) the count of building footprints within priority settlements. The choice of dataset potentially has substantial implications for campaign planning, implementation and coverage assessment. Given the magnitude of the differences observed, further work should more broadly assess the sensitivity of health programme planning metrics to different building footprint datasets, and across a range of geographic contexts and health campaign types.
Chapter
The diagnosis of human filarial infections, despite important advances in recent years, remains in need of more practical and more informative improvements. Accurate diagnosis and assessment of these infections is vital for the medical management of individuals who become infected with a filarial parasite. However, it is also currently extremely important for the initiation, monitoring, and evaluation of the major elimination programs that are underway in endemic countries across the globe targeting the two most clinically significant human filarial diseases. Identification and assessment of these infections have often been inhibited by clinically silent periods before pathognomonic presentations occur in an individual, thus placing emphasis on the need for increased specific and sensitive biomarkers as indicators of infection. In addition, valid and practical evaluation methods for monitoring large filariasis endemic populations are central to the road to success in global efforts to eliminate the transmission of onchocerciasis and eliminate lymphatic filariasis as a public health problem. This chapter discusses aspects of diagnosis and assessment from a practical context, addresses both the needs and challenges that are faced in the development of functional diagnostic tools for filarial infections, and makes suggestions as to potential approaches for research in this area. This discussion is not intended to be a comprehensive review of all aspects of this wide and diverse subject; rather, it emphasizes the need to consider the biology of these parasites in developing new tests, the locations in which they are to be used, and sampling procedures that are acceptable and practical for the assessment of filariasis‐endemic populations.
Conference Paper
Full-text available
This paper presents results from the application of a methodological framework developed as part of an ongoing research project focused on understanding inequalities in the healthcare access of slum residents of cities in four countries: Bangladesh, Kenya, Pakistan and Nigeria. We employ a systematic approach to produce, curate and analyse volunteered geographic information (VGI) on urban communities, based on a combination of collaborative satellite-imagery digitization and participatory mapping, which relies upon geospatial open-source technologies and the collaborative mapping platform OpenStreetMap. Our approach builds upon and extends humanitarian mapping practices, in order to address the twofold challenge of achieving equitable community engagement whilst generating spatial data that adheres quality standards to produce rigorous and trusted evidence for policy and decision making. Findings show that our method generated promising results both in terms of community engagement and the production of high-quality data on communities to analyse urban inequalities.
Article
Full-text available
Effective RI microplanning requires accurate population estimates and maps showing health facilities and locations of villages and target populations. Traditional microplanning relies on census figures to project target populations and on community estimates of distances, while GIS microplanning uses satellite imagery to estimate target populations and spatial analyses to estimate distances. This paper estimates the cost-effectiveness of geographical information systems (GIS)-based microplanning for routine immunization (RI) programming in two states in northern Nigeria. For our cost-effectiveness analysis, we captured the cost of all inputs for both approaches to capture the incremental cost of GIS over traditional microplanning and present the incremental cost-effectiveness ratios for each vaccine-preventable illness, death, and disability-adjusted life year (DALY) averted. We considered two scenarios for estimating vaccine requirements for each microplanning approach, one based on administrative vaccination coverage rates and one based on National Nutrition and Health Survey rates. With the administrative rates, GIS microplanning projected approximately 194,000 and 157,000 more required vaccinations than traditional microplanning in Bauchi and Sokoto States; with the survey rates, the additional number of vaccinations required was nearly 113,000 in Bauchi and about 47,000 in Sokoto. For each state under each scenario, we present numbers of and costs per measles and pertussis cases, deaths, and DALYs averted by the additional vaccinations, as well as annual costs. As expected, GIS-based microplanning incurs higher costs than traditional microplanning, due mainly to the additional vaccinations required for populations previously unreached. Our estimates of cost per DALY averted suggest, however, that GIS microplanning is more cost-effective than traditional microplanning in both states under both coverage scenarios and that the higher costs incurred by GIS microplanning are worth adopting.
Article
Full-text available
Agricultural land-use statistics are more informative per-field than per-pixel. Land-use classification requires up-to-date field boundary maps potentially covering large areas containing thousands of farms. This kind of map is usually difficult to obtain. We have developed a new, automated method for deriving closed polygons around fields from time-series satellite imagery. We have been using this method operationally in New Zealand to map whole districts using imagery from several satellite sensors, with little need to vary parameters. Our method looks for boundaries—either step edges or linear features—surrounding regions of low variability throughout the time series. Local standard deviations from all image dates are combined, and the result is convolved with a series of extended directional edge filters. We propose that edge linearity over a long distance is a more important criterion than spectral difference for separating fields, so edge responses are thresholded primarily by length rather than strength. The resulting raster edge map (combined from all directions) is converted to vector (GIS) format and the final polygon topology is built. The method successfully segments parcels containing different crops and pasture, as well as those separated by boundaries such as roads and hedgerows. Here we describe the technique and demonstrate it for an agricultural study site (4000 km2) using SPOT satellite imagery. We show that our result compares favorably with that from existing segmentation methods in terms of both quantitative quality metrics and suitability for land-use classification.
Article
Full-text available
The demand for geospatial data concerning road network is constant, due to the wide variety of application which needs this type of data. It stands out the importance of this data in cartography update cycles, that can be obtained using automated processes of feature extraction in digital images, which are more accurate, fast and less costly than the traditional methods. In this sense, this work aimed the road network extraction from RapidEye satellite imagery, by developing a hybrid methodology using techniques of object-based image classification and morphological operators. The methodology was tested in three different sites, with images acquired in distinct dates, and the extraction process was evaluated through metrics obtained from the linear matching procedure. By the proposed extraction process, were achieved in terms of correctness and completeness the values of 92.23% and 85.15% for test site 1, the values of 79.16% and 81.06% for test site 2, and the values of 82.05% and 92.22% for test site 3, respectively. The results shown that the proposed methodology presented a good performance for semi-automatic road network extraction from Rapideye images, representing an alternative to auxiliary road network database acquisition and updating.
Article
Full-text available
YouthMappers is a global network of student chapters actively engaged in collaborative mapping efforts, such as OpenStreetMap mapathons. Many questions have been raised about the impact of mapathons on open map data and on the participating mappers. For example, how can the social gathering and event format encourage productivity and quality, while also contributing to community building? Because YouthMappers chapters regularly host mapathons, there are frequent opportunities to investigate the impact of mapathons. In this paper, three universities involved in the YouthMappers network, located in Europe, North America and Africa, describe how mapathons are conducted at their respective universities. Incorporating mapathons into the curriculum encourages students to contribute much-needed open geospatial data for humanitarian projects. At the same time, students get practical experience in data capturing with open source tools and awareness is raised of humanitarian challenges in other parts of the world, thus nurturing socially engaged citizens for the future. The experiences at the three universities are diverse and richly contextual to the specific character of the campus and its students. These differences underscore the challenge of a common means to formally assess the impact of such events in general. Based on this exploratory research, three themes for assessing the impact of mapathons are proposed: the volume and quality of open geographic data produced during mapathons; the social and personal growth of the students attending the mapathons; and the changes in university programs and curricula introduced as a result of the mapathons.
Preprint
Full-text available
Knowledge over the number of animals in large wildlife reserves is a vital necessity for park rangers in their efforts to protect endangered species. Manual animal censuses are dangerous and expensive, hence Unmanned Aerial Vehicles (UAVs) with consumer level digital cameras are becoming a popular alternative tool to estimate livestock. Several works have been proposed that semi-automatically process UAV images to detect animals, of which some employ Convolutional Neural Networks (CNNs), a recent family of deep learning algorithms that proved very effective in object detection in large datasets from computer vision. However, the majority of works related to wildlife focuses only on small datasets (typically subsets of UAV campaigns), which might be detrimental when presented with the sheer scale of real study areas for large mammal census. Methods may yield thousands of false alarms in such cases. In this paper, we study how to scale CNNs to large wildlife census tasks and present a number of recommendations to train a CNN on a large UAV dataset. We further introduce novel evaluation protocols that are tailored to censuses and model suitability for subsequent human verification of detections. Using our recommendations, we are able to train a CNN reducing the number of false positives by an order of magnitude compared to previous state-of-the-art. Setting the requirements at 90% recall, our CNN allows to reduce the amount of data required for manual verification by three times, thus making it possible for rangers to screen all the data acquired efficiently and to detect almost all animals in the reserve automatically.
Article
Full-text available
Buildings, as impervious surfaces, are an important component of total impervious surface areas that drive urban stormwater response to intense rainfall events. Most stormwater models that use percent impervious area (PIA) are spatially lumped models and do not require precise locations of building roofs, as in other applications of building maps, but do require accurate estimates of total impervious areas within the geographic units of observation (e.g. city blocks or sub-watershed units). Two-dimensional mapping of buildings from aerial imagery requires laborious efforts from image analysts or elaborate image analysis techniques using high spatial resolution imagery. Moreover, large uncertainties exist where tall, dense vegetation obscures the structures. Analyzing LiDAR point-cloud data, however, can distinguish buildings from vegetation canopy and facilitate the mapping of buildings. This paper presents a new building extraction approach that is based on and optimized for estimating building impervious areas (BIA) for hydrologic purposes and can be used with standard GIS software to identify building roofs under tall, thick canopy. Accuracy assessment methods are presented that can optimize model performance for modeling BIA within the geographic units of observation for hydrologic applications. The Building Extraction from LiDAR Last Returns (BELLR) model, a 2.5D rule-based GIS model, uses a non-spatial, local vertical difference filter (VDF) on LiDAR point-cloud data to automatically identify and map building footprints. The model includes an absolute difference in elevation (AdE) parameter in the VDF that compares the difference between mean and modal elevations of last-returns in each cell. The BELLR model is calibrated using an extensive inner-city, highly urbanized small watershed in Columbia, South Carolina, USA that is covered by tall, thick vegetation canopy that obscures many buildings. The calibration of BELLR used a set of building locations compiled by photo-analysts, and validation used independent building reference data. The model is applied to two residential neighborhoods, one of which is a residential area within the primary watershed and the other is a younger suburban neighborhood with a less-well developed tree canopy used as a validation site. Performance results indicate that the BELLR model is highly sensitive to concavity in the lasboundary tool of LAStools® and those settings are highly site specific. The model is also sensitive to cell size and the AdE threshold values. However, properly calibrated the BIA for the two residential sites could be estimated within 1% error for optimized experiments. To examine results in a hydrologic application, the BELLR estimated BIAs were tested using two different types of hydrologic models to compare BELLR results with results using the National Land Cover Database (NLCD) 2011 Percent Developed Imperviousness data. The BELLR BIA values provide more accurate results than the use of the 2011 NLCD PIA data in both models. The VDF developed in this study to map buildings could be applied to LiDAR point-cloud filtering algorithms for feature extraction in machine learning or mapping other planar surfaces in more broad-based land-cover classifications.
Article
Full-text available
Study region The annually flooded Barotse Floodplain in the upper Zambezi River Basin in the Western Province of Zambia, Southern Africa. Study focus Discharge variability plays a significant role in inundation extent and thus it controls habitat conditions of river channels and the linked wetlands. The linkage between discharge and inundation extent in the Barotse Floodplain allowed us to analyse the trends in extent overtime using optical satellite imagery MODIS. The Desert Flood Index, a surface water extraction algorithm, was used to generate time series of inundation extent. For validation of the inundation extent we used a flood mask extracted from a supervised classification land cover map using Landsat imagery. The land cover map was validated using the error matrix method with ground truthed data. The estimated inundation extent time series enabled us to test the inundation correlation with discharge and water level using Pearson r correlation, a parametric statistical test. Based on the established correlation we used the Mann–Kendall, a non-parametric test, to analyse trends in the inundation extent and discharge and water level time series from which we made inferences on the direction of the historical trend in inundation extent. New hydrological insights for the region The results revealed that there is observable inter-annual variability in inundation extent in the Barotse Floodplain with prominent differences demonstrated in both the flood ascending/peak and receding period. For the period 2003–2013 the results indicated a rising trend in inundation extent with a Mann–Kendall Z statistic of 1.71 and increase in magnitude of 33.1 km² at significance level alpha of 0.05. Strong correlations between inundation extent and water level and between inundation extent and discharge with correlation coefficients of determination of 0.86 and 0.89 respectively were observed. For the period 2000–2011 water level time series showed a rising trend with the Mann–Kendall Z statistic of 2.97 and increase in magnitude of 0.1 m at significance level alpha of 0.05. Overall, during the period 1952–2004 discharge in the floodplain showed a declining trend with Mann–Kendall Z statistics of −2.88 and −3.38 at the inlet and outlet of the floodplain respectively. By correlation inference, the overall inundation extent trend in the floodplain was in a downward movement. Rainfall and discharge variability, high evapotranspiration and the changes in the land cover-use in the catchment of the floodplain are largely the factors affecting the observed variability and trends in inundation extent in the floodplain. The presented remote sensing based approach significantly reduces the need for the expensive and time limiting traditional physical field based wetland inundation mapping methods that form a limitation for achieving progress in wetland monitoring especially in open and sparsely gauged floodplains such as the Barotse.
Article
Full-text available
Road extraction offers great potential for research initiatives because of the complexity due to its great topological variability. The use of remote sensing imagery to accomplish this mapping is an interesting option. Indeed, satellite images can be acquired shortly after the event, and cover a large area of territory. We hope to produce a mapping of the present facilities from very high resolution images shortly after a disaster. This availability of very high spatial resolution images brings added value to the study in urban areas and their mapping. Increasing the spatial resolution generates noise, which makes extraction difficult, especially in the event of an earthquake in an urban context. This problem increases false alarm rates and generally affects the performance of road extraction algorithms in detecting linear features used to locate and extract roads on such images. During major disasters, short deadlines demand an effective response in terms of updating the mapping of affected areas. Our aim is to improve the road extraction quality after adaptation of Lowe’s scale-invariant features transform descriptors jointly with spectral angle algorithms. An illustration is performed on three high-resolution images, respectively, representing a rural, suburban, and urban disaster area, captured by the Quickbird satellite. Our approach significantly reduces the false detection rate and shows an increase in overall quality of up to nearly 30% in some cases as compared to what obtain in the literature.
Article
Knowledge over the number of animals in large wildlife reserves is a vital necessity for park rangers in their efforts to protect endangered species. Manual animal censuses are dangerous and expensive, hence Unmanned Aerial Vehicles (UAVs) with consumer level digital cameras are becoming a popular alternative tool to estimate livestock. Several works have been proposed that semi-automatically process UAV images to detect animals, of which some employ Convolutional Neural Networks (CNNs), a recent family of deep learning algorithms that proved very effective in object detection in large datasets from computer vision. However, the majority of works related to wildlife focuses only on small datasets (typically subsets of UAV campaigns), which might be detrimental when presented with the sheer scale of real study areas for large mammal census. Methods may yield thousands of false alarms in such cases. In this paper, we study how to scale CNNs to large wildlife census tasks and present a number of recommendations to train a CNN on a large UAV dataset. We further introduce novel evaluation protocols that are tailored to censuses and model suitability for subsequent human verification of detections. Using our recommendations, we are able to train a CNN reducing the number of false positives by an order of magnitude compared to previous state-of-the-art. Setting the requirements at 90% recall, our CNN allows to reduce the amount of data required for manual verification by three times, thus making it possible for rangers to screen all the data acquired efficiently and to detect almost all animals in the reserve automatically.