Content uploaded by Sam Conrad Joyce
Author content
All content in this area was uploaded by Sam Conrad Joyce on Jan 20, 2021
Content may be subject to copyright.
Machine Learning for
Comparative Urban Planning at
Scale: An Aviation Case Study
1 San Fransisco Airport before
and after U-Net segmentation
showing class labels.
Ahmed Meeran
Master's Student,
Singapore University of
Technology and Design
Sam Conrad Joyce
Meta Design Lab,
Singapore University of
Technology and Design
1
ABSTRACT
Aviation is in ux, experiencing 5.4% yearly growth over the last two decades, however with
COVID-19 in aviation was hard hit, this along with its contribution to global warming has
lead to louder calls to limit its use. This situation puts emphasis on how urban planners and
technologists could contribute to understanding and responding to this change.
This paper explores a novel workow of performing image-based Machine Learning (ML)
on satellite images of over 1000 world airports which were algorithmically collated using
European Space Agency Sentinel2 API. From this the top 350 US airports were analysed
with land use parameters extracted around the airport using Computer Vision, which were
mapped against their passenger footfall numbers.
The results demonstrate a scalable approach to identify how easy and benecial it would
be for certain airports to expand or contract and how this would impact the surrounding
urban environment in terms of pollution and congestion. The generic nature of this work-
ow makes it possible to potentially extend this method to any large infrastructure, and
also compare and analyse specic features across large number of images while being
able to understand the same feature through time. This is critical in answering key typology
based urban design challenges at a higher level and without having the need to perform
on-ground studies which could be expensive and time consuming.
TOPIC (ACADIA team will ll in)
2
INTRODUCTION
Aviation has been one of the most consistently growing
industries over the last two decades representing a
15-year doubling period of passenger numbers (ICAO 2018)
resulting in many airports being built. However, in 2020,
aviation was one of the worst-hit industries by the global
pandemic (IATA 2020), and before that seeing increasing
calls to limit its use due to its contribution to global
warming, urban congestion and noise pollution. This calls
for a potential urban reform targeting the larger pieces of
infrastructure constituting the urban ecosystem.
Airport planning and infrastructure construction should
be a critical element of this rethink: the construction of
airports uses large amounts of land often near cities and
can be disruptive, expensive, and time consuming; and
when in operation result in exposure to noise and environ-
mental degradation; but the connections enabled enrichen
cities, society, culture and economies. The lack of readily
available data for airport planning and design can be a
problem for governments and decision-makers, especially
when comparing between airports in different countries.
GIS and Satellite imagery has been one of the key tools
aiding planners and designers in their workows. Satellite
images are not just pleasing entities to look at (Figure 2), but
they also hold large amounts of data within. From explaining
variation in geographies, visualising change of seasons and
understanding the changing urban density satellite images
are useful pieces of data to urban designers and Architects.
Currently, the lack of a consistent and scalable method to
extract useful data from raw satellite images is to the point
where complex analysis are being performed manually or
compromised altogether, in certain cases. This is where the
potential of AI and Machine Learning could be leveraged; to
build a system that can help understand the changing land
use patterns in real time and in turn be able to take higher
level planning decisions at the very early stages of design
conceptualisation.
METHODOLOGY
Data-driven urban planning has been important and
successful in recent sustainable planning and policy
decision making (Sadik-Khan 2017). This work attempts to
build on urban design tools by developing an AI supported
approach to collect, model, process and synthesize mean-
ingful planning data, leveraging trained image recognition
on publicly available satellite image sources. To enable
designers and stakeholders to look at the variables from
a higher level of abstraction, and in this case to better
2 Aerial Imagery over the city of Chicago, USA captured by ESA Sentinel2
showing the coastline and the urban sprawl.
understand the impact of aviation on urban land use.
Specically by leveraging an approach which analyses
many of the same typology so that holistic policy decisions
as well as specic but relativistic entity decisions might be
supported.
Images Collection
We used open access satellite imagery data sourced
from the Sentinel2 Satellite, which was launched by the
European Space Agency’s (ESA) Copernicus mission in
2015 for earth observation. This satellite and ones like
it might change the way we work with earth observation
since accessibility barrier is very strong when it comes
to satellite data which are often gatekept for national
security reasons. Also, since the revisit time is just 5 days
it’s possible to closely monitor the natural earth phenom-
enon adding great value to areas such as geology and
meteorology.
The images are rendered in 3 different resolutions (60m,
20m, 10m) with 10m being each square pixel of that tile
is of the size of 10m x 10m in the real world. A tile in
sentinel terminology is a single capture, which is usually
100km x 100km square of a region. Sentinel2 is on a
Sun-synchronous orbit moving pole to pole capturing the
whole of the earth’s surface under similar daylighting. The
satellite also captures other bands such as Vegetation
Index, Near Infrared, Ultra-blue Aerosol, Shortwave
Infrared although these bands are available only in lower
resolutions (20m and 60m). Another key aspect of this
mission is that the images can be accessed not just through
their web interface (European Space Agency 2014) but can
also be extracted from the Sentinel2’s Python API which
gives high control in dening the parameters for returned
images.
5
5 Sentinel Image of Tokyo, Japan
visualised across all the 4
different seasons.
3 Spatial boundary showing the 10km square Area of Interest over Atlanta
International Airport (ATL), Georgia, USA.
4 The general automated process
workow to collect , compile and
analyse aerial imagery from ESA
Sentinel2 Satellite.
4
Machine Learning for Comparative Urban Planning at Scale: Meeran, Joyce
The API also supports other keyword arguments, a few of
them are listed below:
• Datetime – A range of dates can be passed as a string
to query the database for available sentinel tiles. A
short correction was made to get the best possible
image during Spring-Summer due to season inversion
across hemispheres.
• Cloudcoverpercentage – A number indicating the
preferred cloud coverage for the tile.
• Limit – The number of tiles in a single search
• Contains – This ensures a strict intersection of the
shapele with the tile (100% inclusion)
The focus here is on extracting the World’s top 1000
airports (based on passenger footfall in 2018), and explore
key typology-based questions specically: airport expand-
ability, land encroachment, and interaction between urban
zones and airport activity. The approach is a scalable
methodology that can be applied to many airport sites:
to explore though comparative data, current impact of
airports on existing land usage, if airports could expand or
contract, and how benecial or challenging it would be to
do so. The approach and ML training is general for large
scale land use analysis but here applied to understand
the relationship between airports and cities. For the sake
of demonstration we chose to analyse the top airports in
the USA mainly due to the fact that American airports are
leaning towards the saturation phase in terms of passenger
footfall numbers and it would be the right time for the
authorities to take action to answer the million dollar ques-
tion “Is it viable for airports to expand or contract?”.
One of the primary motives of this study is to understand
the US airports as datapoints and take a general macro
level planning rethink on whether they should ‘contract’, or
3
be ‘removed’ based on a few key parameters like land-use,
size of the airport and passenger trafc.
To collect any kind of aerial imagery, we would need the
geo-coordinates such as latitude and longitude of the
desired location. We sourced this data from ‘Aviation
Fanatic’ (Tóth 2011) and compiled it into a CSV le. This was
then combined with the IATA’s own open-access dataset of
world airports which was used to validate the authenticity
of aviation fanatic. A spatial boundary around the airport
was constructed (Figure 3) using Python’s Pygc, a spatial
projection library. A square boundary of side 10km was
chosen keeping the terminal building at the centre and the
shapele produced was parsed to Sentinelsat, the Python
API to query for matches of the spatial boundary in the
satellite’s database.
The API returned a list of “Products” (a 100 sq. km tile),
each of which have a string based identier and contains
other metadata including its date of capture and orbit
number. The list of Products were then sorted to get the
least clouded tile and the best one downloaded using
their unique identiers and the corresponding 10m true
color tiles were extracted computationally from the le
structure. This was performed in series and in batches to
preserve the repeatability of the process, since the volume
of data involved was in the order of hundreds of giga-
bytes. This also ensures as a failsafe mechanism, in any
event of a program crash which can then be tracked to its
nearest batch number thus saving time.The tiles were then
converted to world projection EPSG:4326 using the library
GDAL (Frank, Even and others 1998). This was a critical
step as sentinel collects imagery across the globe and each
country uses a different EPSG code which best represents
their aerial map. For scalability reasons, all the satellite
images collected were transformed to the web Mercator
standard for visualisation later.
The Pygc spatial boundary constructed earlier was used to
clip out the Area of Interest from the projected larger tile,
which is the 10km by 10km square around the terminal and
runways. These were then stored to le, as RGB images and
their metadata (date, time, cloud cover and satellite order)
retained for further analysis.
Image Recognition
The next step was to devise a methodology to understand
the land use features in the airport images, for this we
devised 4 unique classes of land use (vegetable land, arid
land, built-up land and water) which our ML model was
trained to predict, across our entire dataset.
We used the U-Net Architecture (Ronneberger, Fischer
and Brox 2015) for feature extraction from our images
for a number of reasons. Firstly, its fully convolutional and
symmetric in nature which will ensure the resolution of
both the images and the features are preserved. Secondly,
U-Net is known for its high convergence rate with low
sizes of training dataset reducing human labelling time.
Traditionally, in Machine Learning practices dataset sizes
are in the order of 104 but in our case its much lower and
U-Net was able to work with lower numbers of data. Thirdly,
its supervised nature of learning enabled us to manually
provide area labels in the training dataset which we desire
the model to predict over the whole set. U-Net has already
been in effective use in geology (Karchevskiy, Ashrapov and
Kozinkin 2018) and diagnostic radiology (Dong, et al. 2017).
Initially twenty ve airport images were human selected
to approximate the entire dataset in terms of diversity
6
6 Schematic visual of the
U-Net Architecture.
can be performed across the entire dataset based on
one or more parameters to generate meaningful insights
pertaining to urban planning and design.
ANALYSIS
Once the predictions from the U-Net model of the 4 land use
classes were extracted, we then proceeded with per-pixel
analytics based on the ground truth parameters to extract
analytical metrics out of the images. A visual representation
(Figure 9) shows the geographic context as well as the
number of airports per state. This helped us understand
which state is potentially on the vulnerable side, in terms of
the impact it causes to the surrounding urban fabric in the
form of passenger trafc congestion and pollution.
To understand the level of urbanisation around the airports,
its crucial to rst understand the land use distribution
derived from the U-Net. A pie chart showing the land
use composition was plotted at the respective airport
locations (Figure 10) in a map showing the share of each
land use metric against the total available land area. The
data shows airports located towards the coastal region
had a fair share of water being represented in the plot,
whereas composition of Arid (yellow) land began to rise in
the dry Mid-Western region. This in turn proved that the
model resulted in providing reasonably accurate results
especially the level of detail, considering the size of the
dataset involved in training our model and the resolution of
the images supplied.
For this relatively simplistic study we dened two key vari-
ables which will enable us to compare the ground reality
with the results predicted by ML system. The ground reality
is given by the passenger footfall numbers (2018 data) and
the urbanisation ratio is given by taking the total of the ‘red’
in level of urbanisation, cloudiness, proximity to sea and
other natural features. These images were then manually
segmented using the four classes as a key, using Photoshop
as the ground truth labels. The black pixels on the left-
most of (Figure 7) were features which did not fall into the
four basic classes we dened initially however for future
versions of this work we recommend ne tuning the manual
labelling process for better accuracy of prediction.
The images were stored along with its original RGB vari-
ants, which was together fed into the U-Net. Initially we
found that the U-Net was not able to predict images that
had a higher cloud coverage and sometimes classifying
clouds as desert (arid). Hence a second round of sampling
was carried out with twenty ve more images (fty in total)
to include some wider cases such as cloud-ridden and
semi-arid cases. The masks were converted into a one
shot encoding to preserve the land use classes for a more
accurate prediction. The model was then trained on an
Nvidia GTX 1060 GPU and converged in 4 hours with an
accuracy of about 70% at 300 epochs. It is to be recognised
the remarkable nature of U-Net which provided reasonably
accurate predictions with just fty sample training points
which further strengthened our condence towards using
this Architecture.
The prediction masks were then stored to le using a
similar naming convention as that of the RGB airport
images. At this point, it is interesting to note the versatility
of this method in being able to be scaled up as well as
manipulated due to its open source nature. The exibility of
this workow could easily allow us to perform longitudinal
analyses on a single image over time to monitor phenom-
enon such as land use change, area lost to forest re, and
land reclamation or construction progress monitoring. At
the same time, cross-sectional or comparative analyses
7 (From left) Ground truth mask showing multiple urban land use, the RGB Sentinel2 image of O’Hare Airport (ORD) Chicago used for identication, and
land use predictions from trained U-Net.
7
Machine Learning for Comparative Urban Planning at Scale: Meeran, Joyce
8
8 RGB images visualised along with their U-Net Multiclass predictions of 150 of the top 350 airports in USA.
pixels from the image predictions, which correspond to
the built-up pixels from the true colour RGB images. Both
the variables were normalised by dividing each datapoint
against the maximum value found across the entire dataset.
The two variables were then plotted against each other
(Figure 12) to understand if there is a correlation between
Passenger footfall and urbanisation in general, but
primarily to explore which airports near saturation if in
terms of urbanisation, or passenger trafc, or both.
RESULTS & DISCUSSION
One of the main motives of this research was to nd out if
airports have been impacting the surrounding urban fabric
and how can it be quantied to an extent where higher level
urban planning decisions can be taken based on analytics.
Based on the scatter plot (Figure 12), the initial and a more
general trend identied was that more congested airports
tend to cause a greater negative impact on the surrounding
urban context. This was further conrmed by the leading
diagonal of the scatter plot. There were also cases where
9
10 Airport Passenger Trafc in the US
Airports in 2018.
9 Distribution density of the US
airports by state visualised at
their approximate respective
locations.
airports which are relatively less crowded, but compar-
atively more urbanised, as well as airports that were
semi-urban/arid in nature but was highly crowded. To
account for these cases, we devised an action strategy with
4 unique categories based on their current congestion-ur-
banisation ratio. These categories were namely:
• Safe to Expand – Airports that are congested, but semi-
urban in nature
• Remove – Airports that are relatively less congested
but highly urbanised
• Monitor – Airports that are both more congested and
more urbanised, requiring possible attention in the
future
10
Machine Learning for Comparative Urban Planning at Scale: Meeran, Joyce
12 Level of urbanisation plotted
against passenger footfall
numbers with each airport repre-
sented by its corresponding aerial
image.
11 Land use distribution around the
airport based on the four different
class labels.
• Leave – Low impact airports which does not require
any intervention now.
These categories were determined using the scatterplot
(Figure 13) as an index, by assigning cut-off values for
each axis. Although this is a basic measure using only two
factors and in both cases the cut-off value is chosen arbi-
trarily, they could be reinforced through addition of more
variables, consumer travel patterns and domestic-inter-
national share of ights to be able to offer a more precise
output. As well as linking the variables to actual observed
airports which are borderline over/under acceptable limits
as dened by government or local population. However, we
reserve this for a future version of this work.
The results obtained from the scatterplot are interesting
in of themselves although the ndings are more indicative
and not to be applied from this study alone. As expected,
we found at least 10 airports which had been causing
serious impact to its surroundings, as highlighted in red.
Majority of the airports fell under the “Monitor” category,
which could be justied considering the general rising
trend of increasing air passenger trafc numbers in the
US combined with the optimism revolving around building
and expanding cities. These airports need to be closely
monitored and we feel that governments and planning
authorities should consider this band of airports more
seriously and plan for them with greater emphasis on their
sustainability. It is suggested that the green band should
be treated carefully to avoid misinterpretation, though they
might be safe to expand their feasibility and sustainability
must be put at top priority. The rest of the airports
currently have no pressure on them in terms of passenger
numbers as well as urban congestion. Nonetheless, the
results obtained here must be treated cautiously in general,
since the number of parameters used are relatively less
and hence the simplistic nature of this study.
11
12
13 Urban congestion vs airport passenger trafc with axis cut-off values for
action strategy.
14 Macro level planning recommendation for US Airports.
CONCLUSION
Technology driven urban planning could be the key towards
the greater goal of rethinking individual and network plan-
ning in the aviation industry. In this study we have explored
the current situation of US Airports in terms of two key
variables one of which was the ground truth (passenger
trafc data) while the other was predicted by the ML
system which was trained to identify the different types
of land use from the satellite images. This approach has
shown its computationally possible to longitudinally analyse
many airports for key high-level parameters based on ML
summaries of complex raw aerial data. The open source
nature of the methodology makes it highly scalable and
repeatable to be able to suit varied requirements, major
beneters being country and state governments especially
those rapidly developing without up-to-date mapping. The
ability to extract complex and critical information related
to land use patterns and perform per-pixel analytics from
a simple aerial image is remarkable and we feel that this
could potentially change the way the aviation industry
foresees its challenging adaption path ahead. This could
mean a lot for urban planners, policy makers and environ-
mentalists especially with the growing concern around the
ever expansive nature of aviation and urban infrastructure
with direct impacts to the living population through noise
pollution, urban congestion and to nature through shifting
land-use patterns.
We strongly feel that our methodology provides an inter-
face to compare and analyse the different contributing
parameters in analysing such impacts at a higher level very
much at the early stages of design conceptualisation. The
generality of this approach makes it more intriguing and
applicable to similar large infrastructure of concern such
as transport hubs and seaports etc to name a few.
Investigations in this novel domain are ongoing especially
in design space exploration through generative and the use
14
13
Machine Learning for Comparative Urban Planning at Scale: Meeran, Joyce
of Meta Parametric models (Ibrahim and Joyce 2019) at
the building level which we feel combined with this study is
remarkable in exploring alternative design options at the
early stage of architectural design through analytics at the
building, site and at the urban level.
LIMITATIONS & FURTHER WORK
Given that this study was predominantly focused on airport
expansion purely in terms of land availability around
the terminal building, we are aware that variables such
as noise pollution, cost and traveller preferences were
not explicitly taken into consideration and we leave this
for future work. As researchers, we are also aware of
ML taking the driver’s seat in streamlining most of the
processes across multiple industries, with new appli-
cations being discovered each day. AI in design has the
immense potential to reinforce subjective human decision
making, which, is crucial and a norm in the eld of design
and we feel this study is one of the rst steps towards it by
introducing ML in understanding land-use patterns using
satellite imagery. Even though this work may not offer a
detailed and holistic understanding of airports and the avia-
tion industry, the ability to leverage Articial Intelligence in
Urban Planning and Design workows itself is a milestone,
which shouldn’t be regarded lightly.
REFERENCES
[Conference Paper] Dong, Hao, Guang Yang, Fangde Liu, Yuanhan
Mo, and Yike. Guo. 2017. “Automatic Brain Tumor Detection and
Segmentation Using U-Net Based Fully Convolutional Networks.”
Annual Conference on Medical Image Understanding and Analysis.
Edinburgh, UK. 506-517.
[Web API] European Space Agency. 2014. API Hub. https://
scihub.copernicus.eu/twiki/do/view/SciHubWebPortal/
APIHubDescription.
[Web API] Frank, Warmerdam, Rouault Even, and others. 1998.
GDAL. https://gdal.org/index.html.
[Press Report] IATA. 2020. Pressroom. 24 April. https://www.iata.
org/en/pressroom/pr/2020-04-24-01/.
[Conference Paper] Ibrahim, Nazim, and Sam Joyce. 2019.
“User Directed Meta Parametric Design for Option Exploration.”
Association for Computer Aided Design in Architecture. Austin:
ACADIA.
[Report] ICAO. 2018. Long-Term Trafc Forecasts Passenger and
Cargo. International Civil Aviation Organization.
[Journal Article] Karchevskiy, Mikhail, Insaf Ashrapov, and Leonid.
Kozinkin. 2018. “Automatic salt deposits segmentation: A deep
learning approach. .” ArXiv.
[Report] OECD. 2008. The Impacts of Globalisation on International
Air Transport Activity. Guadalajara: OECD.
[Journal Article] Ronneberger, Olaf, Philipp Fischer, and Thomas
Brox. 2015. “U-Net: Convolutional Networks for Biomedical Image
Segmentation.” LNCS. 9351. 234-241.
[Book] Sadik-Khan, Janette. 2017. Streetght: Handbook for an
Urban Revolution. Penguin Books.
[Website] Tóth, Bálint. 2011. Aviation Fanatic. www.aviationfanatic.
com.
[Report] World Bank . 2020. Urban Development Overview. 20
April. https://www.worldbank.org/en/topic/urbandevelopment/
overview.
IMAGE CREDITS
All images and graphics are produced by the authors.
Ahmed Meeran is a Master's student at the Singapore University
of Technology and Design, pursuing his degree in 'Engineering
Innovation by Design'. He earned his bachelor's degree in
Architectural Technology (First class Honours) from the Indian
Institute of Technology Kharagpur with a minor specialisation in
MS. Economics (First class Honours). Ahmed aims to use tech-
nology to streamline mundane processes in design workows and
currently exploring the use of Articial Intelligence systems and
Generative frameworks for early stage design option explora-
tion. He aims to broaden his knowledge in leveraging large scale
systems to collect, process and analyse Big Data to take meaningful
policy decisions that impact people in their everyday lives. Ahmed
also works on data visualization with a special emphasis on urban
geography and morphology to understand our cities and built envi-
ronment in a useful way.
Sam Conrad Joyce is assistant professor at the Singapore
University of Technology and Design, in the Architecture and
Sustainable Design pillar. He explores possibilities at the inter-
section of technology driven research and design practice, having
prior worked at Foster + Partners on projects such as the Mexico
City Airport. He heads up The Meta Design Lab, an interdisciplinary
group seeking out conceiving, developing, and testing future
architectural capabilities specically, how A.I. and Big Data can nd
design insight and generate novel solutions, with the ultimate goal
that humans and computers might work together as collectively
superior co-creators.