Conference PaperPDF Available

Limits to Applied ML in Planning and Architecture: Understanding and defining extents and capabilities

Authors:

Abstract and Figures

There has been an exponential increase in Machine Learning (ML) research in design. Specifically, with Deep Learning becoming more accessible, frameworks like Generative Adversarial Networks (GANs), which are able to synthesise novel images are being used in the classification and generation of designs in architecture. While much of these explorations successfully demonstrate thè magic' and potential of these techniques, their limits remain unclear, with only a few, but crucial, discussions on underlying fundamental limits and sensitivities of ML. This is a gap in our understanding of these tools especially within the complex context of planning and architecture. This paper seeks to discuss what limits ML in design as it exists today, by examining the state-of-the-art and mechanics of ML models relevant to design tasks. Aiming to help researchers to focus on productive uses of ML and avoid areas of over-promise.
Content may be subject to copyright.
Limits to Applied ML in Planning and Architecture
Understanding and defining extents and capabilities
Sam Conrad Joyce1, Ibrahim Nazim2
1,2Singapore University of Technology and Design
1,2{sam_joyce|nazim_ibrahim}@sutd.edu.sg
There has been an exponential increase in Machine Learning (ML) research in
design. Specifically, with Deep Learning becoming more accessible, frameworks
like Generative Adversarial Networks (GANs), which are able to synthesise novel
images are being used in the classification and generation of designs in
architecture. While much of these explorations successfully demonstrate the
`magic' and potential of these techniques, their limits remain unclear, with only a
few, but crucial, discussions on underlying fundamental limits and sensitivities of
ML. This is a gap in our understanding of these tools especially within the
complex context of planning and architecture. This paper seeks to discuss what
limits ML in design as it exists today, by examining the state-of-the-art and
mechanics of ML models relevant to design tasks. Aiming to help researchers to
focus on productive uses of ML and avoid areas of over-promise.
Keywords: Machine Learning, Artificial Intelligence, Creativity
INTRODUCTION
At the current epoch, it seems an undeniable fact that
AI will have a significant impact on human activity.
Whilst the resultant effect has been hypothesised to
range from the ubiquitous and benign (Jack Ma) to
the potentially catastrophic (Elon Musk, Bill Gates),
there are clear indicators that this technology will im-
pact many industries and ways of life (Frey and Os-
bourne 2017). Whilst this report indicates that cre-
ative design-related roles are some of the least likely
to succumb to automation, discussions around cur-
rent practice paints a different picture. Thom Maine,
head of Morphosis architects, said that “I can have an
office instead of 80 people there’ll be eight of us.. ..I’m
producing 100 variations more [with AI] than I could
do with 10 times the amount of people” [10], and
“There are going to be less and less people needed
for the pure technical mechanical stuff” [2], with more
and more published work showing the breadth and
depth of applied AI already infiltrating the practice.
However, technology is not omnipotent. ML
techniques, which have progressed the field of AI
the most in the last decade, are reliant on a set of
fundamental concepts and paradigms and are inher-
ently limited by them. Deep Learning algorithms,
which have enabled GAN and style transfer (New-
ton 2019), simultaneously defines certain fundamen-
tal limits of the system as well (Heaven 2019, Olenik
2019), much like Turing’s famous NP-completeness
defines the limits of computing upon which ML is de-
pendent.
The aim of this paper is to trace this limit, which
Computational design - Volume 1 - eCAADe 39 |243
in some places is concrete and well defined by the
mechanics of ML and in other places is ambiguous
and linked to human issues of perceptual, qualitative,
and experiential issues of architecture. Drawing this
boundary is inherently limited to human perception
and prediction based on current ML techniques and
whilst it extrapolates and hypothesises about future
capabilities, it cannot predict revolutionary new ap-
proaches or hardware. However, the aim of drawing
such a boundary is intended to be useful so that de-
signers may know where we may be able to rely on AI
for support, and where this might be out of its reach,
and where humans and AI might find the appropri-
ate/comfortable/productive mix of capabilities.
During this consideration of limits, we start with
clearer aspects of design; Level of detail and scale,
Representation, Input data and Resolved-ness, and
work up to more ephemeral considerations such as
Creativity, Autonomy and Design quality. We will
consider these aspects in relation to the perceived
potential to change architectural design and con-
sider what limitations are present due to processing
power, algorithmic capability, and human/design is-
sues. By considering important aspects of design
representation/encoding, and the type of interaction
with the user.
SCALE LIMITS
Architectural design works across multiple scales and
levels of detail, from whole urban blocks to interiors
and construction details. Within a specific scale, de-
sign can be represented at various levels of detail,
typically increasing as the design gets closer to real-
isation. A floor plan can be represented as simple as
a layout diagram or can be detailed to include furni-
ture and finishes. Similarly, a 3D model of a building
might be as simple as a massing model or detailed as
a Building Information Model ‘B.I.M.’(Figure 1).
Currently, deep learning models mostly use
array-based tensors which work well with 2D data, so
it has become a standard to use images such as floor
plans to train and generate architectural representa-
tions (Figure 2). At this basic level, we encounter our
first issue, if we want to represent a wider area then
we need a bigger picture, if we want more details we
also need more pixels. Currently, much of ML is done
on images that are less than 300x300 in size to reduce
memory size and increase processing speed. Algo-
rithms like StyleGAN2 (Karras et al. 2019) use ‘Pro-
gressive Growing’ to allow for improvement in de-
tails, but these are not globally aware and instead are
simply experts in enhancing local detail without con-
sidering global effects. This can result in issues such
as nonsymmetric representations, for example, ear-
rings or eye shapes on both sides not matching in
generated images of faces (Tolosana et al. 2020).
Figure 1
3D models
ofdifferent levels
ofdetails
(LOD)(source:
simplebim.com)
Figure 2
Floor plans
generated from
labelled images
(Huang and Zheng
2018)
However, problems are exponential when extending
these capabilities to 3D (Ahmed et al. 2019). Cur-
rent deep learning techniques all require 3D data to
be translated into tensor/array-based data structures,
with voxel-based encoding being the most widely
used. These are the 3D equivalents of a pixel, repre-
senting a point over a regular 3D grid, encoded as a
binary value (0 or 1) for material or not. While possi-
ble, accurately representing architectural and build-
ing geometry comes at a high resource expense due
to scale. As Galileo understood, if you increase the
linear dimension you increase the volume cubically.
Thus for an image, if you go from 100px to 500px on
each side you go from a total of 10,000px to 25,000px,
2.5 times larger, whereas for a volume you go from
244 |eCAADe 39 - Computational design - Volume 1
1million px, already a lot, to 125million, 125 times
larger. As it stands currently, we are only able to rep-
resent crude geometric models and are not able to in-
tegrate data-rich models such as BIM. Thus with this
representation even with compression, storage and
processing become non-trivial. A small residential
site of 20x20x10m would use 2.8 trillion pixels even
with a course pixel size of 1cm.
Arguably this issue can be solved by higher com-
puting power and intelligent uses thereof, however,
compute power according to Moore’s law doubles
every two years (Moore 1998) , thus assuming current
state of the art models operate on 1000px square im-
ages we would still need 40 years of doubling to scale
up even to the residential site discussed.
REPRESENTATIONAL LIMITS
Additionally, unlike pixels in images, 3D data has
no direct representation or a consistent universal
format, and therefore exists in many forms; point
clouds, polygons, meshes, B-splines, constructive-
solid-geometry etc. as well as combined formats as in
DXF, Revit and IFC models (Figure 3). All of these are
effectively object lists or hierarchical-tree data struc-
tures, devised as they are more compact and aligned
with how buildings have been represented (drawing
lines) or constructed (floors, walls). However, these
structures do not readily capture important ML fea-
tures such as neighbouring relationships. Just as hu-
mans need software to process 3D models into im-
ages to comprehend and compare the actual design,
existing ML would need this converted into tensors
or similar to understand. I n terms of ML image recog-
nition and generation, this essentially has to do with
correlating pieces of data that are spatially and thus
logically close together. However, in the case of re-
lational/hierarchical type data structures, this is non-
trivial, with few current ML algorithms developed
to operate on them (Nauata 2020). Exacerbated by
the fact that many other applications are satisfied
with tensor/image/voxel type representations, such
as video tracking, image recognition, oncology and
fin-tec.
However, there are encouraging developments
in recent advanced models such as GPT-3 [7] and
Image-GPT [1] which are trained on text and tree-
based data structures like websites (Brown et al
2020). Whilst these are not practical to fully train by
one individual they have shown commendable capa-
bility to operate at higher levels of abstract and struc-
tured representation such as making whole web-
sites [16] or generating images based on text inputs
(Ramesh et al. 2021). Equally, we are seeing some
progress in turning the focus away from tensor/ar-
ray/grid type representations, Nauata et al. (2020)
demonstrate a graph centric approach that has been
used since the dawn of computational representa-
tion of space both interior and exterior (Hillier 1996).
Figure 3
Different
representations of
3D forms: Mesh,
Point cloud
(Fujimura 2018),
Voxels [19] and
Hamiltonian cycles
(Soler, Retsin and
Garcia 2017)
LEVELS OF INPUT DATA
Most developments in ML research in recent years
has been model-centric, focussing primarily on mak-
ing better models and algorithms training on com-
mon public datasets [9] . This has its benefits, by
many researchers using a small number of known
data sets, issues and improvements in algorithms
can be easily benchmarked and shared (Phillips et al.
2000), helping to progress the field faster. However,
these are typically well-curated and large data-sets
usually consisting of 10 thousand to 10 million im-
ages. Whilst expanding the potential and accuracy
this also has a wider positive pressure on researchers
to devise algorithms that work best for problems with
large amounts of data.
In architecture, the current ability to build
equally large data sets represents a major challenge.
There are fewer buildings or even photos of them
than people’s faces. For example, for some typolo-
gies such as opera-houses, there may be less than
1000 in the whole world, many are unlikely to have a
CAD representation. Even for general buildings with
Computational design - Volume 1 - eCAADe 39 |245
a digital model representation, there would be differ-
ent layers and representations based on when and by
who they were made. The first 3D Revit model was
only made in 2000. Thus, training sets are smaller,
and a lot more varied than is typically used.
An alternative approach might be to generate
data based on mapping and sensors. Open Street
Map and Microsoft ML have been broadly effective in
getting basic 2D building outlines [18]. Other sources
like LIDAR and photogrammetry may be able to get
3D shapes of buildings, but capturing the inside and
floor plans, let alone construction details of buildings,
at the data scales required to train existing quality
Deep Learning is a much larger challenge. Even if
possible, this kind of training will only enable a model
to learn general configurations and thus categorise
and generate ‘typical buildings’, of which there are
many. While some problems in Architecture can be
considered as general problems this excludes much
of the work that architects do on exceptional build-
ings.
However, there are new algorithms specifically
aimed to operate on small(er) data sets(Ronneberger
et al. 2015, Koch et al. 2015). And this trend is likely
to continue as even in conventional fields large data
sets are proving to be controversial due to privacy
[12], so there is pressure to be able to do this. But it
is unclear how well they can generalise and operate
with smaller data [13]. One encouraging approach is
in ML that is first trained generally with wide amounts
of data and then trained again on a specific prob-
lem. This increases complexity and again moves the
skills away from the individual model, but this is how
GPT-3 functions and has shown to be very effective
(Brown et al. 2020). Arguably this is also how ar-
chitects are trained, by looking at many classes and
ranges of buildings, but when working on a specific
building they focus on a specific location, scale, and
typology.
LEVEL OF RESOLVED-NESS
Architecture is an iterative process that starts from
conceptual design ideas and explorations and devel-
ops to a more detailed and refined design. At every
stage, the designer seeks to resolve functional, practi-
cal as well as aesthetic issues concerned at that stage
so that the design reaches a level of resolved-ness be-
fore it moves on to the next. This process requires
a general understanding of design and functionality
as well as specific requirements of the given context
and a mechanism for dynamic modification and re-
finement.
In image-based learning, such dynamic refine-
ment has been used to generate realistic-looking
photographs of humans [14]. As explained earlier
some of the success there are based on reprocessing
low-resolution generated images to ‘shore them up
to make them more rational. However again this is
a very localised process and doesn’t understand the
real purpose or context.
Figure 4
Example GAN
image output from
thisrentaldoesno-
texist.com
[6]
If we consider a similar output from an architec-
tural example[6] using style GAN but less well re-
fined, we see in Figure 4 fundamental issues such as
non-functional chairs, chairs that clip through tables,
kitchen cupboards with no depth, very narrow beds
or simply nonsensical wall arrangements. These are
a product of the ML simply reproducing an image in
the style of those inputs, but not knowing the way
that the space is supposed to function. This is a crit-
ical issue in architecture for two reasons, firstly and
ultimately linked to Plato’s The Allegory of the cave
(Plato 2000), the image of the room is all that the ML
sees and knows as so it is not inherently a 3D space
to the ML. So, any image reproduced does not un-
derstand what is physically possible in a 3D space,
as the ML is simply trying to make a passable image
in 2D. Whilst more data and better training would
reduce this we are still left with a critical issue that
the ML cannot comprehend that the design/output
246 |eCAADe 39 - Computational design - Volume 1
needs to be functional. Ultimately ML is an expert
of imitation, but this is no guarantee of functional-
ity. Unlike approaches like Evolutionary Optimisation
(Machairas et al 2014) there is no explicit measure of a
generated solutions performance with most ML tech-
niques. They are concerned not with the solution be-
ing right or wrong but more about appearing right or
wrong.
This might be solved in one of two ways, the first
is to reinforce functional solutions in the generation
of solutions, or alternatively to provide a layer to only
select those identified to be functional. This would
require explicit analysis to realise. The second is for
the quality of the imitations to be sufficiently good as
to minimise these errors happening in the first place.
The latter solution pre-supposes that there is not only
enough data to train but also enough angles/section-
s/plans etc or full 3D data is input so that the needs
would be reproduced. Even then the output would
be confined to the technology of the day. This is clear
in Figure 5 for example, that if trained on just Baroque
buildings then the resultant structure is less space-
efficient because it requires more support material.
However, if we instead used real metrics we would
require advanced, potentially 3D, representation so
that the training results can be analysed.
Beyond the details of this discussion, there is a
further consideration about how functional the out-
put should be of such systems. In fact, early archi-
tectural sketches and indeed whole conceptual de-
sign process may be antithetical to direct functional
needs. Because by posing designs unhindered (ini-
tially) by functional needs this provides more expres-
sive/creative freedom but also perhaps it poses chal-
lenges to other members of the team (structural en-
gineers, etc.) to realise the things that hereto have
not been done. This follows into the next aspect of
creativity.
Figure 5
Sample floor plan
as interpreted in
the Baroque style
(Chaillou 2020)
LEVEL OF CREATIVITY
As discussed in the section above Deep Learning
models produce novel results, and this is intrinsically
linked to human ideas of creativity. However, within
a computational framework, there are degrees of cre-
ativity. Being a subjective quality there is no uni-
versal measure, paradoxically the easier it is to mea-
sure change the less novel the change, as truly novel
changes operate by radical ideas never considered
before, thus unlikely to be measured also.
A case in point is Brown and Mueller’s Work
(2019) on Parametric variation, here ‘Novelty’ is re-
duced to a normalised vector of the input ‘slider’ val-
ues of the parametric model. In parametric mod-
els, this limited number of inputs makes the nov-
elty/change tractable, with a study showing most
parametric models in a corpus of 2002 contained be-
tween 1-11 parameters (Davis 2013). Whilst optimi-
sation techniques might refer to each unique slider
input vector as a new design, in reality, most designs
are only minor relative changes and are more ‘evolu-
tion’ than ‘revolution’, it is the underlying parametric
code that enables but also tethers the output to quite
similar results. Systems such as ‘Meta-Parametric De-
sign’ (Joyce and Ibrahim 2017, Harding et al 2013) can
be used to modify the components and typology of a
parametric model in addition to sliders. Howeverulti-
mately, this and other generative design approaches
still have their own limitations as described by Davis
[8].
Simple levels of variability don’t directly translate
to creativity. In many optimisation methods, num-
ber of possible design states can be considered as the
amount of data and thus possible design information
in the model. This may be described by Shannon’s
Entropy, with enough data required to convey infor-
mation, a large building (information) cannot be de-
scribed by very few lines (data). Thus, data is not in-
formation, and also information is not creativity. As
Brown and Muller identified, the amount of change
is influenced by relative change showing that a de-
sign’s level of change is in relation to what already ex-
ists (Gero 2002). Current Deep learning models are
Computational design - Volume 1 - eCAADe 39 |247
trained on historical data to statistically extract the
features of a known solution space. This space essen-
tially encodes the design ‘experience’ of the ML and
also focuses the output of any generative ML. The ML
is therefore bounded by the limits of that space and
inherently limited to produce similar things that can
exist within it. Whereas when we speak of novel and
original ideas, we tend to describe things that depart
from the existing in some significant way.
Change is a challenge for ML, a GAN trained on
buildings will not spontaneously produce coherent
images of chairs. Even ML that has been explicitly
built to change something is still bound to the input.
For example, with style transfer (Figure 6) we see that
it is possible to transfer an image to a different artis-
tic style, sometimes with radical changes if the style
is very abstract. However, even then this requires the
input of an existing style, it can only transfer those
that already exist or alternately blend between styles,
which although novel is still ultimately derivative.
Figure 6
Photograph of
Germany in the
style of 4 different
iconic paintings The
Shipwreck of the
Minotaur, The
Starry Night (left
top and bottom),
The Scream, Seated
Nude(right top and
bottom) [15]
However, this mechanism inherent in ML (training)
raises questions about the share of creativity be-
tween algorithm and human. The user stops being
the direct author and instead become more of an
‘inspirational curator’, feeding a model to define the
kind of output produced. To use an art analogy, the
user becomes Peggy Guggenheim like a curator who
influences the output but doesn’t make it directly.
Lauded architects like Wolf Prix and Thom Mayne
are already using this approach to generate alterna-
tives (DigitalFUTURES 2020). Prix inputting all of the
images of existing outputs from Coop Himmelb(l)au
office, to produce a range of images of possible future
forms for his own inspiration. Whereas Thom Mayne
is showing its use in a more constrained way as a gen-
erator of similar alternatives produced from a single
handmade work (Figure 7).
These lauded office leaders are also discussing
this impact on their practice. Mayne directly saying
a few options were hand modelled in 5 days whereas
now 100-200 options can be generated in 8/24 hours.
A very real sign that automation will have an impact
within even the creative domain, especially in the
case of practices with one central figure, who’s ca-
pabilities may be augmented by this technology and
not need as many assistants.
Others are celebrating the aesthetic of ML. An ex-
ample is Matias Del Campo’s winning entry 24 High-
school in Shenzhen (Del Campo 2021). Where im-
ages produced using an AttnGAN which converted
text related to the aspirations of the building such as
“A building with four walls and tomorrow inside Self
discipline and social commitment” into an image that
was translated into a floorplan or projected onto fa-
cades. Perhaps this is not surprising, human creativ-
ity has often been born out of impoverishment or lim-
itations. I n this case, the ML style becomes a strength
as it drives current uniqueness and a link to new tech-
nology. However, this is perhaps better termed an
artistic medium in which humans manipulate creativ-
ity. Maybe the creativity is not from the ML but the
user in the same way that it would not be the drips
but Jackson Pollock being creative despite the drip
mechanism making most of the detailed marks.
The current computer-generated art may or may
not represent something more long-lived than a sin-
gle fad. ML will need to be creative enough to stop
people getting bored from of it. Much like the rep-
resentational limitation on novel designs that comes
from simple parametric variation, many of the output
from GANs or Deep Dream [5] show limited not uni-
versal variability. Indeed, whilst Deep Dream images
were of intense interest when they came out in 2015,
these images don’t excite people as much now, per-
haps because these alien scenes have little relation to
wider human aesthetics or issues.
This is fundamentally linked to important issues
of what architecture is. If we consider it as a purely
248 |eCAADe 39 - Computational design - Volume 1
technical practice, simply creating liveable habitation
then creativity may be considered as capability to
solve problems in unique ways or adapt to new sit-
uations/sites but possibly this is better characterised
as ‘innovative’. However, architects see themselves
as part of culture. In this context we go back to the
issue of artistic creativity, here producing the same
design even if it is one’s own work is unacceptable,
copy or self-similar reproduction is in some way not
creative. ML is able to produce a huge number of
options thus able to sidestep this issue however ul-
timately it would need self and perhaps even cul-
tural reflection/feedback to truly represent a creative
agent.
Figure 7
Top: Examples of
closely similar
generated designs
by Morphosis,
Bottom: GAN
generated images
from Coop
Himmelb(l)au,
produced by
training on images
of their previous
work
(DigitalFutures
2020).
LEVEL OF AUTONOMY
Architecture is primarily for humans and driven by
human needs. The role of the designer is essen-
tially to understand those needs and articulate them
through space and form. The complexity of plan-
ning and architecture arises from the evolving and
ill-defined nature of these needs (Rittel and Webber
1973). And therefore, requires the active input and
involvement of a human to both describe the needs
as well as interpret if a design solution addresses
them. And therefore, it is unlikely to realistically have
AI design agents that are completely autonomous
unless they too also become the target users of the
building, or AI gains enough power to build a human
theory of mind and be empathetic to human needs.
Assuming human in the loop at least to be the
client or target user, then the question is, in which
tasks can they be autonomous and in which capacity
can they freely operate without the active engage-
ment of a human. This can also point to the design
tasks that are most replaceable by an AI, and in turn
which tasks might be secure.
With traditional computational design, design
approaches, rules, requirements, and conventions
are encoded as rigid logic and parameters within the
models and the automation happens in the explo-
ration of these parameters. This ensures that gen-
erated designs are predictable; in that they follow
the design logic set up by the designer. Coupled
with meaningful evaluation parameters, these meth-
ods have proven to be useful in practice but are
ultimately deterministic and rely on significant hu-
man programming input. Thus, due to the human
needs these automated systems are either geared to-
wards generic or highly specialised (single building)
automation.
However, with current Deep Learning the out-
comes are almost the opposite. Depending on the in-
puts, results are often unpredictable, in-precise, and
with inexplicable generation rationale (a ‘black box’),
but are able to produce outputs which are reasonably
flexible and fast once trained. Depending on input
data, they can implicitly encode design conventions,
but cannot be expected to be 100% compliant due to
their statical nature. However, they are more likely to
produce something adaptive in previously unseen/-
trained edge cases and robust enough to try to re-
spond even if proposing a solution is challenging. Al-
though the quality of that solution is likely question-
able.
As a result, it is hard to expect that a design
produced by ML will be used without some human
checking. We can expect to have more conventional
post-processing of ML output to rationalise image
input into lines/BIM families etc. or throw out mal-
formed designs. And with this, some further com-
Computational design - Volume 1 - eCAADe 39 |249
mon sense and algorithmic refinement may be pos-
sible, in the same way as noise reduction or image
sharpening is done on other media now. Ultimately
a human will have to check them, to ensure that out-
put is sensible and for the prior mentioned appraisal
in terms of human and perhaps functional aspects if
that can’t be analysed automatically. Although ML
will most likely be able to guess/estimate this if it is
possible to capture the data as a training set. Ar-
guably the most significant utility in ML in design will
be its ability to automate or co-opt the actual spa-
tial design. Extending automation beyond existing
rules-based systems used for defining door sched-
ules, panellisation planning etc. into subjective but
fundamentally rational design activities like furniture
and room layout.
The exciting challenge will be in relation to in-
terfaces between ML and users. Ultimately the ef-
fectiveness of any ML beyond basic generative capa-
bility will be contingent on its ability to understand
and thus provide for human spatial needs. Practi-
cally this will be in the form of responding to hu-
man input to change a design. Due to trained ML’s
fast response, there is potential for systems to en-
gage in repeat interchanges, unlike typical analysis
reliant Multi-Objective Optimisation (MOO) methods
where options are generated over a long time and a
solution is chosen in one-shot. ML based interfaces
could be more multidirectional. As demonstrated by
Chaillou (2020), they can be designed to allow for
fast sketching and modification by a user in response
to the ML generated design, or have autocomplete/-
fill type functionality as shown in Image GPT. Sites
like ‘Picbreeder’ [11] show interfaces for interactive
generative design, Ibrahim and Joyce (2019) demon-
strates a similar approach on Meta-parametric mod-
els. Where a generative tree of options can be in-
teractively navigated, which is similar to how human
generated options are explored (Fujimura 2018) or in
a GitHub type parametric model versioning system
(Cristie and Joyce 2019, Mohiuddin et al. 2017).
Beyond direct high-bandwidth brain-machine
connections envisioned by Elon Musk’s ‘Nuralink’ this
lack of intellectual expediency in the to-and-fro of de-
sign might be one of the biggest barriers to quality
design production in ML, as currently a relatively gen-
eral ML requires significant interaction to tune its out-
put to fit a designer’s/user’s needs. But examples of
link between images and text such as Dall-e [3] and
Clip [4] point to more expedient high-level guidance
of ML output.
LEVEL OF DESIGN QUALITY
The previous section considered the use of ML for
inspirational means and the sections before consid-
ered ML as a useful functional helper. What we may
consider as ’good/great’ architecture often goes be-
yond fulfilling basic functional requirements or even
pure aesthetics and style. Most of a singular build-
ing is driven by the high-level needs of the client. At
the same time, buildings as a whole can be an ex-
pression and self-reflection of the values and ethos
of the designer(s). Much of these ideas may not be
explicit or captured formally. Currently, ML does not
have a past, nor any ’ambitions’ beyond what is pro-
grammed to them. However, in some respects, larger
General AI is arguably beginning to share some as-
pects of humans. For example, GPT-3 uses a large cor-
pus of data including all of Wikipedia which only rep-
resenting 6% of the data it uses. After initial training,
estimated to cost $4.6M in compute alone (Li 2020),
it can then learn quickly with limited input, able to
extend existing texts, mimic different authors’ styles,
and even write in other languages. It is a ’pattern
machine’ able to abstract existing text patterns using
its 175Billion parameters. In this way the model may
be said to be aggregating ’experience’, humans gain
learn linearly over time, but ML can learn in a highly
parallel way to shorten that time but at the cost of
using up power.
Whilst the experience (training data sets) flavour
ML output in a similar way that a designer might pro-
duce architecture as a product of who they worked
with or what buildings they saw, we still see a lack
of self-motivated activity. Perhaps unless an ML be-
comes active in pursuing and also preferential and
250 |eCAADe 39 - Computational design - Volume 1
Figure 8
Mapping of ML
tools based on the
different issues
which contribute to
their input
requirements and
output quality
critical about its input data then there is not any sense
of “Self” or “Self-actualisation”. Limiting its output
and scope from a designer to instructed assistant.
If instead, we focus more on the ‘tool’ nature of
ML we also come up against limits. Currently, ML
appears to be more effective when trained on lots
of data, and with high levels of complexity. Assum-
ing there are no shortcuts, then we enter a limita-
tion related to being able to afford and ultimately
own ML tools. Whilst high-quality models might be
very useful, as GPT-3 shows they are not cheap, nor
are they easy to understand or work with. Tools like
Lobe.ai[17] show human usable GUI interfaces for
these systems and, with correct automatic tuning, ML
may be relatively easy for non-technical people to
use and train. However even then there is still much
work to do in the realms of ‘Explainable AI’ to help
users understand what is going on in the black box,
trust is less of an issue assuming that all designs are
vetted by a human which is a reasonable assumption
still for buildings.
A commercial issue also exists especially with
General AI, that the high cost means that they are
likely to be controlled by larger software companies.
If they become more useful there is a danger our abil-
ity as designers will be limited by access to them,
which may be controlled by the commercial inter-
ests of those able to afford them and few architec-
ture firms are likely to be able to do so. AEC has al-
ready has been affected by similar issues in relation to
data exchange with Revit which was restricted by cre-
ators Autodesk. Whereasthe previous text-based .dxf
(also by Autodesk) was truly open. Perhaps what we
are seeing is the difference between low-level Ten-
sorFlow tools (early stage) and high-level GPT-3 tools
(late-stage). The industry navigation of this issue may
prove to be an important one especially if architec-
ture wants to maintain its autonomy.
CONCLUSION
In some ways, this paper echoes Rittel and Web-
ber’s (1973) ‘Wicked Problems’ , which highlighted
the limits of solving planning problems with a con-
ventional design process. However, at the cost of
under-representing the beneficial power inherent in
design. This paper suffers similarly in emphasising
the ‘wickedness’ of applying ML to design while not
highlighting its strengths. However, we believe that
ML is not at a loss for supporters with abundant ex-
plorations in applying individual ML tools in design.
However, there is a lack of studies that evaluate and
cross-compare different ML tools for their limits and
applicability in design. Such studies can help to map
the world of ML and determine the most project ap-
propriate tools for the level of intervention or output
detail required. In figure 8, we present our attempt at
organising and mapping some of these tools against
issues discussed in the paper. We hope this may pro-
vide a starting point for others, to expand on and this
paper, like Rittel and Webber’s, helps those who are
working with these tools focus their talents on solv-
ing the real challenges and avoid the pitfalls of this
powerful computing paradigm.
Computational design - Volume 1 - eCAADe 39 |251
REFERENCES
Ahmed, E, Saint, A, Shabayek, AER, Cherenkova, K, Das, R,
Gusev, G, Aouada, D and Ottersten, B 2019 ’A Sur vey
on Deep Learning Advances on Different 3D Data
Representations’, arXiv:1808.01462 [cs]
Brown, TB, Mann, B, Ryder, N, Subbiah, M, Ka-
plan, J, Dhariwal, P, Neelakantan, A and ., et al
2020 ’Language Models Are Few-Shot Learners’,
arXiv:2005.14165 [cs]
Brown, NC and Mueller, CT 2019, ’Quantifying Diversity
in Parametric Design: A Comparison of Possible Met-
rics’, AI EDAM, 33, pp. 40-53
Del Campo, M 2021 ’ARCHITECTURE, LANGUAGE AND
AI:Language, Attentional Generative Adversarial
Networks (AttnGAN) and Architecture Design’, Proc.
CAADRIA 2021
Chaillou, S 2020, ’ArchiGAN: Artificial Intelligence x Ar-
chitecture’, in Yuan, PF, Xie, M, Leach, N, Yao, J and
Wang, X (eds) 2020, Architectural Intelligence: Se-
lected Papers from CDRF 2019, Springer, Singapore
Cristie, V and Joyce, SC 2019 ’GHShot’, Proc. eCAADe/SI-
GraDi 2019
Daniel, D 2013, Modelled on Software Engineering: Flex-
ible Parametric Models in the Practice of Architecture,
Ph.D. Thesis, RMIT
Frey, CB and Osborne, MA 2017, ’The Future of Employ-
ment: How Susceptible Are Jobs to Computerisa-
tion?’, Technological Forecasting and Social Change,
114, pp. 254-280
Fujimura, R 2018, The Form Of Knowledge, The Prototype
Of Architectural Thinking And Its Application, Toto
Gero, JS 2002 ’Computational Models of Creative De-
signing Based on Situated Cognition’, Proc. 4th Conf.
on Creativity & Cognition
Harding, J, Joyce, S, Shepherd, P and Williams, C 2013,
’Thinking Topologically at Early Stage Parametric
Design’, Advances in Architectural Geometry 2012,
2012, pp. 67-76
Heaven, D 2019, ’Why Deep-Learning AIs Are so Easy to
Fool’, Nature, 574, pp. 163-166
Ibrahim, N and Joyce, S.C 2019 ’User Directed Meta Para-
metric Design for Option Exploration’, Proc. ACADIA
2019.
Joyce, S.C and Ibrahim, N 2017 ’Exploring the Evolution
of Meta Parametric Models’, Proceedings of ACADIA
2017
Karras, T, Laine, S and Aila, T 2019 ’A Style-Based Gen-
erator Architecture for Generative Adversarial Net-
works’, Proc. IEEE/CVF Conf. Comp. Vision and Pattern
Recogn.
Koch, G, Zemel, R and Salakhutdinov, R 2015 ’Siamese
Neural Networks for One-shot Image Recognition’,
Proc. of 32nd Int. Conf. on Mach. Learning
Machairas, V, Tsangrassoulis, A and Axarli, K 2014, ’Al-
gorithms for Optimization of Building Design: A Re-
view’, Renewable and Stust. Energy Rev., 31, pp. 101-
112
Mohiuddin, A, Woodbury, R, Ashtari, N, Cichy, M and
Mueller, V 2017 ’A Design Gallery System: Prototype
and Evaluation’, Proc. ACADIA 2017
Nauata, N, Chang, KH, Cheng, CY, Mori, G and Furukawa,
Y 2020 ’House-GAN: Relational Generative Adversar-
ial Networks for Graph-Constrained House Layout
Generation’, arXiv:2003.06988 [cs]
Newton, D 2019, ’Generative Deep Learning in Architec-
tural Design’, Technology|Architecture+ Design, 3, pp.
177-189
Oleinik, A 2019, ’What Are Neural Networks Not Good at?
On Artificial Creativity’, Big D ata & Soc., 6, pp. 1-13
Phillips, P.J, Moon, H, Rizvi, S and Rauss, P Oct./2000,
’The FERET Evaluation Methodology for Face-
Recognition Algorithms’, IEEE Trans. on Patt. Analysis
and Mach. Int., 22, pp. 1090-1104
Ramesh, A, Pavlov, M, Goh, G, Gray, S, Voss, C, Radford, A,
Chen, M and Sutskever, I 2021, ’Zero-Shot Text-to-
Image Generation’, arXiv: [cs], 2102.12092, pp. 1-20
Rittel, HWJ and Webber, M 1973, ’Dilemmas in a General
Theory of Planning’, Policy Sciences, 4, p. 155–169
Ronneberger, O,Fischer, P and Brox, T 2015, ’U-Net: Con-
volutional Networks for Biomedical Image Segmen-
tation’, arXiv: [cs], 1505.04597, pp. 234-241
Soler, V,G, Retsin and Garcia, M J 2017 ’A Generalized Ap-
proach to NonLayered Fused Filament Fabrication’,
Proc. ACADIA 2017
Tolosana, R, Vera-Rodriguez, R, Fierrez, J, Morales, A and
Ortega-Garcia, J 2020, ’Deepfakes and beyond: A
Survey of face manipulation and fake detection’, Inf.
Fusion, 64, pp. 131-148
[1] https://openai.com/blog/image-gpt/
[2] https://www.archdaily.com/938693
[3] https://openai.com/blog/clip/
[4] https://openai.com/blog/dall-e/
[5] http://ai.googleblog.com/2015/06/inceptionism-go
ing-deeper-into-neural.html
[6] https://thisrentaldoesnotexist.com/
[7] https://lambdalabs.com/blog/demystifying-gpt-3/
[8] https://www.danieldavis.com/generative-design-do
omed-to-fail/
[9] https://youtu.be/06-AZXmwHjo
[10] https://youtu.be/OlvYzmWuMsU
[11] http://picbreeder.org/
[12] https://www.ft.com/content/7d3e0d6a-87a0-11e9
-a028-86cea8523dc2
[13] https://www.technologyreview.com/2020/10/16/1
010566
[14] https://thispersondoesnotexist.com/
[15] https://github.com/cysmith/neural-style-tf
[16] https://www.technologyreview.com/2020/07/20/1
005454
[17] https://lobe.ai/
[18] https://www.nytimes.com/interactive/2018/10/12/
us/map-of-every-building-in-the-united-states.html
[19] http://transnatural.org/wp-content/uploads/2011/
09/EZCT_Booklet-Screen.pdf
252 |eCAADe 39 - Computational design - Volume 1
... A large part of the research on DL for architecture is targeted at the synthesis of design representations. Initiatives that either analyse the limits of DL in design (Joyce & Nazim, 2021) or that test these ideas in an educational setting are still in their infancy. Some of the early efforts to integrate the recent bloom of DL in design pedagogy were crystallised as a series of workshops in CAAD conferences. ...
Conference Paper
Full-text available
This paper reports a pedagogical experience that incorporates deep learning to design in the context of a recently created course at the Carnegie Mellon University School of Architecture. It analyses an exercise called Bubble2Floor (B2F), where students design floor plans for a multi-story row-house complex. The pipeline for B2F includes a parametric workflow to synthesise an image dataset with pairs of apartment floor plans and corresponding bubble diagrams, a modified Pix2Pix model that maps bubble diagrams to floor plan diagrams, and a computer vision workflow to translate images to the geometric model. In this pedagogical research, we provide a series of observations on challenges faced by students and how they customised different elements of B2F, to address their personal preferences and problem constraints of the housing complex as well as the obstacles from the computational workflow. Based on these observations, we conclude by emphasising the importance of training architects to be active agents in the creation of deep learning workflows and make them accessible for socially relevant and constrained design problems, such as housing.
Chapter
Full-text available
This paper proposes a novel graph-constrained generative adversarial network, whose generator and discriminator are built upon relational architecture. The main idea is to encode the constraint into the graph structure of its relational networks. We have demonstrated the proposed architecture for a new house layout generation problem, whose task is to take an architectural constraint as a graph (i.e., the number and types of rooms with their spatial adjacency) and produce a set of axis-aligned bounding boxes of rooms. We measure the quality of generated house layouts with the three metrics: the realism, the diversity, and the compatibility with the input graph constraint. Our qualitative and quantitative evaluations over 117,000 real floorplan images demonstrate that the proposed approach outperforms existing methods and baselines. We will publicly share all our code and data.
Article
Full-text available
The free access to large-scale public databases, together with the fast progress of deep learning techniques, in particular Generative Adversarial Networks, have led to the generation of very realistic fake content with its corresponding implications towards society in this era of fake news. This survey provides a thorough review of techniques for manipulating face images including DeepFake methods, and methods to detect such manipulations. In particular, four types of facial manipulation are reviewed: i) entire face synthesis, ii) identity swap (DeepFakes), iii) attribute manipulation, and iv) expression swap. For each manipulation group, we provide details regarding manipulation techniques, existing public databases, and key benchmarks for technology evaluation of fake detection methods, including a summary of results from those evaluations. Among all the aspects discussed in the survey, we pay special attention to the latest generation of DeepFakes, highlighting its improvements and challenges for fake detection. In addition to the survey information, we also discuss open issues and future trends that should be considered to advance in the field.
Conference Paper
Full-text available
The potential of parametric associative models to explore large ranges of different designs is limited by our ability to manually create and modify them. While computation has been successfully used to generate variations by optimizing input parameters, adding or changing ‘components’ and ‘links’ of these models has typically been manual and human driven. The intellectual overhead and challenges of manually creating and maintaining complex parametric models has limited their usefulness in early stages of design exploration, where a quicker and wider design search is preferred. Recent methods called Meta Parametric Design using Cartesian Genetic Programming (CGP) specifically tailored to operate on parametric models, allows computational generation and topological modification for parametric models. This paper proposes the refinement of Meta Parametric techniques to quickly generate and manipulate models with a higher level of control than existing; enabling a more natural human centric user-directed design exploration process. Opening new possibilities for the computer to act as a co-creator: able to generate its own novel solutions, steered at a high-level by user(s) and able to develop convergent or divergent solutions over an extended interaction session, replicating in a faster way a human design assistant.
Article
Full-text available
This article discusses three dimensions of creativity: metaphorical thinking; social interaction; and going beyond extrapolation in predictions. An overview of applications of neural networks in these three areas is offered. It is argued that the current reliance on the apparatus of statistical regression limits the scope of possibilities for neural networks in general, and in moving towards artificial creativity in particular. Artificial creativity may require revising some foundational principles on which neural networks are currently built.
Chapter
AI will soon massively empower architects in their day-to-day practice. This article provides a proof of concept. The framework used here offers a springboard for discussion, inviting architects to start engaging with AI, and data scientists to consider Architecture as a field of investigation.
Article
We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images (e.g., freckles, hair), and it enables intuitive, scale-specific control of the synthesis. The new generator improves the state-of-the-art in terms of traditional distribution quality metrics, leads to demonstrably better interpolation properties, and also better disentangles the latent factors of variation. To quantify interpolation quality and disentanglement, we propose two new, automated methods that are applicable to any generator architecture. Finally, we introduce a new, highly varied and high-quality dataset of human faces.
Article
Generative Adversarial Networks (GANs) are an emerging research area in deep learning that have demonstrated impressive abilities to synthesize designs, however, their application in architectural design has been limited. This research provides a survey of GAN technologies and contributes new knowledge on their application in select architectural design tasks involving the creation and analysis of 2D and 3D designs from specific architectural styles. Experimental results demonstrate how the curation of training data can be used to control the fidelity and diversity of generated designs. Techniques for working with small training sets are introduced and shown to improve the visual quality of synthesized designs. Lastly, experiments demonstrate how GANs might be used analytically to gain insight into specific architectural oeuvres.
Conference Paper
Designers work by creating alternatives. Current design media restrict this practice through their near-universal adherence to a single-state document model. We describe the implementation of an online gallery system built as part of a research program to understand new media types for working with design alternatives in parametric modeling. The online gallery supports multiple commercially available parametric modelers. A user study shows a significant difference between two modes of gallery operation and a qualitative study describes user patterns in using the online gallery system.
Article
To be useful for architects and related designers searching for creative, expressive forms, performance-based digital tools must generate a diverse range of design solutions. This gives the designer flexibility to choose from a number of high-performing designs based on aesthetic preferences or other priorities. However, there is no single established method for measuring diversity in the context of computational design, especially in the field of architecture. This paper explores different metrics for quantifying diversity in parametric design, which is an increasingly common digital approach to early-stage exploration, and tests how human users perceive these diversity measurements. It first provides a review of existing methodologies for measuring diversity and describes how they can be adapted for parametrically formulated design spaces. This paper then tests how these different metrics align with human perception of design diversity through an online visual survey. Finally, it offers a quantitative comparison between the different methods and a discussion of their attributes and potential applications. In general, the comparison indicates that at the level of diversity difference that becomes visually meaningful to humans, the measurable difference between metrics is small. This paper informs future researchers, developers, and designers about the measurement of diversity in parametric design, and can stimulate further studies into the perception of diversity within sets of design options, as well as new design methodologies that combine architectural novelty and performance.