Conference PaperPDF Available

An Academy of Spatial Agents: Generating Spatial Configurations with Deep Reinforcement Learning

Authors:
  • Carnegie Mellon University / University of Arkansas

Abstract and Figures

Agent-based models rely on decentralized decision making instantiated in the interactions between agents and the environment. In the context of generative design, agent-based models can enable decentralized geometric modelling, provide partial information about the generative process, and enable fine-grained interaction. However, the existing agent-based models originate from non-architectural problems and it is not straightforward to adapt them for spatial design. To address this, we introduce a method to create custom spatial agents that can satisfy architectural requirements and support fine-grained interaction using multi-agent deep reinforcement learning (MADRL). We focus on a proof of concept where agents control spatial partitions and interact in an environment (represented as a grid) to satisfy custom goals (shape, area, adjacency, etc.). This approach uses double deep Q-network (DDQN) combined with a dynamic convolutional neural-network (DCNN). We report an experiment where trained agents generalize their knowledge to different settings, consistently explore good spatial configurations, and quickly recover from perturbations in the action selection.
Content may be subject to copyright.
An Academy of Spatial Agents
Generating spatial configurations with deep reinforcement learning
Pedro Veloso1, Ramesh Krishnamurti2
1,2Carnegie Mellon University
1pedroveloso13@gmail.com 2ramesh@andrew.cmu.edu
Agent-based models rely on decentralized decision making instantiated in the
interactions between agents and the environment. In the context of generative
design, agent-based models can enable decentralized geometric modelling,
provide partial information about the generative process, and enable fine-grained
interaction. However, the existing agent-based models originate from
non-architectural problems and it is not straight-forward to adapt them for
spatial design. To address this, we introduce a method to create custom spatial
agents that can satisfy architectural requirements and support fine-grained
interaction using multi-agent deep reinforcement learning (MADRL). We focus on
a proof of concept where agents control spatial partitions and interact in an
environment (represented as a grid) to satisfy custom goals (shape, area,
adjacency, etc.). This approach uses double deep Q-network (DDQN) combined
with a dynamic convolutional neural-network (DCNN). We report an experiment
where trained agents generalize their knowledge to different settings, consistently
explore good spatial configurations, and quickly recover from perturbations in
the action selection.
Keywords: space planning, agent-based model, interactive generative systems,
artificial intelligence, multi-agent deep reinforcement learning
INTRODUCTION
In this paper we investigate a generative model for
spatial design based on learning agents, which sup-
ports fine-grained and real-time human-computer
interaction.
Typically, computational agents are used for
modeling natural or artificial phenomena that unfold
over time based on the interactions with the shared
environment (Agent-based modeling, ABM) (Wilen-
sky and Rand, 2015). In the context of multi-agent
space planning (MASP) (Veloso, Rhee and Krishna-
murti, 2019), an agent can represent a spatial en-
tity with local control, interweaving individual per-
ception and action to decide how space should be
shaped, occupied, or partitioned. The ‘environment’
comprehends elements of the space that are inde-
pendent of the agents and that support the different
interactions. During a simulation, agents receive sig-
nals from the environment, neighboring agents, or
even the designer and make decisions to change the
D1.T7.S1. COGNIZANT ARCHITECTURE - WHAT IF BUILDINGS COULD THINK? - Volume 2 - eCAADe 38 |191
spatial configuration. This is in contrast to many of
the prevailing generative methods.
Overall, agent-based models support many char-
acteristics that are beneficial for design exploration.
They enable the control of configurations that are not
feasible with a centralized modeling strategy. Agents
can produce patterns or behaviors that are not nec-
essarily predictable from the perspective of their in-
ternal programs. They operate on a shared envi-
ronment and use a local action space, facilitating a
fine-grained interaction with the designer. The dis-
tributed decision making of multiple spatial agents in
a simulation results in partial design states with valu-
able information about design trade-offs and oppor-
tunities for local changes.
Gap in multi-agent space planning
Agent-based models used in spatial synthesis include
swarm algorithms (flocking and pheromone navi-
gation), cellular automata, reaction-diffusion, and
physics simulation (Herr and Ford, 2015; Veloso, Rhee
and Krishnamurti, 2019). Cellular automata enable
the emergence of patterns and has been successfully
applied to conceptual form generation, urban mor-
phology (Koenig, 2011), and even to building design
(Araghi and Stouffs, 2015), but the cell-based com-
putation imposes strict restrictions for the satisfac-
tion of architectural requirements. Physics simula-
tion (rigid or soft body) is a more general technique
that conciliates simple control with intuitive interac-
tion and can be adapted for different spatial prob-
lems and objectives. However, physics-based agents
are reactive agents that approximately follow laws
of physics. They do not have any sophisticated pol-
icy (i.e. a probability of choosing actions in every
state) to manage spatial conflicts or to explore the de-
sign space. Bio-inspired models, such as swarm algo-
rithms simulate exogenous phenomena (social nav-
igation) in which the units can move and interact in
space.
While these models enable the incorporation of
certain architectural requirements, such as area or
adjacency, it is usually harder to adapt them to pro-
duce conventional architectural forms or satisfy other
requirements, without resorting to some form of de-
structive technique to improve their configurations.
The challenge for agent-based, generative models is
to develop control strategies that incorporate spe-
cific architectural requirements and preserve the fine
granularity of the simulation.
Developing self-learning agents
Reinforcement Learning (RL) is a viable alternative to
develop control strategies that incorporate specific
architectural requirements and to preserve the fine
granularity of the simulation. With RL, agents can
learn how to build spaces by interacting with the en-
vironment and improving their performance with the
respect to certain goals. After training, designers can
interact with the spatial agents in real-time to explore
design alternatives.
In CAAD, RL has been recently applied to varied
topics, such as intelligent adaptive building control
(Smith and Lasch, 2016), autonomous robots (Hos-
mer and Tigas, 2019), fire egress evaluation (Jabi et
al., 2019), and machine feedback (Xu et al., 2018). For
the generation of spatial configuration, RL has been
adopted for automatic decision-making with shape
grammars (Ruiz-Montiel et al., 2013) and a natural se-
lection algorithm has been used to improve the pol-
icy network of competing teams of agents that create
clusters of building blocks on a grid (Narahara, 2017).
In this paper, through a proof of concept we de-
scribe an instance of our method to train agents to
generate spatial configurations by interacting in an
environment. We briefly describe the representation
of the environment, agents, actions and objectives.
Then, we focus on the description of the algorithm,
which uses RL and spatial objectives provided by the
designer to train the agents. Particularly, we use cus-
tom RL techniques based on deep learning for the
multi-agent setting − which is referred to as multi-
agent deep reinforcement learning (Nguyen, Nguyen
and Nahavandi, 2018). Finally, we evaluate the per-
formance of our proof of concept in an architectural
application.
192 |eCAADe 38 - D1.T7.S1. COGNIZANT ARCHITECTURE - WHAT IF BUILDINGS COULD THINK? - Volume 2
SPATIAL REPRESENTATION OF ENVIRON-
MENT AND AGENTS
Environment and agents
Information is organized as stacked grids of informa-
tion, which we refer to as arrays. For instance, the
environment is an array of shape (denv, h, w). The
width (w) and height (h) define the size of the grid,
and denv defines the number of layers of informa-
tion. The basic layer of information of the environ-
ment contains two sets of cells: empty (0) and obsta-
cle (-1). Only empty cells can be occupied by agents.
More information can be added to the array as addi-
tional layers.
The agent is an array with shape (dagent,h,w).
It represents a spatial partition, such as a building, a
room or an area for certain activity. In this proof of
concept, the spatial partition is a polyomino with no
holes (PnH, see figure 1, row 1). A polyomino is a set
of grid cells that share edges. By avoiding the exis-
tence of holes, the PnH has certain interesting prop-
erties. For instance, it forms a polygon with orthogo-
nal edge boundaries and avoids containment of one
agent by another. Therefore, it can form custom di-
agrams of floorplans, comprehending organizations
such as tight packing and loose packing. The shapes
can approximate any polygon, such as orthogonal
and non-orthogonal rectangles, or free forms such as
amoeba-like shapes. The resulting polyominoes can
be post-processed to depict bubble diagrams or sup-
port the development of a parametric model. In our
setting, each PnH induces different types of cells (see
figure 1, row 2).
Action space
The actions available to the agents are (1) single-
cell expansion, (2) single-cell retraction, and (3) no-
action. The set of legal cells by which the agent’s
expansion and retraction is defined consists of cells
that are not blocked by an obstacle of the environ-
ment, preserve the PnH-ness of all the agents, and
are inside the action grid. It is important to notice
that the mask or action grid (see figure 1, row 3) has
shape (haction, waction )<(h, w)and is placed
on the environment based on the current centroid of
the agents’ PnH. Therefore, as the agents expand the
PnH in a certain direction, they also move the cen-
troid and, consequently, the action grid. The basic
actions are building blocks that can be combined to
form complex interplays such as blocking, pushing,
pulling, or attraction. For example, if the agent elim-
inates all the cells of its PnH using retraction, it then
can jump to any legal cell inside the current action
grid, using a single expansion. The agent also has the
option of not executing any action this turn.
Spatial objectives
Our proof of concept provides for the definition of a
variety of objectives. In our first experiments, we fo-
cused on simple objectives based on neighborhood
information, such as smooth adjacency, or strictly
based on local information, such as area and shape −
a metric that stimulates shapes with few folds, such
as rectangle, L or U (for more details, see Veloso and
Krishnamurti, in press). For each of these objectives,
there is a utility function and a spatial representa-
tion. The utility functions (uadj, uarea , ushape)re-
turn a value in the unit interval representing the per-
formance of the agent in that state with respect to a
goal. The spatial functions (gadj,garea ,gshape )re-
turn an array with size (h, w)containing spatial hints
for the performance of an agent, which intends to
ease training and enable parametrization by the user
(see Figure 1, rows 4-6). The total utility (utotal) of an
agent state is based on a combination of the selected
goals.
A structured representation
The state representation is composed of the different
layers of information. In our proof of concept, we or-
ganized an array (7, h, w)with the obstacles (envi-
ronment), types of cells, utility, mask, folds (gfold ),
area (garea), and adjacency (gadj )(see Figure 1).
The organization of the state as a fixed array condi-
tions the learning and enables the use of efficient
computer vision algorithms and machine learning
models.
D1.T7.S1. COGNIZANT ARCHITECTURE - WHAT IF BUILDINGS COULD THINK? - Volume 2 - eCAADe 38 |193
Figure 1
Row 1: left:
simplified
visualization of
agents in the grid
with obstacles;
middle: closest
neighbors; right:
adjacency goals.
Row 2: classification
of the cells:
structure (dark
blue), surface
(medium blue), and
offset (light blue);
row 3: mask with
the action space of
the agent; row 4:
g_fold: indication
of folding cells:
L-folds (light blue),
U-folds (dark blue);
row 5: g_area: the
agents have the
respective (area/
target area): 0: (17/
2), 1: (11/ 4), 2: (15/
8), 3: (13/ 16), 4:
(12/ 25); row 6:
g_adj: soft
adjacency values;
row 7: the current
utility of the agent
(u_total)
represented in the
internal cells of the
agent.
194 |eCAADe 38 - D1.T7.S1. COGNIZANT ARCHITECTURE - WHAT IF BUILDINGS COULD THINK? - Volume 2
LEARNING
Deep Reinforcement Learning
In Reinforcement Learning, agents learn by interact-
ing with the environment and discovering behav-
iors that can maximize their performance with the
respect to certain goals. The mathematical frame-
work for this interaction is a Markov decision process
(MDP) (Sutton and Barto, 2018, pp. 47-57).
In our proof of concept, the MDP consists of: a
set of states defined by the configuration of the agent
and of the environment; a set of actions (expansions,
retractions, or no-action) that can change the state of
the agent; a set of reward signals based on the of
the combined utility functions (utotal) between two
states. An agent interacts with the environment by
observing its current state (s) and taking an action (a)
in the MDP. The environment returns the next state
(s’) and an evaluative feedback in the form of a re-
ward signal (r).
To train our agent, we use an off-policy, model-
free approach called Q-learning, which relies on esti-
mating the future cumulative rewards for the state-
action pairs (Q-values) of the MDP. The agent in-
teracts with the environment using a policy πde-
rived from the Q-values estimates. Given an observa-
tion (s, a, r, s), the algorithm updates the estimate
Q(s, a)towards a target defined by the recursive re-
lationship between subsequent optimal Q-values in
the MDP (see formula 1). This target includes a dis-
count rate γ[0,1] to avoid infinite returns.
r+γmax
aQ(s, a)(1)
Conventional Q-learning algorithms store every Q-
value in a tabular data structure. However, to ad-
dress the large state-action spaces of our setting, we
estimate the Q-values using neural networks. More
specifically, we use the Double Deep Q-Network al-
gorithm (DDQN) (van Hassel, Guez and Silver, 2016).
DDQN uses two similar deep neural networks (Qθ
and Qθ) to estimate the Q-values.
In our version of DDQN, the agent interacts with
the environment in multiple episodes, where, the ob-
jective parameters, environment values, and PnH are
randomly initialized. It relies on the alternate execu-
tion of two steps: (1) interaction with the environ-
ment and (2) update of the network estimates.
In step 1, the agent interacts with the environ-
ment using an ε-greedy policy to generate obser-
vations (s, a, r, s)− i.e. it selects a random action
with εprobability or an action that maximizes ex-
pected cumulative reward according to the online Q-
network (Qθ) otherwise. At the beginning of a train-
ing session, εis a large value, so the policy explores
random trajectories in the environment. As εis re-
duced over training, the agent switches to the ex-
ploitation of the best estimates. The obser vations are
augmented with rotation and reflection, then added
to a large memory buffer D, from which they are
later retrieved for training. The buffer randomizes the
data, removing the correlation between observation
samples and reducing the variance for the updates.
In step 2, the algorithm updates the cur-
rent estimate of Qθwith a batch of observations
(S, A, R, S)retrieved from D.Qθis trained with a
gradient descent method using the target:
R+γQθ(S,argmaxaQθ(S, a)) (2)
Then, Qθis slowly updated to approximateQθusing
a step-size τ, which reduces the correlation between
Q estimates and target values.
Multi-agent deep Reinforcement Learning
In the previous section we described the algorithm
for a single agent. However, our prototype requires
multi-agent deep reinforcement learning (MADRL)
methods to address the problem of the interaction
of multiple spatial agents that compete and cooper-
ate to improve individual and collective rewards. In
this context, the formalization of the MDP general-
izes to a Markov Game, where the state contains the
information of all the agents and the action space is
defined by joint actions (Nguyen, Nguyen and Naha-
vandi, 2018, p. 10). MADRL imposes several addi-
tional challenges to RL, such as exponential growth
of the joint-action space, non-stationarity due to the
observation of other agents, heterogeneous agents,
D1.T7.S1. COGNIZANT ARCHITECTURE - WHAT IF BUILDINGS COULD THINK? - Volume 2 - eCAADe 38 |195
partial observability, and credit assignment of global
rewards.
Our approach to the multi-agent setting relies
on a dynamic convolutional neural network (DCNN)
to approximate the Q-values of the multiple agents.
DCNN estimates the Q-values after receiving three ar-
rays as the input:
An array (nagents,7, h, w)with the state in-
formation (see section “A structured representa-
tion”).
An array (nagents, m)with the indices of the m
closest neighbors to reconfigure DCNN.
An array (nagents, k)with the indices of the ad-
jacent goals to reconfigure DCNN.
Each agent is represented in a strand and there are
five main network blocks (see Figure 2).
Over the sequence of blocks, the network cap-
tures a larger and larger receptive field, defined by
neighbors and neighbors of neighbors. The strands
of the network for each agent not only share weights
but also gradients. Therefore, the backpropaga-
tion step of DDQN enables cooperative adjustment
of weights between agents. As a result, while the
number of closest neighbors (m) and maximum con-
nected agents (k) should be fixed, the number of
agents in the model can be changed during execu-
tion, by adding or removing strands.
The first block receives the state array of each
agent and uses a convolutional neural network to
create a richer representation. The second and third
block use convolutional neural networks on the re-
sulting arrays of the closest neighbors and connected
agents. The fourth block uses a convolutional neu-
ral network on the combination of the resulting ar-
rays from previous blocks. Finally, the fifth block is
a Q-network that receives the resulting array from
the previous block associated with the mask of the
agents and computes the respective Q-values using
fully connected layers.
EXPERIMENT
Table 1
Configuration of
training.
Our approach to interactive generative design is to
train spatial agents that build custom spatial config-
urations and adapt to changes devised by real-time
interactions with the designer. This interaction in-
cludes not only changes in the parameters of the sim-
ulation (ex: number of agents, area, adjacency, and
amount of randomness in action selection) but also
direct changes to the PnH configuration of the agents
and of the environment.
In this section, we report an experiment that
evaluates the quality of the spaces generated by the
agents, and their capacity to generalize their knowl-
edge to unknown situations and to react to perturba-
tions in the cell configuration.
In the training setting (see Table 1), six agents are
randomly initialized and interact in randomly gen-
erated environments during multiple episodes of 64
steps. The reward signals are derived fromthe follow-
ing utility function:
utotal = 0.5uadj (uar ea +ushape)(3)
In the experimental setting we use twelve agents
to represent the spaces of a generic two-bedroom
house (see Figure 3). There are two custom environ-
ments: a terrain on the top of a hill with irregular
boundaries and a terrain with a bridge over a stream.
For each environment, the agents interact for 6500
steps, which are divided in three stages that are sep-
arated by two phases of perturbations:
196 |eCAADe 38 - D1.T7.S1. COGNIZANT ARCHITECTURE - WHAT IF BUILDINGS COULD THINK? - Volume 2
Figure 2
Simplified graphical
representation of
the layers,
operations, and
arrays of DCNN.
Every CNN is
composed of 2
convolutional layers
with 64 filters of
size 3x3, stride 1,
padding 1 and a
Leaky ReLU
activation. The final
FC network has 3
linear layers with
104, 52, and 26
neurons. Its two
first layers use
Leaky ReLU
activation, while
the last uses a
hyperbolic tangent
function (Tanh).
Stage 1 (0-1999): the agents start at random po-
sitions and use the learned policy 95% of the
time and random actions 5% of the time − i.e.
ε-greedy policy with ε= 0.05.
Perturbation 1 (2000-2099): actions are selected
randomly.
Stage 2 (2100-3099): agents use an epsilon
greedy policy with ε= 0.05.
Figure 3
Program of the
house for the
experiment. The
leading number is
used to identify the
spaces in the next
images. The
numbers in
parenthesis
indicate area. The
connections
indicate adjacency.
The colored
grouping indicates
spatial integration
between spaces.
Perturbation 2 (3100-3499): actions are selected
randomly.
Stage 3 (3500-6499): agents use an epsilon
greedy policy with ε= 0.05.
The cell configuration of the obstacles and agents
is the input of a parametric model which generates
the geometry for visualization, with floors, walls, win-
dows, and openings for doors. The openings are
inserted randomly in the intersection between ad-
jacent environments. The windows are inserted in
one of the external edges of the PnH (bedrooms and
kitchens) or in one of the external edges of a cell of
the PnH (bathrooms). Some partitions are integrated
to create open spaces, such as in the dining and living
room or in the bedroom and closet. The social spaces
have glass walls, the patio does not have any walls,
and the remaining spaces have walls on the edges
without doors or windows.
Figure 4 and Figure 5 show some snapshots of
the resulting configuration for the two environments
over 6500 steps. The agents generated multiple con-
figurations with high scores and adapted to the ob-
stacles. Overall, the agents improved the perfor-
mance of the global configuration either from ran-
dom initialization or from a random perturbation in
all the scenarios. After the perturbations, the agents
were able to re-organize in different spatial config-
urations, which shows the potential for direct inter-
action with the designer. Also, interesting adaptive
spatial patterns that were not part of the objectives
emerge from the interaction of the agents with local
D1.T7.S1. COGNIZANT ARCHITECTURE - WHAT IF BUILDINGS COULD THINK? - Volume 2 - eCAADe 38 |197
Figure 4
Rows 1-6:
simulation of 12
agents in a terrain
on the top of a hill
with irregular
boundaries. Every
pair of rows contain
the first 4 steps in a
stage of the
simulation and 4
steps selected
visually. Row 7:
graph with the
average
performance of the
agents (y axis) over
the episode (x axis).
The areas between
dotted lines
indicate the periods
where action was
selected randomly
(perturbation).
198 |eCAADe 38 - D1.T7.S1. COGNIZANT ARCHITECTURE - WHAT IF BUILDINGS COULD THINK? - Volume 2
Figure 5
Rows 1-6:
simulation of 12
agents in a terrain
with a bridge over a
stream. Every pair
of rows contain the
first 4 steps in a
stage of the
simulation and 4
steps selected
visually. Row 7:
graph with the
average
performance of the
agents (y axis) over
the episode (x axis).
The areas between
dotted lines
indicate the periods
where action was
selected randomly
(perturbation).
D1.T7.S1. COGNIZANT ARCHITECTURE - WHAT IF BUILDINGS COULD THINK? - Volume 2 - eCAADe 38 |199
situations. For example, an agent created a protuber-
ance to increase the area and to adapt to local obsta-
cles (Figure 4, patio at 3898 or kitchen at 5128) or a
small courtyard emerged between the sectors of the
house (Figure 5, at 6216, and 6219).
CONCLUSION
In this paper, we introduced a workflow to train
agents for spatial diagramming and planning. The
agents successfully learned how to generate spaces,
addressed specific spatial goals, preserved fine-
grained interaction with the representation and were
able to react to perturbations in the simulation. This
algorithm will be improved and integrated to a game
engine, so designers can use custom tools to interact
with the model by changing parameters of the sim-
ulation and the PnH configurations of the environ-
ment and agents in real-time. By promoting the con-
versation between computational agents and the de-
signer, we expect to support improvisation and cyclic
reasoning in generative design.
ACKNOWLEDGMENT
This research was supported in part by funding from
the Carnegie Mellon University Frank-Ratchye Fund
for Art @ the Frontier as well as a PhD scholarship
granted by the Brazilian National Council for Scien-
tific and Technological Development (CNPq).
REFERENCES
Araghi, SK and Stouffs, R 2015, ’Exploring cellular au-
tomata for high density residential building form
generation’, Automation in construction, 49, pp. 152-
162
van Hassel, H, Guez, A and Silver, D 2016 ’Deep Rein-
forcement Learning with Double Q-learning’, Pro-
ceedings of the Thirtieth AAAI Conference on Artificial
Intelligence, Phoenix, pp. 2094-2100
Herr, CM and Ford, RC 2015 ’Adapting Cellular Automata
as Architectural Design Tools, Emerging Experience
in Past, Present and Futureof D igital Architecture: Pro-
ceedings of the 20th CAADRIA conference, Daegu, pp.
169-178
Hosmer, T and Tigas, P 2019 ’Deep Reinforcement Learn-
ing for Autonomous Robotic Tensegrity’, Ubiquity
and Autonomy: Proceedings of the 39th ACADIA con-
ference, Austin, pp. 16-29
Jabi, W, Chatzivasileiadi, A, Wardhana, NM, Lannon, S
and Aish, R 2019 ’The synergy of non-manifold
topology and reinforcement learning for fire egress,
Architecture in the age of the 4th Industrial Revolution:
Proceedings of the XXXVII eCAADe and XXIII SIGraDi
Conference, Porto, pp. 85-94
Koenig, R 2011, ’Generating Urban Structures: a Method
for Urban Planning Supported by Multi-Agent Sys-
tems and Cellular Automata’, Przestrzen I Forma, -
(16), pp. 353-376
Narahara, T 2017 ’Collective Construction Modeling and
Machine Learning: Potential for Architectural De-
sign’, Sharing Computational Knowledge!: Proceed-
ings of the 35th eCAADe Conference, Rome, pp. 593-
600
Nguyen, TT, Nguyen, ND and Nahavandi, S 2018, ’Deep
Reinforcement Learning for Multi-Agent Systems: A
Review of Challenges, Solutions and Applications’,
arXiv:1812.11794 [cs, stat], -, p. -
Ruiz-Montiel, M, Boned, J, Gavilanes, J, Jiménez, E,
Mandow, L and Pérez-de-la-Cruz, JL 2013, ’Design
with shape grammars and reinforcement learning’,
Advanced Engineering Informatics, 27(2), pp. 230-
245
Smith, SI and Lasch, C 2016 ’Machine Learning Integra-
tion for Adaptive Building Envelopes: An Experi-
mental Framework for Intelligent Adaptive Control’,
Proceedings of the 36th ACADIA Conference: Posthu-
mans Frontiers, Ann Arbor, pp. 98-105
Sutton, RS and Barto, AG 2018, Reinforcement learning:
An introduction, The MIT Press, Cambridge
Veloso, P and Krishnamurti, R (in press) ’Self-learning
Agents for Spatial Synthesis’, Proceedings of the 5th
International Symposium on Formal Methods in Archi-
tecture (5FMA)
Veloso, P, Rhee, J and Krishnamurti, R 2019 ’Multi-agent
Space Planning: a Literature Review (2008-2017)’,
Hello, Culture!: Proceedings of 18th CAAD Futures con-
ference, Daejeon, Korea, pp. 52-74
Wilensky, U and Rand, W 2015, An Introduction to Agent-
based Modeling: modeling natural, societal and engi-
neered complex systems with netlogo., The MIT Press,
Cambridge
Xu, T, Wang, D, Yang, M, You, X and Huang, W 2018
’An Evolving Built Environment Prototype’, Learning,
Adapting and Prototyping: Proceedings of the 23rd
CAADRIA conference, Beijing, pp. 207-215
200 |eCAADe 38 - D1.T7.S1. COGNIZANT ARCHITECTURE - WHAT IF BUILDINGS COULD THINK? - Volume 2
... In the paper, An Academy of Spatial Agents report (Veloso & Krishnamurti, 2020) the authors successfully attempt the use of a doubledeep Q-network (DDQN) combined with a dynamic convolutional neural network (DCNN) with multiagent deep reinforcement learning (MADRL) in order to solve multi-goal problems (shape, area, adjacency, etc. ...
Conference Paper
Full-text available
This paper presents case studies and analysis of agent-based reinforcement learning (RL) systems towards practical applications for specific architecture/engineering tasks using Unity 3D-based simulation methods. Finding and implementing sufficient abstraction for architecture and engineering problems to be solved by agent-based systems requires broad architectural knowledge and the ability to break down complex problems. Modern artificial intelligence (AI) and machine learning (ML) systems based on artificial neural networks can solve complex problems in different domains such as computer vision, language processing, and predictive maintenance. The paper will give a theoretical overview, such as more theoretical abstractions like zero-sum games, and a comparison of presented games. The application section describes a possible categorization of practical usages. From more general applications to more narrowed ones, we explore current possibilities of RL application in the field of relatable problems. We use the Unity 3D engine as the basis of a robust simulation environment.
... Recently, attention has been drawn to the formulation of this problem as a game for multiple agents learning how to achieve the goals and solve the constraints at the same time. For a recent review of these approaches see (Veloso and Krishnamurti, 2020). A good example of this can be seen in the work of (Guo and Li, 2017) where they combined a 3D bubble diagram with a grid-based system to allocate activities for architectural layouts. ...
Conference Paper
Full-text available
Our approach to Generative Design converts the problems of design from the geometrical drawing of shapes in a continuous setting to topological decision making about spatial configurations in a discrete setting. The paper presents a comprehensive formulation of the zoning problem as a sub-problem of architectural 3D layout configurations. This formulation focuses on the problem of zoning as a location-allocation problem in the context of Operations Research. Specifically, we propose a methodology for solving this problem by combining a well-known Multi-Criteria Decision-Analysis (MCDA) method called 'Technique for Order of Preference by Similarity to Ideal Solution' (TOPSIS) with a Multi-Agent System (MAS) operating in a discrete design space.
... The objective uniting these models is to learn low-level agent behaviors that lead to observed or desired system effects. The range of models that integrate learning approaches, e.g., in the category of spatial elements [45,101,130], manipulators [131], or people [132] shows that learning approaches are independent of the particular field of application and add a significant improvement to models in which low-level rules are unknown or difficult to define. ...
Article
Over the last two decades, the use of agent-based models (ABMs) to model and simulate the dynamics of complex systems has increased significantly among various scientific fields, including architecture. Based on a systematic literature review, this paper presents a classification for agent-based modeling and simulation (ABMS) in architecture based on the individual entities being modeled as agents. The classification is based on a reproducible search method capable of incorporating findings from different domain-specific databases to systematically retrieve relevant literature for ABMS in architecture. Subsequently, in each of the ABMs encountered in the selected literature, we identify what entity an agent in the model represents. Based on this identification, a comprehensive classification for ABMs in architecture is achieved. By describing each of the resulting categories, we provide new insights into the field of ABMS in architecture. Finally, we discuss limitations, as well as future trends and possibilities for ABMS in architecture.
... ABM are applied in various domains and disciplines, including biology, epidemiology, economics, social sciences, etc. In building science, applications include simulating human behaviours to assist building systems control (Mo, 2003), to analyse crowd behaviour in architecture (Feng, 2016), or to help with generative design approaches such as generating spatial layout (Veloso, 2020). The calculation efficiency of ABM could be extremely improved by replacing traditional CPU clusters with Graphics Processing Units(GPU) (Shen, 2011). ...
Conference Paper
Full-text available
Designing for receiving ample daylight is an integral part of an Architect's early design framework. However, conventional Daylight Simulation tools for daylight analyses are time taking due to their dependency on external simulation engines. Moreover, several inputs are required to set up the simulation process. While this maybe useful for analytical studies, conventional daylight analysis workflows in early-phase design decisions, for multiple design iterations can be cumbersome and require running thousands of test cases to optimize a design. This research proposes a step-by-step method for utilizing light simulation as a metric for optimization that uses agent-based modelling of photon particles to generate instantaneous daylight illumination results for comparative and real-time directional analysis of early-phase design iterations. This algorithm removes external dependencies of simulation engines and runs instantaneously. This enables quick performance optimization for real time design decision making, and also allows using real-time daylight impact assessment in generative design studies. It also consists of a front-end interface for non-intuitive users to perform comparative studies with ease. To demonstrate the speed and effortlessness of daylight evaluation using this methodology, a façade design is explored using multiple geometrical inputs to find an optimized solution with real-time directional and comparative feedback. It has been observed that the proposed methodology is near-instantaneous as compared to conventional simulation methods for iterative design optimization. Key Innovations : A unique methodology to run instantaneous comparative daylighting studies where options are to be "rankable" and absolute performance feedback is unnecessary. Agent based method allows parallel processing for real-time feedback assessment, giving instantaneous simulation results even while running on the CPU. First study in the state of the art to allow multiple materials in agent-based performance simulations.
... The objective uniting these models is to learn low-level agent behaviors that lead to observed or desired system effects. The range of models that integrate learning approaches, e.g., in the category of spatial elements [45,101,130], manipulators [131], or people [132] shows that learning approaches are independent of the particular field of application and add a significant improvement to models in which low-level rules are unknown or difficult to define. ...
Article
Adaptive skins and structures constitute an approach for reducing the immense material consumption in the construction of buildings, their energy demand and related emissions. Through the integration of sensor-actuator systems, conventional structures and skins become a controllable dynamic system, capable of reacting to different external influences. This approach demands an integral planning and design process with participants from various disciplines. The Collaborative Research Centre (CRC) 1244 entitled Adaptive Skins and Structures for the Built Environment of Tomorrow, involves, the research of specific processes and methods for the design and planning of adaptive buildings. In this paper, conventional approaches in architectural design and planning are analysed, providing the basis for the development of an improved process for adaptive buildings.
Article
Full-text available
Reinforcement learning (RL) algorithms have been around for decades and employed to solve various sequential decision-making problems. These algorithms, however, have faced great challenges when dealing with high-dimensional environments. The recent development of deep learning has enabled RL methods to drive optimal policies for sophisticated and capable agents, which can perform efficiently in these challenging environments. This article addresses an important aspect of deep RL related to situations that require multiple agents to communicate and cooperate to solve complex tasks. A survey of different approaches to problems related to multiagent deep RL (MADRL) is presented, including nonstationarity, partial observability, continuous state and action spaces, multiagent training schemes, and multiagent transfer learning. The merits and demerits of the reviewed methods will be analyzed and discussed with their corresponding applications explored. It is envisaged that this review provides insights about various MADRL methods and can lead to the future development of more robust and highly useful multiagent learning methods for solving real-world problems.
Conference Paper
Full-text available
In this paper we review the research on multi-agent space planning (MASP) during the period of 2008-2017. By MASP, we refer to space planning (SP) methods based on online mobile agents that map local perceptions to actions in the environment, generating spatial representation. We group two precedents and sixteen recent MASP prototypes into three categories: (1) agents as moving spatial units, (2) agents that occupy a space, and (3) agents that partition a space. In order to compare the prototypes, we identify the occurrence of features in terms of representation, objectives, and control procedures. Upon analysis of occurrences and correlations of features in the types, we present gaps and challenges for future MASP research. We point to the limits of current systems to solve spatial conflicts and to incorporate architectural knowledge. Finally, we suggest that behavioral learning offers a promising path for robust and autonomous MASP systems in the architectural domain.
Article
Full-text available
This work is based on the concept that the structure of a city can be defined by six basic urban patterns. To enable more complex urban planning as a long-term objective I have developed a simulation method for generating these basic patterns and for combining them to form various structures. The generative process starts with the two-dimensional organisation of streets followed by the parceling of the remaining areas. An agent-based diffusion-contact model is the basis of these first two steps. Then, with the help of cellular automata, the sites for building on are defined and a three-dimensional building structure is derived. I illustrate the proposed method by showing how it can be applied to generate possible structures for an urban area in the city of Munich. Keywords:
Article
Shape grammars are a powerful and appealing formalism for automatic shape generation in computer-based design systems. This paper presents a proposal complementing the generative power of shape grammars with reinforcement learning techniques. We use simple (naive) shape grammars capable of generating a large variety of different designs. In order to generate those designs that comply with given design requirements, the grammar is subject to a process of machine learning using reinforcement learning techniques. Based on this method, we have developed a system for architectural design, aimed at generating two-dimensional layout schemes of single-family housing units. Using relatively simple grammar rules, we learn to generate schemes that satisfy a set of requirements stated in a design guideline. Obtained results are presented and discussed.
Deep Reinforcement Learning with Double Q-learning
  • H Van Hassel
  • Guez
  • D Silver
van Hassel, H, Guez, A and Silver, D 2016 'Deep Reinforcement Learning with Double Q-learning', Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, pp. 2094-2100
Adapting Cellular Automata as Architectural Design Tools' , Emerging Experience in Past, Present and Future of Digital Architecture: Proceedings of the 20th CAADRIA conference
  • Herr
  • Ford
Herr, CM and Ford, RC 2015 'Adapting Cellular Automata as Architectural Design Tools', Emerging Experience in Past, Present and Future of Digital Architecture: Proceedings of the 20th CAADRIA conference, Daegu, pp. 169-178