ArticlePDF Available

Human–machine collaboration using gesture recognition in mixed reality and robotic fabrication

Authors:

Abstract

This research presents an innovative approach that integrated gesture recognition into a Mixed Reality (MR) interface for human–machine collaboration in the quality control, fabrication, and assembly of the Unlog Tower . MR platforms enable users to interact with three-dimensional holographic instructions during the assembly and fabrication of highly custom and parametric architectural constructions without the necessity of two-dimensional drawings. Previous MR fabrication projects have primarily relied on digital menus and custom buttons within the interface for user interaction between virtual and physical environments. Despite this approach being widely adopted, it is limited in its ability to allow for direct human interaction with physical objects to modify fabrication instructions within the virtual environment. The research integrates user interactions with physical objects through real-time gesture recognition as input to modify, update, or generate new digital information. This integration facilitates reciprocal stimuli between the physical and virtual environments, wherein the digital environment is generative of the user’s tactile interaction with physical objects. Thereby providing user with direct, seamless feedback during the fabrication process. Through this method, the research has developed and presents three distinct Gesture-Based Mixed Reality (GBMR) workflows: object localization, object identification, and object calibration. These workflows utilize gesture recognition to enhance the interaction between virtual and physical environments, allowing for precise localization of objects, intuitive identification processes, and accurate calibrations. The results of these methods are demonstrated through a comprehensive case study: the construction of the Unlog Tower , a 36’ tall robotically fabricated timber structure.
Kyawetal. Architectural Intelligence (2024) 3:11
https://doi.org/10.1007/s44223-024-00053-4
RESEARCH ARTICLE Open Access
© The Author(s) 2024. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http://creativecommons.org/licenses/by/4.0/.
Architectural Intelligence
Human–machine collaboration using
gesture recognition inmixed reality androbotic
fabrication
Alexander Htet Kyaw1, Lawson Spencer2 and Leslie Lok1*
Abstract
This research presents an innovative approach that integrated gesture recognition into a Mixed Reality (MR) interface
for human–machine collaboration in the quality control, fabrication, and assembly of the Unlog Tower. MR plat-
forms enable users to interact with three-dimensional holographic instructions during the assembly and fabrication
of highly custom and parametric architectural constructions without the necessity of two-dimensional drawings.
Previous MR fabrication projects have primarily relied on digital menus and custom buttons within the interface
for user interaction between virtual and physical environments. Despite this approach being widely adopted, it
is limited in its ability to allow for direct human interaction with physical objects to modify fabrication instructions
within the virtual environment. The research integrates user interactions with physical objects through real-time ges-
ture recognition as input to modify, update, or generate new digital information. This integration facilitates reciprocal
stimuli between the physical and virtual environments, wherein the digital environment is generative of the user’s
tactile interaction with physical objects. Thereby providing user with direct, seamless feedback during the fabrication
process. Through this method, the research has developed and presents three distinct Gesture-Based Mixed Reality
(GBMR) workflows: object localization, object identification, and object calibration. These workflows utilize gesture
recognition to enhance the interaction between virtual and physical environments, allowing for precise localization
of objects, intuitive identification processes, and accurate calibrations. The results of these methods are demonstrated
through a comprehensive case study: the construction of the Unlog Tower, a 36’ tall robotically fabricated timber
structure.
Keywords Human–Computer Interaction, Mixed reality, Gestural Tracking, Feedback Based Fabrication, Robotic
Fabrication
1 Introduction
Mixed Reality (MR) serves as a bridge between the tangi-
ble physical environments and immersive virtual environ-
ments. Within the field of architecture, fabrication, and
construction, this convergence holds significant potential
for human–machine collaboration. Using MR, architects
and designers can overlay digital blueprints directly onto
physical geometries, enabling real-time instruction visu-
alization (Rezvani etal., 2023). is paper aims to explore
the collaboration between humans and machines to pre-
sents novel opportunities for fabrication efficiency, accu-
racy, and experience. e symbiosis of human expertise
and machine feedback through MR processes presents
a future that leads to new integrated workflows between
human input, robotic fabrication and machine feedback
within an immersive and phygital realm.
*Correspondence:
Leslie Lok
wll36@cornell.edu
1 Rural-Urban Building Innovation Lab (RUBI), College of Architecture,
Cornell University, Art, and Planning, Ithaca, NY 14853, USA
2 Robotic Construction Laboratory (RCL), College of Architecture, Cornell
University, Art, and Planning, Ithaca, NY 14853, USA
Page 2 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
e term Mixed Reality encompasses both Augmented
Reality (AR) and Virtual Reality (VR) within the Reality-
Virtual (RV) Continuum. is continuum serves as a con-
nection between real-world experiences and immersive
virtual environments. (Milgram & Kishino, 1994). With
the advancement of immersive technology and 3D user
interfaces (3DUIs) in industry and academic research,
the understanding of MR, as defined by Milgram and
Kashino, has continuously evolved (Skarbez etal., 2021).
In recent research, MR is often described as an environ-
ment-aware overlay of digital content on the physical
world, enabling users to interact seamlessly with both
environments (Speicher et al., 2019). To facilitate this
interaction, MR systems employ an array of techniques,
including spatial mapping, hand-tracking, eye-tracking,
and auditory recording, collecting vital environmental
and human physiological data. is amalgamation of the
digital and physical in MR environments is further sup-
ported by advanced MR-enabled devices like the Micro-
soft HoloLens 2 and Meta Quest Pro, equipped with
sensors, microphones, and cameras, enabling real-time
monitoring of user behavior and changes in the physical
environment (Microsoft, 2022).
Previous research using AR and MR workflows in the
area of architectural fabrication have increased exponen-
tially (Song etal., 2021b). Projects such as Woven Steel,
Timber De-Standardized,, Code-Bothy, and many more
have explored human interaction with digital instruc-
tions in MR through digital interfaces such as buttons,
menus and/or fiducial markers such as QR codes and
AruCo markers (Jahn etal., 2018a; Lee, 2022; Lok & Bae,
2022). ese MR fabrication projects have focused on
using human interactions with digital interfaces as the
primary means to update the 3DUIs with new informa-
tion. However, there exists an opportunity to directly
incorporate human interaction with physical objects to
update the 3DUI without needing digital interfaces.
e research integrates tactile interactions with physi-
cal objects through real-time gesture recognition as input
to modify and update information in the digital environ-
ment. rough gesture recognition, the user’s touch of
a physical object could modify, update, or generate new
digital information creating seamless stimuli between
the physical and the virtual environments. By record-
ing user gestures as they interact with physical objects,
the three-dimensional user interface can automatically
provide new information in real time. As a result, the
virtual environment could respond dynamically to deter-
mine the real-time location of physical objects in the
digital environment. is human machine collaboration
can generate information such as localizing robotic tool
paths, recognizing components, or measuring inaccura-
cies between the physical and the digital model. e real
time generative data in the MR 3DUI allows the user to
quickly respond to previous actions. e real time, feed-
back-based MR environment represents a cybernetic sys-
tem whereby the outcome of interacting with a physical
object(s) is taken as input for further action, thus creating
a feedback loop until a desired condition is achieved.
e relationship between MR, gestural movement,
digital twin, cybernetics, and human–computer inter-
action are used to help define systems of interaction
between user and machine. From these relationships, the
research presents three distinct Gesture-Based Mixed
Reality (GBMR) fabrication workflows; a) object locali-
zation—registers the location of a physical object in the
digital space, b) object identification—differentiates phys-
ical components using their digital parameters, c) object
calibration—measures discrepancies between the physi-
cal object and associated digital geometry. Each of these
three methods were used in six different tasks to con-
struct the Unlog Tower (Fig.1). e workflows derivative
of this research presents new opportunities for human–
machine co-creation within physical and virtual envi-
ronments through MR in architecture and fabrication
industries.e integration of tactile interactions plays a
crucial role in allowing users to engage with digital data
in a hands-on manner, effectively blending the physical
and the virtual environments.
2 State oftheart
Previous research projects have explored AR for Robotic
fabrication to facilitate human–robot collaboration.
“Implementation of an Augmented Reality AR Workflow
for Human Robot Collaboration in Timber Prefabrica-
tion” proposes a user-friendly AR interface to visualize
and manipulate robotic joint orientations, allowing users
to send commands through a menu interface (Kyjanek
et al., 2019). Pop Up Factory, employs an AR interface
that allows users to manipulate digital control points of
a wall assembly, thereby effecting the design of the 3D
model used for subsequent robotic fabrication (Betti
etal., 2019). Lastly, [AR]OBOT, employs an AR interface
to visualize robotic operations in bricklaying applica-
tions. Users can plan the robotic movements by tapping
on digital models of individual bricks (Song etal., 2021a).
ese projects have demonstrated the use of AR and MR
interfaces for effective communication in robotic fabrica-
tion. However, these projects have primarily used AR and
MR interfaces for the robotic fabrication of standardized
work materials such as: foam blocks, bricks, or dimen-
sion lumber. e projects use AR and MR to engage with
digital control points or menu interfaces. is paper
demonstrates the potential to leverage gestural inputs for
direct interaction with physical objects, providing spatial
Page 3 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
data as parameters to enhance collaboration with robotic
fabrication of both standard and non-standard materials.
Innovative fabrication research projects such as Holo-
graphic Construction, Code-Bothy, Woven Steel, Bent, and
Timber De-Standardized 2.0, use interactive “buttons”
for users to toggle between different sets of digital geom-
etry which is visible in the 3DUI (Jahn etal., 2018a, 2019,
2020a; Lee, 2022; Lok & Bae, 2022). ough each of these
projects use a Microsoft Hololens with Fologram’s plug-in
for Rhino3d and Grasshopper, the “buttons” can equally
be interacted with one’s mobile device. In each of these
precedents, the “button” is a custom, pre-defined click-
able digital object (either mesh or poly-surface). ereby
any change in the virtual interface is dependent on the
user interacting with the select, pre-defined “buttons”
or otherwise manipulating other digital geometry. Holo-
graphic Construction and Code-Bothy use digital “but-
tons” to toggle up and down between rows of bricks as
they are laid (Jahn etal., 2020a; Lee, 2022). Code-Bothy
has the added effect of color coordinating the amount
of rotation per brick (Lee, 2022). Woven Steel and Bent
exhibited several buttons to aid in the complex bending
of tube steel and sheet metal (Jahn etal., 2018a, 2019).
Timber De-Standarized 2.0 developed menu list to visu-
alize different aspects of an inventory of scanned irregu-
lar log meshes as well as cataloging and designing with
the members through operations of slicing, indexing,
baking, and isolating (Lok & Bae, 2022). ough these
precedents offer an interaction between the user and the
digital geometry, the interactions are limited to digital
menus and buttons.
Other research projects such as Timber De-Standard-
ized 1.0, Augmented Feedback, and Augmented Vision
use various methods of AruCo markers for tracking,
physics simulation, and real-time scanning to create an
active responsive environment between digital and physi-
cal objects (Lok etal., 2021; Goepel & Crolla, 2022; Jahn
etal., 2022). In Augmented Feedback, AruCo makers were
placed at nodal intersections of a bending-active bam-
boo grid-shell structure (Goepel & Crolla, 2022). AruCo
marker tracking allowed users to digitize the locations of
the markers and provide graphic feedback for all active
users through the head mounted display (HMD). Timber
De-Standardized 1.0 utilized a physics simulation for fab-
ricators to visualize and virtually “drop” irregular scanned
meshes of logs till they found their resting point, which
allowed for a precise alignment with its associated physi-
cal log (Lok etal., 2021). Finally, Augmented Vision uses
the Hololens 2 to track and scan the user’s environment
then display such information to inform the progress of
constructing a minimal surface with strips of paper and/
or bark (Jahn etal., 2022). ese projects have demon-
strated the capabilities of feedback-based MR using addi-
tional systems such as AruCo markers, scanned meshes,
and simulation.
Additionally, the accuracy of AR/MR platforms pre-
sents a significant challenge in many of these AR/MR
fabrication workflows. e accuracy of the fabrication
instructions provided to users depends on the preci-
sion of the system. As a result, several studies have been
conducted to assess the accuracy of AR/MR systems.
Researchers have investigated the use of AR for assem-
bling metal pipes (Jahn etal., 2018b), weaving bamboo
structures (Goepel & Crolla, 2020), and constructing
complex wall systems with bricks within a tolerance
of ± 20 mm (Jahn et al., 2020b). Moreover, there have
been research efforts aimed at improving the accuracy of
AR/MR systems. e paper, “Augmented Reality for High
Fig. 1 The Unlog Tower, Photo by Cynthia Kuo
Page 4 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
Precision fabrication of Glued Laminated Timber Beams”,
has explored the use of multiple QR codes to achieve a
tolerance below 2 mm with the Microsoft HoloLens 2
(Kyaw etal., 2023). e results of this study indicate that
AR/MR systems have the potential to be used for high
precision applications, such as assisting in robotic fabri-
cation and accurate quality control.
3 Aim andObjectives
e research presented in this paper investigates various
applications where several GBMR workflows can lever-
age tactile feedback to enrich the user experience when
interacting with both physical and virtual items. e
paper demonstrates how tactile interactions can be used
to visually enhance the user’s perception with additional
digital information when manipulating physical objects.
e research exhibits how the three described GBMR
workflows can create a more immersive and fluid interac-
tion methodology that capitalizes on the human’s natural
sense of touch, enabling users to physically feel and inter-
act with the virtual environment in a tangible way. While
previous MR projects have focused on using menus,
AruCo markers, scanned meshes, and simulations to
interact with digital geometries, this project investigates
the potential of incorporating user’s tactile interaction
with physical objects as an input to update the 3DUI.
is research has developed 6 experiments to test 3
GBMR fabrication workflows to enhance tactile interac-
tions by generating geometry relative to physical objects,
localizing robotic tool paths, recognizing discrete com-
ponents according to parameters such as height and
length, and measuring inaccuracies between the physi-
cal and the digital models. e paper will first present
the tools and software of the method, which will then be
followed by the three GBMR workflows used to fabricate
the UnLog Tower: a) object localization, b) object iden-
tification, and c) object calibration. Object localization
was used to determine the log geometry work object and
the toolpath placement for robotic fabrication (Method
4.1) (Fig. 2). Object identification is utilized to identify
physical components and display intuitive step-by-step
assembly instructions (Method 4.2). Object calibration is
employed to ensure the adjustment of jigs and the con-
nection of panels match the digital model (Method 4.3).
Each of these workflows will demonstrate new meth-
ods in MR research whereby physical stimuli can become
a generative tool to interact and inform MR fabrication
in real-time. rough gestural interaction, our research
endeavors to redefine the boundaries between the physi-
cal and virtual environments. By showcasing their appli-
cation in the construction of the Unlog Tower, these
workflows demonstrate potential to optimize fabrication
processes, enhance assembly efficiency and instruction,
thereby contributing to an advancement within the field
building construction.
4 Methods
rough computer vison and gestural recognition algo-
rithm, the following studies were conducted with a
Microsoft HoloLens 2 and Fologram, a AR/MR plug-in
for Rhino3D and Grasshopper (Fologram Pty Ltd, 2021;
Robert McNeel & Associates, 2022; Rutten, 2022). e
near depth sensing camera on the Microsoft HoloLens 2
is used for articulated hand tracking (AHAT). AHAT is a
computer vision algorithm that tracks the movement and
gestures of the user’s hand, independent from the visible
light cameras (VLCs) used for simultaneous locating and
mapping (SLAM). e articulated hand tracking system
recognizes and records twenty-five 3D joint positions
and rotations, including the wrist, metacarpals, proximal,
Fig. 2 Workflow diagram outlining the various assembly and fabrication process
Page 5 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
distal, and fingertip joints (Ungureanu etal., 2020). is
data is live streamed from the HoloLens 2 device to Rhi-
no3D and Grasshopper via Wi-Fi. e Microsoft AHAT
API provides access to the built-in gestural recognition
algorithm of the HoloLens 2, enabling the utilization of
its advanced capabilities for hand tracking purposes. e
joint configuration and orientation obtained from AHAT
can facilitate the estimation of hand poses, such as pinch-
ing, tapping, or poking (Taylor etal., 2016).
is study focuses on the use of pinching as the pri-
mary mode of gestural interaction by the user. e pinch-
ing gesture is recognized when the thumb tip and index
fingertip are in close proximity (Fig. 3). erefore, a
device capable of AHAT programming is imperative for
gesture recognition and therefore is integral to the GBMR
workflows. Gestural recognition plays an important role
in tracking tactile interactions and serves as the input for
human–machine collaboration in GBMR workflows.
4.1 Object localization
e Unlog Tower exhibits robotically kerfed timber round
woods that have been stretched along two threaded rods
to form panels through a similar method exhibited at
the Unlog pavilion at University of Virginia (Lok etal.,
2023). Logs are irregular geometries that are comprised
of knots and are sometimes curved, but can nonethe-
less be abstracted to a cylinder in most cases. Six ash
logs with minor deformations were used to construct the
Fig. 3 Digital twin of HoloLens 2 headset location, joint configuration, and orientation from AHAT (Articulated Hand Tracking); visualized
through headset (left); visualized through Rhino3D and Grasshopper (right)
Page 6 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
tower, each log was first cut in half and then robotically
kerfed. To localize the robot targets and cut the log in
half using a 6-axis robotic arm with a 5hp bandsaw end-
effector, object localization method was employed. e
user placed three points at both ends of the log to create
two individual circles that generated a cylindrical mesh
which was superimposed with the physical log (Fig.4).
Each point was created by the user pinching their right-
hand index finger to their thumb. is feedback mecha-
nism provides the user with a visual confirmation of the
digitization process by displaying a point for each gesture
recorded. From the cylindrical mesh, a surface was gener-
ated in the middle of the cylinder whereby the robot tool
path could be derived from the robot targets at either end
of the surface using Robot Components (Deetman etal.,
2023), a robot programming plug-in for ABB robots in
Grasshopper that is then copied into Robot Studio, an
ABB software for programming ABB robots (ABB, 2023).
Once the log was cut in half, one half of the log was
rotated 90° and remounted in the robot cell. According
to the structural requirements for the Unlog Tower, the
cross section of each board was to be no less than 5 by
0.75. Figure 5 demonstrates the process whereby the
user would locate the half log in the robot cell by placing
three points; two at one side of the half log to determine
the diameter and one at the opposite end to determine
the length of the half log(Fig.5). After the log geometry
was defined, the user set the location of the cut geometry
by placing a point on the profile of the log (Fig.6).
e MR workflow offered the user ongoing feedback
throughout the process by performing a validation to
determine whether the cut geometry falls within the
boundary of the log. In the event that the cut geometry
was placed outside the log or was situated too close to
the log mount, a red notation with a cross mark was
displayed within the 3DU (Fig. 7a and b). e user
responded to the alert and adjusted the location of the
cut geometry until a satisfactory outcome was achieved,
represented by a green notation (Fig.7c). e fabricator
was to check the location of the cut surfaces within the
log to ensure that the boards met the minimum cross-
sectional requirements without any of the cut surfaces
colliding with the 4 × 4 log mounts. e object locali-
zation workflow allows users to define points in the
digital space that represent the physical log for work-
object localization during robotic fabrication (Fig. 8).
Fig. 4 Object localization is used to generate the location of a cylinder according to the diameter(s) of the log to automate the placement
of the robotic toolpaths
Fig. 5 Object localization is used to determine the work object placement for robotic fabrication
Page 7 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
An ABB IRB 6700 on a 4200mm external linear track
was used to cut each half log into robotically kerfed,
bending-active panels (Fig.9).
4.2 Object identication
Object identification was used to differentiate between
self-similar physical components and display intuitive
step-by-step assembly instructions. After the half logs
Fig. 6 Gestural inputs are used to register the location of a physical object in the digital space for robotics
Fig. 7 Object localization is used to determine the placement of the toolpath for robotic fabrication
Fig. 8 Object localization system diagram describing how user interactions physical objects are used to create digital data through gestural
recognition
Page 8 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
have been robotically kerfed, they are set aside and pre-
pared for finger jointing. e finger joint template not
only includes an outline for the finger joints, but also
an outline for the hole that the threaded rod would pass
through. Because of the parametric design of the kerfed
timber panels for the Unlog Tower, the finger joint loca-
tions are staggard between adjacent boards within each
half log (Fig.10).
In order to correctly mark the location of the finger
joints and the location for the threaded rod holes in each
board layer, GBMR was employed for object identifica-
tion. Each board layer had a defined thickness of 0.75
inches. erefore, the height of the virtual templates
were at intervals of 0.75 inches (e.g., Layer 1: 0.75 inches,
Layer 2: 1.5 inches, Layer 3: 2.25 inches, and so on).
Object identification was specifically used to identify the
board layer that the user was working on to display the
corresponding virtual template location. e workflow
determines the corresponding virtual template to display
by comparing the height of the user-defined point with
height of the virtual templates from the ground plane
(Fig.11). For instance, if the user specifies a gestural point
positioned 1.43 inches above the ground, the system will
match this value with the nearest layer height within
a virtual template. In this scenario, the system will pre-
sent layer 2 as the closest match, positioned at 1.5 inches
above the ground, which closely corresponds to the input
of 1.43 inches. e virtual template had an added nota-
tion that visually communicated to the user which layer
they were working on, so that the user could be sure that
the physical template was appropriately placed. e fin-
ger joints were cut with an oscillating saw and drill, while
the holes for the threaded rods were drilled with a hole
saw (Fig.12). is object identification workflow allows
for fluid transition between the physical world and the
digital overlays, where users can simultaneously navigate
digital instruction and fabricate physical geometries.
Additionally, object identification was used to index and
coordinate between self-similar parts. rough gestural
recognition, tactile interactions with physical geometries
were recorded as digital points. ese points were sorted
in the order of registration to calculate the distance
Fig. 9 6-axis robotic arm with a 5hp bandsaw end-effector cutting a log after object localization
Fig. 10 Staggered board layers depending on kerf panel geometry and parameter
Page 9 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
between each gesture. is distance parameter was used
to match the corresponding digital instruction for the
user. is human–machine collaboration was exhibited
in the fabrication of the reciprocal tube steel frames in
the Unlog Tower. To brace the kerfed wood panels, the
interior of the tower exhibited 3 sets of steel tube frames.
Due to the custom design of the steel tube frames, there
were nine unique tube lengths amongst 54 total steel
tubes (Fig.13). Seven of the nine steel tube lengths were:
17.27 inches, 18.82 inches, 22.28 inches, 23.20 inches,
24.83 inches, 27.72 inches, and 32.93 inches. After the
steel tubes were cut to length, object identification was
employed to index the tube steel according to their length
and communicate the location of the tube steel in the dig-
ital model(s) (Fig.14a and c). By placing a point at either
end of the of the tube steel through gesture recognition,
the user would define the length of the object, which was
checked against a list of tube steel lengths predetermined
in the digital model. If the value between the user defined
length and a predefined length was within tolerance (see
Table 2 in the Results), the 3DUI displayed the corre-
sponding digital information to the user through nota-
tion and two coordination models that visually indicated
the location of the tube steel within the overall struc-
ture and highlights the selected member from to blue to
red. e coordination model on the left (Fig.14 b and d)
illustrated at 1:1 scale the tube steel location within the
associated tube steel frame and the coordination model
on the right (Figs.14a, b, 15c, and 14d) illustrated at 1:10
Fig. 11 Object identification is utilized to identify physical components and display intuitive step-by-step assembly instructions
Fig. 12 Robotically Kerfed logs with finger joints and threaded rod
holes
Page 10 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
scale a virtual model of the Unlog Tower with the loca-
tion of the tube steel within the whole model. By using
predetermined distances and gestural recognition, Object
Identification was used to pair digital assembly instruc-
tions with the identified physical object (Fig.15).
4.3 Object calibration
In order for the kerfed logs to splay out into panels, the
threaded rods had to have pre-located hex nuts appro-
priately placed to ensure that each board member would
be in the correct location. In the GBMR workflow, object
calibration was employed to place the hex nut locator
correctly along a plywood jig. e hex nut locator was
3D printed with PLA to firmly hold each hex nut when
it was screwed into the plywood board. A digital twin
was created for each hex nut locator. is 3D printed hex
nut locator had a handle that protruded 0.25 inches with
a thickness of 0.125 inches. When the user pinched the
handle on the hex nut locator, object calibration would
use gesture recognition to continuously track this move-
ment, thereby synchronizing the digital geometry with
the physical. As the physical object moved closer to the
Fig. 13 Reciprocally framed tube steel in the UnLog Tower, photo by Cynthia Kuo.
Fig. 14 Object identification is utilized to identify physical components and display part to whole assembly instructions
Page 11 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
goal position, the notation would transform from red to
yellow to green once the physical was properly located
(Fig.16).
is workflow represented a cybernetic system in
which the adjustment of the physical locator position
would generate new virtual feedback for the user, thus
creating a feedback loop until the desired condition was
attained. e desired condition was achieved when the
digitized physical location of the hex nut locator was
within a tolerance of 0.125 inches. is was indicated
to the user via the notation system where the red or yel-
low cross turned into a green tick. e MR system would
instruct the user to move onto the next hex nut loca-
tor only after the previous hex nut locator was correctly
placed via gesture recognition. After all the hex nut loca-
tors were properly placed, a threaded rod was screwed
through the jig (Fig.17).
For the panel assembly, the robotically kerfed logs were
splayed out along two threaded rods with pre-located
hex nuts as was done in the Unlog pavilion (Lok etal.,
Fig. 15 Object identification system diagram describing how digital assembly is filtered through object identification via gestural recognition
Fig. 16 Object calibration is employed to ensure the hex nut locators are adjusted to match the digital model. As the physical hex nut locator
moves closer to its digital position, the notation would transform from red to yellow to green
Fig. 17 After all the hex nut locators were properly placed, a threaded rod is screwed through jig
Page 12 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
2023) (Fig. 18). Temporary custom slip washers were
placed between the hex nut and the board to ensure that
the boards would keep their position until joined into
larger prefab components with steel slip washers. Once
the panels were joined together in larger prefab compo-
nents, object calibration was used to check the location of
each board as they were fixed into location (Fig.19). is
quality control step aligned a digital model of the goal
geometry to the physical panel using the placement of a
QR code. e physical location of the boards were deter-
mined by using GBMR to place a point at the center of
the finger joint location on each board, which was auto-
matically checked against the closest digital board from
the 3D model. e deviation between the GBMR input
board location and the digital board allowed for a 0.125
tolerance. A red cross notation indicated that the devia-
tion was outside the tolerance, otherwise a green check
notation would appear indicate that the board was cor-
rectly placed.
Object Calibration, as a quality control step, ensured
that the parametrically defined wall panels were prop-
erly calibrated into larger prefab wall elements that were
then transported to the site for assembly (Fig.20). e
utilization of gestural recognition allowed the machine
to record user’s tactile interaction with physical objects.
By measuring the distance between the physical and the
digital objects, the machine can understand the fabrica-
tion tolerances in real-time and provide an immediate
visual feedback to the user (Fig.21).
5 Results andDiscussion
e implementation of gesture recognition for GBMR
was incredibly useful for the fabrication of irregular and
parametrically defined building components exhibited in
the construction of the Unlog Tower. e prefab wall pan-
els were attached to the tube steel reciprocal frames on
site and lifted onto the foundation with a boom forklift
(Fig.22). e Unlog Tower was on display for 6months
until it was deinstalled in March of 2023.
Gestural recognition in MR fabrication workflows
allowed users to define physical objects without the
arduous placement of AruCo markers. e object local-
ization workflow demonstrates that gesture recogni-
tion can be employed to locate robot work object data
(Fig.8). However, the utilization of gesture recognition
assumes a certain level of dexterity on the part of the
user, as the data is dependent on the fidelity and accu-
racy of the user’s fingers. During the experiment, no
issues were encountered regarding the fidelity of the
user’s finger. Since robotic fabrication was utilized for
kerfing logs, the workflow achieves its intended out-
come as long as the work object remains within the
Fig. 18 Transformable material system at two phases: Collapsed kerf log (left) and Stretched kerf log (Right)
Fig. 19 Object calibration is employed for quality control of prefab wall components
Page 13 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
Fig. 20 Aerial of the kerfed panels assembled into larger wall components, photos by Cynthia Kuo.
Fig. 21 Object calibration is employed to ensure the adjustment of jigs and the connection of panels match the digital model
Fig. 22 Aerial photograph of the Unlog Tower lifted on the foundation pad with a boom forklift, photo by Cynthia Kuo
Page 14 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
width of the robotic bandsaw. However, robotic fabri-
cation processes such as milling might require higher
accuracy. Future studies will investigate how the object
localization workflow can be modified for robotic fab-
rication procedures that require higher tolerances.
Alternatively, improvements in the AHAT, articulated
hand tracking, on the Microsoft HoloLens 2 would also
increase the accuracy of the overall system and the res-
olution of the work object placement.
e research also describes the potential of using ges-
tural tracking for object identification whereby the user’s
hands can be intuitively used to index and coordinate
assembly of self-similar parts based upon predefined
parameters (Fig.15). e allowable range of a user posi-
tioned points through gesture recognition is defined as
the gestural input tolerance. As object parameters are rela-
tive to one another, the gesture input tolerance is also rela-
tive to adjacent parameters within a list, so the lower limit
of the gesture input tolerance for a specific object xn can be
found by calculating the midpoint between the predefined
parameters of the preceding object xn-1 and object xn. e
upper limit of this range can be determined by calculating
the midpoint between the predefined parameters of sub-
sequent object xn+1 and object xn (Eq.1).
In the first Object Identification experiment, gestural
input was used for board layer identification. In this
context, the gesture input tolerance refers to the accept-
able range within which a user’s gestural inputs must fall
for the system to accurately identify the corresponding
board layer. (Table 1). For example, the gesture input
tolerance for layer 2 is between 1.175 and 1.825 inches.
is means any gestural input falling below the lower
limit of 1.175 inches will correspond to the virtual tem-
plate of layer 1, while any input above the upper limit
(1)
x
n1+
x
n
2,
x
n+
x
n+1
2
corresponds with layer 3. e lower limit of the gesture
input tolerance for layer 2 is calculated by finding the
midpoint between the heights of layer 1 and 2, while the
upper limit is the midpoint between layer 2 and 3.
Another value that was used to measure the robustness
of the system is the identification threshold. e identifi-
cation threshold represents the smallest allowable devia-
tion the user’s gestural input can have before the system
identifies the wrong object. e identification threshold of
object xn can be calculated by finding the lesser difference
between the geometry parameter of object xn and that
of its preceding object xn-1 and subsequent object xn+1
(Eq.2). e identification threshold is negative if the pre-
ceding object (xn-1) has a smaller difference. e identifi-
cation threshold is positive if the subsequent object (xn+1)
has a smaller difference. If the two values are equal, then
the identification threshold has both positive and negative
value. In this experiment, the identification threshold for
all board layers is
±
0.375 inches. is means any gestural
input deviating by more or less than 0.375 inches from the
object’s layer height will result in a misidentification. Dur-
ing the board layer identification experiments, the system
was able to accurately identify all corresponding layers
without any errors for the identification threshold.
e second experiment in Object Identification recognizes
distinct tube steel types by utilizing varying lengths of the
members as geometry parameters. In contrast to the ini-
tial experiment, which focused on incremental differences
in layer height, this experiment involves tube steel length
variations with non-uniform differences among individual
members. Due to this non-uniform varying, the gesture
input tolerance between each member was drastically dif-
ferent. For example, Type D has a gestural input tolerance
between 22.74 inches to 24.015 inches which is a range of
1.275 inches, and Type G has a gestural input tolerance
between 30.325 to 35.535 inches which is a range of 5.21
inches (Table2). As a result, it is more likely for a user’s
(2)
f(xn)=
min(|xn1xn|,|xnxn+1|),if |xn1xn|<|xnxn+1
|
+min(|xn1xn|,|xnxn+1|),if |xn1xn|>|xnxn+1
|
±min(|xn1xn|,|xnxn+1|),if |xn1xn|=|xnxn+1
|
Table 1 Gestural Input Tolerance and Identification Threshold for Uniform Board Layer Identification
Board Layer No Layer Height Gestural Input Tolerance Identication Threshold
Layer 1 0.75 inches 0 to 1.175 inches ± 0.375 inches
Layer 2 1.5 inches 1.175 to 1.825 inches ± 0.375 inches
Layer 3 2.25 inches 1.825 to 2.25 inches ± 0.375 inches
Layer 4 3.0 inches 3.0 to 3.37 inches ± 0.375 inches
Layer 5 3.75 inches 3.37 to 4.5 inches ± 0.375 inches
Page 15 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
gestural input to fall out of bound for Type D compared to
Type G. However, Type D can be misidentified as either
Type C or Type E. e identification threshold can be calcu-
lated if it is more likely for the system to identify Type C as
Type C or Type E. e identification threshold of Type D is
-0.46. In reference to Eq.2, the negative value of the identi-
fication threshold was attributed to smaller differences with
the preceding object. erefore, the likelihood of the system
misclassifying Type D as Type C was higher. roughout
the experiment, there were two instances of error recorded
during the five documentation trials. Both of these errors
occurred when the system mistook Type D as Type C.
When comparing the two object identification experi-
ments, the identification threshold in the first experiment
had consistent value of 0.375 inches. While this value is
smaller than the identification threshold of Type D in sec-
ond experiment, there was no error recorded in the first
experiment. However, it is also important to note that in
the first experiment, the user only needed to input one
gestural point for the system to read the layer height. In
the second experiment, the user needed to input two ges-
tural points to register the tube steel length. Registering
two points means that the identification through gestural
recognition could have an increased possibility of error.
Future research will conduct a precision study on how
the number of gestural points can lead to a higher dis-
crepancy. e results also indicate that type of geometry
parameters has a significant role in the performance of an
object identification workflow using the GBMR method.
Currently, the object identification method utilizes the
varying lengths and heights of components as the param-
eter. Future studies could incorporate other geometric
parameters such as the boundary geometry or volume in
the workflow.
e research underscores a critical aspect of visual
feedback of human–machine collaboration by devel-
oping visualization strategies for various fabrication
tasks. For true collaboration to exist, there must also be
a mutual understanding between the user and the sys-
tem. e machine must be able to comprehend the user’s
input, and the human must also be able to understand
the machine’s outputs. Utilizing gestural recognition, the
machine is capable of capturing and processing interac-
tions initiated by users. Subsequently, the machine gener-
ates outputs that enhance the user’s tactile interaction by
providing real-time visual feedback.
In the case of the object localization workflow, the
accuracy of the gesture recognition is limited to the user’s
finger precision. e tactile interaction is enhanced with
visual feedback by displaying a sphere at the location of
the placement point to verify the physical input. Prelimi-
nary experiments have recorded users recalling their tac-
tile interactions when they notice discrepancies displayed
in the visual feedback. is visual feedback enhancement
enables users to see errors between physical action and
the digital output.
Integrating visual perception also plays a crucial role
in the object identification workflows, where 3D draw-
ings and instructions are dynamically updated based on
the user’s tactile interactions. During the kerf panel fab-
rication, we noticed that it was challenging to identify if
a task is registered without clear labeling on each panel
layer. Specific labels and colors have been added as a form
of visual feedback to draw attention to updated informa-
tion. During the steel frame fabrication, the change in
color highlighting the selected member allows the user to
confirm that their object identification was successful.
Finally, the object calibration workflow showcases a
synchronized method for users to link physical objects
with their digital twins (Fig.21). e threaded rod test
was unique in that the user could pinch the hex nut loca-
tor while moving the physical object. Visual feedback
can was used to enhance tactile interaction through
color coordination. For example, the instructions can
shift colors from red, to yellow, to green in response to
the user’s physical inputs, effectively signaling to antici-
pate when they would be close to the goal location. Users
have reported that the visual feedback provides them
with more confidence in their actions during the fabri-
cation process. rough the employment this workflow,
Table 2 Gestural Input Tolerance and Identification Threshold for Steel Tube Identification
Steel Tube Type Steel Tube Length Gestural Input Tolerance Identication Threshold
Type A 17.27 inches 16.495 to 18.045 inches + 0.775 inches
Type B 18.82 inches 18.045 to 20.55 inches - 0.775 inches
Type C 22.28 inches 20.55 to 22.74 inches + 0.46 inches
Type D 23.20 inches 22.74 to 24.015 inches - 0.815 inches
Type E 24.83 inches 24.015 to 26.275 inches - 0.815 inches
Type F 27.72 inches 26.275 to 30.325 inches - 1.445 inches
Type G 32.93 inches 30.325 to 35.535 inches - 2.605 inches
Page 16 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
all 24 threaded rods of the Unlog Tower were success-
fully fabricated as intended. e second object calibration
experiment with the panel quality control demonstrated
that some objects are too heavy or cumbersome to pinch
while moving. For that reason, the second test demon-
strated the use of gesture recognition to iteratively define
critical points until the physical geometry aligned with
the digital model.
With the development of Gesture-Based Mixed Real-
ity workflows for object localization, identification, and
calibration, the research advances current fabrication
processes by enabling real-time feedback through tactile
interaction. By enabling direct interaction with three-
dimensional holographic instructions, the need for two-
dimensional drawings in other fabrication processes is
eliminated, allowing for a more interactive and tactile
engagement with the fabrication tasks. Without relying
on physical measurement tools such as measuring tapes
or rulers associated with common fabrication practices,
the method can handle complex, parametric, and irregu-
lar geometries while accounting for fabrication errors.
is workflow can also have a drastic impact on the
industry and the manpower involved in the fabrica-
tion process. By changing the nature of how fabrication
drawings and technical documentation are produced, the
workflow makes it easier for teams to understand and fol-
low complex fabrication instructions. Previously, reading
technical drawings would be limited to those with spe-
cialized training in architecture or construction. While
using a mixed reality headset still requires training, it is
still a lower barrier of entry into certain fabrication tasks.
e use of interactive fabrication instructions and real-
time feedback opens up opportunities for experts and
nonexperts to fabricate highly customized and unique
geometries. e research also presents opportunities for
fabricators to develop future projects that employ this
method to coordinate and educate subcontractors on the
construction of parametric components with discretized
or self-similar parts. e use of gesture recognition and
MR in fabrication projects is not just about improving
human–machine collaboration; it’s also about enhancing
human–human collaboration.
6 Conclusion
e future potential of using gesture recognition in
MR fabrication projects is enormous. e presented
research not only demonstrates that real time feedback
through gesture recognition is imperative for advanced
MR fabrication projects, but it can also be used in
robotics, geometry creation, object indexing, model
coordination, interactive digital twin, and complex
quality control. In the age of automation, the research
highlights the importance of integrating human
interaction into machine processes. e research pre-
sents a concurrent bi-directional human–machine col-
laboration workflow. e focus isn’t solely on humans
giving commands to machines or machines directing
humans. Instead, it is about fostering a deeper under-
standing and synergy between both entities, working
collaboratively to improve and optimize outcomes.
e integration of tactile interaction and gesture rec-
ognition embodies this collaboration, enabling users
to not only interface with the digital environment but
also to effectively collaborate with machine generated
information.
e insights gained from the experiments conducted
in this study pave the way for future explorations, offer-
ing innovative approaches to integrate physical stim-
uli as generative tools for MR fabrication in real-time.
Future investigations will seek to improve the accuracy
of this method for high precision fabrication projects and
explore the potential of incorporating a wider range of
gestures, such as "tap”, “poke", and “pinch. Additionally,
the development of a user-controlled interface to manage
recognized gestures, enabling actions such as enable/dis-
able or undo, will further refine the collaborative dynam-
ics between the user and the system.
is research demonstrates how gesture-based mixed
reality workflows can provide a tangible interface to
simultaneously interact with both physical objects and
digital content within mixed reality environments. By
leveraging tactile interactions, the workflow redefines
the boundaries between the physical and digital domains,
ultimately pushing the limitation of immersive technol-
ogy for feedback-based human–machine collaboration in
construction and related fields. e three GBMR work-
flows exhibited in this paper demonstrate the various
applications for the real-time feedback-based fabrication
and assembly of the Unlog Tower. is phygital experi-
ence offers a whole series of future applications inves-
tigations in the field of Mixed Reality fabrication and
Human–Machine co-creation.
Acknowledgements
This research was conducted as part of the project, Unlog Tower, exhibited at
the 2022 Cornell Biennial which was curated by Timothy Murry. The authors of
this research would like to thank Tim Murry and Tina DuBois for their generos-
ity, encouragement, and patience through the realization of this research.
Special recognition for the invaluable contributions provided by the project
collaborators, Sasa Zivkovic for the collaborative conceptualization, design,
and construction of the tower, Kurt Jordan for the research of regional long-
house typologies in the exhibition component, and Matthew Reiter for the
structural engineering of the tower. The authors would like to acknowledge
the contributions by research assistants Shihui Xie, and the assembly team:
Yuxuan Xu, Andrea Zvonar, Cook Shaw, Sahil Adnan, and Benjamin Ezquerra.
Authors’ contributions
Conceptualization and Methodology: Alexander Htet Kyaw, Lawson Spencer,
and Leslie Lok; Formal analysis and investigation: Alexander Htet Kyaw; Princi-
pal Investigator and Funding acquisition: Leslie Lok; Co-writing: Leslie Lok and
Lawson Spencer.
Page 17 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
Funding
The research was funded by the Cornell Council for the Arts (CCA) grant. The
grant was awarded to the participating designer Leslie Lok. Additional partial
funding was provided by Cornell University College of Architecture, Art, and
Planning.
Availability of data and materials
The raw data supporting the conclusions of this article will be made available
by the authors upon request.
Declarations
Ethics approval and consent to participate
Not Applicable.
Consent for publication
Upon review and approval from the scientific peer review committee, the cor-
responding author of this paper consents to the publication of this article.
Competing interests
On behalf of the authors, the corresponding states that the research was con-
ducted in the absence of any commercial or financial relationships that could
be construed as a potential conflict of interest.
Received: 31 October 2023 Accepted: 20 February 2024
References
ABB. (2023). RobotStudio [RAPID; Windows]. ABB.
Betti, G., Aziz, S., & Ron, G. (2019). Pop Up Factory: Collaborative Design in
Mixed Rality Interactive live installation for the makeCity festival, 2018
Berlin. Blucher Design Proceedings, 115–124. https:// doi. org/ 10. 5151/
proce edings- ecaad esigr adi20 19_ 425.
Deetman, A., Wannemacher, B., & Rumpf, G. (2023). Robot Components (1.5.1)
[C++; Windows]. Robot Studio.
Fologram Pty Ltd. (2021). Fologram (Version 2020/02/15) [Windows]. Folo-
gram Pty Ltd.
Goepel, G., & Crolla, K. (2020). Augmented Reality-based Collaboration—
ARgan, a bamboo art installation case study. Proceedings of the 25th
International Conference of the Association for Computer-Aided Architec-
tural Design Research in Asia, 313–322. https:// doi. org/ 10. 52842/ conf.
caadr ia. 2020.2. 313.
Goepel, G., & Crolla, K. (2022). Augmented Feedback: A case study in
Mixed-Reality as a tool for assembly and real-time feedback in bamboo
construction. In K. Dörfler, S. Parasho, J. Scott, B. Bogosian, B. Farahi, J.
L. García del Castillo y López, J. A. Grant, & V. A. A. Noel (Eds.), ACADIA
2021: Toward Critical Computation (pp. 232–237). ACADIA. https:// doi.
org/ 10. 52842/ conf. acadia. 2021. 232
Jahn, G., Newnham, C., & Beanland, M. (2018a). Making in Mixed Reality.
Holographic design, fabrication, assembly and analysis of woven steel
structures. Proceedings of the 38th Annual Conference of the Association
for Computer Aided Design in Architecture., 88–97. https:// doi. org/ 10.
52842/ conf. acadia. 2018. 088.
Jahn, G., Newnham, C., & Beanland, M. (2018b). Making in Mixed Reality.
Holographic design, fabrication, assembly and analysis of woven steel
structures. Proceedings of the 38th Annual Conference of the Association
for Computer Aided Design in Architecture., 88–97. https:// doi. org/ 10.
52842/ conf. acadia. 2018. 088.
Jahn, G., Newnham, C., & Berg, N. (2022). Depth Camera Feedback for
Guided Fabrication in Augmented Reality. In Dr. D. Aviv, H. Jamelle, R.
Stuart-Smith, & Dr. M. Akbarzadeh (Eds.), Proceedings of the 42nd Annual
Conference of the Association for Computer Aided Design in Architecture
(ACADIA). ACADIA. https:// papers. cumin cad. org/ cgi- bin/ works/ paper/
acadi a22_ 684
Jahn, G., Newnham, C., van den Berg, N., Iraheta, M., & Wells, J. (2020a).
Holographic Construction. In C. Gengnagel, O. Baverel, J. Burry, M.
Ramsgaard Thomsen, & S. Weinzierl (Eds.), Impact: Design With All Senses
(pp. 314–324). Springer International Publishing. https:// doi. org/ 10.
1007/ 978-3- 030- 29829-6_ 25
Jahn, G., Newnham, C., van den Berg, N., Iraheta, M., & Wells, J. (2020b).
Holographic Construction. In C. Gengnagel, O. Baverel, J. Burry, M.
Ramsgaard Thomsen, & S. Weinzierl (Eds.), Impact: Design With All Senses
(pp. 314–324). Springer International Publishing. https:// doi. org/ 10.
1007/ 978-3- 030- 29829-6_ 25
Jahn, G., Wit, A. J., & Pazzi, J. (2019). [Bent] Holographic handcraft in large-
scale steam-bent timber structures. Proceedings of the 39th Annual
Conference of the Association for Computer Aided Design in Architecture
(ACADIA), 438–447. https:// papers. cumin cad. org/ data/ works/ att/ acadi
a19_ 438. pdf
Kyaw, A. H., Xu, A., Jahn, G., Berg, N., Newnham, C., & Zivkovic, S. (2023).
Augmented Reality for High Precision Fabrication of Glue Laminated
Timber Beams. Automation in Construction. https:// doi. org/ 10. 1016/j.
autcon. 2023. 104912
Kyjanek, O., Al Bahar, B., Vasey, L., Wannemacher, B., & Menges, A. (2019).
Implementation of an Augmented Reality AR Workflow for Human Robot
Collaboration in Timber Prefabrication. 36th International Symposium
on Automation and Robotics in Construction, Banff, AB, Canada.
https:// doi. org/ 10. 22260/ ISARC 2019/ 0164.
Lee, G. (2022). Code-Bothy: Mixed reality and craft sustainability. Frontiers of
Architectural Research. https:// doi. org/ 10. 1016/j. foar. 2022. 05. 002.
Lok, L., & Bae, J. (2022). Timber De-Standardized 2.0: Mixed Reality Visualiza-
tions and User Interface for Processing Irregular Timber. In J. van
Ameijde, N. Gardner, H. Hyun, D. Luo, & U. Sheth (Eds.), Proceedings of
the 27th CAADRIA Conference (pp. 121–130). CAADRIA. https:// doi. org/
10. 52842/ conf. caadr ia. 2022.2. 121.
Lok, L., Samaniego, A., & Spencer, L. (2021). Timber De-Standardized: A
Mixed-Reality Framework for the Assembly of Irregular Tree Log Struc-
tures. In B. Farahi, B. Bogosian, J. Scott, J. L. García del Castillo y López,
K. Dörfler, J. A. Grant, S. Parasho, & V. A. A. Noel (Eds.), Proceedings of the
40th Annual Conference of the Association for Computer Aided Design in
Architecture (ACADIA) (pp. 222–231). Association for Computer Aided
Design in Architecture (ACADIA). https:// doi. org/ 10. 52842/ conf. acadia.
2021. 222
Lok, L., Zivkovic, S., & Spencer, L. (2023). UNLOG: A Deployable, Light-
weight, and Bending-Active Timber Construction Method.
Technology|Architecture + Design, 7(1), 95–108.
Microsoft. (2022). What is mixed reality? - Mixed Reality. https:// learn. micro
soft. com/ en- us/ windo ws/ mixed- reali ty/ disco ver/ mixed- reali ty.
Milgram, P., & Kishino, F. (1994). A Taxonomy of Mixed Reality Visual Displays.
IEICE Trans. Information Systems, E77-D(12), 1321–1329.
Rezvani, M., Lei, Z., Rankin, J., & Waugh, L. (2023). Current and Future Trends
of Augmented and Mixed Reality Technologies in Construction. In R.
Gupta, M. Sun, S. Brzev, M. S. Alam, K. T. W. Ng, J. Li, A. El Damatty, & C.
Lim (Eds.), Proceedings of the Canadian Society of Civil Engineering
Annual Conference 2022 (pp. 19–39). Springer International Publishing.
https:// doi. org/ 10. 1007/ 978-3- 031- 34593-7_2.
Robert McNeel & Associates. (2022). Rhino3d (7.17) [Windows]. Robert
McNeel & Associates.
Rutten, D. (2022). Grasshopper (1.0.0007) [Windows]. Robert McNeel & Asso-
ciates. https:// www. grass hoppe r3d. com/.
Skarbez, R., Smith, M., & Whitton, M. (2021). Revisiting Milgram and Kishino’s
Reality-Virtuality Continuum. Frontiers in Virtual Reality, 2. https:// doi.
org/ 10. 3389/ frvir. 2021. 647997.
Song, Y., Koeck, R., & Luo, S. (2021a). [AR]OBOT: The AR-Assisted Robotic Fab-
rication System for Parametric Architectural Structures. Blucher Design
Proceedings, 1115–1126. https:// doi. org/ 10. 5151/ sigra di2021-4.
Song, Y., Koeck, R., & Luo, S. (2021b). Review and analysis of augmented real-
ity (AR) literature for digital fabrication in architecture. Automation in
Construction, 128, 103762https:// doi. org/ 10. 1016/j. autcon. 2021. 103762.
Speicher, M., Hall, B. D., & Nebeling, M. (2019). What is Mixed Reality?
Proceedings of the 2019 CHI Conference on Human Factors in Computing
Systems, 1–15. https:// doi. org/ 10. 1145/ 32906 05. 33007 67.
Taylor, J., Bordeaux, L., Cashman, T., Corish, B., Keskin, C., Soto, E., Sweeney,
D., Valentin, J., Luff, B., Topalian, A., Wood, E., Khamis, S., Kohli, P., Sharp,
T., Izadi, S., Banks, R., Fitzgibbon, A., & Shotton, J. (2016). Efficient and
Precise Interactive Hand Tracking through Joint, Continuous Optimiza-
tion of Pose and Correspondences. ACM Transactions on Graphics (TOG)
- Proceedings of ACM SIGGRAPH, 2016:35. https:// www. micro soft. com/
Page 18 of 18
Kyawetal. Architectural Intelligence (2024) 3:11
en- us/ resea rch/ publi cation/ effic ient- preci se- inter active- hand- track
ing- joint- conti nuous- optim izati on- pose- corre spond ences/.
Ungureanu, D., Bogo, F., Galliani, S., Sama, P., Duan, X., Meekhof, C., Stühmer,
J., Cashman, T. J., Tekin, B., Schönberger, J. L., Olszta, P., & Pollefeys,
M. (2020). HoloLens 2 Research Mode as a Tool for Computer Vision
Research. ArXiv, Computer Vision and Pattern Recognition. https:// doi.
org/ 10. 48550/ arXiv. 2008. 11239
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in pub-
lished maps and institutional affiliations.
... Besides, Morse et al. [19] used an AR overlay to assemble a robotically milled demonstrator. Kyaw et al. [20] combined gesture recognition and mixed reality to assemble robotically kerfed timber round woods. In the three latter studies, components were primarily fabricated by means of robots before being manually assembled with AR guidance. ...
Article
Full-text available
This research introduces an innovative Augmented Reality (AR) workflow for Human-Robot Interaction (HRI) in timber construction. The approach leverages human dexterity and adaptability alongside the strength and precision of robotic arms to assemble timber structures connected by wood-wood connections. While research in the field of automated construction generally focuses on singular interactions, such as robot agents carrying components and human agents attaching them, this paper explores multiple degrees of interaction involving cooperation or collaboration between agents. A new algorithmic framework is developed to automate the generation of holographic instructions and allocate assembly tasks to human and robot agents according to their abilities. The application to a full-scale demonstrator reveals that certain elements necessitate collaboration for assembly, while others can exclusively be assembled manually or robotically. Ultimately, the research also highlights the benefits of AR in assisting manual assembly, simulating robot trajectories, and increasing safety during collaborative tasks.
Preprint
Full-text available
Recent advancements in Augmented Reality (AR) have demonstrated applications in architecture, design, and fabrication. Compared to conventional 2D construction drawings, AR can be used to superimpose contextual instructions, display 3D spatial information and enable on-site engagement. Despite the potential of AR, the widespread adoption of the technology in the industry is limited by its precision. Precision is important for projects requiring strict construction tolerances, design fidelity, and fabrication feedback. For example, the manufacturing of glulam beams requires tolerances of less than 2mm. The goal of this project is to explore the industrial application of using multiple fiducial markers for high-precision AR fabrication. While the method has been validated in lab settings with a precision of 0.97, this paper focuses on fabricating glulam beams in a factory setting with an industry manufacturer, Unalam Factory.
Conference Paper
Full-text available
Timber De-Standardized is a framework that salvages irregular and regular shaped tree logs by utilizing a mixed reality (MR) interface for the design, fabrication, and assembly of a structurally viable tree log assembly. The process engages users through a direct, hands-on design approach to iteratively modify and design irregular geometry at full scale within an immersive MR environment without altering the original material. A digital archive of 3D scanned logs are the building elements from which users, designing in the MR environment, can digitally harvest (though slicing) and place the elements into a digitally constructed whole. The constructed whole is structurally analyzed and optimized through recursive feedback loops to preserve the user’s predetermined design. This iterative toggling between the physical and virtual emancipates the use of irregular tree log structures while informing and prioritizing the user’s design intent. To test this approach, a scaled prototype was developed and fabricated in MR. By creating a framework that links a holographic digital design to a physical catalog of material, the interactive workflow provides greater design agency to users as co-creators in processing material parts. This participation enables users to have a direct impact on the design of discretized tree logs that would otherwise have been discarded in standardized manufacturing. This paper presents an approach in which complex tree log structures can be made without the use of robotic fabrication tools. This workflow opens new opportunities for design in which users can freely configure structures with non-standardized elements within an intuitive MR environment.
Conference Paper
Full-text available
Augmented reality environments have been demonstrated to assist with architectural fabrication tasks by displaying construction information at full scale and in context. However, this information typically needs to be sparse in order to prevent virtual models occluding a fabricators view of the physical environment, and this limits the application of augmented reality to tasks such as surface forming. To address this issue, we propose a method for guided fabrication in augmented reality using real time comparisons between depth scans of as built conditions and target conditions defined by design models. Through the design and fabrication of a small proof of concept prototype from paper strips, we demonstrate that guided fabrication is adequate for high speed, approximate and ad-hoc fabrication of complex surface geometries without the need for extensive rationalization for fabrication constraints or explicit documentation of parts. We further show how this method generalizes to other processes such as additive fabrication or part placement and speculate on the implications of accessible real time depth data from the HoloLens within Grasshopper.
Chapter
Augmented reality (AR) and mixed reality (MR) technologies have gained significant interest throughout the past two decades in the Architecture, Engineering, and Construction (AEC) Industry. However, despite the rapid growth of these technologies, their effective implementation in the AEC industry is still in its infancy. Therefore, a comprehensive investigation of the state-of-the-art applications and categories of AR/MR in the construction industry can guide researchers and industry experts to choose the most suitable AR/MR solution for research and implementation. This paper provides a comprehensive overview of 103 AR/MR articles published in credible journals in the field of the AEC industry within the years 2013–2021. Typically, review-type papers assess articles primarily based on their application areas. However, this classification approach overlooks some other critical dimensions, such as the article’s technology type, the maturity level of technology used in the research, and the project phase in which technology is implemented. Accordingly, this paper classifies articles based on ten dimensions and their relevant categories: research methodology, improvement focus, industry sector, target audience, project phase, stage of technology maturity, application area, comparison role, technology type, and location. The results reveal that AR/MR literature has increasingly focused on simulation/visualization applications during construction and maintenance/operation phases of the project, emphasizing improving the performance of workers/technicians. Additionally, the increasing trend in AR/MR articles was identified as using self-contained headsets (e.g., Microsoft HoloLens). Markerless tracking systems show a significant trend among the articles. Moreover, the target location of implementing AR/MR primarily found to be in on-site and in outdoor spaces. The trend indicates an increase in immersive and mobile AR/MR applications in outdoor job sites such as construction sites to aid workers/technicians in assembly works during the construction phase.KeywordsAugmented and mixed technologiesConstruction
Article
In recent years, the utilization of Augmented-Reality (AR) within the construction industry has increased. However, high-precision applications such as glulam beam fabrication require tolerances of less than 2 mm, which is smaller than what current AR workflows can offer. This research investigates the use of QR-Code Markers on glulam beams to encode additional positional data in the environment to better interpolate between the physical space, the glulam beam, and the headset. The objective is to understand the effects of Marker placement, size, and frequency on the accuracy of AR projection for glulam fabrication. This paper describes the AR workflow, the effects of Markers, and the framework of the Twinbuild software for drift correction in large-scale AR applications. The method can achieve an average tolerance as low as 0.97 mm when Markers are placed in 1.25 ft. increments along the beam edge. The research demonstrates the viability of AR for high-precision fabrication applications.
Article
Code-Bothy examines traditional bricklaying using mixed reality technology. Digital design demands a re-examination of how we make. The digital and the manual should not be considered as autonomous but as part of something more reciprocal. One can engage with digital modelling software or can reject all digital tools and make and design by hand, but can we work in between? In the context of mixed-reality fabrication, the real and virtual worlds come together to create a hybrid environment where physical and digital objects are visualised simultaneously and interact with one another in real time. Hybridity of the two is compelling because the digital is often perceived as the future/emergent and the manual as the past/obsolescent. The practice of being digital and manual is on the one hand procedural and systematic, on the other textural and indexical. Working digitally and manually is about exploring areas in design and making: manual production and digital input can work together to allow for the conservation of crafts, while digital fabrication can be advanced with the help of manual craftsmanship.