Content uploaded by Gizem Yetiş
Author content
All content in this area was uploaded by Gizem Yetiş on Sep 24, 2018
Content may be subject to copyright.
A Novel Approach for Classification of Structural Elements
in a 3D Model by Supervised Learning
Gizem Yetiş1, Ozan Yetkin2, Kongpyung Moon3, Özkan Kılıç4
1METU Design Factory, Middle East Technical University 2,3Building Science, De-
partment of Architecture, Middle East Technical University 4Department of Com-
puter Engineering, Yildirim Beyazit University
1,2{yetis.gizem|ozan.yetkin}@metu.edu.tr 3kpmoon@gmail.com 4okilic@ybu.
edu.tr
Development of Computer Aided Design (CAD) has made a transition from 2D to
3D architectural representation and today, designers directly work with 3D
digital models for the initial design process. While these digital models are being
developed, layering and labelling of 3D geometries in a model become very
crucial for a detailed design phase. However, when the number of geometries
increases, the process of labelling and layering becomes simple labor. Hence, this
paper proposes automation for labelling and layering of segmented 3D digital
models based on architectural elements. In various parametric design
environments (Rhinoceros, Grasshopper, Grasshopper Python and Grasshopper
Python Remote), a training set is generated and applied to supervised learning
algorithms to label architectural elements. Automation of the labelling and
layering 3D geometries not only advances the workflow performance of design
process but also introduces wider range of classification with simple features.
Additionally, this research discovers advantages and disadvantages of alternative
classification algorithms for such an architectural problem.
Keywords: Automation, Classification, Grasshopper Python, Layering,
Labelling, Supervised Learning
INTRODUCTION
The development of design technologies increased
the complexity of the building design. 3D mod-
elling plays critical part in design process of mas-
sive and complicated buildings. As the building gets
sophisticated, the number of architectural elements
has increased simultaneously. During the initial de-
sign phase of schematic drawings, architects gener-
ally draw individual geometries without labelling its
architectural attributes to model faster. Moreover,
at the concept phase of the design process, design-
ers generally categorize each geometry into semanti-
cally correct layers after the draft model has finished.
The workload of layering and labelling geometries in-
dividually is not only time consuming but also does
not require profession-specific knowledge to do this
task. However, it is necessary in order to proceed on
detailed drawing/modelling for construction draw-
AI FOR DESIGN AND BUILT ENVIRONMENT - Volume 1 - eCAADe 36 |129
ing phase. Therefore, the aim of this research is to au-
tomate the labelling and layering in order to improve
architects‘ and designers’ work performance.
In this paper, the Rhinoceros 3D is chosen as the
development environment for two reasons. First, it
is widely used by designers. Second, it has a plug-in,
Grasshopper, which has a Python code editor with a
graphical interface. As parallel to Rhinoceros’increas-
ing number of users [1], this research attains acces-
sibility not only to architects but also other design-
ers, so that the automation of labelling and layering
benefits a large spectrum of professions in designing
field.
RELATED WORK
Semantic labelling has always been a vital research
topic. Studies working with Princeton Segmenta-
tion Benchmark shed some light for this paper about
the machine learning approaches for the segmen-
tation of mesh based objects (Chen, Golovinskiy &
Funkhouser, 2009; Kalogerakis, Hertzmann, & Singh,
2010; Lv, Chen, Huang & Bao, 2012); however, the
lack of architectural elements directs us to search for
other solutions.
When architecture-related works are examined
in detail, it can be observed that point-cloud and im-
ages with depth information based works differ from
our work in various aspects. Vectors or voxels are
mainly used to classify multiple categories unlike this
study. Proximity and adjacency become important
key factors to analyze objects. For example, a re-
search conducted in Stanford University (Armeni et
al., 2016) proposes unsupervised detect-based pars-
ing method and labels semantic elements on large-
scale point cloud data. While the large-scale point
cloud data contains larger aspect of task, from void
and mass detection, to architectural element and
furniture detection, our research focuses on the au-
tomation of labelling and layering only structural el-
ements with using the X, Y, Z dimensions.
On the other hand, in Hyperspectral Image
(HSI) datasets, generally supervised machine learn-
ing algorithms are utilized such as, Support Vector
Machines (SVM), Artificial Neural Networks (ANN),
or Sparse Representation-based Classification (SRC).
However, He, Liu, Wang and Hu’s research (2017) con-
ducts a Generative Adversarial Networks (GANs), a
semi supervised method, for HSI classification. Com-
pared to supervised classification, a GAN can be ap-
plied to both limited training datasets and abundant
unlabeled datasets. Spectral-spatial features are ex-
tracted by 3D Bilateral Filter (3DBF). The similarity
between study of He et al. and this paper is that
both training datasets can be manipulated. 3DBF can
perform better by downsampling and upsampling,
similar to the volume of our training dataset gen-
erated parametrically that decelerates or accelerates
the computing power with the number of elements
in the dataset.
Researches working with 3D objects in either
mesh or surface format, are found similar to our re-
search. Even though the main aim is not labelling the
architectural elements, most of the researches’ back-
bone relies on it. In Hua’s research (2014), for exam-
ple, a design synthesis algorithm takes Google Ware-
house repository and according to mesh informa-
tion on the models, classification of architectural el-
ements occurs inductively. First, the smallest meshes
are found. Then, they are merged incrementally to
label the objects as columns, walls, stairs etc. There-
fore, the adjacency plays an important role. However,
using a repository may be a problematic method
to train and/or test bigger datasets because of the
lack of the variety of examples and different adjacent
components. Hence, generating a training set from
scratch for the current study can prevent such prob-
lems.
Bassier, Vergauwen and Van Genechten’s re-
search (2016) is a bit different than Hua’s (2014) in
terms of dataset and the problem definition. It takes
point-cloud data and transform it into meshes with
Pointfuse, and then into surfaces with Grasshopper.
During these transformations, noise reduction is ap-
plied so that a clear dataset can be obtained. The
dataset including vertical and horizontal architec-
tural elements in surface format is utilized to clas-
130 |eCAADe 36 - AI FOR DESIGN AND BUILT ENVIRONMENT - Volume 1
sify architectural components of a new test set. By
doing so, they achieve an as-built Building Informa-
tion Model without dealing with complex vector el-
ements in point-cloud data. Their working environ-
ment, Grasshopper, inspires our research. On the
other hand, working with surfaces and their features
becomes a key point in our classification problem just
like their research.
MATERIAL AND METHOD
For this research, first, segmented building geome-
tries were introduced to Grasshopper. Then the
machine learning algorithms, trained by generated
architectural element dataset, were developed on
Grasshopper Python (GhPython) [2] and Grasshop-
per Python Remote (GhPython Remote)[3] to label
and bake geometries on Rhinoceros 3D environment
with the architectural labels and color layers.
GENERATING DATASET
There are researches on generating 3D model dataset
for development of classification algorithms. For ex-
ample, Shape Retrieval Contest (SHREC), which an-
nually updates datasets, and ShapeNet, a Princeton
ModelNet dataset. However, there are no datasets
that are composed of architectural elements such as
walls, columns, beams etc. Therefore, for this re-
search we generated our own dataset that is grouped
as composed of walls, columns, beams and slabs.
Six-hundred elements are generated with de-
fined range of lengths in order to imply the geomet-
rical characteristics of architectural elements. The X,
Y, Z-axes lengths specify the range of dimension of
each edge of architectural elements. The dataset of
walls is composed of 200 pieces, 100 pieces for one
direction and the other 100 pieces for the perpendic-
ular to the previous one. X-axis (width) ranges be-
tween 10 cm and 30 cm. Y-axis (length) ranges be-
tween 500 cm and 1000 cm. Z-axis (height) ranges
between 250 cm and 600 cm. The minimum height
is set as 250 cm since under this value will not satisfy
an acceptable floor height. (Figure 1).
Figure 1
Generated training
set (The upper left:
walls; the upper
right: columns; the
bottom left: beams,
the bottom right:
slabs)
The beam dataset is composed of 200 pieces, 100
pieces for one direction and the rest is for the per-
pendicular direction. X-axis (width) ranges between
20 cm and 50 cm. Y-axis (length) ranges between 250
cm and 1000 cm. Z-axis (height) ranges between 40
cm and 80 cm.
The column dataset is composed of 100 pieces.
X-axis (width) ranges between 20 cm and 50 cm. Y-
axis (length) ranges between 20 cm and 50 cm. Z-axis
(height) ranges between 250 cm and 1000 cm. The
parameters of X, Y-axes of the columns and X-axis of
the beams share similarities since they directly con-
tact one another.
Figure 2
Data point
representation in
Python
Lastly, the slab dataset is composed of 100
pieces. X-axis(width) ranges between 250 cm and
1000 cm. Y-axis (length) ranges between 250 cm and
1000 cm. Z-axis (height) ranges between 10 cm and
AI FOR DESIGN AND BUILT ENVIRONMENT - Volume 1 - eCAADe 36 |131
Figure 3
80 different 3D
models from the
students’ work
30 cm. As in Figure 2, the points on X, Y, Z -axes graph-
ically shows how the generated dataset for training
are distributed in a range that represents its own ar-
chitectural elements.
TEST & CONTROL DATA
To understand such a classification problem, some
test and control datasets without prior labels are pro-
vided. The test set is taken from Middle East Technical
University (METU), architecture students’ works from
Digital Media Course (Figure 3). There are 80 models
and they represent more or less the same character-
istics in terms of structure and physical appearance.
Hence, this dataset helps for perceiving the problem-
atic parts of the different algorithms mentioned in
the following chapter.
A control set which is more complex compared
to the test set is also used for enriching the exam-
ple variety and observing the algorithms’ behavior
under a complex task. This set includes 3D models
of METU Housing, METU Department of Architecture
and building models found online (Figure 4).
PROPOSED WORKFLOW
For this research, Rhinoceros and Grasshopper en-
vironments are chosen to display labelling and lay-
ering solution as mentioned before. GhPython and
GhPython Remote has been two important key plu-
gins. While GhPython allows user to develop algo-
rithms with Python and RhinoScriptSyntax on geom-
etry based problems, GhPython Remote presents the
necessary libraries for machine learning techniques
(Figure 5).
For this classification problem, different ap-
proaches are conducted. Since it is supervised and
the aim is to predict non-numeric data, Logistic Re-
gression, K-Nearest Neighbours (K-NN), Linear SVM,
Kernel SVM, Naïve Bayes and Decision Tree algo-
rithms are preferred. The reason behind choosing dif-
ferent approaches lies in understanding the different
potentials and finding the optimum one for solving
such a problem. These approaches can be measured
by accuracy and speed. Kernel SVM works to achieve
the best accuracy rate.
132 |eCAADe 36 - AI FOR DESIGN AND BUILT ENVIRONMENT - Volume 1
Figure 4
The control set
examples (In order
from top: METU
Department of
Architecture, METU
Housing, online
examples)
Figure 5
An example of
Grasshopper
interface with
Python
components
including machine
learning algorithms
Logistic Regression, K-NN, Linear SVM, Naïve Bayes
and Decision Tree work for fast training. Decision
Tree is not good at multi-feature classification and
shows better results on binary decisions. However,
Logistic Regression, K-NN, Linear SVM and Naive
Bayes are good at handling multi-class classifications.
Once the dataset is introduced in Grasshopper,
all of these approaches can be developed inside Gh-
Python component separately. First, the training set
is used with the algorithms to teach the specifica-
tions (in this case X, Y, Z dimensions) of the archi-
tectural components. Then, the test set is linked to
predict the classes and give labels. After labelling is
done, another algorithm is conducted to automatize
baking these labelled elements into Rhinoceros en-
vironment and creating new layers. The results are
compared to each other in the following section in
detail.
COMPARISON AND RESULTS
The multi-class classification task is a common prob-
lem within the field of machine learning and there are
various algorithms which use different strategies to
define the classes in a dataset. For this research, the
aim is not only to classify main structural elements
but also compare different machine learning models
for further steps.
Firstly, the logistic regression from scikit-learn li-
brary developed by Pedregosa et al. (2011) is im-
plemented to the dataset which, at first sight, seems
to do regression but actually works for classification
tasks. In this model, with logistic regression formula
shown in below, the possible outcomes of a single
test probabilities are modelled and minimize the cost
function accordingly. After the implementation, lo-
gistic regression model classified 83.42% of given
data correctly.
f(x) = L
1 + e−k(x−x0)(1)
•e= the natural logarithm base (Euler’s num-
ber),
•x0= the x-value of the sigmoid’s midpoint
•L= the curve’s maximum value,
•k= the steepness of the curve.
Secondly, k-nearest neighbours algorithm (K-NN) is
implemented to the dataset from the same library
which is an instance-based learning storing instances
of the training data instead of constructing a general
internal model. The classification is achieved basi-
cally from a majority vote of the nearest neighbours
of each point. In more detail, a query point is ap-
AI FOR DESIGN AND BUILT ENVIRONMENT - Volume 1 - eCAADe 36 |133
Table 1
Comparison of
model accuracies
pointed to the data class which has the most repre-
sentatives within the k-nearest neighbours accord-
ing to the Euclidean distance shown on the below
formula. When implemented on our dataset, K-NN
model is able to classify 90% of the test data correctly.
d(p, q) = v
u
u
t
n
∑
i=1
(qi−pi)2(2)
As the third method, support vector machine (SVM)
from scikit-learn library is implemented on our
dataset. An SVM constructs a hyperplane or a set of
hyperplanes that has the largest distance to the near-
est training data points of any class in a hyper dimen-
sional space by using a decision function shown on
the formula below. It is also possible to change the
kernel of the hyperplanes that will totally affect the
space division method of SVM. For this research, both
linear SVM and SVM with Radial Basis Function (RBF)
kernels are implemented. The linear SVM classifies
74.02% of the test data correctly while RBF SVM has
an accuracy of 90.23%.
sgn(n
∑
i=1
yiαiK(xi, x) + ρ)(3)
For the fourth method, the Naïve Bayes classifier is
implemented to the dataset from scikit-learn. It is an
algorithm based on Bayes’ theorem with the ‘naive’
assumption of independence between each pair of
features. Given a class variable yand a dependent
feature vector x1through xn, algorithm uses the fol-
lowing classification rule on the below formula.
P(y|x1, . . . , xn)∝P(y)
n
∏
i=1
P(xi|y)(4)
⇒ˆy=arg max
yP(y)
n
∏
i=1
P(xi|y)(5)
When implemented on our dataset, Naive Bayes
model is able to label 82.66% of given data correctly.
Finally, from the same library, decision tree
model is used which is a non-parametric learning
method for classification. The aim of a decision tree
is to create a model that predicts the value of a tar-
get variable by learning simple decision rules derived
from the features of a given data and classify data ac-
cording to a classification criteria shown on the for-
mula below. After the implementation, decision tree
model only classifies 22.61% of given data correctly
since its mathematical model depends on binary de-
cisions and orthogonal divisions which is seemingly
not a proper model for this case.
pmk = 1/Nm∑
xi∈Rm
I(yi=k)(6)
To conclude, each model classifies the given data by
using a different mathematical method and conse-
quently their performance on this classification task
differs. From the table and figures below, compar-
ison of their performance according to correct la-
belling percentage, visualization of their space divi-
sion methods and categorized elements of the test
and control data sets can be seen (Table1 , Figure 6,
Figure 7, Figure 8).
134 |eCAADe 36 - AI FOR DESIGN AND BUILT ENVIRONMENT - Volume 1
Figure 6
Visualization of
different models
(From left to right:
Logistic Regression,
Nearest Neighbors,
Linear SVM, Kernel
SVM, Naïve Bayes,
Decision Tree) [4]
Figure 7
Categorized
elements from the
test set
Figure 8
Categorized
elements from the
control set
CONCLUSION AND FUTURE WORK
In conclusion, by generating a dataset from scratch
and use it to train supervised machine learning mod-
els, it is possible to automate classification and lay-
ering process. Multi-class labelling can be accom-
plished only with 3 features from X,Y, Z-axes of the
architectural elements’ boundary edges, which is a
much simpler approach than using mesh or point
cloud. Since the proposed method is simple, it is pos-
sible to be controlled, customized and extended it
in future research. For example, it can lead working
on tilted and/or non-orthogonal geometries by using
sequence of bounding boxes instead of one bound-
ing box. Also, optimization of dimensions for the
training set by genetic algorithms (e.g. Galapagos)
to achieve higher accuracy in different classification
AI FOR DESIGN AND BUILT ENVIRONMENT - Volume 1 - eCAADe 36 |135
tasks can be tested. Last but not least, new features
can be included within the training data such as po-
sition or material of architectural elements to extend
the tasks towards construction industry. In shor t, this
small task and a simple approach shows that intersec-
tion of architecture and machine learning has great
research potentials to be studied and extended fur-
ther in future. As Turing, the pioneer of theoretical
computer science and artificial intelligence said: ” We
can only see a short distance ahead, but we can see
plenty there that needs to be done.” (1950).
ACKNOWLEDGEMENTS
We would like to send our sincere thanks to Mehmet
Koray Pekeriçli from METU, Department of Architec-
ture; METU Digital Media Course students; and last
but not least, Pierre Cuvilliers from MIT Digital Struc-
tures for his scientific contribution to Grasshopper
Python Remote.
REFERENCES
Armeni, I, Sener, O, Zamir, AR, Jiang, H, Brilakis, I, Fis-
cher, M and Savarese, S 2016 ’3d semantic parsing
of large-scale indoor spaces’, Proceedings of the IEEE
Conference on Computer Vision and Pattern Recogni-
tion, Las Vegas, pp. 1534-1543
Bassier, M, Vergauwen, M and van Genechten, B 2016
’Automated Semantic Labelling of 3D Vector Mod-
els for Scan-to-BIM’, Proceedings of the 4th Annual In-
ternational Conference on Architecture and Civil Engi-
neering (ACE2016), Singapore, pp. 93-100
Chen, X, Golovinskiy, A and Funkhouser, T 2009, ’A
Benchmark for 3D Mesh Segmentation’, ACM Trans-
actions on Graphics (TOG), 28(3), p. 73
He, Z, Liu, H, Wang, Y and Hu, J 2017, ’Generative Adver-
sarial Networks-Based Semi-Supervised Learning for
Hyperspectral Image Classification’, Remote Sensing,
9(10), p. 1042
Hua, H 2014, ’A Case-Based Design with 3D Mesh Mod-
els of Architecture’, Computer-Aided Design, 57, pp.
54-60
Kalogerakis, E, Hertzmann, A and Singh, K 2010, ’Learn-
ing 3D Mesh Segmentation and Labeling’, ACM
Transactions on Graphics (TOG), 29(4), p. 102
Lv, J, Chen, X, Huang, J and Bao, H 2012, ’Semi-
supervised Mesh Segmentation and Labeling’, Com-
puter Graphics Forum, 31(7), pp. 2241-2248
Pedregosa, F, Varoquaux, G, Gramfort, A, Michel, V,
Thirion, B, Grisel, O, Blondel, M, Prettenhofer, P,
Weiss, R, Dubourg, V, Vanderplas, J, Passos, A, Cour-
napeau, D, Brucher, M, Perrot, M and Duchesnay,
E 2011, ’Scikit-learn: Machine learning in Python’,
Journal of Machine Learning Research, 12, pp. 2825-
2830
Turing, AM 1950, ’Computing Machinery and Intelli-
gence’, Mind, 59(236), pp. 433-460
[1] http://www.food4rhino.com/stats
[2] http://www.food4rhino.com/app/ghpython
[3] https://pypi.python.org/pypi/gh-python-remote/1.
0.4
[4] http://scikit-learn.org/stable/auto_examples/classifi
cation/plot_classifier_comparison.html
136 |eCAADe 36 - AI FOR DESIGN AND BUILT ENVIRONMENT - Volume 1