Conference PaperPDF Available

Custom Digital Workflows with User-Defined Data Transformations Via Property Graphs

Authors:

Abstract and Figures

Custom digital workflows aim to allow diverse, non-integrated design and analysis applications to be custom linked in digital workflows, created by a variety of users, including those who are not expert programmers. With the intention of introducing this in practice, education and research, this paper focuses on critical aspects of overcoming interoperability hurdles, illustrat- ing the use of property graphs for mapping data between AEC software tools that are not connected by common data formats and/or other interoperability measures. A brief exemplar design scenario is presented to illustrate the concepts and methods proposed, and conclusions are then drawn regarding the feasibility of this approach and directions for further research.
Content may be subject to copyright.
Custom Digital Workflows with User-Defined
Data Transformations Via Property Graphs
Patrick Janssen, Rudi Stouffs, Andre Chaszar, Stefan Boeykens,
and Bianca Toth
Abstract Custom digital workflows aim to allow diverse, non-integrated design
and analysis applications to be custom linked in digital workflows, created by a
variety of users, including those who are not expert programmers. With the
intention of introducing this in practice, education and research, this paper focuses
on critical aspects of overcoming interoperability hurdles, illustrating the use of
property graphs for mapping data between AEC software tools that are not
connected by common data formats and/or other interoperability measures. A
brief exemplar design scenario is presented to illustrate the concepts and methods
proposed, and conclusions are then drawn regarding the feasibility of this approach
and directions for further research.
Introduction
The persistent lack of integration in building design, analysis and construction calls
for new approaches to information exchange. We argue that bottom-up, user-
controlled and process-oriented approaches to linking design and analysis tools,
as envisaged by pioneers of CAD [1,2], are indispensable as they provide degrees
of flexibility not supported by current top-down, standards-based and model-
oriented approaches.
P. Janssen (*)
National University of Singapore, Singapore, Singapore
e-mail: patrick.janssen@nus.edu
R. Stouffs
National University of Singapore, Singapore, Singapore
Delft University of Technology, Delft, Netherlands
A. Chaszar
Delft University of Technology, Delft, Netherlands
S. Boeykens
KU Leuven, Leuven, Belgium
B. Toth
Queensland University of Technology, Brisbane, Australia
©Springer International Publishing Switzerland 2015
J.S. Gero, S. Hanna (eds.), Design Computing and Cognition '14,
DOI 10.1007/978-3-319-14956-1_29
511
We propose a platform to support a design method where designers can compose
and execute automated workflows that link computational design tools into com-
plex process networks [3,4]. By allowing designers to effectively link a wide
variety of existing design analysis and simulation tools, such custom digital
workflows support the exploration of complex trade-offs between multiple
conflicting performance criteria. For such explorations, these trade-offs often
present a further set of questions rather than a final set of answers, so the method
is envisaged as a cyclical process of adaptive workflow creation followed by
iterative design exploration.
The adaptive-iterative design method requires a platform for designers to effec-
tively and efficiently create and execute workflows. The remainder of this paper
first gives a general overview of the proposed platform and then focuses on critical
aspects of overcoming interoperability hurdles, specifically the creation of mapping
procedures that map data between tools with incompatible data representations. We
explore the feasibility of a data mapping approach that allows end-users to define
their own customized mappers, applying it to an example scenario focusing on a
digital design-analysis workflow linking parametric design tools to performance
analysis tools.
The research method consists of building a test workflow comprising a geo-
metric design model and analysis tools to evaluate lighting and thermal perfor-
mance, and applying customized data mappings between these applications via
property graphs. The data collected from this experiment includes observations of
the types of data mappings required, the complexity of the mappings, and the
modifiability of the mappings when editing of the workflow is needed. We
conclude that linking design tools via customized data mappers is a feasible
approach that can complement other existing mapping approaches, and we dis-
cuss future research directions.
Adaptive-Iterative Design Platform
In order to support the adaptive-iterative design method, a platform for creating and
executing workflows is proposed. This platform is based on existing scientific
workflow systems that enable the composition and execution of complex task
sequences on distributed computing resources [5].
Scientific workflow systems exhibit a common reference architecture that con-
sists of a graphical user interface (GUI) for authoring workflows, along with a
workflow engine that handles invocation of the applications required to run the
solution [6,7]. Nearly all workflow systems are visual programming tools in that
they allow processes to be described graphically as networks of nodes and wires
that can be configured and reconfigured by users as required [8]. Nodes perform
some useful function; wires support the flow of data, linking an output of one node
to an input of another node.
512 P. Janssen et al.
Each workflow system acts to accelerate and streamline the workflow process,
but each system also varies greatly in specific capabilities [9]. We aim to identify
the functionality needed to develop a flexible, open and intuitive system for design-
analysis integration, building on the capabilities exhibited by scientific workflow
systems, and further capitalizing on recent advances in cloud computing.
Actor Model
The proposed platform is a type of scientific workflow system using an actor model
of computation [10]. Nodes are actors, and the data that flows between nodes is
encapsulated in distinct data sets, referred to as data objects. The actor model
allows for a clear separation between actor communication (dataflow) and overall
workflow coordination (orchestration).
We consider three types of actors: process actors, data actors and control actors.
Process actors define procedures that perform some type of simulation, analysis, or
data transformation. They have a number of input and output ports for receiving and
transmitting data objects, as well as meta-parameters that can be set by the user to
guide task execution. Data actors define data sources and data sinks within the
workflow, including the data inputs and data output for the workflow as a whole.
Control actors provide functionality related to workflow initiation, execution and
completion. We focus here on the role of process actors in workflows, and the
development of an approach to support custom data mapping procedures.
Process actors can be further classified into tool actors and mapping actors. Tool
actors define procedures that wrap existing applications to make their functionality
and data accessible to the workflow; while mapping actors define procedures that
transform data sets in order to map the output from one tool actor to the input for
another.
Figure 1shows a conceptual diagram of an example network in which a
parametric CAD system actor is connected to three evaluation actors: Revit for
cost estimating; EnergyPlus for heating/cooling load simulation; and Radiance for
daylighting simulation. The CAD system actor encapsulates a procedure that starts
the CAD system, loads a specified input model and a set of parameter values, and
then generates two output models. One of the models is a generated as an IFC file,
which is directly consumed by the Revit actor (using the Geometry Gym Revit IFC
plugin [11]), while the other model is generated as a text file. This latter model then
undergoes two separate transformations that map it into both EnergyPlus and
Radiance compatible formats. The simulation actors then read in this transformed
data, run their respective simulations, and generate output data consisting of
simulation results.
The output results may be merged and displayed to the user within a graphical
dashboard in an integrated way in order to support decision making. Furthermore,
such a dashboard may also allow users to manipulate the input parameters for the
workflow. Each time such inputs are changed, the execution of the network will be
Custom Digital Workflows 513
triggered, resulting in a new set of results. It is envisaged that more advanced data
analytics tools could be developed in which multiple sets of results from the
adaptive-iterative process could be analyzed and compared.
Platform Architecture
In order to support both the collaborative creation of workflows and their parallel
execution, the proposed platform can be implemented as a web application
deployed on a cloud or grid based infrastructure. Workflows (such as the one
shown in Fig. 1) can be interactively developed by users within the web browser
interface. When the user runs a workflow, it is uploaded to a workflow server as a
workflow job consisting of a set of computational tasks, where it is executed in a
distributed manner.
For scalability, both the procedures and the associated data objects are stored in
an online distributed key-value database. Key-value databases are highly optimized
for speed and scalability when storing and retrieving many documents. Although
such documents may use a standardized format (e.g. JSON), they are typically used
as containers for storing other data, with no restrictions on data models and data
schemas. For example, a document may contain any number of serialized objects,
text files or binary blobs. Both data objects and actor procedures are stored in
documents in the key-value database. No restrictions are imposed on the types of
data objects and actor procedures that can be stored.
Fig. 1 Example network of actors. A parametric CAD system is linked to Revit, Radiance and
EnergyPlus via various mappers (M). End users contribute white components, while actor devel-
opers build grey components and wrap black components representing existing tools
514 P. Janssen et al.
Data Models for Data Mappers
Data mappings aim to overcome interoperability problems between tools that
cannot be linked using existing approaches. We examine two common approaches
for creating such mappers, and propose a complementary approach.
The various approaches to overcoming interoperability problems rely on three
distinct levels of data representation: the data model, the data schema, and the data
file. A data model organizes data using generic constructs that are domain inde-
pendent. Due to this generic nature, the range of data that can be described is very
broad. It offers a way of defining a data structure that is very flexible but relies on
human interpretation of semantic meaning. For example, Tsichritzis and
Lochovsky [12] distinguish seven types: relational, network, hierarchical, entity-
relationship, binary, semantic network, and infological. Data models will often be
coupled with highly generic languages for querying and manipulating data, vari-
ously called data query languages and data manipulation languages.
A data schema represents domain-specific information using semantic constructs
related to a particular domain. Due to the highly specific nature of the constructs,
the type of information that can be described tends to be relatively narrow.
However, this manner of representing information supports automated interpreta-
tion of semantic meaning. The data schema is often built on top of a data model, by
formally defining constraints that describe a set of allowable semantic entities and
semantic relationships specific to a particular domain. This data schema is defined
using a specialized language, variously called a data definition language,adata
description language,oradata modeling language. Note that the data schema itself
is distinct from the language used for describing the schema.
A data file uses syntactic constructs to describe how data is stored, either in a file
or any other form of persistent storage. Going from the data file to the data schema
is referred to as parsing, while the reverse is known as serializing. For any given
data schema, there may be many different types of data files.
For example, consider the Industry Foundation Classes (IFC) representation for
the exchange of Building Information Modeling data in the AEC industry [13]. In
this case, the data model is an entity-relationship model, the data schema is an IFC
schema defined in the EXPRESS modeling language, and the two main data files
use a STEP or XML based file format. Note the use of XML for one of the file
formats does not mean that an XML-based data model is being used for data
manipulation. The data file is only used as a convenient way of storing the data.
Tool Interoperability
To link two tools with incompatible data representations, a formalized data map-
ping needs to take place [14]. The mapping is defined as a one-way transformation
process, where the data representation of the source tool is mapped to the data
Custom Digital Workflows 515
representation of the target tool. In existing scientific workflow systems, this
mapping process is called shimming[15].
In most cases, the incompatibility exists at all levels: at the data model level, the
data schema level, and the data file level. One approach in overcoming this type of
incompatibility is to create direct file translators for each pair of tools to be linked.
An alternative approach is to create indirect file translators that read and write to an
intermediate representation. Within the AEC industry, these are the two main
interoperability approaches being pursued, which we call direct file translation
and indirect file translation.
The direct file translation approach has two key weaknesses. First, since separate
data file translators need to be created for each pair of tools, the number of trans-
lators required increases rapidly as the number of tools increases. Second, the
required generality of the translator means that no assumptions can be made
regarding the sets of data to be processed, so these translators are complex to
develop and maintain (as file formats are continuously updated and changes are
often undocumented), and tend to be plagued by bugs and limitations.
With the indirect file translation approach, a range of different levels exist
depending on the type of data schema. At the low-level end of the spectrum, data
schemas may be limited to describing only geometric entities and relationships,
while at the more high-level end of the spectrum, schemas may also describe
complex real-world entities and relationships. The AEC industry has been using
low-level geometric translators for many decades, based on file formats such as
DXF, IGES and SAT. Such low-level geometry based file translation approaches
are clearly limited, since they do not allow for the transfer of non-geometric
semantic data.
At the more high-level end of the spectrum, the AEC research community has,
since the 1970s, been working on ontological schemas that combine geometric
information with higher level semantic constructs that describe entities and rela-
tionships in the AEC domain. Until recently, this approach was often conceptual-
ized as a single centralized model containing all project data and shared by all
project participants. However, this is now increasingly seen as being impractical for
a variety of reasons, and as a result such high-level ontologies are now being
promoted as a smartform of file translation [16]. The latest incarnation of these
efforts is the STEP-based IFC standard for BIM data storage and exchange.
Open Interoperability
A number of platforms are being developed that enable users to apply both direct
and indirect file translators, focusing in particular on BIM and IFC. For example,
the SimModel aims to allow BIM and energy simulation tools to be linked via a new
XML based model aligned with IFC [17]; D-BIM workbench aims to allow a
variety of tools for analyzing differing performance aspects to be tightly integrated
with BIM tools [18].
516 P. Janssen et al.
However, there are many sets of tools for which such translators have either not
yet been developed or are no longer up-to-date. We therefore propose a more
flexible and open type of platform that also supports linking tools for which direct
and indirect file translators are not available. Such tools are linked using an
approach that allows users to define their own customized data mappers tailored
to the data exchange scenarios they require. Figure 2shows the three approaches to
linking tools. Note that this figure omits other approaches that do not require the
neutral file format approach, e.g., agent-based interoperability [19]. The proposed
platform will allow users to create automated workflows that mix-and-match all
three tool-linking approaches, applying the most expedient approach for each pair
of tools. For example, in cases where IFC-based mappers are already available, the
user may prefer to simply apply such mappers, while in cases where no translators
are available, customized data mappers can be developed.
End users will need to be provided with various tools and technologies in order
to support them in this endeavor of developing mappers. This research focuses on
communities of users with limited programming skills, with the goal to develop a
mapping approach that allows such non-programmers to easily create and share
custom mappings in a collaborative manner. Since users are focused on one specific
data exchange scenario, these mappers can potentially be restricted to only a subset
of the data schemas of the design tools being linked, thereby resulting in mappers
that are orders of magnitude smaller and simpler when compared to general-
purpose file translators.
To achieve this goal, the mapping approach must be both flexible and user-
friendly. It must be flexible so that users can apply the same mapping strategies to a
wide range of data exchange scenarios. It must also be user-friendly so that it
supports users with limited programming skills in the process of creating and
debugging mappers.
Fig. 2 Interoperability approaches. Linking individual tools via file translation (left); via a shared
data schema (middle); or via a shared mapping process (right)
Custom Digital Workflows 517
The complexity of creating such mapping is to a significant extent related to the
levels of data incompatibility between the source tool and target tool. So far, the
assumption has been that for most tools, data incompatibility will exist at all three
levels: data model, data schema, and data file. This creates many difficulties for
end-users attempting to create mappers. In particular, the user is required to deal
with two different data models, each of which will have its own languages for data
query, data manipulation, and schema definition. However, a simpler data exchange
scenario can be imagined, where the input and output data for the mapping both use
the same data model. In such a case, the user would be able to develop a mapping
within one coherent data representation framework, including the option of using a
single model based language for querying and manipulating data. In addition, given
a single data model, it then becomes feasible to provide visual tools for developing
such mappers, as will be discussed in more detail in the next section.
The simpler data exchange scenario with a common data model is clearly more
desirable. The proposed approach is therefore to transform data exchange scenarios
with incompatible data models into the simpler type of data exchange scenario
where both input and output data sets have a common data model. With this
approach, input data sets are created that closely reflect the data in the source
data file, and output data sets are created that closely reflect the data in the target
data file. The mapping task is then reduced to the simpler task of mapping from
input data sets to output data sets using a common data model. Figure 3shows the
relationship between the mapping procedure and the source and target data files.
These input and output data sets may have data schemas, either informally
defined in help documentation or formally defined using a schema definition
language. In the latter case, the schema may be used to support visual mapping
tools. In most cases, these schemas will be reducedschemas in that they will only
cover a sub-set of the data that might be represented in the source and target data
files. For example, while the source and target tools may each have large and
complex schema, a user may decide that for the task at hand, they will only be
using a small sub-set of the entities and relationships defined in those schemas. As a
result, these schemas may typically be small and highly focused on the task at hand.
Fig. 3 The mapping of a source data file to a target data file involves three stages of data
transformation: parsing, mapping, and serializing
518 P. Janssen et al.
The input and output data sets can be generated in various ways. One approach is
to use a parser to convert the source data file into the input data set and a serializer to
convert the output data set into the target data file. In both reading and writing the
data files, the data is converted with the minimum amount of change, and neither
the parser nor the serializer performs any actual data mapping. As long as a generic
and flexible mapping data model is selected, the conversions between data file and
data model should be relatively straightforward. In some cases, it may even be
possible to automatically generate parsers and serializers. We focus here on the data
mapping procedure, which requires more in-depth interpretation.
User Defined Data Mappings
Given a pair of automated parsing and serializing procedures, the task for the user is
then to define the data mapping. This mapping procedure needs to be able to process
any data set that adheres to the input data schema, and produce a data set that
adheres to the output data schema. For creating such mapping procedures, a number
of techniques can be identified, differing in complexity and power. For the simplest
type of mappings, which we refer to as declarative equivalency mapping, user-
friendly visual tools already exist, while for more complex types of mappings,
which we refer to as scripted mappings, the user is forced to write code. In this
research, we propose a powerful intermediate approach using flexible and highly
generic visual programming tools. This intermediate level we refer to as procedural
query mapping.
With declarative equivalency mappings, the input and output data schemas are
formally defined, and the user defines a list of semantically equivalent entities
between these schemas. Based on this user-defined information, a mapping proce-
dure can then be automatically generated that will transform the input data set to the
output data set. In some cases, it may be possible to define such mappings using
visual tools [20]. One key problem with this approach is that only fairly simple
mappings can be created using direct semantic mappings. More complex mappings
may require a number of input entities to be processed in some way in order to be
able to generate another set of target entities.
With procedural query mappings, the user creates data mapping rules using
visual programming languages specialized for particular types of data models.
These specialized languages include data query languages and data manipulation
languages. The former are used for retrieving data from a data set, and the latter for
inserting and deleting data in a data set. In many cases, the same language can be
used for both querying and manipulation. A popular example is the Structured
Query Language (SQL), which is used for both retrieving and manipulating data in
relational databases. Other generic languages for retrieving and manipulating data
include XQuery/XPath for data stored in XML data models, SPARQL for data
stored in Resource Description Framework (RDF) data models, and jaql for data
Custom Digital Workflows 519
stored in JSON data models. Although such languages are specialized for certain
data models, the languages themselves are semantically still highly generic.
With scripted mappings, the user develops mapping scripts using a programming
language. Such scripted mapping may be appropriate in cases where complex
mappings need to be defined. Consider the example in Fig. 1. The output from
the CAD system cannot be easily mapped to either the input for EnergyPlus or input
for Radiance. Instead, an additional step is required that performs Boolean opera-
tions on the polygons. For EnergyPlus, surface polygons need to be sliced where
there are changes in boundary conditions (as each surface can only have one
boundary condition attribute), and then infer what these boundary conditions are,
i.e. internal, external or ground contact. For Radiance, surface polygons need to
have holes cut where there are windows. These additional steps may have to be
performed by a scripted mapper, the PolygonSlicer (Fig. 1).
For creating user-defined mappings within workflows, either declarative equiv-
alency mappings or procedural query mappings approach are seen as being more
appropriate, since these approaches do not require advanced programming skills.
However, if more complex types of mappings are required, then scripted mappers
can be created. Ideally, such scripted mappers should be developed to apply to a
wide variety of situations and contexts, so as to be easily reusable.
For declarative equivalency mappings, a number of tools already exist, and for
scripted mappers, query and manipulation languages abound. However, visual tools
for procedural query mapping are rare. In addition, this approach is seen as being
particularly crucial, since it is not subject to the limitations of the simpler declar-
ative equivalency mapping approach, while at the same time it does not require the
more advanced programming skills for the more complex scripted mapping
approach. This research therefore specifically focuses on the development of a set
of visual tools for developing procedural query mappings.
Data Models for Mapping
It is envisaged that these various tools, parsers, serializers, and mappers could be
developed and shared through an online user community. Users could download
diverse sets of actors developed by different groups from a shared online repository,
and then string these together into customized workflows (e.g., Fig. 1). This process
would ideally emerge in a bottom-up manner with minimal restrictions being
placed on developers of actors. It is therefore important that no specific data
model is imposed, but instead that actor developers and other users are able to
choose preferred data models. For a particular pair of tool actors, various parser-
serializer pairs may be developed allowing users to choose to generate mappings
based on alternative data models. For example, one parser-serializer pair might use
a hierarchical XML data model, allowing users to create mappings with their
preferred declarative equivalency mapping tool, while another might use a
520 P. Janssen et al.
relational data model, allowing users to create a mapping by writing SQL scripts.
Ideally, a diverse ecosystem of actors would emerge.
For procedural query mapping, various data models and query languages were
considered from the point of view of applicability and ease of use. Below, an
example scenario is described in which the property graph data model was used
as the mapping data model. A property graph is a directed graph data structure
where edges are assigned a direction and a type, and both vertices and edges can
have attributes called properties. This allows property graphs to represent complex
data structures with many types of relationships between vertices. In graph theoretic
language, a property graph is known as a directed, attributed, multi-relational
graph. The query language used for querying and manipulating data in the property
graphs is called Gremlin [21].
Example Scenario
In order to demonstrate the feasibility of the proposed approach, we have
implemented part of the example scenario shown in Fig. 1using Kepler [22], an
open-source workflow system based on an actor-oriented software framework
called Ptolemy II [23]. Kepler workflows can be nested, allowing complex
workflows to be composed from simpler components, and enabling workflow
designers to build reusable, modular sub-workflows that can be saved and used
for many different applications.
Kepler is used to create a workflow connecting various tools, including SideFX
Houdini as a parametric CAD system to generate a building model and EnergyPlus
and Radiance as energy analysis simulation program and lighting analysis program,
respectively, to evaluate building performance. Any other (parametric) CAD soft-
ware and simulation tools could also be considered for this purpose.
A simplified design is used for testing the workflow, consisting of only two
spaces stacked on top of each other, each with a window in one wall. The Kepler
workflow is shown in Fig. 4; here the actors created wrap the Houdini application,
the EnergyPlus program, and the two Radiance programs (using Python as the
programming language). In total there are 14 polygons, and each polygon is
Fig. 4 The Kepler workflow. See Fig. 8for the contents of the Mapper3 actor
Custom Digital Workflows 521
assigned a set of attributes that are used for generating the property graphs. The
model is shown in Fig. 5, together with key attributes defined for each polygon.
Workflow Mappers
For all the mappers, property graphs are used as the mapping data model. For
Houdini, EnergyPlus, and Radiance, parsers and serializers are created for stepping
between the native file formats and the property graphs.
The workflow contains two scripted mappers and five composite mappers
containing sub-networks of mapping actors. The two scripted mappers are actually
two instances of the PolygonSlicer mapper, the scripted mapper performing the
polygon slicing using Boolean operations. This has been implemented in a gener-
alized way to be used for various slicing operations by users who do not have the
necessary programming skills to implement such a mapper. In this case, Radiance
and EnergyPlus require the polygons to be sliced in different ways. The
PolygonSlicer mapper has a set of parameters that allows users to define the
polygons to be sliced, and the Boolean operation to be performed. It also allows
users to specify certain other constraints, such as the maximum number of points
per polygon. Finally, it also identifies adjacency relationships between polygons,
for example if two polygons are a mirror image of one another. The input and output
files for this mapper use a standard JSON property graph file format, and the parser
and serializer using the existing GraphSON library.
The five composite mappers each contain sub-networks that perform a proce-
dural query type of mapping operation. These sub-networks are defined using
Kepler mapping actors that provide a set of basic functions for mapping graph
vertices and graph edges. Users are able to build complex customized mapping
networks by wiring together these basic mapping actors. Each mapping actor has
various parameters that allow the mapping function to be customized. When these
actors are executed, Gremlin mapping scripts are automatically generated based on
these parameter settings.
Fig. 5 CAD model of 14 polygons, each with three attributes (uid,typeand group)
522 P. Janssen et al.
Three highly generic mapping actors are defined that can be used to create a wide
variety of mappings. The Merge actor is used for merging input graphs into a single
output graph; the Spawn actor is used for adding new vertices to the input graph;
and the Iterate actor is used for iteration over vertices and edges in the input graph
while at the same time generating vertices and edges in the output graph. This last
actor is powerful and very flexible, as it allows for copying, modifying, or deleting
vertices and edges from the input to the output. A parameter called selectallows
users to specify a Gremlin selection filter on the input graph. For each entity (i.e.,
vertex or edge) in the filtered input graph, a particular action is triggered, which
could be the creation or modification of vertices or edges.
A Mapping Example
In order to understand the mapping process, the mapping from Houdini to
EnergyPlus will be described in more detail. The first step is for the user to create
the parametric model of the design together with a set of parameter values. The
Houdini wrapper will trigger Houdini to generate a model instance and will save it
as a geometry data file (using Houdinis .geo format, though other formats could be
used too, e.g., .obj).
Two mappers are then applied. The first mapper maps the output from Houdini
to the input of the PolygonSlicer, and the second mapper maps the output of the
PolygonSlicer to the input of EnergyPlus. For the first mapper, a parser is provided
for stepping from the Houdini geometry file to the property graph, and a serializer is
provided for stepping from the property graph to the JSON graph file. For the
second mapper, a parser is provided for stepping from the JSON graph file to the
property graph, and a serializer is provided for stepping from the property graph to
an EnergyPlus input file (using EnergyPlus.idf format). As already mentioned
above, these parsers and serializers just mirror between the data file and the data
model, and as a result they can be implemented in a way that is highly generic.
Although implementing these parsers and serializers will require someone with
programming skills, it needs to be done only once, after which end-users can simply
select the required parsers and serializers from a library. Implementing the parser
and serializer for the JSON graph files is trivial since a library already exists. For
Houdini and EnergyPlus, the ASCII data files have a clear and simple structure,
resulting in a straightforward implementation for the parser and serializer.
Given a library of parsers and serializers, the task for the end-user is then
reduced to the transformation of the input property graph into the output property
graph using the three Kepler mapping actors. In anticipation of this mapping
process, the user can define additional attributes in the geometry model that can
be used during the mapping. In this scenario, the user knows that in order to map to
EnergyPlus, surfaces will need to be assigned different types and will also need to
be grouped into zones. In this case, the polygons in the parametric model are each
assigned three attributes: a uidattribute is used to define a unique name for each
Custom Digital Workflows 523
polygon, a typeattribute is used to define the type for each polygon, and a group
attribute is used to define the group to which the surface belongs, with groups
corresponding to zones (see Fig. 5). When the parser reads the geometry data file, it
will convert these attributes into properties, so that a polygon with attributes in the
geometry file will become a graph vertex with properties in the property graph.
The user then needs to create the graph mappers using the graph mapping nodes.
Figure 6shows the overall structure of these input and output property graphs, and
Fig. 7shows the properties associated with three of the vertices in each property
graph. The PolygonSlicer and the EnergyPlus actors both have input graph schemas
that specify the required structure of the graph and the required properties of the
vertices. The task for the user is therefore to create mappings that generate graphs
that adhere to these schema constraints.
In the first mapping, where the output of Houdini is mapped to the input of the
PolygonSlicer, two new vertices are added for the two groups, and edges are added
from the new group vertices to the polygon vertices. These vertices and edges are
created using an Iterate actor. The PolygonSlicer then transforms its input graph by
dividing the surfaces for the ceiling of the lower zone (c2) and the floor of the
upper zone (f7) so as to ensure that each surface has a homogeneous boundary
condition. The PolygonSlicer also detects the relationships between the floors and
ceilings, between the floors and the ground, and between windows and walls.
In the second mapping, where the output of the PolygonSlicer is mapped to the
input of the EnergyPlus simulator, additional properties are added to the existing
vertices in the input graph, and a number of additional vertices are also added to
define a set of other objects required in the EnergyPlus input file. The Kepler network
for this mapper is shown in Fig. 8. The Spawn actor is used to create the additional
vertices, and Iterate actors are used to copy and modify existing vertices. The groups
Fig. 6 Simplified diagrammatic representation of the property graphs for (Point data is not shown
in order to reduce the complexity of the diagrams. Before the PolygonSlicer there are 24 points,
while afterwards there are 28 points)
524 P. Janssen et al.
are mapped to EnergyPlus zones, and the polygons to EnergyPlus surfaces. In the
process of mapping, the Iterate actor also transforms the edges that existed in the
input graph into properties in the output graph. The output graph becomes a simple
list of vertices under the idfroot vertex. For example, in the input graph the window
is connected to the wall with an edge, while in the output graph the window is no
longer connected but instead has a property that specifies the wall name.
For each different surface type, a separate Iterate actor is defined. For example,
consider the Iter_V_Ceilingsactor in Fig. 8. This node generates the ceilings of
the two zones. Table 1shows the two main parameters for the actor. The Select
parameter filters the input graph so that the remaining vertices all have an Entity
property with a value of polygonand a Typeproperty with a value of ceiling,
and in addition have an outgoing boundary_isedge that points to another polygon
Fig. 7 An example of the property data for a few of the vertices in the property graphs. Typically,
the property graphs will undergo a process of information expansion, where data is gradually
added to the model as needed
Fig. 8 The Kepler mapper that maps the output of the PolygonSlicer actor to the input of the
EnergyPlus actor. See the Mapper3actor in Fig. 5
Custom Digital Workflows 525
(i.e., the floor above). The Vertex propertiesparameter then defines a set of name-
value property pairs. For each polygon in the filtered input graph, the iterator actor
will generate a vertex in the output graph with the specified properties.
Note that when the user is specifying the property values, they can insert
Gremlin commands to extract these values from the input graph, thus ensuring
that the values can be dynamically generated. Figure 6shows changes for a number
of vertices in the property graph as data is mapped and transformed. When the
Iter_V_Ceilingsactor iterates over the c2.2polygon in the input graph, it
generates the c2.2EnergyPlus surface.
Adaptive-Iterative Exploration
Once a workflow has been developed, it can be used to iteratively explore design
variants. The example workflow can be used without modification to evaluate any
design variants that consist of a set of stacked zones. If other spatial configurations
of zones need to be evaluated not limited to stacking, then the workflow may need
to be adjusted in minor ways. For example, if zones are configurations so that they
are adjacent to one another, then the Iter_V_Wallsactor in Fig. 8would be
modified to allow common boundary walls between zones.
Conclusions
In order to support a bottom-up, user-controlled and process-oriented approach to
linking design and analysis tools, a data mapping approach is required that allows
designers to create and share custom mappings. To achieve this goal, the data
Table 1 The parameter names and values for the Iter_V_Ceilings actor. Gremlin code is shown in
italics, and makes use of two predefined local variables: grefers to the input graph, and xrefers to
the entity being iterated over (which in this case is a vertex)
526 P. Janssen et al.
mapping approach should be both flexible in that it can be applied to a wide variety
of tools, and user-friendly in that it supports non-programmers in the process of
easily creating and debugging mappers. The use of common data models simplifies
the process for end-users to develop customized mappings. The example scenario
demonstrated how designers with minimal scripting skills would be able to set up
complex digital workflows that enable the fluid and interactive exploration of
design possibilities in response to custom performance metrics.
The next stage of this research will explore the scalability of the user-defined
graph mapping approach when working with larger data sets and more complex
data schemas (such as the IFC schema). In the current demonstration, the data sets
and data schemas are small, and as a result the graph mappers are relatively simple.
However, if data sets grow and the number of entity and relationship types is very
large, then the graph mappers could potentially become more difficult to construct.
In order to deal with this increased complexity, we foresee that the user will require
additional data management and schema management tools. The data management
tools could enable users to visualize, interrogate and debug property graph data
during the mapping process [24]. Schema management tools could let actor devel-
opers define formal graph schemas for input and output data for their actors. This
could then let end-users identify and isolate subsets of large schemas relevant to
their particular design scenario.
Acknowledgments The approach proposed in this research emerged out of discussions at the
“Open Systems and Methods for Collaborative BEM (Building Environment Modeling)” work-
shops held at the CAAD Futures Conference in July 2011 and at the eCAADe Conference in 2013.
We would like to thank the participants of these workshops for their contributions. Additionally,
we would like to thank the reviewers for their valuable comments.
References
1. Coons SA (1963) An outline of the requirements for a computer-aided design system. In:
Proceedings of the AFIPS, ACM, pp 299–304
2. Sutherland I (1963) SketchPad: a man-made graphical communication system. In: Proceedings
of the Spring Joint Computer Conference, Detroit, pp 329–346
3. Janssen P, Stouffs R, Chaszar A, Boeykens S, Toth B (2012) Data transformations in custom
digital workflows: property graphs as a data model for user defined mappings. In: Proceedings
of the Intelligent Computing in Engineering Conference ICE2012, Munich
4. Toth B, Janssen P, Stouffs R, Boeykens S, Chaszar A (2013) Custom digital workflows: a new
framework for design analysis integration. Int J Archit Comput 10(4):481–500
5. Deelman E, Gannon D, Shields M, Taylor I (2008) Workflows and e-science: an overview of
workflow system features and capabilities. Futur Gener Comput Syst 25:528–540
6. Yu J, Buyya R (2005) A taxonomy of scientific workflow systems for grid computing.
SIGMOD Rec 34:44–49
7. Curcin V, Ghanem M (2008) Scientific workflow systems – can one size fit all? In: CIBEC.
Cairo, pp 1–9
8. McPhillips T, Bowers S, Zinn D, Ludaescher B (2009) Scientific workflow design for mere
mortals. Futur Gener Comput Syst 25:541–551
Custom Digital Workflows 527
9. Janssen P, Chen KW (2011) Visual dataflow modelling: a comparison of three systems. In:
Proceedings of the 14th International Conference on Computer Aided Architectural Design
Futures, Liege, Belgium, pp 801–816
10. Bowers S, Luda
¨scher B (2005) Actor-oriented design of scientific workflows. In: Proceedings
of the International Conference on Conceptual Modeling ER, pp 369–384
11. Mirtschin J (2011). Engaging generative BIM workflows. In: Proceedings of LSAA 2011
Conference Collaborative Design of Lightweight Structures, Sydney
12. Tsichritzis DC, Lochovsky FH (1982) Data models. Prentice-Hall, Englewood Cliffs
13. Eastman CM, Teichholz P, Sacks R, Liston K (2011) BIM handbook – a guide to building
information modeling for owners, managers, designers, engineers, and contractors, 2nd edn.
Wiley, Hoboken
14. Kilian A (2011) Design innovation through constraint modeling. Int J Archit Comput 4:87–105
15. Altintas I (2011) Collaborative provenance for workflow-driven science and engineering. PhD
thesis, University of Amsterdam, Amsterdam
16. Bos P (2012) Collaborative engineering with IFC: new insights and technology. In:
Gudnason G, Scherer R (eds) eWork and eBusiness in architecture, engineering and construc-
tion, 9th ECPPM Conference Proceedings, pp 811–818
17. ODonnell J, See R, Rose C, Maile T, Bazjanac V, Haves P (2011) SimModel: a domain data
model for whole building energy simulation. In: Proceedings of the 12th Conference of
International Building Performance Simulation Association, Sydney
18. Srinivasan, R, Kibert C, Thakur S, Ahmed I, Fishwick P, Ezzell Z, Lakshmanan J (2012)
Preliminary research in dynamic-BIM (D-BIM) workbench development. In: Proceedings of
the 2012 Winter Simulation Conference (WSC), pp 1–12
19. Kannengiesser U (2007) Agent-based interoperability without product model standards.
Comput Aided Civ Infrastruct Eng 22(2):96–114
20. Bellahsene Z, Bonifati A, Duchateau F, Velegrakis Y (2011) On evaluating schema matching
and mapping. In: Bellahsene Z, Bonifati A, Rahm E (eds) Schema matching and mapping.
Springer, Berlin, pp 253–291
21. Gremlin (2014) Retrieved 1 Apr 2014 from https://github.com/tinkerpop/gremlin/wiki
22. Ludascher B, Altintas I, Berkley C, Higgins D, Jaeger E, Jones M, Lee A, Tao J, Zhao Y (2006)
Scientific workflow management and the Kepler system. Concurr Comput Pract Experience 18
(10):1039–1065
23. Eker J, Janneck J, Lee EA, Liu J, Liu X, Ludvig J, Sachs S, Xiong Y (2003) Taming
heterogeneity: the Ptolemy approach. Proc IEEE 91(1):127–144
24. Stouffs R (2001) Visualizing information structures and its impact on project teams: an
information architecture for the virtual AEC company. Build Res Inf 29(3):218–232
528 P. Janssen et al.
... Still, potential reductions in control stem from issues of system reliability, viruses, etc. (vs. physical damage to analog representations) and from questions about the applicability of approaches adopted from manufacturing when applied to more open design processes --although these can be mitigated to some extent by recourse to more open design space exploration tools, 'custom digital workflows', and approaches to interoperability [5,6,7,8] as discussed further below. Given these points, design representation with digital means on the whole seems to come out on the side of greater control, in balance. ...
... Another associated risk is information overload, "losing sight of the forest for the trees": a possible excess of choices and consequent disorientation in balancing conflicting requirements. These may be offset to an extent with good data querying and visualisation (see below) [5,8,9,10]. Optimisation builds upon analysis and simulation above, with additional power and control exercised through multiple iterations of goal-directed design revisions, greatly increasing the number of design variations examinable. ...
... This typically requires a less constrained and more interactive approach than optimisation, with potentially more collaboration, consequently more negotiation (therefore less command), and more interest in comprehending all of the (design) data produced in order to gain a better appreciation of the design situation and potentials. Research on 'custom digital workflows' [7,8] and 'multivariate interactive visualisation' [9] address some of these issues by on the one hand enabling the construction of more open interative loops for design generation and evaluation, and on the other hand more effective comprehension of the copious data resulting from such processes. Designers can thus gain more control over the choice of software to use in design and analysis, and also more control in the face of information overload (noted above also as a potential problem with automated optimisation). ...
Conference Paper
Issues of control and command in architecture are considered here via reflections on recent and current research projects concerning digital technologies. The projects’ topics cover a range of scales and approaches, from the planning and design of urban ensembles to the detailing of panels for constructing freeform building envelopes. Additional topics on this spectrum include methods to support open-ended design explorations, goal-driven optimisations, participatory design and the internet-of-things. In each of these the possibilities and methods for controlling the design process and the resulting artifacts and systems are addressed in different ways, which consequently influence the roles of architects in different ways. Overall we see that while digital technologies do indeed enhance architects’ control in some cases, some applications require sharing of control with others, while still others may result in loss of control either to other parties due to transferability of skills, for example or altogether due to complexity and feedbacks. Awareness of these different possibilities may aid better use of the technologies.
... For instance, data acquisition, urban modelling, simulation and visualization tools can be linked so that a completely interactive exploration of design alternative becomes a reality. As Janssen et al. [3] mentions, this integration has to happen in a user friendly and flexible way to make this useful technology available to wider range of urban planners. He proposes a workflow configuration engine to manage tool collaborations within an urban planning pipeline (Fig. 2). ...
... As shown in the related work section, there are quite a few middleware systems focusing on urban planning [3,[7][8][9]. ...
... With the presented concept of LUCI as described in Sect. 3 we are able to answer our initial question, how can we couple different simulations that are written in various programming languages and that can run on different platforms. The second question about the functioning and performance of LUCI was positively answered by the dynamic load balancing experiments in Sect. 4 as well as by the use cases presented in Sect. 5. ...
Article
Full-text available
In this paper we introduce LUCI, a Lightweight Urban Calculation Interchange system, designed to bring the advantages of calculation and content co-ordination system to small planning and design groups by the means of an open source middle-ware. The middle-ware focuses on problems typical to urban planning and therefore features a geo-data repository as well as a job runtime administration, to coordinate simulation models and its multiple views. The described system architecture is accompanied by two exemplary use cases, that have been used to test and further develop our concepts and implementations.
... For instance, data acquisition, urban modelling, simulation and visualization tools can be linked so that a completely interactive exploration of design alternative becomes a reality. As Janssen et al. [3] mentions, this integration has to happen in a user friendly and flexible way to make this useful technology available to wider range of urban planners. He proposes a workflow configuration engine to manage tool collaborations within an urban planning pipeline (Fig. 2). ...
... As shown in the related work section, there are quite a few middleware systems focusing on urban planning [3,[7][8][9]. ...
... With the presented concept of LUCI as described in Sect. 3 we are able to answer our initial question, how can we couple different simulations that are written in various programming languages and that can run on different platforms. The second question about the functioning and performance of LUCI was positively answered by the dynamic load balancing experiments in Sect. 4 as well as by the use cases presented in Sect. 5. ...
... As shown in the related work section, there are quite a few middleware systems focusing on urban planning, see also Al-Jaroodi et al, 2003, Borst et al., 2009, Romualdo-Suzuki, 2013and Janssen et al., 2014 We introduce an open geospatial middleware for urban planning that includes a geo-data repository and an asynchronous job management engine that allows a supervised execution of modeling, simulation and visualization tasks in a network. What our solution focuses on is simple configurability and usability, parallel computation of urban simulations and flexible web integration. ...
Conference Paper
Full-text available
In this paper we introduce LUCI, a Lightweight Urban Calculation Interchange system, designed to bring the advantages of a calculation and content coordination system to small planning and design groups by the means of an open source middle-ware. The middle-ware focuses on problems typical to urban planning and therefore features a geo-data repository as well as a job runtime administration, to coordinate simulation models and its multiple views. The described system architecture is accompanied by two exemplary use cases that have been used to test and further develop our concepts and implementations. Figure 1. Transition Workshop at FCL in Singapore in April 2015 showing several views of the CPlan tool backed by LUCI imitating a planning workshop.
Conference Paper
Full-text available
This paper investigates the state-of-the-art with respect to simulation-based planning support systems in order to draw a set of requirements and best practices for an urban planning and design framework that enables multiple stakeholders with differing perspectives to systematically explore design options, leveraging the latest analysis and simulation techniques. From these requirements and best practices, the foundations and structure of such an urban planning and design framework are developed. A number of technological and methodological challenges are identified for future investigation.
Chapter
Full-text available
Executive SummaryIntroductionTypes of Construction FirmsInformation Contractors Want from BIMProcesses to Develop a Contractor Building Information ModelReduction of Design Errors Using Clash DetectionQuantity Takeoff and Cost EstimatingConstruction Analysis and PlanningIntegration with Cost and Schedule Control and Other Management FunctionsUse for Offsite FabricationUse of BIM Onsite: Verification, Guidance, and Tracking of Construction ActivitiesImplications for Contract and Organizational ChangesBIM Implementation
Conference Paper
Full-text available
This paper presents a Simulation Domain Model (SimModel) -a new interoperable XML-based data model for the building simulation domain. SimModel provides a consistent data model across all aspects of the building simulation process, thus preventing information loss. The model accounts for new simulation tool architectures, existing and future systems, components and features. In addition, it is a multi-representation model that enables integrated geometric and MEP simulation configuration data. The SimModel objects ontology moves away from tool-specific, non-standard nomenclature by implementing an industry-validated terminology aligned with Industry Foundation Classes (IFC). The first implementation of SimModel supports translations from IDD, Open Studio IDD, gbXML and IFC. In addition, the EnergyPlus Graphic User Interface (GUI) employs SimModel as its internal data model. Ultimately, SimModel will form the basis for a new IFC Model View Definition (MVD) that will enable data exchange from HVAC Design applications to Energy Analysis applications. Extensions to SimModel could easily support other data formats and simulations (e.g. Radiance, COMFEN, etc.).
Conference Paper
Full-text available
Visual programming languages enable users to create computer programs by manipulating graphical elements rather than by entering text. The difference between textual languages and visual languages is that most textual languages use a procedural programming model, while most visual languages use a dataflow programming model. When visual programming is applied to design, it results in a new modelling approach that we refer to 'visual dataflow modelling' (VDM). Recently, VDM has becoming increasingly popular within the design community, as it can accelerate the iterative design process, thereby allowing larger numbers of design possibilities to be explored. Furthermore, it is now also becoming an important tool in performance-based design approaches, since it may potentially enable the closing of the loop between design development and design evaluation. A number of CAD systems now provide VDM interfaces, allowing designers to define form generating procedures without having to resort to scripting or programming. However, these environments have certain weaknesses that limit their usability. This paper will analyse these weaknesses by comparing and contrasting three VDM environments: McNeel Grasshopper, Bentley Generative Components, and Sidefx Houdini. The paper will focus on five key areas: * Conditional logic allow rules to be applied to geometric entities that control how they behave. Such rules will typically be defined as if-then-else conditions, where an action will be executed if a particular condition is true. A more advanced version of this is the while loop, where the action within the loop will be repeatedly executed while a certain condition remains true. * Local coordinate systems allow geometric entities to be manipulated relative to some convenient local point of reference. These systems may be either two-dimensional or three-dimensional, using either Cartesian, cylindrical, or spherical systems. Techniques for mapping geometric entities from one coordinate system to another also need to be considered. * Duplication includes three types: simple duplication, endogenous duplication, and exogenous duplication. Simple duplication consists of copying some geometric entity a certain number of times, producing identical copies of the original. Endogenous duplication consist of copying some geometric entity by applying a set of transformations that are defined as part of the duplication process. Lastly, exogenous duplication consists of copying some geometric entity by applying a set of transformations that are defined by some other external geometry. * Part-whole relationships allow geometric entities to be grouped in various ways, based on the fundamental set-theoretic concept that entities can be members of sets, and sets can be members of other sets. Ways of aggregating data into both hierarchical and non-hierarchical structures, and ways of filtering data based on these structures need to be considered. * Spatial queries include relationships between geometric entities such as touching, crossing, overlapping, or containing. More advanced spatial queries include various distance based queries and various sorting queries (e.g. sorting all entities based on position) and filtering queries (e.g. finding all entities with a certain distance from a point). For each of these five areas, a simple benchmarking test case has been developed. For example, for conditional logic, the test case consists of a simple room with a single window with a condition: the window should always be in the longest north-facing wall. If the room is rotated or its dimensions changed, then the window must re-evaluate itself and possibly change position to a different wall. For each benchmarking test-case, visual programs are implemented in each of the three VDM environments. The visual programs are then compared and contrasted, focusing on two areas. First, the type of constructs used in each of these environments are compared and contrasted. Second, the cognitive complexity of the visual programming task in each of these environments are compared and contrasted.
Conference Paper
Full-text available
The concept of working in one central Building Information Model (BIM) is becoming increasingly popular. Reports in literature as well as industry practitioners describe the lack of good implementations for IFC import/export in current software tools. The so-called ‘round tripping’ of an IFC model cannot be performed without data loss, making the merging of data into one central data repository impractical. This feeds discussions about the workability of IFC in relation to a homogeneous software environment. In the Netherlands several experiments were conducted to research if IFC still meets the needs from the AEC industry. The observations and opinions from users refute current theories and perceptions on collaborations using IFC. This paper describes a collaboration process called ‘reference models’. User opinions from the research state that the use of IFC, in an suitable collaboration process, meets the needs of the industry even better than homogeneous proprietary software environments
Article
Full-text available
Flexible information exchange is critical to successful design-analysis integration, but current top-down, standards-based and model-oriented strategies impose restrictions that contradict this flexibility. In this article we present a bottom-up, user-controlled and process-oriented approach to linking design and analysis applications that is more responsive to the varied needs of designers and design teams. Drawing on research into scientific workflows, we present a framework for integration that capitalises on advances in cloud computing to connect discrete tools via flexible and distributed process networks.We then discuss how a shared mapping process that is flexible and user friendly supports non-programmers in creating these custom connections. Adopting a services-oriented system architecture, we propose a web- based platform that enables data, semantics and models to be shared on the fly.We then discuss potential challenges and opportunities for its development as a flexible, visual, collaborative, scalable and open system.
Conference Paper
Full-text available
Flexible information exchange is critical to successful design integration, but current top-down, standards-based and model-oriented strategies impose restrictions that are contradictory to this flexibility. In this paper we present a bottom-up, user-controlled and process-ori- ented approach to linking design and analysis applications that is more responsive to the varied needs of designers and design teams. Drawing on research into scientific workflows, we present a framework for inte- gration that capitalises on advances in cloud computing to connect discrete tools via flexible and distributed process networks. Adopting a services-oriented system architecture, we propose a web-based plat- form that enables data, semantics and models to be shared on the fly. We discuss potential challenges and opportunities for the development thereof as a flexible, visual, collaborative, scalable and open system.
Conference Paper
Past and ongoing research efforts toward seamless integration of building design and analysis have established a strong foothold in the building community. Yet, there is lack of seamless connectivity between Building Information Modeling (BIM) and building performance tools. D-BIM Workbench provides an essential framework to conduct integrated building performance assessments within BIM, an environment familiar to all stakeholders. With tighter tool integration within BIM, this open-source Workbench can be tailored to specific analysis such as energy, environmental, and economic impact of buildings. The Workbench, currently under development, will enable on-the-fly simulations of building performance tools to design, operate, and maintain a low/Net Zero Energy (NZE) built environment and beyond. This paper discusses the preliminary research in D-BIM Workbench development such as the Workbench architecture, its open-source environment, and other efforts currently under progress including integration of 3D heat transfer in the Workbench.