Conference PaperPDF Available

Data transformations in custom digital workflows: property graphs as a data model for user-defined mappings

Authors:

Abstract and Figures

This paper describes the use of property graphs for mapping data between AEC software tools, which are not linked by common data formats and/or other interoperability measures. The intention of introducing this in practice, education and research is to facilitate the use of diverse, non-integrated design and analysis applications by a variety of users who need to create customised digital workflows, including those who are not expert programmers. Data model types are examined by way of supporting the choice of directed, attributed, multi-relational graphs for such data transformation tasks. A brief exemplar design scenario is also presented to illustrate the concepts and methods proposed, and conclusions are drawn regarding the feasibility of this approach and directions for further research.
Content may be subject to copyright.
Data transformations in custom digital workflows:
Property graphs as a data model for user-defined mappings
PATRICK JANSSEN1, RUDI STOUFFS1,2, ANDRE CHASZAR2,
STEFAN BOEYKENS3 and BIANCA TOTH4
1 National University of Singapore, Singapore
patrick@janssen.name, stouffs@nus.edu.sg
2 Delft University of Technology, Netherlands
r.m.f.stouffs@tudelft.nl, a.t.chaszar@tudelft.nl
3 KU Leuven, Belgium
stefan.boeykens@asro.kuleuven.be
4 Queensland University of Technology, Australia
bianca.toth@qut.edu.au
Abstract. This paper describes the use of property graphs for mapping data between AEC
software tools, which are not linked by common data formats and/or other interoperability
measures. The intention of introducing this in practice, education and research is to facilitate the
use of diverse, non-integrated design and analysis applications by a variety of users who need to
create customised digital workflows, including those who are not expert programmers. Data model
types are examined by way of supporting the choice of directed, attributed, multi-relational graphs
for such data transformation tasks. A brief exemplar design scenario is also presented to illustrate
the concepts and methods proposed, and conclusions are drawn regarding the feasibility of this
approach and directions for further research.
1. Introduction
There is a clear and urgent need for new approaches to information exchange that address the
persistent lack of interoperability and integration in building design, analysis and
construction. The continuing and active discourse amongst AEC practitioners and researchers
alike highlights current limitations in both process and technology that commonly challenge
design collaboration. We propose that bottom-up, user-controlled and process-oriented
approaches to linking design and analysis tools are more appropriate than current top-down,
standards-based and model-oriented strategies, because they provide degrees of flexibility
critical to the process(es) of design (Coons 1963). This approach emerged out of discussions
at the “Open Systems and Methods for Collaborative BEM (Building Environment
Modelling)” workshop held at the CAAD Futures Conference in July 2011, and was further
developed into a proposal for a platform (Toth et al, 2012). This paper does not intend to give
a comprehensive overview of the proposed platform, but instead focuses on critical aspects of
overcoming interoperability hurdles.
The proposed platform is based on existing scientific workflow systems that enable the
composition and execution of complex task sequences on distributed computing resources
(Deelman et al, 2009). These systems exhibit a common reference architecture and consist of
a graphical user interface (GUI) for authoring workflows, along with a workflow engine that
handles invocation of the applications required to run the solution (Curcin and Ghanem,
2008). Workflows are represented as networks of nodes and wires that can be configured and
reconfigured by users as required. Nodes perform some useful function; wires support the
flow of data, linking an output of one node to an input of another node. We consider three
types of nodes: process nodes, input/output (IO) nodes and control nodes. Process nodes
perform data analysis and transformation procedures. They have a number of typed input and
output ports for receiving and transmitting data files, as well as meta-parameters that can be
set by the user to guide task execution. IO nodes and control nodes are simpler types of nodes
not associated with specific design task related computational procedures, but rather with
providing functionality related to workflow initiation, execution and completion. In this paper
we focus primarily on the role of process nodes in workflows, and the development of a
strategy to support custom data transformation procedures.
Process nodes can be further classified into tool nodes and mapper nodes. Tool nodes wrap
existing applications to make their functionality and data accessible to the workflow; while
mapper nodes apply transformation procedures to data sets in order to map the output from
one tool node to the input for another. Figure 1 shows a conceptual diagram of an example
network in which a parametric CAD system node is connected via a set of mapper nodes
(denoted by ‘M’) to EnergyPlus [1] and Radiance [2] simulation nodes. The CAD system
node encapsulates a procedure that starts the CAD system, loads a specified model, and then
generates a model instance by setting certain parameter values. The resulting geometric output
undergoes two separate transformations that map it into both EnergyPlus and Radiance
compatible formats. The simulation nodes then read in this transformed data, run their
respective simulations, and generate output data in the form of simulation results.
Figure 1: Example network of nodes. A parametric CAD system is linked to Radiance and
EnergyPlus via a set of mappers (M). End users may contribute the white components, node
developers will build the grey components, and the black components are existing tools.
Data mapping can be approached in a number of ways, ranging from an ontological approach,
where a common ontological data structure is imposed on data exchanges, to an open-world
approach, where the user is expected to resolve data format issues manually - a process
known as ‘shimming’ (Altıntaş, 2011). In the AEC industry, the most prominent solution to
this issue is the current evolution of Building Information Modelling (BIM), which tends
toward the all-encompassing ontological end of the spectrum, with the Industry Foundation
Classes (IFC) as its embodiment. However, IFC like any standardised ontology has
significant epistemological, practical and technical limitations (Bowker and Starr 1999,
Kiviniemi 2005, Pazlar and Turk 2008). Therefore, rather than reading and writing to a
common representational structure, we propose that tool nodes be coupled more flexibly
through mapper nodes that allow direct data exchange via any data or file format.
The applicability of a particular mapping approach may also depend on the specific design
scenario and, more importantly to us, on the skills of the designer. In this research, we are
focusing on communities of users with limited programming skills, with the goal to develop a
mapping approach that allows such non-programmers to easily create and share custom
mappings in a collaborative manner. To achieve this goal, the mapping approach must be both
flexible and user-friendly. It must be flexible so that users can apply the same mapping
approach to any type of data. This approach must therefore be semantically agnostic. It must
also be user-friendly to the extent that it supports users with limited programming skills in the
process of creating and debugging mappers.
There will however always be cases where complex mappings need to be defined that require
more advanced programming skills. Consider the example in Figure 1. The output from the
CAD system cannot be easily mapped to either the input for EnergyPlus or input for
Radiance. Instead, an additional step is required that performs Boolean operations on the
polygons. For EnergyPlus, surface polygons need to be sliced where there are changes in
boundary conditions (as each surface can only have one boundary condition attribute), and
then infer what these boundary conditions are, i.e. internal, external or ground contact. For
Radiance, surface polygons need to have holes cut where there are windows. These additional
steps may have to be performed by a scripted mapper, denoted the PolygonSlicer.
The need for this additional mapper illustrates the difference between mappers that perform
relatively straightforward manipulations of the data structure and data content, and mappers
that perform more complex data transformations requiring specific computational functions.
While we aim to support users of different skill levels to create these mappers, this may not
always be possible with scripted mappers, which would typically need to be created by
developers with programming skills. Ideally, such scripted mappers should be developed to
apply to a wide variety of situations and contexts, so as to be easily reusable.
2. Data models for data mappers
With regards to creating mappers, a distinction can usefully be made between three steps: data
reading, data transformation, and data writing. Reading and writing data to specific file
formats and databases are functions that can be generated mainly using existing technology,
specifically, parser generators and serialiser generators. Therefore, we focus here on the data
transformation step, which requires more in-depth interpretation. An explicit model for data
transformation is needed to relate constructs in the source representation to constructs in the
target representation (Kilian, 2006). This may involve deriving new data as well as discarding
data that is superfluous in the target format. The aim of the transformation is to create a target
data set that is as close as possible to the target file format, so that serialisation is reduced to a
very simple formatting procedure, which can be fully automated.
When transforming data, the data model provides ways of organising information at an
abstract level, including defining the data structure, the data constraints, and the data
operations. Two approaches to data modelling are the general-purpose approach versus the
domain-specific approach. A general-purpose model organises data using generic constructs
that are often highly domain independent. Due to this generic nature, the range of data that
can be described tends to be very broad. It offers a way of defining a data structure that is
very flexible but relies on human interpretation of semantic meaning.
A domain-specific model represents domain-specific information using semantic constructs
related to a particular domain. The model may be defined using either a Data Definition
Language (DDL) or a Domain Specific Modelling Language (DSML). Due to the highly
specific nature of the constructs, the type of information that can be described tends to be
relatively narrow. However, this manner of representing information supports automated
interpretation of semantic meaning.
In many cases, domain-specific models are defined on top of a general-purpose one, by
specifying additional constraints on the general-purpose model. An example is the many
different XML schemas defined on top of the XML data model using the XML Schema DDL.
Data mappings will typically need to transform both the data content and the data structure. In
the case of domain-specific models, when the semantic schemas for both the source data
model and the target data model are available, the definition by the user of a list of
semantically equivalent entities between these schemas may be sufficient. Based on this user-
defined information, a mapping procedure can then be automatically generated that will
transform the source data set to the target data set. In some cases, it may be possible to define
such mappings using visual tools such as Altova’s XML MapForce [3] (Altova 2005) that
generates XSLT/XQuery code based on a mapping between elements in two XML schemas.
We denote this approach a declarative equivalency mapping. One key problem with this
approach is that only fairly simple mappings can be created using direct semantic mappings.
More complex mappings may require a number of source entities to be processed in some
way in order to be able to generate another set of target entities.
Alternatively, the user may create data transformation rules using languages specialised for
particular types of data models. We denote this approach a procedural query mapping. These
specialised languages include data query languages and data manipulation languages. The
former are used for retrieving data from a data set, and the latter for inserting and deleting
data in a data set. In many cases, the same language can be used for both querying and
manipulation. A popular example is the Structured Query Language (SQL), which is used for
both retrieving and manipulating data in relational databases. Other languages for retrieving
and manipulating data include XQuery / XPath for data stored in XML data models, and
SPARQL for data stored in Resource Description Framework (RDF) data models. Although
such languages are specialised for certain data model, the languages themselves are still
highly generic semantically.
For creating user-defined mappings within workflows, the procedural query mapping
approach is seen as being more appropriate since it is semantically agnostic and therefore
highly flexible. In such a case, the input data and output data for each of the tools could be
made to adhere to the same general-purpose model, and this commonality would allow the
tools to be more easily sharable. The user could download diverse tools (and possibly
mappers) developed by different groups from a shared online repository, and then string these
together into customised workflows (e.g., see the workflow in Figure 1). Each tool developer
would specify a domain specific model for the input data, referred to as the input schema (and
optionally also a domain specific model for the output data, referred to as the output schema).
The user’s task would then be to write mappers, where necessary, that generate data sets that
adhere to the input schemas of the selected tools.
So far, the tools that have been considered have been design tools such as parametric
modelling software and simulation programs. However, any existing tools that can be
wrapped could be included in the tool library. Two types of tools that would likely be
desirable are spreadsheet tools (such as Microsoft Excel) and data analysis and visualisation
tools (such as Tableau Desktop [4]). In addition, existing data mapping tools could also be
leveraged for creating specialised types of mappings. One example already mentioned above
is Altova’s Mapforce tool. Another example would be FME Desktop by SafeSoftware [5]
(SafeSoftware 2008), which allows users to visually construct data mappings for geo-spatial
data types by selecting and configuring predefined sets of data transformers. Although such
mapping tools may be useful in certain cases, a more general approach to creating mappings
would nevertheless still be needed.
With respect to the procedural query mapping approach, we have considered various general-
purpose data models and query languages from the point of view of applicability and ease of
use. The three main data models that were considered are relational data models using SQL,
XML data models using XQuery, and property graphs using Gremlin [6].
Relational data models organise data into table structures consisting of rows and columns,
XML data models organise data into hierarchical trees consisting of elements and attributes,
and property graphs organise data into network structures consisting of vertices and edges. Of
the three, the property graph is the least well known, and is a directed graph data structure
where edges are assigned a direction and a type and both vertices and edges can have
attributes called properties. This allows property graphs to represent complex data structures
with many types of relationships between vertices. In graph theoretic language, a property
graph is known as a directed, attributed, multi-relational graph. Many other graph data models
such as RDF graphs can be viewed as a special kind of property graph. Gremlin is a domain-
specific graph traversal language for navigating such graphs.
Typically, modelling and simulation of design problems requires data structures with highly
complex relationship networks. Relational data models and XML data models are not able to
represent such complex networks in an elegant way. Furthermore, the query languages used
for these models are cumbersome when working with complex relationship networks. This is
especially problematic in supporting users with little or no programming skills to understand
these complex relationship networks. In the case of SQL, “join” clauses are required to relate
data in multiple tables, while in XQuery, “idref” functions have to be used to relate data from
different parts of the hierarchical tree. In contrast, property graphs use an inherently
networked data structure and therefore do not suffer from these drawbacks. Below, an
example scenario is described in which the property graph data model was used as the
underlying general-purpose data model, and Gremlin as the query and manipulation language.
3. Example scenario
In order to demonstrate the feasibility of our approach, we have implemented part of the
example scenario shown in Figure 1. A parametric CAD system is used to generate a model
of a small building and EnergyPlus and Radiance are used to evaluate building performance.
In this scenario, we have used an existing workflow system called Kepler [7] (McPhillips et
al, 2009) to connect the various tools together. The Kepler workflow is shown in Figure 2.
Figure 2: The Kepler workflow. The contents of the Mapper3 node is shown in Figure 5.
For this demonstration, the model is highly simplified, consisting of only two spaces stacked
on top of each other, each with a window in one wall (Figure 3). In total there are 14
polygons, and each polygon is assigned a set of attributes that are used for generating the
property graphs.
SideFX Houdini [8] is used as the parametric CAD system, but various other (parametric)
CAD software could also be used. The main requirement for this system would be the ability
to create a customised procedure to output the geometric data as a simple JSON file (using the
GraphSON library [9]). Existing parsers can then be used to generate a property graph model
from such a file. Similarly, EnergyPlus is used as the energy analysis simulation program and
Radiance is used as the lighting analysis program, but other simulation tools can also be
considered for this purpose. The Houdini application, the EnergyPlus program, and the two
Radiance programs have been wrapped in Python wrappers and scripted mapper nodes have
been created to transform the inputs and output files into GraphSON files.
Figure 3: The CAD model consisting of 14 polygons, each with three attributes (one unique ID called
“uid”, one property called “prop_type”, and one relationship called “rel_is_in_group”).
In order to define the graph mappings, a set of nodes were created in Kepler to provide a set
of basic mapping functions for mapping graph vertices and graph edges. These nodes have
various parameters that allow users to customise the mappings. When these nodes get
executed, Gremlin mapping scripts are automatically generated based on these parameter
settings.
Three different types of mapping nodes were created: graph manipulation nodes, vertex
creation nodes, and iterator nodes. Graph manipulation nodes are used for merging, splitting,
and filtering graphs. Vertex creation nodes are used for creating new vertices from scratch.
Iterator nodes are used for iterating over the contents of an input graph and triggering certain
actions. For each type of node, a parameter called “select” allows users to specify a Gremlin
selection filter on the input graph. For each entity (i.e. vertex or edge) in the filtered input
graph, a particular action is triggered, which could be the creation or modification of vertices
or edges.
The mapping process from Houdini to EnergyPlus will be described in more detail. The first
graph mapper maps the output from Houdini to the input of the PolygonSlicer. The second
mapper maps the output of the PolygonSlicer to the input of EnergyPlus. In both mappers, the
different GraphSON files are automatically parsed into and serialised from property graphs by
existing tools and, therefore, the user only needs to focus on the transformation of the
property graphs. Figure 4 shows the overall structure of the property graphs, and Figure 6
shows the properties associated with three of the vertices in each property graph.
The first step is for the user to define the parametric model of the design together with a set of
parameter values. The Houdini wrapper will trigger Houdini to generate a model instance and
will retrieve the geometric data for that instance. The geometric data will then be
automatically restructured as a property graph and saved as a GraphSON file. Points and
polygons in the model will be mapped to vertices in the property graph. In addition, the user
can influence the restructuring process by defining a set of attributes for the polygons in the
parametric model. An attribute with the name “uid” (i.e. unique ID) is used to define the node
name, attributes with names that starts withprop_” will result in properties being added to
the vertices, and attributes with names that start with “rel_” will result in relationships being
generated between vertices.
In this scenario, the user knows that in order to map to EnergyPlus, surfaces will need to be
assigned different types and will also need to be grouped into zones. The user has therefore
defined certain attributes in order to simplify this mapping process. Each polygon has been
given a “uid” attribute, a “prop_type” attribute that will define a type for each surface, and a
“rel_is_in_group” attribute that groups surfaces according to zones (see Figure 3).
The user then needs to create the graph mappers using the graph mapping nodes. The
PolygonSlicer and the EnergyPlus simulator both have input graph schemas that specify the
required structure of the graph and the required properties of the vertices. The task for the
user is therefore to create mappings that generate graphs that adhere to these schema
constraints. Figure 4 shows the overall structure of the property graphs at each stage.
Figure 4: Simplified diagrammatic representation of the property graphs for. (Point data is not shown
in order to reduce the complexity of the diagrams. In the actual graph, there are also 24 points, each
with x,y, and z properties.)
In the first mapping, where the output of Houdini is mapped to the input of the PolygonSlicer,
the number of vertices remains constant, but the edges between the vertices need to be
reversed and the edge label changed from “is_in_group” to “has_a”. Iterator nodes are used
to reverse the edges and to add “type” properties to the nodes. The PolygonSlicer then
transforms its input graph by dividing the surfaces for the ceiling of the lower zone (“c2”) and
the floor of the upper zone (“f7”) so as to ensure that each surface has a homogeneous
boundary condition. The PolygonSlicer also detects the relationships between the floors and
ceilings, between the floors and the ground, and between windows and walls.
In the second mapping, where the output of the PolygonSlicer is mapped to the input of the
EnergyPlus simulator, additional properties are added to the existing vertices in the input
graph, and a number of additional vertices are also added to define a set of other objects
required in the EnergyPlus input file. Vertex creator nodes are used to create the additional
vertices, and iterator nodes are used to copy and modify existing vertices (Figure 5). The
groups are mapped to EnergyPlus zones, and the polygons to EnergyPlus surfaces. In the
process of mapping, the iterator node also transforms the edges that existed in the input graph
into properties in the output graph. The output graph becomes a simple list of vertices under
the “idf” root node. For example, in the input graph the window is connected to the wall with
an edge, while in the output graph the window is no longer connected but instead has a
property that specifies the wall name.
Figure 5: The Kepler mapper that maps the output of the PolygonSlicer node to the input of the
EnergyPlus node. See the “Mapper3” node in Figure 2.
For each different surface type, a separate iterator node is created. For example, consider the
“Iter_V_Ceilings” node in Figure 5. This node generates the ceilings of the two zones. Table
1 shows the two main parameters for the node. The “select” parameter filters the input graph
so that the remaining vertices all have an “Entity” property with a value of “polygon” and a
“Type” property with a value of “ceiling”, and in addition have an outgoing “boundary_is”
edge that points to another polygon (i.e., the floor above). The node then has a set of
parameters that define name-value property pairs. For each polygon in the filtered input
graph, the iterator node will generate a vertex in the output graph with the specified
properties.
Table 1: The parameter names and values for the Iter_V_Ceilings node. Gremlin code is shown in
italics, and makes use of two predefined local variables: ‘g’ refers to the input graph, and ‘x’ refers to
the entity being iterated over (which in this case is a vertex).
Parameter Parameter value
Filter graph:
Select
g.V.has(‘Entity’,’polygon’).has(‘Type’,‘ceiling’).as(‘result’)
.out(‘boundary_is’).has(‘Entity’,‘polygon’).back(‘result’)
Generate vertices:
Vertex properties
Object : ‘BuildingSurface:Detailed’
Name : x.Name
Surface_Type : ‘CEILING’
Construction_Name : ‘light ceiling’
Zone : x.in(‘group_contains’).Name
Outside_Boundary_Cond : ‘SURFACE’
Outside Boundary_Cond_Object : x.out(‘boundary_is’).Name
Sun_Exposure : ‘NOSUN’
Wind_Exposure : ‘NOWIND’
Points : x.Points
Note that when the user is specifying the property values, they can insert Gremlin commands
that extract these values from the input graph, thereby ensuring that the values can be
dynamically generated. Figure 6 shows the changes for a number of vertices in the property
graph as the data is mapped and transformed. When the “Iter_V_Ceilings” node iterates over
the “c2.2” polygon in the input graph, it will generate the “c2.2” EnergyPlus surface.
Figure 6: An example of the property data for a few of the vertices in the property graphs. Typically,
the property graphs will undergo a process of information expansion, where data is gradually added to
the model as needed.
4. Future work
In order to support a bottom-up, user-controlled and process-oriented approach to linking
design and analysis tools, a data mapping approach is required that allows designers to create
and share custom mappings. To achieve this goal, the data mapping approach should be both
flexible in that it can be applied to a wide variety of tools, and user-friendly in that it supports
non-programmers in the process of easily creating and debugging mappers. The use of a
general-purpose data model ensures that the mapping approach is highly flexible, and in
addition also allows for improved sharability. The example scenario demonstrated how
designers with minimal scripting skills would be able to set up complex digital workflows that
enable the fluid and interactive exploration of design possibilities in response to custom
performance metrics.
The next stage of this research will explore the scalability of the user-defined graph mapping
approach when working with larger data sets and more complex data schemas (such as the
IFC schema). In the current demonstration, the data sets and data schemas are small, and as a
result the graph mappers are relatively simple. However, if data sets increase in size and the
number of entity and relationship types is very large, then the graph mappers could potentially
become more difficult to construct. In order to deal with this increased complexity, we foresee
that the user will require additional data management and schema management tools. The data
management tools could enable users to visualise, interrogate and debug property graph data
during the mapping process (Stouffs, 2001). An example is Gephi [10], a rich interactive tool
for analysing and visualising complex graph data sets (Bastian et al, 2009). Schema
management tools could enable node developers to define formal graph schemas for the input
and output data for their nodes. This could in turn allow end-users to identify and isolate
subsets of large schemas relevant to their particular design scenario.
Acknowledgements
The authors would like to acknowledge and thank the participants of the “Open Systems and
Methods for Collaborative BEM (Building Environment Modelling)” workshop held at the
CAAD Futures 2011 Conference in Liège, Belgium, 4 July 2011, and of the LinkedIn Group
sharing the same name, for their contributions to the discussions leading to the ideas
presented and described in this paper. We invite interested parties to contribute to the
development of these ideas and to join the discussions in the LinkedIn Group.
References
Altıntaş, İ.: 2011, Collaborative Provenance for Workflow-driven Science and Engineering, PhD Thesis,
University of Amsterdam, Amsterdam.
Altova Inc.: 2005, Data Integration: Opportunities, challenges, and Altova MapForce™ 2005, Whitepaper
available at: http://www.altova.com/whitepapers/mapforce.pdf
Bastian M., Heymann S., Jacomy M.: 2009, Gephi: an open source software for exploring and manipulating
networks. International AAAI Conference on Weblogs and Social Media.
Bowker, G.C. and Starr, S.L.: 1999, Sorting Things Out: Classification and Its Consequences. MIT Press,
Cambridge, MA.
Coons, S.A.: 1963, An outline of the requirements for a computer-aided design system, Proceedings AFIPS,
ACM, 299-304.
Curcin, V. and Ghanem, M.: 2008, Scientific workflow systems - can one size fit all?, CIBEC 2008, Cairo, 1-9.
Deelman, E., Gannon, D., Shields, M. and Taylor, I.: 2008, Workflows and e-science: An overview of workflow
system features and capabilities, Future Generation Computer Systems, 25, 528-540.
Kilian, A.: 2006, Design innovation through constraint modeling, International Journal of Architectural
Computing, 4(1), 87–105.
Kiviniemi, A.: 2006, Ten years of IFC development - why we are not there yet. Proceedings CIB-W78,
Montreal.
McPhillips T., Bowers S., Zinn D. and Ludaescher B.: 2009, Scientific workflow design for mere mortals, Future
Generation Computer Systems, 25(5), 541-551.
Pazlar, T. and Turk, Z.: 2008, Interoperability in practice: geometric data exchange using the IFC standard,
ITcon, 13, 362-380.
SafeSoftware: 2008, FME Desktop Under the Hood, Whitepaper available at:
http://cdn.safe.com/resources/whitepapers/FME-Desktop-Under-the-Hood.pdf
Stouffs, R.: 2001, Visualizing information structures and its impact on project teams: an information architecture
for the virtual AEC company, Building Research & Information, 29(3), 218–232.
Toth, B., Boeykens, S., Chaszar, A., Janssen, P., & Stouffs, R.: 2012, Custom digital workflows. A new
framework for design analysis integration. Beyond Codes & Pixels. 17th CAADRIA Conference Proceedings.
Chennai (India), 163-172.
[1] http://apps1.eere.energy.gov/buildings/energyplus/
[2] http://radsite.lbl.gov/radiance/
[3] http://www.altova.com/mapforce/xml-mapping.html
[4] http://www.tableausoftware.com/products/desktop
[5] http://www.safe.com/fme/fme-technology/fme-desktop
[6] http://gremlin.tinkerpop.com
[7] https://kepler-project.org/
[8] http://sidefx.com
[9] https://github.com/tinkerpop/blueprints/wiki/GraphSON-Reader-and-Writer-Library
[10] https://gephi.org/
... Individual translation modules resolve data transformations as needed resulting in many one-to-one mappings. Systems like this are being proposed by some of the researchers collaborating through the Open Systems and Methods for Built Environment Modelling initiative 1 (Janssen et al., 2012). (3) A loose system of tools using an open or otherwise publicised, standardised data format like the Industry Foundation Classes (IFCs) and Green Building XML (gbXML). ...
... (3) A loose system of tools using an open or otherwise publicised, standardised data format like the Industry Foundation Classes (IFCs) and Green Building XML (gbXML). Latter two approaches frequently employ workflow control systems or specific structured data layers to glue the pieces together (Flager et al., 2008;Toth et al., 2012). This research project uses a mix of the first and third approaches by using a shared data representation while initially focusing on the implementation of frameworks that allow loose association of authoring and analysis tools through an application programming interface (API). ...
Conference Paper
Full-text available
ntegration of analyses into early design phases poses several challenges. An experimental implementation of an analysis framework in conjunction with an optimization framework ties au-thoring and analysis tools together under one umbrella. As a prototype it served intensive use-testing in the context of the SmartGeometry 2012 workshop in Troy, NY. In this prototype the data flow uses a mix of proprietary and publicised file formats, exchanged through publicly accessible interfaces. The analysis framework brokers between the parametric authoring tool and the analysis tools. The optimization framework controls the processes between the authoring tool and parametric engine on one side and the optimization algorithm on the other. In addition to some user-implemented analyses inside the parametric design model the prototype makes energy analysis and structural analysis available. The prototype allows testing assumptions about work flow, implementation, usability and general feasibility of the pursued approach.
... Discarding assumptions of homogeneity in the format of design and analysis data, we then discuss the benefits and challenges that such a system presents for design practice and outcomes. An initial demonstration of one possible approach to the shared mapping procedure has already been undertaken [1], and we have assembled a collective of researchers and practitioners interested in pursuing our proposal.We are confident that the approach described in this framework will lend itself well to coping with the frequently changing pace and focus of design projects, as well as the varying priorities of their many stakeholders. ...
... It should be noted, however, that each is unique to a particular type of data model. In a further paper, we have considered several data models and their respective query languages from the point of view of applicability and ease of use [1]. Since adherence to a common data model is only a prerequisite for nodes in any individual mapping, to ensure maximum flexibility, we propose to dispense with data model restrictions at the system level. ...
Article
Full-text available
Flexible information exchange is critical to successful design-analysis integration, but current top-down, standards-based and model-oriented strategies impose restrictions that contradict this flexibility. In this article we present a bottom-up, user-controlled and process-oriented approach to linking design and analysis applications that is more responsive to the varied needs of designers and design teams. Drawing on research into scientific workflows, we present a framework for integration that capitalises on advances in cloud computing to connect discrete tools via flexible and distributed process networks.We then discuss how a shared mapping process that is flexible and user friendly supports non-programmers in creating these custom connections. Adopting a services-oriented system architecture, we propose a web- based platform that enables data, semantics and models to be shared on the fly.We then discuss potential challenges and opportunities for its development as a flexible, visual, collaborative, scalable and open system.
... The general approaches to addressing the challenges remained the same for the first prototype and the S2I: The currently dominant proposals for overcoming the interoperability challenge to achieve virtually seamless collaboration between design partners are (a) confining teams to a tight, closed proprietary system of software applications, unified through a proprietary data format; (b) a loose system of many software tools with individual translation modules on an asneeded basis resulting in many one-to-one mappings (Janssen et al. 2012); and (c) a loose system of tools using an open or otherwise published, standardized data format like the Industry Foundation Classes (IFCs). Latter two approaches frequently employ workflow control systems to connect all the modules required for a specific workflow (Flager et al. 2008; Toth et al. 2012). ...
... We propose a platform to support a design method where designers can compose and execute automated workflows that link computational design tools into complex process networks [3,4]. By allowing designers to effectively link a wide variety of existing design analysis and simulation tools, such custom digital workflows support the exploration of complex trade-offs between multiple conflicting performance criteria. ...
Conference Paper
Full-text available
Custom digital workflows aim to allow diverse, non-integrated design and analysis applications to be custom linked in digital workflows, created by a variety of users, including those who are not expert programmers. With the intention of introducing this in practice, education and research, this paper focuses on critical aspects of overcoming interoperability hurdles, illustrat- ing the use of property graphs for mapping data between AEC software tools that are not connected by common data formats and/or other interoperability measures. A brief exemplar design scenario is presented to illustrate the concepts and methods proposed, and conclusions are then drawn regarding the feasibility of this approach and directions for further research.
... Various approaches have been proposed to overcome or bypass this obstacle to seamless collaboration between design team members (1) A tight, closed proprietary system of software applications, unified through a proprietary data format. (2) a loose system of many software tools using as many data formats with individual translation modules resolving data transformations as needed resulting in many one-to-one mappings (Open Systems and Methods for Built Environment Modelling initiative [1]; Janssen et al. 2012). (3) A loose system of tools using an open or otherwise publicised, standardised data format like the Industry Foundation Classes (IFCs). ...
Conference Paper
Full-text available
This research pursues an experimental implementation of an analysis framework in conjunction with an optimization framework for building design. The frameworks tie authoring and analysis tools together under one umbrella. In a prototype, the data flow uses a mix of proprietary and publicised file formats, exchanged through publicly accessible interfaces. An analysis framework brokers between the parametric authoring tool and the analysis tools. An optimization framework controls the processes between the authoring tool and parametric engine on one side and the optimization algorithm on the other. The prototype allowed testing assumptions about workflow, implementation, usability and general feasibility of the pursued approach during the SmartGeometry 2012 event.
Conference Paper
Full-text available
Designers interested in applying evo-devo-design methods for performance based multi-objective design exploration have typically faced two main hurdles: it's too hard and too slow. An evo-devo-design method is proposed that effectively overcomes the hurdles of skill and speed by leveraging two key technologies: computational workflows and cloud computing. In order to tackle the skills hurdle, Workflow Systems are used that allow users to define computational workflows using visual programming techniques. In order to tackle the speed hurdle, cloud computing infrastructures are used in order to allow the evolutionary process to be parallelized. We refer to the proposed method as Evo-Devo In The Sky (EDITS). This paper gives an overview of both the EDITS method and the implementation of a software environment supporting the EDITS method. Finally, a case-study is presented of the application of the EDITS method.
Article
Full-text available
The exchange of information between the different organisations and individuals involved in the different stages of a building's life cycle has always been an important, but at the same time a difficult task. A vast number of participants with different views of the same physical structure have to interact and exchange information through the whole building life cycle. In order to find remedies to the current problems, in particular in CAD data exchange, the product modelling and information exchange standards community have developed several high level representations of buildings (known as Building Information Models - BIMs) in order to enable a more coherent exchange of data. Recently the Industry Foundation Classes (IFC), with a considerable number of software implementations, have emerged as the leading solution candidate. But soon after the first implementations doubts have been raised whether claimed IFC specification compliance by a software product insures a sufficient level of interoperability in practical data exchange. In the presented research work the interoperability performance of three widely used IFC compatible architectural, design applications has been evaluated. Tests with file based geometry exchange confirmed our anticipations that the IFC interfaces did not work as expected. The tests demonstrated through illustrative (simple and complex) examples revealed several cases of information distortion and/or information loss both on the entity and attribute level. Unsatisfying model handling proved to be characteristic of all the tested exchange scenarios. Our conclusion is that in the future more effort should be put into the IFC interface development.
Article
Full-text available
Flexible information exchange is critical to successful design-analysis integration, but current top-down, standards-based and model-oriented strategies impose restrictions that contradict this flexibility. In this article we present a bottom-up, user-controlled and process-oriented approach to linking design and analysis applications that is more responsive to the varied needs of designers and design teams. Drawing on research into scientific workflows, we present a framework for integration that capitalises on advances in cloud computing to connect discrete tools via flexible and distributed process networks.We then discuss how a shared mapping process that is flexible and user friendly supports non-programmers in creating these custom connections. Adopting a services-oriented system architecture, we propose a web- based platform that enables data, semantics and models to be shared on the fly.We then discuss potential challenges and opportunities for its development as a flexible, visual, collaborative, scalable and open system.
Conference Paper
Full-text available
The past decade has witnessed a growing trend in designing and using workflow systems with a focus on supporting the scientific research process in bioinformatics and other areas of life sciences. The aim of these systems is mainly to simplify access, control and orchestration of remote distributed scientific data sets using remote computational resources, such as EBI web services. In this paper we present the state of the art in the field by reviewing six such systems: Discovery Net, Taverna, Triana, Kepler, Yawl and BPEL. We provide a high-level framework for comparing the systems based on their control flow and data flow properties with a view of both informing future research in the area by academic researchers and facilitating the selection of the most appropriate system for a specific application task by practitioners.
Article
Full-text available
Scientific workflow systems have become a necessary tool for many applications, enabling the composition and execution of complex analysis on distributed resources. Today there are many workflow systems, often with overlapping functionality. A key issue for potential users of work- flow systems is the need to be able to compare the capabilities of the various available tools. There can be confusion about system functionality and the tools are often selected without a proper functional analysis. In this paper we extract a taxonomy of features from the way sci- entists make use of existing workflow systems and we illustrate this feature set by providing some examples taken from existing workflow systems. The taxonomy provides end users with a mechanism by which they can assess the suitability of workflow in general and how they might use these features to make an informed choice about which workflow system would be a good choice for their particular application.
Article
Full-text available
In this paper we describe how constraint modeling can support design innovation. Furthermore, we lay out how constraints are employed in the construction and exploration of a model's design space. We place the approach within the context of design exploration using computational and conceptual representations of design. A review of the literature reveals that geometric, topologic, functional, and quantitative constraints are those most commonly used. For each constraint type, an example is presented drawing from several workshops and research conducted by the author. The examples range from product design, to structural design, to fabrication issues in freeform geometry. Based on the case studies, we describe how the different types of constraints can be used as design drivers and help in the exploration of solution spaces. In conclusion, we identify the need for bidirectional exercising of constraints as the next challenge in design exploration and discuss how it is relevant in particular for cross domain design.
Article
Met de groei van wetenschappelijke kennis en de stijging van het aantal studies dat gebruikmaakt van informatie van verschillende wetenschappelijke disciplines wordt de complexiteit van systematisch wetenschappelijk onderzoek aanzienlijk vergroot. Om de ‘grote’ wetenschappelijke vragen te beantwoorden gebruiken wetenschappers computergebaseerde methodieken die bijna dagelijks worden aangepast. Veranderingen in de computerwetenschap en technologie hebben geleid tot een verzameling van gereedschappen ontworpen om het wetenschappelijke proces meer efficiënt en sneller te maken. Deze gereedschappen zijn in staat om de creatie en uitvoering van computergestuurde taken te vereenvoudigen en zijn bekend onder de naam ‘wetenschappelijke workflows’. Ilkay Altintas presenteert vier belangrijke bijdragen op het gebied van samenwerkende workflows, genaamd collaborative provenance. De voornaamste bijdragen in het vastleggen van samenwerkende oorspronginformatie in de studie leiden tot de ontwikkeling van computersystemen die interoperatief samenwerken en het hergebruik van workflowresultaten vergroten.
Article
In the early 1950's at M.I.T. the Servomechanisms Laboratory (now the Electronic Systems Laboratory) devised and developed the first automatically controlled milling machine. The controlling information for the machine was introduced in the form of punched paper tape, on which all dimensional information and instructions for the various feeds and cutter speeds was contained. At first the punched paper tape was prepared manually by some human operator who translated, in effect, the detail drawing of the part to be machined into numerical form and then into appropriate patterns of holes in the tape. This was a tedious and entirely mechanical chore, and it was only natural that short cuts in the process began to suggest themselves. The scope of such short cuts began to spread through the fabric of the technique, and it was not long before the computer was involved in implementing them.
Article
Recent years have seen a dramatic increase in research and development of scientific workflow systems. These systems promise to make scientists more productive by automating data-driven and compute-intensive analyses. Despite many early achievements, the long-term success of scientific workflow technology critically depends on making these systems useable by “mere mortals”, i.e., scientists who have a very good idea of the analysis methods they wish to assemble, but who are neither software developers nor scripting-language experts. With these users in mind, we identify a set of desiderata for scientific workflow systems crucial for enabling scientists to model and design the workflows they wish to automate themselves. As a first step towards meeting these requirements, we also show how the collection-oriented modeling and design (comad) approach for scientific workflows, implemented within the Kepler system, can help provide these critical, design-oriented capabilities to scientists.