Available via license: CC BY 4.0
Content may be subject to copyright.
Data-Centric Engineering (2022), 1–37
doi:10.1017/dce.2020.7
RESEARCH ARTICLE
An Approach for System Analysis with MBSE and Graph
Data Engineering
Florian Schummer1* and Maximillian Hyba1
1Chair of Astronautics, Technical University of Munich, Boltzmannstr 15, Bavaria, Garching, 85748, Germany
*Corresponding author. E-mail: f.schummer@tum.de
Received: TBD; Revised: ;Accepted:
Keywords: MBSE; SysML; Graph Databases; SysML Graph Schema; Anomaly Resolution
Abstract
Model-Based Systems Engineering aims at creating a model of a system under development, covering the complete
system with a level of detail that allows to define and understand its behavior and enables to define any interface
and workpackage based on the model. Once such a model is established, further benefits can be reaped, such as
the analysis of complex technical correlations within the system. Various insights can be gained by displaying the
model as a formal graph and querying it. To enable such queries, a graph schema needs to be designed, which
allows to transfer the model into a graph database. In the course of this paper, we discuss the design of a graph
schema and MBSE modelling approach, enabling deep going system analysis and anomaly resolution in complex
embedded systems. The schema and modelling approach are designed to answer questions such as what happens
if there is an electrical short in a component? Which other components are now offline and which data cannot be
gathered anymore? Or if a condition cannot be met, which alternative routes can be established to reach a certain
state of the system. We build on the use case of qualification and operations of a small spacecraft. Structural and
behavioral elements of the MBSE model are transferred to a graph database where analyses are conducted on the
system. The schema is implemented by an adapter for MagicDraw to Neo4j. A selection of complex analyses are
shown on the example of the MOVE-II space mission.
Impact Statement
The proposed schema to transfer SysML Models to Labelled Property Graphs and the related modelling strat-
egy open a wide range of system analyses for designers and operators of complex embedded systems, such
as tracking data paths, finding probable causes of anomalies or retrieving every possibility to reach a certain
state within the system from the current one. It can also be used to analyze which input losses components
should be resilient against to achieve a maximum level of robustness and may help modelers in choosing what
information to include in their model. By parametrizing graph-queries, they can be reused for any system and
thereby increase the efficiency of model analysis. The requirements imposed on the analyzed SysML model
are few and do not require a holistically complete model. Thereby the approach can also be applied to system
models still in the making or where parts of the system are black-boxed.
© The Authors(s), 2020. This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://
creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work
is properly cited.
arXiv:2201.06363v1 [cs.SE] 17 Jan 2022
2Florian Schummeret al.
1. Introduction
1.1. Obstacles in the Introduction of Model Based Systems Engineering
In 2007, the International Council on Systems Engineering defined Model Based Systems Engineering
as "the formalized application of modeling to support system requirements, design, analysis, verifi-
cation and validation activities beginning in the conceptual design phase and continuing throughout
development and later life cycle phases." (International Council on Systems Engineering,2007, p. 15)
In 2019, the National Aeronautics and Space Administration published a survey with 50 participants
from the industry, academia, US-government agencies and tool vendors for systems engineering tools
on the topic of the current state of systems engineering at their own institution or as perceived at part-
nering entities (cf. Gerald Pawlikowski (2019)). The study identified an increased application of Model
Based Systems Engineering (MBSE) as the number 3 factor on improving overall performance, after
improving training of systems engineers and improving their domain specific knowledge. Asked about
the expected benefits, 63% expect a reduction of 30% to 50% regarding the development cycle time.
While encouraging the use of MBSE, the study also found the majority of participants reporting less
than 25% adoption of MBSE in projects at their workplace. 33% of the participants across all disci-
plines see insufficiency of MBSE tools as a key weakness in their application of systems engineering.
The only factor named more often are cultural issues with 46%, which the authors describe as people
being reluctant to change from document based processes (Gerald Pawlikowski (2019)).
I chose this study for the introduction to this paper for the following reasons: According to the study,
systems engineers across industry, public agencies and academia show a high level of expectations on
MBSE. At the same time, they admit that the tools are inadequate. People do not accept tools that they
view as inferior to their current working environment, thus leading to a smaller percentage of adaption
of MBSE, than the expected improvement through its application would suggest.
The successful introduction of new products and processes always requires understanding the users’
needs. Roughly around the same time as NASA’s study was conducted, the European Space Agency
started its "Model Based for Systems Engineering" initiative, which aims at increasing the usability
and interoperability of MBSE and MBSE tools within the European space sector. A report on perceived
user needs published by the initiative in 2020 sums up the current state of interoperability between tools
as low, resulting in unnecessary duplicate work. The report identifies 15 key user needs that MBSE
should deliver. One of the key user needs is to "ensure the consistency, completeness and feasibility
of requirements and design", noting that "especially the functional complexity must be kept in [sic]
control"(European Space Agency,2020b, p. 11). Another key user need is to "structure the knowledge
about the system in such a way that the relations between the knowledge elements are established and
traceable"(European Space Agency,2020b, p. 11).
Looking at yet another survey on the adoption of MBSE, the report Benchmarking the Benefits and
Current Maturity of Model-Based Systems Engineering across the Enterprise by the Systems Engineer-
ing Research Center asked 240 individuals across academia and industry what they see as the largest
obstacles in MBSE adoption. "MBSE methods and processes" is cited as number one obstacle by the
report (Thomas McDermott, Nicole Hutchison et al.,2020, p.20). Gerald Pawlikowski (2019) supports
these findings; 88% of participants identified "Purpose and Scope Definition" as a major challenge in
the adoption of MBSE.
Combining the results of the three reports gives insight into some obstacles to rolling out MBSE on
a wider scale; MBSE tools shall enable users to ensure the completeness, consistency and feasibility of
their development and a third of the participants in NASA’s study identify insufficiency of MBSE tools
as a key weakness to their application of MBSE. The conclusion lies near, that the tools are not yet fully
up to the task. Combined with the findings of Thomas McDermott, Nicole Hutchison et al. (2020), the
reports show tools and modelling strategies as major factor for a broader acceptance of MBSE. Part of
the tools inadequacy in our experience is their inability to process complex queries on models. Starting
Data-Centric Engineering 3
at any given element of a model, one can usually only follow it for one hop, i.e. to the end of any of
its direct relations, or along a single relation type. Although any consistent model contains complex
modelling patterns, these can usually not be followed within standard commercial tools.
1.2. Goals and Outline
This paper shall show how graph databases can be used for specific MBSE analyses that are cumber-
some and inefficient in common modelling tools such as MagicDraw. In the course of this document
typical analysis tasks during spacecraft development, formalized as questions shall be outlined. Further-
more it shall be explored how graph-technology enables such analyses. A graph schema is a prerequisite
for any graph-based analysis. A graph schema transforming SysML into a labelled property graph shall
hence be developed and a respective software implementation, which automatically transfers standard
SysML models into the graph-database shall be provided. A goal of this paper is to enable others to con-
duct such analyses as well. Therefore a complementary set of modelling guidelines shall be presented,
which is to be kept to a bare minimum to reach a high compatibility with any prevailing modelling
strategy. Customizations of the modelling guidelines shall be discussed, depending on which analyses
are to be carried out. Finally, a selection of the previously defined analysis tasks shall be translated into
actual queries for the graph database and performed on the example of an actual spacecraft. Before get-
ting into the details of analysis tasks, graph schemata and modelling rules, some background shall be
provided.
Therefore, Section 2gives an overview of various graph databases after summarizing the current
state of MBSE implementation as gathered in recent surveys. It concludes with a short introduction
to the use case of the MOVE-II spacecraft. Section 3outlines typical tasks in system analysis, with a
special focus on later project phases such as Assembly Integration Testing and Operations of a sys-
tem. Building on the outlined tasks, possible solutions for a graph schema are discussed and a solution
suitable to the analysis tasks is selected. In Section 4modelling guidelines for the MBSE model are
presented and their effect on which analyses can be conducted is discussed. Section 5provides a
short overview of the actual code implementation, which translates MagicDraw SysML models into
a labelled property graph. In Section 6the proposed graph schema and modelling guidelines are veri-
fied against an operational space mission: The MOVE-II space mission designed and operated by the
Technical University of Munich’s Chair of Astronautics. The model used for the analysis covers the
complete structure of the spacecraft, including hardware and software, as well as any systems and com-
ponents of the ground station and operations equipment. Section 7compares the results of the paper to
the state of the art and Section 8presents an outlook on future topics of research in the area.
2. Background
This section is divided in three parts; The first part covers the state of the art on the implementation of
MBSE and the challenges thereof as perceived by the literature. No previous work that compares dif-
ferent studies on the implementation of MBSE could be found. Also, no previous work was found on
the application of MBSE in later phases of a spacecraft project such as Assembly Integration and Test-
ing or Operations of the spacecraft. The comparison is conducted here to provide a better impression
on what challenges people are dealing with when implementing MBSE and to gain insight into why
MBSE seems to not be applied to later life cycle phases. The second part introduces graph databases
and the underlying concept of graph schemata. The third part gives a short introduction on the MOVE-II
project used as implementation show cases.
4Florian Schummeret al.
2.1. Recent Surveys on Model-Based Systems Engineering
The report Benchmarking the Benefits and Current Maturity of Model-Based Systems Engineering
across the Enterprise by Thomas McDermott, Nicole Hutchison et al. (2020) was published in March
2020. Studies lying to far in the past are seen as less relevant for this work, as the presented studies show,
that the application of MBSE drastically increased over the in recent times and new challenges and
problems arose while others become less relevant. Being supported by the United States’ Department
of Defense, the report sets out to explore the current state of adoption of MBSE, identifying challenges
and enablers in the adoption and which skillsets are necessary for a successful MBSE adoption. The
survey is based on data gathered from November 2019 to January 2020, with a total of 240 participants
and thereby comprises not only the most recent but also largest survey, with participants from industry,
academia and governmental institutions. While Thomas McDermott, Nicole Hutchison et al. (2020)
and Gerald Pawlikowski (2019) were the most recent surveys we could find, other publications address
the same topic and should not be omitted.
M. Chami, J. Bruel (2018) report on a survey conducted on MBSE adoption in the industry with 42
participants. The survey is focused on MBSE adoption challenges. In difference to Gerald Pawlikowski
(2019), all participants had an industrial background. The participants came from a broader field, with
roughly a third working in consultancy and training, and the rest being split on aerospace, medical,
railway, defense, computing and IT engineering and automotive (ordered by decreasing participation).
The main findings are consistent with Gerald Pawlikowski (2019): "Awareness and change resistance"
is the most frequently named challenge with 88% of participants either agreeing or strongly agreeing
on the topic. Interesting is that "purpose and scope definition" has the same level of total agreement
(88%) but a lower percentage of strongly agreeing participants. The third and fourth most mentioned
problems are "method definition and extension" (84%) and "tool dependency and integration" (83%).
The higher percentages in agreement can be traced to the surveying method. While Gerald Paw-
likowski (2019) built upon telephone interviews with rather free questions and consolidated the
responses afterwards into groups, M. Chami, J. Bruel (2018) used a predefined set of perceived chal-
lenges in MBSE adoption and asked the survey participants about their agreement with each challenge.
However, it should be mentioned that "method definition and extension", as well as "purpose and scope
definition" came up as additional challenges that were not in the focus of Gerald Pawlikowski (2019).
This can either be attributed to the broader field of participants or the different surveying method, which
might bring up focus on topics that do not immediately spring to mind otherwise.
B. Morris et al. (2016) focused on the earlier phases of a system’s lifecycle by limiting their two
surveys from 2015 and 2014 to conceptual design works. What makes the survey especially worth
reading is the open-minded approach to the question whether the employment of MBSE brought bene-
fits or exacerbated existing problems further. Interestingly, the study found "solutioneering" (which is
described as stakeholders pressing for a specific solution without understanding the problem first) and
"lack of Stakeholder engagement" as main topics. Topics mentioned in Gerald Pawlikowski (2019) and
M. Chami, J. Bruel (2018) such as "purpose and scope definition", "method definition and extension"
or "tool dependency and integration" are not present in the report of Morris et al. Likely reasons for
the difference between the former two studies and B. Morris et al. (2016) might be the limited number
of participants in any of the surveys, the 4-5 years difference between their conduction or the differ-
ent methodologies and specific questions employed by the survey conductors. Table 1gives a short
overview of key numbers of all four publications.
Going on from the topic of MBSE challenges to specific lacks of functionality for MBSE tools,
A. Hazle, J. Towers. (2020) summarize literature on the verification and validation of SysML models.
They specifically point out reviews, analysis and simulation as the three main techniques of verification
and validation for SysML models and describe formal model checking of SysML models as rarely used
and often reserved for high risk aspects due to the significant effort attributed to it. They describe the
necessity to transform the model into another formal language prior to formal analyses, such as Petri
Nets or directed graphs. They further state the formal verification of models with Petri Nets or directed
graphs is limited to the behavior aspects of a SysML model, omitting the structural aspects. As for the
Data-Centric Engineering 5
Table 1. Comparison of surveys on the state of Model Based Systems Engineering
Year Title Authors No. Part. Method
2014/2015 Issues in conceptual
design and mbse suc-
cesses: Insights from
the model-based con-
ceptual design surveys
B. Morris et al. 39 / 40 Open Questions
2018 A Survey on MBSE
Adoption Challenges
M. Chami, J. Bruel 42 Level of agreement to
preselected statements
2019 Independent Assess-
ment of Perception
From External/non-
NASA Systems
Engineering (SE)
Sources
G. Pawlikowski et al. 50 Open questions
2020 Benchmarking the
Benefits and Cur-
rent Maturity of
Model-Based Systems
Engineering across
the Enterprise
T. McDermott et al. 240 combined
analysis of requirements and structural aspects of the model, a wide variety of publications were made
on the topic of automatic analysis of requirements in SysML (see J. Bankauskaite, A. Morkevicius
(2018); Petnga (2019); Morkevicius and Jankevicius (2015)).
What all presented publications have in common is their focus on the early project phases and formal
reviews. Going back to the original definition of MBSE, later life cycle phases such as the qualification
and operation of a system should not be omitted but are rarely studied. I found only one publication
that can be linked to MBSE employment in qualification and operation of a spacecraft. In 2020, the
European Space Agency issued an invitation to tender on reverse-engineering of the satellite OPS-
SAT: The satellite is built as a generic flying laboratory, that allows new experiments to be uploaded
and conducted while in orbit. The invitation to tender states the following intent: "Users could greatly
benefit from understanding of the system via a formal system model also for experiment integration
and feasibility analysis" (European Space Agency,2020a, p.6). While the aim of this activity is the
employment of MBSE to assist in the operational phase, the activity is ongoing and no publications on
their progress are issued yet. The lack of publications on MBSE in the operational phase and during
assembly integration and testing leads to the conclusion that activities in this field of application are rare.
Taking a look at the current rate of MBSE employment this is no surprise. Gerald Pawlikowski (2019)
found a majority of interview partners reporting less than 25% of MBSE on current projects, with a peak
reporting between 5% and 10%. MBSE is easier to employ in earlier project phases, where the system’s
design still has a lower level of detail and hence requires a lower degree of complexity from the model.
The conclusion lies near that due to the overall low level of adaption the focus lies on the earlier project
phases. Additionally it requires a detailed model of the spacecraft during Assembly Integration and
Testing to reap any benefits of MBSE for AIT activities. Such a detailed model implies that the project
successfully implemented MBSE throughout the project’s timeline, including any subcontractors. As
protection of intelectual property is a concern in spacecraft developments, subcontractors are naturally
reluctant to provide a detailed model with their subsystems. For the application of MBSE in Assembly
Integration and Testing as well as Operations this leads to three conclusions:
6Florian Schummeret al.
1. Any modelling guidelines built here to enable analyses via graph databases should be compliant
with pre-existing modelling strategies, as ideally the model is maintained since early project
phases and has a pre-existing modelling strategy
2. In case no previous model exists and MBSE and graph analyses shall be employed in a later
project phase, building the model should be as time efficient as possible.
3. Any analysis should be able to cope with black-boxed parts of the system.
2.2. Graph Databases
2.2.1. Comparison and Selection of a Graph Database
The following definition of Graph Databases is provided by (I. Robinson et al.,2015, p.5): "A graph
database management system (henceforth, a graph database) is an online database management system
with Create, Read, Update, and Delete (CRUD) methods that expose a graph data model". They further
specify graph data models as the underlying concept of how the relations between entities in the graph
database are stored. According to I. Robinson et al. (2015), the property graph, the hypergraphs and
Resource Description Framework Triples (RDF) are the dominant graph data models.
Fernandes and Bernardino (2018) compared 5 different graph database implementations, which also
cover the various graph data models, which are summarized in Table 2.
Table 2. Summary of the database comparison conducted by Fernandes and Bernardino (2018) and
supplemented by information from Franz Inc. (2021); ?); Objectivity Inc. (2021).
Database Graph Data Model Query Language Open Source
AllegroGraph RDF SPARQL, Prolog no
ArangoDB multi-model ArangoDB Query Language community edition
InfiniteGraph Property Graph Model "DO" no
Neo4j Property Graph Model Cypher community edition
OrientDB multi-model Gremlin, SQL yes
As Table 2 shows, graph databases are not yet in a state of development where one query language
emerged as prevalent over the others.
The databases in closer consideration for this work were OrientDB, ArangoDB and Neo4j, due to
being the only open source databases. Additional requirements for the choice were
•available support and examples,
•available documentation and guidebooks,
•available drivers for python,
•ease of use and ease of installation on Windows, MacOS and Linux,
•capable graphical viewing tools,
•ability to handle up to a few million elements efficiently,
Regarding the efficiency, all of the above databases are up to the task. Examples and documentation
are also sufficiently available for all of the above databases. As for guidebooks, Neo4j stands out with
a variety of books made available for free on their website, that help beginners to get started with the
database (see J. Webber, R.v.Bruggen (2020); M. Needham, A. Hodler (2018); Hodler and Needham
(2019); Rik Van Bruggen (2014); Neo4j Inc. (2019)). This as well as the ease of use, ease of installation
and large number of publicly available examples built with the database were seen as especially impor-
tant, since a low effort of getting started with the analysis may open the idea to a broader audience.
Since there is also a variety of tools available to query and view Neo4j graphs, we decided for Neo4j,
although we do not see any obstacles to trying out other graph databases for the same kind of analyses.
Data-Centric Engineering 7
2.2.2. The Labelled Property Graph Model
How the database stores information and makes it available to users has influence on the specification
of the graph schema. Therefore, informational constructs may require slight adaptions of the graph
schema from one graph database to the other. Neo4j employs the so called Labelled Property Graph
Model, which shall briefly be described in the following(Hodler and Needham (2019)).
Like any other graph database model, the main components of the Labelled Property Graph Model
are nodes and edges. Compared to other graph data models, the Labelled Property Graph Model allows
to store key-value pairs as properties directly on nodes and edges, whereas other graph data models such
as rdf require a separate node for every property that shall be stored and do not allow for properties of
edges (T. Neumann, G. Weikum (2011); Hodler and Needham (2019)). Edges in the Labelled Property
Graph Model carry a single type, describing the relation formed between the nodes. If multiple types
shall be assigned to a connection between two nodes, the nodes are connected with multiple edges.
The edges are directed, i.e. every edge has a dedicated source node and a dedicated target node. An
edge can also only ever connect exactly two nodes. If more than two nodes shall be connected, a so-
called hypernode can be employed, to which all nodes are connected that share the relation (Angles
and Gutierrez (2018)). Apart from key-value based properties, nodes can carry multiple labels, which
define the type of a node. The application we have in mind is to investigate SysML Models with the
help of graph queries. The necessary semantics can be taken from the System Modelling Language and
the specific model under investigation. While the semantics could be styled more elaborately in other
graph models such as rdf, the gentle learning curve established by a simpler graph schema allows for
easier understanding and thereby facilitates the application of the schema.
2.3. Introduction to the Show Case Application
We use the Munich Orbital Verification Experiment II (MOVE-II) as showcase application. The MOVE-
II spacecraft is a 10x10x10cm large satellite following the CubeSat form factor (see Puig-Suari (2014)).
The mission was founded in 2015 with the goal of hands-on practice for students looking for a career
in the aerospace sector (Langer et al. (2015)). Its payload consists of novel quatro-junction solar cell
prototypes, whose degradation due to space radiation shall be measured over time (Rutzinger et al.
(2016)). Figure 1shows the spacecraft in its deployed configuration, including the solar cell payload in
the middle of the spacecraft and the four solar generators surrounding it.
After a successful launch in December 2018, the spacecraft is now in orbit since over 2.5 years
and performing well (Rückerl, S. et al (2019); Roberts and Hadaller (2019)). The mission is a typical
Figure 1. Photo of the MOVE-II Spacecraft showing the solar array with the payload solar cells in the
middle.
8Florian Schummeret al.
case of small spacecraft developments; parts of the system such as the Command and Data Handling
software, the payload, the structure and mechanisms or the Attitude Determination and Control System
were designed and built in house by students, while other subsystems, such as the hardware for the
Command and Data Handling system or the Electrical Power System were acquired by external vendors
(Langer et al. (2017)). The SysML model employed for the analyses in this paper covers the structure
of the whole spacecraft, including any data flows and electrical flows, any sensor measurements and the
satellite’s software. It furthermore covers the structure of the mission’s ground segment, i.e. everything
necessary to operate the satellite, such as the ground station with radio equipment and antennas or the
servers and software of the operations interface and telemetry database.
3. Development of the Graph Schema
This section provides an overview of the development of the graph schema and modelling guidelines. In
order to analyze a system by combining SysML and graph analysis, a graph schema has to be defined,
with the help of which the SysML Model can be transferred from its editing tool into a labelled property
graph. A graph schema explains how nodes in a graph can be labelled and which relation types can exist
between specific nodes.
3.1. Existing Graph Schemata for SysML
Two schemata are mentioned in the literature, which - before progressing to the questions set on SysML
Models for graph analysis - shall be discussed briefly.
Petnga (2019) proposes a schema focusing on requirements analysis. Goal of the schema is to assess
completeness, consistency and correctness of requirements in a SysML model built in MagicDraw. The
author provides a detailed schema for the requirements related elements blocks, test cases and require-
ments. The model falls short of considering any other element of the System Modelling Language. Also
the questions on which the schema builds are kept quite simplistic and do not exploit the strengths of a
graph database. The analyses are centered on:
•What percentage of requirements are not completely defined?
•What percentage of requirements are not satisfied or not verified?
•How many elements with duplicate names exist?
None of these questions requires a graph traversal of more than one relation, i.e. they could be answered
just as well by reading the elements into a table-based database or by employing table methods provided
by MagicDraw. Petnga (2019) further brings up the idea of applying graph algorithms to the model,
which would allow finding critical elements in the system, or elements that have the largest influence on
others and consequently applies the betweenness centrality algorithm on the requirements of the system,
to find out which requirements have the largest influence on others. While this yields information for
the first time that could not be obtained as easily by other means, he falls short of defining what routes
and elements should be selected for the algorithm, which makes a major difference in the results.
The second graph schema found in the literature is maintained by the company Intercax as part of
their Syndeia software suite (see Intercax LLC (2021)). First proposed in Bajaj et al. (2011), the idea
behind Syndeia is to generate a "Total System Model", which they specify as a model that allows to
link information across various information sources in a graph database. The MBSE model is one of the
information sources integrated into the total system model, alongside product lifecycle management,
Computer Aided Design systems, Databases such as MySQL or Neo4j, simulation tools or application
lifecycle management systems such as Jira or git (Intercax LLC (2021)). The software is commercially
distributed and was for example applied on the Large Synoptic Survey Telescope for Verification and
Validation purposes (Selvy et al. (2018)). Multiple publications were made on the software, its appli-
cations and involved challenges (see Bajaj et al. (2016,0); Fisher et al. (2014)). However, no detailed
Data-Centric Engineering 9
specification of the graph model itself or data of the use case systems is provided in any of the publi-
cations, making it difficult to reproduce any of the results described in the publications. In Bajaj et al.
(2017) they show detailed graph query results, but fall short of providing the schema that allows to query
the system. While the lack of a detailed specification may be attributed to the commercial distribution of
the software, a second lack in the publications on Syndeia is a SysML modelling strategy to go with the
graph schema. Analogous to Petnga (2019) the queries presented by Bajaj et al. (2017) on the SysML
graph are of a simplistic nature and do not exploit the potential of the graph database. The queries range
from which elements are connected to a specific element to Which behaviors are attributed to a specific
model element and which paths exist between a specific pair of elements. The last query is the only one
in the paper that exploits the potential of a graph database compared to relational databases. Combin-
ing a graph schema with an appropriate modelling strategy allows for far deeper analyses and a higher
quality of the information drawn from the model as is shown in the following.
3.2. Analyses for a SysML Graph Schema
According to Van Bruggen (2014) the aim of a graph schema should be to enable answering all ques-
tions that can be foreseen to be put to the graph with a minimum of required syntax. Therefore, this
section defines questions on specific aspects of SysML models, such as structural or behavioral infor-
mation. For the sake of brevity, requirements and use cases are not addressed in this publication. The
questions provided in the following are the result of five years of personal experience in systems engi-
neering for small spacecraft and testing and operations of small spacecraft. The list is of course not
exhaustive, but sufficient to help drafting the graph schema and modelling guidelines.
3.2.1. Analyzing the Structural Part of a SysML Model
SysML diagrams can be separated into three groups; structural diagrams, behavioral diagrams and
requirements modelling. The following questions can be set to the structural part (i.e. Block Definition
Diagrams, Internal Block Diagrams and Parameter Diagrams) of a SysML Model:
1. What is component X composed of?
2. What types of ports are used over a certain range of equipment?
3. What elements belong to a certain class?
4. What systems employ a certain type of component?
5. What component supplies system X with power?
6. How is information Y processed within a certain subsystem?
7. How is information Y processed globally?
8. Which components draw power from a certain supply component?
The following questions show the might of a graph analysis, as a complete fallout analysis can be
performed with the same type of queries:
9. What is the source of telemetry Y and what could influence its measurement?
10. Given an anomaly on a specific subset of a system’s telemetry, which components are most likely
to have caused it? Which components can be ruled out?
11. Given a failure of component X, are there any alternative ways of acquiring data usually
processed by component X?
12. What happens if component X breaks?
(a) Which systems will not work nominally anymore as they process data coming from
component X?
(b) Which components will be offline in case of an electrical short in component X?
(c) Which components will suffer from a loss of input, as they depend on data processed by
any of the components offline due to the electrical short in component X?
10 Florian Schummeret al.
3.2.2. Analyzing the Behavioral Part of a SysML Model
Behavioral Diagrams such as state machines, activity diagrams and sequence diagrams describe the
response of a system to certain events and conditions. They are used to model expected behavior,
required inputs and task sequences to reach certain states of the system or to produce a certain output,
compare Friedenthal et al. (2011). Analogous to this are the questions that can be answered by analyzing
behavioral diagrams:
13. Which conditions lead to state A?
14. What is the shortest path from state A to state B?
(a) While condition C cannot be met?
(b) Without changing the state of equipment D?
(c) Which conditions have to be fulfilled for this path?
(d) Which activities are performed along this path?
15. Which functions/activities require object E?
16. Starting at activity F, is there a way to reach activity G? Which activities G are on that route and
which alternatives could be taken?
17. Which is the shortest route from activity F to activity G?
18. Which conditions have to be met on the route from activity F to G and which inputs provided?
19. Which activities lead to the production of object E?
20. Which inputs are required to produce object E?
21. In case condition C cannot be met, which states of the system cannot be reached anymore?
22. In case condition C cannot be met, which outputs cannot be acquired anymore?
3.3. Proposed Graph Schema
The proposed graph schema is - as the subsection before - structured according to the main aspects of
SysML; structure and behavior. The design goal of the schema is to enable answering the questions
defined in Section 3.2 by using Cypher queries performed on a Neo4j database.
Each paragraph describes the definition of node-labels and relation-types as well as proper-
ties stored on relations and nodes. The labels were chosen with a focus on readability. The idea
behind this is to maintain a shallow learning curve and keep to Cypher’s declarative nature.
Representations that read like a sentence are easier to understand and remember. For example,
(Mobile Charger) -[:CLASS]-> (Power Converter) does not make it apparent yet
which is the class and which the element. (Mobile Charger) -[:IS_OF_TYPE]-> (Power
Converter) makes it clear that the Mobile Charger is an element of the class Power Converter.
3.3.1. Graph Schema for Structural Aspects of a SysML Model
Since the SysML is a graphic modelling language, most concepts can be transferred straight-forwardly
(compare Object Modelling Group (2019)):
•SysML Blocks become nodes with the label :BLOCK.
•SysML Ports become nodes with the label :PORT.
•Instances of Blocks become nodes with the labels :BLOCK and :INSTANCE
•Generalizations become :IS_OF_TYPE relations.
•Aggregations and Compositions, i.e. the associations that structure SysML Blocks hierarchically,
become :IS_PART_OF relations.
One of the ideas behind the graph schema for SysML is to avoid the need to know every aspect
of the system that was ever modelled but to be able to retrieve any information via queries. Since
aggregations and compositions are both used to structure a set of blocks hierarchically, they both get
Data-Centric Engineering 11
Graph Schema Examplestructure
[Package]
bdd ][
A
«block»
B
«block»
D
«block»
C
«block»
X
«block»
bPort
dPort
cPort2
cPort
second B
B
C
D
Figure 2. bdd to Figure 3.
Graph Schema Example[Block] ibd ][ A
second B : B
C : C D : DB : B
bPort
cPort2cPort dPortbPort
X X
Figure 3. Example of an itemflow in an internal block diagram.
the same relation type :IS_PART_OF. Details on the relation type can be retrieved via the property
type. The :IS_PART_OF relation is also used to connect ports to their respective blocks.
Instantiations in SysML are not always straight forward. While every block in the diagram becomes
instantiated at the moment an internal block diagram is built, the ports connected to the blocks are
not instantiated. Furthermore, the block hosting the internal block diagram and is represented by the
diagram-frame is not instantiated when an internal block diagram is built.
This becomes relevant when looking at SysML Connectors. Connectors are used to depict connec-
tions between instances of blocks, over which physical or non-physical items can be exchanged, such
as electrical power, data or physical momentum. A concept closely related to connectors are itemflows.
An itemflow describes what items are transmitted via a certain connection. The transmitted items them-
selves are modelled as Blocks (see Object Modelling Group (2019)). To distinguish them, they get
the additional label :FLOWITEM. Figure 3shows a simple itemflow in SysML, while Figure 2shows
the corresponding block definition diagram. The diagram depicts four block instances, of which three
are connected over respective ports. The connectors both carry the flowitem X. Note that the instance
"second B", which is an instance of Block B does not interface any connector.
The graph schema has to depict the information in Figures 2and 3unambiguously. Figure 4shows
the representation of the block definition diagram in Figure 2in the graph. Figure 5shows the relation
of instances and blocks. Figure 6shows the graph-transformation for the internal block diagram in
Figure 3. The design is explained in the following and weighed against alternatives.
12 Florian Schummeret al.
IS_PART_OF
IS_PART_OF
IS_PART_OF
IS_PART_OF
IS_PART_OF
IS_PART_OF
IS_PART_OF
X
A
B
C
D
bPort
cPort
cPort2
dPort
Figure 4. Graph Representation to Figure 2. Blocks are depicted in brown, ports in green.
IS_PART_OF
IS_PART_OF
IS_PART_OF
IS_INSTANCE_OF
IS_INSTANCE_OF
IS_INSTANCE_OF
IS_INSTANCE_OF
A
B
C
DD
C
B
second B
Figure 5. Graph Representation of the instances for Blocks. Blocks are depicted in brown, Instances in
blue.
The itemflows shall be traceable through the whole system. One possibility to translate the infor-
mation into a graph is to simply create a connector from block to block and add the flowitem’s name
and ID as properties on the connector. However, this would lead to the following complications: (a)the
node of the flowitem has no connection to the flow as the connector runs from source block to target
block and cannot be connected to a third node, (b) the flowitem would not be traceable anymore, if a
block containing the flowitem is used to describe the itemflow instead of the flowitem. The next simpler
solution would be to directly use the flowitem as node between the ports it flows from and to. However,
this only works until a second connector carries the same flow, as it would become untraceable which
relation belongs to which connection. The solution in Figure 6does not have these impasses, as a con-
nector node is created for every SysML connector. This construct is often referred to as hypernodes,
which is why the node carries the label :HYPERNODE.
Data-Centric Engineering 13
FLOWS
FLOWS
FLOWS
IS_PART...
IS_PART_OF
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS_IN
FLOWS_IN
IS_INSTANC...
IS_INSTANC...
IS_INSTAN…
IS_INSTANC…
IS_INSTANCE_OF
IS_PART...
D
C
B
second B
bPort_in…cPort_in…
cPort2_i… dPort_in…
X
IS_PART_OF
dPort
D
bPort
B
Figure 6. Graph Representation of the information related to Figure 2and 3. Blocks are depicted in
brown, instances of blocks in blue, hypernodes in purple, ports and port instances in green, flowitems
in red. Note: As the graph itself contains all information, this is merely an excerpt showing specific
information but not bound to the limits of any SysML diagram type.
Figure 7. Proposed graph schema for Structural SysML Diagrams.
14 Florian Schummeret al.
simple behaviorsimple behavior[Activity] act ][ Graph Transformation
Action 2
Action 1
End
Start
[guard 2]
[guard 1]
[guard 3]
IS_PART_OF
CONTROL_FLOW
CONTROL_FLOW
CONTROL_FLOW
simple
behavio…
Action 1
Action 2
Start
End
IS_PART_OF
Figure 8. Simple activity diagram and its graph transformation.
Similarly, the simplest solution to handle ports would be to omit port instantiation and only store the
relation between port and block, as well as port and block-instance. In the example shown in Figures 3
and 6however, this approach would lead to a loss of the information, whether the instance "B" or "sec-
ond B" carries out the connection, as both would be connected to the port "bPort". Hence, additionally
to the elements taken from the MagicDraw model, port instances are created during the translation from
MagicDraw to Neo4j to discern which port instance belongs to which block instance. While ports can
formally be instantiated in MagicDraw, requiring users to conduct this instantiation by hand and relat-
ing all port instances by hand would mean extra modelling effort that can be omitted. Figure 7shows
the complete graph schema for SysML Structure diagrams and to which the transformation of the inter-
nal block diagram in Figure 3shown in Figure 6is compliant. The graphic is interpreted as follows:
Nodes are depicted in rectangulars, while relations are presented as connections between the rectangu-
lars. The graphic defines which combinations of labels and relations between node types are allowed by
the schema. For nodes with multiple labels (for example :BLOCK:INSTANCE) all relations defined by
the singular labelled node (in this case BLOCK) are also allowed. The relation types are written on the
connections. Arrows originating at the node they point to show relations defined between two nodes of
the same type. Additionally to the labels and relationtypes defined in Figure 7, all nodes carry an id-
and a name property, allowing for unique identification and easy reference. The names are taken over
from the SysML model. Where no name is provided in SysML NULL is entered in the graph.
3.3.2. Graph Schema for Behavioral Aspects of a SysML Model
Behavior is modelled in SysML via Activity Diagrams, State Machines, Sequence Diagrams and Use
Case Diagrams (Object Modelling Group (2019)). This section will focus on Activity Diagrams, State
Machines and Sequence Diagrams.
Graph Schema for Activity Diagrams
Figure 8shows a simple activity diagram, consisting of a Start and End node and 2 activities named
Action 1 and Action 2 as well as their translation into a graph.
The information contained by the control flows (dashed arrows in the SysML act diagram, green
arrows in the graph transformation, see Figure 8) is stored as properties of the :CONTROL_FLOW
relations, which are not depicted in Figure 8. Any activity diagram can be seen as an activity by itself.
Therefore, the node to the very left of the graph transformation in Figure 8represents the diagram
itself. This becomes a necessity when dealing with nested activities. Figure 9shows the activity simple
behavior from Figure 8nested in another activity with the title reused activities. Semantically, this
Data-Centric Engineering 15
reused activitiesreused activities[Activity] act ][
simple
execution :
simple
behavior
Action 0
End
Start
Figure 9. Nested activities.
means simple behavior is executed in the reused activities diagram and the complete content of simple
behavior is run through at the time the activity is called.
Figures 10 and 11 show different possibilities of the diagram’s transformation. Figure 10 shows a
simpler transformation that is closer to the SysML’s own syntax. The simple execution of simple behav-
ior is executed and connected via the corresponding control flows as depicted in Figure 9. Figure 11
shows a more complex transformation. Each activity and node contained in the simple behavior dia-
gram is realized again together with the respective control flows. The control flows that connected to
simple execution are redirected to the Start and End node of simple execution. Thereby the complete
activity flow is generated and can be queried to understand, which activities have to be performed in
order to achieve a certain result.
While both transformation schemata have their advantages, we decided in favor of the variation
shown in Figure 11, as it is easier to understand a complete flow of activities and keeps queries simpler.
Another construct of SysML Activity Diagrams is the use of blocks as inputs and outputs of activities
as depicted in Figure 12. Blocks are depicted in activity diagrams as rectangles with pointed corners.
They are connected to activities in the diagram via object flows (solid arrows in Figure 12).
Figure 13 shows the diagram’s translation into the graph. To realize the usage of blocks in activity
diagrams, two additional relation types are introduced: :OBJECT_FLOW, which is a direct translation
of its SysML counterpart, and :IS_USED_IN, which represents the connection between the respective
block (A) and the :ACTNODEs (in, buffer, out) of the object flow. The :ACTNODE label is also used
for Initial and Final nodes, as well as Decision nodes, Fork nodes and Merge nodes. To discern the
different types, the property nodetype is defined, which carries the respective information.
This brings us to the declaration of node labels for activity graphs. Figure 14 summarizes the graph
schema for activity diagrams. Discerning activities from other nodes such as Initial, End, Fork or Buffer
nodes in an activity diagram is necessary to answer the Questions 16 to 20 in Section 3.2.2. There-
fore, additionally to the :ACTNODE label, an :ACTIVITY label is defined, which is used for the
actual activities. To reduce the effort to build queries, nodes with the label :ACTNODE or :ACTIVITY
also carry the label :ACT, which marks any node from an Activity Diagram. For any execution, the
:EXECUTION label is provided, together with the relation :IS_EXECUTION_OF. The :BLOCK label
used in Figure 14 is the same one as in the graph schema for block definition diagrams and links the
two diagram types.
16 Florian Schummeret al.
IS_PART_OF
IS_PART_OF
IS_PART_OF
IS_PART_OF
CONTROL_FLOW
CONTROL_FLOW
CONTROL_FLOW
IS_EXECUTION_OF
reused
activities
Action 0
simple
execution
Start
End simple
behavior
Figure 10. Nested activities transformation Alternative 1.
Graph Schema for State Machines
The second important part of behavioral modelling is state machines. Figure 15 shows a simple
state machine with three states, state 1, state 2, state 3 and a set of transitions. According to (Object
Modelling Group,2019, p. 162), transitions can carry triggers, guards and activities. Triggers and
guards specify when the transition becomes active, while the activity simply states that an activity
shall be performed when the transition activates. Figure 16 shows a possible transformation of the
state machine in Figure 15 into a graph. However, the schema applied in Figure 16 does not allow
for complex queries on transition conditions and behavior such as Question 14 in Section 3.2.2. This
is resolved by a :HYPERNODE construct, equivalent to the :HYPERNODE in Section 3.3.1, which is
linked with the state that was left and the state which is entered via a :TRANSITION relation. Triggers
for state transitions, such as condition 1, 2 and 3 in Figure 15 carry the label :TRIGGER, which may
be added additionally to any other label the conditioning node carries. :TRIGGERS relations link the
:TRIGGER with the :HYPERNODE.
Additionally to the trigger events, so called guards can be specified to refine the conditions under
which a state transition is activated. The definition of trigger events as separate nodes is sufficient to
answer the questions defined in Section 3.2.2 though. Therefore, guards are for now treated as properties
of the :HYPERNODE (guard-property). For an exemplary transformation of Figure 15 see Figure 17.
Note how condition 1 is used in both transitions and the activity simple behavior is also linked to the
transition, enabling us to answer questions such as Question 14 a, c (What is the shortest path from state
A to state B, while condition C cannot be met and which conditions have to be fulfilled for this path),
21 (In case condition C cannot be met, which states of the system cannot be reached anymore) or 22
(In case condition C cannot be met, which outputs and signals cannot be acquired anymore).
Data-Centric Engineering 17
IS_PART_OF
IS_PART_OF
IS_PART_OF
IS_PART_OF
CONTROL_FLOW
CONTROL_FLOW
IS_PART_OF
CONTROL_FLOW
IS_PART_OF
CONTROL_FLOW
IS_PART_OF
CONTROL_FLOW
IS_PART_OF
CONTROL_FLOW
IS_EXECUTION_OF
reused
activities
Action 0
simple
execution
Start
End
Start
Action 1
Action 2End
simple
behavior
Figure 11. Nested activities transformation Alternative 2. Note: :IS _INSTANCE_OF relations and
respective nodes are omitted in this view for better readability.
actwithblocksactwithblocks[Activity] act ][
out : A
in : A
Activity 2
Activity 1
: A
«block»
End
Start
Figure 12. Activity Diagram with Blocks.
18 Florian Schummeret al.
OBJECT_FLOW
OBJECT_FLOW
OBJECT_FLOW
OBJECT_FLOW
CONTROL_FLOW
IS_USED_IN
IS_USED_IN
IS_USED_IN
IS_PART_OF
CONTROL_FLOW
IS_PART_OF
CONTROL_FLOW
In
Activity 1
buffer
Activity 2
out
A
actwith
blocks
Start End
IS_PART_OF
IS_PART_OF
in buffer out
Start End
Figure 13. Graph Transformation of the Activity Diagram with Blocks in Figure 12. Yellow:
:ACTNODE, blue: :ACTIVITY, beige: :BLOCK.
Figure 14. Graph Schema for Activity Diagrams..
[state machine] simple stmsimple stmstm ][
state 3
state 2
state 1
End
Initial
condition 1 / simple behaviour
condition 3condition 2
condition 1
Figure 15. Simple State Machine.
Data-Centric Engineering 19
IS_PART_OF
TRANSITION
IS_PART_OF
TRANSITION
IS_PART_OF
TRANSITION
TRANSITION
TRANSITION
IS_PART_OF
IS_PART_OF
state 1
state 2
state 3
simple
stm
End
Initial
Figure 16. Possible graph transformation of the state machine in Figure 15.:STATE nodes in orange,
:PSEUDOSTATE nodes in green, :TRANSITION relations in blue, :IS_PART_OF relations in
brown.
20 Florian Schummeret al.
TRIGGERS
TRANSITION
TRANSITION
TRANSITION
TRANSITION
TRANSITION
TRANSITION
TRANSITION
TRIGGERS
TRIGGERS
TRIGGERS
IS_USED_IN
TRANSITION
TRANSITION
TRANSITION
condition
1
state 2
End
state 3
condition
2
condition
3
simple
behavior
state 1
Initial
Figure 17. Graph transformation of the state machine in Figure 15.:STATE nodes in orange,
:PSEUDOSTATE nodes in green, :HYPERNODEs in brown, :ACTIVITY nodes in purple,
:TRIGGER nodes in red. Note: The node "simple stm" is not shown for better readability.
Data-Centric Engineering 21
Figure 18 and Figure 20 show more possibilities to be considered for the transformation. Figure 18
depicts state machines with substates in two variations. This opens up possibilities for the flow of
transitions, as the transition entering superstate can either be drawn to superstate,start superstate (or
entry), or both. The same applies for the transitions to leave the superstate. Arguments can be made for
and against all three variations:
•Drawing the transition from initial to superstate allows to answer the question "How can I enter or
leave superstate?", but makes it more cumbersome to follow the flow of transitions through the
project, as for every state we now need to query whether it has a set of substates and if so include
these in the list of states and transitions. Which level of detail is appropriate may be a difficult to
answer question especially since the graph queries shall be usable without prior knowledge of the
complete SysML Model.
•Drawing the transition from Initial to Entry allows to follow the transitional flow through the whole
diagram, without ever querying any other relation than :TRANSITION. A disadvantage of this
approach is that it becomes more difficult to query how to enter and leave the superstate.
•Drawing both transitions leaves us without any cumbersome queries for either way of modelling,
but results in a graph that is harder to understand, since the transitions in the graph would show that
you are either in the superstate or in substate 1 or substate 2.
Following a transitions network through a set of state machines is a more complex task than answering
the question of possible entries and exits to a single state. Therefore, I decided to stick with the second
option from above. Figure 19 shows the graph transformation of the state machines in Figure 18.
An interesting feature of state machines in SysML is their connection to activities. Activities can
be set to be performed on transitions, entry, do [while] or exit of a state (Object Modelling Group,
2019, p. 161). Such activities may for example log the information "state was left", or saving the results
at the end of a measurement accumulation. Such a state machine is displayed in Figure 20 on the
right. Employing activities is a basic feature of state machines and depicting this kind of crosscutting
connection from from one diagram type to the other is one of the strengths of graph databases. State
substate statemachine 1substate statemachine 1[state machine] stm ][
superstate
substate 2
substate 1
Exit
Entry
End
Initial
substate statemachine 2substate statemachine 2[state machine]stm ][
superstate
substate 2
substate 1
End Superstate
Start
Superstate
End
Initial
Figure 18. State machine with substates in two variations.
22 Florian Schummeret al.
IS_PART_OF
TRANSITION
TRANSITION
IS_PART_OF
IS_PART_OF
IS_PART_OF
IS_PART_OF
IS_PART_OF
TRANSITION
TRANSITION
IS_PART_OF
TRANSITION
TRANSITION
TRANSITION
TRANSITION
TRANSITION
TRANSITION
substate
statema…
End End
Superst…
supersta…
Start
Superst…
substate 2
substate 1
Initial
Figure 19. Graph Transformation of the state machine with substates in Figure 18. Orange nodes carry
a:STATE-label, green nodes a :PSEUDOSTATE-label and beige nodes a :HYPERNODE-label. State
transitions are shown as blue relations, whereas :IS_PART_OF relations are shown in beige.
2in Figure 20 shows activities at various positions within a state (entry, do, exit). The activities used
in the state machine have to be defined elsewhere and are then only used in the state. Using a set of
activities within a state implicitly creates a new set of control flows between the entry, do and exit
activities. The graph schema covers this implicit connection by explicitly linking the three elements
with :CONTROL_FLOW construct (see Section 3.3.2). Another relation is necessary between the state
and the activities performed within it. This is covered by a :IS_PART_OF relation. While it could
be argued, that a new relation type instead of reusing a relation from the structural aspects of SysML
makes sense, goal of the schema is to keep it as simple as possible. Creating new relation types for the
same kind of information as in the structural part of the SysML is not necessary as the context provided
by the node labels (:BLOCK in the structural part of the schema, :ACTIVITY and :STATE in the
behavioral part) resolves any ambiguity.
The diagram on the left side of Figure 20 shows the reused state machine reused execution: sim-
ple stm. The concept of nesting and reusing defined behavioral elements was already described in the
beginning of this section for activities. The same logic applies here. Porting the execution of the stm
that was already created by MagicDraw over into the graph allows for clear and understandable seman-
tics. Therefore, executions of states are referred to with the label :EXECUTION additionally to the
:STATE label. Apart from the :STATE nodes, other elements of the state machine also have to be
transferred into the graph, such as initial nodes and end nodes, decision nodes, etc. For these elements
the :PSEUDOSTATE label is provided.
Figure 21 shows the graph schema for state machines in detail. Nodes of the type :STATE
and :PSEUDOSTATE can be connected to :HYPERNODEs via the :TRANSITION relation.
:STATE:EXECUTION nodes are connected to :STATE nodes via the :IS_EXECUTION_OF rela-
tion type. States can also be part of other states, and therefore same as :ACTIVITY nodes can be
Data-Centric Engineering 23
[state machine] reused stmreused stmstm ][
reuse instance
: simple stm
state 0
End
Initial
[condition 1]
[condition 2]
[state machine] stm with activitiesstm with activitiesstm ][
simple behaviorexit /
parallel activitiesdo /
conditional behaviorentry /
state 2
simple behaviorexit /
state 1
[condition 1]
[condition 2]
Figure 20. Left: State machine with reference to another stm. Right: State machine with activities.
Figure 21. Graph Schema for SysML state machines.
connected to states via :IS_PART_OF relations. :TRIGGER nodes define the conditions which are
necessary to conduct a transition and relate via the :TRIGGERS relation to :HYPERNODEs.
4. Modelling Guidelines
One goal of the graph schema is to enable queries on a model that is not yet complete and thereby
enabling graph-queries on SysML models from an early stage of development or on systems that are
only partially modelled in later stages.
One goal of this set of modelling guidelines is to show how to build a SysML model that is effective
for employing graph queries. A second goal of the guidelines is to explain how to trim the guidelines,
i.e. what happens if a certain rule is not followed. When trimming these guidelines and working with
incomplete models consider that the graph can only relay information that is available in the model.
This is an important principle to understand which information should be included in the SysML model
and guides us through the following section. The last goal of this set of modelling guidelines is to be
useful, which requires it to be simple and efficient to apply and to yield results fast, while touching
24 Florian Schummeret al.
as little as possible on other modelling principles that may be followed by the modeler. Therefore,
examples illustrating a guideline are presented at various points.
The buildup of this section follows the same principle as the previous sections, starting with
structural diagrams and progressing to behavioral diagrams.
4.1. Modelling Guidelines for Structural Aspects of a SysML-Model
The following section refers to the questions defined in Section 3. Any question number cited in the
following refers to Section 3.
4.1.1. Modelling of Structural Associations
Recalling the questions defined in Section 3.2.1, the Questions 1(detailing the composition of a block)
to 4(detailing systems that employ a certain component) require a sound use of associations between
blocks. I.e. Every Block that is instantiated should be part of a structure of SharedAssociations and
PartAssociations which start at the system of systems and reach any components used in the system.
This is also required for Question 8(querying components dependant on a certain power supply) or
Question 12 (querying the fallout caused by a broken component). As the graph schema unifies the
concepts of part associations and shared associations into :IS_PART_OF-relations, no distinction is
made in the guideline between the two association types.
The same applies for flowitems such as data: modelling flowitems as a set of associated blocks
saves modelling effort when pursuing questions such as "How is information Y processed globally?"
(Question 7) or searching for likely culprits in case of an anomaly on a specific subset of telemetry
(Question 10).
4.1.2. Generalizations
The Questions 2(querying the types of ports used over a certain range of equipment), 3(querying
the elements belonging to a certain class) and 4(querying which systems employ a certain type of
component) are of interest when the failure of a certain component can be traced back to its working
principle and according changes need to be made across the whole system. Furthermore generalizations
can be used to classify flowitems in the model into categories such as currents and voltages or data. This
distinction is necessary to answer questions regarding how a certain piece of information is processed
(Questions 6,7,9,10 11) or regarding power paths (Questions 8and 12 (a) to (c) ).
All of these questions require the use of generalizations as defined in Object Modelling Group
(2019). A hierarchy in which port types, component types and data types are defined is therefore sensi-
ble, but can be limited to the level of detail that allows to answer the above questions. On some systems
this may require to refine them to the point where the protocol of the port is defined (for example CAN,
USB, Ethernet), while for other systems a simple differentiation between analog and digital ports may
suffice.
Formulating a guideline for the use of generalizations, generalizations shall be used to classify
blocks, flowitems and ports by important terms and concepts used throughout the development. This
concept especially comes into focus, when dealing with anomaly detection and failure detection, iden-
tification and recovery. For example, the questions 12 (concerning the fallout of a broken component)
and 12b (analyzing components offline due to an electrical short in a component) would benefit from
employing the following two concepts:
•afuse class, which shows what fuse is the next upstream on the power path, that is triggered by the
short and
•avoltage or power class, which can be used to discern the power paths from data paths or other
physical values for example.
In the same way, a data class would be beneficial to answer Question 12c (concerning components
suffering from a loss of input, as they depend on data processed by any of the components offline due
Data-Centric Engineering 25
to the electrical short in the a specified component), which depends on being able to discern a telemetry
value from the physical flow it measures.
4.1.3. Internal Block Diagrams and Modelling of Flowitems
The purpose of Internal Block Diagrams is to show the connections between blocks and thereby define
which paths flowitems can take within the model.
To answer questions such as Question 5(querying the component that supplies a system with power),
Question 6(concerning how a certain information is being processed within a certain subsystem)
Question 7, Question 8(querying which components draw power from a certain supply component),
Question 9(querying the source of and possible influences on a certain telemetry value) or Question
10 requires detailed information on the data and power flows within a system. Therefore, Blocks used
as flowitems should be modelled to the same level of detail with which analyses are to be performed
later. To elaborate on this, the associations between flowitems on different levels, such as "Temperature
X1 is part of the telemetry of Panel X, the telemetry of Panel X is part of Subsystem Y’s telemetry"
should be modelled on the same level of depth that is afterwards required for analyses. As the analyses
often start with anomalies seen on a single telemetry value this usually requires modelling down to the
measurement of every single sensor.
While the flowitems should be modelled to great depth, the itemflows modelled in internal block
diagrams do not require the same level of detail. If a certain telemetry shall be traceable all the way
back to the sensor generating it, the same level of detail also has to be provided in the internal block
diagram. Most modern spacecraft developments rely on commercial of the shelf parts to some degree.
The suppliers of these components typically do not provide the necessary information to model on a
level of detail allowing to track every data path within their subsystem. Therefore, the graph analysis
has to be able to cope with black boxed subsystems, of course following the paradigm that you can
only query information which is available in the model. I.e. if only the specification of the telemetry
provided from the subsystem is available, but no information on its internal connections, setting the
subsystem as a black box and transmitting a block containing all telemetry suffices.
Figure 22 illustrates this in an example. The upper diagram in Figure 22 shows the telemetry
flowitems of Subsystem A and Subsystem B. The middle diagram provides a structural breakdown of
the spacecraft. The bottom diagram shows the telemetry connections within the spacecraft in the form
of an internal block diagram. Note here, that Subsystem A, for which no further breakdown is provided
simply transmits its telemetry, which according to the upper diagram contains the values A1, A2 and A3.
Depending on the system to be modelled, discerning between physical values and their measure-
ments as flowitems can be a useful addition to the modelling guidelines. An example of this would be
an electrical power system, where currents, voltages and the measurements thereof are transmitted.
In such a case the application of generalizations of the flowitems to the two blocks physical value
and measurement resolves any ambiguity. This is especially recommended to answer questions such as
Question 12 as it allows to backtrack power paths to the next fuse and to discern between power paths
and data paths.
26 Florian Schummeret al.
Spacecraft TLM paths[Block] ibd ][
Sys B : Subsystem B
SB2 : Sensor B2
SB1 : Sensor B1
OBC : On Board Computer
SysA : Subsystem A
Data Port
Data Port
Value B1
Value B2
Data Port
Data Port
Subsystem A Telemetry
Subsystem B Telemetry
Data Port
[package] bdd modelling_guideline_example telemetry hierarchy ][
Subsystem A Telemetry
«block»
Subsystem B Telemetry
«block»
Value B2
«block»
Value B1
«block»
Value A3
«block»
Value A2
«block»
Value A1
«block»
modelling_guideline_example Spacecraft Breakdown[package] bdd ]
[
On Board Computer
«block»
Sensor B2
«block»
Sensor B1
«block»
Subsystem A
«block»
Spacecraft
«block»
Subsystem B
«block»
Data Port
Data Port
Data Port
Data Port
Data Port
Figure 22. Exemplary application of the modelling guidelines for structural SysML aspects.
4.2. Modelling Guidelines for Behavioral Aspects of a SysML-Model
Answering the questions set in Section 3.2.2 also requires some modelling rules. Compared to the
structural part of SysML however, these are rather sparse. Modelling state machine conditions with
signals, as shown in Figures 15 and 17 allows to answer Question 14 a and c (What is the shortest path
from state A to state B while condition C cannot be met? Which conditions have to be fulfilled for this
path) and Question 21 (In case condition C cannot be met, which states of the system cannot be reached
anymore).
Connecting state machines can be achieved via reusing state machines or employing signals. Both
of these practices are encouraged to achieve a higher level of interrelated, queryable information in
the model. The same concepts apply to Activity Diagrams. When modelling activities, reusing existing
activities to connect the diagrams can be of help to later retrieve information via the graph database.
4.3. Summary
Overall, the modelling guidelines to transfer SysML models to Neo4j in a way that enables such com-
plex queries as described in Section 3.2 are few. This was intended from the beginning, as the method
described in this paper should be combinable with any other modeling strategy. Parts of the rules
described above can even be neglected, if the specific questions related to these rules are not of interest.
Data-Centric Engineering 27
5. Implementation
The translation of MagicDraw SysML Models into the graph is implemented using Python 3.8, with
two scripts and a library of common functions, compare Figure 23.
Figure 23 provides an overview of the implementation. The retrieve_SysML_model.py script
reads the SysML-Model generated in MagicDraw 19.0 (.mdxml-file) and extracts SysML components
and relations as lists of python dict variables. The lists are stored as .json-files. This enables a
separation of concerns and enables testing the retrieve functions separate from the insert functions,
which insert the extracted SysML components in Neo4j. It also enables future developers to use only
the extraction part of our code and insert the extracted SysML components anywhere else.
The insert_SysML_in_neo4j.py script loads the stored .json files and writes every com-
ponent via a separate query into the Neo4j database. This procedure is quite ineffective regarding
execution times but allows for faster debugging. As the execution time on a normal office PC is still
under 3 minutes for several thousand SysML elements, the advantage in debugging prevails.
The code is open sourced on github under MIT License. It can be found under the following link:
https://gitlab.lrz.de/lrt/sysml_graph_analysis_tool/
To help with initial studies of the subject, the reference models used in this paper, i.e. the MOVE-II
Model and a smaller model containing the diagrams shown in Section 3are published under https://
mediatum.ub.tum.de/1633734
Figure 23. Setup of the extract transfer load software to import MagicDraw SysML data to Neo4j.
28 Florian Schummeret al.
6. Application of the Schema on the MOVE-II Spacecraft
The following section describes the application of the schema and guidelines developed in Section 3
and Section 4. After a short introduction, a selection of questions from Section 3.2.1 are taken and
respective queries are explained and performed on the SysML Model of the MOVE-II spacecraft. To
conduct these analyses, the SysML Model, which was generated using MagicDraw v19.0 SP4 was
transformed into a Neo4j graph database, employing the schema built in Section 3and the software
implementation explained in Section 5.
The model of the MOVE-II spacecraft consists of a total of 605 blocks, 345 ports, 2820 relations,
no activities and no states and thereby comprises a medium sized model with purely structural aspects.
No information on the behavior is included.
The model encompasses the satellite, ground station and operations systems and shows the data paths
of all telemetry the spacecraft provides as well as all power paths within the spacecraft. The spacecraft
itself consists of multiple subsystems;
•Attitude Determination and Control System, controlling the spacecraft’s attitude relative to the
Earth and Sun
•Communications (consisting of S-Band and UHF/VHF Transceiver), which provides contact to the
ground
•Command and Data Handling, which handles all telementry, interprets communications and
controls the state of the satellite
•Electrical Power System, containing the spacecraft’s batteries and power converters and controlling
the maximum power point
•Solar Cell Payload, solar cells which shall be measured against degradation over time in the space
environment
•Structure and Mechanisms, providing structural integrity and deploying the antennas and solar array
6.1. Hierarchical Analyses
The same list of subsystems provided above can be generated from the graph database with the fol-
lowing cypher-query, which searches for a node with the label :BLOCK and name-property MOVE-II
satellite and any further block, which is directly a part of MOVE-II satellite, here described with the
subsystem-variable. Query 1returns the name property of all nodes that match the position of the
subsystem-variable:
MATCH (moveii:BLOCK{name:'MOVE-II satellite'}) <-[:IS_PART_OF]- (subsystem:BLOCK)
RETURN subsystem.name
subsystem.name
ADCS
CDH
EPS
UHF/VHF
PL
S-Band
Solar Array
STR
Query 1: Retrieving the parts of a Block. Note: By adding *after IS_PART_OF the query retrieves
the parts to unlimited depth.
While the above example already shows the solution for Question 1from Section 3.2.1 (What is
component X composed of?), the Query 2answers Question 2(What types of ports are used over a
certain range of equipment?). It starts by anchoring the query to the node with a :BLOCK label and the
Data-Centric Engineering 29
name Ground Station, asks for any :PORT that :IS_PART_OF the ground station node and queries
the port types via the :IS_OF_TYPE relations 1. It returns distinct names of the porttype variables
and their number of usages, ordering the results in descending order by the number of usages of the
port type within the ground station. The result is provided below the query. Note how the declarative
definition of relation types fits the declarative language-style of Cypher, allowing for queries that can
be understood with little prior knowledge of the language.
MATCH (GroundStation:BLOCK{name:'Ground Station'}) <-[:IS_PART_OF*]-
(PortInGs:PORT)-[:IS_OF_TYPE]->(porttype)↩→
RETURN DISTINCT porttype.name, count(porttype) ORDER BY count(porttype) DESC
porttype.name count(porttype)
'N Connector' 8
'Ethernet' 3
'Serial Port' 2
'USB' 2
'SMA Connector' 2
'data port' 1
Query 2: Retrieve all port types used within a certain range of equipment.
6.2. Tracing Data Paths and Analyzing Data Anomalies
6.2.1. Datapath Query
One of the first steps in analyzing anomalies is finding all components which potentially participate in
the anomaly. Taking a data anomaly as example, any component albeit software or hardware process-
ing the anomaly-holding telemetry are to be found. Instead of consulting a variety of SysML Diagrams
to find all components in question, which is cumbersome and prone to errors, Query 3can provide the
result, defining any source and target component and the dataform in which the telemetry is being trans-
mitted. Note how the use of a parameter allows the reuse of the query for any other telemetry. In this
case, the telemetry is Sidepanel X+ Temperature OW2, a temperature value on the outside of the space-
craft. Query 3then looks for the :FLOWITEM with the defined telemetry name and all :FLOWITEMs
which contain the node with the defined telemetry name. In the next step, it finds all :HYPERNODEs
where any of the flowitem nodes :FLOWS_IN and calls these hpn. The final step is to look for the pat-
tern of :BLOCK:INSTANCE nodes, containing :PORTs connected via :FLOWS to a hpn. Over the
:FLOWS-direction, the query differs between source and target of the flow and consequently returns a
table with the results. Of course, the same query can be performed to return a graph instead of a mere
table, which might be useful to graphically assist the understanding of the dataflow. Figure 24 shows
the result of this query. Note how at a certain size the graph loses visual interpretability.
6.2.2. Finding Probable Causes of Anomalies
The next step in finding a data anomaly is to identify other telemetry processed by the components
Query Listing 3yielded and checking them for anomalies. A component often either processes any
data correctly or none, so while this is just an empirical step, it is an important and useful one. Query 4
is an addition to Query 3and yields any other telemetry processed by the same ports.
This query also builds the basis to answer Question 10 (Given an anomaly on a specific subset of
a system’s telemetry, which components are most likely to have caused it? Which components can be
ruled out?).
The question refers to a scenario, where more than one telemetry value shows an anomaly. To rule
out any other component, the query follows the logic "if another input is being processed by the exact
1The asterix behind :IS_PART_OF defines that an arbitrary number of relations of the type can be followed in this direction.
30 Florian Schummeret al.
:param searchterm=>'Sidepanel X+ Temperature OW2'
//datapath table
MATCH(telemetry:FLOWITEM {name: $searchterm}) -[:IS_PART_OF*0..]-> (flowitem:FLOWITEM)
WITH flowitem
MATCH(flowitem)-[:FLOWS_IN]->(hpn:HYPERNODE)
WITH flowitem, hpn
MATCH path = (source:BLOCK:INSTANCE) <-[:IS_PART_OF]- (:PORT) -[:FLOWS]-> (hpn) -[FLOWS]->
(:PORT) -[:IS_PART_OF]-> (target:BLOCK:INSTANCE)↩→
RETURN DISTINCT flowitem.name AS processedElement, source.name AS source, target.name AS
target ORDER BY processedElement↩→
processedElement source target
’ADCS Beacondata’ ’beacon Poster’ ’ADCS Backend’
’ADCS Beacondata’ ’ADCS Daemon’ ’beacon Data Collector’
’ADCS Beacondata’ ’microcontroller’ ’ADCS’
’ADCS Housekeeping Data’ ’ADCS Backend’ ’ADCS schema’
’ADCS Housekeeping Data’ ’ADCS’ ’CDH’
’Sidepanel X+ Temperature OW2’ ’temperature Sensors x+’ ’microcontroller x+’
[...]
Query 3: Retrieve the datapath of a certain piece of information within the model. Note: The response
was cut short here and originally contains an additional 18 lines of response omittted for readability.
CALL{
MATCH (telemetry{name: $searchterm})-[:FLOWS_IN]->(hpn) RETURN hpn, telemetry
UNION
MATCH (telemetry{name:
$searchterm})-[:FLOWS_IN]->()-[:FLOWS]-()-[:IS_PART_OF]->(component)<-[:IS_PART_OF]-()-[:FLOWS]-(hpn)
RETURN hpn, telemetry
↩→
↩→
}
WITH hpn, telemetry
MATCH (suggestion)-[:FLOWS_IN]->(hpn)
WHERE NOT suggestion = telemetry
return suggestion.name ORDER BY SHORTESTPATH((telemetry)-[:FLOWS_IN|FLOWS*]-(suggestion))
suggestion.name
'Sidepanel X+ Temperature OW3'
'Sidepanel X+ Temperature OW1'
'PDM Current ADCS 3V3 1'
'ADCS Sidepanel Data Package x+'
'Sun Vector x+'
'Sidepanel X+ Temperature BMX'
'Gyroscope Data x+'
'Magnetic Field Vector x+'
Query 4: Retrieve suggestions for possibly compromised telemetry by checking telemetry which is
directly processed by the same components.
same component and port correctly, the error is most likely not in this component." Query 5shows the
basic pattern to find components processing the three faulty telemetry packages $fltm1, $fltm2, $fltm3,
while not processing the healthy telemetry package $good_tlm. The result shows a list of ports and
components, including their IDs. The query shows how it is possible to narrow down a list of over
1400 possible suggestions to a mere 6 by applying logic on the graph transformation of the SysML
model. Taking a closer look at the proposed components, we find that the ports p3 and p4 are proxy
ports and therefore no real components. Ruling those out, we end up with one component and three
ports as suggested causes of the anomaly. It has to be noted here, that other components or ports could
be at fault as well, as not every possible fault path is traceable through the model. However the query
provides a good starting point for the analysis.
Figure 25 shows the respective part of the SysML Model, including the item flows, ports and blocks.
Data-Centric Engineering 31
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS
FLOWS
IS_P…
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS
FLOWS
IS_PART_OF
FLOWS
FLOWS
FLOWS
FLOWS
IS_PART_OF
IS_PART_OF
FLOWS
FLOWS
FLOWS_IN
IS_PART_OF
IS_PART_OF
FLOWS_IN
FLOWS_IN
IS_PAR…
IS_PART_OF
FLOWS_IN
FLOWS_IN
FLOW…
FLOWS_IN
IS_PART_OF
FLOWS_IN
FLOWS_IN
FLOWS_IN
FLOWS_IN
F…
FLO…
IS_PART_OF
IS_PART_OF
FLOWS_IN
FLOWS_IN
IS_PART_OF
FLOWS_IN
FLOWS_IN
FLOWS_IN
IS_PART_OF
IS_PART_OF
FLOWS_IN
FLOWS_IN
FLOWS_IN
FLOWS_IN
FLOWS_IN
IS_PART_OF
IS_PA…
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLO…
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
FLOWS
Sidepanel
Temp...
THM
Beacon…
Satellite
Beacon
Data
Transmi…
THM
Housek…
Housek…
ADCS
Sidepanel
Data
ADCS
Beacon…
ADCS
Housek…
tempera…
microco…
beacon
Poster
THM
Backend
THM
Daemon
beacon
Data
Collec…
beacon
Parser
operatio…
COM
Daemon
nominal
Image
ground
Station
MOVE-II
satellite
UHF/VHF
CDH
THM
schema
processor
SD Card
microco…
ADCS
Backend
ADCS
Daemon
ADCS
ADCS
schema
tempera…
_…
microco…
beacon
Poster_…
_…
THM
Backen…
THM
Daemo…
_…
beacon
Data
Collec…
beacon
Parser_…
_…
beacon
Poster_…
operatio…
_…
beacon
Parser_…
COM
Daemo…
_…
nominal
Image_…
beacon
Data
Collec…
_…
COM
Daemo…
ground
Station…
_…
MOVE-II
satellite…
_…
ground
Station…
UHF/VH…
_…
CDH_po…
_…
UHF/VH…
THM
Backen…
_…
THM
schema…
THM
Daemo…
_…
process…
_…
SD
Card_p…
microco…
_…
microco…
_…
ADCS
Backen…
ADCS
Daemo…
_…
beacon
Data
Collec…
microco…
_…
ADCS_p…
ADCS
Backen…
_…
ADCS
schema…
_…
_…
ADCS
Daemo…
ADCS
Daemo…
_…
Figure 24. Graph representation of the data path query. Red: :FLOWITEM nodes, green: :PORT
nodes, blue: :BLOCK:INSTANCE nodes, purple: :HYPERNODEs .
[Block] CDHCDHibd ][
level shifter : Level shifter
SPI level shifter : SPI level shifter
processor : Processor
IMG0 : Nominal Image
«full»
p2 : spi level shifter outside
«full»
p1 : spi levelshifter inside
ADCS Housekeeping Data,
S-Band Beacon Data Hardware Only,
UKW Beacon Data Hardware Only,
PL Housekeeping Data «full»
p4
«full»
p3
«full»
I2C_inside I2C
SPI driver
EPS Housekeeping Data
ADCS Housekeeping Data,
S-Band Beacon Data Hardware Only,
UKW Beacon Data Hardware Only,
PL Housekeeping Data
«full»
I2C
«full»
SPI
PL Housekeeping Data,
S-Band Beacon Data Hardware Only,
UKW Beacon Data Hardware Only,
ADCS Housekeeping Data
EPS Housekeeping Data
ADCS Housekeeping Data,
UKW Beacon Data Hardware Only,
PL Housekeeping Data,
S-Band Beacon Data Hardware Only
[Block] CDHCDHibd ][
level shifter : Level shifter
SPI level shifter : SPI level shifter
processor : Processor
IMG0 : Nominal Image
«full»
p2 : spi level shifter outside
«full»
p1 : spi levelshifter inside
ADCS Housekeeping Data,
S-Band Beacon Data Hardware Only,
UKW Beacon Data Hardware Only,
PL Housekeeping Data p4
p3
«full»
I2C_inside I2C
SPI driver
EPS Housekeeping Data
ADCS Housekeeping Data,
S-Band Beacon Data Hardware Only,
UKW Beacon Data Hardware Only,
PL Housekeeping Data
«full»
I2C
«full»
SPI
PL Housekeeping Data,
S-Band Beacon Data Hardware Only,
UKW Beacon Data Hardware Only,
ADCS Housekeeping Data
EPS Housekeeping Data
ADCS Housekeeping Data,
UKW Beacon Data Hardware Only,
PL Housekeeping Data,
S-Band Beacon Data Hardware Only
Figure 25. Internal Block Diagram to Query 5.
32 Florian Schummeret al.
:param fltm1=>'ADCS Housekeeping Data'
:param fltm2=>'UKW Beacon Data Hardware Only'
:param fltm3=>'S-Band Beacon Data Hardware Only'
MATCH (faulty_tlm1{name:$fltm1}), (faulty_tlm2{name:$fltm2}), (faulty_tlm3{name:$fltm3}),
(good_tlm {name: $good_tlm})↩→
WITH *
MATCH (faulty_tlm1)-[:FLOWS_IN]->()-[:FLOWS]->()-[:IS_PART_OF]->(component)
WHERE EXISTS {(faulty_tlm_2)-[:FLOWS_IN]->()-[:FLOWS]->()-[:IS_PART_OF]->(component)}
AND EXISTS {(faulty_tlm3)-[:FLOWS_IN]->()-[:FLOWS]->()-[:IS_PART_OF]->(component)}
AND NOT EXISTS{(good_tlm)-[:FLOWS_IN]->()-[:FLOWS]->()-[:IS_PART_OF]->(component)}
RETURN component.name as suggestion, component.id as id
UNION
MATCH (faulty_tlm1{name:$fltm1}), (faulty_tlm2{name:$fltm2}), (faulty_tlm3{name:$fltm3}),
(good_tlm {name: $good_tlm})↩→
WITH *
MATCH (faulty_tlm1)-[:FLOWS_IN]->()-[:FLOWS]->(port)
WHERE EXISTS {(faulty_tlm_2)-[:FLOWS_IN]->()-[:FLOWS]->(port)}
AND EXISTS {(faulty_tlm3)-[:FLOWS_IN]->()-[:FLOWS]->(port)}
AND NOT EXISTS {(good_tlm)-[:FLOWS_IN]->()-[:FLOWS]->(port)}
RETURN port.name as suggestion, port.id as id
suggestion id
SPI level shifter _19_0_4_9b3028f_1626770558674_761066_49622
CDH_SPI_port_instance _19_0_4_64d021d_1606812551466_659190_43643
_19_0_4_9b3028f_1626784294211_953036_61690
processor_port_instance _19_0_4_9b3028f_1606755365738_580408_43811
_19_0_4_9b3028f_1626770585221_986561_49783
level shifter_p4_port_instance _19_0_4_9b3028f_1632823793184_527010_45978
_19_0_4_9b3028f_1626770580938_1308_49739
level shifter_p3_port_instance _19_0_4_9b3028f_1632823808903_57539_46012
_19_0_4_9b3028f_1626770580938_1308_49739
SPI level shifter_port_instance _19_0_4_64d021d_1609248717945_193014_44390
_19_0_4_9b3028f_1626770558674_761066_49622
Query 5: Determine any port or block processing a list of faulty telemetry while not processing healthy
telemetry.
Data-Centric Engineering 33
6.3. Failure Propagation Analyses
The next type of analysis already becomes useful during the design of the spacecraft. In comparison
to ground-based systems, robustness against failure is a design goal for any spacecraft, as on-sight
repairs are usually not possible. Hence failure propagation is to be kept to an absolute minimum, i.e.
the spacecraft’s attitude control system, on board computer, power and basic communication systems
should not be influenced by a failure in any other part of the system.
The graph schema and modelling guidelines proposed here do not enable the analysis of complex
relations such as "the communication system can only work if the spacecraft is pointing to the ground
station". The analyses possible with the here proposed graph schema and modelling guidelines center
around Question 12 (What happens if component X breaks?).
In case the component X is a physical component, a broken component may trigger an electrical
short. To prevent a spread of such an event electrical fuses are employed throughout the system.
Therefore one of the tasks to be applied here is to find out which fuse triggers and which components
are shut down as well by the fuse. Query 7does exactly this. However, to enable this query, the graph
needs to extend by an additional :FLOWS relation between any :PORT with an incoming or outgoing
flow connection and the :BLOCK which the port is part of. This is accomplished by the code presented
in Query 6. Note further that Query 7takes two parameters as input; the $searchterm parameter defines
which component was shorted, the $flowlength parameter defines how many :FLOWS connections the
query shall follow before it stops searching. It also requires to follow the modelling schema presented
in Section 4.1.2, defining physical flowitems as a class differing from telemetry in order to enable an
accurate tracking of the power path. Limiting the amount of :FLOWS connections followed in the query
is not technically necessary but increases the performance significantly. As the volatile relations were
all built with the property tbd=True, they can be deleted again after the query is finished.
MATCH (b:INSTANCE:BLOCK)<-[:IS_PART_OF]-(p:PORT)-[:FLOWS]->(:HYPERNODE)
MERGE (p)<-[volatile_rel_out:FLOWS]-(b)
SET volatile_rel_out.tbd = True
MATCH (b2:INSTANCE:BLOCK)<-[:IS_PART_OF]-(p2:PORT)<-[:FLOWS]-(:HYPERNODE)
MERGE (p2)-[volatile_rel_in:FLOWS]->(b2)
SET volatile_rel_in.tbd = True
Query 6: Code to create volatile [:FLOWS] relations necessary to apply the shortest path algorithm in
Query 7.
Query 7can also be tuned to show any telemetry values that are not processed correctly anymore
due to the short by adding a new MATCH-clause based on the variable shortedcomponent.
6.4. Summary
As shown on select examples above, the graph allows to retrieve information from a complex context in
a minimum of time. Especially noteworthy are the efficiency increase on looking for similar information
within a different context and the ability to query complex design constructs. The parametrization of
the queries allows to reuse them with different objects of interest and even on different systems. Some
of the queries here would benefit from a more compact graph schema. Setting up the complete graph in
a simpler schema inevitably results in a loss of information, though. The graph projection capability of
Neo4j however allows to create simpler projected graphs from an existing graph, which could be used
to shorten queries (see Neo4j Inc. (2021)), without losing information.
34 Florian Schummeret al.
Call {
MATCH p = (rtc{name:$searchterm}) <-[:IS_PART_OF]- (port) <-[:FLOWS]-()<-[:FLOWS_IN]-
(shortedvoltage) -[:IS_OF_TYPE]-> (pfi{name:'physical flowitem'})↩→
WITH shortedvoltage
MATCH (fusetype {name:'electrical fuse'}) <-[:IS_OF_TYPE]- (fuse) <-[:IS_INSTANCE_OF]-
(fuseinstance) <-[:IS_PART_OF]- (fuseportinstance) -[:FLOWS]-> (hpn) <-[:FLOWS_IN]-
(shortedvoltage)
↩→
↩→
RETURN fuseinstance, shortedvoltage ORDER BY length(SHORTESTPATH((fuseinstance) -[:FLOWS*]-
(rtc))) LIMIT 1}↩→
WITH *
MATCH p=(fuseinstance) -[:FLOWS*1..$flowlength]-> (hpn:HYPERNODE) <-[:FLOWS_IN]-
(shortedvoltage)↩→
WITH hpn, fuseinstance
MATCH (hpn) -[:FLOWS]-> () -[:IS_PART_OF]-> (shortedcomponent)
WHERE NOT fuseinstance.id = shortedcomponent.id
RETURN shortedcomponent.name
shortedcomponent.name
'current Limiter SD Cards'
'SD Card'
'FRAM'
'flash'
'GPS'
'processor'
'real Time Clock'
'level shifter'
Query 7: Find components affected by an electrical short.
7. Conclusion
Over the past four sections of this paper, a graph schema and modelling guidelines evolved from a set
of analysis questions which are typical for assembly, integration and testing of small spacecraft and
are based on personal experience. Section 3developed the graph schema as a mere concept, weighing
different options and explaining the thought process behind the proposed schema. To the best of the
author’s knowledge it is the only public schema for SysML interpretation in graph databases to this date
that goes beyond the modelling of requirements and use cases presented by Petnga (2019). In the form
proposed here it completely lacks any schema for the handling of use cases, requirements, sequence
diagrams and parameter diagrams as well as custom stereotypes. Seeing as this paper is already quite
lengthy this was not an oversight but the result of a pragmatic decision. Section 4interlaces the defined
graph schema with the questions defined in Section 3.2 into a concise set of modelling rules, while
explaining where modellers can cut corners on the rules depending on their analysis goals. Section 5
lays out the code structure on a basic level and provides the reader with pointers to the respective
open source repository and the two published test models. We hope the code provided in the repository
enables any interested party to apply the schema at their own models. Section 6shows the application
of the schema on the model of the MOVE-II spacecraft. Queries are deduced from selected questions
defined in Section 3.2 to show the capabilities and limitations of schema and modelling guidelines. The
queries go far beyond the publications by Petnga (2019),Fisher et al. (2014); Bajaj et al. (2016) andBajaj
et al. (2017). The clear focus on the information a graph shall provide and the parallel development of
modelling rules enable far reaching and complex analyses on the system.
Overall the paper shows how graph analyses based on SysML can be a useful tool for engineer-
ing teams working with MBSE. It takes on the challenges to the adoption of MBSE summarized in
Section 2, especially regarding the insufficiency of tools, the low share of projects alraedy implementing
MBSE and the importance of keeping with prevalent modelling strategies (compare Gerald Paw-
likowski (2019)), while keeping the functional complexity in control and structuring the knowledge
about the system in a traceable manner (cf. European Space Agency (2020b)).
A weak point of the provided implementation is its focus on files created with MagicDraw v19.0
SP4. As we do not have access to other versions of the modelling software, no cross tests with newer
software versions was possible. A second weak point is the lack of proposed queries with regard to
Data-Centric Engineering 35
the behavioral aspects of SysML. This is due to a lack of a behavior-intensive medium-scale SysML
model on the one side and on the other side due to a trade-off between explaining more challenging and
complex queries on the structural part of the model and the length of the text.
8. Outlook
While the above presented schema and application contain a solid proposal for graph transformations
of information in Block Definition Diagarms, Internal Block Diagrams, Activity Diagrams and State
Machines, these make only four of the 9 diagram types of SysML (cf. (Object Modelling Group,2019,
p. 211)). Schemata for Package-, Parametric-, Sequence-, Use Case- and Requirement Diagrams are
still to be defined. It furthermore does not cover the treatment of custom profiles which may require
additional node and relation types. All of this work may be done in future revisions of the schema
and may be the topic of further publications. Another interesting aspect would enhancing formal
reviews of a SysML model via graph queries. Parametrized queries could be set to any model to be
reviewed, increasing efficiency and transparency of the review process. The comparison between dif-
ferent versions of a SysML Model via graph analysis could also be interesting and would most likely
require some slight adaptions to the schema proposed here. Only a small selection of the questions
posed in Section 3.2 could be shown here. Formulating the queries to answer all questions provided in
Section 3.2 is a prospect for the future.
The graph database could be extended to include document-based information and relating it on a
key-word basis to the SysML Model, thereby creating a holistic view of the system’s digital environ-
ment. First tests show that while a promising amount of relations can be created by such an approach,
the processing of synonyms, abbreviations and unrelated usage of the same words create a unique chal-
lenge. Compared to classical MBSE approaches, which require a high level of discipline from everyone
involved, such an approach could be conducted without requiring any change of behavior by the engi-
neering team, which - as stated by Gerald Pawlikowski (2019); Thomas McDermott, Nicole Hutchison
et al. (2020) and M. Chami, J. Bruel (2018) - could result in an easier introduction of MBSE in general.
36 Florian Schummeret al.
References
A. Hazle, J. Towers. (2020). Good practice in mbse model verification and validation.
Angles, R. and Gutierrez, C. (2018). An Introduction to Graph Data Management, pages 1–32. Springer International Publishing,
Cham.
B. Morris et al. (2016). Issues in conceptual design and mbse successes: Insights from the model-based conceptual design
surveys.
Bajaj, M., Backhaus, J., Walden, T., Waikar, M., Zwemer, D., Schreiber, C., Issa, G., Intercax, and Martin, L. (2017). Graph-
based digital blueprint for model based engineering of complex systems. In INCOSE International Symposium, volume 27,
pages 151–169. Wiley Online Library.
Bajaj, M., Zwemer, D., Peak, R., Phung, A., Scott, A. G., and Wilson, M. (2011). Slim: collaborative model-based systems
engineering workspace for next-generation complex systems. In 2011 Aerospace Conference, pages 1–15. IEEE.
Bajaj, M., Zwemer, D., Yntema, R., Phung, A., Kumar, A., Dwivedi, A., and Waikar, M. (2016). Mbse++—foundations for
extended model-based systems engineering across system lifecycle. In INCOSE International Symposium, volume 26, pages
2429–2445. Wiley Online Library.
European Space Agency (2020a). Application of mbse to reverse-engineer ops-sat and improve ops-sat2, statement of work.
Technical report.
European Space Agency (2020b). Mb4se user needs. Technical report.
Fernandes, D. and Bernardino, J. (2018). Graph databases comparison: Allegrograph, arangodb, infinitegraph, neo4j, and
orientdb. In DATA, pages 373–380.
Fisher, A., Nolan, M., Friedenthal, S., Loeffler, M., Sampson, M., Bajaj, M., VanZandt, L., Hovey, K., Palmer, J., and Hart, L.
(2014). 3.1. 1 model lifecycle management for mbse. In INCOSE International Symposium, volume 24, pages 207–229.
Wiley Online Library.
Franz Inc. (2021). AllegroGraph Website. https://allegrograph.com/products/allegrograph/. Last accessed 2021-06-23.
Friedenthal, S., Moore, A., and Steiner, R. (2011). A Practical Guide to SysML: The Systems Modeling Language. The MK/OMG
Press. Elsevier Science.
Gerald Pawlikowski, Jon B Holladay, J. R. L. K. (2019). Systems engineering and model-based systems engineering stakeholder
state of the discipline - independent assessment of perception from external/non-nasa systems engineering (se) sources.
Hodler, A. and Needham, M. (2019). Graph Algorithms: Practical Examples in Apache Spark and Neo4j. O’Reilly Media,
Incorporated.
I. Robinson et al. (2015). Graph Databases: New Opportunities for Connected Data. O’Reilly Media, Inc., 2nd edition.
Intercax LLC (2021). Syndeia - Intercax | Software For Integrated MBSE. https://intercax.com/products/syndeia/. Last accessed
2021-07-17.
International Council on Systems Engineering (2007). Systems engineering vision 2020.
J. Bankauskaite, A. Morkevicius (2018). An approach: Sysml-based automated completeness evaluation of the system
requirements specification.
J. Webber, R.v.Bruggen (2020). Graph Databases for dummies. John Wiley and Sons.
Langer, M., Appel, N., Dziura, M., Fuchs, C., Günzel, P., Gutsmiedl, J., Losekamm, M., Meßmann, D., Pöschl, T., and Trinitis,
C. (2015). MOVE-II-der zweite Kleinsatellit der Technischen Universität München. Deutsche Gesellschaft für Luft-und
Raumfahrt-Lilienthal-Oberth eV.
Langer, M., Schummer, F., Appel, N., Gruebler, T., Janzer, K., Kiesbye, J., Krempel, L., Lill, A., Messmann, D., Rueckerl, S.,
et al. (2017). Move-ii-the munich orbital verification experiment ii. In Proceedings of the 4th IAA Conference on University
Satellite Missions & CubeSat Workshop, Rome, Italy, pages 4–7.
M. Chami, J. Bruel (2018). A survey on mbse adoption challenges.
M. Needham, A. Hodler (2018). A comprehensive guide to graph algorithms. Neo4j Inc.
Morkevicius, A. and Jankevicius, N. (2015). An approach: Sysml-based automated requirements verification. In 2015 IEEE
International Symposium on Systems Engineering (ISSE), pages 92–97.
Neo4j Inc. (2019). The Neo4j Graph Algorithms User Guide v3.5. Neo4j Inc.
Neo4j Inc. (2021). Graph catalog - neo4j graph data science. https://neo4j.com/docs/graph-data-science/current/management
-ops/graph-catalog-ops/. Last accessed 2021-09-30.
Object Modelling Group (2019). OMG Systems Modeling Language (OMG SysML) Version 1.6. Object Modelling Group.
Objectivity Inc. (2021). InfiniteGraph Website. https://infinitegraph.com/pricing/. Last accessed 2021-06-23.
Petnga, L. (2019). Graph-based assessment and analysis of system architecture models. Proceedings of the 29th Annual INCOSE
International Symposium.
Puig-Suari, P. J. (2014). CubeSat Design Specification.
Rik Van Bruggen (2014). Learning Neo4j. Packt Publishing.
Roberts, J. and Hadaller, A. (2019). Behind the us’s largest rideshare launch: Spaceflight’s sso-a.
Rückerl, S. et al (2019). First flight results of the move-ii cubesat.
Rutzinger, M., Krempel, L., Salzberger, M., Buchner, M., Höhn, A., Kellner, M., Janzer, K., Zimmermann, C. G., and Langer,
M. (2016). On-orbit verification of space solar cells on the cubesat move-ii. In 2016 IEEE 43rd Photovoltaic Specialists
Conference (PVSC), pages 2605–2609.
Data-Centric Engineering 37
Selvy, B. M., Roberts, A., Reuter, M., Claver, C. C., Comoretto, G., Jenness, T., O’Mullane, W., Serio, A., Bovill, R., Sebag, J.,
et al. (2018). V&v planning and execution in an integrated model-based engineering environment using magicdraw, syndeia,
and jira. In Modeling, Systems Engineering, and Project Management for Astronomy VIII, volume 10705, page 107050U.
International Society for Optics and Photonics.
T. Neumann, G. Weikum (2011). Rdf-stores und rdf-query-engines. Datenbank-Spektrum, 11(1):63–66.
Thomas McDermott, Nicole Hutchison et al. (2020). Benchmarking the benefits and current maturity of model-based systems
engineering across the enterprise. Technical report.
Van Bruggen, R. (2014). Learning Neo4j. Packt Publishing Ltd.