Content uploaded by Mathias Pletz
Author content
All content in this area was uploaded by Mathias Pletz on Jul 23, 2018
Content may be subject to copyright.
© Schattauer 2018 Methods Inf Med Open 01/2018
e92
Smart Medical Information
Technology for Healthcare (SMITH)*
Data Integration based on Interoperability Standards
Alfred Winter1; Sebastian Stäubert1; Danny Ammon2; Stephan Aiche3; Oya Beyan4;
Verena Bischoff5; Philipp Daumke6; Stefan Decker4; Gert Funkat7; Jan E. Gewehr8;
Armin de Greiff9; Silke Haferkamp10; Udo Hahn11; Andreas Henkel2; Toralf Kirsten12;
Thomas Klöss13; Jörg Lippert14; Matthias Löbe1; Volker Lowitsch10; Oliver Maassen15;
Jens Maschmann16; Sven Meister17; Rafael Mikolajczyk18; Matthias Nüchter12; Mathias
W. Pletz19; Erhard Rahm20; Morris Riedel21; Kutaiba Saleh2; Andreas Schuppert22; Stefan
Smers7; André Stollenwerk23; Stefan Uhlig24; Thomas Wendt25; Sven Zenker26; Wolfgang
Fleig27,**; Gernot Marx15,**; André Scherag28, 29,**; Markus Löffler1,**
1Leipzig University, Institute of Medical Informatics, Statistics and Epidemiology, Leipzig, Germany;
2University Medical Center Jena, Central Service Provider For Information Technology, Jena, Germany;
3SAP SE, Potsdam, Germany;
4RWTH Aachen University, Chair of Computer Science 5, Aachen, Germany;
5University of Leipzig Medical Center, Division Staff and Justice, Leipzig, Germany;
6Averbis GmbH, Freiburg, Germany;
7University of Leipzig Medical Center, Division Information Management, Leipzig, Germany;
8University Medical Center Hamburg-Eppendorf, Business Division for Information Technology, Hamburg, Germany;
9Essen University Hospital, Central Information Technology, Essen, Germany;
10RWTH Aachen University Hospital, Division Information Technology, Aachen, Germany;
11Friedrich-Schiller-Universität Jena, Language & Information Engineering Lab (JULIE Lab), Jena, Germany;
12Leipzig University, LIFE Research Centre for Civilization Diseases, Leipzig, Germany;
13Martin-Luther-Universität Halle-Wittenberg Medical Center, Medical Director, Halle, Germany;
14Bayer AG, Wuppertal, Germany;
15RWTH Aachen University Hospital, Department of Intensive Care and Intermediate Care, Aachen, Germany;
16University Medical Center Jena, Medical Director, Jena, Germany;
17Fraunhofer Institute for Software and Systems Engineering, Dortmund, Germany;
18Martin-Luther-Universität Halle-Wittenberg, Institute of Medical Epidemiology, Biometry and Informatics, Halle,
Germany;
19University Medical Center Jena, Institute of Infectious Diseases and Infection Control, Jena, Germany;
20Leipzig University, Department of Computer Science – Database Group, Leipzig, Germany;
21Forschungszentrum Jülich, Jülich Supercomputing Centre, Jülich, Germany;
22RWTH Aachen University, Institute for Computational Biomedicine II, Aachen, Germany;
23RWTH Aachen University, Informatik 11 – Embedded Software, Aachen, Germany;
24RWTH Aachen University, Medical Faculty, Dean, Aachen, Germany;
25University of Leipzig Medical Center, Data Integration Center, Leipzig, Germany;
26University of Bonn Medical Center, Department of Anesthesiology and Intensive Care Medicine, Bonn, Germany;
27University of Leipzig Medical Center, Medical Director, Leipzig, Germany;
28University Medical Center Jena, Center for Sepsis Control and Care, Jena, Germany;
29University Medical Center Jena, Institute of Medical Statistics, Computer and Data Sciences (IMSID), Jena, Germany
Keywords
Health information interoperability, inte-
grated advanced information management
systems, systems integration, phenotyping,
intensive care, antimicrobial stewardship
Summary
Introduction: This article is part of the
Focus Theme of Methods of Information in
Medicine on the German Medical Inform -
atics Initiative. “Smart Medical Information
Technology for Healthcare (SMITH)” is one
Correspondence to:
Prof. Alfred Winter
Leipzig University
Institute of Medical Informatics, Statistics
and Epidemiology
Haertelstr. 16-18
04107 Leipzig
Germany
E-mail: alfred.winter@imise.uni-leipzig.de
Methods Inf Med 2018; 57(Open 1): e92– e105
https://doi.org/10.3414/ME18-02-0004
received: March 5, 2018
accepted: May 7, 2018
Funding
This work has been supported by German Federal
Ministry of Education and Research (Grant No’s.
01ZZ1609A, 01ZZ1609B, 01ZZ1609C, 01ZZ1803A,
01ZZ1803B, 01ZZ1803C, 01ZZ1803D, 01ZZ1803E,
01ZZ1803F, 01ZZ1803G, 01ZZ1803H, 01ZZ1803I,
01ZZ1803J, 01ZZ1803K, 01ZZ1803L, 01ZZ1803M,
01ZZ1803N).
* Supplementary material published on our
website https://doi.org/10.3414/ME18-02-0004
** Shared senior authorship
Focus Theme – Original Articles
German Med. Informatics Initiative
OPEN ACCESS
© Georg Thieme Verlag KG 2018 License terms: CC-BY-NC-ND (https://creativecommons.org/licenses/by-nc-nd/4.0)
Methods Inf Med Open 01/2018 © Schattauer 2018
e93
of four consortia funded by the German
Medical Informatics Initiative (MI-I) to create
an alliance of universities, university hospit-
als, research institutions and IT companies.
SMITH‘s goals are to establish Data Inte-
gration Centers (DICs) at each SMITH partner
hospital and to implement use cases which
demonstrate the usefulness of the approach.
Objectives: To give insight into architectural
design issues underlying SMITH data inte-
gration and to introduce the use cases to be
implemented.
Governance and Policies: SMITH imple-
ments a federated approach as well for its
governance structure as for its information
system architecture. SMITH has designed a
generic concept for its data integration
centers. They share identical services and
functionalities to take best advantage of the
interoperability architectures and of the data
use and access process planned. The DICs
provide access to the local hospitals’ Elec-
tronic Medical Records (EMR). This is based
on data trustee and privacy management ser-
vices. DIC staff will curate and amend EMR
data in the Health Data Storage.
Methodology and Architectural Frame-
work: To share medical and research data,
SMITH’s information system is based on com-
munication and storage standards. We use
the Reference Model of the Open Archival In-
formation System and will consistently imple-
ment profiles of Integrating the Health Care
Enterprise (IHE) and Health Level Seven (HL7)
standards. Standard terminologies will be ap-
plied. The SMITH Market Place will be used
for devising agreements on data access and
distribution. 3LGM² for enterprise architec-
ture modeling supports a consistent develop-
ment process.
The DIC reference architecture determines the
services, applications and the standards-
based communication links needed for effi-
ciently supporting the ingesting, data nour-
ishing, trustee, privacy management and data
transfer tasks of the SMITH DICs. The refer-
ence architecture is adopted at the local sites.
Data sharing services and the market place
enable interoperability.
Use Cases: The methodological use case
“Phenotype Pipeline“ (PheP) constructs algo-
rithms for annotations and analyses of pa-
tient-related phenotypes according to clas-
sification rules or statistical models based on
structured data. Unstructured textual data
will be subject to natural language process-
ing to permit integration into the phenotyp-
ing algorithms. The clinical use case “Algorith-
mic Surveillance of ICU Patients” (ASIC) fo-
cusses on patients in Intensive Care Units
(ICU) with the acute respiratory distress syn-
drome (ARDS). A model-based decision-sup-
port system will give advice for mechanical
ventilation. The clinical use case HELP devel-
ops a “hospital-wide electronic medical rec-
ord-based computerized decision support sys-
tem to improve outcomes of patients with
blood-stream infections” (HELP). ASIC and
HELP use the PheP. The clinical benefit of the
use cases ASIC and HELP will be demon-
strated in a change of care clinical trial based
on a step wedge design.
Discussion: SMITH’s strength is the modu-
lar, reusable IT architecture based on interop-
erability standards, the integration of the hos-
pitals’ information management departments
and the public-private partnership. The pro-
ject aims at sustainability beyond the first
4-year funding period.
1. Introduction and
Objectives
“Using the wealth of data – for improved
patient care” [1] is the motto motivating
the German Federal Ministry of Education
and Research for funding the projects in
the German Medical Informatics Initiative
(MI-I) and introduced in this special issue.
Better usage of the wealth of data can con-
tribute to validating the relevance of novel
diagnostic and therapeutic methods as well
as to directly improving care. Patient care
would benefit from the availability of
actionable, data-driven decision support
standardized across care delivering organ-
izations, e.g. to optimize diagnosis and
therapy in intensive care patients with lung
failure or antibiotic therapy of patients with
blood-stream infections.
Taking up the initiative of the German
Federal Ministry of Education and Re-
search, Leipzig University and University
Hospital Leipzig, Jena University Hospital
and Friedrich-Schiller-University Jena,
University Hospital RWTH Aachen and
RWTH Aachen University founded the
“Smart Medical Information Technology
for Healthcare (SMITH)” consortium in
order to better use the wealth of data for
the sake of patients. In addition, the follow-
ing research organizations and companies
joined the initial SMITH set-up: SAP SE,
Fraunhofer Institute for Software and Sys-
tems Engineering, Bayer AG, März Inter-
network Services AG, Averbis GmbH, ID
GmbH & Co. KGaA, Forschungszentrum
Jülich – Jülich Supercomputing Centre. In
a second consolidation phase, University
Hospitals Halle, Bonn, Hamburg and Essen
completed the SMITH consortium.
The SMITH consortium has particu-
larly set the following goals to be reached
until 2022:
•
to implement an overarching concept of
data sharing and data ownership;
•
to establish synchronized Data Inte-
gration Centers (DIC) at each SMITH
partner hospital. They will have two
major responsibilities: (1) implement
the interoperability processes and the
architecture of the SMITH transinstitu-
tional information system SMItHIS; (2)
provide data curation, data sharing and
trustee functions;
•
to implement three use cases in order to
demonstrate the usefulness of the DICs.
This paper’s objective is to give insight
into fundamental design issues underlying
SMITH. It will especially introduce the
governance of SMITH and its basic pol-
icies, explain the overall SMItHIS architec-
ture to be implemented and the methods
used, and will give more details of the use
cases to be implemented in order to show
the feasibility of the architecture.
2. Governance and Policies
The SMITH consortium’s governance
structure as well as the SMItHIS architec-
ture results from a strictly federated ap-
proach. Instead of one central DIC, local
A. Winter et al.: SMITH
German Med. Informatics Initiative
License terms: CC-BY-NC-ND (https://creativecommons.org/licenses/by-nc-nd/4.0) © Georg Thieme Verlag KG 2018
© Schattauer 2018 Methods Inf Med Open 01/2018
e94
DICs are coordinated by appropriate com-
mittees. These committees aim at integrat-
ing the interests and activities of patient
care, i.e. the member hospitals, of research,
i.e. the member medical faculties and non-
university research institutions, and of the
industrial partners. Therefore, leadership
as well as means for participation are care-
fully balanced.
In this governance system, the Super -
visory Board bears responsibility for the
SMITH consortium and the overall project.
This board nominates the delegates for the
National Steering Committee of the MI-I,
which coordinates the activities of all fund -
ed consortia in the MI-I. The Supervisory
Board appoints the Executive Board.
The Executive Board bears the scientific
responsibility for SMITH and steers the
project. It is supported by the External Ad-
visory Board whose members are selected
international experts in the field.
The Coordination Board represents
delegates from all sub-projects and from all
partners and coordinates their work.
Besides local management units at
each partner’s site, there is a consortium
management unit for SMITH located in
Leipzig. The consortium management unit
takes care of operations of SMITH as a
whole and is supported by the local man-
agement units. Operations include the
entire project management, controlling,
contracting, public relations and com-
munication.
Based on the governance structure,
functionally identical DICs will be set up in
Leipzig, Jena, Aachen and, subsequently, in
Halle, Bonn, Hamburg and Essen. DIC
staff will have the same job descriptions
and will use common standard operating
procedures. The DICs will adhere to the
following policies:
•
Local access to the HIS: The DICs have
to provide access to the data of the Elec-
tronic Medical Records (EMR) in the
local hospital information systems
(HIS). This access is subject to German
legislation on the use of patient data and
established data protection and privacy
regulations. Each DIC will analyze and
annotate patient data on an individual
basis within the hospital. Authorized
local DIC staff will have access to the
EMR in the locally operational HIS sys-
tem. This requires the DIC to be organ-
izationally linked to the hospital and
technically interoperable with its in-
formation system and with the clinical
documentation procedures (cf. Data
and Metadata Transfer Management
(DMT) in section 3.2.1).
•
Trustee and Consent for Research: Ger-
many has strict regulations concerning
data protection and privacy. Research
on patient data (with local amendments,
annotations etc.) is only possible if pa-
tient consent is provided. We anticipate
that multilevel consent will be provided.
This will range from consenting to data
being shared regarding anonymous or
pseudonymous healthcare data to ad -
ditional consent for specific research
projects. The DIC provides the relevant
services for the trustee center (cf. Data
Trustee and Privacy Management
(DTP) in section 3.2.1).
•
Health Data Storage (HDS) to provide
electronic health records (EHR): The
local HISs typically exhibit different IT
architectures and consist of numerous
specialized application components,
which are also currently being refined.
However, functionally identical Health
Data Storages will be established in the
participating institutions to maintain
standardized and interoperable EMRs
curated and amended by the DIC staff
(cf. SMItHIS architecture in section
3.2.2.1).
•
Organization: Overall management pol-
icies will be identical across the DICs.
However, we will have local sub-policies
on organizational embedding.
In order to further strengthen Medical In-
formatics, education in Health and Bio-
medical Informatics at different levels will
be driven by systematic, evidence-based
and outcome-oriented curriculum devel-
opment. This process is concerted by the
Joint Expertise Center for Teaching
(SMITH-JET) as part of the SMITH gov-
ernance structure.
3. Architectural Framework
and Methodology
3.1 Methodology
3.1.1 Open Archival Information
System (OAIS)
Our approach to data integration centers
and their organizational structure as well as
their tasks is inspired by the Reference
Model for an Open Archival Information
System (OAIS) [2]. This model provides a
framework, including terminology and
concepts, for describing and comparing
architectures and operations of archives
and thus for sharing their content. OAIS is
the most common standard for archival
organizations (ISO Standard 14721:2012).
OAIS conformant archives have to take
care of proper ingest of Submission In-
formation Packages (“raw data”) from
producers and their transformation to
Archival Information Packages (“metada-
ta-enriched data”), which are processed by
Data Management and Archival Storage.
Consumers order or query Dissemination
Information Packages (“exported data”)
which are processed by the archive. OAIS is
helpful in structuring tasks within the DIC
without prescribing implementation de-
tails. It is already used as guideline for data
and model sharing at the LIFE Research
Center for Civilization Diseases [3, 4, 5]
and the Leipzig Health Atlas project [6].
3.1.2 Three Layer Graph Based Meta
Model (3LGM²) for Enterprise
Architecture Modeling
In order to provide an integrated and con-
sistent view of the design of the entire sys-
tem of data processing and data exchange
within SMITH and beyond, we decided to
use an enterprise architecture modeling
approach. By using the three layer graph
based meta model 3LGM for modeling
health information systems [7], especially
transinstitutional information systems [8]
and, therefore, the entire information sys-
tem of SMITH with its local institutional
components can be described by concepts
on three layers [9]:
•
The domain layer describes an in-
formation system independent of its im-
plementation by the tasks it supports
A. Winter et al.: SMITH
German Med. Informatics Initiative
© Georg Thieme Verlag KG 2018 License terms: CC-BY-NC-ND (https://creativecommons.org/licenses/by-nc-nd/4.0)
Methods Inf Med Open 01/2018 © Schattauer 2018
e95
(e.g. “Ingesting Data and Knowledge in
DIC Health Data Storage”). Tasks need
certain data and provide data for other
processes. In 3LGM models, these data
are represented as entity types.
•
The logical tool layer focusses on appli-
cation components supporting tasks (e.g.
“Health Data Storage (HDS)”, “Data In-
tegration Engine”). Application compo-
nents are responsible for the processing,
storage, and transfer of data. Computer-
based application components are in-
stalled software. Interfaces ensure the
communication among application
components and, therefore, enable their
integration.
•
The physical tool layer consists of physi-
cal data processing systems (e.g. person-
al computers, servers, switches, routers,
etc.), which are connected in a network.
This approach for structuring and describ-
ing information systems turned out to be
an appropriate basis for not only describing
health information systems at the interface
between patient care and research [10] but
also for assessing their quality [11].
3LGM model data for both the domain
and the logical tool layer have been col-
lected at several workshops and interviews
from all partners of the SMITH consor-
tium. In the first step, we collected tasks
and entity types (i.e. the data) needed or
produced by the tasks in workshops with
experts in the domain of data sharing in
and between health care and medical re-
search. In the second step, the workshops
were held with the CIOs and their experts
of health information systems architectures
and standards-based communication. The
application components found and their
interfaces and the communication links be-
tween them were connected to the tasks
and to the entity types determined by the
domain experts. Thus, we developed a
blueprint for the transinstitutional SMITH
information system, which integrates the
domain layer view of tasks and entity types
used with a tool layer view of applications,
services and communication. This was the
basis for the SMITH DIC reference archi-
tecture design explained in more depth in
3.2.2.1. The continuously updated 3LGM
model will be used during the entire pro-
ject as an evolving blueprint for the devel-
opment process.
3.1.3 Integrating the Healthcare
Enterprise (IHE)
Our goal is to share medical and research
data not only inside the SMITH consor-
tium but also with a stepwise growing
number of partners in the near future.
Therefore, we decided to design the logical
tool layer of SMITH’s information system
strictly based on communication and stor-
age standards as far as possible. Especially,
we decided to implement IHE integration
profiles, which describe the precise and
“coordinated implementation of communi-
cation standards, such as DICOM, HL7
W3C and security standards” [12]
(DICOM: Digital Imaging and Communi-
cations in Medicine; W3C: World Wide
Web Consortium). IHE profiles ensure pro-
cessual interoperability, since they define
how application systems in a certain role
(described as actor, e.g. “document con-
sumer”, “document provider”, “document
repository”) can communicate with other
application systems through certain trans-
actions.
In particular, the following IHE inte-
gration profiles have been used for design-
ing the SMITH DIC reference architecture
[12]:
•
Patient Identifier Cross-referencing (PIX)
profile for pseudonym management
across all SMITH sites.
•
Patient Demographics Query (PDQ)
profile for querying demographic data
from any patient management system of
a SMITH partner hospital.
•
Cross-Enterprise Document Sharing
(XDS) profile for sharing documents
within an affinity domain, which us -
ually covers (parts of) a SMITH partner
hospital.
•
Cross-Community Access (XCA) profile
for sharing documents across affinity
domains, e.g. between SMITH partner
hospitals.
•
Basic Patient Privacy Consents (BPPC)
and Advanced Patient Privacy Consents
(APPC) profiles for recording patients’
privacy consents in a way that they can
be used for controlling access to docu-
ments via XDS and XCA.
•
The Cross-Enterprise User Assertion
(XUA) profile for checking and con-
firming the identity of persons or sys-
tems trying to access data.
•
The Audit Trail and Node Authenti-
cation (ATNA) profile for providing
data integrity and confidentiality even
in case local interim copies of confiden-
tial data have to be allowed for safety
reasons.
•
The Cross-Enterprise Document Work-
flow (XDW) profile to define and use
workflows upon document sharing pro-
files like XDS.
•
The Cross-Community Patient Discovery
(XCPD) profile to find patient data
across the SMITH partner hospitals.
•
The Cross-Community Fetch (XCF) pro-
file: like XCA, this profile accounts for
sharing documents, but in a simplified
way. In SMITH, we also use this profile
in a modified way to transport FHIR
queries and resources.
•
The Personnel White Pages (PWP) inte-
gration profile for managing user con-
tact information.
•
The Healthcare Provider Directory
(HPD) profile for managing healthcare
provider information, including roles
and access rights.
These profiles have been adapted to the
planned architecture at the logical tool
layer of the SMITH information system.
Using 3LGM, we mapped IHE profile
actors to the planned application compo-
nents and used the transactions as tem-
plates for defining the applications’ com-
munication interfaces. This endeavor was
supported by predefined templates and
corresponding workflows which have been
described in more detail in [13].
3.1.4 Communication Standards HL7
CDA and HL7 FHIR
IHE profiles, especially XDS und XCA,
support the shared use of medical docu-
ments. For this purpose, the HL7 Clinical
Document Architecture (CDA) provides
specifications for various types of docu-
ments typically used in clinical care. In this
XML-based format, the CDA level de-
scribes a degree of structuring in the XML
body, ranging from level 1 (containing only
A. Winter et al.: SMITH
German Med. Informatics Initiative
License terms: CC-BY-NC-ND (https://creativecommons.org/licenses/by-nc-nd/4.0) © Georg Thieme Verlag KG 2018
© Schattauer 2018 Methods Inf Med Open 01/2018
e96
textual information) via level 2 (coded sec-
tions) to level 3 (structured entries).
Since we do not only want to share
documents but discrete individual data as
well, we took advantage of recent develop-
ments of IHE profiles using Fast Healthcare
Interoperability Resources (FHIR) [14].
FHIR defined “resources” can be used to
arrange patient data, e.g. about allergies
and intolerances, family member histories,
medication requests and provide them for
access by remote application systems.
Modern web-based application program-
ming interface (API) technology, especially
the RESTful protocol, can be used to access
this data using certain operations, which
are defined as “interactions” by FHIR.
Since entries contained in CDA docu-
ments can be linked to FHIR resources and
several FHIR resources together can be
combined in a document, both formats are
suitable to facilitate syntactic interoperabil-
ity for retrieval and exchange of clinical
care data.
3.1.5 Medical Terminologies
For a transinstitutional use of medical and
research data, these data must include or
refer to machine-processable descriptions
of their content. International medical ter-
minologies like SNOMED CT offer code
systems and define relations between sub-
ject entities for an unambiguous specifi-
cation of certain medical information. We
will set up terminology services to provide
detailed representations of conceptual en-
tities. This includes browsing terminol-
ogies, displaying concepts, facetted search-
ing in terminologies and exporting. We will
import standard terminologies for coded
medical data like the International Classifi-
cation of Diseases (ICD) and the Logical
Observation Identifiers Names and Codes
(LOINC); terminologies and ontologies
used for text mining and phenotyping, like
Medical Subject Headings (MeSH), ICD,
LOINC or Human Phenotype Ontology
(HPO); furthermore, local vocabularies
will be used in a DIC (e.g. for medication).
In addition, new vocabularies and concepts
can be created and mapped to standardized
terminologies to facilitate the use of core
data sets. These terminology services will
be based on the Common Terminology
Services 2 (CTS2) data model.
Both HL7 CDA and HL7 FHIR refer to
the use of medical terminologies for the en-
coding of clinical concepts, thus enabling
semantic interoperability of shared data.
A link is created from process-related,
technical and semantic interoperability
with IHE, HL7 (procedure description and
definition), with HL7 CDA, FHIR, Clinical
Quality Language (CQL), etc. (definition of
protocols and file formats) and LOINC,
SNOMED-CT, etc. (Information Models).
3.1.6 Industrial Data Space (IDS)
SMITH wants to go beyond mere data ag-
gregation and therefore will develop the
SMITH Market Place (SMP) for devising
agreements on data access and distribution
(see 4.2.2). The architecture of the SMP is
based on the Industrial Data Space (IDS)
project, led by Fraunhofer ISST. IDS is a
virtual data space using standards and
common governance models to facilitate
the secure exchange and easy linkage of
data in business ecosystems. It thereby pro-
vides the basis for creating and using smart
services and innovative business processes,
while at the same time ensuring digital
sovereignty of data owners, decentral data
management, data economy, value cre-
ation, easy linkage of data, trust and secure
data supply chains and, finally, data gov-
ernance [15].
To facilitate the above-mentioned char-
acteristics, three key architectural compo-
nents were defined within the IDS. The
connector allows the secure interconnec-
tion with data sources and execution of
apps. To mediate between different con-
nectors, the broker allows the semantical
description of data sources as well as apps.
The last component is the app store, which
manages certified apps. With respect to the
SMP, the already defined concepts should
be used to enable an all-encompassing data
market place for medical informatics, pay-
ing attention to specific requirements like
regulatory affairs (e.g. consenting, data
protection and privacy) and existing data
exchange and integration standards (see
3.1.3 and 3.1.4).
The concepts of the connector as well as
the broker, will be reflected in the concep-
tualization process of the SMP. The con-
nector will be a secure and trusted soft-
ware-based endpoint for external users to
enable secure handshaking and contracting
between external participants and the SMP.
Furthermore, the connector has to be used
to initiate the onboarding process. The
broker will serve as an interface to the
metadata repository for data provisioning
and services. Like a directory, the broker
will allow external as well as internal users
to query a metadata repository in order to
fulfill a customer-specific demand.
3.2 Architectural Framework of
SMItHIS
SMITH intends to build a medical network
of its partner hospitals, medical faculties
and research institutions. This network is
based on SMItHIS, the network’s trans -
institutional Health Information System
(tHIS) [8, 16]. SMItHIS connects the in-
formation systems of the partner insti -
tutions, including their Data Integration
Centers (DIC) and local clinical and re-
search application components. SMItHIS
integrates them by added components. In
this section, we introduce the architectural
framework for SMItHIS, which will guide
the entire development process.
In order to provide a holistic and con-
sistent view of the design of the entire sys-
tem of data processing and data exchange
within SMITH and beyond, we decided to
use the Enterprise Architecture Modeling
approach 3LGM. Using 3LGM, the en-
tire transinstitutional information system
SMItHIS with its local institutional com -
ponents can be described by concepts on
three layers. Since there are no special sol-
utions on the physical tool layer, we here
focus on the domain and logical tool layers.
SMITH has designed a generic concept
for its data integration centers. They share
identical services and functionalities to
take best advantage of the interoperability
architectures and of the data use and access
process planned. In the following section
3.2.1, we describe the main tasks of a DIC
in SMITH; they are common for the DICs
of all partners. In section 3.2.2 we describe
the application components and their in-
terfaces and communication links. We will
do this first by a generic reference model
A. Winter et al.: SMITH
German Med. Informatics Initiative
© Georg Thieme Verlag KG 2018 License terms: CC-BY-NC-ND (https://creativecommons.org/licenses/by-nc-nd/4.0)
Methods Inf Med Open 01/2018 © Schattauer 2018
e97
Figure 1
High-level tasks (rec-
tangles) and entity
types (ovals) of
SMItHIS.
Arrows pointing from
a task T to an entity
type (ET): ET is up-
dated by T, other di-
rection: ET is used by
T. Dotted lines indi-
cate availability of
more detailed refine-
ments. Rectangles in-
side other rectangles
indicate part-of re-
lations [1].
and show later how we will assemble and
integrate the local instances of the refer-
ence model.
3.2.1 SMItHIS Domain Layer:
Tasks of DIC
The domain layer describes which tasks
have to be performed in SMITH and what
data are used for these tasks.
▶
Figure 1
gives a high-level overview of the tasks and
entity types (data) relevant for SMItHIS
from the DIC perspective (dotted lines
indicate availability of more detailed re-
finements that can be found in the figure of
the
▶
Online Appendix).
Each DIC operates a Health Data Storage
(HDS) containing a local EHR (electronic
health record) [17, 18] covering both data
in the local EMR (electronic medical rec-
ord) [18] ingested from a local partner
University Hospital (UH) and data in-
gested from other sources. In addition, the
HDS contains knowledge, i.e. rules and
methods on how to nourish the ingested
data. Health Data Storage Operations in
DIC cover ingesting data and knowledge,
as well as data nourishing.
•
Ingesting Data and Knowledge: Con-
siderable parts of local EMR data are
still unstructured data in reports, find-
ings or discharge letters. However,
structured, i.e. discrete individual data
is needed. Taking unstructured patient
data under DIC supervision calls for in-
corporating natural language processing
(NLP) tools. Text-mining algorithms
extract information from narrative texts
and render it as structured data [19].
The needed algorithms will be devel-
oped within the so-called Research &
Development Factory. By this term, we
summarize the research and develop-
ment projects using the shared data and
deriving rules and methods, i.e. knowl-
edge, out of the data. Within the factory,
the methodological use case “Phenotype
pipeline” (PheP) will e.g. develop phe-
notyping rules (see 4.1). A DIC will as
well be able to ingest data from insur-
ance companies (record linkage) and
from patients themselves (e.g. patient
reported outcomes) by providing speci -
fic services in a Patient Portal. Ingested
patient data from EMR in a certain uni-
versity hospital and from shared re-
sources are stored as a local EHR in the
HDS. Knowledge will be ingested to the
local HDS as well. Note that ingesting
data into the HDS does not necessarily
mean that data has to be copied. Rather,
links to the data in their sources may be
used.
•
Data Nourishing deals with adding value
to the ingested patient data. Ingesting
comprises metadata management, data
curation and phenotyping.
Metadata management provides and
maintains a local metadata repository,
which describes data elements and their
semantics. It will conform to ISO/IEC
norm 11179, which is widely used in the
health care domain. As part of data and
metadata transfer management, subsets
of this catalog may be shared. Addition-
ally, semantic information out of the
local catalog of metadata enrich patient
data in the local EHR.
Curation and applying the phenotyping
rules to data in the local EHR is sup-
ported by a Rules-Engine executing
rules, which are taken from the pre-
viously ingested knowledge. The DIC is
responsible for proper operation of the
Rules-Engine and thus of automatically
executing the rules in daily routine.
Based on the automatic execution of
A. Winter et al.: SMITH
German Med. Informatics Initiative
License terms: CC-BY-NC-ND (https://creativecommons.org/licenses/by-nc-nd/4.0) © Georg Thieme Verlag KG 2018
© Schattauer 2018 Methods Inf Med Open 01/2018
e98
rules for data curation, data of patients
are checked for plausibility, consistency,
redundancy, etc. Curated and semanti-
cally annotated EHR data is amended
with computed phenotype tags in the
course of the phenotyping pipeline (see
▶
Figure 4). The computed phenotype
tags are used, e.g. by computerized deci-
sion support systems during patient
care in the hospitals (cf. 4.2 and 4.3).
Nourished EHR-local data will be used
for patient care in the university hospi-
tal. Thus, the innovation cycle from
bedside to bench and back from bench
to bedside [20] is closed.
Tagging rules derived from phenotyping
are developed and provided especially by
the use cases ASIC and HELP in the Re-
search & Development Factory (cf. 4.2 and
4.3). They use shared EHR data as input,
which is provided by the Data and Meta -
data Transfer Management units (DMT) in
the DICs of the partner university hospit-
als. The DMT acts under control of the
Data Use and Access committee (DUA).
Besides distribution of assets like data and
knowledge, consulting for research groups
is an important task of DMT. Depending
on privacy regulations and patients’ con-
sents, the Data Trustee and Privacy Man-
agement unit (DTP) will ensure pseudo-
nymization prior to data transfer.
3.2.2 SMItHIS Logical Tool Layer:
Application Components and
Services Supporting the DIC Tasks
The SMItHIS logical tool layer consists of
application components and their inter-
faces and communication links.
At each partner site, application compo-
nents are needed to support the local DIC,
i.e. supporting the execution of tasks as
well as storing and communicating data
(entity types) as discussed in the previous
section. Data and knowledge sharing be-
tween sites and between patient care and
Research & Development Factory have to
be enabled by appropriate communication
links and dedicated components. Com-
munication for data and knowledge shar-
ing at the SMItHIS logical tool layer uses
the IDS and is standards-based, if possible,
using especially IHE profiles, CDA, FHIR
and medical terminologies to ensure pro-
cessual, syntactic and semantic interoper-
ability, respectively (cf. section 3).
The architecture of the local system of
application components and their com-
munication links at each site, i.e. each of
the local sub-information systems of
SMItHIS, follows the DIC reference archi-
tecture. Using IHE profiles and the IDS
based SMITH Market Place, the local sub-
information systems are integrated in order
to build SMItHIS as a whole. Still, SMItHIS
allows for local peculiarities by applying
the DIC reference architecture locally.
3.2.2.1 DIC Reference Architecture
As explained before, a DIC has to ingest
data from various Data Sources, i.e. differ-
ent application components of a local HIS.
We classify communications between ap-
plication components into three categories,
A, B, and C, according to their interface
type “if-type” (see legend in
▶
Figure 2).
Sources of type A are already designed ac-
cording to common coding schemes and
nomenclatures and they transfer data ac-
cording to processual standards, using e.g.
IHE profiles. Type B sources export data in
standard formats, such as HL7, DICOM
etc., but overarching standardization, pre-
venting variations in a technical standard
or semantically coded metadata for a uni-
fied data exchange is missing, and these
have to be added in a transformation step.
Finally, type C sources are proprietary, such
as data provided by comma-separated
(CSV) files. While data transformations
usually are not necessary for type A
sources, they need to be specified for type
B and C sources. Such data transform -
ations consist of data type conversions,
transformations/calculations into a stan-
dard unit of measurement (e.g. weight in
kg), and additions or replacements of the
codes and labels of categorical data in ac-
cordance with a standard terminology or
value set (coding scheme).
The Data Integration Engine will execute
all kinds of data transformation and load
processes from sources into the Health
Data Storage (HDS). The HDS contains
both a component for storing HL7 FHIR
resources (Health Data Repository), provid-
ing data by RESTful interfaces [21], and an
IHE XDS Document Repository, comprising
clinical data in HL7 CDA documents [22].
Thus, the HDS is the central and harmon-
ized base for all user-specified queries, data
exchanges, reports, etc. Using the interface-
type scheme (A, B, and C), we integrate
data beyond department borders within a
single hospital, i.e., laboratory, data and
pre-analytic data are stored in the same
way as treatment data from medical docu-
mentation systems. Note, at this generic
stage we do not specify whether data of
certain sources are virtually referred by or
materially copied into the HDS.
The data catalogue contained in the
HDS is a superset of the National Core
Data Set for Health Care [23], a commu-
nity-developed set of data from different
health care domains (“modules”). It con-
sists of basic modules for representing
demographic data, encounters, diagnoses,
procedures and medications in a standard-
ized way and extended modules, e.g. for
diagnostics, imaging, biobanking, Omics
and ICU data. It will be utilized in inter-
consortia use cases for evaluating the level
of interoperability between the consortia
described in this issue.
Metadata curation and harmonization
services will define a process to harmonize
common metadata elements from each site
by creating descriptive metadata at both
document and data element level, includ-
ing semantic, technical and provenance
metadata. Varying coding schemes and vo-
cabularies will be semantically annotated,
mapped and harmonized by alignment. A
quality management process will be set up
to ensure better metadata quality at the
metadata entry stage by applying terminol-
ogy management and semantic data valid-
ity checks. Metadata management will
strictly follow the FAIR principles [24].
Having captured data that are semantically
enriched by metadata (cf. “nourishing” in
previous section) is a prerequisite for later
analyses. Such metadata are typically pro-
vided by type A sources. Type B and C
sources necessitate the capture of meaning-
ful names and descriptions on the data el-
ement level, i.e., there is a name and a
description for each column of a data table
or class of data files, if they contain data of
lowest granularity, such as images. Such
metadata are centrally managed at each site
by a Metadata Repository as one part of the
A. Winter et al.: SMITH
German Med. Informatics Initiative
© Georg Thieme Verlag KG 2018 License terms: CC-BY-NC-ND (https://creativecommons.org/licenses/by-nc-nd/4.0)
Methods Inf Med Open 01/2018 © Schattauer 2018
e99
Figure 2 SMITH-DIC Reference Architecture.
Rounded rectangles represent application systems and services. Small rounded rectangles represent interfaces. Lines between interfaces represent communi-
cation links. Arrows indicate initiation of communication.
Metadata Services. Conversely, the XDS
Document Registry comprises another type
of metadata. It describes each CDA docu-
ment of the XDS Document Repository ac-
cording to the Dublin Core standard [25],
i.e., when and by whom the document has
been imported (provenance metadata).
Furthermore, just like in the Metadata Re-
pository, subsets of metadata stored in the
XDS Document Registry need to be shared
or unified, e.g. metadata on how docu-
ments are categorized into classes and
types. To support an overarching semantic
interoperability, unified metadata at the
conceptual level will be linked to inter-
nationally shared terminologies. National
and international initiatives, such as Clini-
cal Information Modeling Initiative
(CIMI), the National Metadata Repository
and the German value sets for XDS are
carefully observed.
There are different Data Analytic Ser-
vices within each DIC. Various analysis
routines will be made available for data in
the HDS. These routines are used, for
example, for filtering patients (or cases)
according to specific disease entities, such
as pre-diabetes, taking data of different
sources into account (examination data,
laboratory data, and genetic data, if avail-
able). In this way, essential tasks for clinical
research (hypothesis generation, patient re-
cruitment for clinical trials) as well as for
health care (quality indicators, hospital
controlling) are seamlessly supported at the
architectural level. Similarly, data will be
prepared for sharing by the Data & Meta-
data Transfer Management Service. Be-
tween all consortia, an agreement towards
the development of a common data model
has been reached, where data transfer be-
tween DICs should be based on. The model
will be based on published information
models and best practice examples [26, 27,
28]. The national core data set will be
mapped to an exchange model based on
the reference model. In this way, metadata
and data can be analyzed as to whether rel-
evant data and sufficient cases/patients are
available to set up an analysis project for an
intended medical hypothesis. Analysis rou-
tines running on patient data will be
executed in a privacy-preserving computing
environment [29, 30] without interaction
A. Winter et al.: SMITH
German Med. Informatics Initiative
License terms: CC-BY-NC-ND (https://creativecommons.org/licenses/by-nc-nd/4.0) © Georg Thieme Verlag KG 2018
© Schattauer 2018 Methods Inf Med Open 01/2018
e100
Figure 3 SMItHIS high-level architecture. Lines and rectangles have the same meaning as in ▶ Figure
2. Each grey shaded application system “DIC …” is an adapted instance of the generic DIC reference
model in ▶ Figure 2.
beyond the DIC; the results will then be
shared with specific applicants of analysis
projects. Large scale analysis routines util-
izing clinical data from multiple sites will
be executed in a distributed manner by
bringing algorithms to the data, e.g. by pro-
viding docker containers [31].
In addition to the Data Analytic Ser-
vices, there are also other application sys-
tems that interact with the HDS, such as
the Patient Portal to support patient em-
powerment, which is shown in
▶
Figure 2.
In addition to care-related services, pa-
tients shall be able to get detailed in-
formation on conducted clinical research
and data sharing and to “donate” their own
data to selected projects. The first step is to
provide a module showing which types of
consent a patient has given. This Consent
Creator Service also provides patients (or by
proxy medical personnel) with the oppor-
tunity to consent electronically. A second
step will provide information on the spe-
cific usage of their data in data analysis
projects. A third step will be to provide
a link to the EMR documents available.
Further web portals based on DIC services
are planned to provide additional support
for Data Trustee and Privacy Management
and Data and Metadata Transfer Manage-
ment functions.
Among others, the reference architec-
ture contains a Patient Identification and
Pseudonym Management Service, which
relies on IHE PIX and PDQ profiles to
manage patient-related identifiers inde-
pendent from clinical care or personal data.
All components are connected by the
Data Integration Engine. In this way, analy-
sis services can transparently access data of
source systems via the Health Data Storage.
The same holds for authenticated users
who have been granted data access. Both –
named users and automatically running
analysis routines – extensively use available
metadata, which are managed by the Meta-
data Repository and the XDS Document
Registry.
All cross-community, i.e. transinstitu-
tional communication requires authenti-
cation, authorization and auditing. Access
control interoperability is crucial for a suc-
cessful and sustainable health inform -
ation exchange. The Access Control Services
(ACS) use several international standards
and frameworks (e.g. IHE profiles
A/BPPC, XUA and ATNA, c.f. section
3.1.3) to fulfill access control workflows
and to ensure proper auditing. The ACS
Facade (Data Sharing Services Interface)
handles all transinstitutional communi-
cation and encapsulates the associated se-
curity aspects based on the facade pattern
of software design. It provides all IHE-
based interfaces to access structured data
and unstructured documents and a FHIR-
based interface to access discrete data ob-
jects, and enforces the appropriate consents
and authorizations.
3.2.2.2 SMItHIS High-Level Architecture
Based on the DIC reference architecture,
the SMItHIS High-Level Architecture (see
▶
Figure 3) describes the services for a
connection of all DICs, other Medical In-
formatics Initiative consortia and external
partners. The SMITH consortium intro-
duces two additional architectural con-
cepts: Data Sharing Services and the
SMITH Market Place (SMP) (see
▶
Figure
2). The high-level architecture clearly dis-
tinguishes between access and contracting
services provided for researchers and DICs
through the SMP and Data Sharing Ser-
vices. While the SMP provides services for
contracting and granting access to data by
incorporating concepts from the Fraun-
hofer Industrial Data Space (IDS), The
Data Sharing Services provide functions
for started projects and connect all DIC
ACS Facades for access control and stan-
dardized data sharing based on IHE pro-
files.
The SMP will be a central contracting
endpoint for internal as well as external
data consumers to provide data and knowl-
edge to all interested researchers. As such,
it will enable researchers to identify rel-
evant data sets by providing a GUI to cre-
ate and execute feasibility queries based on
the HL7 CQL standard for queries to clini-
A. Winter et al.: SMITH
German Med. Informatics Initiative
© Georg Thieme Verlag KG 2018 License terms: CC-BY-NC-ND (https://creativecommons.org/licenses/by-nc-nd/4.0)
Methods Inf Med Open 01/2018 © Schattauer 2018
e101
cal knowledge. Once a suitable dataset has
been identified, the SMP will support re-
searchers in creating data use and access
proposals. Furthermore, the SMP will sup-
port the Data Use and Access Committee
in the review and approval process. Finally,
when the proposal is approved, the SMP
will activate the required access policies,
enabling data transfer to the applicant. A
central aspect is to enforce a contract be-
tween the data user and the data provider.
Foreseeing a potential extension of the
SMP, the contracting mechanisms will be
implemented in a generic way, enabling the
reuse of the same principles when sharing
not only data, but also analytical services,
for example. Similarly, the architecture of
the SMP will be designed to enable the in-
tegration of third-party products. The SMP
does not store any data on its own, i.e. it
has no data storage functionality besides
the one required for processing the actual
contracts and access requests. The SMP
will define and provide interfaces and
facades to mediate between data use and
access requests and consortia-specific data
storage and access policies (interconnec-
tion to the Data Sharing Services).
The Data Sharing Services include an
overarching identity provider (IHE HPD)
for managing all participating identities
and their roles for clinical studies and
analysis projects. An Enterprise Master Pa-
tient Index (EMPI, IHE PIX/PDQ) man-
ages the linkage of patient pseudonyms
across the DICs whereby no personal pa-
tient data is stored. Access to the EMPI is
protected by the Access Control System
(SAML / XACML). It utilizes security
tokens and policies (IHE XUA) to secure
communication with the central compo-
nents. The consent registry (IHE XDS.b,
APPC) for providing information about
consent policy documents is also included.
This registry is connected to the Consent
Creator Services, i.e. local Patient Portals
for the creation of such consents based on
patients’ informed permissions to use their
data. It can be queried by a Consent Con-
sumer, e.g. the Data and Metadata Transfer
Management Service, via IHE XCA Query
and then further processed to check, for
example, whether certain data can be
shared and used for a specified purpose.
Terminology services manage and provide
codes and concepts on a consortium level.
These can also be used to connect to ex -
ternal terminology services, e.g. at the
national level, or to connect to other third-
party services, such as existing public key
infrastructures or external identity pro-
viders. Thus, the Data Sharing Services
comprise functions for a federation of stan-
dardized data sharing and access control
which can be cascaded from the DICs up to
a national level.
In summary, the SMITH Market Place
and the Data Sharing Services connect the
individual sites and provide the functional-
ities and services to initiate projects, to
share medical knowledge, data and algo-
rithms and to support existing and new use
cases for all possible partners (further uni-
versity hospitals during the development
and networking phase as well as additional
healthcare institutions and network part -
ners in an elaborated roll-out process) in
the Medical Informatics Initiative.
4. Use Cases
In SMITH, we will implement one metho-
dological use case (PheP) and two clinical
use cases (ASIC and HELP). The use cases
shall gather evidence for the usefulness
and benefit of the DICs and their services
implemented on top of the SMItHIS archi-
tecture. The clinical and patient-relevant
impact of both clinical use cases will be
evaluated by stepped wedge study designs.
4.1 Methodological Use Case
PheP: Phenotype Pipeline
The SMITH consortium plans to develop
and implement a set of tools and algo-
rithms to systematically annotate and ana-
lyze patient-related phenotypes according
to classification rules and statistical or dy-
namic models. Using DIC services (c.f. sec-
tion 3.2.1,
▶
Figure 1) EHR data in the
HDS is annotated with computed pheno-
type tags. The annotations and derivatives
will be made available for triggering alerts
and actions, e.g. by study nurses in order to
acquire additional findings from certain
patients for data completion. The tags are
subject to sharing and will be used for ana-
lyses of patient care and outcomes. This set
of tools and algorithms constitutes the
“Phenotype pipeline” (PheP). PheP will
utilize data from the participating hospital’s
information systems. Besides structured
data, unstructured textual data from clini-
cal reports and the EMR will be incorpor-
ated and will be subject to natural language
processing (NLP). The technology will be
implemented in the DICs and first used to
support the clinical use cases HELP and
ASIC and in other research driven specific
data use projects.
▶
Figure 4 illustrates the basic tasks of
and entity types used in PheP. This figure
refines the generic DIC tasks as illustrated
in
▶
Figure 1:
•
Within the Research & Development
Factory, research groups may be estab-
lished to develop knowledge to be used
in the DIC. Especially phenotyping
rules for phenotype classifications and
NLP algorithms for information extrac-
tion from unstructured documents are
in focus. Following a strict division
between research and patient care, the
research groups in the Research & De-
velopment Factory will use pseudonym-
ized patient data provided for sharing
by the DMT unit of one or more DICs.
•
We use the term “phenotype” in a very
general sense, referring to a set of
attributes that can be attached to an
individual. We distinguish between ob-
servable and computable phenotypes,
defined as “a clinical condition, charac-
teristic, or set of clinical features that
can be determined solely from the data
in EMRs and ancillary data sources and
does not require chart review or inter-
pretation by a clinician” [32]. Phenotype
classification rules are based on classifi-
cation trees, statistical models or simu-
lation models. This will result in anno-
tations (called tags) of a broad spectrum
of attributes linked to a patient and his/
her pattern of care and outcome.
•
The main prerequisite for performing
phenotyping is the availability of struc-
tured data. Phenotype information will
be automatically extracted from un-
structured EMR entries and clinical
documents using NLP. For this, we plan
two building blocks: a clinical document
corpus (ClinDoC; for a preliminary ver-
sion, cf. [33]) and a collection of NLP
A. Winter et al.: SMITH
German Med. Informatics Initiative
License terms: CC-BY-NC-ND (https://creativecommons.org/licenses/by-nc-nd/4.0) © Georg Thieme Verlag KG 2018
© Schattauer 2018 Methods Inf Med Open 01/2018
e102
Figure 4 Detailed model of tasks and entity types of a DIC focusing on the phenotyping pipeline
PheP. Here, entity types are depicted as rounded rectangles. Rectangles and arrows have the same
meaning as in ▶ Figure 1.
components for processing German-
language clinical documents, the
SMITH Clinical Text Analytics Proces-
sor (ClinTAP; according to software en-
gineering standards outlined in [34],
cf. [19] for a preliminary version).
ClinDoC will serve as the backbone for
system performance evaluation and as a
training and development environment
for single NLP modules, which will be
integrated, after quality control, in the
ClinTAP information extraction (IE)
engine.
•
The Phenotyping Rules Engine (see
▶
Figure 2) automatically applies the
tagging rules for phenotyping and com-
putes the phenotype tags. As tools ma-
ture, rules and tags will be handed over
for routine use to the DICs.
•
Any DIC interested in using this knowl-
edge, i.e. the rules for NLP and pheno-
typing, will ingest it into its Health Data
Storage. Whereas structured data may
easily be ingested in the HDS, unstruc-
tured documents from a hospital’s EMR
have to be processed by information
extraction and text mining tools using
the NLP algorithms.
•
The Metadata Repository (MDR) will
provide access to medical terminologies
and code systems and thus supports
semantic interoperability. In the PheP
use case, we will enhance the MDR with
metadata from a large variety of health-
care datasets, epidemiological cohorts,
clinical trials and use cases. The MDR
will also provide a directory of data
items available projects in the Research
& Development Factory (catalog of
items).
4.2 Clinical Use Case ASIC:
Algorithmic Surveillance of
ICU Patients
Due to the epidemiological challenges in
Germany, demand for intensive care medi-
cine will increase over the next 10-15 years.
At the same time, there will be a shortage
of staff resources and so there is an urgent
need for interoperable and intelligent sol-
utions for using data to improve outcomes
and processes. The need for outcome im-
provement is especially urgent in patients
suffering from the acute respiratory dis-
tress syndrome (ARDS). Incidence of
ARDS worldwide remains high, with 10.4%
of total ICU admissions and 23.4% of all
patients requiring mechanical ventilation
[35]. The use case Algorithmic Surveillance
of ICU Patients (ASIC) will therefore in-
itially focus on ICU patients with ARDS.
By means of continuous analyses of data
from the Patient Data Management Sys-
tem (PDMS), a model-based “algorithmic
surveillance” of the state of critically ill
patients will be established. To predict indi-
vidual disease progression, ASIC will util-
ize pattern recognition technologies as well
as established mechanistic systems medi-
cine models, complemented by machine
learning, both integrated in a hybrid virtual
patient model [36]. Ultimately, the virtual
patient model will enable individual prog-
noses to support therapy decisions, clinical
trials and training of future clinicians.
Training the virtual patient model requires
high-performance computing.
The resulting ASIC system will be an
on-line rule-based computerized decision-
support system (CDSS). As such, it will ex-
tensively use the DIC services and espe -
cially the Phenotype Rules-Engine, which
applies the phenotyping rules developed in
PheP. Its aim is to accelerate correct and
A. Winter et al.: SMITH
German Med. Informatics Initiative
© Georg Thieme Verlag KG 2018 License terms: CC-BY-NC-ND (https://creativecommons.org/licenses/by-nc-nd/4.0)
Methods Inf Med Open 01/2018 © Schattauer 2018
e103
sound diagnosis and increase guideline
compliance regarding ventilation. The
rules may be realized by explicit decision
trees, complex models, mechanistic or ma-
chine learning-based, or combinations of
both. Its main component is the Diagnostic
Expert Advisor (DEA). The development
utilizes the PheP efforts in the following
way (c.f.
▶
Figure 4 and
▶
Online Appen-
dix):
•
Usually PDMS data is well structured
and will be ingested to the DIC Health
Data Storage without NLP. Specific
NLP-processed data mining from other
sources (e.g. radiology and microbiol-
ogy reports) will complement the data.
The Data and Metadata Transfer Man-
agement (DMT) will provide pseudo-
nymized patient data taken from the
different partner HDS for training the
DEA.
•
The ASIC team contributes to the Re-
search & Development factory. It will
provide phenotyping rules for comput-
ing phenotype alert tags. Established
machine learning approaches (with pre -
ference given to hierarchical clustering,
random forest, Support Vector Ma-
chines, neural networks and Hidden
Markov Models) will be applied to the
training data collections to identify new
patient classification schemes. We will
also use the large data sets from all con-
sortium’s DICs to evaluate and utilize
the predictivity of deep learning for
time series.
•
The identified data patterns and models
will be expressed in a system of rules in
the Phenotype Rules-Engine, which will
be provided for ingesting into the
knowledge bases of all partnering DICs.
In the initial phase, this will be used for
research purposes, in the later phase the
ASIC-apps will be certified for care ac-
cording to the medical products regu-
lations.
•
The rule system will be used for pheno-
typing of new patients using the EHR
data in the HDS. The computed tags
will then be used as input for the ASIC
decision support system.
The utility of the ASIC approach will be
demonstrated in the clinical care setting
using a step wedge design. In this setting,
the ASIC app will provide the interface
between the DIC, the surveillance algo-
rithms and the medical professionals at the
bedside. Physicians will be automatically
informed if a priori specified limits are
exceeded. This alert function will be en-
hanced by smart latest action-based algo-
rithms, preventing medical professionals
from moot alerts (and, thus, alert fatigue).
4.3 Clinical Use Case HELP:
A Hospital-wide Electronic Medical
Record-based Computerized
Decision Support System to
Improve Outcomes of Patients
with Blood-stream Infections
Antimicrobial stewardship programs (ABS
programs) use a variety of methods to im-
prove patient care and outcomes. One of
the core strategies of ABS programs is
prospective audit and feedback, which is
labor-intensive and costly. When an infec-
tious diseases specialist is involved in a
patient’s care and the physician in charge
follows his/her recommendations, patients
are more often correctly diagnosed, have
shorter lengths of stay, receive more appro-
priate therapies, have fewer complications,
and may use fewer antibiotics, overall.
However, physicians with infectious dis-
ease expertise are rare in Germany and
usually limited to a few university hospit-
als.
As an alternative, CDSS have been rec-
ommended by a recent German S3 Guide-
line on ABS programs [37]. CDSS may help
to establish the correct diagnosis, to choose
appropriate antimicrobial treatment, and
to balance optimal patient care with unde-
sirable aspects such as the development of
antibiotic resistance, adverse events, and
costs. However, the availability of CDSS for
ABS program implementation is currently
underdeveloped. HELP aims to develop
and implement a CDSS. We will first
focus on Staphylococcal bacteremia, i.e.
staphylococcal bloodstream infections,
since Staphylococci are the most frequent
pathogens detected in blood cultures
(BCs). Furthermore, a recent meta-analysis
has shown that the particularly high mor-
tality rate of staphylococcus aureus bac-
teremia can be reduced by 47% when an
infectious diseases specialist is involved.
This reduction is achieved by increased ad-
herence to an evidence-based bundle of
care which will be implemented into the
HELP CDSS [38].
Development and use of the HELP-
CDSS will use SMITH’s Phenotype Pipe-
line PheP similarly like ASIC (see
▶
Figure
4, and
▶
Online Appendix):
•
The HELP-CDSS requires the inte-
gration of structured clinical data and
unstructured clinical reports from the
partnering hospitals’ EHRs. Previous
work (e.g. [39]) has indicated a potential
benefit of such an integration in re-
search fields related to our HELP objec-
tive. Again, structured data will be in-
gested based on technical standards,
whereas NLP algorithms will be used to
ingest the reports into HDS, and the
structured information derived there-
from.
•
The DMT will provide and share pseu-
donymized patient data for developing
the HELP-CDSS.
•
As above, the HELP team is part of
the Research & Development Factory
and will develop phenotyping rules. In
HELP, these rules are to classify patients
according to their need of antibiotic
stewardship. The proposed CDSS algo-
rithm consists of several decision levels
for which i) no human support is re -
quired, ii) all required information are
or will become digitally available and iii)
an automated directive/management
proposal can be reported immediately
to the treating physician.
•
These rules will be used for phenotyp-
ing of new patients using the EHR data
in the HDS. The Rules-Engine, there-
fore, monitors patient data in real-time,
phenotypes the patients applying the
rules, which represent the generalized
and derived guidelines and models. The
HELP CDSS uses the computed pheno-
type tags and issues an alert if a poten-
tial candidate for action is identified.
Feedback to and support of the health care
professionals will be based on the HELP
App, which will be based on a generic
SMITH app – like the ASIC app. The goal
of the HELP app is to provide immediate
alerts/directives to the treating physician
along the outlined algorithms.
A. Winter et al.: SMITH
German Med. Informatics Initiative
License terms: CC-BY-NC-ND (https://creativecommons.org/licenses/by-nc-nd/4.0) © Georg Thieme Verlag KG 2018
© Schattauer 2018 Methods Inf Med Open 01/2018
e104
5. Discussion
This paper outlines an ambitious program.
It is ambitious not only with regard to the
standards-based SMItHIS architecture, but
with regard to the use cases as well. We are
confident of being successful in the end be-
cause
•
SMITH is not only an academic consor-
tium. Moreover, the CIOs of the part-
nering hospitals and their information
management departments are actively
involved. Hence, we will be able to con-
nect bench and bedside in both direc-
tions [20].
•
A unique public-private partnership of
complementary partners and the incor-
poration of experienced companies in
the field of interoperable transinstitu-
tional information systems and in the
field of analyzing medical data (struc-
tured as well as unstructured) efficiently
will effectively help implementing the
standards-based architecture.
However, we are aware of the challenges
and risks of such an endeavor, e.g.
•
The DICs in general and the intended
use cases require high quality clinical
documentation. Although we will im-
plement NLP methods to analyze un-
structured documents, more and more
structured documentation will be
needed. Therefore, the sustainability of
DICs depends on the ongoing support
by the hospital’s management and, most
importantly, by their health care pro -
fessionals. This support can only be ob-
tained when healthcare and patients, in
particular, will benefit from SMItHIS
and thus promote the use of clinical
data for its services.
•
The rights of citizens and patients for
informational self-determination and
the legal regulations for data protection
and privacy may come into conflict with
important medical research goals in
SMITH. The German Ethics Council
recognized this potential for conflict
and recommends legal regulations
based on the principle of data sover-
eignty [40]. In SMITH, we will act
closely to this recommendation es-
pecially by integrating a patient portal.
The patient portal shall give individuals
information on the usage of their data
and shall provide the opportunity for
data donation.
References
1. Federal Ministry of Education and Research, Ger-
many. Medical Informatics Funding Scheme: Net-
working data – improving health care. Berlin, Ger-
many; 2015. Available from: https://www.bmbf.de/
pub/Medical_Informatics_Funding_Scheme.pdf.
2. CCSDS Secretariat. Reference Model for an Open
Archival Information System (OAIS): Magenta
Book: NASA Headquarters; 2012. Available from:
http://public.ccsds.org/publications/archive/
650x0m2.pdf.
3. Kirsten T, Kiel A, Wagner J, Rühle M, Löffler M.
Selecting, Packaging, and Granting Access for
Sharing Study Data. In: Eibl M, Gaedke M, editors.
Informatik 2017 – Bände I-III: Tagung vom 25.-29.
September 2017 in Chemnitz. Bonn: Gesellschaft
für Informatik; 2017. p. 1381–1392 (GI-Edition
Proceedings; vol. 275).
4. Kirsten T, Kiel A, Rühle M, Wagner J. Metadata
Management for Data Integration in Medical
Sciences – Experiences from the LIFE Study. In:
Mitschang B, Ritter N, Schwarz H, Klettke M,
Thor A, Kopp O, et al., editors. BTW 2017: Daten-
banksysteme für Business, Technologie und Web
(Workshopband); Tagung vom 6. – 7.März 2017 in
Stuttgart. Bonn: Gesellschaft für Informatik; 2017.
p. 175–194 (GI-Edition – lecture notes in in-
formatics (LNI) Proceedings; volume P-266).
5. Loeffler M, Engel C, Ahnert P, Alfermann D, Are-
lin K, Baber R, et al. The LIFE-Adult-Study: Ob-
jectives and design of a population-based cohort
study with 10,000 deeply phenotyped adults in
Germany. BMC Public Health 2015; 15: 691.
6. Löffler M, Binder H, Kirsten T. Leipzig Health
Atlas; 2018 [cited 2018 Jan 30]. Available from:
https://www.health-atlas.de/en.
7. Winter A, Brigl B, Wendt T. Modeling Hospital In-
formation Systems (Part 1): The Revised Three-
Layer Graph-Based Meta Model 3LGM2. Methods
Inf Med 2003; 42(5): 544–551.
8. Winter A, Haux R, Ammenwerth E, Brigl B, Hell-
rung N, Jahn F. Health Information Systems –
Architectures and Strategies. London: Springer;
2011.
9. Staemmler M. Towards sustainable e-health net-
works: does modeling support efficient manage-
ment and operation? In: Kuhn KA, Warren JR,
Leong T-Y, editors. Proceedings of Medinfo 2007
(Part 1). Amsterdam: IOS Press; 2007. p. 53–57
(Stud Health Technol Inform; vol. 129).
10. Stäubert S, Winter A, Speer R, Loffler M. Design-
ing a Concept for an IT-Infrastructure for an Inte-
grated Research and Treatment Center. In: Safran
C, Marin H, Reti S, editors. MEDINFO 2010 Part-
nerships for Effective eHealth Solutions. Amster-
dam: IOS Press; 2010. p. 1319–1323 (Stud Health
Technol Inform; vol. 160).
11. Winter A, Takabayashi K, Jahn F, Kimura E, Engel-
brecht R, Haux R, et al. Quality Requirements for
Electronic Health Record Systems: A Japanese-
German Information Management Perspective.
Methods Inf Med 2017; 56(Open): e92–e104.
Available from: https://www.schattauer.de/index.
php?id=5236&mid=27796&L=1.
12. IHE International Inc. IHE Profiles; 2017 [cited
2018 Jan 4]. Available from: http://wiki.ihe.net/
index.php/Profiles#IHE_IT_Infrastructure_Pro-
files.
13. Stäubert S, Schaaf M, Jahn F, Brandner R, Winter
A. Modeling Interoperable Information Systems
with 3LGM and IHE. Methods Inf Med 2015;
54(5): 398–405.
14. HL7.org. Introducing HL7 FHIR; 2014 Jan 1 [cited
2018 Jan 4]. Available from: https://www.hl7.org/
fhir/summary.html.
15. Otto B, Jürjens J, Schon J, Auer S, Menz N, Wenzel
S et al. INDUSTRIAL DATA SPACE: Digitale Sou-
veränität über Daten; 2016. Available from: www.
industrialdataspace.org.
16. Juhr M, Haux R, Suzuki T, Takabayashi K. Over-
view of recent trans-institutional health network
projects in Japan and Germany. J Med Syst 2015;
39(5): 234.
17. International Organization for Standardization
(ISO). ISO 18308: 2011 Requirements for an elec-
tronic health record architecture; 2011 [cited 2014
Jan 23].
18. Garets D, Davis M. Electronic Medical Records vs.
Electronic Health Records: Yes, There Is a Differ-
ence. Chicago, IL: HIMSS Analytics; 2006. Avail-
able from: http://s3.amazonaws.com/rdcms-
himss/files/production/public/HIMSSorg/Con-
tent/files/WP_EMR_EHR.pdf.
19. Hellrich J, Matthies F, Faessler E, Hahn U. Sharing
models and tools for processing German clinical
texts. Stud Health Technol Inform 2015; 210:
734–738.
20. Marincola FM. Translational Medicine: A two-way
road. J Transl Med 2003; 1(1): 1.
21. FHIR Developer Introduction; 2015 [cited 2016
Apr 13]. Available from: http://hl7.org/fhir/over
view-dev.html#1.8.1.6.
22. Health Level Seven International. CDA® Release 2:
Health Level Seven International; 2016 [cited 2016
Feb 12]. Available from: http://www.hl7.org/im
plement/standards/product_brief.cfm?prod-
uct_id=7.
23. Redaktionsgruppe Kerndatensatz der AG Interop-
erabilität des Nationalen Steuerungsgremiums der
Medizininformatik-Initiative; 2017 Mar 10. Avail-
able from: http://www.medizininformatik-initi
ative.de/sites/default/files/inline-files/
MII_04_Kerndatensatz_1–0.pdf.
24. Wilkinson MD, Dumontier M, Aalbersberg IJJ,
Appleton G, Axton M, Baak A et al. The FAIR
Guiding Principles for scientific data management
and stewardship. Sci Data 2016; 3: 160018.
25. Darmoni SJ, Thirion B, Leroy JP, Douyère M. The
use of Dublin Core metadata in a structured health
resource guide on the internet. Bull Med Libr
Assoc 2001; 89(3): 297–301.
26. Meineke, Frank, Stäubert S, Löbe M, Winter A. A
Comprehensive Clinical Research Database based
on CDISC ODM and i2b2. Stud Health Technol
Inform 2014; 205: 1115–1119.
27. Murphy SN, Mendis M, Hackett K, Kuttan R, Pan
W, Phillips LC, et al. Architecture of the open-
source clinical research chart from Informatics for
Integrating Biology and the Bedside. AMIA Annu
Symp Proc 2007; 2007: 548–552.
A. Winter et al.: SMITH
German Med. Informatics Initiative
© Georg Thieme Verlag KG 2018 License terms: CC-BY-NC-ND (https://creativecommons.org/licenses/by-nc-nd/4.0)
Methods Inf Med Open 01/2018 © Schattauer 2018
e105
28. Hripcsak G, Duke JD, Shah NH, Reich CG, Huser
V, Schuemie MJ, et al. Observational Health Data
Sciences and Informatics (OHDSI): Opportunities
for Observational Researchers. Stud Health Tech-
nol Inform 2015; 216: 574–578.
29. Kho AN, Cashy JP, Jackson KL, Pah AR, Goel S,
Boehnke J, et al. Design and implementation of a
privacy preserving electronic health record linkage
tool in Chicago. J Am Med Inform Assoc 2015;
22(5): 1072–1080.
30. Vatsalan D, Sehili Z, Christen P, Rahm E. Privacy-
Preserving Record Linkage for Big Data: Current
Approaches and Research Challenges. In: Zomaya
A, Sakr S, editors. Privacy-Preserving Record
Linkage for Big Data: Handbook of Big Data Tech-
nologies. Springer; 2017.
31. Löbe M, Ganslandt T, Lotzmann L, Mate S, Chris-
toph J, Baum B et al. Simplified Deployment of
Health Informatics Applications by Providing
Docker Images. Stud Health Technol Inform 2016;
228: 643–647.
32. Richesson RL, Smerek MM, Blake Cameron C. A
Framework to Support the Sharing and Reuse of
Computable Phenotype Definitions Across Health
Care Delivery and Clinical Research Applications.
EGEMS (Wash DC) 2016; 4(3): 1232.
33. Hahn U, Matthies F, Lohr C, Löffler M.
3000PA—Backbone for a national clinical refer-
ence corpus of German. MIE 2018. In: Ugon A,
Karlsson D, Klein GO, Moen A, editors. Building
Continents of Knowledge in Oceans of Data: The
future of co-created ehealth. Amsterdam: IOS
Press; 2018. p. 26–30.
34. Hahn U, Matthies F, Faessler E, Hellrich J. UIMA-
based JCoRe 2.0 goes GitHub and Maven Central:
State-of-the-art software resource engineering and
distribution of NLP pipelines. In: Calzolari N, edi-
tor. LREC 2016: [proceedings]. [S. l.: s. n.]; 2016. p.
2502–2509.
35. Henry KE, Hager DN, Pronovost PJ, Saria S. A tar-
geted real-time early warning score (TREWScore)
for septic shock. Sci Transl Med 2015; 7(299):
299ra122.
36. Wolkenhauer O, Auffray C, Brass O, Clairambault
J, Deutsch A, Drasdo D, et al. Enabling multiscale
modeling in systems medicine. Genome Med
2014; 6(3): 21.
37. With K de, Allerberger F, Amann S, Apfalter P,
Brodt H-R, Eckmanns T, et al. Strategies to en-
hance rational use of antibiotics in hospital: A
guideline by the German Society for Infectious
Diseases. Infection 2016; 44(3): 395–439.
38. Vogel M, Schmitz RPH, Hagel S, Pletz MW, Gagel-
mann N, Scherag A, et al. Infectious disease con-
sultation for Staphylococcus aureus bacteremia –
A systematic review and meta-analysis. J Infect
2016; 72(1): 19–28.
39. DeLisle S, South B, Anthony JA, Kalp E, Gundla-
pallli A, Curriero FC, et al. Combining free text
and structured electronic medical record entries to
detect acute respiratory infections. PLoS One
2010; 5(10): e13377.
40. Deutscher Ethikrat. Big Data und Gesundheit –
Datensouveränität als informationelle Freiheits-
gestaltung: Stellungnahme. Berlin; 2017 Nov 30.
Available from: http://www.ethikrat.org/dateien/
pdf/stellungnahme-big-data-und-gesundheit.pdf.
A. Winter et al.: SMITH
German Med. Informatics Initiative
License terms: CC-BY-NC-ND (https://creativecommons.org/licenses/by-nc-nd/4.0) © Georg Thieme Verlag KG 2018