ArticlePDF Available

Efficient genomic profiling of patients: the benefit of systems interoperability

  • HLTH Network

Abstract and Figures

Genedata ProfilerTM is a translational research software platform developed in collaboration with leading pharmaceutical companies to effectively process, manage, and analyze omic and phenotypic data to the highest standards of data quality and regulatory compliance. Genedata Profiler complements the knowledge management platform tranSMART by standardizing the processing and quality control of omics data, simplifying the publishing of data into the data warehouse, and adding sophisticated statistical analyses. In this case study we will demonstrate an end-to-end workflow to identify insulin response biomarkers utilizing the seamless, bidirectional interoperability of Genedata Profiler with tranSMART
Content may be subject to copyright.
Efficient genomic profiling of patients:
the benefit of systems interoperability
Improving the process of translational R&D
Achieving the vision of precision medicine is reliant
to a large extent on translational research activities.
Such activities require researchers to characterize
and profile patients using omic technologies in order
to understand their response to new therapies,
stratify patients for trials or search for new disease
. Genedata Profiler is an enterprise
software platform used by pharma and biopharma
companies to address these challenges and to
optimize the process of translational research.
In this case study, we will discuss how Genedata
Profiler complements existing in-house software
solutions such as tranSMART
to create a complete
infrastructure for performing translational research.
The open-source data warehouse tranSMART is used
by several pharmaceutical companies to manage multi-
omics data sets to be used for biomarker projects.
Often, tranSMART is used to merge ‘clinical data’ from
the highly regulated clinical study environment into
the less-regulated ‘discovery’ research environment,
where users have much more flexibility to add omics
data as well as other important research information.
By integrating Genedata Profiler with tranSMART,
we have enhanced and expanded the capabilities of
organizations to perform patient genomic profiling.
The result is a very powerful, user-friendly platform
for translational research.
Axel Schumacher a, Mark A. Collinsb, Marc Flescha, Miles Fisher-Pollardc, & Tamas Rujanc
aGenedata GmbH, Munich, Germany | bGenedata Inc, Lexington, USA | cGenedata AG, Basel, Switzerland
Keywords: tranSMART, quality control, data harmonization, translational research, precision medicine, patient privacy
Fig. 1: Left: Version-controlled workflows for repeatable execution. The workflow engines of Genedata Profiler allow organizations to build
and deploy standardized workflows throughout an organization, ensuring data quality and reproducibility. Example shows an RNA-Seq
pipeline that was ‘approved’ by the study manager. Right: The best-in-class, fully interactive genome browser incorporated into Genedata
Profiler facilitates visual inspection of raw data such as original short read alignments together with analysis to ensure result consistency.
Genedata Profiler
is a translational research software platform developed in collaboration with leading
pharmaceutical companies to effectively process, manage, and analyze omic and phenotypic data to the
highest standards of data quality and regulatory compliance. Genedata Profiler complements the knowl-
edge management platform tranSMART by standardizing the processing and quality control of omics data,
simplifying the publishing of data into the data warehouse, and adding sophisticated statistical analyses. In
this case study we will demonstrate an end-to-end workflow to identify insulin response biomarkers utiliz-
ing the seamless, bidirectional interoperability of Genedata Profiler with tranSMART.
Schumacher et al. Efficient genomic profiling of patients: the benefit of systems interoperability
published with a click of a button in minutes rather
than hours (Fig. 3).
In addition, Genedata Profiler allows augmentation
of studies without the need to reload all data for
each study. Out-of-the-box integration with public
omics data sources such as GEO, ArrayExpress,
and various other resources, allows researchers
to integrate data from various public repositories
easily into tranSMART. The easy access to a wide
variety of multi-omics data enables efficient data
discovery and data sharing.
Harmonizing raw omics data processing
& quality control
Data quality and hence data curation is critical
to making the right scientific conclusions.
Genedata Profiler complements tranSMART by
adding sophisticated data processing & curation
capabilities to harmonize and standardize data
processing workflows. Expert users can set up and
approve such workflows and make them available
to a larger user community (Fig. 1).
To illustrate the patient profiling capabilities of
Genedata Profiler and the value of the integration
with tranSMART, we sought to identify biomarkers
for insulin response in skeletal muscle. We applied
a best-practice workflow (Fig. 1, left) to process
NGS and microarray data from a public RNA-
Seq data set of human skeletal muscle myocyte
( n = 1 2 , 3 d i f f e r e n t t r e a t m e n t t i m e p o i n t s )
together with expression profiles from a different
set of muscle biopsies (n=36), which were analyzed
with Affymetrix microarrays
Inspection of the comprehensive quality control
report (Fig. 2) automatically generated by the
workflow indicates that all omics data is of sufficient
quality to be utilized for statistical analysis, and is
ready to be published to tranSMART.
Simplifying data loading into tranSMART
tranSMART utilizes a highly complex Extract-
Transform-Load (ETL) infrastructure to allow
expert users to load data into tranSMART. Loading
omics and clinical sample annotations (metadata)
into tranSMART can therefore be time consuming
and expensive.
Genedata Profiler uses its own public APIs to load
data directly into tranSMART, allowing data to be
Fig. 2: Automated quality control reporting.
Configurable and
dynamic quality reports in Genedata Profiler provide a rich set of
quality metrics on NGS reads.
Fig. 3: Left: Saving high-dimensional data to tranSMART. User selects a study from the data warehouse study tree together with other data
groups to be uploaded. The data is uploaded, processed and written to tranSMART and the user notified when data is available. Right: Saving
clinical data to tranSMART. Additional clinical annotations can be selected from a list by dragging and dropping. Omic and clinical annota-
tions may be linked by choosing an appropriate field, e.g. subject ID.
Schumacher et al. Efficient genomic profiling of patients: the benefit of systems interoperability
Fig. 4: Left: Initiating data transfer from tranSMART to Genedata Profiler.
After selecting high- and low-dimensional data nodes in
tranSMART, data is automatically transferred to the statistical module of Genedata Profiler. Middle: The Volcano Plot visualizer (top) displays
a scatter plot of markers in which P-values are plotted against n-fold change. The most significant markers (red) are highly expressed after
insulin treatment. A gene ontology Fisher’s Exact Test (bottom) indicates that these genes are involved in cellular responses to zinc ions, which
play a central role in glycemic control. Right: The Genedata Profiler platform was specifically tailored for the integration & interpretation of
experimental data in translational R&D, providing a wide range of data analyses through a rich statistical toolbox and intuitive visualizations.
Identifying & reporting clinically relevant
Genedata Profiler makes tranSMART smarter
by adding sophisticated statistical methods not
available in tranSMART. It offers:
A rich statistical toolbox to perform a wide range
of data analyses;
External algorithms as plugins;
Integration of data across technologies and
studies from in-house and public data sources;
Sophisticated data visualization and reports.
Multi-omics approaches inherently increase the
already growing complexity of data in the life
sciences. Genedata Profiler provides scalability
beyond the capabilities of tranSMART so that huge
amounts of complex data can be analyzed within a
short time.
Using the statistical tools of Genedata Profiler
(Fig. 4), we were able to identify a novel biomarker
(a diabetes-associated gene), potentially
linking diabetic cardiomyopathy and baroreflex
dysfunction, which was not detected in the original
study by Coletta et al
Making the data regulatory-compliant
As pharmaceutical companies increasingly leverage
confidential patient data such as medical records
and genetic information in translational research,
important regulations governing data security and
privacy must be respected. Recent high-profile
decisions, e.g. replacement of the EU-US Safe Harbor
agreement on transatlantic data exchange with a
new “Privacy Shield”, impose stronger compliance
requirements on organizations. Genedata Profiler
provides comprehensive capabilities to ensure
patient privacy and maintain the chain of custody of
data, goals that are core to regulatory compliance.
For example, in our biomarker case study, the
integration with tranSMART ensures that the same
access controls are used between Genedata Profiler
and tranSMART, safeguarding the insulin study
patient data and reducing exposure to regulatory
non-compliance risk.
The information infrastructure of Genedata Profiler
enables organizations to cope with the wide range
and volumes of omics data as well as the wide
variety of data consumers (e.g. bioinformaticians,
analysts, biologists, etc.).
Genedata ProfilerTM is part of the Genedata portfolio of advanced software solutions that ser ve
the evolving needs of drug discovery, industrial biotechnology, and other life sciences.
Basel | Boston | Munich | San Francisco | Toky o |
© 2016 Genedata AG. All rights reserved. Genedata Profiler is a registered tr ademark of Genedata AG. All other pro duct
and ser vice names mentioned are th e trademarks of their re spective companies.
Schumacher et al. Efficient genomic profiling of patients: the benefit of systems interoperability
Strict user role management, audit trails, access
authorization, data federation, data lifecycle
and method management, and a comprehensive
reporting infrastructure are core components of
the software which enable:
Collaboration between different user roles
throughout a global organization using
comprehensive role-based access controls;
Integration, federation, and curation of the wide
variety of omic, phenotypic and patient data from
internal (e.g. tranSMART), external and public
data sources;
Sharing of data, methods and results.
Genedata Profiler makes omic-based patient
profiling processes significantly more efficient. It
streamlines the whole data processing, analysis,
and management process, and reduces the time
it takes to perform biomarker studies. As we have
shown in this case study, Genedata Profiler and its
bidirectional integration with tranSMART can help
scientists gain new insights into omics data from
clinical studies, while reducing compliance risk in
their translational research.
1. Schumacher, A., Rujan, T. & Hoefkens, J. A collaborative approach
to develop a multi-omics data analytics platform for translational
research. Appl. Transl. Genomics 4–7 (2014). doi:10.1016/j.
2. Athey, B. D., Braxenthaler, M., Haas, M. & Guo, Y. tranSMART: An
Open Source and Community-Driven Informatics and Data Sharing
Platform for Clinical and Tr anslational Research. AMI A Jt. Sum mits
Transl. Sci. Proc. AMIA Summit Transl. Sci. 2013, 6–8 (2013).
3. Väremo, L. et al. Proteome- and Transcriptome-Driven
Reconstruction of the Human Myocyte Metabolic Network and Its
Use for Id enti cation of Mar kers for Diabe tes. Cell Rep. 11, 921–933
4. Coletta, D. K. et al. Effect of acute physiological hyperinsulinemia
on gene expression in human skeletal muscle in vivo. Am.J.Physiol
Endocrinol.Metab 294, E910–E917 (2008).
Please cite this article as:
Schumacher, A., Collins, M., Fisher-Pollard, M. & Rujan,
T. (2016) Efcient genomic proling of patients: the benet of
systems interoperability, doi: 10.13140/RG.2.1.3093.4648
Fig. 5: Left: Study-centric, role-based web UI.
The web interface facilitates collaboration, data, method and results sharing. The
underlying enterprise architecture provides integration/federation with HPC file systems, internal and external databases and public data
sources. Right: User role management in Genedata Profiler. The ‘Members’ dialog lists all the members of a selected study and the specific
roles that are associated with those members. Roles are used to define permissions and access control.
... This integrated data will be very valuable also for the pharmaceutical companies (Schumacher et al. 2016;Regan and Payne 2015), which can mine this data and find patterns when comparing different patients' interactions with its drugs along every patient journey. This data can allow pharmaceutical companies to develop algorithms that explore improvements on the drug-patient interaction. ...
Full-text available
Addresses the motivation and enablers for digital health innovations Contextualizes the application, technical considerations, as well as socio-psycho-economical ones influencing many digital health technologies’ acceptance and widespread use Presents a comprehensive state-of the-art approach to digital health technologies and practices
Digital therapeutics, i.e., adding digital components to traditional therapeutics, can improve or even prevent diseases through behavioral change in cases where traditional drugs have not succeeded. The inclusion of digital components provides significant value not only to the therapeutics by improving their effectiveness, but also to the drug development process by reducing costs and increasing efficiency. The combination of digital therapeutics and diagnostics empowers the provider to deliver personalized medicine by better diagnosing and managing the patient, potentially enabling early disease detection. The implementation of computational, or “in silico” tools in therapeutics and diagnostics, such as deep learning algorithms, is taking the digitalization improvements to its next level, fueling the healthcare revolution from curing diseases to preventing them.
Full-text available
The integration and analysis of large datasets in translational research has become an increasingly challenging problem. We propose a collaborative approach to integrate established data management platforms with existing analytical systems to fill the hole in the value chain between data collection and data exploitation. Our proposal in particular ensures data security and provides support for widely distributed teams of researchers. As a successful example for such an approach, we describe the implementation of a unified single platform that combines capabilities of the knowledge management platform tranSMART and the data analysis system Genedata Analyst™. The combined end-to-end platform helps to quickly find, enter, integrate, analyze, extract, and share patient- and drug-related data in the context of translational R&D projects.
Full-text available
tranSMART is an emerging global open source public private partnership community developing a comprehensive informatics-based analysis and data-sharing cloud platform for clinical and translational research. The tranSMART consortium includes pharmaceutical and other companies, not-for-profits, academic entities, patient advocacy groups, and government stakeholders. The tranSMART value proposition relies on the concept that the global community of users, developers, and stakeholders are the best source of innovation for applications and for useful data. Continued development and use of the tranSMART platform will create a means to enable "pre-competitive" data sharing broadly, saving money and, potentially accelerating research translation to cures. Significant transformative effects of tranSMART includes 1) allowing for all its user community to benefit from experts globally, 2) capturing the best of innovation in analytic tools, 3) a growing 'big data' resource, 4) convergent standards, and 5) new informatics-enabled translational science in the pharma, academic, and not-for-profit sectors.
Full-text available
This study was undertaken to test the hypothesis that short-term exposure (4 h) to physiological hyperinsulinemia in normal, healthy subjects without a family history of diabetes would induce a low grade inflammatory response independently of glycemic status. Twelve normal glucose tolerant subjects received a 4-h euglycemic hyperinsulinemic clamp with biopsies of the vastus lateralis muscle. Microarray analysis identified 121 probe sets that were significantly altered in response to physiological hyperinsulinemia while maintaining euglycemia. In normal, healthy human subjects insulin increased the mRNAs of a number of inflammatory genes (CCL2, CXCL2 and THBD) and transcription factors (ATF3, BHLHB2, HES1, KLF10, JUNB, FOS, and FOSB). A number of other genes were upregulated in response to insulin, including RRAD, MT, and SGK. CITED2, a known coactivator of PPARalpha, was significantly downregulated. SGK and CITED2 are located at chromosome 6q23, where we previously detected strong linkage to fasting plasma insulin concentrations. We independently validated the mRNA expression changes in an additional five subjects and closely paralleled the results observed in the original 12 subjects. A saline infusion in healthy, normal glucose-tolerant subjects without family history of diabetes demonstrated that the genes altered during the euglycemic hyperinsulinemic clamp were due to hyperinsulinemia and were unrelated to the biopsy procedure per se. The results of the present study demonstrate that insulin acutely regulates the levels of mRNAs involved in inflammation and transcription and identifies several candidate genes, including HES1 and BHLHB2, for further investigation.