Stefan Decker's research while affiliated with RWTH Aachen University and other places

Publications (466)

Thesis
Full-text available
The Internet of Production (IoP) is one of the clusters of excellence at RWTH University. Its goal is to enable a new way of data understanding by integrating semantics in real-time data related to the production system, including processes and users’ data. For achieving this, it is necessary to have semantic models (ontologies), i.e., a formal des...
Preprint
Full-text available
Deep neural networks (DNNs) have been shown to outperform traditional machine learning algorithms in a broad variety of application domains due to their effectiveness in modeling intricate problems and handling high-dimensional datasets. Many real-life datasets, however, are of increasingly high dimensionality, where a large number of features may...
Article
Recent heavy rainfall-induced flood events, for example in Germany, Australia and USA, have highlighted the relevance of countermeasures in saving human lives and preventing property damage. Newly introduced ML-based flood forecasting methods rely on high-intensity synthetic rainfall events due to the sparsity of their real counterpart. Such synthe...
Article
Full-text available
Service-oriented architectures (SOA) are becoming more widespread in the context of Industry 4.0, and their interface descriptions enable modular and scalable communication systems. Since syntactic checks such as data types are solved nowadays, the purpose of this work is to add semantic validation based on the idea of Semantic Web Services. This p...
Article
Full-text available
The constant upward movement of data-driven medicine as a valuable option to enhance daily clinical practice has brought new challenges for data analysts to get access to valuable but sensitive data due to privacy considerations. One solution for most of these challenges are Distributed Analytics (DA) infrastructures, which are technologies fosteri...
Article
Full-text available
Background In recent years, data-driven medicine has gained increasing importance in terms of diagnosis, treatment, and research due to the exponential growth of health care data. However, data protection regulations prohibit data centralisation for analysis purposes because of potential privacy risks like the accidental disclosure of data to third...
Article
Full-text available
Artificial intelligence (AI) systems are increasingly used in health and personalized care. However, the adoption of data-driven approaches in many clinical settings has been hampered due to their inability to perform in a reliable and safe manner to leverage accurate and trustworthy diagnoses. A critical and challenging usage scenario for AI is ai...
Conference Paper
In past years, the average amount of time a human spends on digital devices increased remarkably. This change towards consumption of mostly digital contents is driven by the increasing availability of data and the accompanying digitization of knowledge. Yet, data availability in composite production is still up to this day either scarce or non-exis...
Preprint
Full-text available
Unlike traditional central training, federated learning (FL) improves the performance of the global model by sharing and aggregating local models rather than local data to protect the users' privacy. Although this training approach appears secure, some research has demonstrated that an attacker can still recover private data based on the shared gra...
Chapter
Shared vocabularies and ontologies are essential for many applications. Although standards and recommendations already cover many areas, adaptations are usually necessary to represent concrete use-cases properly. Domain experts are unfamiliar with ontology engineering, which creates special requirements for needed tool support. Simple sketch applic...
Conference Paper
Full-text available
Shared vocabularies and ontologies are essential for many applications. Although standards and recommendations already cover many areas, adaptations are usually necessary to represent concrete use-cases properly. Domain experts are unfamiliar with ontology engineering, which creates special requirements for needed tool support. Simple sketch applic...
Chapter
As Semantic Web Technologies are increasingly employed for the management of highly dynamic data resources, e.g., the Industrial Internet of Things, resource versioning, state synchronization and distributed data management infrastructures are gaining practical relevance. The HTTP Memento protocol has recently been discussed as a promising building...
Chapter
As decision-making is increasingly data-driven, trustworthiness and reliability of the underlying data, e.g., maintained in knowledge graphs or on the Web, are essential requirements for their usability in the industry. However, neither traditional solutions, such as paper-based data curation processes, nor state-of-the-art approaches, such as dist...
Chapter
Full-text available
Skin cancer has become the most common cancer type. Research has applied image processing and analysis tools to support and improve the diagnose process. Conventional procedures usually centralise data from various data sources to a single location and execute the analysis tasks on central servers. However, centralisation of medical data does not o...
Article
Full-text available
In recent years, implementations enabling Distributed Analytics (DA) have gained considerable attention due to their ability to perform complex analysis tasks on decentralised data by bringing the analysis to the data. These concepts propose privacy-enhancing alternatives to data centralisation approaches, which have restricted applicability in cas...
Conference Paper
Full-text available
In recent years, microservice-based architectures have become the de-facto standard for cloud-native applications and enable modular and scalable systems. The lack of communication standards however complicates reliable information exchange. While syntactic checks like datatypes or ranges are mostly solved nowadays, semantic mismatches (e.g., diffe...
Preprint
Finding out the differences and commonalities between the knowledge of two parties is an important task. Such a comparison becomes necessary, when one party wants to determine how much it is worth to acquire the knowledge of the second party, or similarly when two parties try to determine, whether a collaboration could be beneficial. When these two...
Article
Full-text available
Osteoarthritis (OA) is a degenerative joint disease, which significantly affects middle-aged and elderly people. Although primarily identified via hyaline cartilage change based on medical images, technical bottlenecks like noise, artifacts, and modality impose an enormous challenge on high-precision, objective, and efficient early quantification o...
Chapter
Due to privacy protection, the conventional machine learning approaches, which upload all data to a central location, has become less feasible. Federated learning, a privacy-preserving distributed machine learning paradigm, has been proposed as a solution to comply with privacy requirements. By enabling multiple clients collaboratively to learn a s...
Preprint
Full-text available
In recent years we have seen significant advances in the technology used to both publish and consume Linked Data. However, in order to support the next generation of ebusiness applications on top of interlinked machine readable data suitable forms of access control need to be put in place. Although a number of access control models and frameworks h...
Conference Paper
Full-text available
RDFS and OWL ontologies simultaneously define naming, hierarchy, syntactical data structure, and axioms. This strong coupling complicates the reusability of both ontological concepts and annotated data, due to logical pitfalls in RDFS and OWL semantics. The differences between OWL axioms and integrity constraints used for validation are often not c...
Article
Full-text available
An accurate diagnosis and prognosis for cancer are specific to patients with particular cancer types and molecular traits, which needs to address carefully. The discovery of important biomarkers is becoming an important step toward understanding the molecular mechanisms of carcinogenesis in which genomics data and clinical outcomes need to be analy...
Article
Full-text available
Background: Sharing sensitive data across organizational boundaries is often significantly limited by legal and ethical restrictions. Regulations such as the EU General Data Protection Rules (GDPR) impose strict requirements concerning the protection of personal and privacy sensitive data. Therefore new approaches, such as the Personal Health Trai...
Chapter
Wikidata is a free and open knowledge base which can be read and edited by both humans and machines. It acts as a central storage for the structured data of several Wikimedia projects. To improve the process of manually inserting new facts, the Wikidata platform features an association rule-based tool to recommend additional suitable properties. In...
Article
Full-text available
The study of genetic variants(GVs) can help find correlating population groups to identify cohorts that are predisposed to common diseases and explain differences in disease susceptibility and how patients react to drugs. ML algorithms are increasingly being applied to identify interacting GVs to understand their complex phenotypic traits. In this...
Article
Full-text available
In order to facilitate the accesses of general users to knowledge graphs, an increasing effort is being exerted to construct graph-structured queries of given natural language questions. At the core of the construction is to deduce the structure of the target query and determine the vertices/edges which constitute the query. Existing query construc...
Preprint
Full-text available
Amid the coronavirus disease(COVID-19) pandemic, humanity experiences a rapid increase in infection numbers across the world. Challenge hospitals are faced with, in the fight against the virus, is the effective screening of incoming patients. One methodology is the assessment of chest radiography(CXR) images, which usually requires expert radiologi...
Article
Full-text available
Clustering is central to many data-driven bioinformatics research and serves a powerful computational method. In particular, clustering helps at analyzing unstructured and high-dimensional data in the form of sequences, expressions, texts and images. Further, clustering is used to gain insights into biological processes in the genomics level, e.g....
Article
In the production industry, the volume, variety, and velocity of data as well as the number of deployed protocols increase exponentially due to the influences of the Internet-of-Things (IoT) advances. While hundreds of isolated solutions exist to utilize these data, e.g., optimizing processes or monitoring machine conditions, the lack of a unified...
Article
Full-text available
In recent years, as newer technologies have evolved around the healthcare ecosystem, more and more data have been generated. Advanced analytics could power the data collected from numerous sources, both from healthcare institutions, or generated by individuals themselves via apps and devices, and lead to innovations in treatment and diagnosis of di...
Article
Full-text available
Cancer is one of the deadliest diseases caused by abnormal behaviors of genes that control the cell division and growth. Genomics data and clinical outcomes from multiplatform and heterogeneous sources are used to make clinical decisions for the cancer patients, where both multimodality and heterogeneity impose significant challenges to bioinformat...
Preprint
Full-text available
The discovery of important biomarkers is a significant step towards understanding the molecular mechanisms of carcinogenesis; enabling accurate diagnosis for, and prognosis of, a certain cancer type. Before recommending any diagnosis, genomics data such as gene expressions(GE) and clinical outcomes need to be analyzed. However, complex nature, high...
Preprint
In order to facilitate the accesses of general users to knowledge graphs, an increasing effort is being exerted to construct graph-structured queries of given natural language questions. At the core of the construction is to deduce the structure of the target query and determine the vertices/edges which constitute the query. Existing query construc...
Conference Paper
Interference between pharmacological substances can cause serious medical injuries. Correctly predicting so-called drug-drug interactions (DDI) does not only reduce these cases but can also result in a reduction of drug development cost. Presently, most drug-related knowledge is the result of clinical evaluations and post-marketing surveillance; re...
Article
Smart cities around the world have begun monitoring parking areas in order to estimate available parking spots and help drivers looking for parking. The current results are promising, indeed. However, existing approaches are limited by the high cost of sensors that need to be installed throughout the city in order to achieve an accurate estimation....
Article
Secondary use of electronic health record (EHR) data requires a detailed description of metadata, especially when data collection and data re-use are organizationally and technically far apart. This paper describes the concept of the SMITH consortium that includes conventions, processes, and tools for describing and managing metadata using common s...
Preprint
Full-text available
Smart cities around the world have begun monitoring parking areas in order to estimate available parking spots and help drivers looking for parking. The current results are promising, indeed. However, existing approaches are limited by the high cost of sensors that need to be installed throughout the city in order to achieve an accurate estimation....
Preprint
Full-text available
Interference between pharmacological substances can cause serious medical injuries. Correctly predicting so-called drug-drug interactions (DDI) does not only reduce these cases but can also result in a reduction of drug development cost. Presently, most drug-related knowledge is the result of clinical evaluations and post-marketing surveillance; re...
Chapter
A promising pathway for natural language question answering over knowledge graphs (KG-QA) is to translate natural language questions into graph-structured queries. During the translation, a vital process is to map entity/relation phrases of natural language questions to the vertices/edges of underlying knowledge graphs which can be used to construc...
Article
Full-text available
Introduction: This article is part of the Focus Theme of Methods of Information in Medicine on the German Medical Informatics Initiative. "Smart Medical Information Technology for Healthcare (SMITH)" is one of four consortia funded by the German Medical Informatics Initiative (MI-I) to create an alliance of universities, university hospitals, rese...
Conference Paper
Several smart cities around the world have begun monitoring parking areas in order to estimate free spots and help drivers that are looking for parking. The current results are indeed promising, however, this approach is limited by the high costs of sensors that need to be installed throughout the city in order to achieve an accurate estimation rat...
Chapter
Full-text available
Blockchain technology is widely known as the technological basis on which bitcoin is built. This technology has created high expectations, as transactions of every kind are executed in a decentralized way, without the need of a trusted third-party. Blockchain real business applications are currently limited mostly to financial services but many R&D...
Preprint
Full-text available
The understanding of variations in genome sequences assists us in identifying people who are predisposed to common diseases, solving rare diseases, and finding the corresponding population group of the individuals from a larger population group. Although classical machine learning techniques allow researchers to identify groups (i.e. clusters) of r...
Conference Paper
This paper presents an approach to represent medication guidelines in a machine readable form for its use within a production Home Care environment. Part of this work was developed under the scope of the POLYCARE project. The POLYCARE project aims at developing a patient-centred integrated care environment, supported by ICT systems to improve the q...
Conference Paper
Managing Privacy and understanding the handling of personal data has turned into a fundamental right-at least for Europeans-since May 25th with the coming into force of the General Data Protection Regulation. Yet, whereas many different tools by different vendors promise companies to guarantee their compliance to GDPR in terms of consent management...
Conference Paper
Full-text available
As early as the mid sixties, motivated by the ever growing body of scientific knowledge, scholars identified the need for data to be organised in a manner that is more intuitive for humans to digest. Additionally , they envisioned a future where intelligent systems would be able to make sense of vast amounts of data and alleviate humans from perfor...
Article
Mining maximal frequent patterns (MFPs) in transactional databases (TDBs) and dynamic data streams (DDSs) is substantially important for business intelligence. MFPs, as the smallest set of patterns, help to reveal customers’ purchase rules and market basket analysis (MBA). Although, numerous studies have been carried out in this area, most of them...
Article
Full-text available
The dielectric properties of biological tissues characterise the interaction of human tissues with electromagnetic (EM) fields. Accurate knowledge of the dielectric properties of tissues are vital in EM-based therapeutic and diagnostic techniques, and for assessing the safety of wireless devices. Despite the importance of these properties, the fiel...
Article
Full-text available
Background Biomedical data, e.g.~from knowledge bases and ontologies, is increasingly published following linked data principles preferably as triple data according to the RDF standards. This is a necessary step towards unified access to biological data sets, but this still requires solutions to query multiple endpoints for their heterogeneous data...
Conference Paper
Recently, RDF and OWL have become the most common knowledge representation languages in use on the Web, propelled by the recommendation of the W3C. In this paper we examine an alternative way to represent knowledge based on Prototypes. This Prototype-based representation has different properties, which we argue to be more suitable for data sharing...
Conference Paper
Full-text available
We investigate the benefit of data integration and curation services for the current refugee crisis and proposed an architecture to support development of innovative solutions. We focus on developing a multi- /cross-lingual semantic data curation pipeline enriched with natural language processing capabilities in order to (a) improve decision making...
Article
Full-text available
Irish Record Linkage 1864–1913 is a multi-disciplinary project that started in 2014 aiming to create a platform for analyzing events captured in historical birth, marriage, and death records by applying semantic technologies for annotating, storing, and inferring information from the data contained in those records. This enables researchers to, amo...
Article
In recent years we have seen significant advances in the technology used to both publish and consume structured data using the existing web infrastructure, commonly referred to as the Linked Data Web. However, in order to support the next generation of e-business applications on top of Linked Data suitable forms of access control need to be put in...
Article
In recent years RDF and OWL have become the most common knowledge representation languages in use on the Web, propelled by the recommendation of the W3C. In this paper we present a practical implementation of a different kind of knowledge representation based on Prototypes. In detail, we present a concrete syntax easily and effectively parsable by...