August 2024
·
3 Reads
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
August 2024
·
3 Reads
July 2024
·
128 Reads
·
1 Citation
The integration and management of heterogeneous data pose challenges across various applications, requiring scalable solutions to handle large volumes of data, maintain compatibility, and ensure security, privacy, and regulatory compliance. This position paper presents a federated data and service catalogue based on the Eclipse XFSC framework. It presents enhancements such as individual application profiles, metadata service offerings, trusted management, interoperable data sharing and ecosystem integration. These contributions address the critical requirements of federated dataspaces, providing a robust, secure and scalable infrastructure for effective data integration and management across diverse applications.
May 2024
·
7 Reads
December 2023
·
176 Reads
Tools for creating and managing knowledge graphs are a crucial ingredient of data management in various fields such as healthcare or manufacturing. Knowledge graphs comprise entities, attributes, and relationships, which help machines to understand the meaning of data and facilitate data sharing. Although knowledge graphs are a key concept for many applications, their creation remains complex and challenging, especially for novice users. To address this limitation, we have developed KGraphX, an easy-to-use visual editor to make knowledge graph creation for users with limited knowledge of Semantic Web technology more engaging. KGraphX uses visual elements, predictive typing, and augmented information from existing knowledge repositories to create knowledge graphs easily. In a task-based evaluation, we have proven the usability and quality of the created knowledge graphs using KGraphX. Compared to other tools, our tool significantly enhanced knowledge graph creation by 81%, while also improving its clarity by 91%.
August 2023
·
10 Reads
·
2 Citations
Communications in Computer and Information Science
This paper introduces the concept of a knowledge graph for time series data, which allows for a structured management and propagation of characteristic time series information and the ability to support query-driven data analyses. We gradually link and enrich knowledge obtained by domain experts or previously performed analyses by representing globally and locally occurring time series insights as individual graph nodes. Supported by a utilization of techniques from automated knowledge discovery and machine learning, a recursive integration of analytical query results is exploited to generate a spectral representation of linked and successively condensed information. Besides a time series to graph mapping, we provide an ontology describing a classification of maintained knowledge and affiliated analysis methods for knowledge generation. After a discussion on gradual knowledge enrichment, we finally illustrate the concept of knowledge propagation based on an application of state-of-the-art methods for time series analysis. KeywordsKnowledge GraphTime SeriesKnowledge DiscoveryExploratory Data AnalysisMachine Learning
July 2023
·
174 Reads
·
67 Citations
Briefings in Bioinformatics
Artificial intelligence (AI) systems utilizing deep neural networks and machine learning (ML) algorithms are widely used for solving critical problems in bioinformatics, biomedical informatics and precision medicine. However, complex ML models that are often perceived as opaque and black-box methods make it difficult to understand the reasoning behind their decisions. This lack of transparency can be a challenge for both end-users and decision-makers, as well as AI developers. In sensitive areas such as healthcare, explainability and accountability are not only desirable properties but also legally required for AI systems that can have a significant impact on human lives. Fairness is another growing concern, as algorithmic decisions should not show bias or discrimination towards certain groups or individuals based on sensitive attributes. Explainable AI (XAI) aims to overcome the opaqueness of black-box models and to provide transparency in how AI systems make decisions. Interpretable ML models can explain how they make predictions and identify factors that influence their outcomes. However, the majority of the state-of-the-art interpretable ML methods are domain-agnostic and have evolved from fields such as computer vision, automated reasoning or statistics, making direct application to bioinformatics problems challenging without customization and domain adaptation. In this paper, we discuss the importance of explainability and algorithmic transparency in the context of bioinformatics. We provide an overview of model-specific and model-agnostic interpretable ML methods and tools and outline their potential limitations. We discuss how existing interpretable ML methods can be customized and fit to bioinformatics research problems. Further, through case studies in bioimaging, cancer genomics and text mining, we demonstrate how XAI methods can improve transparency and decision fairness. Our review aims at providing valuable insights and serving as a starting point for researchers wanting to enhance explainability and decision transparency while solving bioinformatics problems. GitHub: https://github.com/rezacsedu/XAI-for-bioinformatics.
April 2023
·
175 Reads
·
3 Citations
April 2023
·
30 Reads
October 2019
·
83 Reads
·
24 Citations
September 2019
·
198 Reads
The discovery of important biomarkers is a significant step towards understanding the molecular mechanisms of carcinogenesis; enabling accurate diagnosis for, and prognosis of, a certain cancer type. Before recommending any diagnosis, genomics data such as gene expressions(GE) and clinical outcomes need to be analyzed. However, complex nature, high dimensionality, and heterogeneity in genomics data make the overall analysis challenging. Convolutional neural networks(CNN) have shown tremendous success in solving such problems. However, neural network models are perceived mostly as `black box' methods because of their not well-understood internal functioning. However, interpretability is important to provide insights on why a given cancer case has a certain type. Besides, finding the most important biomarkers can help in recommending more accurate treatments and drug repositioning. In this paper, we propose a new approach called OncoNetExplainer to make explainable predictions of cancer types based on GE data. We used genomics data about 9,074 cancer patients covering 33 different cancer types from the Pan-Cancer Atlas on which we trained CNN and VGG16 networks using guided-gradient class activation maps++(GradCAM++). Further, we generate class-specific heat maps to identify significant biomarkers and computed feature importance in terms of mean absolute impact to rank top genes across all the cancer types. Quantitative and qualitative analyses show that both models exhibit high confidence at predicting the cancer types correctly giving an average precision of 96.25%. To provide comparisons with the baselines, we identified top genes, and cancer-specific driver genes using gradient boosted trees and SHapley Additive exPlanations(SHAP). Finally, our findings were validated with the annotations provided by the TumorPortal.
... Recent advances in explainable artificial intelligence have enabled the interpretation of integrated machine learning (ML) models [16,17], offering significant potential for exploring POD risk factors. Several studies have utilized ML algorithms for POD risk identification; however, most studies rely on routine clinical data and electronic health records, which often exclude sensitive biomarkers, leading to significant bias in predictive outcomes [18,19]. ...
July 2023
Briefings in Bioinformatics
... This work emerged in the context of building a dataspace based on Gaia-X, which is directly connected to the topic of Semantic Web technologies and tools since their use is central to the implementation of federated dataspaces in Gaia-X. A general overview of the role of semantics in dataspaces is given by Theissen-Lipp et al. [13]. ...
April 2023
... To address these challenges, feature selection plays a critical role in gene expression classification. By identifying and retaining only the most informative genes, feature selection not only improves model generalization but also enhances interpretability, allowing researchers to derive biologically meaningful insights from machine learning predictions [9,20]. Furthermore, reducing the feature space significantly lowers computational costs, enabling faster model training and inference. ...
October 2019