Article

An RDF/OWL Knowledge Base for Query Answering and Decision Support in Clinical Pharmacogenetics

Medical University of Vienna, Vienna, Austria.
Studies in health technology and informatics 01/2013; 192:539-42. DOI: 10.3233/978-1-61499-289-9-539
Source: PubMed

ABSTRACT Genetic testing for personalizing pharmacotherapy is bound to become an important part of clinical routine. To address associated issues with data management and quality, we are creating a semantic knowledge base for clinical pharmacogenetics. The knowledge base is made up of three components: an expressive ontology formalized in the Web Ontology Language (OWL 2 DL), a Resource Description Framework (RDF) model for capturing detailed results of manual annotation of pharmacogenomic information in drug product labels, and an RDF conversion of relevant biomedical datasets. Our work goes beyond the state of the art in that it makes both automated reasoning as well as query answering as simple as possible, and the reasoning capabilities go beyond the capabilities of previously described ontologies.

0 Followers
 · 
65 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Drug-drug interactions (DDIs) are a major contributing factor for unexpected adverse drug events (ADEs). However, few of knowledge resources cover the severity information of ADEs that is critical for prioritizing the medical need. The objective of the study is to develop and evaluate a Semantic Web-based approach for mining severe DDI-induced ADEs. We utilized a normalized FDA Adverse Event Report System (AERS) dataset and performed a case study of three frequently prescribed cardiovascular drugs: Warfarin, Clopidogrel and Simvastatin. We extracted putative DDI-ADE pairs and their associated outcome codes. We developed a pipeline to filter the associations using ADE datasets from SIDER and PharmGKB. We also performed a signal enrichment using electronic medical records (EMR) data. We leveraged the Common Terminology Criteria for Adverse Event (CTCAE) grading system and classified the DDI-induced ADEs into the CTCAE in the Web Ontology Language (OWL). We identified 601 DDI-ADE pairs for the three drugs using the filtering pipeline, of which 61 pairs are in Grade 5, 56 pairs in Grade 4 and 484 pairs in Grade 3. Among 601 pairs, the signals of 59 DDI-ADE pairs were identified from the EMR data. The approach developed could be generalized to detect the signals of putative severe ADEs induced by DDIs in other drug domains and would be useful for supporting translational and pharmacovigilance study of severe ADEs.
    BioData Mining 06/2015; 8(1):12. DOI:10.1186/s13040-015-0044-6 · 1.54 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We previously described a methodology for converting a large set of confidential data records into a set of summaries of similar patients. They claimed that the resulting patient types could "capture important trends and patterns in the data set without disclosing the information in any of the individual data records." In this paper we examine the predictive validity of an initial set of patient types developed in our earlier research. We ask the following question: To what extent can the summarized data derived from each cluster (patient type) be as informative as the original case level data (individuals) from which the clusters were inferred? We address this question by assessing how well predictions made with summarized data matched predictions made with original data. After reviewing relevant literature, and explaining how data is summarized in each cluster of similar patients, we compare the results of predicting death in the ICU 1 using both summarized (regression analysis) and original case data (discriminant analysis and logistic regression analysis). When multiple clusters were used, prediction based on regression analysis of the summarized data was found to be better than prediction using either logistic regression or discriminant analysis on the raw data. We hypothesize that this result is due to segmentation of a heterogenous multivariate space into more homogeneous subregions. We see the present results as an important step towards the development of generalized health data search engines that can utilize non-confidential summarized data passed through health data repository firewalls.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Chignell et al. [1] previously described a methodology for converting a large set of confidential data records into a set of summaries of similar patients. They claimed that the resulting patient types could "capture important trends and patterns in the data set without disclosing the information in any of the individual data records." In this paper we examine the predictive validity of an initial set of patient types developed by [1]. We ask the following question: To what extent can the summarized data derived from each cluster (patient type) be as informative as the original case level data (individuals) from which the clusters were inferred? We address this question by assessing how well predictions made with summarized data matched predictions made with original data. After reviewing relevant literature, and explaining how data is summarized in each cluster of similar patients, we compare the results of predicting death in the ICU 1 using both summarized (regression analysis) and original case data (discriminant analysis and logistic regression analysis). When multiple clusters were used, prediction based on regression analysis of the summarized data was found to be better than prediction using either logistic regression or discriminant analysis on the raw data. We hypothesize that this result is due to segmentation of a heterogenous multivariate space into more homogeneous subregions. We see the present results as an important step towards the development of generalized health data search engines that can utilize non-confidential summarized data passed through health data repository firewalls.