An integrated pharmacokinetics ontology and corpus for text mining

BMC Bioinformatics (Impact Factor: 2.58). 02/2013; 14(1):35. DOI: 10.1186/1471-2105-14-35
Source: PubMed


Drug pharmacokinetics parameters, drug interaction parameters, and pharmacogenetics data have been unevenly collected in different databases and published extensively in the literature. Without appropriate pharmacokinetics ontology and a well annotated pharmacokinetics corpus, it will be difficult to develop text mining tools for pharmacokinetics data collection from the literature and pharmacokinetics data integration from multiple databases.

A comprehensive pharmacokinetics ontology was constructed. It can annotate all aspects of in vitro pharmacokinetics experiments and in vivo pharmacokinetics studies. It covers all drug metabolism and transportation enzymes. Using our pharmacokinetics ontology, a PK-corpus was constructed to present four classes of pharmacokinetics abstracts: in vivo pharmacokinetics studies, in vivo pharmacogenetic studies, in vivo drug interaction studies, and in vitro drug interaction studies. A novel hierarchical three level annotation scheme was proposed and implemented to tag key terms, drug interaction sentences, and drug interaction pairs. The utility of the pharmacokinetics ontology was demonstrated by annotating three pharmacokinetics studies; and the utility of the PK-corpus was demonstrated by a drug interaction extraction text mining analysis.

The pharmacokinetics ontology annotates both in vitro pharmacokinetics experiments and in vivo pharmacokinetics studies. The PK-corpus is a highly valuable resource for the text mining of pharmacokinetics parameters and drug interactions.

Download full-text


Available from: Santosh Philips, Sep 16, 2015
51 Reads
    • "Finally, the Pharmacokinetics (PK) ontology (Wu et al., 2013), developed at Indiana University, focused on the representation of different types of PK DDI studies, which are experiments developed in vitro or in vivo to study the existence of drug interactions affecting some of the pharmacokinetic parameters of the interacting drugs. The OWL ontology included five main classes representing the different types of PK studies ('Pharmacokinetic Experiments') and the entities relevant in those studies ('Drug', 'Metabolizing enzymes', 'Transporters' and 'Subjects'). "
    07/2015; 5(3):19-38. DOI:10.4018/IJIRR.2015070102
  • [Show abstract] [Hide abstract]
    ABSTRACT: Scientific communication in biomedicine is, by and large, still text based. Text mining technologies for the automated extraction of useful biomedical information from unstructured text that can be directly used for systems biology modeling have been substantially improved over the past few years. In this review, we underline the importance of named entity recognition and relationship extraction as fundamental approaches that are relevant to systems biology. Furthermore, we emphasize the role of publicly organized scientific benchmarking challenges that reflect the current status of text-mining technology and are important in moving the entire field forward. Given further interdisciplinary development of systems biology-orientated ontologies and training corpora, we expect a steadily increasing impact of text-mining technology on systems biology in the future.
    Drug discovery today 09/2013; 19(2). DOI:10.1016/j.drudis.2013.09.012 · 6.69 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Collection of documents annotated with semantic entities and relationships are crucial resources to support development and evaluation of text mining solutions for the biomedical domain. Here I present an overview of 36 corpora and show an analysis on the semantic annotations they contain. Annotations for entity types were classified into six semantic groups and an overview on the semantic entities which can be found in each corpus is shown. Results show that while some semantic entities, such as genes, proteins and chemicals are consistently annotated in many collections, corpora available for diseases, variations and mutations are still few, in spite of their importance in the biological domain.
    F1000 Research 04/2014; 3:96. DOI:10.12688/f1000research.3216.1
Show more