-
[show abstract]
[hide abstract]
ABSTRACT: In response to the unbridled growth of information in literature and biomedical databases, researchers require efficient means of handling and extracting information. As well as providing background information for research, scientific publications can be processed to transform textual information into database content or complex networks and can be integrated with existing knowledge resources to suggest novel hypotheses. Information extraction and text data analysis can be particularly relevant and helpful in genetics and biomedical research, in which up-to-date information about complex processes involving genes, proteins and phenotypes is crucial. Here we explore the latest advancements in automated literature analysis and its contribution to innovative research approaches.
Nature Reviews Genetics 11/2012; · 38.08 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Researchers use animal studies to better understand human diseases. In recent years, large-scale phenotype studies such as Phenoscape and EuroPhenome have been initiated to identify genetic causes of a species' phenome. Species-specific phenotype ontologies are required to capture and report about all findings and to automatically infer results relevant to human diseases. The integration of the different phenotype ontologies into a coherent framework is necessary to achieve interoperability for cross-species research.Here, we investigate the quality and completeness of two different methods to align the Human Phenotype Ontology and the Mammalian Phenotype Ontology. The first method combines lexical matching with inference over the ontologies' taxonomic structures, while the second method uses a mapping algorithm based on the formal definitions of the ontologies. Neither method could map all concepts. Despite the formal definitions method provides mappings for more concepts than does the lexical matching method, it does not outperform the lexical matching in a biological use case. Our results suggest that combining both approaches will yield a better mappings in terms of completeness, specificity and application purposes.
Journal of biomedical semantics. 09/2012; 3 Suppl 2:S1.
-
[show abstract]
[hide abstract]
ABSTRACT: The investigation of phenotypes in model organisms has the potential to reveal the molecular mechanisms underlying disease. The large-scale comparative analysis of phenotypes across species can reveal novel associations between genotypes and diseases. We use the PhenomeNET network of phenotypic similarity to suggest genotype-disease association, combine them with drug-gene associations available from the PharmGKB database, and infer novel associations between drugs and diseases. We evaluate and quantify our results based on our method's capability to reproduce known drug-disease associations. We find and discuss evidence that levonorgestrel, tretinoin and estradiol are associated with cystic fibrosis (p < 2:65 ยท 10(-6), p < 0:002 and p < 0:031, Wilcoxon signedrank test, Bonferroni correction) and that ibuprofen may be active in chronic lymphocytic leukemia (p < 2:63 p < 0:03110(-23) Wilcoxon signed-rank test, Bonferroni correction). To enable access to our results, we implement a web server and make our raw data freely available. Our results are the first steps in implementing an integrated system for the analysis and prediction of drug-disease associations for rare and orphan diseases for which the molecular basis is not known.
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing 01/2012;
-
[show abstract]
[hide abstract]
ABSTRACT: Despite considerable progress in understanding the molecular origins of hereditary human diseases, the molecular basis of several thousand genetic diseases still remains unknown. High-throughput phenotype studies are underway to systematically assess the phenotype outcome of targeted mutations in model organisms. Thus, comparing the similarity between experimentally identified phenotypes and the phenotypes associated with human diseases can be used to suggest causal genes underlying a disease. In this manuscript, we present a method for disease gene prioritization based on comparing phenotypes of mouse models with those of human diseases. For this purpose, either human disease phenotypes are "translated" into a mouse-based representation (using the Mammalian Phenotype Ontology), or mouse phenotypes are "translated" into a human-based representation (using the Human Phenotype Ontology). We apply a measure of semantic similarity and rank experimentally identified phenotypes in mice with respect to their phenotypic similarity to human diseases. Our method is evaluated on manually curated and experimentally verified gene-disease associations for human and for mouse. We evaluate our approach using a Receiver Operating Characteristic (ROC) analysis and obtain an area under the ROC curve of up to . Furthermore, we are able to confirm previous results that the Vax1 gene is involved in Septo-Optic Dysplasia and suggest Gdf6 and Marcks as further potential candidates. Our method significantly outperforms previous phenotype-based approaches of prioritizing gene-disease associations. To enable the adaption of our method to the analysis of other phenotype data, our software and prioritization results are freely available under a BSD licence at http://code.google.com/p/phenomeblast/wiki/CAMP. Furthermore, our method has been integrated in PhenomeNET and the results can be explored using the PhenomeBrowser at http://phenomebrowser.net.
PLoS ONE 01/2012; 7(6):e38937. · 4.09 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: MOTIVATION: Ontologies are essential in biomedical research due to their ability to semantically integrate content from different scientific databases and resources. Their application improves capabilities for querying and mining biological knowledge. An increasing number of ontologies is being developed for this purpose, and considerable effort is invested into formally defining them in order to represent their semantics explicitly. However, current biomedical ontologies do not facilitate data integration and interoperability yet, since reasoning over these ontologies is very complex and cannot be performed efficiently or is even impossible. We propose the use of less expressive subsets of ontology representation languages to enable efficient reasoning and achieve the goal of genuine interoperability between ontologies. RESULTS: We present and evaluate EL Vira, a framework that transforms OWL ontologies into the OWL EL subset, thereby enabling the use of tractable reasoning. We illustrate which OWL constructs and inferences are kept and lost following the conversion and demonstrate the performance gain of reasoning indicated by the significant reduction of processing time. We applied EL Vira to the open biomedical ontologies and provide a repository of ontologies resulting from this conversion. EL Vira creates a common layer of ontological interoperability that, for the first time, enables the creation of software solutions that can employ biomedical ontologies to perform inferences and answer complex queries to support scientific analyses. Availability and implementation: The EL Vira software is available from http://el-vira.googlecode.com and converted OBO ontologies and their mappings are available from http://bioonto.gen.cam.ac.uk/el-ont.
Bioinformatics 02/2011; 27(7):1001-8. · 5.47 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Annotated reference corpora play an important role in biomedical information extraction. A semantic annotation of the natural language texts in these reference corpora using formal ontologies is challenging due to the inherent ambiguity of natural language. The provision of formal definitions and axioms for semantic annotations offers the means for ensuring consistency as well as enables the development of verifiable annotation guidelines. Consistent semantic annotations facilitate the automatic discovery of new information through deductive inferences.
We provide a formal characterization of the relations used in the recent GENIA corpus annotations. For this purpose, we both select existing axiom systems based on the desired properties of the relations within the domain and develop new axioms for several relations. To apply this ontology of relations to the semantic annotation of text corpora, we implement two ontology design patterns. In addition, we provide a software application to convert annotated GENIA abstracts into OWL ontologies by combining both the ontology of relations and the design patterns. As a result, the GENIA abstracts become available as OWL ontologies and are amenable for automated verification, deductive inferences and other knowledge-based applications.
Documentation, implementation and examples are available from http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/.
Journal of biomedical semantics. 01/2011; 2 Suppl 5:S1.
-
[show abstract]
[hide abstract]
ABSTRACT: Researchers design ontologies as a means to accurately annotate and integrate experimental data across heterogeneous and disparate data- and knowledge bases. Formal ontologies make the semantics of terms and relations explicit such that automated reasoning can be used to verify the consistency of knowledge. However, many biomedical ontologies do not sufficiently formalize the semantics of their relations and are therefore limited with respect to automated reasoning for large scale data integration and knowledge discovery. We describe a method to improve automated reasoning over biomedical ontologies and identify several thousand contradictory class definitions. Our approach aligns terms in biomedical ontologies with foundational classes in a top-level ontology and formalizes composite relations as class expressions. We describe the semi-automated repair of contradictions and demonstrate expressive queries over interoperable ontologies. Our work forms an important cornerstone for data integration, automatic inference and knowledge discovery based on formal representations of knowledge. Our results and analysis software are available at http://bioonto.de/pmwiki.php/Main/ReasonableOntologies.
PLoS ONE 01/2011; 6(7):e22006. · 4.09 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Phenotypic information is important for the analysis of the molecular mechanisms underlying disease. A formal ontological representation of phenotypic information can help to identify, interpret and infer phenotypic traits based on experimental findings. The methods that are currently used to represent data and information about phenotypes fail to make the semantics of the phenotypic trait explicit and do not interoperate with ontologies of anatomy and other domains. Therefore, valuable resources for the analysis of phenotype studies remain unconnected and inaccessible to automated analysis and reasoning.
We provide a framework to formalize phenotypic descriptions and make their semantics explicit. Based on this formalization, we provide the means to integrate phenotypic descriptions with ontologies of other domains, in particular anatomy and physiology. We demonstrate how our framework leads to the capability to represent disease phenotypes, perform powerful queries that were not possible before and infer additional knowledge.
http://bioonto.de/pmwiki.php/Main/PheneOntology.
Bioinformatics 10/2010; 26(24):3112-8. · 5.47 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Primary immunodeficiency diseases (PIDs) are the consequence of genetic disorders and usually manifest themselves in very young patients. Because of their rarity, they are notoriously difficult to diagnose both for general practitioners and clinicians. In this paper, we present the foundations of an ontology of PIDs, which will be at the heart of an expert system designed to assist the clinician in the diagnosis of these diseases. To achieve this, the PIDOntology characterises Primary Immunodefieciencies in terms of Phenotypes. While there are a number of different ontologies already available that allow the description of phenotypes and phenotypic qualities, these have a number of associated ontological problems, which we will also address as part of this paper. We use the subtype of Hyper-IgE Syndrome caused by a STAT3 defects as an example of a primary immunodeficiency and show how the clinical phenotype of the disease can be modeled in terms of other phenotypes by introducing the notion of the "phene". Furthermore, we develop patterns for different types of phenes and show, that these patterns can be mapped onto more traditional entity-quality statements, which are the current state of the art in phenotypic modeling.
Proceedings of the 2nd Workshop for Ontologies in Biomedicine and Life sciences (OBML); 09/2010
-
[show abstract]
[hide abstract]
ABSTRACT: An integration of the OBO Flatfile Format and the Web Ontology Language OBOF with OWL while maintaining the semantics for relations provided by the RO. (OWL) would enable automated reasoning, inferences and consistency checking of biomedical ontologies and support the development and maintenance of ontologies developed in the OBO Flatfile Format. So far, the translation of relations in the OBO language to OWL is performed according to a single rigid pattern and in violation of the relation definitions of the OBO Relationship Ontology. We extend both the OBO Flatfile Format and the Manchester OWL Syntax to accommodate relation definitions. Based on these extensions, we implemented and evaluated two software applications. The first converts the OBO Flatfile Format to an OWL representation. The second uses automated inferences to convert OWL ontologies back to a representation in the OBO Flatfile Format. The OWLDEF method is generally applicaple whenever ontologies are developed primarily using patterns and not a detailled knowledge representation language. The tools and libraries we developed for the OWLDEF method are available from http://bioonto. de/obo2owl.
Proceedings of the 13th Annual Bio-Ontologies Meeting; 07/2010
-
[show abstract]
[hide abstract]
ABSTRACT: most biomedical ontologies are represented in the OBO Flatfile Format, which is an easy-to-use graph-based ontology language. The semantics of the OBO Flatfile Format 1.2 enforces a strict predetermined interpretation of relationship statements between classes. It does not allow flexible specifications that provide better approximations of the intuitive understanding of the considered relations. If relations cannot be accurately expressed then ontologies built upon them may contain false assertions and hence lead to false inferences. Ontologies in the OBO Foundry must formalize the semantics of relations according to the OBO Relationship Ontology (RO). Therefore, being able to accurately express the intended meaning of relations is of crucial importance. Since the Web Ontology Language (OWL) is an expressive language with a formal semantics, it is suitable to de ne the meaning of relations accurately.
we developed a method to provide definition patterns for relations between classes using OWL and describe a novel implementation of the RO based on this method. We implemented our extension in software that converts ontologies in the OBO Flatfile Format to OWL, and also provide a prototype to extract relational patterns from OWL ontologies using automated reasoning. The conversion software is freely available at http://bioonto.de/obo2owl, and can be accessed via a web interface.
explicitly defining relations permits their use in reasoning software and leads to a more flexible and powerful way of representing biomedical ontologies. Using the extended langua0067e and semantics avoids several mistakes commonly made in formalizing biomedical ontologies, and can be used to automatically detect inconsistencies. The use of our method enables the use of graph-based ontologies in OWL, and makes complex OWL ontologies accessible in a graph-based form. Thereby, our method provides the means to gradually move the representation of biomedical ontologies into formal knowledge representation languages that incorporates an explicit semantics. Our method facilitates the use of OWL-based software in the back-end while ontology curators may continue to develop ontologies with an OBO-style front-end.
BMC Bioinformatics 01/2010; 11:441. · 2.75 Impact Factor
-
Proceedings of the Fourth International Symposium for Semantic Mining in Biomedicine, Cambridge, United Kingdom, October, 2010; 01/2010
-
[show abstract]
[hide abstract]
ABSTRACT: Directed acyclic graphs are commonly used to represent on-tologies in the biomedical domain. They provide an intuitive means to formalize relations that hold between ontological categories. However, their semantics is usually not explicit. We provide a semantics for a part of the OBO Flatfile Format by extending OWL with a method to express relational patterns. These patterns are OWL axioms with variables for classes. The variables can only be filled with named classes. Addition-ally, we provide a semantics for open patterns in OWL. Our method is applicable to the OBO Flatfile Format, and provides a means to design OWL ontologies using complex ontology design patterns. Therefore, it leads not only to an integration of the OBO Flatfile Format and OWL, but extends OWL with an intuitive interface for designing ontologies us-ing complex definition patterns. A prototypic implementation and test results are available at http://bioonto.de/obo2owl.