Gene Ontology annotations: what they mean and where they come from

The Jackson Laboratory, Bar Harbor, ME, USA.
BMC Bioinformatics (Impact Factor: 2.67). 02/2008; 9 Suppl 5(Suppl 5):S2. DOI: 10.1186/1471-2105-9-S5-S2
Source: DBLP

ABSTRACT To address the challenges of information integration and retrieval, the computational genomics community increasingly has come to rely on the methodology of creating annotations of scientific literature using terms from controlled structured vocabularies such as the Gene Ontology (GO). Here we address the question of what such annotations signify and of how they are created by working biologists. Our goal is to promote a better understanding of how the results of experiments are captured in annotations, in the hope that this will lead both to better representations of biological reality through annotation and ontology development and to more informed use of GO resources by experimental scientists.

Download full-text


Available from: Judith A Blake, Jul 04, 2015
1 Follower
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Bio-ontologies are essential tools for accessing and analyzing the rapidly growing pool of plant genomic and phenomic data. Ontologies provide structured vocabularies to support consistent aggregation of data and a semantic framework for automated analyses and reasoning. They are a key component of the semantic web. This paper provides background on what bio-ontologies are, why they are relevant to botany, and the principles of ontology development. It includes an overview of ontologies and related resources that are relevant to plant science, with a detailed description of the Plant Ontology (PO). We discuss the challenges of building an ontology that covers all green plants (Viridiplantae). Ontologies can advance plant science in four keys areas: (1) comparative genetics, genomics, phenomics, and development; (2) taxonomy and systematics; (3) semantic applications; and (4) education. Bio-ontologies offer a flexible framework for comparative plant biology, based on common botanical understanding. As genomic and phenomic data become available for more species, we anticipate that the annotation of data with ontology terms will become less centralized, while at the same time, the need for cross-species queries will become more common, causing more researchers in plant science to turn to ontologies.
    American Journal of Botany 07/2012; 99(8):1263-75. DOI:10.3732/ajb.1200222
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Knowledge-making practices in biology are being strongly affected by the availability of data on an unprecedented scale, the insistence on systemic approaches and growing reliance on bioinformatics and digital infrastructures. What role does theory play within data-intensive science, and what does that tell us about scientific theories in general? To answer these questions, I focus on Open Biomedical Ontologies, digital classification tools that have become crucial to sharing results across research contexts in the biological and biomedical sciences, and argue that they constitute an example of classificatory theory. This form of theorizing emerges from classification practices in conjunction with experimental know-how and expresses the knowledge underpinning the analysis and interpretation of data disseminated online.
    International Studies in the Philosophy of Science 03/2012; 26(1). DOI:10.1080/02698595.2012.653119
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Ontologies and taxonomies have proven highly beneficial for biocuration. The Open Biomedical Ontology (OBO) Foundry alone lists over 90 ontologies mainly built with OBO-Edit. Creating and maintaining such ontologies is a labour-intensive, difficult, manual process. Automating parts of it is of great importance for the further development of ontologies and for biocuration. We have developed the Dresden Ontology Generator for Directed Acyclic Graphs (DOG4DAG), a system which supports the creation and extension of OBO ontologies by semi-automatically generating terms, definitions and parent-child relations from text in PubMed, the web and PDF repositories. DOG4DAG is seamlessly integrated into OBO-Edit. It generates terms by identifying statistically significant noun phrases in text. For definitions and parent-child relations it employs pattern-based web searches. We systematically evaluate each generation step using manually validated benchmarks. The term generation leads to high-quality terms also found in manually created ontologies. Up to 78% of definitions are valid and up to 54% of child-ancestor relations can be retrieved. There is no other validated system that achieves comparable results. By combining the prediction of high-quality terms, definitions and parent-child relations with the ontology editor OBO-Edit we contribute a thoroughly validated tool for all OBO ontology engineers. DOG4DAG is available within OBO-Edit 2.1 at Supplementary data are available at Bioinformatics online.
    Bioinformatics 06/2010; 26(12):i88-96. DOI:10.1093/bioinformatics/btq188