Meeting report: a workshop on Best Practices in Genome Annotation

Informatics, J. Craig Venter Institute, Rockville, MD 20850 USA, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK and The Arabidopsis Information Resource, Carnegie Institution of Washington, Stanford, CA 94305 USA.
Database The Journal of Biological Databases and Curation (Impact Factor: 4.46). 01/2010; 2010:baq001. DOI: 10.1093/database/baq001
Source: PubMed

ABSTRACT Efforts to annotate the genomes of a wide variety of model organisms are currently carried out by sequencing centers, model organism databases and academic/institutional laboratories around the world. Different annotation methods and tools have been developed over time to meet the needs of biologists faced with the task of annotating biological data. While standardized methods are essential for consistent curation within each annotation group, methods and tools can differ between groups, especially when the groups are curating different organisms. Biocurators from several institutes met at the Third International Biocuration Conference in Berlin, Germany, April 2009 and hosted the 'Best Practices in Genome Annotation: Inference from Evidence' workshop to share their strategies, pipelines, standards and tools. This article documents the material presented in the workshop.

Download full-text


Available from: Linda I Hannick, Jun 29, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: As the amount of heterogeneous genomic data and related annotations continues to grow, a flexible and easy-to-access data management solution is required to integrate such data and diverse annotation tasks. This preliminary report describes the benefits of using IBM DB2® Content Manager software by conducting task-oriented grape genome annotations, along with data quality-assurance checks throughout the annotation process. To demonstrate the usability of this application, we describe the implementation of two real-life content-based genome annotation case scenarios: 1) expressed sequence tags annotation; and 2) sequence annotation related to simple sequence repeat markers. The IBM DB2 Content Manager allows users to easily construct content-based genomic information applications as rapidly built and readily adapted customized content documents with attributes within an easy-to-use interface system. Users can simultaneously conduct the annotation quality checks while making annotations by utilizing a built-in standardized data quality-control assurance procedure referred to as annotation “routing.” The system provides search features or cross-links with different annotation contents or data formats. The data quality workflow and procedure within the system also resulted in accuracy and consistency in the data annotation and curation lifecycle.
    Ibm Journal of Research and Development 11/2011; 55(6):13. DOI:10.1147/JRD.2011.2172837 · 0.50 Impact Factor
  • Source
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The promise of genome sequencing was that the vast undiscovered country would be mapped out by comparison of the multitude of sequences available and would aid researchers in deciphering the role of each gene in every organism. Researchers recognize that there is a need for high quality data. However, different annotation procedures, numerous databases, and a diminishing percentage of experimentally determined gene functions have resulted in a spectrum of annotation quality. NCBI in collaboration with sequencing centers, archival databases, and researchers, has developed the first international annotation standards, a fundamental step in ensuring that high quality complete prokaryotic genomes are available as gold standard references. Highlights include the development of annotation assessment tools, community acceptance of protein naming standards, comparison of annotation resources to provide consistent annotation, and improved tracking of the evidence used to generate a particular annotation. The development of a set of minimal standards, including the requirement for annotated complete prokaryotic genomes to contain a full set of ribosomal RNAs, transfer RNAs, and proteins encoding core conserved functions, is an historic milestone. The use of these standards in existing genomes and future submissions will increase the quality of databases, enabling researchers to make accurate biological discoveries.
    Standards in Genomic Sciences 10/2011; 5(1):168-93. DOI:10.4056/sigs.2084864 · 3.17 Impact Factor