Calling on a million minds for community annotation in WikiProteins

Erasmus Medical Centre, Department of Medical Informatics, Rotterdam, the Netherlands.
Genome biology (Impact Factor: 10.47). 02/2008; 9(5):R89. DOI: 10.1186/gb-2008-9-5-r89
Source: PubMed

ABSTRACT WikiProteins enables community annotation in a Wiki-based system. Extracts of major data sources have been fused into an editable environment that links out to the original sources. Data from community edits create automatic copies of the original data. Semantic technology captures concepts co-occurring in one sentence and thus potential factual statements. In addition, indirect associations between concepts have been calculated. We call on a 'million minds' to annotate a 'million concepts' and to collect facts from the literature with the reward of collaborative knowledge discovery. The system is available for beta testing at

Download full-text


Available from: Roberto C. S. Pacheco, Jul 01, 2015
1 Follower
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The rapid accumulation of genome annotations, as well as their widespread reuse in clinical and scientific practice, poses new challenges to management of the quality of scientific data. This study contributes towards better understanding of scientist perception and priorities for data quality and data quality assurance skills needed in genome annotation. Our study was guided by a previously developed general framework for assessment of data quality and by a taxonomy of data quality skills, and intended to define context-sensitive models of criteria for data quality and skills for genome annotation. Analysis of the results revealed that genomics scientists recognize specific sets of criteria for quality in the genome-annotation context. Seventeen data quality dimensions were reduced to five factor constructs, and 17 relevant skills were grouped into four factor constructs. The constructs defined by this study advances the understanding of data quality relationships and is an important contribution to data and information quality research. In addition, the resulting models can serve as valuable resources to genome data curators and administrators for developing data-curation policies and designing DQ-assurance strategies, processes, procedures, and infrastructure. The study’s findings may also inform educators in developing data quality assurance curricula and training courses.
    Journal of the American Society for Information Science and Technology 01/2012; 63(1):195-207. DOI:10.1002/asi.21652 · 2.23 Impact Factor
  • Source
    Bioinformatics - Trends and Methodologies, 11/2011; , ISBN: 978-953-307-282-1
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Compose "dream tools" from continuously evolving bundles of software to make sense of complex scientific data sets.
    Communications of the ACM 03/2011; 54(3):60-69. DOI:10.1145/1897852.1897871 · 2.86 Impact Factor