Sustaining the Data and Bioresource Commons

Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, CB2 3EG, UK.
Science (Impact Factor: 31.48). 10/2010; 330(6004):592-3. DOI: 10.1126/science.1191506
Source: PubMed

ABSTRACT Globalization of biomedical research requires sustained investment for databases and biorepositories.

  • Source
    • "Documenting biological diversity requires open exchange of data and tools across disciplines and national borders (Mittermeier et al. 1997). Therefore, the traditional paradigm of sharing scientific data and results only through publications in books and specialized journals is not sufficient (Schofield et al. 2010). Lack of taxonomic knowledge is frequently pointed to as an impediment for biodiversity studies and for defining conservation plans (Wheeler et al. 2004b). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Scientists from megadiverse countries, such as Brazil, face huge challenges in gathering and analyzing information about species richness and abundance. In Brazil, speciesLink is an e-infrastructure that offers free and open access to data from more than 300 biological and data collections. SpeciesLink’s thematic network, INCT-Virtual Herbarium of Plants and Fungi and the List of Species of the Brazilian Flora, are used as primary data sources to develop Lacunas, an information system with a public web interface that generates detailed reports of the status of plant species occurrence data. Lacunas also integrates information about endemism, conservation status, and collecting efforts over time. Here we describe the motivation and the functionality of this system, showing how it can be useful in detecting under-sampled plant species and geographic areas. We show examples of how knowledge can be extracted from biodiversity primary data using Lacunas. For instance, Lacunas report revealed that 111 angiosperm species (10.3 %), currently considered Data Deficient (DD) in the Official List of Threatened Brazilian Flora, have their distribution well characterized. In addition, the situation of Attalea funifera, a native palm classified as DD, was analyzed in detail, together with other use cases. Information presented in Lacunas reports can thus be used by scientists and policy-makers to help evaluate the status of species occurrence data and prioritize digitization and collecting efforts, as well as some features concerning its conservation status. As Lacunas offers a public online interface, it may also become a valuable tool for helping decision-making processes to become more dynamic and transparent.
    Biodiversity and Conservation 11/2013; 23(1). DOI:10.1007/s10531-013-0587-0 · 2.07 Impact Factor
  • Source
    • "It is estimated that 80% of scientific output comes from these small providers [7]. Generally called " small science, " these data are rarely preserved [9] [10]. Scientific publication, a narrative explanation derived from primary data, is often the only lasting record of this work. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Centuries of biological knowledge are contained in the massive body of scientific literature, written for human-readability but too big for any one person to consume. Large-scale mining of information from the literature is necessary if biology is to transform into a data-driven science. A computer can handle the volume but cannot make sense of the language. This paper reviews and discusses the use of natural language processing (NLP) and machine-learning algorithms to extract information from systematic literature. NLP algorithms have been used for decades, but require special development for application in the biological realm due to the special nature of the language. Many tools exist for biological information extraction (cellular processes, taxonomic names, and morphological characters), but none have been applied life wide and most still require testing and development. Progress has been made in developing algorithms for automated annotation of taxonomic text, identification of taxonomic names in text, and extraction of morphological character information from taxonomic descriptions. This manuscript will briefly discuss the key steps in applying information extraction tools to enhance biodiversity science.
    Advances in Bioinformatics 05/2012; 2012:391574. DOI:10.1155/2012/391574
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: RIKEN BioResource Center (BRC) has collected, preserved, conducted quality control of, and distributed mouse resources since 2002 as the core facility of the National BioResource Project by the Ministry of Education, Culture, Sports, Science and Technology (MEXT), Japan. Our mouse resources include over 5,000 strains such as humanized disease models, fluorescent reporters, and knockout mice. We have developed novel mouse strains such as tissue-specific Cre-drivers and optogenetic strains that are in high demand by the research community. We have removed all our specified pathogens from the deposited mice and used our quality control tests to examine their genetic modifications and backgrounds. RIKEN BRC is a founding member of the Federation of International Mouse Resources and the Asian Mouse Mutagenesis and Resource Association, and provides mouse resources to the one-stop International Mouse Strain Resource database. RIKEN BRC also participates in the International Gene Trap Consortium, having registered 713 gene-trap clones and their sequences in a public library, and is an advisory member of the CREATE (Coordination of resources for conditional expression of mutated mouse alleles) consortium which represents major European and international mouse database holders for the integration and dissemination of Cre-driver strains. RIKEN BRC provides training courses in the use of advanced technologies for the quality control and cryopreservation of mouse strains to promote the effective use of mouse resources worldwide.
    Interdisciplinary Bio Central 12/2010; 2(4). DOI:10.4051/ibc.2010.2.4.0015
Show more