[Show abstract][Hide abstract] ABSTRACT: The Minimum Information for Biological and Biomedical Investigations (MIBBI) project provides a resource for those exploring the range of extant minimum information checklists and fosters coordinated development of such checklists. European Union Framework VI project META- PHOR (Food-ST-2006-03622)
[Show abstract][Hide abstract] ABSTRACT: This meeting report summarizes the proceedings of the "eGenomics: Cataloguing our Complete Genome Collection IV" workshop held June 6-8, 2007, at the National Institute for Environmental eScience (NIEeS), Cambridge, United Kingdom. This fourth workshop of the Genomic Standards Consortium (GSC) was a mix of short presentations, strategy discussions, and technical sessions. Speakers provided progress reports on the development of the "Minimum Information about a Genome Sequence" (MIGS) specification and the closely integrated "Minimum Information about a Metagenome Sequence" (MIMS) specification. The key outcome of the workshop was consensus on the next version of the MIGS/MIMS specification (v1.2). This drove further definition and restructuring of the MIGS/MIMS XML schema (syntax). With respect to semantics, a term vetting group was established to ensure that terms are properly defined and submitted to the appropriate ontology projects. Perhaps the single most important outcome of the workshop was a proposal to move beyond the concept of "minimum" to create a far richer XML schema that would define a "Genomic Contextual Data Markup Language" (GCDML) suitable for wider semantic integration across databases. GCDML will contain not only curated information (e.g., compliant with MIGS/MIMS), but also be extended to include a variety of data processing and calculations. Further information about the Genomic Standards Consortium and its range of activities can be found at http://gensc.org.
Omics A Journal of Integrative Biology 07/2008; 12(2):101-8. DOI:10.1089/omi.2008.0014 · 2.36 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: This meeting report summarizes the proceedings of the fifth Genomic Standards Consortium (GSC) workshop held December 12-14, 2007, at the European Bioinformatics Institute (EBI), Cambridge, UK. This fifth workshop served as a milestone event in the evolution of the GSC (launched in September 2005); the key outcome of the workshop was the finalization of a stable version of the MIGS specification (v2.0) for publication. This accomplishment enables, and also in some cases necessitates, downstream activities, which are described in the multiauthor, consensus-driven articles in this special issue of OMICS produced as a direct result of the workshop. This report briefly summarizes the workshop and overviews the special issue. In particular, it aims to explain how the various GSC-led projects are working together to help this community achieve its stated mission of further standardizing the descriptions of genomes and metagenomes and implementing improved mechanisms of data exchange and integration to enable more accurate comparative analyses. Further information about the GSC and its range of activities can be found at http://gensc.org.
Omics A Journal of Integrative Biology 07/2008; 12(2):109-13. DOI:10.1089/omi.2008.A3B3 · 2.36 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The Genomic Contextual Data Markup Language (GCDML) is a core project of the Genomic Standards Consortium (GSC) that implements the "Minimum Information about a Genome Sequence" (MIGS) specification and its extension, the "Minimum Information about a Metagenome Sequence" (MIMS). GCDML is an XML Schema for generating MIGS/MIMS compliant reports for data entry, exchange, and storage. When mature, this sample-centric, strongly-typed schema will provide a diverse set of descriptors for describing the exact origin and processing of a biological sample, from sampling to sequencing, and subsequent analysis. Here we describe the need for such a project, outline design principles required to support the project, and make an open call for participation in defining the future content of GCDML. GCDML is freely available, and can be downloaded, along with documentation, from the GSC Web site (http://gensc.org).
Omics A Journal of Integrative Biology 07/2008; 12(2):115-21. DOI:10.1089/omi.2008.0A10 · 2.36 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Given the growing wealth of downstream information, the integration of molecular and non-molecular data on a given organism has become a major challenge. For micro-organisms, this information now includes a growing collection of sequenced genes and complete genomes, and for communities of organisms it includes metagenomes. Integration of the data is facilitated by the existence of authoritative, community-recognized, consensus identifiers that may form the heart of so-called information knuckles. The Genomic Standards Consortium (GSC) is building a mapping of identifiers across a group of federated databases with the aim to improve navigation across these resources and to enable the integration of their information in the near future. In particular, this is possible because of the existence of INSDC Genome Project Identifiers (GPIDs) and accession numbers, and the ability of the community to define new consensus identifiers such as the culture identifiers used in the StrainInfo.net bioportal. Here we outline (1) the general design of the Genomic Rosetta Stone project, (2) introduce example linkages between key databases (that cover information about genomes, 16S rRNA gene sequences, and microbial biological resource centers), and (3) make an open call for participation in this project providing a vision for its future use.
Omics A Journal of Integrative Biology 07/2008; 12(2):123-7. DOI:10.1089/omi.2008.0020 · 2.36 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the 'transparency' of the information contained in existing genomic databases.
[Show abstract][Hide abstract] ABSTRACT: The Genomic Standards Consortium (GSC) invited a representative of the Long-Term Ecological Research (LTER) to its fifth workshop to present the Ecological Metadata Language (EML) metadata standard and its relationship to the Minimum Information about a Genome/Metagenome Sequence (MIGS/MIMS) and its implementation, the Genomic Contextual Data Markup Language (GCDML). The LTER is one of the top National Science Foundation (NSF) programs in biology since 1980, representing diverse ecosystems and creating long-term, interdisciplinary research, synthesis of information, and theory. The adoption of EML as the LTER network standard has been key to build network synthesis architectures based on high-quality standardized metadata. EML is the NSF-recognized metadata standard for LTER, and EML is a criteria used to review the LTER program progress. At the workshop, a potential crosswalk between the GCDML and EML was explored. Also, collaboration between the LTER and GSC developers was proposed to join efforts toward a common metadata cataloging designer's tool. The community adoption success of a metadata standard depends, among other factors, on the tools and trainings developed to use the standard. LTER's experience in embracing EML may help GSC to achieve similar success. A possible collaboration between LTER and GSC to provide training opportunities for GCDML and the associated tools is being explored. Finally, LTER is investigating EML enhancements to better accommodate genomics data, possibly integrating the GCDML schema into EML. All these action items have been accepted by the LTER contingent, and further collaboration between the GSC and LTER is expected.
Omics A Journal of Integrative Biology 05/2008; 12(2):151-6. DOI:10.1089/omi.2008.0015 · 2.36 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: This paper presents a holistic approach that illustrates how the semantic hurdle for integration of biological databases might be overcome when mapping sources that provide information on individual genes and complete genomes to sources that provide information on the biological resources from which these sequences where derived, and vice versa. In particular we will explain how each of the completed and ongoing whole-genome sequencing projects in the Genomes OnLine Database and each of the ribosomal RNA sequences in the SILVA ribosomal RNA database have been persistently cross-referenced with the StrainInfo.net bioportal, serving both a genome centric and an organism centric view to the life on our blue planet as one more stepping stone towards the establishment of fully integrated and flexible biological information networks.
[Show abstract][Hide abstract] ABSTRACT: In this commentary, we advocate building a richer set of descriptions about our invaluable and exponentially growing collection of genomes and metagenomic datasets through the construction of consensus-driven data capture and exchange mechanisms. Standardization activities must proceed within the auspices of open-access and international working bodies, and to tackle the issues surrounding the development of better descriptions of genomic investigations we have formed the Genomic Standards Consortium (GSC). Here, we introduce the 'Minimum Information about a Genome Sequence' specification in the hopes of gaining wider participation in its development and discuss the resources that will be required to support it (standardization of annotations through the use of ontologies and mechanisms of metadata capture, exchange). As part of its wider goals, the GSC also strongly supports improving the 'transparency' of the information contained in existing genomic databases that contain calculated analyses and genomic annotations.
[Show abstract][Hide abstract] ABSTRACT: Researchers working on environmentally relevant organisms, populations, and communities are increasingly turning to the application of OMICS technologies to answer fundamental questions about the natural world, how it changes over time, and how it is influenced by anthropogenic factors. In doing so, the need to capture meta-data that accurately describes the biological "source" material used in such experiments is growing in importance. Here, we provide an overview of the formation of the "Env" community of environmental OMICS researchers and its efforts at considering the meta-data capture needs of those working in environmental OMICS. Specifically, we discuss the development to date of the Env specification, an informal specification including descriptors related to geographic location, environment, organism relationship, and phenotype. We then describe its application to the description of environmental transcriptomic experiments and how we have used it to extend the Minimum Information About a Microarray Experiment (MIAME) data standard to create a domain-specific extension that we have termed MIAME/Env. Finally, we make an open call to the community for participation in the Env Community and its future activities.
Omics A Journal of Integrative Biology 02/2006; 10(2):172-8. DOI:10.1089/omi.2006.10.172 · 2.36 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The development of the Functional Genomics Investigation Ontology (FuGO) is a collaborative, international effort that will provide a resource for annotating functional genomics investigations, including the study design, protocols and instrumentation used, the data generated and the types of analysis performed on the data. FuGO will contain both terms that are universal to all functional genomics investigations and those that are domain specific. In this way, the ontology will serve as the "semantic glue" to provide a common understanding of data from across these disparate data sources. In addition, FuGO will reference out to existing mature ontologies to avoid the need to duplicate these resources, and will do so in such a way as to enable their ease of use in annotation. This project is in the early stages of development; the paper will describe efforts to initiate the project, the scope and organization of the project, the work accomplished to date, and the challenges encountered, as well as future plans.
Omics A Journal of Integrative Biology 02/2006; 10(2):199-204. DOI:10.1089/omi.2006.10.199 · 2.36 Impact Factor