Schriml Lynn

University of Maryland · Institute for Genome Sciences

Publications

  • 7.48
    Impact points
    Disease Ontology: a backbone for disease semantic integration.

    Lynn Marie Schriml, Cesar Arze, Suvarna Nadendla, Yu-Wei Wayne Chang, Mark Mazaitis, Victor Felix, Gang Feng, Warren Alden Kibbe

    Nucleic acids research. 11/2011; 40(Database issue):D940-6.

    The Disease Ontology (DO) database (http://disease-ontology.org) represents a comprehensive knowledge base of 8043 inherited, developmental and acquired human diseases (DO version 3, revision 2510). The DO web browser has been designed for speed, efficiency and robustness through the use of a graph ... [more] The Disease Ontology (DO) database (http://disease-ontology.org) represents a comprehensive knowledge base of 8043 inherited, developmental and acquired human diseases (DO version 3, revision 2510). The DO web browser has been designed for speed, efficiency and robustness through the use of a graph database. Full-text contextual searching functionality using Lucene allows the querying of name, synonym, definition, DOID and cross-reference (xrefs) with complex Boolean search strings. The DO semantically integrates disease and medical vocabularies through extensive cross mapping and integration of MeSH, ICD, NCI's thesaurus, SNOMED CT and OMIM disease-specific terms and identifiers. The DO is utilized for disease annotation by major biomedical databases (e.g. Array Express, NIF, IEDB), as a standard representation of human disease in biomedical ontologies (e.g. IDO, Cell line ontology, NIFSTD ontology, Experimental Factor Ontology, Influenza Ontology), and as an ontological cross mappings resource between DO, MeSH and OMIM (e.g. GeneWiki). The DO project (http://diseaseontology.sf.net) has been incorporated into open source tools (e.g. Gene Answers, FunDO) to connect gene and disease biomedical data through the lens of human disease. The next iteration of the DO web browser will integrate DO's extended relations and logical definition representation along with these biomedical resource cross-mappings.
  • 12.92
    Impact points
    The Genomic Standards Consortium.

    Dawn Field, Linda Amaral-Zettler, Guy Cochrane, James R Cole, Peter Dawyndt, George M Garrity, Jack Gilbert, Frank Oliver Glöckner, Lynette Hirschman, Ilene Karsch-Mizrachi, [......], Nikos Kyrpides, Folker Meyer, Inigo San Gil, Susanna-Assunta Sansone, Lynn M Schriml, Peter Sterk, Tatiana Tatusova, David W Ussery, Owen White, John Wooley

    PLoS biology. 06/2011; 9(6):e1001088.

    A vast and rich body of information has grown up as a result of the world's enthusiasm for 'omics technologies. Finding ways to describe and make available this information that maximise its usefulness has become a major effort across the 'omics world. At the heart of this effort is the ... [more] A vast and rich body of information has grown up as a result of the world's enthusiasm for 'omics technologies. Finding ways to describe and make available this information that maximise its usefulness has become a major effort across the 'omics world. At the heart of this effort is the Genomic Standards Consortium (GSC), an open-membership organization that drives community-based standardization activities, Here we provide a short history of the GSC, provide an overview of its range of current activities, and make a call for the scientific community to join forces to improve the quality and quantity of contextual information about our public collections of genomes, metagenomes, and marker gene sequences.
  • 29.50
    Impact points
    Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications.

    Pelin Yilmaz, Renzo Kottmann, Dawn Field, Rob Knight, James R Cole, Linda Amaral-Zettler, Jack A Gilbert, Ilene Karsch-Mizrachi, Anjanette Johnston, Guy Cochrane, [......], James M Tiedje, Doyle V Ward, George M Weinstock, Doug Wendel, Owen White, Andrew Whiteley, Andreas Wilke, Jennifer R Wortman, Tanya Yatsunenko, Frank Oliver Glöckner

    Nature biotechnology. 05/2011; 29(5):415-20.

    Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences--the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The 'environmen... [more] Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences--the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The 'environmental packages' apply to any genome sequence of known origin and can be used in combination with MIMARKS and other GSC checklists. Finally, to establish a unified standard for describing sequence data and to provide a single point of entry for the scientific community to access and learn about GSC checklists, we present the minimum information about any (x) sequence (MIxS). Adoption of MIxS will enhance our ability to analyze natural genetic diversity documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere.
  • The Translational Medicine Ontology and Knowledge Base: driving personalized medicine by bridging the gap between bench and bedside.

    Joanne S Luciano, Bosse Andersson, Colin Batchelor, Olivier Bodenreider, Tim Clark, Christine K Denney, Christopher Domarew, Thomas Gambet, Lee Harland, Anja Jentzsch, [......], Elgar Pichler, Robert L Powers, Eric Prud'hommeaux, Matthias Samwald, Lynn Schriml, Peter J Tonellato, Patricia L Whetzel, Jun Zhao, Susie Stephens, Michel Dumontier

    Journal of biomedical semantics. 01/2011; 2 Suppl 2:S1.

    Translational medicine requires the integration of knowledge using heterogeneous data from health care to the life sciences. Here, we describe a collaborative effort to produce a prototype Translational Medicine Knowledge Base (TMKB) capable of answering questions relating to clinical practice and p... [more] Translational medicine requires the integration of knowledge using heterogeneous data from health care to the life sciences. Here, we describe a collaborative effort to produce a prototype Translational Medicine Knowledge Base (TMKB) capable of answering questions relating to clinical practice and pharmaceutical drug discovery. We developed the Translational Medicine Ontology (TMO) as a unifying ontology to integrate chemical, genomic and proteomic data with disease, treatment, and electronic health records. We demonstrate the use of Semantic Web technologies in the integration of patient and biomedical data, and reveal how such a knowledge base can aid physicians in providing tailored patient care and facilitate the recruitment of patients into active clinical trials. Thus, patients, physicians and researchers may explore the knowledge base to better understand therapeutic options, efficacy, and mechanisms of action. This work takes an important step in using Semantic Web technologies to facilitate integration of relevant, distributed, external sources and progress towards a computational platform to support personalized medicine. TMO can be downloaded from http://code.google.com/p/translationalmedicineontology and TMKB can be accessed at http://tm.semanticscience.org/sparql.
  • Metagenomes and metatranscriptomes from the L4 long-term coastal monitoring station in the Western English Channel.

    Jack A Gilbert, Folker Meyer, Lynn Schriml, Ian R Joint, Martin Mühling, Dawn Field

    Standards in genomic sciences. 01/2010; 3(2):183-93.

    Both metagenomic data and metatranscriptomic data were collected from surface water (0-2m) of the L4 sampling station (50.2518 N, 4.2089 W), which is part of the Western Channel Observatory long-term coastal-marine monitoring station. We previously generated from this area a six-year time series of ... [more] Both metagenomic data and metatranscriptomic data were collected from surface water (0-2m) of the L4 sampling station (50.2518 N, 4.2089 W), which is part of the Western Channel Observatory long-term coastal-marine monitoring station. We previously generated from this area a six-year time series of 16S rRNA V6 data, which demonstrated robust seasonal structure for the bacterial community, with diversity correlated with day length. Here we describe the features of these metagenomes and metatranscriptomes. We generated 8 metagenomes (4.5 million sequences, 1.9 Gbp, average read-length 350 bp) and 7 metatranscriptomes (392,632 putative mRNA-derived sequences, 159 Mbp, average read-length 272 bp) for eight time-points sampled in 2008. These time points represent three seasons (winter, spring, and summer) and include both day and night samples. These data demonstrate the major differences between genetic potential and actuality, whereby genomes follow general seasonal trends yet with surprisingly little change in the functional potential over time; transcripts tended to be far more structured by changes occurring between day and night.
  • Meeting Report: "Metagenomics, Metadata and Meta-analysis" (M3) Workshop at the Pacific Symposium on Biocomputing 2010.

    Lynette Hirschman, Peter Sterk, Dawn Field, John Wooley, Guy Cochrane, Jack Gilbert, Eugene Kolker, Nikos Kyrpides, Folker Meyer, Ilene Mizrachi, Yasukazu Nakamura, Susanna-Assunta Sansone, Lynn Schriml, Tatiana Tatusova, Owen White, Pelin Yilmaz

    Standards in genomic sciences. 01/2010; 2(3):357-60.

    This report summarizes the M3 Workshop held at the January 2010 Pacific Symposium on Biocomputing. The workshop, organized by Genomic Standards Consortium members, included five contributed talks, a series of short presentations from stakeholders in the genomics standards community, a poster session... [more] This report summarizes the M3 Workshop held at the January 2010 Pacific Symposium on Biocomputing. The workshop, organized by Genomic Standards Consortium members, included five contributed talks, a series of short presentations from stakeholders in the genomics standards community, a poster session, and, in the evening, an open discussion session to review current projects and examine future directions for the GSC and its stakeholders.
  • Meeting Report from the Genomic Standards Consortium (GSC) Workshop 9.

    Tanja Davidsen, Ramana Madupu, Peter Sterk, Dawn Field, George Garrity, Jack Gilbert, Frank Oliver Glöckner, Lynette Hirschman, Eugene Kolker, Renzo Kottmann, Nikos Kyrpides, Folker Meyer, Norman Morrison, Lynn Schriml, Tatiana Tatusova, John Wooley

    Standards in genomic sciences. 01/2010; 3(3):216-24.

    This report summarizes the proceedings of the 9th workshop of the Genomic Standards Consortium (GSC), held at the J. Craig Venter Institute, Rockville, MD, USA. It was the first GSC workshop to have open registration and attracted over 90 participants. This workshop featured sessions that provided o... [more] This report summarizes the proceedings of the 9th workshop of the Genomic Standards Consortium (GSC), held at the J. Craig Venter Institute, Rockville, MD, USA. It was the first GSC workshop to have open registration and attracted over 90 participants. This workshop featured sessions that provided overviews of the full range of ongoing GSC projects. It included sessions on Standards in Genomic Sciences, the open access journal of the GSC, building standards for genome annotation, the M5 platform for next-generation collaborative computational infrastructures, building ties with the biodiversity research community and two discussion panels with government and industry participants. Progress was made on all fronts, and major outcomes included the completion of the MIENS specification for publication and the formation of the Biodiversity working group.
  • Meeting Report from the Genomic Standards Consortium (GSC) Workshop 10.

    Elizabeth Glass, Folker Meyer, Jack A Gilbert, Dawn Field, Sarah Hunter, Renzo Kottmann, Nikos Kyrpides, Susanna Sansone, Lynn Schriml, Peter Sterk, Owen White, John Wooley

    Standards in genomic sciences. 01/2010; 3(3):225-31.

    This report summarizes the proceedings of the 10th workshop of the Genomic Standards Consortium (GSC), held at Argonne National Laboratory, IL, USA. It was the second GSC workshop to have open registration and attracted over 60 participants who worked together to progress the full range of projects ... [more] This report summarizes the proceedings of the 10th workshop of the Genomic Standards Consortium (GSC), held at Argonne National Laboratory, IL, USA. It was the second GSC workshop to have open registration and attracted over 60 participants who worked together to progress the full range of projects ongoing within the GSC. Overall, the primary focus of the workshop was on advancing the M5 platform for next-generation collaborative computational infrastructures. Other key outcomes included the formation of a GSC working group focused on MIGS/MIMS/MIENS compliance using the ISA software suite and the formal launch of the GSC Developer Working Group. Further information about the GSC and its range of activities can be found at http://gensc.org/.
  • 7.48
    Impact points
    GeMInA, Genomic Metadata for Infectious Agents, a geospatial surveillance pathogen database.

    Lynn M Schriml, Cesar Arze, Suvarna Nadendla, Anu Ganapathy, Victor Felix, Anup Mahurkar, Katherine Phillippy, Aaron Gussman, Sam Angiuoli, Elodie Ghedin, Owen White, Neil Hall

    Nucleic acids research. 10/2009;

    The Gemina system (http://gemina.igs.umaryland.edu) identifies, standardizes and integrates the outbreak metadata for the breadth of NIAID category A-C viral and bacterial pathogens, thereby providing an investigative and surveillance tool describing the Who [Host], What [Disease, Symptom], When [Da... [more] The Gemina system (http://gemina.igs.umaryland.edu) identifies, standardizes and integrates the outbreak metadata for the breadth of NIAID category A-C viral and bacterial pathogens, thereby providing an investigative and surveillance tool describing the Who [Host], What [Disease, Symptom], When [Date], Where [Location] and How [Pathogen, Environmental Source, Reservoir, Transmission Method] for each pathogen. The Gemina database will provide a greater understanding of the interactions of viral and bacterial pathogens with their hosts and infectious diseases through in-depth literature text-mining, integrated outbreak metadata, outbreak surveillance tools, extensive ontology development, metadata curation and representative genomic sequence identification and standards development. The Gemina web interface provides metadata selection and retrieval of a pathogen's; Infection Systems (Pathogen, Host, Disease, Transmission Method and Anatomy) and Incidents (Location and Date) along with a hosts Age and Gender. The Gemina system provides an integrated investigative and geospatial surveillance system connecting pathogens, pathogen products and disease anchored on the taxonomic ID of the pathogen and host to identify the breadth of hosts and diseases known for these pathogens, to identify the extent of outbreak locations, and to identify unique genomic regions with the DNA Signature Insignia Detection Tool.
  • 2.29
    Impact points
    Laying the Foundation for a Genomic Rosetta Stone: Creating Information Hubs through the Use of Consensus Identifiers.

    Bart Van Brabant, Tanya Gray, Bert Verslyppe, Nikos Kyrpides, Karin Dietrich, Frank Oliver Glöckner, James Cole, Ryan Farris, Lynn M Schriml, Paul De Vos, Bernard De Baets, Dawn Field, Peter Dawyndt

    Omics : a journal of integrative biology. 07/2008; 12(2):123-7.

    Abstract Given the growing wealth of downstream information, the integration of molecular and non-molecular data on a given organism has become a major challenge. For micro-organisms, this information now includes a growing collection of sequenced genes and complete genomes, and for communities of o... [more] Abstract Given the growing wealth of downstream information, the integration of molecular and non-molecular data on a given organism has become a major challenge. For micro-organisms, this information now includes a growing collection of sequenced genes and complete genomes, and for communities of organisms it includes metagenomes. Integration of the data is facilitated by the existence of authoritative, community-recognized, consensus identifiers that may form the heart of so-called information knuckles. The Genomic Standards Consortium (GSC) is building a mapping of identifiers across a group of federated databases with the aim to improve navigation across these resources and to enable the integration of their information in the near future. In particular, this is possible because of the existence of INSDC Genome Project Identifiers (GPIDs) and accession numbers, and the ability of the community to define new consensus identifiers such as the culture identifiers used in the StrainInfo.net bioportal. Here we outline (1) the general design of the Genomic Rosetta Stone project, (2) introduce example linkages between key databases (that cover information about genomes, 16S rRNA gene sequences, and microbial biological resource centers), and (3) make an open call for participation in this project providing a vision for its future use.
  • 2.29
    Impact points
    Meeting Report: The Fifth Genomic Standards Consortium (GSC) Workshop.

    Dawn Field, George M Garrity, Susanna-Assunta Sansone, Peter Sterk, Tanya Gray, Nikos Kyrpides, Lynette Hirschman, Frank Oliver Glöckner, Renzo Kottmann, Sam Angiuoli, [......], Nick Thomson, Inigo San Gil, Norman Morrison, Tatiana Tatusova, Ilene Mizrachi, Robert Vaughan, Guy Cochrane, Leonid Kagan, Sean Murphy, Lynn Schriml

    Omics : a journal of integrative biology. 07/2008; 12(2):109-13.

    Abstract This meeting report summarizes the proceedings of the fifth Genomic Standards Consortium (GSC) workshop held December 12-14, 2007, at the European Bioinformatics Institute (EBI), Cambridge, UK. This fifth workshop served as a milestone event in the evolution of the GSC (launched in Septembe... [more] Abstract This meeting report summarizes the proceedings of the fifth Genomic Standards Consortium (GSC) workshop held December 12-14, 2007, at the European Bioinformatics Institute (EBI), Cambridge, UK. This fifth workshop served as a milestone event in the evolution of the GSC (launched in September 2005); the key outcome of the workshop was the finalization of a stable version of the MIGS specification (v2.0) for publication. This accomplishment enables, and also in some cases necessitates, downstream activities, which are described in the multiauthor, consensus-driven articles in this special issue of OMICS produced as a direct result of the workshop. This report briefly summarizes the workshop and overviews the special issue. In particular, it aims to explain how the various GSC-led projects are working together to help this community achieve its stated mission of further standardizing the descriptions of genomes and metagenomes and implementing improved mechanisms of data exchange and integration to enable more accurate comparative analyses. Further information about the GSC and its range of activities can be found at http://gensc.org.
  • Gemina: A Web-Based Epidemiology and Genomic Metadata System Designed to Identify Infectious Agents.

    Lynn M. Schriml, Aaron Gussman, Kathy Phillippy, Samuel V. Angiuoli, Kumar Hari, Alan Goates, Ravi Jain, Tanja Davidsen, Anurhada Ganapathy, Elodie Ghedin, Steven Salzberg, Owen White, Neil Hall

    Intelligence and Security Informatics: Biosurveillance, Second NSF Workshop, BioSurveillance 2007, New Brunswick, NJ, USA, May 22, 2007, Proceedings.; 01/2007

  • Resources for genetic and genomic studies of Xenopus.

    Steven L Klein, Daniela S Gerhard, Lukas Wagner, Paul Richardson, Lynn M Schriml, Amy K Sater, Wesley C Warren, John D McPherson

    Methods in molecular biology (Clifton, N.J.). 02/2006; 322:1-16.

    The National Institutes of Health Xenopus Initiative is a concerted effort to interact with the Xenopus research community to identify the community's needs; to devise strategies to meet those needs; and to support, oversee, and coordinate the resulting projects. This chapter provides a brief de... [more] The National Institutes of Health Xenopus Initiative is a concerted effort to interact with the Xenopus research community to identify the community's needs; to devise strategies to meet those needs; and to support, oversee, and coordinate the resulting projects. This chapter provides a brief description of several genetic and genomic resources generated by this initiative and explains how to access them. The resources described in this chapter are (1) complementary deoxyribonucleic acid (cDNA) libraries and expressed sequence tag (EST) sequences; (2) UniGene clusters; (3) full-insert cDNA sequences; (4) a genetic map; (5) genomic libraries; (6) a physical map; (7) genome sequence; (8) microarrays; (9) mutagenesis and phenotyping; and (10) bioinformatics. The descriptions presented here were based on data that were available at the time of manuscript submission. Because these are ongoing projects, they are constantly generating new data and analyses. The Web sites cited in each subheading present current data and analyses.
  • 7.48
    Impact points
    Database resources of the National Center for Biotechnology Information.

    David L Wheeler, Tanya Barrett, Dennis A Benson, Stephen H Bryant, Kathi Canese, Vyacheslav Chetvernin, Deanna M Church, Michael DiCuccio, Ron Edgar, Scott Federhen, [......], Edwin Sequeira, Stephen T Sherry, Karl Sirotkin, Alexandre Souvorov, Grigory Starchenko, Tugba O Suzek, Roman Tatusov, Tatiana A Tatusova, Lukas Wagner, Eugene Yaschenko

    Nucleic acids research. 02/2006; 34(Database issue):D173-80.

    In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the E... [more] In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Retroviral Genotyping Tools, HIV-1, Human Protein Interaction Database, SAGEmap, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov.
  • 7.48
    Impact points
    Database resources of the National Center for Biotechnology Information.

    David L Wheeler, Tanya Barrett, Dennis A Benson, Stephen H Bryant, Kathi Canese, Deanna M Church, Michael DiCuccio, Ron Edgar, Scott Federhen, Wolfgang Helmberg, [......], Lynn M Schriml, Edwin Sequeira, Steven T Sherry, Karl Sirotkin, Grigory Starchenko, Tugba O Suzek, Roman Tatusov, Tatiana A Tatusova, Lukas Wagner, Eugene Yaschenko

    Nucleic acids research. 02/2005; 33(Database issue):D39-45.

    In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data retrieval systems and computational resources for the analysis of data in GenBank and other biological data made available through NCBI's website. NCBI re... [more] In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data retrieval systems and computational resources for the analysis of data in GenBank and other biological data made available through NCBI's website. NCBI resources include Entrez, Entrez Programming Utilities, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD) and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of the resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov.
  • 7.48
    Impact points
    Database resources of the National Center for Biotechnology Information: update.

    David L Wheeler, Deanna M Church, Ron Edgar, Scott Federhen, Wolfgang Helmberg, Thomas L Madden, Joan U Pontius, Gregory D Schuler, Lynn M Schriml, Edwin Sequeira, Tugba O Suzek, Tatiana A Tatusova, Lukas Wagner

    Nucleic acids research. 02/2004; 32(Database issue):D35-40.

    In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's website. NCBI resources include Entrez... [more] In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's website. NCBI resources include Entrez, PubMed, PubMed Central, LocusLink, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SARS Coronavirus Resource, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD) and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov.
  • 11.34
    Impact points
    Human disease genes and their cloned mouse orthologs: exploration of the FANTOM2 cDNA sequence data set.

    Lynn M Schriml, David P Hill, Judith A Blake, Hidemasa Bono, Anthony Wynshaw-Boris, William J. Pavan, Brian Z Ring, Kirk Beisel, Mitsutoshi Setou, Yasushi Okazaki

    Genome research. 07/2003; 13(6B):1496-500.

    The FANTOM2 cDNA sequence data set is an excellent model to demonstrate the power of large-scale cDNA sequencing, with the goal of providing a full-length transcript sequence for each mouse gene. This data set enhances the use of the mouse as a model for human disease. Here we identify mouse cDNA se... [more] The FANTOM2 cDNA sequence data set is an excellent model to demonstrate the power of large-scale cDNA sequencing, with the goal of providing a full-length transcript sequence for each mouse gene. This data set enhances the use of the mouse as a model for human disease. Here we identify mouse cDNA sequences in the FANTOM2 data set for a set of 67 human disease genes that as of May 2002 had no corresponding mouse cDNA annotated in the Mouse Genome Informatics (MGI) database. These 67 human disease genes include genes related to neurological and eye disorders and cancer. We also present a list of the human disease genes and their cloned mouse orthologs found in two public databases, LocusLink and MGI. Allelic variant and gene functional information available in MGI provides additional information relative to these mouse models, whereas computed sequence-based connections at NCBI support facile navigation through multiple genomes.
  • 11.34
    Impact points
    Connecting sequence and biology in the laboratory mouse.

    Richard M Baldarelli, David P Hill, Judith A Blake, Jun Adachi, Masaaki Furuno, Dirck Bradt, Lori E Corbani, Sharon Cousins, Kenneth S Frazer, Dong Qi, [......], Lois J Maltais, Louise M McKenzie, Lynn M Schriml, Donna Maglott, Deanna M Church, Kim Pruitt, Janan T Eppig, Joel E Richardson, Jim A Kadin, Carol J Bult

    Genome research. 07/2003; 13(6B):1505-19.

    The Mouse Genome Sequencing Consortium and the RIKEN Genome Exploration Research grouphave generated large sets of sequence data representing the mouse genome and transcriptome, respectively. These data provide a valuable foundation for genomic research. The challenges for the informatics community ... [more] The Mouse Genome Sequencing Consortium and the RIKEN Genome Exploration Research grouphave generated large sets of sequence data representing the mouse genome and transcriptome, respectively. These data provide a valuable foundation for genomic research. The challenges for the informatics community are how to integrate these data with the ever-expanding knowledge about the roles of genes and gene products in biological processes, and how to provide useful views to the scientific community. Public resources, such as the National Center for Biotechnology Information (NCBI; http://www.ncbi.nih.gov), and model organism databases, such as the Mouse Genome Informatics database (MGI; http://www.informatics.jax.org), maintain the primary data and provide connections between sequence and biology. In this paper, we describe how the partnership of MGI and NCBI LocusLink contributes to the integration of sequence and biology, especially in the context of the large-scale genome and transcriptome data now available for the laboratory mouse. In particular, we describe the methods and results of integration of 60,770 FANTOM2 mouse cDNAs with gene records in the databases of MGI and LocusLink.

Following (19)

29
Publications
19
Followers