Ilene Karsch-Mizrachi

Publications

  • 7.48
    Impact points
    GenBank.

    Dennis A Benson, Ilene Karsch-Mizrachi, Karen Clark, David J Lipman, James Ostell, Eric W Sayers

    Nucleic acids research. 12/2011; 40(Database issue):D48-53.

    GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 250,00 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including wh... [more] GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 250,00 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI home page: www.ncbi.nlm.nih.gov.
  • 7.48
    Impact points
    Database resources of the National Center for Biotechnology Information.

    Eric W Sayers, Tanya Barrett, Dennis A Benson, Evan Bolton, Stephen H Bryant, Kathi Canese, Vyacheslav Chetvernin, Deanna M Church, Michael Dicuccio, Scott Federhen, [......], Karl Sirotkin, Douglas Slotta, Alexandre Souvorov, Grigory Starchenko, Tatiana A Tatusova, Lukas Wagner, Yanli Wang, W John Wilbur, Eugene Yaschenko, Jian Ye

    Nucleic acids research. 12/2011; 40(Database issue):D13-25.

    In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Ent... [more] In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
  • 7.48
    Impact points
    BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata.

    Tanya Barrett, Karen Clark, Robert Gevorgyan, Vyacheslav Gorelenkov, Eugene Gribov, Ilene Karsch-Mizrachi, Michael Kimelman, Kim D Pruitt, Sergei Resenchuk, Tatiana Tatusova, Eugene Yaschenko, James Ostell

    Nucleic acids research. 12/2011; 40(Database issue):D57-63.

    As the volume and complexity of data sets archived at NCBI grow rapidly, so does the need to gather and organize the associated metadata. Although metadata has been collected for some archival databases, previously, there was no centralized approach at NCBI for collecting this information and using ... [more] As the volume and complexity of data sets archived at NCBI grow rapidly, so does the need to gather and organize the associated metadata. Although metadata has been collected for some archival databases, previously, there was no centralized approach at NCBI for collecting this information and using it across databases. The BioProject database was recently established to facilitate organization and classification of project data submitted to NCBI, EBI and DDBJ databases. It captures descriptive information about research projects that result in high volume submissions to archival databases, ties together related data across multiple archives and serves as a central portal by which to inform users of data availability. Concomitantly, the BioSample database is being developed to capture descriptive information about the biological samples investigated in projects. BioProject and BioSample records link to corresponding data stored in archival repositories. Submissions are supported by a web-based Submission Portal that guides users through a series of forms for input of rich metadata describing their projects and samples. Together, these databases offer improved ways for users to query, locate, integrate and interpret the masses of data held in NCBI's archival repositories. The BioProject and BioSample databases are available at http://www.ncbi.nlm.nih.gov/bioproject and http://www.ncbi.nlm.nih.gov/biosample, respectively.
  • 7.48
    Impact points
    The International Nucleotide Sequence Database Collaboration.

    Ilene Karsch-Mizrachi, Yasukazu Nakamura, Guy Cochrane

    Nucleic acids research. 11/2011; 40(Database issue):D33-7.

    The members of the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org) set out to capture, preserve and present globally comprehensive public domain nucleotide sequence information. The work of the long-standing collaboration includes the provision of data formats,... [more] The members of the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org) set out to capture, preserve and present globally comprehensive public domain nucleotide sequence information. The work of the long-standing collaboration includes the provision of data formats, annotation conventions and routine global data exchange. Among the many developments to INSDC resources in 2011 are the newly launched BioProject database and improved handling of assembly information. In this article, we outline INSDC services and update the reader on developments in 2011.
  • 12.92
    Impact points
    The Genomic Standards Consortium.

    Dawn Field, Linda Amaral-Zettler, Guy Cochrane, James R Cole, Peter Dawyndt, George M Garrity, Jack Gilbert, Frank Oliver Glöckner, Lynette Hirschman, Ilene Karsch-Mizrachi, [......], Nikos Kyrpides, Folker Meyer, Inigo San Gil, Susanna-Assunta Sansone, Lynn M Schriml, Peter Sterk, Tatiana Tatusova, David W Ussery, Owen White, John Wooley

    PLoS biology. 06/2011; 9(6):e1001088.

    A vast and rich body of information has grown up as a result of the world's enthusiasm for 'omics technologies. Finding ways to describe and make available this information that maximise its usefulness has become a major effort across the 'omics world. At the heart of this effort is the ... [more] A vast and rich body of information has grown up as a result of the world's enthusiasm for 'omics technologies. Finding ways to describe and make available this information that maximise its usefulness has become a major effort across the 'omics world. At the heart of this effort is the Genomic Standards Consortium (GSC), an open-membership organization that drives community-based standardization activities, Here we provide a short history of the GSC, provide an overview of its range of current activities, and make a call for the scientific community to join forces to improve the quality and quantity of contextual information about our public collections of genomes, metagenomes, and marker gene sequences.
  • 29.50
    Impact points
    Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications.

    Pelin Yilmaz, Renzo Kottmann, Dawn Field, Rob Knight, James R Cole, Linda Amaral-Zettler, Jack A Gilbert, Ilene Karsch-Mizrachi, Anjanette Johnston, Guy Cochrane, [......], James M Tiedje, Doyle V Ward, George M Weinstock, Doug Wendel, Owen White, Andrew Whiteley, Andreas Wilke, Jennifer R Wortman, Tanya Yatsunenko, Frank Oliver Glöckner

    Nature biotechnology. 05/2011; 29(5):415-20.

    Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences--the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The 'environmen... [more] Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences--the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The 'environmental packages' apply to any genome sequence of known origin and can be used in combination with MIMARKS and other GSC checklists. Finally, to establish a unified standard for describing sequence data and to provide a single point of entry for the scientific community to access and learn about GSC checklists, we present the minimum information about any (x) sequence (MIxS). Adoption of MIxS will enhance our ability to analyze natural genetic diversity documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere.
  • 6.40
    Impact points
  • 7.48
    Impact points
    Towards BioDBcore: a community-defined information specification for biological databases.

    Pascale Gaudet, Amos Bairoch, Dawn Field, Susanna-Assunta Sansone, Chris Taylor, Teresa K Attwood, Alex Bateman, Judith A Blake, Carol J Bult, J Michael Cherry, [......], Lorna Richardson, Philippe Rocca-Serra, Paul N Schofield, Damian Smedley, Christopher Southan, Tin Wee Tan, Tatiana Tatusova, Patricia L Whetzel, Owen White, Chisato Yamasaki

    Nucleic acids research. 01/2011; 39(Database issue):D7-10.

    The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between r... [more] The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between resources and to promote the use of semantic and syntactic standards. BioDBCore will make it easier for users to evaluate the scope and relevance of available resources. This new resource will increase the collective impact of the information present in biological databases.
  • 7.48
    Impact points
    The International Nucleotide Sequence Database Collaboration.

    Guy Cochrane, Ilene Karsch-Mizrachi, Yasukazu Nakamura

    Nucleic acids research. 01/2011; 39(Database issue):D15-8.

    Under the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), globally comprehensive public domain nucleotide sequence is captured, preserved and presented. The partners of this long-standing collaboration work closely together to provide data formats and conventi... [more] Under the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), globally comprehensive public domain nucleotide sequence is captured, preserved and presented. The partners of this long-standing collaboration work closely together to provide data formats and conventions that enable consistent data submission to their databases and support regular data exchange around the globe. Clearly defined policy and governance in relation to free access to data and relationships with journal publishers have positioned INSDC databases as a key provider of the scientific record and a core foundation for the global bioinformatics data infrastructure. While growth in sequence data volumes comes no longer as a surprise to INSDC partners, the uptake of next-generation sequencing technology by mainstream science that we have witnessed in recent years brings a step-change to growth, necessarily making a clear mark on INSDC strategy. In this article, we introduce the INSDC, outline data growth patterns and comment on the challenges of increased growth.
  • 7.48
    Impact points
    GenBank.

    Dennis A Benson, Ilene Karsch-Mizrachi, David J Lipman, James Ostell, Eric W Sayers

    Nucleic acids research. 11/2010; 39(Database issue):D32-7.

    GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 380,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whol... [more] GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 380,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system that integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.
  • 7.48
    Impact points
    GenBank.

    Dennis A Benson, Ilene Karsch-Mizrachi, David J Lipman, James Ostell, David L Wheeler

    Nucleic acids research. 02/2008; 36(Database issue):D25-30.

    GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 260 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web... [more] GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 260 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.
  • Managing sequence data.

    Ilene Karsch Mizrachi

    Methods in molecular biology (Clifton, N.J.). 02/2008; 452:3-27.

    Nucleotide and protein sequences are the foundation for all bioinformatics tools and resources. Researchers can analyze these sequences to discover genes or predict the function of their products. The INSD (International Nucleotide Sequence Database--DDBJ/EMBL/GenBank) is an international, centraliz... [more] Nucleotide and protein sequences are the foundation for all bioinformatics tools and resources. Researchers can analyze these sequences to discover genes or predict the function of their products. The INSD (International Nucleotide Sequence Database--DDBJ/EMBL/GenBank) is an international, centralized primary sequence resource that is freely available on the internet. This database contains all publicly available nucleotide and derived protein sequences. This chapter summarizes the nucleotide sequence database resources, provides information on how to submit sequences to the databases, and explains how to access the sequence data.
  • 2.29
    Impact points
    Evidence standards in experimental and inferential INSDC Third Party Annotation data.

    Guy Cochrane, Kirsty Bates, Rolf Apweiler, Yoshio Tateno, Jun Mashima, Takehide Kosuge, Ilene Karsch-Mizrachi, Susan Schafer, Michael Fetchko

    Omics : a journal of integrative biology. 02/2006; 10(2):105-13.

    The Third Party Annotation (TPA) project collects and presents high-quality annotation of nucleotide sequence. Annotation is submitted by researchers who have not themselves generated novel nucleotide sequence. In its first few years, the resource has proven to be popular with submitters from a rang... [more] The Third Party Annotation (TPA) project collects and presents high-quality annotation of nucleotide sequence. Annotation is submitted by researchers who have not themselves generated novel nucleotide sequence. In its first few years, the resource has proven to be popular with submitters from a range of biological research areas. Central to the project is the requirement for high-quality data, resulting from experimental and inferred analysis discussed in peer-reviewed publications. The data are divided into two tiers: those with experimental evidence and those with inferential evidence. Standards for TPA are detailed and illustrated with the aid of case studies.
1 2 3 Next »
42
Publications
1
Follower