-
Cristina Aurrecoechea,
Ana Barreto,
John Brestelli,
Brian P Brunk,
Shon Cade,
Ryan Doherty, Steve Fischer,
Bindu Gajria,
Xin Gao,
Alan Gingle, [......],
Jessica C Kissinger,
Eileen T Kraemer,
Wei Li,
Deborah F Pinney,
Brian Pitts,
David S Roos,
Ganesh Srinivasamoorthy,
Christian J Stoeckert,
Haiming Wang,
Susanne Warrenfeltz
[show abstract]
[hide abstract]
ABSTRACT: EuPathDB (http://eupathdb.org) resources include 11 databases supporting eukaryotic pathogen genomic and functional genomic data, isolate data and phylogenomics. EuPathDB resources are built using the same infrastructure and provide a sophisticated search strategy system enabling complex interrogations of underlying data. Recent advances in EuPathDB resources include the design and implementation of a new data loading workflow, a new database supporting Piroplasmida (i.e. Babesia and Theileria), the addition of large amounts of new data and data types and the incorporation of new analysis tools. New data include genome sequences and annotation, strand-specific RNA-seq data, splice junction predictions (based on RNA-seq), phosphoproteomic data, high-throughput phenotyping data, single nucleotide polymorphism data based on high-throughput sequencing (HTS) and expression quantitative trait loci data. New analysis tools enable users to search for DNA motifs and define genes based on their genomic colocation, view results from searches graphically (i.e. genes mapped to chromosomes or isolates displayed on a map) and analyze data from columns in result tables (word cloud and histogram summaries of column content). The manuscript herein describes updates to EuPathDB since the previous report published in NAR in 2010.
Nucleic Acids Research 11/2012; · 8.03 Impact Factor
-
Jason E Stajich,
Todd Harris,
Brian P Brunk,
John Brestelli, Steve Fischer,
Omar S Harb,
Jessica C Kissinger,
Wei Li,
Vishal Nayak,
Deborah F Pinney,
Chris J Stoeckert,
David S Roos
[show abstract]
[hide abstract]
ABSTRACT: FungiDB (http://FungiDB.org) is a functional genomic resource for pan-fungal genomes that was developed in partnership with the Eukaryotic Pathogen Bioinformatic resource center (http://EuPathDB.org). FungiDB uses the same infrastructure and user interface as EuPathDB, which allows for sophisticated and integrated searches to be performed using an intuitive graphical system. The current release of FungiDB contains genome sequence and annotation from 18 species spanning several fungal classes, including the Ascomycota classes, Eurotiomycetes, Sordariomycetes, Saccharomycetes and the Basidiomycota orders, Pucciniomycetes and Tremellomycetes, and the basal 'Zygomycete' lineage Mucormycotina. Additionally, FungiDB contains cell cycle microarray data, hyphal growth RNA-sequence data and yeast two hybrid interaction data. The underlying genomic sequence and annotation combined with functional data, additional data from the FungiDB standard analysis pipeline and the ability to leverage orthology provides a powerful resource for in silico experimentation.
Nucleic Acids Research 11/2011; 40(Database issue):D675-81. · 8.03 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: OrthoMCL is an algorithm for grouping proteins into ortholog groups based on their sequence similarity. OrthoMCL-DB is a public database that allows users to browse and view ortholog groups that were pre-computed using the OrthoMCL algorithm. Version 4 of this database contained 116,536 ortholog groups clustered from 1,270,853 proteins obtained from 88 eukaryotic genomes, 16 archaean genomes, and 34 bacterial genomes. Future versions of OrthoMCL-DB will include more proteomes as more genomes are sequenced. Here, we describe how you can group your proteins of interest into ortholog clusters using two different means provided by the OrthoMCL system. The OrthoMCL-DB Web site has a tool for uploading and grouping a set of protein sequences, typically representing a proteome. This method maps the uploaded proteins to existing groups in OrthoMCL-DB. Alternatively, if you have proteins from a set of genomes that need to be grouped, you can download, install, and run the stand-alone OrthoMCL software.
Current protocols in bioinformatics / editoral board, Andreas D. Baxevanis ... [et al.] 09/2011; Chapter 6:Unit 6.12.1-19.
-
Cristina Aurrecoechea,
Ana Barreto,
John Brestelli,
Brian P. Brunk,
Elisabet V. Caler, Steve Fischer,
Bindu Gajria,
Xin Gao,
Alan Gingle,
Gregory R. Grant, [......],
Wei Li,
Vishal Nayak,
Cary Pennington,
Deborah F. Pinney,
Brian Pitts,
David S. Roos,
Ganesh Srinivasamoorthy,
Christian J. Stoeckert Jr,
Charles Treatman,
Haiming Wang
Nucleic Acids Research. 01/2011; 39:612-619.
-
Steve Fischer,
Cristina Aurrecoechea,
Brian P Brunk,
Xin Gao,
Omar S Harb,
Eileen T Kraemer,
Cary Pennington,
Charles Treatman,
Jessica C Kissinger,
David S Roos,
Christian J Stoeckert
[show abstract]
[hide abstract]
ABSTRACT: Web sites associated with the Eukaryotic Pathogen Bioinformatics Resource Center (EuPathDB.org) have recently introduced a graphical user interface, the Strategies WDK, intended to make advanced searching and set and interval operations easy and accessible to all users. With a design guided by usability studies, the system helps motivate researchers to perform dynamic computational experiments and explore relationships across data sets. For example, PlasmoDB users seeking novel therapeutic targets may wish to locate putative enzymes that distinguish pathogens from their hosts, and that are expressed during appropriate developmental stages. When a researcher runs one of the approximately 100 searches available on the site, the search is presented as a first step in a strategy. The strategy is extended by running additional searches, which are combined with set operators (union, intersect or minus), or genomic interval operators (overlap, contains). A graphical display uses Venn diagrams to make the strategy's flow obvious. The interface facilitates interactive adjustment of the component searches with changes propagating forward through the strategy. Users may save their strategies, creating protocols that can be shared with colleagues. The strategy system has now been deployed on all EuPathDB databases, and successfully deployed by other projects. The Strategies WDK uses a configurable MVC architecture that is compatible with most genomics and biological warehouse databases, and is available for download at code.google.com/p/strategies-wdk. Database URL: www.eupathdb.org.
Database The Journal of Biological Databases and Curation 01/2011; 2011:bar027. · 2.07 Impact Factor
-
Cristina Aurrecoechea,
Ana Barreto,
John Brestelli,
Brian P Brunk,
Elisabet V Caler, Steve Fischer,
Bindu Gajria,
Xin Gao,
Alan Gingle,
Greg Grant, [......],
Wei Li,
Vishal Nayak,
Cary Pennington,
Deborah F Pinney,
Brian Pitts,
David S Roos,
Ganesh Srinivasamoorthy,
Christian J Stoeckert,
Charles Treatman,
Haiming Wang
[show abstract]
[hide abstract]
ABSTRACT: AmoebaDB (http://AmoebaDB.org) and MicrosporidiaDB (http://MicrosporidiaDB.org) are new functional genomic databases serving the amoebozoa and microsporidia research communities, respectively. AmoebaDB contains the genomes of three Entamoeba species (E. dispar, E. invadens and E. histolityca) and microarray expression data for E. histolytica. MicrosporidiaDB contains the genomes of Encephalitozoon cuniculi, E. intestinalis and E. bieneusi. The databases belong to the National Institute of Allergy and Infectious Diseases (NIAID) funded EuPathDB (http://EuPathDB.org) Bioinformatics Resource Center family of integrated databases and assume the same architectural and graphical design as other EuPathDB resources such as PlasmoDB and TriTrypDB. Importantly they utilize the graphical strategy builder that affords a database user the ability to ask complex multi-data-type questions with relative ease and versatility. Genomic scale data can be queried based on BLAST searches, annotation keywords and gene ID searches, GO terms, sequence motifs, protein characteristics, phylogenetic relationships and functional data such as transcript (microarray and EST evidence) and protein expression data. Search strategies can be saved within a user's profile for future retrieval and may also be shared with other researchers using a unique strategy web address.
Nucleic Acids Research 10/2010; 39(Database issue):D612-9. · 8.03 Impact Factor
-
Cristina Aurrecoechea,
John Brestelli,
Brian P. Brunk, Steve Fischer,
Bindu Gajria,
Xin Gao,
Alan Gingle,
Gregory R. Grant,
Omar S. Harb,
Mark Heiges, [......],
Vishal Nayak,
Cary Pennington,
Deborah F. Pinney,
David S. Roos,
Chris Ross,
Ganesh Srinivasamoorthy,
Christian J. Stoeckert Jr,
Ryan Thibodeau,
Charles Treatman,
Haiming Wang
Nucleic Acids Research. 01/2010; 38:415-419.
-
Martin Aslett,
Cristina Aurrecoechea,
Matthew Berriman,
John Brestelli,
Brian P. Brunk,
Mark Carrington,
Daniel P. Depledge, Steve Fischer,
Bindu Gajria,
Xin Gao, [......],
Dhileep Sivam,
Deborah F. Smith,
Ganesh Srinivasamoorthy,
Christian J. Stoeckert Jr,
Sandhya Subramanian,
Ryan Thibodeau,
Adrian Tivey,
Charles Treatman,
Giles Velarde,
Haiming Wang
Nucleic Acids Research. 01/2010; 38:457-462.
-
Cristina Aurrecoechea,
John Brestelli,
Brian P Brunk, Steve Fischer,
Bindu Gajria,
Xin Gao,
Alan Gingle,
Greg Grant,
Omar S Harb,
Mark Heiges, [......],
Vishal Nayak,
Cary Pennington,
Deborah F Pinney,
David S Roos,
Chris Ross,
Ganesh Srinivasamoorthy,
Christian J Stoeckert,
Ryan Thibodeau,
Charles Treatman,
Haiming Wang
[show abstract]
[hide abstract]
ABSTRACT: EuPathDB (http://EuPathDB.org; formerly ApiDB) is an integrated database covering the eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera. The most recent release of EuPathDB includes updates and changes affecting data content, infrastructure and the user interface, improving data access and enhancing the user experience. EuPathDB currently supports more than 80 searches and the recently-implemented 'search strategy' system enables users to construct complex multi-step searches via a graphical interface. Search results are dynamically displayed as the strategy is constructed or modified, and can be downloaded, saved, revised, or shared with other database users.
Nucleic Acids Research 11/2009; 38(Database issue):D415-9. · 8.03 Impact Factor
-
Martin Aslett,
Cristina Aurrecoechea,
Matthew Berriman,
John Brestelli,
Brian P Brunk,
Mark Carrington,
Daniel P Depledge, Steve Fischer,
Bindu Gajria,
Xin Gao, [......],
Dhileep Sivam,
Deborah F Smith,
Ganesh Srinivasamoorthy,
Christian J Stoeckert,
Sandhya Subramanian,
Ryan Thibodeau,
Adrian Tivey,
Charles Treatman,
Giles Velarde,
Haiming Wang
[show abstract]
[hide abstract]
ABSTRACT: TriTrypDB (http://tritrypdb.org) is an integrated database providing access to genome-scale datasets for kinetoplastid parasites, and supporting a variety of complex queries driven by research and development needs. TriTrypDB is a collaborative project, utilizing the GUS/WDK computational infrastructure developed by the Eukaryotic Pathogen Bioinformatics Resource Center (EuPathDB.org) to integrate genome annotation and analyses from GeneDB and elsewhere with a wide variety of functional genomics datasets made available by members of the global research community, often pre-publication. Currently, TriTrypDB integrates datasets from Leishmania braziliensis, L. infantum, L. major, L. tarentolae, Trypanosoma brucei and T. cruzi. Users may examine individual genes or chromosomal spans in their genomic context, including syntenic alignments with other kinetoplastid organisms. Data within TriTrypDB can be interrogated utilizing a sophisticated search strategy system that enables a user to construct complex queries combining multiple data types. All search strategies are stored, allowing future access and integrated searches. 'User Comments' may be added to any gene page, enhancing available annotation; such comments become immediately searchable via the text search, and are forwarded to curators for incorporation into the reference annotation when appropriate.
Nucleic Acids Research 10/2009; 38(Database issue):D457-62. · 8.03 Impact Factor
-
Cristina Aurrecoechea,
John Brestelli,
Brian P. Brunk,
Jennifer Dommer, Steve Fischer,
Bindu Gajria,
Xin Gao,
Alan Gingle,
Gregory R. Grant,
Omar S. Harb, [......],
Wei Li,
John A. Miller,
Vishal Nayak,
Cary Pennington,
Deborah F. Pinney,
David S. Roos,
Chris Ross,
Christian J. Stoeckert Jr,
Charles Treatman,
Haiming Wang
Nucleic Acids Research. 01/2009; 37:539-543.
-
Cristina Aurrecoechea,
John Brestelli,
Brian P. Brunk,
Jane M. Carlton,
Jennifer Dommer, Steve Fischer,
Bindu Gajria,
Xin Gao,
Alan Gingle,
Gregory R. Grant, [......],
Hilary G. Morrison,
Vishal Nayak,
Cary Pennington,
Deborah F. Pinney,
David S. Roos,
Chris Ross,
Christian J. Stoeckert Jr,
Steven Sullivan,
Charles Treatman,
Haiming Wang
Nucleic Acids Research. 01/2009; 37:526-530.
-
Cristina Aurrecoechea,
John Brestelli,
Brian P Brunk,
Jennifer Dommer, Steve Fischer,
Bindu Gajria,
Xin Gao,
Alan Gingle,
Greg Grant,
Omar S Harb, [......],
Wei Li,
John A Miller,
Vishal Nayak,
Cary Pennington,
Deborah F Pinney,
David S Roos,
Chris Ross,
Christian J Stoeckert,
Charles Treatman,
Haiming Wang
[show abstract]
[hide abstract]
ABSTRACT: PlasmoDB (http://PlasmoDB.org) is a functional genomic database for Plasmodium spp. that provides a resource for data analysis and visualization in a gene-by-gene or genome-wide scale. PlasmoDB belongs to a family of genomic resources that are housed under the EuPathDB (http://EuPathDB.org) Bioinformatics Resource Center (BRC) umbrella. The latest release, PlasmoDB 5.5, contains numerous new data types from several broad categories--annotated genomes, evidence of transcription, proteomics evidence, protein function evidence, population biology and evolution. Data in PlasmoDB can be queried by selecting the data of interest from a query grid or drop down menus. Various results can then be combined with each other on the query history page. Search results can be downloaded with associated functional data and registered users can store their query history for future retrieval or analysis.
Nucleic Acids Research 11/2008; 37(Database issue):D539-43. · 8.03 Impact Factor
-
Cristina Aurrecoechea,
John Brestelli,
Brian P Brunk,
Jane M Carlton,
Jennifer Dommer, Steve Fischer,
Bindu Gajria,
Xin Gao,
Alan Gingle,
Greg Grant, [......],
Hilary G Morrison,
Vishal Nayak,
Cary Pennington,
Deborah F Pinney,
David S Roos,
Chris Ross,
Christian J Stoeckert,
Steven Sullivan,
Charles Treatman,
Haiming Wang
[show abstract]
[hide abstract]
ABSTRACT: GiardiaDB (http://GiardiaDB.org) and TrichDB (http://TrichDB.org) house the genome databases for Giardia lamblia and Trichomonas vaginalis, respectively, and represent the latest additions to the EuPathDB (http://EuPathDB.org) family of functional genomic databases. GiardiaDB and TrichDB employ the same framework as other EuPathDB sites (CryptoDB, PlasmoDB and ToxoDB), supporting fully integrated and searchable databases. Genomic-scale data available via these resources may be queried based on BLAST searches, annotation keywords and gene ID searches, GO terms, sequence motifs and other protein characteristics. Functional queries may also be formulated, based on transcript and protein expression data from a variety of platforms. Phylogenetic relationships may also be interrogated. The ability to combine the results from independent queries, and to store queries and query results for future use facilitates complex, genome-wide mining of functional genomic data.
Nucleic Acids Research 10/2008; 37(Database issue):D526-30. · 8.03 Impact Factor
-
Bindu Gajria,
Amit Bahl,
John Brestelli,
Jennifer Dommer, Steve Fischer,
Xin Gao,
Mark Heiges,
John Iodice,
Jessica C Kissinger,
Aaron J Mackey,
Deborah F Pinney,
David S Roos,
Christian J Stoeckert,
Haiming Wang,
Brian P Brunk
[show abstract]
[hide abstract]
ABSTRACT: ToxoDB (http://ToxoDB.org) is a genome and functional genomic database for the protozoan parasite Toxoplasma gondii. It incorporates the sequence and annotation of the T. gondii ME49 strain, as well as genome sequences for the GT1, VEG and RH (Chr Ia, Chr Ib) strains. Sequence information is integrated with various other genomic-scale data, including community annotation, ESTs, gene expression and proteomics data. ToxoDB has matured significantly since its initial release. Here we outline the numerous updates with respect to the data and increased functionality available on the website.
Nucleic Acids Research 02/2008; 36(Database issue):D553-6. · 8.03 Impact Factor
-
Bindu Gajria,
Amit Bahl,
John Brestelli,
Jennifer Dommer, Steve Fischer,
Xin Gao,
Mark Heiges,
John Iodice,
Jessica C. Kissinger,
Aaron J. Mackey,
Deborah F. Pinney,
David S. Roos,
Christian J. Stoeckert Jr,
Haiming Wang,
Brian P. Brunk
Nucleic Acids Research. 01/2008; 36:553-556.
-
Cristina Aurrecoechea,
Mark Heiges,
Haiming Wang,
Zhiming Wang, Steve Fischer,
Philippa Rhodes,
John Miller,
Eileen Kraemer,
Christian J Stoeckert,
David S Roos,
Jessica C Kissinger
[show abstract]
[hide abstract]
ABSTRACT: ApiDB (http://ApiDB.org) represents a unified entry point for the NIH-funded Apicomplexan Bioinformatics Resource Center (BRC) that integrates numerous database resources and multiple data types. The phylum Apicomplexa comprises numerous veterinary and medically important parasitic protozoa including human pathogenic species of the genera Cryptosporidium, Plasmodium and Toxoplasma. ApiDB serves not only as a database in its own right, but as a single web-based point of entry that unifies access to three major existing individual organism databases (PlasmoDB.org, ToxoDB.org and CryptoDB.org), and integrates these databases with data available from additional sources. Through the ApiDB site, users may pose queries and search all available apicomplexan data and tools, or they may visit individual component organism databases.
Nucleic Acids Research 02/2007; 35(Database issue):D427-30. · 8.03 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Version 5.1 of PlasmoDB, a resource for malaria parasite genomic and functional genomics datasets, was released in August 2006. This new release includes additional Plasmodium genomes and a newly designed website. The new site reflects the status of PlasmoDB as a member of a linked family of Apicomplexan databases.
Trends in Parasitology 01/2007; 22(12):543-6. · 5.14 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: ApiEST-DB (http://www.cbil.upenn.edu/paradbs-servlet/) provides integrated access to publicly available EST data from protozoan parasites in the phylum Apicomplexa. The database currently incorporates a total of nearly 100,000 ESTs from several parasite species of clinical and/or veterinary interest, including Eimeria tenella, Neospora caninum, Plasmodium falciparum, Sarcocystis neurona and Toxoplasma gondii. To facilitate analysis of these data, EST sequences were clustered and assembled to form consensus sequences for each organism, and these assemblies were then subjected to automated annotation via similarity searches against protein and domain databases. The underlying relational database infrastructure, Genomics Unified Schema (GUS), enables complex biologically based queries, facilitating validation of gene models, identification of alternative splicing, detection of single nucleotide polymorphisms, identification of stage-specific genes and recognition of phylogenetically conserved and phylogenetically restricted sequences.
Nucleic Acids Research 02/2004; 32(Database issue):D326-8. · 8.03 Impact Factor
-
Klaus H Kaestner,
Catherine S Lee,
L Marie Scearce,
John E Brestelli,
Athanasios Arsenlis,
Phillip Phuc Le,
Kristen A Lantz,
Jonathan Crabtree,
Angel Pizarro,
Joan Mazzarelli,
Deborah Pinney, Steve Fischer,
Elisabetta Manduchi,
Christian J Stoeckert,
Gerard Gradwohl,
Sandra W Clifton,
Juliana R Brown,
Hiroshi Inoue,
Corentin Cras-Méneur,
M Alan Permutt
[show abstract]
[hide abstract]
ABSTRACT: The Endocrine Pancreas Consortium was formed in late 1999 to derive and sequence cDNA libraries enriched for rare transcripts expressed in the mammalian endocrine pancreas. Over the past 3 years, the Consortium has generated 20 cDNA libraries from mouse and human pancreatic tissues and deposited >150,000 sequences into the public expressed sequence tag databases. A special effort was made to enrich for cDNAs from the endocrine pancreas by constructing libraries from isolated islets. In addition, we constructed a library in which fetal pancreas from Neurogenin 3 null mice, which consists of only exocrine and duct cells, was subtracted from fetal wild-type pancreas to enrich for the transcripts from the endocrine compartment. Sequence analysis showed that these clones cluster into 9,464 assembly groups (approximating unique transcripts) for the mouse and 13,910 for the human sequences. Of these, >4,300 were unique to Consortium libraries. We have assembled a core clone set containing one cDNA for each assembly group for the mouse and have constructed the corresponding microarray, termed "PancChip 4.0," which contains >9,000 nonredundant elements. We show that this PancChip is highly enriched for genes expressed in the endocrine pancreas. The mouse and human clone sets and corresponding arrays will be important resources for diabetes research.
Diabetes 08/2003; 52(7):1604-10. · 8.29 Impact Factor