-
[show abstract]
[hide abstract]
ABSTRACT: One of the major challenges of the post-genomic world is the efficient application of genomic information to solve practical problems such as the improvement of breeding processes. The Solanaceae Genomics Network (SGN) (http://solgenomics.net/) is a clade oriented genome and phenome database for species such as the recently sequenced tomato and potato, and will cover the species of the SOL-100 project for which the high quality tomato sequence will serve as a reference. The incorporation of genotypic and phenotypic information in the database enabled tools that link phenotypes to genotypes, and therefore functionality that can inform the breeding process. Currently implemented in the platform are extensive web-based phenotype and trait data management tools, pedigree data tools, and the solQTL tool, which uses QTL analysis to link genome regions to traits of interest. Data export functions for data formats compatible with major association genetics software are also available. Under development are features that accommodate new breeding paradigms such as Genomic Selection, and simple interfaces for the management of breeding programs. The software can be applied to any other species or clade and is available in an open source format.
Plant and Animal Genome XXI Conference; 01/2013
-
[show abstract]
[hide abstract]
ABSTRACT: The current challenge in biology is to link the rapidly accumulating sequence information to phenotypic information. The Sol Genomics Network (SGN, http://solgenomics.net/) is a clade oriented database for the Solanaceae, including tomato, potato, pepper and petunia, and other Asterid plants. Genome to phenome linkages have been the focus of SGN for much of the past 10 years. Sequence data is obtained from large sequencing projects (tomato, potato) and resources such as Genbank, while the phenotypic data is obtained from sources such as the SolCAP project (http://solcap.msu.edu/), and submissions by individual researchers. Currently, SGN has phenotypic information on more than 20,000 accessions, including free text descriptions, ontology based annotations, literature references, and images. To precisely describe the phenotypes of the Solanaceae, the Solanaceae Phenotype Ontology (SP) has been developed, as a pre-composed ontology linked to Plant Ontology (PO) and PATO. The phenotypic database at SGN is based on the Chado Natural Diversity module, which was co-developed by a number of databases, including SGN, the Genome Database for Roseaceae (GDR, http://rosaceae.org/), and VectorBase. To link genotype to phenotype, the solQTL web tool was implemented in R and Perl that allows QTL analyses to be run on the fly on the website for trait that has the required data (genotype and phenotype) in the database. Other approaches, such as Association Genetics, and application to breeders using Genomic Selection approaches, will be implemented in the future.
Plant and Animal Genome XX Conference (January 14-18, 2012); 01/2012
-
[show abstract]
[hide abstract]
ABSTRACT: The Sol Genomics Network (SGN; http://solgenomics.net/) is a clade-oriented database (COD) containing biological data for species in the Solanaceae and their close relatives, with data types ranging from chromosomes and genes to phenotypes and accessions. SGN hosts several genome maps and sequences, including a pre-release of the tomato (Solanum lycopersicum cv Heinz 1706) reference genome. A new transcriptome component has been added to store RNA-seq and microarray data. SGN is also an open source software project, continuously developing and improving a complex system for storing, integrating and analyzing data. All code and development work is publicly visible on GitHub (http://github.com). The database architecture combines SGN-specific schemas and the community-developed Chado schema (http://gmod.org/wiki/Chado) for compatibility with other genome databases. The SGN curation model is community-driven, allowing researchers to add and edit information using simple web tools. Currently, over a hundred community annotators help curate the database. SGN can be accessed at http://solgenomics.net/.
Nucleic Acids Research 10/2010; 39(Database issue):D1149-55. · 8.03 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: A common approach to understanding the genetic basis of complex traits is through identification of associated quantitative trait loci (QTL). Fine mapping QTLs requires several generations of backcrosses and analysis of large populations, which is time-consuming and costly effort. Furthermore, as entire genomes are being sequenced and an increasing amount of genetic and expression data are being generated, a challenge remains: linking phenotypic variation to the underlying genomic variation. To identify candidate genes and understand the molecular basis underlying the phenotypic variation of traits, bioinformatic approaches are needed to exploit information such as genetic map, expression and whole genome sequence data of organisms in biological databases.
The Sol Genomics Network (SGN, http://solgenomics.net) is a primary repository for phenotypic, genetic, genomic, expression and metabolic data for the Solanaceae family and other related Asterids species and houses a variety of bioinformatics tools. SGN has implemented a new approach to QTL data organization, storage, analysis, and cross-links with other relevant data in internal and external databases. The new QTL module, solQTL, http://solgenomics.net/qtl/, employs a user-friendly web interface for uploading raw phenotype and genotype data to the database, R/QTL mapping software for on-the-fly QTL analysis and algorithms for online visualization and cross-referencing of QTLs to relevant datasets and tools such as the SGN Comparative Map Viewer and Genome Browser. Here, we describe the development of the solQTL module and demonstrate its application.
solQTL allows Solanaceae researchers to upload raw genotype and phenotype data to SGN, perform QTL analysis and dynamically cross-link to relevant genetic, expression and genome annotations. Exploration and synthesis of the relevant data is expected to help facilitate identification of candidate genes underlying phenotypic variation and markers more closely linked to QTLs. solQTL is freely available on SGN and can be used in private or public mode.
BMC Bioinformatics 10/2010; 11:525. · 2.75 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Quantitative trait loci (QTL) analysis is used to dissect the genetic basis underlying polygenic traits. Several public databases have been storing and making QTL data available to research communities. To our knowledge, current QTL databases rely on manual curation where curators read literature and extract relevant QTL information to store in databases. Evidently, this approach is expensive in terms of expert manpower and time use and limits the type of data that can be curated. At the Solanaceae Genomics Network (SGN) ("http://sgn.cornell.edu":http://sgn.cornell.edu), we have developed a database to store raw phenotype and genotype data from QTL studies, perform, on the fly, QTL analysis using R/QTL statistical software ("http://www.rqtl.org":http://www.rqtl.org) and visualize QTLs on a genetic map. Users can identify peak, and flanking markers for QTLs of traits of interest. The QTL database is integrated with other SGN databases (eg. Marker, BACs, and Unigenes), and analysis tools such as the Comparative Map Viewer. Using the comparative map viewer, users can compare chromosome with QTL regions to genetic maps of interest from the same or different Solanaceae species. As the tomato genome sequencing advances, users can also identify corresponding BAC sequences or locations on the tomato physical map, which can be suggestive of candidate genes for a trait of interest.
Furthermore at SGN, images, quantitative phenotype and genotype data, publications, genetic maps generated by QTL studies are displayed and available for download. Currently, data from three F2 and two backcross population QTL studies on fruit morphology traits (18 – 46 traits per population) is available at the SGN website for viewing at population, accession, and trait levels. Traits are described using ontology terms. Phenotype data is presented in tabular and graphical formats such as frequency distributions with basic descriptive statistics. Mapping data showing location of parental alleles on individual accession genetic maps is also available.
SGN is a public database hosted at Boyce Thomson Institute, Cornell University, and funded by USDA CSREES and NSF.
Nature Precedings.