Article
Towards the understanding of the cocoa transcriptome: Production and analysis of an exhaustive dataset of ESTs of Theobroma cacao L. generated from various tissues and under various conditions.
Biological Systems Department, UMR DAP TA 40/03, CIRAD, Montpellier, France.
BMC Genomics (impact factor:
4.07).
11/2008;
9:512.
DOI:10.1186/1471-2164-9-512
pp.512
Source: PubMed
- Citations (18)
-
Cited In (0)
-
Article: Reliable identification of large numbers of candidate SNPs from public EST data.
[show abstract] [hide abstract]
ABSTRACT: High-resolution genetic analysis of the human genome promises to provide insight into common disease susceptibility. To perform such analysis will require a collection of high-throughput, high-density analysis reagents. We have developed a polymorphism detection system that uses public-domain sequence data. This detection system is called the single nucleotide polymorphism pipeline (SNPpipeline). The analytic core of the SNPpipeline is composed of three components: PHRED, PHRAP and DEMIGLACE. PHRED and PHRAP are components of a sequence analysis suite developed to perform the semi-automated analysis required for large-scale genomes (provided courtesy of P. Green). Using these informatics tools, which examine redundant raw expressed sequence tag (EST) data, we have identified more than 3,000 candidate single-nucleotide polymorphisms (SNPs). Empiric validation studies of a set of 192 candidates indicate that 82% identify variation in a sample of ten Centre d'Etudes Polymorphism Humain (CEPH) individuals. Our results suggest that existing sequence resources may serve as a valuable source for identifying genetic variation.Nature Genetics 04/1999; 21(3):323-5. · 35.53 Impact Factor -
Article: QualitySNP: a pipeline for detecting single nucleotide polymorphisms and insertions/deletions in EST data from diploid and polyploid species.
[show abstract] [hide abstract]
ABSTRACT: Single nucleotide polymorphisms (SNPs) are important tools in studying complex genetic traits and genome evolution. Computational strategies for SNP discovery make use of the large number of sequences present in public databases (in most cases as expressed sequence tags (ESTs)) and are considered to be faster and more cost-effective than experimental procedures. A major challenge in computational SNP discovery is distinguishing allelic variation from sequence variation between paralogous sequences, in addition to recognizing sequencing errors. For the majority of the public EST sequences, trace or quality files are lacking which makes detection of reliable SNPs even more difficult because it has to rely on sequence comparisons only. We have developed a new algorithm to detect reliable SNPs and insertions/deletions (indels) in EST data, both with and without quality files. Implemented in a pipeline called QualitySNP, it uses three filters for the identification of reliable SNPs. Filter 1 screens for all potential SNPs and identifies variation between or within genotypes. Filter 2 is the core filter that uses a haplotype-based strategy to detect reliable SNPs. Clusters with potential paralogs as well as false SNPs caused by sequencing errors are identified. Filter 3 screens SNPs by calculating a confidence score, based upon sequence redundancy and quality. Non-synonymous SNPs are subsequently identified by detecting open reading frames of consensus sequences (contigs) with SNPs. The pipeline includes a data storage and retrieval system for haplotypes, SNPs and alignments. QualitySNP's versatility is demonstrated by the identification of SNPs in EST datasets from potato, chicken and humans. QualitySNP is an efficient tool for SNP detection, storage and retrieval in diploid as well as polyploid species. It is available for running on Linux or UNIX systems. The program, test data, and user manual are available at http://www.bioinformatics.nl/tools/snpweb/ and as Additional files.BMC Bioinformatics 02/2006; 7:438. · 2.75 Impact Factor -
Article: Mining for single nucleotide polymorphisms and insertions/deletions in maize expressed sequence tag data.
[show abstract] [hide abstract]
ABSTRACT: We have developed a computer based method to identify candidate single nucleotide polymorphisms (SNPs) and small insertions/deletions from expressed sequence tag data. Using a redundancy-based approach, valid SNPs are distinguished from erroneous sequence by their representation multiple times in an alignment of sequence reads. A second measure of validity was also calculated based on the cosegregation of the SNP pattern between multiple SNP loci in an alignment. The utility of this method was demonstrated by applying it to 102,551 maize (Zea mays) expressed sequence tag sequences. A total of 14,832 candidate polymorphisms were identified with an SNP redundancy score of two or greater. Segregation of these SNPs with haplotype indicates that candidate SNPs with high redundancy and cosegregation confidence scores are likely to represent true SNPs. This was confirmed by validation of 264 candidate SNPs from 27 loci, with a range of redundancy and cosegregation scores, in four inbred maize lines. The SNP transition/transversion ratio and insertion/deletion size frequencies correspond to those observed by direct sequencing methods of SNP discovery and suggest that the majority of predicted SNPs and insertion/deletions identified using this approach represent true genetic variation in maize.Plant physiology 06/2003; 132(1):84-91. · 6.53 Impact Factor
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed.
The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual
current impact factor.
Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence
agreement may be applicable.
Keywords
biochemical pathways
candidate genes
cDNA libraries
database.To check
different environmental conditions
different metabolic pathways extensively
EST collection displays
EST collection.A large collection
fungal diseases
main GO categories.A specific information system
major cash crops
quality improvement
significant homology
South America
T. cacao
T. cacao qualities
T. cacao quality improvement
terpene pathways
Theobroma cacao L
two metabolic pathways