Automating sequence-based detection and genotyping of SNPs from diploid samples. Nat Genet

Department of Statistics, University of Washington, Seattle, Washington 98195, USA.
Nature Genetics (Impact Factor: 29.35). 04/2006; 38(3):375-81. DOI: 10.1038/ng1746
Source: PubMed


The detection of sequence variation, for which DNA sequencing has emerged as the most sensitive and automated approach, forms the basis of all genetic analysis. Here we describe and illustrate an algorithm that accurately detects and genotypes SNPs from fluorescence-based sequence data. Because the algorithm focuses particularly on detecting SNPs through the identification of heterozygous individuals, it is especially well suited to the detection of SNPs in diploid samples obtained after DNA amplification. It is substantially more accurate than existing approaches and, notably, provides a useful quantitative measure of its confidence in each potential SNP detected and in each genotype called. Calls assigned the highest confidence are sufficiently reliable to remove the need for manual review in several contexts. For example, for sequence data from 47-90 individuals sequenced on both the forward and reverse strands, the highest-confidence calls from our algorithm detected 93% of all SNPs and 100% of high-frequency SNPs, with no false positive SNPs identified and 99.9% genotyping accuracy. This algorithm is implemented in a software package, PolyPhred version 5.0, which is freely available for academic use.

10 Reads
  • Source
    • "Chromatograms were analyzed with Phred (version: 0.020425.c) (Ewing and Green, 1998; Ewing et al., 1998); Phrap and Cross match (version 0.990319); PolyPhred (version 6.18, April 29, 2009) (Nickerson et al., 1997; Stephens et al., 2006), and Consed (version 20.0) (Gordon et al., 1998). For comparison of the patient sequences with an NF1 reference sequence, a pseudochromatogram was generated with Sudophred (version 6.18.; April 29, 2009), using positions 17: 31,094,927 to 31,382,116 of the human genome (Ensembl release 76, GRCh38) ( "
    [Show abstract] [Hide abstract]
    ABSTRACT: Neurofibromatosis type I is an autosomal dominant disease with complete penetrance and variable age-dependent expressivity. It is caused by heterozygous mutations in neurofibromin 1 (NF1). These occur throughout the length of the gene, with no apparent hotspots. Even though some mutations have been found repeatedly, most have been observed only once. This, along with the variable expressivity, has made it difficult to establish genotype-phenotype correlations. Here, we report the clinical and molecular characteristics of four pediatric patients with neurofibromatosis type I. Patients were clinically examined and DNA was extracted from peripheral blood. The whole coding sequence of NF1, plus flanking intronic regions, was examined by Sanger sequencing, and four frameshift mutations were identified. The mutation c.3810_3820delCATGCAGACTC was observed in a familial case. This mutation occurred within a sequence comprising two 8-bp direct repeats (GCAGACTC) separated by a CAT trinucleotide, with the deletion leading to the loss of the trinucleotide and the 8-bp repeat following it. The deletion might have occurred due to misalignment of the direct repeats during cell division. In the mutation c.5194delG, the deleted G is nested between two separate mononucleotide tracts (AAAGTTT), which could have played a role in creating the deletion. The other two mutations reported here are c.4076_4077insG, and c.3193_3194insA. All four mutations create premature stop codons. In three mutations, the consequence is predicted to be loss of the GAP-related, Sec14 homology, and pleckstrin homology-like domains; while in the fourth, only the latter two domains would be lost.
    Genetics and molecular research: GMR 09/2015; 14(3):8326-8337. DOI:10.4238/2015.July.27.21 · 0.78 Impact Factor
  • Source
    • "New primers (Forward Genome Walker [FGW]; Reverse Genome Walker [RGW]) were designed from the newly added microsatellite flanking sequence (Table 2). These primers were then used to amplify DNA of 'Pound 7', 'P 30', 'PA 7', and nine other cacao genotypes, each representing the major genetic groups of cacao as described by Motamayor et al. (2008) The amplified products were sequenced with an ABI 3730 genetic analyzer (Applied Biosystems, Foster City, CA, USA) and aligned with Phred, Phrap, Polyphred, and Consed software for sequence comparison and SNP detection (Ewing and Green 1998; Ewing et al. 1998; Gordon et al. 1998; Stephens et al. 2006). Each SNP site detected was named after the mTcCIR microsatellite marker (locus) from which it was identified, followed by the distance in nucleotides of the SNP from the 5' end of the sequence. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The majority of the world's cacao for chocolate manufacture is produced in West Africa. Cocoa breeding programs in West Africa need genetic markers to reduce the time needed for improving cocoa by screening seedlings for the presence of the markers rather than mature plants for the phenotypic traits (i.e., marker-assisted selection [MAS]). For MAS to be successful, the breeder must have both access to markers linked to desired traits and a convenient marker-assay system that can be performed locally. In this study, microsatellite markers that flanked disease resistance quantitative trait loci (QTL) but could not be assayed conveniently in West Africa were converted using a genome walking method into single nucleotide polymorphism (SNP) markers that could be assayed locally. The SNP and microsatellite markers were equally effective in identifying off-types in two different mapping populations of cacao. Also, SNPs cast doubt on whether all microsatellite markers are identical by descent.
    Journal of Crop Improvement 03/2013; 27(2). DOI:10.1080/15427528.2012.752773
  • Source
    • "The PCR products were sent to the Nucleic Acid Sequencing Center of National Cheng Kung University for sequencing using the ABI 3100 DNA sequencer (Applied Biosystems). The sequence data were analyzed by using the PolyPhred software (v5.04) [19]. The genotypes were assigned to the participants independently by two individuals blinded to the participant information. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The aim of this case-control study was to investigate whether the vitamin D receptor (VDR) 1a promoter gene polymorphisms are associated with susceptibility to polycystic ovary syndrome (PCOS). Women with PCOS and a control group, all aged 18-45 years, were enrolled. Genotypes of two functional single nucleotide polymorphisms (SNPs), the 1521 bp (G/C) and 1012 bp (A/G), located on the 1a promoter of the VDR gene were determined by using direct sequencing. Serum 25-hydroxyvitamin D levels were measured by ELISA. Two functional SNPs in the 1a promoter region of the VDR gene were in complete linkage disequilibrium. The genotype distributions of these two polymorphisms in the PCOS group were not significantly different from those of the control group. Further subgroup analyses according to body mass index also revealed no significant differences in the genotype distribution in the PCOS group. Significantly lower serum 25-hydroxyvitamin D levels were observed in the heterozygous 1521CG/1012GA haplotype of both groups. Metformin treatment was only effective to increase serum 25-hydroxyvitamin D levels in PCOS patients carrying the homozygous 1521G/1012A haplotype. These results suggest that the VDR 1a promoter polymorphisms may not be associated with the risk for PCOS, but are associated with serum 25-hydroxyvitamin D levels. Metformin treatment will be beneficial to PCOS patients without the VDR 1a promoter variant in Taiwanese population.
    Taiwanese journal of obstetrics & gynecology 12/2012; 51(4):565-71. DOI:10.1016/j.tjog.2012.09.011 · 0.99 Impact Factor
Show more

Preview (2 Sources)

10 Reads
Available from