Automating resequencing-based detection of insertion-deletion polymorphisms Nat. Genet. 38, 1457-1462

Department of Bioengineering, University of Washington, Seattle, Washington 98195, USA.
Nature Genetics (Impact Factor: 29.35). 01/2007; 38(12):1457-62. DOI: 10.1038/ng1925
Source: PubMed


Structural and insertion-deletion (indel) variants have received considerable recent attention, partly because of their phenotypic consequences. Among these variants, the most common are small indels ( approximately 1-30 bp). Identifying and genotyping indels using sequence traces obtained from diploid samples requires extensive manual review, which makes large-scale studies inconvenient. We report a new algorithm, implemented in available software (PolyPhred version 6.0), to help automate detection and genotyping of indels from sequence traces. The algorithm identifies heterozygous individuals, which permits the discovery of low-frequency indels. It finds 80% of all indel polymorphisms with almost no false positives and finds 97% with a false discovery rate of 10%. Additionally, genotyping accuracy exceeds 99%, and it correctly infers indel length in 96% of the cases. Using this approach, we identify indels in the HapMap ENCODE regions, providing the first report of these polymorphisms in this data set.

11 Reads
  • Source
    • "Construction of a high-density linkage map using a large pedigree population and high-quality molecular markers is urgently required for MAS breeding in trees. Gene-derived markers, including simple sequence repeat (SSR) and small insertion and deletion (InDels of 1–30 bp) markers, in the coding or regulatory regions of genes, can alter gene function, transcription, or translation (Bhangale et al., 2006; Du et al., 2013b). Indeed, compared with nongenic markers, gene-derived markers are more reliable for construction of a high-resolution linkage map and can uncover a detailed picture of the QTL responsible for complex traits (Fukuoka et al., 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Deciphering the genetic architecture underlying polygenic traits in perennial species can inform molecular marker-assisted breeding. Recent advances in high-throughput sequencing have enabled strategies that integrate linkage–linkage disequilibrium (LD) mapping in Populus. We used an integrated method of quantitative trait locus (QTL) dissection with a high-resolution linkage map and multi-gene association mapping to decipher the nature of genetic architecture (additive, dominant, and epistatic effects) of potential QTLs for growth traits in a Populus linkage population (1200 progeny) and a natural population (435 individuals). Seventeen QTLs for tree height, diameter at breast height, and stem volume mapped to 11 linkage groups (logarithm of odds (LOD) ≥ 2.5), and explained 2.7–18.5% of the phenotypic variance. After comparative mapping and transcriptome analysis, 187 expressed genes (10 046 common single nucleotide polymorphisms (SNPs)) were selected from the segmental homology regions (SHRs) of 13 QTLs. Using multi-gene association models, we observed 202 significant SNPs in 63 promising genes from 10 QTLs (P ≤ 0.0001; FDR ≤ 0.10) that exhibited reproducible associations with additive/dominant effects, and further determined 11 top-ranked genes tightly linked to the QTLs. Epistasis analysis uncovered a uniquely interconnected gene–gene network for each trait. This study opens up opportunities to uncover the causal networks of interacting genes in plants using an integrated linkage–LD mapping approach.
    New Phytologist 10/2015; DOI:10.1111/nph.13695 · 7.67 Impact Factor
  • Source
    • "Consed (Gordon et al. 1998) was used for visualizing the reads. PolyPhred (Bhangale et al. 2006) was used to detect single-nucleotide polymorphisms (SNPs). When a sample produced a heterozygous amplicon, the 2 haplotypes were phased experimentally (Chen et al. 2010) using cloning with Qiagen pDrive Cloning Vector. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Ipomoea purpurea (common morning glory) is an annual vine native to Mexico that is well known for its large, showy flowers. Humans have spread morning glories worldwide, owing to the horticultural appeal of morning glory flowers. Ipomoea purpurea is an opportunistic colonizer of disturbed habitats including roadside and agricultural settings, and it is now regarded as a noxious weed in the Southeastern US. Naturalized populations in the Southeastern United States are highly polymorphic for a number of flower color morphs, unlike native Mexican populations that are typically monomorphic for the purple color morph. Although I. purpurea was introduced into the United States from Mexico, little is known about the specific geographic origins of US populations relative to the Mexican source. We use resequencing data from 11 loci and 30 I. purpurea accessions collected from the native range of the species in Central and Southern Mexico and 8 accessions from the Southeastern United States to infer likely geographic origins in Mexico. Based on genetic assignment analysis, haplotype composition, and the degree of shared polymorphism, I. purpurea samples from the Southeastern United States are genetically most similar to samples from the Valley of Mexico and Veracruz State. This supports earlier speculation that I. purpurea in the Southeastern United States was likely to have been introduced by European colonists from sources in Central Mexico.
    The Journal of heredity 07/2013; 104(5). DOI:10.1093/jhered/est046 · 2.09 Impact Factor
  • Source
    • "Indels are frequently binned into the categories of " small " and " large " based on sequence length. Small indels span $1–30 bp, whereas large indels can add or remove thousands of base pairs (Bhangale et al. 2006). The molecular mechanisms underlying large indels are fairly well understood; these include transposable element proliferation, transposable element-mediated ectopic recombination, slipped-strand mispairing , and nonhomologous end joining (Petrov et al. 2003; Bennetzen et al. 2005; Ju et al. 2011). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Evolutionary changes in genome size result from the combined effects of mutation, natural selection, and genetic drift. Insertion and deletion mutations (indels) directly impact genome size by adding or removing sequences. Most species lose more DNA through small indels (i.e. ∼1-30 bp) than they gain, which can result in genome reduction over time. Because this rate of DNA loss varies across species, small indel dynamics have been suggested to contribute to genome size evolution. Species with extremely large genomes provide interesting test cases for exploring the link between small indels and genome size; however, most large genomes remain relatively unexplored. Here, we examine rates of DNA loss in the tetrapods with the largest genomes - the salamanders. We used low-coverage genomic shotgun sequence data from four salamander species to examine patterns of insertion, deletion, and substitution in neutrally evolving non-LTR retrotransposon sequences. For comparison, we estimated genome-wide DNA loss rates in non-LTR retrotransposon sequences from five other vertebrate genomes: Anolis carolinensis, Danio rerio, Gallus gallus, Homo sapiens, and Xenopus tropicalis. Our results show that salamanders have significantly lower rates of DNA loss than do other vertebrates. More specifically, salamanders experience lower numbers of deletions relative to insertions, and both deletions and insertions are skewed towards smaller sizes. Based on these patterns, we conclude that slow DNA loss contributes to genomic gigantism in salamanders. We also identify candidate molecular mechanisms underlying these differences and suggest that natural variation in indel dynamics provides a unique opportunity to study the basis of genome stability.
    Genome Biology and Evolution 11/2012; 4(12). DOI:10.1093/gbe/evs103 · 4.23 Impact Factor
Show more

Preview (2 Sources)

11 Reads
Available from