CHILD: A new tool for detecting low-abundance insertions and deletions in standard sequence traces

National Institute for Biotechnology in the Negev, Beer Sheva 84105, Israel.
Nucleic Acids Research (Impact Factor: 9.11). 04/2011; 39(7):e47. DOI: 10.1093/nar/gkq1354
Source: PubMed


Several methods have been proposed for detecting insertion/deletions (indels) from chromatograms generated by Sanger sequencing.
However, most such methods are unsuitable when the mutated and normal variants occur at unequal ratios, such as is expected
to be the case in cancer, with organellar DNA or with alternatively spliced RNAs. In addition, the current methods do not
provide robust estimates of the statistical confidence of their results, and the sensitivity of this approach has not been
rigorously evaluated. Here, we present CHILD, a tool specifically designed for indel detection in mixtures where one variant
is rare. CHILD makes use of standard sequence alignment statistics to evaluate the significance of the results. The sensitivity
of CHILD was tested by sequencing controlled mixtures of deleted and undeleted plasmids at various ratios. Our results indicate
that CHILD can identify deleted molecules present as just 5% of the mixture. Notably, the results were plasmid/primer-specific;
for some primers and/or plasmids, the deleted molecule was only detected when it comprised 10% or more of the mixture. The
false positive rate was estimated to be lower than 0.4%. CHILD was implemented as a user-oriented web site, providing a sensitive
and experimentally validated method for the detection of rare indel-carrying molecules in common Sanger sequence reads.

Download full-text


Available from: Eitan Rubin
  • Source
    • "Sanger sequencing can become expensive when many clones need to be screened. Moreover standard Sanger sequencing data can be difficult to interpret because indel mutations cause chromatogram phase shifts, so that it can be difficult to distinguish homozygous and heterozygous mutants [16]. We showed that dual gRNAs do not reduce (and may marginally enhance) targeting efficiency and simplify genotyping by causing more extensive deletions that are easily detected by PCR genotyping. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Designer nucleases such as TALENS and Cas9 have opened new opportunities to scarlessly edit the mammalian genome. Here we explored several parameters that influence Cas9-mediated scarless genome editing efficiency in murine embryonic stem cells. Optimization of transfection conditions and enriching for transfected cells are critical for efficiently recovering modified clones. Paired gRNAs and wild-type Cas9 efficiently create programmed deletions, which facilitate identification of targeted clones, while paired gRNAs and the Cas9D10A nickase generated smaller targeted indels with lower chance of off-target mutagenesis. Genome editing is also useful for programmed introduction of exogenous DNA sequences at a target locus. Increasing the length of the homology arms of the homology-directed repair template strongly enhanced targeting efficiency, while increasing the length of the DNA insert reduced it. Together our data provide guidance on optimal design of scarless gene knockout, modification, or knock-in experiments using Cas9 nuclease.
    Full-text · Article · Aug 2014 · PLoS ONE
  • Source
    • "In the Indelligent algorithm, dynamic programming is used to convert the IUPAC code into two nucleotides (i.e., M is converted into A/C, W into A/T, Y into C/T, K into G/T, and S into G/C) [9] [10]. For ambiguous bases that cannot be decomposed with Indelligent, major and minor sequences are assigned according to the intensities of the corresponding fluorescence signals (Figure 1 "
    [Show abstract] [Hide abstract]
    ABSTRACT: The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3.
    Full-text · Article · Jun 2012 · The Scientific World Journal
  • Source

    Full-text · Chapter · Oct 2011
Show more