Resolving the breakpoints of the 17q21.31 microdeletion syndrome with next-generation sequencing.

Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
The American Journal of Human Genetics (Impact Factor: 11.2). 04/2012; 90(4):599-613. DOI: 10.1016/j.ajhg.2012.02.013
Source: PubMed

ABSTRACT Recurrent deletions have been associated with numerous diseases and genomic disorders. Few, however, have been resolved at the molecular level because their breakpoints often occur in highly copy-number-polymorphic duplicated sequences. We present an approach that uses a combination of somatic cell hybrids, array comparative genomic hybridization, and the specificity of next-generation sequencing to determine breakpoints that occur within segmental duplications. Applying our technique to the 17q21.31 microdeletion syndrome, we used genome sequencing to determine copy-number-variant breakpoints in three deletion-bearing individuals with molecular resolution. For two cases, we observed breakpoints consistent with nonallelic homologous recombination involving only H2 chromosomal haplotypes, as expected. Molecular resolution revealed that the breakpoints occurred at different locations within a 145 kbp segment of >99% identity and disrupt KANSL1 (previously known as KANSL1). In the remaining case, we found that unequal crossover occurred interchromosomally between the H1 and H2 haplotypes and that this event was mediated by a homologous sequence that was once again missing from the human reference. Interestingly, the breakpoints mapped preferentially to gaps in the current reference genome assembly, which we resolved in this study. Our method provides a strategy for the identification of breakpoints within complex regions of the genome harboring high-identity and copy-number-polymorphic segmental duplication. The approach should become particularly useful as high-quality alternate reference sequences become available and genome sequencing of individuals' DNA becomes more routine.

1 Bookmark
  • [Show abstract] [Hide abstract]
    ABSTRACT: The most common recurrent copy-number variants associated with autism, developmental delay and epilepsy are flanked by segmental duplications. Complete genetic characterization of these events is challenging because their breakpoints often occur within high-identity, copy-number polymorphic paralogous sequences that cannot be specifically assayed using hybridization-based methods. Here we provide a protocol for breakpoint resolution with sequence-level precision. Massively parallel sequencing is performed on libraries generated from haplotype-resolved chromosomes, genomic DNA or molecular inversion probe (MIP)-captured breakpoint-informative regions harboring paralog-distinguishing variants. Quantification of sequencing depth over informative sites enables breakpoint localization, typically within several kilobases to tens of kilobases. Depending on the approach used, the sequencing platform, and the accuracy and completeness of the reference genome sequence, this protocol takes from a few days to several months to complete. Once established for a specific genomic disorder, it is possible to process thousands of DNA samples within as little as 3-4 weeks.
    Nature Protocols 06/2014; 9(6):1496-513. DOI:10.1038/nprot.2014.096 · 7.78 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Over 900 genes have been annotated within duplicated regions of the human genome, yet their functions and potential roles in disease remain largely unknown. One major obstacle has been the inability to accurately and comprehensively assay genetic variation for these genes in a high-throughput manner. We developed a sequencing-based method for rapid and high-throughput genotyping of duplicated genes using molecular inversion probes designed to target unique paralogous sequence variants. We applied this method to genotype all members of two gene families, SRGAP2 and RH, among a diversity panel of 1,056 humans. The approach could accurately distinguish copy number in paralogs having up to ∼99.6% sequence identity, identify small gene-disruptive deletions, detect single-nucleotide variants, define breakpoints of unequal crossover and discover regions of interlocus gene conversion. The ability to rapidly and accurately genotype multiple gene families in thousands of individuals at low cost enables the development of genome-wide gene conversion maps and 'unlocks' many previously inaccessible duplicated genes for association with human traits.
    Nature Methods 07/2013; DOI:10.1038/nmeth.2572 · 23.57 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We delineated and analyzed directly oriented paralogous low-copy repeats (DP-LCRs) in the most recent version of the human haploid reference genome. The computationally defined DP-LCRs were cross-referenced with our Chromosomal Microarray Analysis (CMA) database of 25,144 patients subjected to genome-wide assays. This computationally guided approach to the empirically-derived large dataset allowed us to investigate genomic rearrangement relative frequencies and identify new loci for recurrent nonallelic homologous recombination (NAHR)-mediated copy-number variants (CNVs). The most commonly observed recurrent CNVs were NPHP1 duplications (233), CHRNA7 duplications (175), and 22q11.21 deletions (DiGeorge/Velocardiofacial syndrome, 166). In the ~ 25% of CMA cases for which parental studies were available, we identified 190 de novo recurrent CNVs. In this group, the most frequently observed events were deletions of 22q11.21 (48), 16p11.2 (autism, 34), and 7q11.23 (Williams-Beuren syndrome, 11). Several features of DP-LCRs, including length, distance between NAHR substrate elements, DNA sequence identity (fraction matching), GC content, and concentration of the homologous recombination (HR) hot spot motif 5'-CCNCCNTNNCCNC-3' correlate with the frequencies of the recurrent CNVs events. Four novel adjacent DP-LCR-flanked and NAHR-prone regions, involving 2q12.2q13 were elucidated in association with novel genomic disorders. Our study quantitates genome architectural features responsible for NAHR mediated genomic instability and further elucidates the role of NAHR in human disease.
    Genome Research 05/2013; DOI:10.1101/gr.152454.112 · 13.85 Impact Factor

Full-text (2 Sources)

Available from
May 16, 2014