Article

CREST maps somatic structural variation in cancer genomes with base-pair resolution

Department of Information Sciences, St. Jude Children's Research Hospital, Memphis, Tennessee, USA.
Nature Methods (Impact Factor: 25.95). 06/2011; 8(8):652-4. DOI: 10.1038/nmeth.1628
Source: PubMed

ABSTRACT We developed 'clipping reveals structure' (CREST), an algorithm that uses next-generation sequencing reads with partial alignments to a reference genome to directly map structural variations at the nucleotide level of resolution. Application of CREST to whole-genome sequencing data from five pediatric T-lineage acute lymphoblastic leukemias (T-ALLs) and a human melanoma cell line, COLO-829, identified 160 somatic structural variations. Experimental validation exceeded 80%, demonstrating that CREST had a high predictive accuracy.

7 Followers
 · 
423 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Improved molecular diagnostic methods for detection drug resistance in Mycobacterium tuberculosis (MTB) strains are required. Resistance to first- and second- line anti-tuberculous drugs has been associated with single nucleotide polymorphisms (SNPs) in particular genes. However, these SNPs can vary between MTB lineages therefore local data is required to describe different strain populations. We used whole genome sequencing (WGS) to characterize 37 extensively drug-resistant (XDR) MTB isolates from Pakistan and investigated 40 genes associated with drug resistance. Rifampicin resistance was attributable to SNPs in the rpoB hot-spot region. Isoniazid resistance was most commonly associated with the katG codon 315 (92%) mutation followed by inhA S94A (8%) however, one strain did not have SNPs in katG, inhA or oxyR-ahpC. All strains were pyrazimamide resistant but only 43% had pncA SNPs. Ethambutol resistant strains predominantly had embB codon 306 (62%) mutations, but additional SNPs at embB codons 406, 378 and 328 were also present. Fluoroquinolone resistance was associated with gyrA 91-94 codons in 81% of strains; four strains had only gyrB mutations, while others did not have SNPs in either gyrA or gyrB. Streptomycin resistant strains had mutations in ribosomal RNA genes; rpsL codon 43 (42%); rrs 500 region (16%), and gidB (34%) while six strains did not have mutations in any of these genes. Amikacin/kanamycin/capreomycin resistance was associated with SNPs in rrs at nt1401 (78%) and nt1484 (3%), except in seven (19%) strains. We estimate that if only the common hot-spot region targets of current commercial assays were used, the concordance between phenotypic and genotypic testing for these XDR strains would vary between rifampicin (100%), isoniazid (92%), flouroquinolones (81%), aminoglycoside (78%) and ethambutol (62%); while pncA sequencing would provide genotypic resistance in less than half the isolates. This work highlights the importance of expanded targets for drug resistance detection in MTB isolates.
    PLoS ONE 02/2015; 10(2-2):e0117771. DOI:10.1371/journal.pone.0117771 · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Human cancers are frequently polyploid, containing multiple aneuploid subpopulations that differ in total DNA content. In this study we exploit this property to reconstruct evolutionary histories, by assuming that mutational complexity increases with time. We developed an experimental method called Ploidy-Seq that uses flow-sorting to isolate and enrich subpopulations with different ploidy prior to next-generation genome sequencing. We applied Ploidy-Seq to a patient with a triple-negative (ER-/PR-/HER2-) ductal carcinoma and performed whole-genome sequencing to trace the evolution of point mutations, indels, copy number aberrations, and structural variants in three clonal subpopulations during tumor growth. Our data show that few mutations (8% to 22%) were shared between all three subpopulations, and that the most aggressive clones comprised a minority of the tumor mass. We expect that Ploidy-Seq will have broad applications for delineating clonal diversity and investigating genome evolution in many human cancers. Electronic supplementary material The online version of this article (doi:10.1186/s13073-015-0127-5) contains supplementary material, which is available to authorized users.
    Genome Medicine 01/2015; 7(1):6. DOI:10.1186/s13073-015-0127-5 · 4.94 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The identification of DNA copy numbers from short-read sequencing data remains a challenge for both technical and algorithmic reasons. The raw data for these analyses are measured in tens to hundreds of gigabytes per genome; transmitting, storing, and analyzing such large files is cumbersome, particularly for methods that analyze several samples simultaneously. We developed a very efficient representation of depth of coverage (150-1000× compression) that enables such analyses. Current methods for analyzing variants in whole-genome sequencing (WGS) data frequently miss copy number variants (CNVs), particularly hemizygous deletions in the 1-100 kb range. To fill this gap, we developed a method to identify CNVs in individual genomes, based on comparison to joint profiles pre-computed from a large set of genomes. We analyzed depth of coverage in over 6000 high quality (>40×) genomes. The depth of coverage has strong sequence-specific fluctuations only partially explained by global parameters like %GC. To account for these fluctuations, we constructed multi-genome profiles representing the observed or inferred diploid depth of coverage at each position along the genome. These Reference Coverage Profiles (RCPs) take into account the diverse technologies and pipeline versions used. Normalization of the scaled coverage to the RCP followed by hidden Markov model (HMM) segmentation enables efficient detection of CNVs and large deletions in individual genomes. Use of pre-computed multi-genome coverage profiles improves our ability to analyze each individual genome. We make available RCPs and tools for performing these analyses on personal genomes. We expect the increased sensitivity and specificity for individual genome analysis to be critical for achieving clinical-grade genome interpretation.
    Frontiers in Genetics 02/2015; 6(45). DOI:10.3389/fgene.2015.00045

Full-text

Download
218 Downloads
Available from
May 20, 2014