Article

Long-range massively parallel mate pair sequencing detects distinct mutations and similar patterns of structural mutability in two breast cancer cell lines.

Graduate Program in Structural and Computational Biology and Molecular Biophysics, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
Cancer Genetics (Impact Factor: 2.42). 08/2011; 204(8):447-57. DOI: 10.1016/j.cancergen.2011.07.009
Source: PubMed

ABSTRACT Cancer genomes frequently undergo genomic instability resulting in accumulation of chromosomal rearrangement. To date, one of the main challenges has been to confidently and accurately identify these rearrangements by using short-read massively parallel sequencing. We were able to improve cancer rearrangement detection by combining two distinct massively parallel sequencing strategies: fosmid-sized (36 kb on average) and standard 5 kb mate pair libraries. We applied this combined strategy to map rearrangements in two breast cancer cell lines, MCF7 and HCC1954. We detected and validated a total of 91 somatic rearrangements in MCF7 and 25 in HCC1954, including genomic alterations corresponding to previously reported transcript aberrations in these two cell lines. Each of the genomes contains two types of breakpoints: clustered and dispersed. In both cell lines, the dispersed breakpoints show enrichment for low copy repeats, while the clustered breakpoints associate with high copy number amplifications. Comparing the two genomes, we observed highly similar structural mutational spectra affecting different sets of genes, pointing to similar histories of genomic instability against the background of very different gene network perturbations.

Download full-text

Full-text

Available from: Cristian Coarfa, Jul 08, 2015
0 Followers
 · 
299 Views
  • Source
    • "In this respect, Fosills are similar to Fosmid ''diTags'' (Hampton et al. 2011). The principal advantage of our method is that it allows much longer sequencing reads (up to 2 3 101 bases in the current study, but even longer reads are possible), whereas the EcoP15I digest strictly limits the diTags to 2 3 26 bases. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Eliminating the bacterial cloning step has been a major factor in the vastly improved efficiency of massively parallel sequencing approaches. However, this also has made it a technical challenge to produce the modern equivalent of the Fosmid- or BAC-end sequences that were crucial for assembling and analyzing complex genomes during the Sanger-based sequencing era. To close this technology gap, we developed Fosill, a method for converting Fosmids to Illumina-compatible jumping libraries. We constructed Fosmid libraries in vectors with Illumina primer sequences and specific nicking sites flanking the cloning site. Our family of pFosill vectors allows multiplex Fosmid cloning of end-tagged genomic fragments without physical size selection and is compatible with standard and multiplex paired-end Illumina sequencing. To excise the bulk of each cloned insert, we introduced two nicks in the vector, translated them into the inserts, and cleaved them. Recircularization of the vector via coligation of insert termini followed by inverse PCR generates a jumping library for paired-end sequencing with 101-base reads. The yield of unique Fosmid-sized jumps is sufficiently high, and the background of short, incorrectly spaced and chimeric artifacts sufficiently low, to enable applications such as mapping of structural variation and scaffolding of de novo assemblies. We demonstrate the power of Fosill to map genome rearrangements in a cancer cell line and identified three fusion genes that were corroborated by RNA-seq data. Our Fosill-powered assembly of the mouse genome has an N50 scaffold length of 17.0 Mb, rivaling the connectivity (16.9 Mb) of the Sanger-sequencing based draft assembly.
    Genome Research 07/2012; DOI:10.1101/gr.138925.112 · 13.85 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: For many years, it was assumed that gene fusions were a type of mutation confined largely to leukemias and sarcomas. However, fusion genes are now known to be important in several epithelial cancers and a number have been described in breast cancers. In the December 2011 issue of Nature Medicine, Robinson and colleagues reported many more gene fusions -including the first recurrent fusion, SEC16A-NOTCH1 - in breast cancers. Several genes, including members of the MAST (microtubule-associated serine threonine) kinase and Notch gene families, are fused more than once. This finding supports an emerging story that most breast cancers express a number of fusion genes.
    Breast cancer research: BCR 03/2012; 14(2):303. DOI:10.1186/bcr3122 · 5.88 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background It has recently emerged that common epithelial cancers such as breast cancers have fusion genes like those in leukaemias. In a representative breast cancer cell line, ZR-75-30, we searched for fusion genes, by analysing genome rearrangements. Results We first analysed rearrangements of the ZR-75-30 genome, to around 10kb resolution, by molecular cytogenetic approaches, combining array painting and array CGH. We then compared this map with genomic junctions determined by paired-end sequencing. Most of the breakpoints found by array painting and array CGH were identified in the paired end sequencing—55% of the unamplified breakpoints and 97% of the amplified breakpoints (as these are represented by more sequence reads). From this analysis we identified 9 expressed fusion genes: APPBP2-PHF20L1, BCAS3-HOXB9, COL14A1-SKAP1, TAOK1-PCGF2, TIAM1-NRIP1, TIMM23-ARHGAP32, TRPS1-LASP1, USP32-CCDC49 and ZMYM4-OPRD1. We also determined the genomic junctions of a further three expressed fusion genes that had been described by others, BCAS3-ERBB2, DDX5-DEPDC6/DEPTOR and PLEC1-ENPP2. Of this total of 12 expressed fusion genes, 9 were in the coamplification. Due to the sensitivity of the technologies used, we estimate these 12 fusion genes to be around two-thirds of the true total. Many of the fusions seem likely to be driver mutations. For example, PHF20L1, BCAS3, TAOK1, PCGF2, and TRPS1 are fused in other breast cancers. HOXB9 and PHF20L1 are members of gene families that are fused in other neoplasms. Several of the other genes are relevant to cancer—in addition to ERBB2, SKAP1 is an adaptor for Src, DEPTOR regulates the mTOR pathway and NRIP1 is an estrogen-receptor coregulator. Conclusions This is the first structural analysis of a breast cancer genome that combines classical molecular cytogenetic approaches with sequencing. Paired-end sequencing was able to detect almost all breakpoints, where there was adequate read depth. It supports the view that gene breakage and gene fusion are important classes of mutation in breast cancer, with a typical breast cancer expressing many fusion genes.
    BMC Genomics 12/2012; 13(1):719. DOI:10.1186/1471-2164-13-719 · 4.04 Impact Factor