High-throughput semiquantitative analysis of insertional mutations in heterogeneous tumors.

Division of Molecular Biology and Cancer Systems Biology Center, Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands.
Genome Research (Impact Factor: 13.85). 08/2011; 21(12):2181-9. DOI: 10.1101/gr.112763.110
Source: PubMed

ABSTRACT Retroviral and transposon-based insertional mutagenesis (IM) screens are widely used for cancer gene discovery in mice. Exploiting the full potential of IM screens requires methods for high-throughput sequencing and mapping of transposon and retroviral insertion sites. Current protocols are based on ligation-mediated PCR amplification of junction fragments from restriction endonuclease-digested genomic DNA, resulting in amplification biases due to uneven genomic distribution of restriction enzyme recognition sites. Consequently, sequence coverage cannot be used to assess the clonality of individual insertions. We have developed a novel method, called shear-splink, for the semiquantitative high-throughput analysis of insertional mutations. Shear-splink employs random fragmentation of genomic DNA, which reduces unwanted amplification biases. Additionally, shear-splink enables us to assess clonality of individual insertions by determining the number of unique ligation points (LPs) between the adapter and genomic DNA. This parameter serves as a semiquantitative measure of the relative clonality of individual insertions within heterogeneous tumors. Mixing experiments with clonal cell lines derived from mouse mammary tumor virus (MMTV)-induced tumors showed that shear-splink enables the semiquantitative assessment of the clonality of MMTV insertions. Further, shear-splink analysis of 16 MMTV- and 127 Sleeping Beauty (SB)-induced tumors showed enrichment for cancer-relevant insertions by exclusion of irrelevant background insertions marked by single LPs, thereby facilitating the discovery of candidate cancer genes. To fully exploit the use of the shear-splink method, we set up the Insertional Mutagenesis Database (iMDB), offering a publicly available web-based application to analyze both retroviral- and transposon-based insertional mutagenesis data.

  • [Show abstract] [Hide abstract]
    ABSTRACT: In gene therapy trials targeting blood disorders, it is important to detect dominance of transduced hematopoietic stem cell (HSC) clones arising from vector insertion site (VIS) effects. Current methods for VIS analysis often do not have defined levels of quantitative accuracy and therefore can fail to detect early clonal dominance. We have developed a rapid and inexpensive method for measuring clone size based on random shearing of genomic DNA, minimal exponential PCR amplification, and shear site counts as a quantitative endpoint. This "quantitative shearing linear amplification PCR" (qsLAM PCR) assay utilizes an internal control sample containing 19 lentiviral insertion sites per cell that is mixed with polyclonal samples derived from transduced human CD34+ cells. Samples were analyzed from transplanted pigtail macaques and from a participant in our X-linked severe combined immunodeficiency (XSCID) lentiviral vector trial and yielded controlled and quantitative results in all cases. One case of early clonal dominance was detected in a monkey transplanted with limiting numbers of transduced HSCs while the clinical samples from the XSCID trial participant showed highly diverse clonal representation. These studies demonstrate that qsLAM PCR is a facile and quantitative assay for measuring clonal repertoires in subjects enrolled in human gene therapy trials using lentiviral transduced HSCs.
    Human Gene Therapy Methods 12/2014; DOI:10.1089/hgtb.2014.122 · 1.64 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: BACKGROUND: Animal models of cancer are useful to generate complementary datasets for comparison to human tumor data. Insertional mutagenesis screens, such as those utilizing the Sleeping Beauty (SB) transposon system, provide a model that recapitulates the spontaneous development and progression of human disease. This approach has been widely used to model a variety of cancers in mice. Comprehensive mutation profiles are generated for individual tumors through amplification of transposon insertion sites followed by high-throughput sequencing. Subsequent statistical analyses identify common insertion sites (CISs), which are predicted to be functionally involved in tumorigenesis. Current methods utilized for SB insertion site analysis have some significant limitations. For one, they do not account for transposon footprints - a class of mutation generated following transposon remobilization. Existing methods also discard quantitative sequence data due to uncertainty regarding the extent to which it accurately reflects mutation abundance within a heterogeneous tumor. Additionally, computational analyses generally assume that all potential insertion sites have an equal probability of being detected under non-selective conditions, an assumption without sufficient relevant data. The goal of our study was to address these potential confounding factors in order to enhance functional interpretation of insertion site data from tumors. RESULTS: We describe here a novel method to detect footprints generated by transposon remobilization, which revealed minimal evidence of positive selection in tumors. We also present extensive characterization data demonstrating an ability to reproducibly assign semi-quantitative information to individual insertion sites within a tumor sample. Finally, we identify apparent biases for detection of inserted transposons in several genomic regions that may lead to the identification of false positive CISs. CONCLUSION: The information we provide can be used to refine analyses of data from insertional mutagenesis screens, improving functional interpretation of results and facilitating the identification of genes important in cancer development and progression.
    BMC Genomics 12/2014; 15(1):1150. DOI:10.1186/1471-2164-15-1150 · 4.04 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: DNA is packaged together with proteins, such as histones, in the nucleus of a cell to form a fiber called chromatin. The nature of this packaging, the "chromatin structure", is essential for proper cell functioning. This is illustrated by the fact that perturbating chromatin can be associated with many diseases. Hence, artificial perturbation of chromatin may give important new insights into its function. In this dissertation, we have perturbed chromatin by 1) inducing mutations by integrating retroviruses and transposons into DNA, and 2) evicting histones from chromatin and inducing DNA breaks, by the application of anti-cancer drugs. As means of perturbing chromatin, DNA integrating elements such as retroviruses and transposons are used in gene regulation and cancer research, among others. In cancer research, DNA integrating elements are used for detecting cancer genes from tumor screens. We presented a novel algorithm that fully automates this detection, thus removing any potential for bias induced by manual analysis. In gene regulation, DNA integrating elements can be used for studying the chromatin position effect by the location-dependent activation of transgenes present within randomly integrated transposons. We presented a high-throughput method for studying the chromatin position effect using DNA integrating elements, and studied genome-wide transgene expression values generated using this method, especially in relation to enhancers and domains associated with the nuclear lamina. For both applications of DNA integrating elements, it is important to realize that integrations are randomly, but not uniformly randomly, distributed across the genome. For this purpose, we generated large datasets of integrations that were under minimal selective pressure, for two transposons and one retrovirus. We compared the integration profiles with a wide range of (epi)genomic features to generate bias maps across multiple genomic scales. This revealed a hierarchical organization in target site selection, and showed that a substantial fraction of cancer genes retrieved from tumor screens may be false positives. The application of anti-cancer drugs to directly perturb chromatin structure allowed us to take a very low-level approach in studying chromatin. We showed that different drugs target different types of chromatin in evicting histones from chromatin and/or inducing DNA breaks, which can have implications for their chemotherapeutic efficacy. Central themes throughout this dissertation were computational epigenomics and data integration. Due to the complexity of the biology and the data, many of the computational methods were highly customized. Some are more generally applicable. Examples include a method for the normalization of genome-wide sequencing data with control, and a feature ranking method. However, in general high levels of customization are unavoidable. Therefore, as a conclusion, the careful consideration that must go into decisions regarding this customization was illustrated by demonstrating the substantial impact that these decisions can have on research outcomes.
    03/2015, Degree: PhD, Supervisor: Prof. Dr. L.F.A. Wessels