Article

Transposable elements and genome organisation: A comparative survey of retrotransposons revealed by the complete Saccharomyces cerevisiae genome sequence

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... While biologically relevant for S. cerevisiae, Simulation 3 potentially underestimates component performance since synthetic insertions in the yeast insertion model are often placed into fragments of TE sequences that occur upstream of tRNA genes in the reference genome [61,62]. To gain insight into other organismal contexts and understand how insertion into reference TE sequences may impact McClintock 2 component performance in yeast, we also investigated a model of random insertion into unique regions of the S. cerevisiae genome (Simulation 4). ...
... These four methods all indicate that Ty2 has the highest overall number of non-reference Ty insertions in this sample of S. cerevisiae strains. While Ty1 is often cited as the most abundant Ty family in S. cerevisiae because of its high copy number in the reference strain S288c [61,62], the finding that Ty2 has the highest number of non-reference insertions in this diverse worldwide sample of S. cerevisiae strains is supported by orthogonal copy-number estimates from the McClintock 2 coverage module (Fig. S14) and a similar depth-based approach used in recent independent study [82]. In contrast, the fifth component method that shows consistent Ty abundance per strain and high performance in simulated data -Ret-roSeq -predicts more Ty1 and Ty2 insertions in empirical yeast data, which we infer to be misidentification because of the similarity in LTR sequences for these two Ty families [61,83]. ...
... While Ty1 is often cited as the most abundant Ty family in S. cerevisiae because of its high copy number in the reference strain S288c [61,62], the finding that Ty2 has the highest number of non-reference insertions in this diverse worldwide sample of S. cerevisiae strains is supported by orthogonal copy-number estimates from the McClintock 2 coverage module (Fig. S14) and a similar depth-based approach used in recent independent study [82]. In contrast, the fifth component method that shows consistent Ty abundance per strain and high performance in simulated data -Ret-roSeq -predicts more Ty1 and Ty2 insertions in empirical yeast data, which we infer to be misidentification because of the similarity in LTR sequences for these two Ty families [61,83]. ...
Article
Full-text available
Background Many computational methods have been developed to detect non-reference transposable element (TE) insertions using short-read whole genome sequencing data. The diversity and complexity of such methods often present challenges to new users seeking to reproducibly install, execute, or evaluate multiple TE insertion detectors. Results We previously developed the McClintock meta-pipeline to facilitate the installation, execution, and evaluation of six first-generation short-read TE detectors. Here, we report a completely re-implemented version of McClintock written in Python using Snakemake and Conda that improves its installation, error handling, speed, stability, and extensibility. McClintock 2 now includes 12 short-read TE detectors, auxiliary pre-processing and analysis modules, interactive HTML reports, and a simulation framework to reproducibly evaluate the accuracy of component TE detectors. When applied to the model microbial eukaryote Saccharomyces cerevisiae, we find substantial variation in the ability of McClintock 2 components to identify the precise locations of non-reference TE insertions, with RelocaTE2 showing the highest recall and precision in simulated data. We find that RelocaTE2, TEMP, TEMP2 and TEBreak provide consistent estimates of ∼\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim$$\end{document}50 non-reference TE insertions per strain and that Ty2 has the highest number of non-reference TE insertions in a species-wide panel of ∼\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim$$\end{document}1000 yeast genomes. Finally, we show that best-in-class predictors for yeast applied to resequencing data have sufficient resolution to reveal a dyad pattern of integration in nucleosome-bound regions upstream of yeast tRNA genes for Ty1, Ty2, and Ty4, allowing us to extend knowledge about fine-scale target preferences revealed previously for experimentally-induced Ty1 insertions to spontaneous insertions for other copia-superfamily retrotransposons in yeast. Conclusion McClintock (https://github.com/bergmanlab/mcclintock/) provides a user-friendly pipeline for the identification of TEs in short-read WGS data using multiple TE detectors, which should benefit researchers studying TE insertion variation in a wide range of different organisms. Application of the improved McClintock system to simulated and empirical yeast genome data reveals best-in-class methods and novel biological insights for one of the most widely-studied model eukaryotes and provides a paradigm for evaluating and selecting non-reference TE detectors in other species.
... In all cases, synthetic genomes were created with the new reproducible simulation framework available in McClintock 2 (see Implementation above), the UCSC sacCer2 version of the S. cerevisiae S288c reference genome (to allow cross-validation with results in Nelson et al. [15]), canonical sequences for S. cerevisiae Ty elements from [58] (https://github. com/bergmanlab/mcclintock/blob/master/test/sac_cer_TE_seqs.fasta), and 5-bp TSDs for all Ty families [15,59,60,61,62]. Likewise, all McClintock jobs that were run on simulated data used the UCSC sacCer2 version of the S. cerevisiae S288c reference genome with reference TE annotations, taxonomy files, and canonical sequences for S. cerevisiae Ty sequences from [58] (provided in https://github.com/bergmanlab/mcclintock/blob/master/test/) ...
... While biologically relevant for S. cerevisiae, the tRNA promoter insertion model used in Simulation 3 potentially underestimates component performance since synthetic insertions are often placed into fragments of TE sequences that occur upstream of tRNA genes in the reference genome [58,62]. To gain insight into other organismal contexts and understand how insertion into reference TE sequences may impact McClintock 2 component performance in yeast, we also investigated a model of random insertion into unique regions of the S. cerevisiae genome (Simulation 4). ...
... These four methods all indicate that Ty2 has the highest overall number of non-reference Ty insertions in this sample of S. cerevisiae strains. While Ty1 is often cited as the most abundant Ty family in S. cerevisiae because of its high copy number in the reference strain S288c [62,58], the finding that Ty2 has the highest number of non-reference insertions in this diverse worldwide sample of S. cerevisiae strains is supported by orthogonal copynumber estimates from the McClintock 2 coverage module (Fig. S14) and a similar depth-based approach used in recent independent study [79]. In contrast, the fifth component method that shows consistent Ty abundance per strain and high performance in simulated data -RetroSeq -predicts more Ty1 and Ty2 insertions in empirical yeast data, which we infer to be misidentification because of the similarity in LTR sequences for these two Ty families [80,62]. ...
Preprint
Full-text available
Background Many computational methods have been developed to detect non-reference transposable element (TE) insertions using short-read whole genome sequencing data. The diversity and complexity of such methods often present challenges to new users seeking to reproducibly install, execute or evaluate multiple TE insertion detectors. Results We previously developed the McClintock meta-pipeline to facilitate the installation, execution, and evaluation of six first-generation short-read TE detectors. Here, we report a completely re-implemented version of McClintock written in Python using Snakemake and Conda that improves its installation, error handling, speed, stability, and extensibility. McClintock 2 now includes 12 short-read TE detectors, auxiliary pre-processing and analysis modules, interactive HTML reports, and a simulation framework to reproducibly evaluate the accuracy of component TE detectors. When applied to the model microbial eukaryote Saccharomyces cerevisiae , we find substantial variation in the ability of McClintock 2 components to identify the precise locations of non-reference TE insertions, with RelocaTE2 showing the highest recall and precision in simulated data. We find that RelocaTE2, TEMP, TEMP2 and TEBreak provide a consistent and biologically meaningful view of non-reference TE insertions in a species-wide panel of ∼ 1000 yeast genomes, as evaluated by coverage-based abundance estimates and expected patterns of tRNA promoter targeting. Finally, we show that best-in-class predictors for yeast have sufficient resolution to reveal a dyad pattern of integration in nucleosome-bound regions upstream of yeast tRNA genes for Ty1, Ty2, and Ty4, allowing us to extend knowledge aboutfine-scale target preferences first revealed experimentally for Ty1 to natural insertions and related copia -superfamily retrotransposons in yeast. Conclusion McClintock ( https://github.com/bergmanlab/mcclintock/ ) provides a user-friendly pipeline for the identification of TEs in short-read WGS data using multiple TE detectors, which should benefit researchers studying TE insertion variation in a wide range of different organisms. Application of the improved McClintock system to simulated and empirical yeast genome data reveals best-in-class methods and novel biological insights for one of the most widely-studied model eukaryotes and provides a paradigm for evaluating and selecting non-reference TE detectors for other species.
... Saccharomyces cerevisiae contains 5 different classes of retrotransposons, called Ty1-Ty5, in its genome (1). Transposition of these retrotransposons takes place via an RNA intermediate (2). ...
... Ty1 and Ty2 have 334 bp Long Terminal Repeat (LTR) sequences in their 5' and 3' ends, named as delta elements. Ty1 has 30 copies per genome while Ty2 has about 10 copies per genome in yeast (1). Apart from full length Ty elements, the yeast genome contains a large number of solo LTR sequences. ...
... Apart from full length Ty elements, the yeast genome contains a large number of solo LTR sequences. Ty3 and Ty5 have a different genome organization (1). The genome organization of Ty3 is similar to the human immunodeficiency virus. ...
... Application of McClintock to simulated S. cerevisiae genomes with single synthetic TE insertions To test McClintock and its component methods, we used simulated WGS datasets based on the genome of the model eukaryote, S. cerevisiae. We chose S. cerevisiae for testing McClintock because its reference genome is relatively small and has been completely determined [40], it has large samples of publicly-available resequenced genomes [41][42][43], and the genome biology of its TEs is well-characterized [44,45] Figure 1A. In general, analysis of unmodified simulated reference genomes showed that McClintock component methods cannot detect all reference TEs, but also typically have low false positive rates for predicting non-reference TE insertions when they are truly absent (see Additional Table 1 and Additional Table 2). ...
... Simulations are useful for testing methods under controlled settings, but do not capture all aspects of how methods perform when applied to real data. Since much is known about the expected insertion preferences of TEs in S. cerevisiae [44,[46][47][48][49][50][51][52][53][54][55] In general, split-read methods predict between 5-20 non-reference TE insertions per strain, whereas read-pair methods predict approximately 40-100 non-reference TE insertions per strain (Figure 3). Numbers of reference TEs predicted per strain in real data (Additional Figure 5) are generally lower than in simulated genomes (Table 4 and Additional Table 1). ...
... Active TE families in S. cerevisiae are known to target tRNA genes [44,[47][48][49][50][53][54][55]. The highest density of Ty1 and Ty2 insertions are in the 200bp upstream of the tRNA transcription start site [44,47,50,53,54]. ...
Preprint
Full-text available
Background Transposable element (TE) insertions are among the most challenging type of variants to detect in genomic data because of their repetitive nature and complex mechanisms of replication. Nevertheless, the recent availability of large resequencing datasets has spurred the development of many new methods to detect TE insertions in whole genome shotgun sequences. These methods generate output in diverse formats and have a large number of software and data dependencies, making their comparative evaluation challenging for potential users. Results Here we develop an integrated bioinformatics pipeline for the detection of TE insertions in whole genome shotgun data, called McClintock ( https://github.com/bergmanlab/mcclintock ), that automatically runs and generates standardized output for multiple TE detection methods. We demonstrate the utility of the McClintock system by performing comparative evaluation of six TE detection methods using simulated and real genome data from the model microbal eukaryote, Saccharomyces cerevisiae . We find substantial variation among McClintock component methods in their ability to detect non-reference insertions in the yeast genome, but show that non-reference TEs at nearly all biologically-realistic locations can be detected in simulated data by combining multiple methods that use split-read and read-pair evidence. In general, our results reveal that split-read methods detect fewer non-reference TE insertions than read-pair methods, but generally have much higher positional accuracy. Analysis of a large sample of real yeast genomes reveals that most, but not all, McClintock component methods can recover known aspects of TE biology in yeast such as the transpositional activity status of families, tRNA gene target preferences, and target site duplication structure, albeit with varying levels of positional accuracy. Conclusions Our results suggest that no single TE detection method currently provides comprehensive detection of non-reference TEs, even in the context of a simplified model eukaryotic genome like S. cerevisiae . In spite of these limitations, the McClintock system provides a framework for testing, developing and integrating results from multiple TE detection methods to achieve this ultimate aim, as well as useful guidance for yeast researchers to select appropriate TE detection tools.
... The completeness of the QM6a and other 11 high-quality fungal genomes also allowed us to accurately survey genome-wide repetitive features and their correlation to RIP using the RepeatMasker search program (http://www.repeatmasker.org/). We were able to identify almost all Ty elements along the 16 chromosomes of Saccharomyces cerevisiae [65,66] (Additional File 1: Table A15). Our results also confirm that the genome of Neurospora crassa accumulates fragmented and Schizosaccharomyces pombe [67], Cryptococcus neoformans [68] and Coprinopsis cinerea [69]. ...
... The copy numbers of transposon sequences in QM6a we report here (Additional in all of that 16 yeast's chromosomes had been determined before [65,66]. ...
... replaced by Ns). To obtain high-confidence data, we first analyzed the genome sequences of Saccharomyces cerevisiae because the number and location of five different Ty elements in all its 16 yeast chromosomes had been reported previously [65,66]. When the preliminary RepeatMasker data were filtered with two parameters (length >= 140, Smith-Waterman local similarity scores >= 450), the final data (Additional File 1: Table A15) were quite consistent with the known results [65]. ...
Preprint
Trichoderma reesei (Ascomycota, Pezizomycotina) QM6a is a model fungus for a broad spectrum of physiological phenomena, including plant cell wall degradation, industrial production of enzymes, light responses, conidiation, sexual development, polyketide biosynthesis and plant-fungal interactions. The genomes of QM6a and its high-enzyme producing mutants have been sequenced by second-generation-sequencing methods and are publicly available from the Joint Genome Institute (JGI). While these genome sequences have offered useful information for genomic and transcriptomic studies, their limitations and especially their short read lengths make them poorly suited for some particular biological problems, including assembly, genome-wide determination of chromosome architecture and genetic modification or engineering. We integrated Pacific Biosciences and Illumina sequencing platforms for the highest-quality genome assembly yet achieved, revealing seven telomere-to-telomere chromosomes (34,922,528 bp; 10877 genes) with 1630 newly-predicted genes and >1.5 Mb of new sequences. Most new sequences are located on AT-rich blocks, including 7 centromeres, 14 subtelomeres and 2329 interspersed AT-rich blocks. The seven QM6a centromeres separately consist of 24 conserved repeats and 37 putative centromere-encoded genes. These findings open up a new perspective for future centromere and chromosome architecture studies. Next, we demonstrate that sexual crossing readily induced cytosine-to-thymine point mutations on both tandem and unlinked duplicated sequences. We also show by bioinformatic analysis that Trichoderma reesei has evolved a robust repeat-induced point mutation (RIP) system to accumulate AT-rich sequences, with longer AT-rich blocks having more RIP mutations. The widespread distribution of AT-rich blocks correlates genome-wide partitions with gene clusters, explaining why clustering of genes has been reported to not influence gene expression in Trichoderma reesei . Compartmentation of ancestral gene clusters by AT-rich blocks might promote flexibilities that are evolutionarily advantageous in this fungus’ soil habitats and other natural environments. Our analyses, together with the complete genome sequence, provide a better blueprint for biotechnological and industrial applications.
... The only TEs in the S. cerevisiae haploid reference genome (strain S288c) are LTR retrotransposons [20,22]. An initial analysis identified 3.1% of the genome as LTR retrotransposons and solo LTRs, reporting 331 total insertions consisting of 280 solo LTRs or LTR fragments and 51 retrotransposons [65]. More recent analyses reported an overall retrotransposon content of 3.4% or 3.5% [66,67]. ...
... More recent analyses reported an overall retrotransposon content of 3.4% or 3.5% [66,67]. The annotation of the current version of the reference genome includes 383 total LTRs and 50 retrotransposons [22], though 51 full-length retrotransposons have been previously reported [65,67] as a result of identifying 32 full-length Ty1 elements, as opposed to 31 in the annotated genome. These retrotransposons include five families, Ty1-Ty5, with Ty3 being the only gypsy-type (Metaviridae) and the other four copia-type (Pseudoviridae) retrotransposons [65]. ...
... The annotation of the current version of the reference genome includes 383 total LTRs and 50 retrotransposons [22], though 51 full-length retrotransposons have been previously reported [65,67] as a result of identifying 32 full-length Ty1 elements, as opposed to 31 in the annotated genome. These retrotransposons include five families, Ty1-Ty5, with Ty3 being the only gypsy-type (Metaviridae) and the other four copia-type (Pseudoviridae) retrotransposons [65]. The relative abundance of these elements in the reference genome is (including solo LTRs): Ty1 > Ty2 > Ty3 > Ty4 > Ty5 [22,65]. ...
Article
Full-text available
Abstract Genomics and other large-scale analyses have drawn increasing attention to the potential impacts of transposable elements (TEs) on their host genomes. However, it remains challenging to transition from identifying potential roles to clearly demonstrating the level of impact TEs have on genome evolution and possible functions that they contribute to their host organisms. I summarize TE content and distribution in four well-characterized yeast model systems in this review: the pathogens Candida albicans and Cryptococcus neoformans, and the nonpathogenic species Saccharomyces cerevisiae and Schizosaccharomyces pombe. I compare and contrast their TE landscapes to their lifecycles, genomic features, as well as the presence and nature of RNA interference pathways in each species to highlight the valuable diversity represented by these models for functional studies of TEs. I then review the regulation and impacts of the Ty1 and Ty3 retrotransposons from Saccharomyces cerevisiae and Tf1 and Tf2 retrotransposons from Schizosaccharomyces pombe to emphasize parallels and distinctions between these well-studied elements. I propose that further characterization of TEs in the pathogenic yeasts would enable this set of four yeast species to become an excellent set of models for comparative functional studies to address outstanding questions about TE-host relationships.
... Ty2 elements were introduced into a recent ancestor of S. cerevisiae from a related species, Saccharomyces mikitae, as a result of horizontal transfer (23,25). There are 313 copies of Ty1 dispersed throughout the S. cerevisiae S288C genome, including 279 solo LTRs, two other truncated elements, and 32 full-length elements (23,24). The Ty2 family consists of 46 elements, of which 13 are full-length elements, 31 are solo LTRs, and two are other truncations. ...
... The his3AI RIG is also contained in this element to detect retromobility of mini-Ty1HIS3 cDNA. distinguished by a high degree of sequence divergence in the GAG ORF (24,27). ...
... Consistent with the recent transposition of Ty1 and Ty2 elements, the coding regions of members of each family are very homogeneous in sequence, with 86% and 96% invariant amino acids in Ty1 and Ty2 ORFs, respectively (24,26). Furthermore, Ty1 and Ty2 elements in laboratory strains S288C and GRF167 are predominantly autonomous elements that encode functional gRNA and proteins capable of promoting retrotransposition in the absence of other elements (18,24). ...
... Transposases, including MuA, have sequence bias (42)(43)(44)(45)(46)(47)(48), and create domain insertion libraries with inconsistent insertion frequencies and regions without insertions (15,39,40). Additionally, transposases target random DNA sequences, causing five in six insertions to be in the incorrect reading frame or wrong direction, and the MuA transposition mechanism results in an unavoidable 5 bp replication at the insertion site (49,50). ...
... The current state-of-the-art for generating domain insertion libraries relies on MuA transposase (24). However, MuA transposon-generated libraries have incomplete coverage (15,40) and strong sequence bias (42)(43)(44)(45)(46)(47)(48). ...
... The sequence bias and variable efficiency of transposases is well established (42)(43)(44)(45)(46)(47)(48). We and others showed, in different protein families, that this can result in domain insertion libraries that have bias, incomplete coverage, and variable coverage redundancy (15,24,41). ...
Article
Full-text available
Domain recombination is a key principle in protein evolution and protein engineering, but inserting a donor domain into every position of a target protein is not easily experimentally accessible. Most contemporary domain insertion profiling approaches rely on DNA transposons, which are constrained by sequence bias. Here, we establish Saturated Programmable Insertion Engineering (SPINE), an unbiased, comprehensive, and targeted domain insertion library generation technique using oligo library synthesis and multi-step Golden Gate cloning. Through benchmarking to MuA transposon-mediated library generation on four ion channel genes, we demonstrate that SPINE-generated libraries are enriched for in-frame insertions, have drastically reduced sequence bias as well as near-complete and highly-redundant coverage. Unlike transposon-mediated domain insertion that was severely biased and sparse for some genes, SPINE generated high-quality libraries for all genes tested. Using the Inward Rectifier K+ channel Kir2.1, we validate the practical utility of SPINE by constructing and comparing domain insertion permissibility maps. SPINE is the first technology to enable saturated domain insertion profiling. SPINE could help explore the relationship between domain insertions and protein function, and how this relationship is shaped by evolutionary forces and can be engineered for biomedical applications.
... Retrotransposition can thus lead to gene loss of function or gene expression modifications [8]. The first case described in 1988 was a patient with hemophilia A resulting from an L1 insertion in the F8 gene [9]. ...
... 7 Genetics Department, AP-HP Nord, Robert-Debré University Hospital, Paris, France. 8 Centre de Référence maladies rares « maladies dermatologiques en mosaïque », service de dermatologie, FHU-TRANSLAD, Dijon University Hospital, Dijon, France. 9 Service Dermatologie, Dijon University Hospital, Dijon, France. 10 Centre de Référence maladies rares « Anomalies du développement et syndromes malformatifs », centre de génétique, FHU-TRANSLAD, Dijon University Hospital, Dijon, France. ...
Article
About 0.3% of all variants are due to de novo mobile element insertions (MEIs). The massive development of next-generation sequencing has made it possible to identify MEIs on a large scale. We analyzed exome sequencing (ES) data from 3232 individuals (2410 probands) with developmental and/or neurological abnormalities, with MELT, a tool designed to identify MEIs. The results were filtered by frequency, impacted region and gene function. Following phenotype comparison, two candidates were identified in two unrelated probands. The first mobile element (ME) was found in a patient referred for poikilodermia. A homozygous insertion was identified in the FERMT1 gene involved in Kindler syndrome. RNA study confirmed its pathological impact on splicing. The second ME was a de novo Alu insertion in the GRIN2B gene involved in intellectual disability, and detected in a patient with a developmental disorder. The frequency of de novo exonic MEIs in our study is concordant with previous studies on ES data. This project, which aimed to identify pathological MEIs in the coding sequence of genes, confirms that including detection of MEs in the ES pipeline can increase the diagnostic rate. This work provides additional evidence that ES could be used alone as a diagnostic exam.
... We also ignored antiparallel insertions because they do not produce the FRB-FKBP pair. As shown in Table 1, we found that the positional coverage for fosA3 and catI is about 70% (71% and 70%, respectively) but only 51% for ermB, indicating the different levels of MuA insertion bias for different genes [39][40][41]. Specific antibiotic challenges reduced the positional coverages of self-and assisted libraries by <10% for fosA3, 10-20% for ermB and 22-36% for catI. The large difference in coverage reductions for different genes indicates their different abilities to recover the antibiotic-resistant functions from reconstituted fragments. ...
... Another limitation of the current technique is different levels of MuA insertion bias for different genes [39][40][41]. Currently, the positional coverage for fosA3 and catI is about 70% (71 and 70%, respectively) but only 51% for ermB. In other words, potentially better variants may have been missed due to the imperfect coverage. ...
Article
Splitting a protein at a position may lead to self- or assisted-complementary fragments depending on whether two resulting fragments can reconstitute to maintain the native function spontaneously or require assistance from two interacting molecules. Assisted complementary fragments with high contrast are an important tool for probing biological interactions. However, only a small number of assisted-complementary split-variants have been identified due to manual, labour-intensive optimization of a candidate gene. Here, we introduce a technique for high-throughput split-protein profiling (HiTS) that allows fast identification of self- and assisted complementary positions by transposon mutagenesis, a rapamycin-regulated FRB-FKBP protein interaction pair, and deep sequencing. We test this technique by profiling three antibiotic-resistant genes (fosfomycin-resistant gene, fosA3, erythromycin-resistant gene, ermB, and chloramphenicol-resistant gene, catI). Self- and assisted complementary fragments discovered by the high-throughput technique were subsequently confirmed by low-throughput testing of individual split positions. Thus, the HiTS technique provides a quicker alternative for discovering the proteins with suitable self- and assisted-complementary split positions when combining with a readout such as fluorescence, bioluminescence, cell survival, gene transcription or genome editing.
... The Saccharomyces cerevisiae genome has five different types of mobile genetic elements known as Ty (Transposon Yeast) (1). Ty elements propagate via the RNA intermediate in the yeast genome. ...
... It has been discovered as an insertional element within the HIS4 gene of S. cerevisiae (5). Later it was found that it is present as five to ten copies in most of the S. cerevisiae laboratory strains (1). Its genome size is 5.9 Kbp and it contains 0.33 Kbp long terminal repeats (LTR) at its 5' and 3' ends. ...
Article
Objective: Ty2-917 is a low copy retrotransposon found in the Saccharomyces cerevisiae genome. It has structural similarities to metazoan retroviruses in terms of genome organization and propagation mechanisms in the host cells. The objective of this study is to analyze the effects of autophagy signaling on the transcriptional regulation of Ty2 in yeast cells. Materials and Methods: Ty2-LacZ gene fusions on the YEp vectors have been used as reporter genes to analyze the effects of amino acid starvation, nitrogen source, and autophagy signals on the transcription of Ty2. These reporter gene fusions have been transformed into the wild type and also isogenic mutant yeast strains that are defective for one of the regulatory factors involved in nutrient sensing and signaling. To activate autophagy signaling, yeast transformants were treated with caffeine or 3-amino 1-2-3 triazole. Transcription levels of Ty2-LacZ gene fusions in treated and untreated yeast cells were analyzed by β-galactosidase assays. Results: Results of this study show that transcription of Ty2 decreases up to eightfold in response to amino acid starvation. Caffeine treatment of the yeast cells also represses Ty2 transcription, independent of the TOR1 pathway. In addition, our results suggest that Ty2 transcription is also regulated in a nitrogen source-dependent manner through the GATA factors. Conclusions: Our results suggest that activation of autophagy signal results in significant level repression of Ty2 transcription. We have found that the GATA class of transcription factors is involved in the regulation of Ty2 transcription in response to autophagy signaling.
... Phong Quoc Nguyen 1,3 , Christine Conesa ...
... Among the five distinct families, Ty1, Ty2, Ty3, Ty4 and Ty5, identified in the reference strain S288C, Ty1 is the most abundant and is still active (3)(4)(5)(6). Following Ty1 transcription by RNA polymerase II (Pol II), the Ty1 mRNA is translated into Gag and the Gag-Pol polyprotein that includes sequentially the capsid protein (Gag), the protease (PR), the integrase (IN) and the reverse transcriptase (RT). ...
Article
Full-text available
Long-terminal repeat (LTR) retrotransposons are genetic elements that, like retroviruses, replicate by reverse transcription of an RNA intermediate into a complementary DNA (cDNA) that is next integrated into the host genome by their own integrase. The Ty1 LTR retrotransposon has proven to be a reliable working model to investigate retroelement integration site preference. However, the low yield of recombinant Ty1 integrase production reported so far has been a major obstacle for structural studies. Here we analyze the biophysical and biochemical properties of a stable and functional recombinant Ty1 integrase highly expressed in E.coli. The recombinant protein is monomeric and has an elongated shape harboring the three-domain structure common to all retroviral integrases at the N-terminal half, an extra folded region and a large intrinsically disordered region at the C-terminal half. Recombinant Ty1 integrase efficiently catalyzes concerted integration in vitro and the N-terminal domain displays similar activity. These studies that will facilitate structural analyses may allow elucidating the molecular mechanisms governing Ty1 specific integration into safe places in the genome.
... Homology-directed recombination can also be used to maintain telomeres in a number of cancer cells and in models where telomerase is experimentally inactivated, as for example in Arabidopsis thaliana or Saccharomyces cerevisiae (7). Next to the telomere, the subtelomere is usually a gene-poor region comprising repeated elements, such as transposable elements (TEs), satellite sequences, ribosomal DNA (rDNA), or paralogous genes, which are often shared between different subtelomeres (8)(9)(10)(11)(12)(13)(14)(15). The described gene families are involved in diverse life cycle and adaptive processes such as metabolism in S. cerevisiae (11,13), surface antigen repertoires in Plasmodium falciparum (16), resistance to pathogens in common bean (14), or olfactory receptors and cytoskeleton (proteins of the WASP family) in human (17,18). ...
... The repetitive nature of the region promotes homologous recombination (HR), unequal sister chromatid exchange (SCE), break-induced replication (BIR) and replication slippage (8,14,35,40,(43)(44)(45)(46)(47)(48). Transposition also contributes to subtelomere variations (10,14,39,46). These mechanisms, along with others such as non-homologous end-joining (NHEJ)-mediated translocations and fusions, have been described in a variety of species and can lead to segmental duplications and amplification of repeated elements (14,44,46,47). ...
Article
Full-text available
In most eukaryotes, subtelomeres are dynamic genomic regions populated by multi-copy sequences of different origins, which can promote segmental duplications and chromosomal rearrangements. However, their repetitive nature has complicated the efforts to sequence them, analyse their structure and infer how they evolved. Here, we use recent genome assemblies of Chlamydomonas reinhardtii based on long-read sequencing to comprehensively describe the subtelomere architecture of the 17 chromosomes of this model unicellular green alga. We identify three main repeated elements present at subtelomeres, which we call Sultan, Subtile and Suber, alongside three chromosome extremities with ribosomal DNA as the only identified component of their subtelomeres. The most common architecture, present in 27 out of 34 subtelomeres, is a heterochromatic array of Sultan elements adjacent to the telomere, followed by a transcribed Spacer sequence, a G-rich microsatellite and transposable elements. Sequence similarity analyses suggest that Sultan elements underwent segmental duplications within each subtelomere and rearranged between subtelomeres at a much lower frequency. Analysis of other green algae reveals species-specific repeated elements that are shared across subtelomeres, with an overall organization similar to C. reinhardtii. This work uncovers the complexity and evolution of subtelomere architecture in green algae.
... 1A). Few of these sequence variants have already been described (Jordan and McDonald 1998;Kim et al. 1998;Bleykasten-Grosshans et al. 2013;Czaja et al. 2020) and our results provide an exhaustive catalog as well as a detailed view of their mosaic structure. The variants can be organized in three different subfamilies according to the origin of the gag coding sequence. ...
... 1B). As previously shown in a small subset of strains (Kim et al. 1998;Gabriel et al. 2006;Carr et al. 2012;Bleykasten-Grosshans et al. 2013) Ty1 and Ty2 are the most represented families with a total of 17,127 (i.e. 35.1 %) and 24,517 elements (i.e. ...
Article
Full-text available
Transposable elements (TE) are an important source of genetic variation with a dynamic and content that greatly differ in a wide range of species. The origin of the intraspecific content variation is not always clear and little is known about the precise nature of it. Here, we surveyed the species-wide content of the Ty LTR-retrotransposons in a broad collection of 1,011 Saccharomyces cerevisiae natural isolates to understand what can stand behind the variation of the repertoire, i.e. the type and number of Ty elements. We have compiled an exhaustive catalog of all the TE sequence variants present in the S. cerevisiae species by identifying a large set of new sequence variants. The characterization of the TE content in each isolate clearly highlighted that each subpopulation exhibits a unique and specific repertoire, retracing the evolutionary history of the species. Most interestingly, we have shown that ancient interspecific hybridization events had a major impact in the birth of new sequence variants and therefore in the shaping of the TE repertoires. We also investigated the transpositional activity of these elements in a large set of natural isolates, and we found a broad variability related to the level of ploidy as well as the genetic background. Overall, our results pointed out that the evolution of the Ty content is deeply impacted by clade-specific events such as introgressions and therefore follows the population structure. In addition, our study lays the foundation for future investigations to better understand the transpositional regulation and more broadly the TE-host interactions.
... Saccharomyces cerevisiae has been used as a model to understand retrotransposition for decades. Saccharomyces cerevisiae TEs are made up of LTR retrotransposons which fall into six families, Ty1, Ty2, Ty3, Ty3_1p, Ty4, and Ty5 (Kim et al. 1998;Carr et al. 2012). Ty elements make up a small fraction of the genome (<5%), with a total of approximately 50 full-length Ty elements and over 400 solo LTRs in the S. cerevisiae reference genome (Kim et al. 1998;Carr et al. 2012). ...
... Saccharomyces cerevisiae TEs are made up of LTR retrotransposons which fall into six families, Ty1, Ty2, Ty3, Ty3_1p, Ty4, and Ty5 (Kim et al. 1998;Carr et al. 2012). Ty elements make up a small fraction of the genome (<5%), with a total of approximately 50 full-length Ty elements and over 400 solo LTRs in the S. cerevisiae reference genome (Kim et al. 1998;Carr et al. 2012). Ty1 is the most abundant and well-studied Ty element, representing almost 70% of the full length TEs in the reference genome, with its closely related family Ty2 making up a further 25%. ...
Article
Full-text available
Barbara McClintock first hypothesized that interspecific hybridization could provide a “genomic shock” that leads to the mobilization of transposable elements. This hypothesis is based on the idea that regulation of transposable element movement is potentially disrupted in hybrids. However, the handful of studies testing this hypothesis have yielded mixed results. Here, we set out to identify if hybridization can increase transposition rate and facilitate colonization of transposable elements in Saccharomyces cerevisiae x Saccharomyces uvarum interspecific yeast hybrids. S. cerevisiae have a small number of active long terminal repeat (LTR) retrotransposons (Ty elements), while their distant relative S. uvarum have lost the Ty elements active in S. cerevisiae. While the regulation system of Ty elements is known in S. cerevisiae, it is unclear how Ty elements are regulated in other Saccharomyces species, and what mechanisms contributed to the loss of most classes of Ty elements in S. uvarum. Therefore, we first assessed whether transposable elements could insert in the S. uvarum sub-genome of a S. cerevisiae x S. uvarum hybrid. We induced transposition to occur in these hybrids and developed a sequencing technique to show that Ty elements insert readily and non-randomly in the S. uvarum genome. We then used an in vivo reporter construct to directly measure transposition rate in hybrids, demonstrating that hybridization itself does not alter rate of mobilization. However, we surprisingly show that species-specific mitochondrial inheritance can change transposition rate by an order of magnitude. Overall, our results provide evidence that hybridization can potentially facilitate the introduction of transposable elements across species boundaries and alter transposition via mitochondrial transmission, but that this does not lead to unrestrained proliferation of transposable elements suggested by the genomic shock theory.
... Ty3 has the most specific pattern of integration within 20 base pairs upstream of the Pol III transcription start sites, while Ty1 integrates into a larger one-kilobase window upstream of these genes (Baller et al. 2012;Chalker and Sandmeyer 1992;Devine and Boeke 1996;Mularoni et al. 2012;Patterson et al. 2019;. The integration preference of Ty2 and Ty4 for the same 1-kb window upstream of Pol IIItranscribed genes were deduced from the distribution of their endogenous copies in the yeast genome (Carr et al. 2012;Kim et al. 1998). Because tDNAs are in multicopy and thus individually non-essential, Ty1-Ty4 integration patterns minimize genetic damage to their host and yet allow these elements to replicate. ...
... Switching between repression and activation of Ty expression undoubtedly may help minimize the adverse effects of their propagation while maintaining a relatively low copy number in the cells. Noteworthy, Ty1 and Ty2 are the two most abundant Ty families in S288C (Carr et al. 2012;Kim et al. 1998), suggesting that their integration in a onekilobase window upstream of Pol III-transcribed genes may have contributed to their successful propagation in the genome. Ty4, of which there are only defective copies located in the upstream region of the tDNAs, has not had the same success probably because of intrinsic transcription defects (Hug and Feldmann 1996). ...
Article
Full-text available
Transposable elements are ubiquitous in genomes. Their successful expansion depends in part on their sites of integration in their host genome. In Saccharomyces cerevisiae, evolution has selected various strategies to target the five Ty LTR-retrotransposon families into gene-poor regions in a genome, where coding sequences occupy 70% of the DNA. The integration of Ty1/Ty2/Ty4 and Ty3 occurs upstream and at the transcription start site of the genes transcribed by RNA polymerase III, respectively. Ty5 has completely different integration site preferences, targeting heterochromatin regions. Here, we review the history that led to the identification of the cellular tethering factors that play a major role in anchoring Ty retrotransposons to their preferred sites. We also question the involvement of additional factors in the fine-tuning of the integration site selection, with several studies converging towards an importance of the structure and organization of the chromatin.
... However, these reference genomes are not necessarily representative of the original parental genomes of S. pastorianus. Although S. pastorianus genomes are available, they were sequenced with short-read sequencing technology [10][11][12][13] preventing assembly of large repetitive stretches of several thousand base pairs, such as TY-elements or paralogous genes often found in Saccharomyces genomes [21]. The resulting S. pastorianus genomes assemblies are thus incomplete and fragmented into several hundred or thousand contigs [10][11][12][13]. ...
... Sequence alignments of raw-long-reads revealed five reads (from 20.6 to 36.7 Kbp) linking the right arm of ScI to the left arm of ScXIV at position~561 Kbp (Fig. 1c). This location corresponded to a Ty-2 repetitive element; known to mediate recombination within Saccharomyces genomes [21]. In addition to the increased coverage of the right arm of ScI, the left arm of ScXIV showed decreased sequencing coverage up until the~561 Kbp position. ...
Article
Full-text available
Background: The lager brewing yeast, S. pastorianus, is a hybrid between S. cerevisiae and S. eubayanus with extensive chromosome aneuploidy. S. pastorianus is subdivided into Group 1 and Group 2 strains, where Group 2 strains have higher copy number and a larger degree of heterozygosity for S. cerevisiae chromosomes. As a result, Group 2 strains were hypothesized to have emerged from a hybridization event distinct from Group 1 strains. Current genome assemblies of S. pastorianus strains are incomplete and highly fragmented, limiting our ability to investigate their evolutionary history. Results: To fill this gap, we generated a chromosome-level genome assembly of the S. pastorianus strain CBS 1483 from Oxford Nanopore MinION DNA sequencing data and analysed the newly assembled subtelomeric regions and chromosome heterozygosity. To analyse the evolutionary history of S. pastorianus strains, we developed Alpaca: a method to compute sequence similarity between genomes without assuming linear evolution. Alpaca revealed high similarities between the S. cerevisiae subgenomes of Group 1 and 2 strains, and marked differences from sequenced S. cerevisiae strains. Conclusions: Our findings suggest that Group 1 and Group 2 strains originated from a single hybridization involving a heterozygous S. cerevisiae strain, followed by different evolutionary trajectories. The clear differences between both groups may originate from a severe population bottleneck caused by the isolation of the first pure cultures. Alpaca provides a computationally inexpensive method to analyse evolutionary relationships while considering non-linear evolution such as horizontal gene transfer and sexual reproduction, providing a complementary viewpoint beyond traditional phylogenetic approaches.
... In the plant kingdom, TEs cover 82.2% and between 85-90% of the wheat and maize genomes respectively [10][11][12]. In the fungal kingdom, TE content is less than 30% of the genome [13], and only 3% of TEs are in yeast genomes [14]. The TEs are highly variable within Insecta taxa [1,15]; the genomic portion of TEs ranges from 2% in the Belgica antarctica (Diptera) [16] to 65% in the Locusta migratoria (Orthoptera) [17] and covers up to 75% of the genome of Vandiemenella viatical (Orthoptera). ...
Article
Full-text available
Transposable elements (TEs) are a major component of eukaryotic genomes and are present in almost all eukaryotic organisms. TEs are highly dynamic between and within species, which significantly affects the general applicability of the TE databases. Orthoptera is the only known group in the class Insecta with a significantly enlarged genome (0.93-21.48 Gb). When analyzing the large genome using the existing TE public database, the efficiency of TE annotation is not satisfactory. To address this limitation, it becomes imperative to continually update the available TE resource library and the need for an Orthoptera-specific library as more insect genomes are publicly available. Here, we used the complete genome data of 12 Orthoptera species to de novo annotate TEs, then manually re-annotate the unclassified TEs to construct a non-redundant Orthoptera-specific TE library: Orthoptera-TElib. Orthoptera-TElib contains 24,021 TE entries including the re-annotated results of 13,964 unknown TEs. The naming of TE entries in Orthoptera-TElib adopts the same naming as RepeatMasker and Dfam and is encoded as the three-level form of “level1/level2-level3”. Orthoptera-TElib can be directly used as an input reference database and is compatible with mainstream repetitive sequence analysis software such as RepeatMasker and dnaPipeTE. When analyzing TEs of Orthoptera species, Orthoptera-TElib performs better TE annotation as compared to Dfam and Repbase regardless of using low-coverage sequencing or genome assembly data. The most improved TE annotation result is Angaracris rhodopa, which has increased from 7.89% of the genome to 53.28%. Finally, Orthoptera-TElib is stored in Sqlite3 for the convenience of data updates and user access. Supplementary Information The online version contains supplementary material available at 10.1186/s13100-024-00316-x.
... In S. s cerevisiae, LTR is the most abundant. 51 copies of LTR conserved the full length (Kim et al., 1998). In S. pombe, LTR is only the transposon encoded in their genome. ...
Preprint
Full-text available
Transposons are the mobile DNA that itself encodes genes for their own mobility. During evolution, transposons accumulated their copies on genomic DNA, whereas many of them lost their mobile activity due to deletion or point mutations on the DNA elements required for their mobility. Here, we focused on the transposon-encoded genes which are directly involved in replication, excision, and integration of transposon DNA, i.e. transposon-mobility genes in the C. elegans genome. Among the 62,773 copies of retro- and DNA transposons in the latest assembly of the C. elegans genome (VC2010), 290 transposon-mobility genes conserved the complete open reading frame (ORF) structure. Among them, only 145 genes conserved the critical amino acids at the catalytic core. In contrast to the huge number of transposon copies in the genome, a limited number of genes encoded potentially functional enzymes for transposon mobility. Our finding indicates that a handful number of transposon copies can autonomously transpose in the C. elegans genome.
... In S. s cerevisiae, LTR is the most abundant. 51 copies of LTR conserved the full length (Kim et al., 1998). In S. pombe, LTR is only the transposon encoded in their genome. ...
Preprint
Full-text available
Transposons are the mobile DNA that itself encodes genes for their own mobility. During evolution, transposons accumulated their copies on genomic DNA, whereas many of them lost their mobile activity due to deletion or point mutations on the DNA elements required for their mobility. Here, we focused on the transposon-encoded genes which are directly involved in replication, excision, and integration of transposon DNA, i.e. transposon-mobility genes in the C. elegans genome. Among the 62,773 copies of retro- and DNA transposons in the latest assembly of the C. elegans genome (VC2010), 290 transposon-mobility genes conserved the complete open reading frame (ORF) structure. Among them, only 145 genes conserved the critical amino acids at the catalytic core. In contrast to the huge number of transposon copies in the genome, a limited number of genes encoded potentially functional enzymes for transposon mobility. Our finding indicates that a handful number of transposon copies can autonomously transpose in the C. elegans genome.
... Saccharomyces yeast hybrids by performing a large-scale evolution experiment by mutation accumulation (MA) to investigate the near-neutral evolution of TEs in hybrid genomes (Hénault et al., 2020 ). Saccharomyces cerevisiae genomes harbor five main long terminal repeat (LTR) retrotransposon families named Ty1-Ty5 (Kim et al., 1998 ). Its undomesticated sister species Saccharomyces paradoxus comprises related families Ty1, Ty3 and Ty5 (Yue et al., 2017 ), in addition to Tsu4 which was horizontally transferred from Saccharomyces uvarum (Bergman, 2018 ). ...
Preprint
Full-text available
Transposable elements (TEs) are major contributors to structural genomic variation by creating interspersed duplications of themselves. In return, structural variants (SVs) can affect the genomic distribution of TE copies and shape their load. One long-standing hypothesis states that hybridization could trigger TE mobilization and thus increase TE load in hybrids. We previously tested this hypothesis by performing a large-scale evolution experiment by mutation accumulation (MA) on multiple hybrid genotypes within and between wild populations of the yeasts Saccharomyces paradoxus and Saccharomyces cerevisiae. Using aggregate measures of TE load with short-read sequencing, we found no evidence for TE load increase in hybrid MA lines. Here, we resolve the genomes of the hybrid MA lines with long-read phasing and assembly to precisely characterize the role of SVs in shaping the TE landscape. Highly contiguous phased assemblies of 127 MA lines revealed that SV types like polyploidy, aneuploidy and loss of heterozygosity have large impacts on the TE load. We characterized 18 de novo TE insertions, indicating that transposition only has a minor role in shaping the TE landscape in MA lines. Because the scarcity of TE mobilization in MA lines provided insufficient resolution to confidently dissect transposition rate variation in hybrids, we adapted an in vivo assay to measure transposition rates in various S. paradoxus hybrid backgrounds. We found that transposition rates are not increased by hybridization, but are modulated by many genotype-specific factors including initial TE load, TE sequence variants and mitochondrial DNA inheritance. Our results show the multiple scales at which TE load is shaped in hybrid genomes, being highly impacted by SV dynamics and finely modulated by genotype-specific variation in transposition rates.
... In human and mouse genomes, L1 is the most abundant among transposon classes, with 145 copies in human and 2,811 copies in Mus musculus being conserved at full length (Lander et al., 2001;Penzkofer et al., 2017;Waterston et al., 2002). In Drosophila melanogaster, Saccharomyces cerevisiae, and Arabidopsis thaliana (Zhang and Wessler, 2004), LTR is the most abundant class, with 325 copies in D. melanogaster (Bergman et al., 2006) and 51 copies in S. cerevisiae (Kim et al., 1998) being conserved at full length. LTR is only the transposon encoded in S. pombe, with 13 copies being conserved at full length (Bowen et al., 2003). ...
Preprint
Full-text available
Transposons are mobile DNA elements that encode genes for their own mobility. Whereas transposon copies accumulate on the genome during evolution, many lose their mobile activity due to mutations. Here, we focus on transposon-encoded genes that are directly involved in the replication, excision, and integration of transposon DNA, which we refer to as "transposon-mobility genes", in the Caenorhabditis elegans genome. Among the 62,773 copies of retro-and DNA transposons in the latest assembly of the C. elegans genome (VC2010), we found that the complete open reading frame structure was conserved in 290 transposon-mobility genes. Critical amino acids at the catalytic core were conserved in only 145 of these 290 genes. Thus, in contrast to the huge number of transposon copies in the genome, only a limited number of transposons are autonomously mobile. We conclude that the comprehensive identification of potentially functional transposon-mobility genes in all transposon orders of a single species can provide a basis of molecular analysis for revealing the developmental, aging, and evolutionary roles of transposons.
... Retrotransposons are pervasive across diverse eukaryotes and influence genome evolution and affect host fitness. The budding yeast Saccharomyces cerevisiae contains Ty1-5 long terminal repeat (LTR)-retrotransposons, with Ty1 as the most abundant element in many laboratory strains (1,2). LTR-retrotransposons are the evolutionary progenitors of retroviruses; Ty1 elements share many structural hallmarks with retroviral genomic RNA and undergo an analogous replication cycle but lack an extracellular phase. ...
Article
Full-text available
Retrotransposons and retroviruses shape genome evolution and can negatively impact genome function. Saccharomyces cerevisiae and its close relatives harbor several families of LTR-retrotransposons, the most abundant being Ty1 in several laboratory strains. The cytosolic foci that nucleate Ty1 virus-like particle (VLP) assembly are not well understood. These foci, termed retrosomes or T-bodies, contain Ty1 Gag and likely Gag-Pol and the Ty1 mRNA destined for reverse transcription. Here, we report an intrinsically disordered N-terminal prion-like domain (PrLD) within Gag that is required for transposition. This domain contains amino acid composition similar to known yeast prions and is sufficient to nucleate prionogenesis in an established cell-based prion reporter system. Deleting the Ty1 PrLD results in dramatic VLP assembly and retrotransposition defects but does not affect Gag protein level. Ty1 Gag chimeras in which the PrLD is replaced with other sequences, including yeast and mammalian prionogenic domains, display a range of retrotransposition phenotypes from wild type to null. We examine these chimeras throughout the Ty1 replication cycle and find that some support retrosome formation, VLP assembly, and retrotransposition, including the yeast Sup35 prion and the mouse PrP prion. Our interchangeable Ty1 system provides a useful, genetically tractable in vivo platform for studying PrLDs, complete with a suite of robust and sensitive assays. Our work also invites study into the prevalence of PrLDs in additional mobile elements.
... Our reconstructions lack density attributable to either IN1 NLS, indicating that they are not conformationally constrained within the complex with Pol III and, therefore, may mediate simultaneous binding to importin-α in vivo. Moreover, conservation of TD1 in Ty2 and Ty4 retrotransposon integrases (Fig. 2b) suggests that an equivalent mechanism may operate for their interaction with AC40 17 and preferential insertion upstream of Pol III-transcribed genes 7 . ...
Article
Full-text available
The yeast Ty1 retrotransposon integrates upstream of genes transcribed by RNA polymerase III (Pol III). Specificity of integration is mediated by an interaction between the Ty1 integrase (IN1) and Pol III, currently uncharacterized at the atomic level. We report cryo-EM structures of Pol III in complex with IN1, revealing a 16-residue segment at the IN1 C-terminus that contacts Pol III subunits AC40 and AC19, an interaction that we validate by in vivo mutational analysis. Binding to IN1 associates with allosteric changes in Pol III that may affect its transcriptional activity. The C-terminal domain of subunit C11, involved in RNA cleavage, inserts into the Pol III funnel pore, providing evidence for a two-metal mechanism during RNA cleavage. Additionally, ordering next to C11 of an N-terminal portion from subunit C53 may explain the connection between these subunits during termination and reinitiation. Deletion of the C53 N-terminal region leads to reduced chromatin association of Pol III and IN1, and a major fall in Ty1 integration events. Our data support a model in which IN1 binding induces a Pol III configuration that may favor its retention on chromatin, thereby improving the likelihood of Ty1 integration.
... and Ty3 are active in Saccharomyces cerevisiae, while Copia and Gypsy are present in Drosophila melanogaster among others [22][23][24][25] . ...
Thesis
Almost half of the human genome derives from transposable elements (TE). Among them, the Long INterspersed Element-1 (LINE-1 or L1) forms the only currently active and autonomous transposable element family in humans. Although hundreds of thousands L1 copies are dispersed in the human genome, only 80-100 of them are still retrotransposition competent, i.e. able to replicate by a “copy-and-paste” mechanism via an RNA intermediate and a reverse transcription step. On the one hand, L1 activity can have deleterious consequences, such as insertional mutagenesis, and is tightly regulated at the transcriptional or post-transcriptional levels. However, specific host factors are necessary for completion of L1 replication cycle. When occurring in the germline or in the early embryo, L1 insertions can be transmitted to the next generation. Somatic retrotransposition has been also described in epithelial tumors and in the brain, both in neural progenitor cells and differentiated neurons. Nevertheless, the extent of L1 expression and mobilization in other somatic tissues remains unclear.Here, we investigated the activity of L1 retrotransposons in human and mouse skeletal muscle cells. We show that the most abundant L1 protein, ORF1p, which is essential to retrotransposition, is undetectable under our experimental conditions, in mouse or human muscle samples, while it is readily detected in cancer cells or in testis. Similarly, it was undetected in immortalized mouse or human myoblasts. However, we found that L1 is capable of retrotransposition in human and mouse myoblasts when expressed from a plasmid or from an integrated copy with a constitutive or inducible promoter, respectively. In conclusion, while L1 expression is under the limit of detection in muscle, myoblasts are permissive to retrotransposition, indicating that these cells express all the cellular factors necessary to achieve this process, and do not express significant restriction factors that would prevent retrotransposition.Altogether, our findings suggest that somatic L1 activity could not be confined to the brain or cancer cells, but could also occur in muscles under environmental or pathological conditions that would unleash L1 expression.
... However, further experiments are needed to identify the exact effect the mutation has on transport of β-caryophyllene. The MST27/tR(UCU) G1 int mutation is an insertion mutation between MST27 and a Ty element [29]. MST27 is member of the DUP240 multi-gene family and known to impact the vesicles formation [30]. ...
Article
Full-text available
Background β-Caryophyllene is a plant terpenoid with therapeutic and biofuel properties. Production of terpenoids through microbial cells is a potentially sustainable alternative for production. Adaptive laboratory evolution is a complementary technique to metabolic engineering for strain improvement, if the product-of-interest is coupled with growth. Here we use a combination of pathway engineering and adaptive laboratory evolution to improve the production of β-caryophyllene, an extracellular product, by leveraging the antioxidant potential of the compound. Results Using oxidative stress as selective pressure, we developed an adaptive laboratory evolution that worked to evolve an engineered β-caryophyllene producing yeast strain for improved production within a few generations. This strategy resulted in fourfold increase in production in isolated mutants. Further increasing the flux to β-caryophyllene in the best evolved mutant achieved a titer of 104.7 ± 6.2 mg/L product. Genomic analysis revealed a gain-of-function mutation in the a-factor exporter STE6 was identified to be involved in significantly increased production, likely as a result of increased product export. Conclusion An optimized selection strategy based on oxidative stress was developed to improve the production of the extracellular product β-caryophyllene in an engineered yeast strain. Application of the selection strategy in adaptive laboratory evolution resulted in mutants with significantly increased production and identification of novel responsible mutations.
... More importantly, controlled experiments have advantages over comparative studies in several aspects such as the lack of the mentioned ascertainment bias and the certainty in the environment, reproductive mode, evolutionary time, and ancestral genome. The yeast genome has five families of TEs, Ty1 through Ty5, all being retrotransposons flanked by LTRs (Kim et al. 1998;Carr et al. 2012). These TEs, comprising over 400 solo LTRs resulting from excisions and about 50 full-length TEs, collectively constitute approximately 3.1% of the yeast genome (Carr et al. 2012). ...
Article
Full-text available
Compared with asexual reproduction, sex facilitates the transmission of transposable elements (TEs) from one genome to another, but boosts the efficacy of selection against deleterious TEs. Thus, theoretically, it is unclear whether sex has a positive net effect on TE's proliferation. An empirical study concluded that sex is at the root of TE's evolutionary success because the yeast TE load was found to decrease rapidly in approximately 1000 generations of asexual but not sexual experimental evolution. However, this finding contradicts the maintenance of TEs in natural yeast populations where sexual reproduction occurs extremely infrequently. Here we show that the purported TE load reduction during asexual experimental evolution is likely an artifact of low genomic sequencing coverages. We observe stable TE loads in both sexual and asexual experimental evolution from multiple yeast datasets with sufficient coverages. To understand the evolutionary dynamics of yeast TEs, we turn to asexual mutation accumulation (MA) lines that have been under virtually no selection. We find that both TE transposition and excision rates per generation, but not their difference, tend to be higher in environments where yeast grows more slowly. However, the transposition rate is not significantly higher than the excision rate and the variance of the TE number among natural strains is close to its neutral expectation, suggesting that selection against TEs is at best weak in yeast. We conclude that the yeast TE load is maintained largely by a transposition-excision balance and that the influence of sex remains unclear.
... As this approach allowed us to obtain SHAPE reactivity data from a homogenous Ty1 gRNA population, a more reliable RNA secondary structure model was gener-ated. In contrast, the widely used S. cerevisiae S288c laboratory strain background contains 32 full length Ty1 elements comprised of three subfamilies: Ty1, Ty1/Ty2 hybrids, and the ancestral Ty1 element (15,47,48). Ty1 elements are expressed at different levels that vary up to 50-fold, with 75% of total Ty1 expression coming from eleven highly expressed elements, and eight of these are Ty1/Ty2 hybrids (49). ...
Article
Full-text available
Long terminal repeat (LTR)-retrotransposons constitute a significant part of eukaryotic genomes and influence their function and evolution. Like other RNA viruses, LTR-retrotransposons efficiently utilize their RNA genome to interact with host cell machinery during replication. Here, we provide the first genome-wide RNA secondary structure model for a LTR-retrotransposon in living cells. Using SHAPE probing, we explore the secondary structure of the yeast Ty1 retrotransposon RNA genome in its native in vivo state and under defined in vitro conditions. Comparative analyses reveal the strong impact of the cellular environment on folding of Ty1 RNA. In vivo, Ty1 genome RNA is significantly less structured and more dynamic but retains specific well-structured regions harboring functional cis-acting sequences. Ribosomes participate in the unfolding and remodeling of Ty1 RNA, and inhibition of translation initiation stabilizes Ty1 RNA structure. Together, our findings support the dual role of Ty1 genomic RNA as a template for protein synthesis and reverse transcription. This study also contributes to understanding how a complex multifunctional RNA genome folds in vivo, and strengthens the need for studying RNA structure in its natural cellular context.
... Transposable elements (TEs) are mobile components of almost all eukaryotic genomes, and their evolutionary history typically follows that of their hosts (Bowen et al. 2003;Hosid et al. 2012). S. cerevisiae contains multiple families of the TE subclass known as long-terminal repeat (LTR) retrotransposons, named Ty1-5 (Kim et al. 1998), that drive their own replication and integration into the genome via an RNA intermediate. LTR sequences frame gag and pol open reading frames (ORFs), which encode proteins necessary for their transposition (reviewed by Havecker et al. 2004). ...
Article
Full-text available
Saccharomyces cerevisiae is extensively utilized for commercial fermentation, and is also an important biological model; however, its ecology has only recently begun to be understood. Through the use of whole-genome sequencing, the species has been characterized into a number of distinct subpopulations, defined by geographical ranges and industrial uses. Here, the whole-genome sequences of 104 New Zealand (NZ) S. cerevisiae strains, including 52 novel genomes, are analyzed alongside 450 published sequences derived from various global locations. The impact of S. cerevisiae novel range expansion into NZ was investigated and these analyses reveal the positioning of NZ strains as a subgroup to the predominantly European/wine clade. A number of genomic differences with the European group correlate with range expansion into NZ, including 18 highly enriched single-nucleotide polymorphism (SNPs) and novel Ty1/2 insertions. While it is not possible to categorically determine if any genetic differences are due to stochastic process or the operations of natural selection, we suggest that the observation of NZ-specific copy number increases of four sugar transporter genes in the HXT family may reasonably represent an adaptation in the NZ S. cerevisiae subpopulation, and this correlates with the observations of copy number changes during adaptation in small-scale experimental evolution studies.
... Mobile or transposable elements (TEs) are DNA fragments that can "jump" to new chromosomal locations and thus often duplicate themselves. They represent a large part of eukaryotic genomes, ranging from 3% in baker's yeast (Saccharomyces cerevisiae) [8],~20% in fruit fly (Drosophila melanogaster) [9], 45% in humans (Homo sapiens) [10], 50% in grape (Vitis vinifera), 59% in Sorghum (Sorghum bicolor), 60% in Amborella (Amborella trichopoda), 62% in P. equestris, and 75% in maize (Zea mays) [11]. In general, there are two major groups of TEs distinguished by their transposition intermediate: class I retrotransposons with "copy-and-paste" retrotransposition, and class II DNA transposons with "cut-and-paste" retrotransposition [12]. ...
Article
Full-text available
Background Transposable elements (TEs) are fragments of DNA that can insert into new chromosomal locations. They represent a great proportion of eukaryotic genomes. The identification and characterization of TEs facilitates understanding the transpositional activity of TEs with their effects on the orchid genome structure. Results We combined the draft whole-genome sequences of Phalaenopsis equestris with BAC end sequences, Roche 454, and Illumina/Solexa, and identified long terminal repeat (LTR) retrotransposons in these genome sequences by using LTRfinder and classified by using Gepard software. Among the 10 families Gypsy-like retrotransposons, three families Gypsy1, Gypsy2, and Gypsy3, contained the most copies among these predicted elements. In addition, six high-copy retrotransposons were identified according to their reads in the sequenced raw data. The 12-kb Orchid-rt1 contains 18,000 copies representing 220 Mbp of the P. equestris genome. Southern blot and slot blot assays showed that these four retrotransposons Gypsy1, Gypsy2, Gypsy3, and Orchid-rt1 contained high copies in the large-genome-size/large-chromosome species P. violacea and P. bellina. Both Orchid-rt1 and Gypsy1 displayed various ratios of copy number for the LTR sequences versus coding sequences among four Phalaenopsis species, including P. violacea and P. bellina and small-genome-size/small-chromosome P. equestris and P. ahprodite subsp. formosana, which suggests that Orchid-rt1 and Gypsy1 have been through various mutations and homologous recombination events. FISH results showed amplification of Orchid-rt1 in the euchromatin regions among the four Phalaenopsis species. The expression levels of Peq018599 encoding copper transporter 1 is highly upregulated with the insertion of Orchid-rt1, while it is down regulated for Peq009948 and Peq014239 encoding for a 26S proteasome non-ATP regulatory subunit 4 homolog and auxin-responsive factor AUX/IAA-related. In addition, insertion of Orchid-rt1 in these three genes are all in their intron regions. Conclusion Orchid-rt1 and Gypsy1–3 have amplified within Phalaenopsis orchids concomitant with the expanded genome sizes, and Orchid-rt1 and Gypsy1 may have gone through various mutations and homologous recombination events. Insertion of Orchid-rt1 is in the introns and affects gene expression levels.
... Depending on the organism, the proportion of TEs can be highly variable and at times very large. For example, their proportion in genomes represent 3% in yeast (Kim et al. 1998), 15% in Drosophila (Dowsett and Young 1982), 45% in human and in the mouse (Lander et al. 2001;Waterston et al. 2002), and more than 80% in maize (Schnable et al. 2009). ...
... Analysis of TEs in these assemblies revealed a surprisingly high copy number for the Ty4 family in one strain of S. paradoxus from South America (UFRJ50816; n=23 copies) [13]. This was a noteworthy observation for two reasons: (i) Ty4 is typically found at low copy number in yeast strains [14,15,16], and (ii) S. American strains of S. paradoxus exhibit partial reproductive isolation with other strains of this species, which principally results from multiple reciprocal translocations thought to have arisen by unequal crossing-over between dispersed repetitive elements such as Ty elements [17]. ...
Preprint
Full-text available
Background Recent evidence suggests that horizontal transfer plays a significant role in the evolution of of transposable elements (TEs) in eukaryotes. Many cases of horizontal TE transfer (HTT) been reported in animals and plants, however surprisingly few examples of HTT have been reported in fungi. Findings Here I report evidence for a novel HTT event in fungi involving Tsu4 in Saccharomyces paradoxus based on (i) high similarity between Tsu4 elements in S. paradoxus and S. uvarum , (ii) a patchy distribution of Tsu4 in S. paradoxus and general absence from its sister species S. cerevisiae , and (iii) discordance between the phylogenetic history of Tsu4 sequences and species in the Saccharomyces sensu stricto group. Available data suggests the HTT event likely occurred somewhere in the Nearctic, Neotropic or Indo-Australian part of the S. paradoxus species range, and that a lineage related to S. uvarum or S. eubayanus was the donor species. The HTT event has led to massive proliferation of Tsu4 in the South American lineage of S. paradoxus , which exhibits partial reproductive isolation with other strains of this species because of multiple reciprocal translocations. Full-length Tsu4 elements are associated with both breakpoints of one of these reciprocal translocations. Conclusions This work shows that comprehensive analysis of TE sequences in essentially-complete genome assemblies derived from long-read sequencing provides new opportunities to detect HTT events in fungi and other organisms. This work also provides support for the hypothesis that HTT and subsequent TE proliferation can induce genome rearrangements that contribute to post-zygotic isolation in yeast.
... Because telomeres and tRNA genes are associated with repetitive elements [15][16] in addition to having a high genomic copy number, we suspected that their consistently high enrichment across experiments could be an artifact of crosshybridization [17][18]. To test for this, we inspected spot intensities and performed a more finely-grained classification of probes ( Table 3; see Methods). ...
Preprint
Recent chromatin immunoprecipitation (ChIP) experiments in fly, mouse, and human have revealed the existence of high-occupancy target (HOT) regions or “hotspots” that show enrichment across many assayed DNA-binding proteins. Similar co-enrichment observed in yeast so far has been treated as artifactual, and has not been fully characterized. Here we reanalyze ChIP data from both array-based and sequencing-based experiments to show that in the yeast S. cerevisiae, the collective enrichment phenomenon is strongly associated with proximity to noncoding RNA genes and with nucleosome depletion. DNA sequence motifs that confer binding affinity for the proteins are largely absent from these hotspots, suggesting that protein-protein interactions play a prominent role. The hotspots are condition-specific, suggesting that they reflect a chromatin state or protein state, and are not a static feature of underlying sequence. Additionally, only a subset of all assayed factors is associated with these loci, suggesting that the co-enrichment cannot be simply explained by a chromatin state that is universally more prone to immunoprecipitation. Together our results suggest that the co-enrichment patterns observed in yeast represent transcription factor co-occupancy. More generally, they make clear that great caution must be used when interpreting ChIP enrichment profiles for individual factors in isolation, as they will include factor-specific as well as collective contributions.
... Another clue is also the close genomic relationship of tD-NAs with TEs as observed now for decades (93)(94)(95)(96). TEs mobilization/insertion in addition to recombination between repeats (e.g. ...
Article
Full-text available
Beyond their key role in translation, cytosolic transfer RNAs (tRNAs) are involved in a wide range of other biological processes. Nuclear tRNA genes (tDNAs) are transcribed by the RNA polymerase III (RNAP III) and cis-elements, trans-factors as well as genomic features are known to influence their expression. In Arabidopsis, besides a predominant population of dispersed tDNAs spread along the 5 chromosomes, some clustered tDNAs have been identified. Here, we demonstrate that these tDNA clusters are transcriptionally silent and that pathways involved in the maintenance of DNA methylation play a predominant role in their repression. Moreover, we show that clustered tDNAs exhibit repressive chromatin features whilst their dispersed counterparts contain permissive euchromatic marks. This work demonstrates that both genomic and epigenomic contexts are key players in the regulation of tDNAs transcription. The conservation of most of these regulatory processes suggests that this pioneering work in Arabidopsis can provide new insights into the regulation of RNA Pol III transcription in other organisms, including vertebrates.
... Saccharomyces cerevisiae LTR retrotransposons Ty1, Ty2, and Ty4 have the same integration preferences for regions upstream of Pol III-transcribed genes (Kim et al, 1998;Carr et al, 2012), and the Ctermini of their integrases (IN1,IN2,and IN4,respectively) interact with the Pol III subunit AC40 (Bridier-Nahmias et al, 2015). To identify conserved amino acids potentially involved in the AC40 interaction, we aligned the C-terminal sequences of IN1,IN2,and IN4 (Fig 1A,bottom) and observed that IN1 and IN2 are highly similar in this region, whereas IN4 is more divergent. ...
Article
Full-text available
Integration of transposable elements into the genome is mutagenic. Mechanisms targeting integrations into relatively safe locations, hence minimizing deleterious consequences for cell fitness, have emerged during evolution. In budding yeast, integration of the Ty1 LTR retrotransposon upstream of RNA polymerase III (Pol III)-transcribed genes requires interaction between Ty1 integrase (IN1) and AC40, a subunit common to Pol I and Pol III. Here, we identify the Ty1 targeting domain of IN1 that ensures (i) IN1 binding to Pol I and Pol III through AC40, (ii) IN1 genome-wide recruitment to Pol I- and Pol III-transcribed genes, and (iii) Ty1 integration only at Pol III-transcribed genes, while IN1 recruitment by AC40 is insufficient to target Ty1 integration into Pol I-transcribed genes. Swapping the targeting domains between Ty5 and Ty1 integrases causes Ty5 integration at Pol III-transcribed genes, indicating that the targeting domain of IN1 alone confers Ty1 integration site specificity.
... Ces derniers sont des séquences répétées ayant ou ayant eu la capacité de se déplacer au sein du génome. Ils constituent environ 45 % du génome (Fig. 3 (Kim et al., 1998). Des méthodes existent pour identifier les rétrotransposons, mais elles ne sont pas classiquement intégrées dans les pipelines d'analyse de données de ES ou de génome. ...
Thesis
L’avènement du séquençage haut débit d’exome (SHD-E) en diagnostic et en recherche ces dernières années a conduit à l’identification des bases génétiques de nombreuses pathologies mendéliennes, permettant de résoudre de nombreuses situations d’errance diagnostique. Néanmoins, l’analyse des données de SHD-E permet uniquement d’identifier des variations pathogènes ou probablement pathogènes dans 30 à 45 % des situations sans diagnostic. En effet, certaines limites existent, tant au niveau clinique, moléculaire et bioinformatique. L’évolution constante des connaissances cliniques, du nombre de nouveaux gènes impliqués en pathologie humaine, et des corrélations clinico-biologique a un impact important sur l’analyse des données, entraînant une amélioration progressive de la recherche diagnostique. Des limites techniques inhérentes à la technologie, avec en particulier des régions non couvertes, existent, mais se sont également significativement réduites ces dernières années. Enfin, au-delà de l’analyse de SNV et de CNV, d’autres anomalies génétiques peuvent être responsables de maladies rares, nécessitant un développement bioinformatique pour optimiser les résultats. Bien que le séquençage à haut débit du génome permette de résoudre des observations, en particulier en cas de variations dans les régions non codantes ou les variants de structure, il existe encore de nombreuses informations à extraire et à exploiter à partir des données de SHD-E.L’objectif de cette thèse a donc été de participer à l’amélioration des approches bioinformatiques d’analyse de données de SHD-E pour l’identification de nouveaux gènes ou mécanismes moléculaires impliqués dans des maladies génétiques rares afin de réduire l’errance diagnostique des patients.Plusieurs stratégies ont ainsi été mises en place. La première stratégie a consisté en une réanalyse recherche de données de 80 patients ayant bénéficié d’un SHD-E au laboratoire CERBA (thèse CIFRE) dont la lecture diagnostique était négative. Elle a conduit à la mise en évidence deux nouveaux gènes candidats dans la déficience intellectuelle syndromique, dont le gène OTUD7A (article 1). La deuxième stratégie a consisté en la mise au point d’un pipeline bioinformatique pour extraire les données du génome mitochondrial à partir des données de SHD-E. L’ADN mitochondrial n’est pas ciblé par les kits de capture d’exome mais peut être extrait des données capturées indirectement rendant son analyse possible à partir de données de SHD-E préexistantes. A partir de la collection GAD d’exomes de patients sans diagnostic, deux variations causales ont été identifiées chez deux individus atteints de troubles neuro-développementaux sur 928 personnes étudiées, et ainsi résoudre une errance diagnostique dans 0,2 % des patients sans diagnostic (article 2). La troisième stratégie a consisté en la mise en place d’un pipeline bioinformatique d’identification des éléments mobiles au sein des données d’exome, étant attendu qu’environ 0,3 % des variations pathogènes du génome humain ont pour origine l’insertion de novo d’un élément mobile. A partir de la collection GAD d’exomes de 3322 patients sans diagnostic, cette étape a permis d’identifier deux cas en lien avec l’insertion d’un élément Alu au sein d’un exon du gène FERMT1 et du gène GRIN2B (article 3 en cours d’écriture).Cette thèse a permis de repousser certaines limites de la technologie d’exome. D’autres perspectives existent, et sont explorées par l’équipe, en lien avec le projet Européen Solve-RD.
... uvarum clade to S. paradoxus [46]. These annotations revealed counts of full-length elements an order of magnitude lower than solo LTRs (Fig 2c, Fig S1), consistent with previously reported data on S. cerevisiae and S. paradoxus [45,[47][48][49]. To overcome the difficulty of comparing multiple families of short and often degenerated LTR sequences within a phylogenetic framework, we built intra-genome LTR sequence similarity networks. ...
Preprint
Full-text available
Transposable elements (TEs) are mobile genetic elements that can profoundly impact the evolution of genomes and species. A long-standing hypothesis suggests that the merging of diverged genomes within hybrids could alter the regulation of TEs and increase transposition. Higher transposition rates could potentially fuel hybrid evolution with rare adaptive TE insertions, but also cause postzygotic reproductive isolation if maladaptive insertion loads render hybrids sterile or inviable. Mixed evidence for higher TE activity in hybrids was reported in many animal and plant species. Here, we tested for increased TE accumulation in hybrids between incipient species of the undomesticated yeast Saccharomyces paradoxus. Population genomic data revealed no increase in TE content in the natural hybrid lineages. As we could not rule out the elimination of past transposition increase signatures by natural selection, we performed a laboratory evolution experiment on a panel of artificial hybrids to measure TE accumulation in controlled conditions and in the near absence of selection. Changes in TE copy numbers were highly dependent on the individual hybrid genotypes and were not predicted by the evolutionary divergence between the parents of a hybrid genotype. Rather, our data suggested that initial TE copy numbers in hybrids negatively impacted transposition rate, suggesting that TE self-regulation could play a predominant role on TE accumulation in yeast hybrids.
... Tec1p hingegen ist involviert in die Expression von Ty-Retrotransposons, von denen insgesamt fünf (Ty1-5) in der Hefe beschrieben sind; diese parasitären Elemente sind im Wirtsgenom integriert und können durch Umwelteinflüsse auf transkriptioneller Ebene reguliert werden (Kim et al., 1998;Lesage & Todeschini, 2005;Beauregard et al., 2008). (Park et al., 2000;Jang et al., 2004;Trotter & Grant, 2002;Rand & Grant, 2006 (Parrou et al., 1997;Parrou et al., 1999). ...
... Les éléments de classe I sont partagés en différents grands ordres. Les données des proportions en différentes catégories sont extraites de la littérature pour la levure(Kim et al. 1998), le nématode(Stein et al. 1998; The C. elegans Sequencing Consortium 1998), l'arabette (The Arabidopsis Genome Initiative 2000), la drosophile (Biémont and Vieira 2005), le xénope (Hellsten et al. 2010), l'homme (The Chimpanzee Sequencing and Analysis Consortium 2005; The 1000 Genomes Project Consortium 2012), le poisson zèbre (Howe et al. 2013), ainsi que le platy, le medaka, le tilapia, l'épinoche, les poissons globes, la morue et la lépisostée (Chalopin et al. 2015), et enfin les deux téléostéens antarctiques « black rockcod » et « blackfin icefish » (Detrich et al. 2010; Detrich and Amemiya 2010).Lorsqu'ils sont mobilisés, les ETs se déplacent et se multiplient au sein des génomes (McClintock 1950; McClintock 1984), on les retrouve donc naturellement dans différents chromosomes d'une même espèce. Longtemps considérés comme des éléments égoïstes et parasites des génomes (« junk » ou « selfish DNA »)(Biémont and Vieira 2006;Muotri et al. 2007;Sotero-Caio et al. 2017), il est avéré aujourd'hui qu'ils ont un impact fort sur leur structuration, leur fonction, et leur plasticitéVolff 2005;Böhne et al. 2008;Raskina et al. 2008;Kraaijeveld 2010;Howe et al. 2013). ...
Thesis
Full-text available
L’alternance de périodes glaciaires et interglaciaires durant les 20 derniers Ma a mené à des changements environnementaux répétés au niveau du plateau continental antarctique. C’est dans ce contexte que les téléostéens de la famille des Nototheniidae se sont adaptés et diversifiés à travers plusieurs vagues de radiations (dont les Trematominae), dominant l’Ichtyofaune australe. Parmi les Nototheniidae, le groupe « Trematomus » (genres Cryothenia, Pagothenia, Trematomus et Indonotothenia) est celui où l’on observe la plus grande diversité chromosomique, avec des nombres diploïdes de chromosomes allant de 24 à 58, impliquant de nombreux réarrangements ayant accompagné les spéciations. Nous avons cherché à caractériser ces remaniements chromosomiques. Avec un caryotype ancestral inféré de 2n = 48, une conservation des unités chromosomiques entre espèces, et une constance des tailles de génome, l’hypothèse de réarrangements structuraux sans polyploïdisation préalable est la plus probable. Afin de reconstruire l’histoire évolutive de ces événements, nous avons recherché les homologies chromosomiques interspécifiques. Ceci nous a permis de reconstituer les remaniements (majoritairement des fusions) que nous avons repositionnés sur la phylogénie résolue des « Trematomus ». Contrairement à ce qui a été publié pour le genre Notothenia, nos résultats suggèrent des acquisitions multiples et indépendantes. Les éléments transposables (ETs) peuvent être impliqués dans les remaniements chromosomiques par le biais de recombinaisons ectopiques. Ils participent alors à la diversification des lignées au cours de l’évolution. En raison de leur régulation épigénétique, leur mobilisation massive peut être induite en cas de variations environnementales importantes. Nous nous sommes intéressés à trois super-familles d’ETs (DIRS, Gypsy and Copia) dans ces génomes. Les DIRS1 ont montré des patrons d’insertions en points chauds dans les régions centromériques et péricentromériques. Etant donné leur mode de transposition décrit et leur propension à s’insérer dans des copies préexistantes, nous proposons un rôle des éléments DIRS1 comme facilitateurs des fusions observées lors de la diversification des « Trematomus ».
... S. cerevisiae LTR-retrotransposons Ty1, Ty2, and Ty4 have the same integration preferences for regions upstream of Pol III-transcribed genes (Carr et al., 2012;Kim et al., 1998) Figure 1B). GBD-AC40 interacted with GAD-IN1 578-635 but not with GAD-IN1 1-578 , as shown previously (Bridier-Nahmias et al., 2015). ...
Preprint
Full-text available
Integration of transposable elements into the genome is mutagenic. Mechanisms that target integration into relatively safe locations and minimize deleterious consequences for cell fitness have emerged during evolution. In budding yeast, the integration of the Ty1 LTR retrotransposon upstream of RNA polymerase III (Pol III)-transcribed genes requires the interaction between the AC40 subunit of Pol III and Ty1 integrase (IN1). Here we show that the IN1-AC40 interaction involves a short linker sequence in the bipartite nuclear localization signal (bNLS) of IN1. Mutations in this sequence do not impact the frequency of Ty1 retromobility, instead they decrease the recruitment of IN1 to Pol III-transcribed genes and the subsequent integration of Ty1 at these loci. The replacement of Ty5 retrotransposon targeting sequence by the IN1 bNLS induces Ty5 integration into Pol III-transcribed genes. Therefore, the IN1 bNLS is both necessary and sufficient to confer integration site specificity on Ty1 and Ty5 retrotransposons.
Article
Full-text available
One-step and two-step pathways are proposed to synthesize cytokinin in plants. The one-step pathway is mediated by LONELY GUY (LOG) proteins. However, the enzyme for the two-step pathway remains to be identified. Here, we show that quantitative trait locus GY3 may boost grain yield by more than 20% through manipulating a two-step pathway. Locus GY3 encodes a LOG protein that acts as a 5′-ribonucleotide phosphohydrolase by excessively consuming the cytokinin precursors, which contrasts with the activity of canonical LOG members as phosphoribohydrolases in a one-step pathway. The residue S41 of GY3 is crucial for the dephosphorylation of iPRMP to produce iPR. A solo-LTR insertion within the promoter of GY3 suppressed its expression and resulted in a higher content of active cytokinins in young panicles. Introgression of GY3⁰²⁴²⁸ increased grain yield per plot by 7.4% to 16.3% in all investigated indica backgrounds, which demonstrates the great value of GY3⁰²⁴²⁸ in indica rice production.
Preprint
Full-text available
Transposable elements (TEs) are major contributors to structural genomic variation by creating interspersed duplications of themselves. In return, structural variants (SVs) can affect the genomic distribution of TE copies and shape their load. One long-standing hypothesis states that hybridization could trigger TE mobilization and thus increase TE load in hybrids. We previously tested this hypothesis by performing a large-scale evolution experiment by mutation accumulation (MA) on multiple hybrid genotypes within and between wild populations of the yeasts Saccharomyces paradoxus and Saccharomyces cerevisiae. Using aggregate measures of TE load with short-read sequencing, we found no evidence for TE load increase in hybrid MA lines. Here, we resolve the genomes of the hybrid MA lines with long-read phasing and assembly to precisely characterize the role of SVs in shaping the TE landscape. Highly contiguous phased assemblies of 127 MA lines revealed that SV types like polyploidy, aneuploidy and loss of heterozygosity have large impacts on the TE load. We characterized 18 de novo TE insertions, indicating that transposition only has a minor role in shaping the TE landscape in MA lines. Because the scarcity of TE mobilization in MA lines provided insufficient resolution to confidently dissect transposition rate variation in hybrids, we adapted an in vivo assay to measure transposition rates in various S. paradoxus hybrid backgrounds. We found that transposition rates are not increased by hybridization, but are modulated by many genotype-specific factors including initial TE load, TE sequence variants and mitochondrial DNA inheritance. Our results show the multiple scales at which TE load is shaped in hybrid genomes, being highly impacted by SV dynamics and finely modulated by genotype-specific variation in transposition rates.
Preprint
Full-text available
Telomeres and subtelomeres, the genomic regions located at chromosome extremities, are essential for genome stability in eukaryotes. In the absence of the canonical maintenance mechanism provided by telomerase, telomere shortening induces genome instability. The landscape of the ensuing genome rearrangements is not accessible by short-read sequencing. Here, we leverage Oxford Nanopore Technologies long-read sequencing to survey the extensive repertoire of genome rearrangements in telomerase mutants of the model green microalga Chlamydomonas reinhardtii. In telomerase mutant strains grown for ~700 generations, most chromosome extremities were capped by short telomere sequences that were either recruited de novo from other loci or maintained in a telomerase-independent manner. Other extremities did not end with telomeres but only with repeated subtelomeric sequences. The subtelomeric elements, including rDNA, were massively rearranged and involved in breakage-fusion-bridge cycles, translocations, recombinations and chromosome circularization. These events were established progressively over time and displayed heterogeneity at the subpopulation level. New telomere-capped extremities composed of sequences originating from more internal genomic regions were associated with high DNA methylation, suggesting that de novo heterochromatin formation contributes to restore chromosome end stability in C. reinhardtii. The diversity of alternative strategies to maintain chromosome integrity and the variety of rearrangements found in telomerase mutants are remarkable and illustrate genome plasticity at short timescales.
Article
Full-text available
The human commensal and opportunistic fungal pathogen Candida albicans displays extensive genetic and phenotypic variation across clinical isolates. Here, we performed RNA sequencing on 21 well-characterized isolates to examine how genetic variation contributes to gene expression differences and to link these differences to phenotypic traits. C. albicans adapts primarily through clonal evolution, and yet hierarchical clustering of gene expression profiles in this set of isolates did not reproduce their phylogenetic relationship. Strikingly, strain-specific gene expression was prevalent in some strain backgrounds. Association of gene expression with phenotypic data by differential analysis, linear correlation, and assembly of gene networks connected both previously characterized and novel genes with 23 C. albicans traits. Construction of de novo gene modules produced a gene atlas incorporating 67% of C. albicans genes and revealed correlations between expression modules and important phenotypes such as systemic virulence. Furthermore, targeted investigation of two modules that have novel roles in growth and filamentation supported our bioinformatic predictions. Together, these studies reveal widespread transcriptional variation across C. albicans isolates and identify genetic and epigenetic links to phenotypic variation based on coexpression network analysis.
Preprint
Full-text available
In most eukaryotes, subtelomeres are dynamic genomic regions populated by multi-copy sequences of different origins, which can promote segmental duplications and chromosomal rearrangements. However, their repetitive nature has complicated the efforts to sequence them, analyze their structure and infer how they evolved. Here, we use recent and forthcoming genome assemblies of Chlamydomonas reinhardtii based on long-read sequencing to comprehensively describe the subtelomere architecture of the 17 chromosomes of this model unicellular green alga. We identify three main repeated elements present at subtelomeres, which we call Sultan , Subtile and Suber , alongside three chromosome extremities with ribosomal DNA as the only identified component of their subtelomeres. The most common architecture, present in 27 out of 34 subtelomeres, is an array of 1 to 46 tandem copies of Sultan elements adjacent to the telomere and followed by a transcribed centromere-proximal Spacer sequence, a G-rich microsatellite and a region rich in transposable elements. Sequence similarity analyses suggest that Sultan elements underwent segmental duplications within each subtelomere and rearranged between subtelomeres at a much lower frequency. Comparison of genomic sequences of three laboratory strains and a wild isolate of C. reinhardtii shows that the overall subtelomeric architecture was already present in their last common ancestor, although subtelomeric rearrangements are on-going at the species level. Analysis of other green algae reveals the presence of species-specific repeated elements, highly conserved across subtelomeres and unrelated to the Sultan element, but with a subtelomere structure similar to C. reinhardtii . Overall, our work uncovers the complexity and evolution of subtelomere architecture in green algae.
Article
Full-text available
Retrotransposons can represent half of eukaryotic genomes. Retrotransposon dysregulation destabilizes genomes and has been linked to various human diseases. Emerging regulators of retromobility include RNA–DNA hybrid-containing structures known as R-loops. Accumulation of these structures at the transposons of yeast 1 (Ty1) elements has been shown to increase Ty1 retromobility through an unknown mechanism. Here, via a targeted genetic screen, we identified the rnh1Δ rad27Δ yeast mutant, which lacked both the Ty1 inhibitor Rad27 and the RNA–DNA hybrid suppressor Rnh1. The mutant exhibited elevated levels of Ty1 cDNA-associated RNA–DNA hybrids that promoted Ty1 mobility. Moreover, in this rnh1Δ rad27Δ mutant, but not in the double RNase H mutant rnh1Δ rnh201Δ , RNA–DNA hybrids preferentially existed as duplex nucleic acid structures and increased Ty1 mobility in a Rad52-dependent manner. The data indicate that in cells lacking RNA–DNA hybrid and Ty1 repressors, elevated levels of RNA-cDNA hybrids, which are associated with duplex nucleic acid structures, boost Ty1 mobility via a Rad52-dependent mechanism. In contrast, in cells lacking RNA–DNA hybrid repressors alone, elevated levels of RNA-cDNA hybrids, which are associated with triplex nucleic acid structures, boost Ty1 mobility via a Rad52-independent process. We propose that duplex and triplex RNA–DNA hybrids promote transposon mobility via Rad52-dependent or -independent mechanisms.
Thesis
Les génomes de plantes sont peuplés de différents types d’éléments répétés, notamment des éléments transposables (ET) et des séquences satellites (simple sequence repeats, SSRs) qui peuvent avoir un impact important sur la taille et la dynamique du génome, ainsi que sur la régulation de la transcription génique. Au moins les deux-tiers du génome de la tomate sont composés de répétitions. Bien que leur impact global sur l’organisation du génome ait été largement révélé par l’assemblage du génome entier, leur influence sur la biologie et le phénotype de la tomate reste largement sans réponse. Plus spécifiquement, les effets et les rôles des répétitions de l’ADN sur la maturation des fruits charnus, processus complexe présentant un intérêt agro-économique essentiel, doivent encore être étudiés de manière approfondie et la tomate est sans aucun doute un excellent modèle pour cette étude. Nous avons réalisé une annotation complète du repeatome de la tomate pour explorer son impact potentiel sur la composition du génome de la tomate et la transcription des gènes. Nos résultats montrent que le génome de la tomate peut être fractionné en trois compartiments avec une densité de gènes et de répétitions différente, chaque compartiment présentant une composition répétée et génique contrastée, des associations gènes-répétitions et des niveaux transcriptionnels des gènes différents. Dans le contexte de la maturation des fruits, nous avons constaté que des répétitions sont présentes dans la majorité des régions méthylées différentiellement (differentially methylated regions, DMRs) et que des milliers de DMRs associées à des répétitions se trouvent à proximité des gènes, y compris des centaines qui sont différentiellement régulés durant ce processus. De plus, nous avons constaté que des répétitions sont également présentes à proximité des sites de liaison de la protéine clé de maturation RIN. Nous avons également observé que certaines familles de répétitions sont présentes à une fréquence élevée inattendue à proximité des gènes exprimés de manière différentielle au cours de la maturation de la tomate. Compte tenu du lien entre ces différentes entités, nous nous sommes demandé s’il était possible que certains éléments transposables du génome de la tomate aient été sélectionnés au cours de l’évolution pour leur impact sur le génome. Nous avons donc développé une série d’analyses afin d’essayer de détecter in silico de tels éléments. Les familles d’éléments ainsi sélectionnées sont alors au nombre de 36, et certaines se trouvent associées à des fonctions de gènes particulières. Des analyses plus fines des séquences pourraient alors potentiellement permettre de mettre en évidence des motifs d’intérêt, notamment pour la régulation transcriptionnelle des gènes.
Preprint
Full-text available
Mobile elements (MEs) can be divided into two major classes based on their transposition mechanisms as retrotransposons and DNA transposons. DNA transposons move in the genomes directly in the form of DNA in a cut-and-paste style, while retrotransposons utilize an RNA-intermediate to transpose in a copy-and-paste fashion. In addition to the target site duplications (TSDs), a hallmark of transposition shared by both classes, the DNA transposons also carry terminal inverted repeats (TIRs). DNA transposons constitute ~3% of primate genomes and they are thought to be inactive in the recent primate genomes since ~37My ago despite their success during early primate evolution. Retrotransposons can be further divided into Long Terminal Repeat retrotransposons (LTRs), which are characterized by the presence of LTRs at the two ends, and non-LTRs, which lack LTRs. In the primate genomes, LTRs constitute ~9% of genomes and have a low level of ongoing activity, while non-LTR retrotransposons represent the major types of MEs, contributing to ~37% of the genomes with some members being very young and currently active in retrotransposition. The four known types of non-LTR retrotransposons include LINEs, SINEs, SVAs, and processed pseudogenes, all characterized by the presence of a polyA tail and TSDs, which mostly range from 8 to 15 bp in length. All non-LTR retrotransposons are known to utilize the L1-based target-primed reverse transcription (TPRT) machineries for retrotransposition. In this study, we report a new type of non-LTR retrotransposon, which we named as retro-DNAs, to represent DNA transposons by sequence but non-LTR retrotransposons by the transposition mechanism in the recent primate genomes. By using a bioinformatics comparative genomics approach, we identified a total of 1,750 retro-DNAs, which represent 748 unique insertion events in the human genome and nine non-human primate genomes from the ape and monkey groups. These retro-DNAs, mostly as fragments of full-length DNA transposons, carry no TIRs but longer TSDs with ~23.5% also carrying a polyA tail and with their insertion site motifs and TSD length pattern characteristic of non-LTR retrotransposons. These features suggest that these retro-DNAs are DNA transposon sequences likely mobilized by the TPRT mechanism. Further, at least 40% of these retro-DNAs locate to genic regions, presenting significant potentials for impacting gene function. More interestingly, some retro-DNAs, as well as their parent sites, show certain levels of current transcriptional expression, suggesting that they have the potential to create more retro-DNAs in the current primate genomes. The identification of retro-DNAs, despite small in number, reveals a new mechanism in propagating the DNA transposons sequences in the primate genomes with the absence of canonical DNA transposon activity. It also suggests that the L1 TPRT machinery may have the ability to retrotranspose a wider variety of DNA sequences than what we currently know.
Preprint
Full-text available
Beyond their key role in translation, cytosolic transfer RNAs (tRNAs) are involved in a wide range of other biological processes. Nuclear tRNA genes (tDNAs) are transcribed by the RNA polymerase III (RNAP III) and cis-elements, trans-factors as well as genomic features are known to influence their expression. In Arabidopsis, besides a predominant population of dispersed tDNAs spread along the 5 chromosomes, some clustered tDNAs have been identified. Here, we demonstrate that these tDNA clusters are transcriptionally silent and that pathways involved in the maintenance of DNA methylation play a predominant role in their repression. Moreover, we show that clustered tDNAs exhibit repressive chromatin features whilst their dispersed counterparts contain permissive euchromatic marks. Our data highlight that the combination of both genomic environment and epigenomic landscape contribute to fine tune the differential expression of dispersed versus clustered tDNAs in Arabidopsis.
ResearchGate has not been able to resolve any references for this publication.