Comparative Genomics and Molecular Dynamics of DNA Repeats in Eukaryotes

Institut Pasteur, Unité de Génétique Moléculaire des Levures, CNRS, URA2171, Université Pierre et Marie Curie, UFR927, 25 rue du Dr. Roux, F-75015, Paris, France.
Microbiology and molecular biology reviews: MMBR (Impact Factor: 14.61). 01/2009; 72(4):686-727. DOI: 10.1128/MMBR.00011-08
Source: PubMed


Repeated elements can be widely abundant in eukaryotic genomes, composing more than 50% of the human genome, for example. It is possible to classify repeated sequences into two large families, "tandem repeats" and "dispersed repeats." Each of these two families can be itself divided into subfamilies. Dispersed repeats contain transposons, tRNA genes, and gene paralogues, whereas tandem repeats contain gene tandems, ribosomal DNA repeat arrays, and satellite DNA, itself subdivided into satellites, minisatellites, and microsatellites. Remarkably, the molecular mechanisms that create and propagate dispersed and tandem repeats are specific to each class and usually do not overlap. In the present review, we have chosen in the first section to describe the nature and distribution of dispersed and tandem repeats in eukaryotic genomes in the light of complete (or nearly complete) available genome sequences. In the second part, we focus on the molecular mechanisms responsible for the fast evolution of two specific classes of tandem repeats: minisatellites and microsatellites. Given that a growing number of human neurological disorders involve the expansion of a particular class of microsatellites, called trinucleotide repeats, a large part of the recent experimental work on microsatellites has focused on these particular repeats, and thus we also review the current knowledge in this area. Finally, we propose a unified definition for mini- and microsatellites that takes into account their biological properties and try to point out new directions that should be explored in a near future on our road to understanding the genetics of repeated sequences.

Download full-text


Available from: Guy-Franck Richard, Jan 23, 2015
1 Follower
15 Reads
  • Source
    • "variation in TR unit lengths and numbers is characteristic to genomic TRs, such as ribosomal DNA arrays crucial for the translation machinery, and satellite DNA comprising the main component of functional centromeres (Richard et al., 2008). In protein-coding genes, mutations in TRs are likely to alter the structure/function of the protein product. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Tandem repeats (TRs) are frequently observed in genomes across all domains of life. Evidence suggests that some TRs are crucial for proteins with fundamental biological functions and can be associated with virulence, resistance, and infectious/neurodegenerative diseases. Genome-scale systematic studies of TRs have the potential to unveil core mechanisms governing TR evolution and TR roles in shaping genomes. However, TR-related studies are often non-trivial due to heterogeneous and sometimes fast evolving TR regions. In this review, we discuss these intricacies and their consequences. We present our recent contributions to computational and statistical approaches for TR significance testing, sequence profile-based TR annotation, TR-aware sequence alignment, phylogenetic analyses of TR unit number and order, and TR benchmarks. Importantly, all these methods explicitly rely on the evolutionary definition of a tandem repeat as a sequence of adjacent repeat units stemming from a common ancestor. The discussed work has a focus on protein TRs, yet is generally applicable to nucleic acid TRs, sharing similar features.
    Frontiers in Bioengineering and Biotechnology 03/2015; 3:31. DOI:10.3389/fbioe.2015.00031
  • Source
    • "Moreover, mini and microsatellites were early suspected to form secondary structures that may play an important role in the mutational process due to their repetitive nature and highly biased nucleotide composition. Expanded repeats in noncoding regions interfere with the metabolism of several cellular pathways, such as methylation, transcription, splicing, RNA processing, nuclear export and translation and the resulting expanded mRNAs often acquire an altered function (Richard et al. 2008). "
    [Show abstract] [Hide abstract]
    ABSTRACT: A pseudogene, designated as "ps(5.8S+ITS-2)", paralogous to the 5.8S gene and internal transcribed spacer (ITS)-2 of the nuclear ribosomal DNA (rDNA), has been recently found in many triatomine species distributed throughout North America, Central America and northern South America. Among characteristics used as criteria for pseudogene verification, secondary structures and free energy are highlighted, showing a lower fit between minimum free energy, partition function and centroid structures, although in given cases the fit only appeared to be slightly lower. The unique characteristics of "ps(5.8S+ITS-2)" as a processed or retrotransposed pseudogenic unit of the ghost type are reviewed, with emphasis on its potential functionality compared to the functionality of genes and spacers of the normal rDNA operon. Besides the technical problem of the risk for erroneous sequence results, the usefulness of "ps(5.8S+ITS-2)" for specimen classification, phylogenetic analyses and systematic/taxonomic studies should be highlighted, based on consistence and retention index values, which in pseudogenic sequence trees were higher than in functional sequence trees. Additionally, intraindividual, interpopulational and interspecific differences in pseudogene amount and the fact that it is a pseudogene in the nuclear rDNA suggests a potential relationships with fitness, behaviour and adaptability of triatomine vectors and consequently its potential utility in Chagas disease epidemiology and control.
    Memórias do Instituto Oswaldo Cruz 03/2015; DOI:10.1590/0074-02760140398 · 1.59 Impact Factor
  • Source
    • "This repetitive fraction is predominantly composed of dispersed mobile genetic elements (DNA transposons, retroelements) and tandemly repeated satellite DNAs (Hemleben et al., 2007; Weiss-Schneeweiss and Schneeweiss, 2013). Satellite DNA is typically species or genus specific, consisting of long arrays of late-replicating, tandemly arranged, head-to-tail repeats (Charlesworth et al., 1994; Richard et al., 2008). Satellite DNA is a non-coding fraction of the genome of limited transcriptional capacity, subject to methylation, histone modification and chromatin remodelling (Volkov et al., 2006; Hemleben et al., 2007). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background and Aims Chromosomal evolution, including numerical and structural changes, is a major force in plant diversification and speciation. This study addresses genomic changes associated with the extensive chromosomal variation of the Mediterranean Prospero autumnale complex (Hyacinthaceae), which includes four diploid cytotypes each with a unique combination of chromosome number (x = 5, 6, 7), rDNA loci and genome size. Methods A new satellite repeat PaB6 has previously been identified, and monomers were reconstructed from next-generation sequencing (NGS) data of P. autumnale cytotype B6B6 (2n = 12). Monomers of all other Prospero cytotypes and species were sequenced to check for lineage-specific mutations. Copy number, restriction patterns and methylation levels of PaB6 were analysed using Southern blotting. PaB6 was localized on chromosomes using fluorescence in situ hybridization (FISH). Key Results The monomer of PaB6 is 249 bp long, contains several intact and truncated vertebrate-type telomeric repeats and is highly methylated. PaB6 is exceptional because of its high copy number and unprecedented variation among diploid cytotypes, ranging from 104 to 106 copies per 1C. PaB6 is always located in pericentromeric regions of several to all chromosomes. Additionally, two lineages of cytotype B7B7 (x = 7), possessing either a single or duplicated 5S rDNA locus, differ in PaB6 copy number; the ancestral condition of a single locus is associated with higher PaB6 copy numbers. Conclusions Although present in all Prospero species, PaB6 has undergone differential amplification only in chromosomally variable P. autumnale, particularly in cytotypes B6B6 and B5B5. These arose via independent chromosomal fusions from x = 7 to x = 6 and 5, respectively, accompanied by genome size increases. The copy numbers of satellite DNA PaB6 are among the highest in angiosperms, and changes of PaB6 are exceptionally dynamic in this group of closely related cytotypes of a single species. The evolution of the PaB6 copy numbers is discussed, and it is suggested that PaB6 represents a recent and highly dynamic system originating from a small pool of ancestral repeats.
    Annals of Botany 12/2014; in press. DOI:10.1093/aob/mcu178 · 3.65 Impact Factor
Show more