Anna-Sophie Fiston-Lavier

Anna-Sophie Fiston-Lavier
Université de Montpellier | UM1 · Institut des Sciences de l’Évolution Montpellier (ISEM)

PhD
President of the French Society of Bioinformatics (SFBI) Junior member of IUF Co-head of the Bioinformatics Learning Lab

About

82
Publications
8,277
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,384
Citations
Introduction
My research focuses on the study of the impact of transposable elements, one type of DNA repeats, on genome structure, evolution and adaptation bringing together computational and experimental approaches with a particular interest for new sequencing technologies.
Additional affiliations
September 2013 - present
ISEM
Position
  • Professor (Assistant)
September 2013 - present
French National Centre for Scientific Research
Position
  • Professor (Associate)
September 2013 - present
Université de Montpellier
Position
  • Professor (Assistant)

Publications

Publications (82)
Preprint
Efficiently detecting genomic structural variants (SVs) is a key step to grasp the “missing heritability” underlying complex traits involved in major evolutionary processes such as speciation, phenotypic plasticity, and adaptive responses. Yet, the SV-based genotype/trait association studies are still largely overlooked mainly due to the lack of re...
Chapter
Plant genomes contain a particularly high proportion of repeated structures of various types. This chapter proposes a guided tour of the available software that can help biologists to scan automatically for these repeats in sequence data or check hypothetical models intended to characterize their structures. Since transposable elements (TEs) are a...
Article
In this paper, we investigate througth a premilinary study the influence of repeat elements during the assembly process. We analyze the link between the presence and the nature of one type of repeat element, called transposable element (TE) and misassembly events in genome assemblies. We propose to improve assemblies by taking into account the pres...
Article
Full-text available
In 2020, the world faced the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) pandemic that drastically altered people’s lives. Since then, many countries have been forced to suspend public gatherings, leading to many conference cancellations, postponements, or reorganizations. Switching from a face-to-face to a remote conference became...
Article
Full-text available
Background Meiotic recombination is a vital biological process playing an essential role in genome's structural and functional dynamics. Genomes exhibit highly various recombination profiles along chromosomes associated with several chromatin states. However, eu-heterochromatin boundaries are not available nor easily provided for non-model organism...
Article
Full-text available
Background Plasmids are mobile genetic elements that often carry accessory genes, and are vectors for horizontal transfer between bacterial genomes. Plasmid detection in large genomic datasets is crucial to analyze their spread and quantify their role in bacteria adaptation and particularly in antibiotic resistance propagation. Bioinformatics metho...
Article
Full-text available
Retrotransposons can cause somatic genome variation in the human nervous system, which is hypothesized to have relevance to brain development and neuropsychiatric disease. However, the detection of individual somatic mobile element insertions presents a difficult signal-to-noise problem. Using a machine-learning method (RetroSom) and deep whole-gen...
Preprint
Plasmids are mobile genetic elements that often carry accessory genes, and are vectors for horizontal transfer between bacterial genomes. The detection of plasmids in large sets of genomes is crucial to analyze their spread and quantify their role in bacteria adaptation and particularly in antibiotic resistance genes propagation. Several bioinforma...
Poster
Full-text available
Efficiently detecting genomic structural variants (SVs) is a key step to grasp the "missing heritability" underlying complex traits involved in major evolutionary processes such as speciation, phenotypic plasticity, and adaptive responses. We present a random forest ensemble method for accurate deletion identification. We called this approach RF4SV...
Preprint
Full-text available
Motivation Meiotic recombination is a vital biological process playing an essential role in genomes structural and functional dynamics. Genomes exhibit highly various recombination profiles along chromosomes associated with several chromatin states. However, eu-heterochromatin boundaries are not available nor easily provided for non-model organisms...
Article
Full-text available
Motivation: Transposable elements (TEs) constitute a significant proportion of the majority of genomes sequenced to date. TEs are responsible for a considerable fraction of the genetic variation within and among species. Accurate genotyping of TEs in genomes is therefore crucial for a complete identification of the genetic differences among indivi...
Article
Full-text available
Viruses are able to evolve in vitro by mutations after serial passages in cell cultures, which can lead to either a loss, or an increase, of virulence. Cyprinid herpesvirus 3 (CyHV-3), a 295-kb double-stranded DNA virus, is the etiological agent of the koi herpesvirus disease (KHVD). To assess the influence of serial passages, an isolate of CyHV-3...
Preprint
Full-text available
Active retrotransposons in the human genome (L1, Alu and SVA elements) can create genomic mobile element insertions (MEIs) in both germline and somatic tissue ¹ . Specific somatic MEIs have been detected at high levels in human cancers ² , and at lower to medium levels in human brains ³ . Dysregulation of somatic retrotransposition in the human bra...
Article
Full-text available
Most of the current knowledge on the genetic basis of adaptive evolution is based on the analysis of single nucleotide polymorphisms (SNPs). Despite increasing evidence for their causal role, the contribution of structural variants to adaptive evolution remains largely unexplored. In this work, we analyzed the population frequencies of 1,615 Transp...
Data
Histogram showing the number of TEs (y axis) and the number of samples for which we were able to estimate its frequency. (PDF)
Data
19 TEs showing significant correlation with the expression of nearby genes. Results are divided in correlations obtained with male and female expression data (Huang et al. 2015). beta: Effect size estimate, t-stat: Test statistic (t-statistic of T-test), p-value: p-value for the linear regression. FDR: False discovery rate estimated with Benjamini–...
Data
Genomic location of different TE categories. Percentages and rigth-tail p-values are showed when the Chi-square test is significant. (A) Localization of TEs regarding the nearest gene across categories. (B) Localization of intragenic TEs across TE categories. (XLSX)
Data
Enrichment test for TE families. For each family, table shows the number of TEs at each category. HighFreq TEs correspond to the sum of AF, AF-NA, AF-OOA and OOA. p-value (Bonf.) indicates Bonferroni corrected p-values for Chi-square test when comparing HighFreq, AF-OOA and OOA TEs against All TEs. In red p-values < 0.05. (XLSX)
Data
Boxplots showing the distribution of TE ratio percentages (percentage of the length of the TE insertion regarding the length of the canonical family sequence) for each TE category and colored by Age (A) and TE class (B). (PDF)
Data
Correlation between frequencies estimated with data obtained using different sequencing strategies in the Stockholm (Sweden) population. Frequencies calculated using individual strain sequencing (x) (Mateo et al 2018) and pool sequencing (y). Pearson correlation coefficient r = 0.98, p-value < 2.2e-16. (PDF)
Data
Number of TEs showing significant values in the selection tests for each HighFreq category. For each sweep test (iHS, H12 and nSL), “Continent” column indicates population used for the analysis: NA: North America or EU: Europe. For each HighFreq category, table shows the number of significant TEs / number of TEs for which the test was calculated. “...
Data
List of 36 TEs showing at least one significant (highlighted in red) selective sweep test (iHS, H12 or nSL). (XLSX)
Data
List of the 254 HighFreq TEs with at least one pairwise FST calculation performed. Category indicates the classification of the TE according to Fig 2. For each continent, two pairwise comparisons were performed. Values for each comparison are the FST (in red the significant ones). Concordant FST indicates whether TEs with significant FST were at hi...
Data
Summary results for Fixed TEs showing significant Tajima´s D values on neighbour windows. Significance was dermined by the 5% quantile of Tajima´s D values from all high recombination regions in the genome (-1.65 for Autosomes and -1.82 for chromosome X). (XLSX)
Data
Enrichment of genes previously described as associated with different stress-related and behaviour-related traits in the different datasets analyzed. A) Genes the 65 TEs with evidence of selection. B) Genes nearby the 300 HighFreq TEs. C) Genes nearby the 174 OOA TEs. D) Genes nearby the 111 AF-OOA TEs. (XLSX)
Data
Genomic coordinates of cosmopolitan inversion (Kapun et al. 2016) analyzed in order to determine its influence on the transposable elements frequency calculation. (XLSX)
Data
Summary statistics for the pairwise FST calculations. TEs with FST: Number of TEs for which it was possible to calculate FST. Signif. (Africa H/L): Total number of significant TEs. Between brackets: H: Number of significant TEs identified using the distribution of neutral SNPs that are at high frequency in Africa. L: Number of significant TEs ident...
Data
Venn diagrams for the 36 HighFreq TEs with significant evidence of selective sweeps. A) Overlapping between TEs showing significant results for the different selective sweeps statistics (iHS, H12 and nSL). B) Overlapping between TEs showing at least one significant test in the North American (NA) and/or the European (EU) population. The percentage...
Data
Functional enrichment analysis of genes nearby OOA and AF-OOA TEs. A) Significant Gene Ontology Clusters according to DAVID functional annotation tool. Only the top six significant clusters are showed (enrichment score > 1.3). The horizontal axis represents DAVID enrichment score (see S9C and S9D Table for details). B) Significantly overrepresented...
Data
Distribution of the number of TEs (y axis) by the number of strains for which T-lex2 estimated frequencies in the 8 individually-sequenced populations. (PDF)
Data
Distribution of mapped reads for the presence module (red), absence module (green) and total number of reads (blue) for each one of the 48 DrosEU samples (Kapun et al. 2018). (PDF)
Data
Information for the 91 samples used in this study. (XLSX)
Data
TE classes across different TE categories. P-values and percentages are showed in bold when significant enrichment according to Chi-square test p-value < 0.05 when comparing with All TEs. (XLSX)
Data
Distribution of number of TEs that are present at >0.10 and < 0.95 frequency by number of populations in which they are present at that frequency. We considered TEs to be present at high frequency (HighFreq) when they fulfil the frequency condition in at least three samples (represented by blue bars in the figure). (PDF)
Data
Comparison of age estimations obtained by Bergman and Bensasson (2007) and the estimations obtained in this work. Only the 417 TEs that are common between the two studies are plotted. A) TE age distribution of the 417 TEs based on Bergman and Bensasson (2007) and in this work. Note that there are 10 insertions that showed extreme age values in our...
Data
Venn diagrams showing the overlap between TEs showing significant FST values in at least one pair of populations. A) TEs present at high frequency in populations located at low latitude locations. B) TEs present at high frequency in populations located at high latitude locations. (PDF)
Data
TE frequencies estimated using all strains (x axis) vs frequencies estimated after removing strains that contain inversions (y axis) for different individually-sequenced populations. A) Zambia (Lack et al., 2015), B) France (Pool et al., 2012), C) DGRP (Raleigh) (Huang et al. 2014; Mackay et al. 2012), D) Italy (Bari) and E) Sweden (Stockholm) (Mat...
Data
Distribution of iHS values obtained for TEs (red) and neutral SNPs (cyan) in the North American population (DGRP, Raleigh, North Carolina). A) Distribution of iHS values for all TEs and neutral SNPs. B) Distribution of iHS values for TEs and neutral SNPs at high frequency (> 0.10) in the OOA population (Raleigh) and in the African population (Zambi...
Data
Frequency estimations using Tlex2 for the 1,615 TEs at each of the 91 samples. NA indicates that the frequency could not be estimated for that TE in the given sample. Recombination estimates according to Comeron et al. (2012) and Fiston-Lavier et al. (2010) are showed for each TE. Class column indicates the category at which each TE was classified....
Data
TE length ratio statistics. At the top, mean and median TE Length Ratio (%) for each TE category. At the bottom, results for the Wilcoxon rank sum test and Kruskal Wallis test among different TE categories. The results are shown for All TEs (A), for onlt young TEs (B), for only old TEs (C), for only TEs of the class DNA (D), for only TEs of the cla...
Data
A Results of gene ontology (GO) enrichment test for the 83 genes nearby the 65 TEs showing evidence of selection (ES). B Table. Results of gene ontology (GO) enrichment test for the 363 genes nearby the 300 HigFreq TEs. C Table: Results of gene ontology (GO) enrichment test for the 215 genes nearby the 174 OOA TEs. D Table. Results of gene ontology...
Data
Gene association studies analyzing different fitness-related phenotypes. (XLSX)
Article
Full-text available
Viruses are able to evolve in vitro by mutations after serial passages in cell cultures, which can lead to either a loss, or an increase, of virulence. Cyprinid herpesvirus 3 (CyHV-3), a 295-kb double-stranded DNA virus, is the etiological agent of the koi herpesvirus disease (KHVD). To assess the influence of serial passages, an isolate of CyHV-3...
Article
Full-text available
Transposable elements (TEs) are parasitic DNA sequences that threaten genome integrity by replicative transposition in host gonads. The Piwi-interacting RNAs (piRNAs) pathway is assumed to maintain Drosophila genome homeostasis by downregulating transcriptional and post-transcriptional TE expression in the ovary. However, the bursts of transpositio...
Preprint
Full-text available
Mapping genotype to phenotype is challenging because of the difficulties in identifying both the traits under selection and the specific genetic variants underlying these traits. Most of the current knowledge of the genetic basis of adaptive evolution is based on the analysis of single nucleotide polymorphisms (SNPs). Despite increasing evidence fo...
Article
Full-text available
Author Summary Mutations, whether they affect single nucleotides or large genomic regions, are responsible for the genetic variability that allows species to evolve in response to environmental changes. Duplication represents a class of mutation that results in polymorphism in the copy number of genes. Investigating the phenotypic and fitness conse...
Data
Relative AChE1 activities of the various genotypes. Relative AChE1R activities (scaled by the mean AChE1R activity of the R3R3genotype, top panels) and relative AChE1S activities (scaled by the mean AChE1S activity of the SS genotype, bottom panels) are shown for various genotypes, as a function of their number of R or S ace-1 copies. The linear re...
Data
Dynamics of AChE1 activity index (AI) over generations in the experimental evolution assay. For each replicate (C1, C2 and C3), boxplots represent the distribution of activity index (AI) for each generation. Blue and red lines correspond to the expected AI of R5R5 and R3R3 homozygotes, respectively. For each replicate, the green line corresponds to...
Data
List of the primers used in this study. (PDF)
Data
Resolution of the ace-1 duplication structure. (A) Distribution of the paired-end (PE) insert size in the vicinity of the breakpoints (± 1 kb). For each strain, we recorded the insert size of each read and its paired read; for each 200 bp insert size class, we calculated the number of reads, which was then normalized relative to the 2R chromosome m...
Data
Relative AChE1R activity in R3R3 and R5R5 individuals. Boxplots representing the relative AChE1R activity distribution measured on 20 males from the AcerkisR3(R3R3, red) and AgRR5 (R5R5, blue) strains. Differences in activity were assessed with the following GLM: Activity = Geno + ε, where Geno is a two-level factor corresponding to the genotype an...
Data
Female fertility and fecundity in susceptible (SS) and resistant (R3R3and R5R5) homozygotes. For each genotype, SS (green), R3R3 (red) and R5R5 (blue), we present the following: (A) the mean oviposition rate (i.e. the number of females laying eggs over the number of females studied) and its standard error (SEM), (B) the mean number of eggs laid per...
Data
Resistance to bendiocarb (CX) and chlorpyrifos-methyl (OP) insecticides. Mortality (probit scale) is presented as a function of insecticide dose (log10) for the three strains: KisumuP (SS; green squares), AcerkisR3(R3R3, red triangles) and AgRR5 (R5R5, blue dots). Linear regressions between the two factors (solid lines) are indicated, together with...
Data
List of the 12 genes present within the duplicated region and their function (from VectorBase AgamP4 Anopheles gambiae genome). (PDF)
Data
Nature and number of ace-1 copies in different mosquito genotypes. (PDF)
Article
Full-text available
DNA derived from transposable elements (TEs) constitutes large parts of the genomes of complex eukaryotes, with major impacts not only on genomic research but also on how organisms evolve and function. Although a variety of methods and tools have been developed to detect and annotate TEs, there are as yet no standard benchmarks—that is, no standard...
Poster
Full-text available
Mosquitoes show widespread resistance to insecticides and rapidly various resistance genes have been selected over the course of ~40 years in populations. In the Anopheles gambiae s.s., main vector of malaria, resistance to organophosphates and carbamates insecticides is mainly due to a single amino-acid substitution in acetylcholinesterase (AChE1)...
Article
Full-text available
Studies of the population dynamics of transposable elements (TEs) in Drosophila melanogaster indicate that consistent forces are affecting TEs independently of their modes of transposition and regulation. New sequencing technologies enable biologists to sample genomes at an unprecedented scale in order to quantify genome-wide polymorphism for annot...
Article
Full-text available
Transposable elements (TEs) constitute the most active, diverse and ancient component in a broad range of genomes. Complete understanding of genome function and evolution cannot be achieved without a thorough understanding of TE impact and biology. However, in-depth analysis of TEs still represents a challenge due to the repetitive nature of these...
Preprint
Transposable elements (TEs) constitute the most active, diverse and ancient component in a broad range of genomes. Complete understanding of genome function and evolution cannot be achieved without a thorough understanding of TE impact and biology. However, in-depth analysis of TEs still represents a challenge due to the repetitive nature of these...
Article
Full-text available
High-throughput DNA sequencing technologies have revolutionized genomic analysis, including the de novo assembly of whole genomes. Nevertheless, assembly of complex genomes remains challenging, mostly due to the presence of repeats, which cannot be reconstructed unambiguously with short read data alone. One class of repeats, called transposable ele...
Article
Full-text available
The midge, Belgica antarctica, is the only insect endemic to Antarctica, and thus it offers a powerful model for probing responses to extreme temperatures, freeze tolerance, dehydration, osmotic stress, ultraviolet radiation and other forms of environmental stress. Here we present the first genome assembly of an extremophile, the first dipteran in...
Preprint
High-throughput DNA sequencing technologies have revolutionized genomic analysis, including the de novo assembly of whole genomes. Nevertheless, assembly of complex genomes remains challenging, in part due to the presence of dispersed repeats which introduce ambiguity during genome reconstruction. Transposable elements (TEs) can be particularly pro...