Sebastian E Ramos-Onsins

Sebastian E Ramos-Onsins
CRAG Centre for Research in Agricultural Genomics | CRAG · Plant and Animal Genomics

PhD

About

72
Publications
13,783
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,595
Citations
Citations since 2017
21 Research Items
6018 Citations
201720182019202020212022202302004006008001,0001,200
201720182019202020212022202302004006008001,0001,200
201720182019202020212022202302004006008001,0001,200
201720182019202020212022202302004006008001,0001,200
Introduction
Skills and Expertise
Additional affiliations
March 2008 - November 2015
CRAG Centre for Research in Agricultural Genomics
Position
  • Senior Researcher

Publications

Publications (72)
Article
Full-text available
Genetic variation in the pig genome partially modulates the composition of porcine gut microbial communities. Previous studies have been focused on the association between single nucleotide polymorphisms (SNPs) and the gut microbiota, but little is known about the relationship between structural variants and fecal microbial traits. The main goal of...
Preprint
Full-text available
Animal domestication typically affected numerous polygenic quantitative traits, such as behavior, development and reproduction. However, uncovering the genetic basis of quantitative trait variation is challenging, since they are caused by small allele-frequency changes. To date, only a few causative mutations related to domestication processes have...
Poster
In this study, we used whole-genome sequence data to estimate the genetic diversity, structure and putative ancestral origin of the ‘Charolais de Cuba’ (CHCU) breed as well as to identify regions with selective sweeps that may have had an important role in the adaptation to tropical conditions. A total of 12 CHCU samples and 49 samples from 5 breed...
Article
Full-text available
The Site Frequency Spectrum (SFS) and the heterozygosity of allelic variants are among the most important summary statistics for population genetic analysis of diploid organisms. We discuss the generalization of these statistics to populations of autopolyploid organisms in terms of the joint Site Frequency/Dosage Spectrum and its expected value for...
Article
Full-text available
In this study, we used BovineSNP50 Genotyping BeadChip data to estimate the structure, putative ancestral origin as well as to identify regions with selective sweeps that may have had an important role in the adaptation to tropical conditions of the 'Charolais de Cuba' (CHCU) breed. According to a principal component analysis, CHCU samples cluster...
Article
Full-text available
Transposable elements (TEs) are a major driver of plant genome evolution. A part from being a rich source of new genes and regulatory sequences, TEs can also affect plant genome evolution by modifying genome size and shaping chromosome structure. TEs tend to concentrate in heterochromatic pericentromeric regions and their proliferation may expand t...
Article
We introduce the conditional Site Frequency Spectrum (SFS) for a genomic region linked to a focal mutation of known frequency. An exact expression for its expected value is provided for the neutral model without recombination. Its relation with the expected SFS for two sites, 2-SFS, is discussed. These spectra derive from the coalescent approach of...
Article
Pigs (Sus scrofa) originated in Southeast Asia and expanded to Europe and North Africa approximately 1 MYA. Analyses of porcine Y-chromosome variation have shown the existence of two main haplogroups that are highly divergent, a result that is consistent with previous mitochondrial and autosomal data showing that the Asian and non-Asian pig populat...
Article
Full-text available
We present version 6 of the DnaSP (DNA Sequence Polymorphism) software, a new version of the popular tool for performing exhaustive population genetic analyses on multiple sequence alignments. This major upgrade incorporates novel functionalities to analyse large datasets, such as those generated by high-throughput sequencing (HTS) technologies. Am...
Article
Full-text available
The accurate estimation of nucleotide variability using next-generation sequencing data is challenged by the high number of sequencing errors produced by new sequencing technologies, especially for nonmodel species, where reference sequences may not be available and the read depth may be low due to limited budgets. The most popular single-nucleotid...
Conference Paper
Next-generation sequencing (NGS) technologies initiated a revolution in genomics, producing massive amounts of biological data and the consequent need for adapting current computing infrastructures. Multiple alignment of genomes, analysis of variants or phylogenetic tree construction, with quadratic polynomial complexity in the best case are tools...
Chapter
A Genomic Perspective on the Evolutionary History of Sus Speciation Pig-like species (Suidae) are found in many different parts of the world. This superfamily consists of at least 15 different extant species found in Africa, South America, Europe, and Asia (Table 34.1). Suid species in Eurasia exhibit a striking dichotomy in their distribution. Whi...
Article
Full-text available
The msParSm application is an evolution of msPar, the parallel version of the coalescent simulation program ms, which removes the limitation for simulating long stretches of DNA sequences with large recombination rates, without compromising the accuracy of the standard coalescence. This work introduces msParSm, describes its significant performance...
Data
Supplementary Table 1. Parameters used in each of the simulations performed.
Article
Full-text available
Background Taste receptors (TASRs) are essential for the body’s recognition of chemical compounds. In the tongue, TASRs sense the sweet and umami and the toxin-related bitter taste thus promoting a particular eating behaviour. Moreover, their relevance in other organs is now becoming evident. In the intestine, they regulate nutrient absorption and...
Article
We present an exact, closed expression for the expected neutral Site Frequency Spectrum for two neutral sites, 2-SFS, without recombination. This spectrum is the immediate extension of the well known single site $\theta/f$ neutral SFS. Similar formulae are also provided for the case of the expected SFS of sites that are linked to a focal neutral mu...
Article
Full-text available
The availability of extensive databases of crop genome sequences should allow analysis of crop variability at an unprecedented scale, which should have an important impact in plant breeding. However, up to now the analysis of genetic variability at the whole-genome scale has been mainly restricted to single nucleotide polymorphisms (SNPs). This is...
Article
Full-text available
A comprehensive catalog of variability in a given species is useful for many important purposes, e.g., designing high density arrays or pinpointing potential mutations of economic or physiological interest. Here we provide a genomewide, worldwide catalog of single nucleotide variants by simultaneously analyzing the shotgun sequence of 128 pigs and...
Article
Full-text available
Background The genome of the melon (Cucumis melo L.) double-haploid line DHL92 was recently sequenced, with 87.5 and 80.8% of the scaffold assembly anchored and oriented to the 12 linkage groups, respectively. However, insufficient marker coverage and a lack of recombination left several large, gene rich scaffolds unanchored, and some anchored scaf...
Article
Full-text available
Pig domestication began around 9000 YBP in the Fertile Crescent and Far East, involving marked morphological and genetic changes that occurred in a relatively short window of time. Identifying the alleles that drove the behavioural and physiological transformation of wild boars into pigs through artificial selection constitutes a formidable challen...
Article
Full-text available
While many computer programs can perform population genetics calculations, they are typically limited in the analyses and data input formats they offer; few applications can process the large datasets produced by whole-genome resequencing projects. Furthermore, there is no coherent framework for the easy integration of new statistics into existing...
Article
Decreasing costs of next-generation sequencing (NGS) experiments have made a wide range of genomic questions open for study with nonmodel organisms. However, experimental designs and analysis of NGS data from less well-known species are challenging because of the lack of genomic resources. In this work, we investigate the performance of alternative...
Conference Paper
We implemented a parallel version (hereafter referred as “msPar”) of the coalescent simulation program ms, providing the same functionality and output, parallelized using a Master-Worker scheme with on-demand scheduling and MPI to run on an HPC cluster. To our knowledge this is the first time such parallelization has been applied to ms, and shown t...
Article
Full-text available
Recombination allows faithful chromosomal segregation during meiosis and contributes to the production of new heritable allelic variants that are essential for the maintenance of genetic diversity. Therefore, an appreciation of how this variation is created and maintained is of critical importance to our understanding of biodiversity and evolutiona...
Article
Several variation of the Watterson estimator of variability for Next Generation Sequencing (NGS) data have been proposed in the literature. We present a unified framework for generalized Watterson estimators based on Maximum Composite Likelihood, which encompasses most of the existing estimators. We propose this class of unbiased estimators as gene...
Article
Next generation sequencing of pooled samples is an effective approach for studies of variability and differentiation in populations. In this paper we provide a comprehensive set of estimators of the most common statistics in population genetics based on the frequency spectrum, namely the Watterson estimator θW , nucleotide pairwise diversity II, Ta...
Article
Full-text available
Background In contrast to international pig breeds, the Iberian breed has not been admixed with Asian germplasm. This makes it an important model to study both domestication and relevance of Asian genes in the pig. Besides, Iberian pigs exhibit high meat quality as well as appetite and propensity to obesity. Here we provide a genome wide analysis o...
Data
Simulated power against depth. Power was computed as the number of SNP called by SNAPE software divided by the total number of real SNPs in the pool. Depth corresponds to the average depth in the pooled data. Bottom: Power against MAF (minor allele frequency in the pool).
Data
Genes within multicopy regions and extreme selection tests’ windows. MCR genes: genes within multicopy regions; Lowest theta shared autosomes: genes within extreme low θ in autosomes and X pseudoautosomal region (PAR) common in the individual and the pool; Lowest theta shared non-pseudoautosomal region (NPAR): genes within extreme low θ in X NPAR r...
Data
Full-text available
Variability (Wattersons's estimate, per bp) inside multicopy regions vs. variability of windows containing multicopy regions but outside the multicopy region units?
Data
Full-text available
Correlation across 200 kb windows between Tajima’s D and Fay - Wu’s H statistics in pooled data. Regression line is shown in red.
Article
Missing data are common in DNA sequences obtained through high-throughput sequencing. Furthermore, samples of low quality or problems in the experimental protocol often cause a loss of data even with traditional sequencing technologies. Here we propose modified estimators of variability and neutrality tests that can be naturally applied to sequence...
Article
The phylogeography of the porcine X chromosome has not been studied despite the unique characteristics of this chromosome. Here, we genotyped 59 single nucleotide polymorphisms (SNPs) in 312 pigs from around the world, representing 39 domestic breeds and wild boars in 30 countries. Overall, widespread commercial breeds showed the highest heterozygo...
Data
Full-text available
Descriptions of breeds sampled. (0.14 MB PDF)
Data
Full-text available
Summary of genes related to neuron function and that overlap with genomic regions with significant low θW. Summary of genes related to growth, muscle development, metabolism and disease that overlap with genomic regions with significant low θW. (0.08 MB PDF)
Data
Full-text available
Detailed description of measures of polymorphism and genetic differentiation. (0.11 MB PDF)
Data
Full-text available
Summary statistics for all the SNPs identified in Large White. (0.03 MB PDF)
Data
Full-text available
Fst values (A) and p-values frequency (B) presented by breed pair. (0.26 MB PDF)
Data
Full-text available
Summary of genes related to growth, muscle development, metabolism and disease that overlap with genomic regions with significant low θW. (0.16 MB PDF)
Article
Full-text available
Artificial selection has caused rapid evolution in domesticated species. The identification of selection footprints across domesticated genomes can contribute to uncover the genetic basis of phenotypic diversity. Genome wide footprints of pig domestication and selection were identified using massive parallel sequencing of pooled reduced representat...
Article
Full-text available
Domestication, modern breeding and artificial selection have shaped dramatically the genomic variability of domestic animals. In livestock, the so-called FAT1 quantitative trait locus (QTL) in porcine chromosome 4 was the first QTL uncovered although, to date, its precise molecular nature has remained elusive. Here, we characterize the nucleotide v...
Article
Full-text available
One of the main necessities for population geneticists is the availability of statistical tools that enable to accept or reject the neutral Wright-Fisher model with high power. A number of statistical tests have been developed to detect specific deviations from the null frequency spectrum in different directions (i.e., Tajima's D, Fu and Li's F and...
Article
Full-text available
The ascertainment of the demographic and selective history of populations has been a major research goal in genetics for decades. To that end, numerous statistical tests have been developed to detect deviations between expected and observed frequency spectra, e.g., Tajima's D, Fu and Li's F and D tests, and Fay and Wu's H. Recently, Achaz developed...
Article
A. halleri is a psuedometallophyte with a patchy distribution in Europe and is often spread by human activity. To determine the population history and whether this history is consistent with potential human effects, we surveyed nucleotide variation using 24 loci from 12 individuals in a large A. halleri population. The means of total and silent nuc...
Article
Full-text available
The orientation of flanking genes may influence the evolution of intergenic regions in which cis-regulatory elements are likely to be located: divergently transcribed genes share their 5' regions, resulting either in smaller "private" spaces or in overlapping regulatory elements. Thus, upstream sequences of divergently transcribed genes (bi-directi...
Article
Full-text available
Information about polymorphism, population structure, and linkage disequilibrium (LD) is crucial for association studies of complex trait variation. However, most genomewide studies have focused on model systems, with very few analyses of undisturbed natural populations. Here, we sequenced 86 mapped nuclear loci for a sample of 46 genotypes of Boec...
Article
Several tests have been proposed to detect departures of nucleotide variability patterns from neutral expectations. However, very different kinds of evolutionary processes, such as selective events or demographic changes, can produce similar deviations from these tests, thus making interpretation difficult when a significant departure of neutrality...
Article
Detecting the signature of adaptation on nucleotide variation is often difficult in species that like Arabidopsis thaliana might have a complex demographic history. Recent re-sequencing surveys in this species provided genome-wide information that would mainly reflect its demographic history. We have used a large empirical data set (LED) as well as...
Article
Full-text available
Coalescent theory is commonly used to perform population genetic inference at the nucleotide level. Here, we examine the procedure that fixes the number of segregating sites (henceforth the FS procedure). In this approach a fixed number of segregating sites (S) are placed on a coalescent tree (independently of the total and internode lengths of the...
Article
Full-text available
Coalescent theory is a powerful tool for population geneticists as well as molecular biologists interested in understanding the patterns and levels of DNA variation. Using coalescent Monte Carlo simulations it is possible to obtain the empirical distributions for a number of statistics across a wide range of evolutionary models; these distributions...
Article
Full-text available
Coalescent simulations were used to investigate the possible role of population subdivision and history in shaping nucleotide variation in a recombining 88-kb genomic fragment of Drosophila simulans displaying an unusual large-scale haplotype structure. The multilocus analysis, based on summary statistics using specific demographic null models unde...
Article
Nucleotide variation at the FAH1 and DFR gene regions was surveyed in four populations of Arabidopsis lyrata (two European A. l. petraea and two North American A. l. lyrata populations). In contrast to previous results, levels of variation were not consistently lower in A. l. lyrata than in A. l. petraea, and similar degrees of genetic differentiat...
Article
Full-text available
The simultaneous analysis of multiple genomic loci is a powerful approach to studying the effects of population history and natural selection on patterns of genetic variation of a species. By surveying nucleotide sequence polymorphism at 334 randomly distributed genomic regions in 12 accessions of Arabidopsis thaliana, we examined whether a standar...
Article
Cecropins are insect antibacterial peptides that are part of the insect humoral immune response and could, therefore, be potential targets of natural selection. In Drosophila, the Cec genes constitute a multigene family whose members are arranged in tandem. The complete Cec family was isolated in two obscura group species: D. subobscura and D. pseu...
Article
Full-text available
Nucleotide variation in eight effectively unlinked genes was surveyed in species-wide samples of the closely related outbreeding species Arabidopsis halleri and A. lyrata ssp. petraea and in three of these genes in A. lyrata ssp. lyrata and A. thaliana. Significant genetic differentiation was observed more frequently in A. l. petraea than in A. hal...
Article
Full-text available
A number of statistical tests for detecting population growth are described. We compared the statistical power of these tests with that of others available in the literature. The tests evaluated fall into three categories: those tests based on the distribution of the mutation frequencies, on the haplotype distribution, and on the mismatch distribut...
Article
There is an increasing interest in direct screening of polymorphisms at candidate loci to associate them with adaptations in natural situations. We report primers that amplify regions at 22 putatively orthologous functional loci in the family Brassicaceae: Arabidopsis thaliana, its two wild outcrossing relatives, and Brassica oleracea. Four groups...
Article
Full-text available
Human DNA sequence variation data are useful for studying the origin, evolution, and demographic history of modern humans and the mechanisms of maintenance of genetic variability in human populations, and for detecting linkage association of disease. Here, we report worldwide variation data from a approximately 10-kilobase noncoding autosomal regio...
Article
Full-text available
Approximately 4 kb of the Cecropin cluster region have been sequenced in nine lines of Drosophila melanogaster and one line of the sibling species D. simulans, D. mauritiana, and D. sechellia. This region includes three functional genes (CecA1, CecA2, and CecB), which are involved in the insect immune response, and two pseudogenes (CecPsi1 and CecP...
Article
A region of approximately 1.6 kb encompassing the ribosomal protein 49 gene (rp49) has been sequenced and compared in nine species of the obscura group of Drosophila: four species belonging to the obscura subgroup, three to the pseudoobscura subgroup, and two to the affinis subgroup. Our data provide strong support that the nearctic species (pseudo...

Network

Cited By