Recombination rates in admixed individuals identified by ancestry-based inference

Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA.
Nature Genetics (Impact Factor: 29.35). 07/2011; 43(9):847-53. DOI: 10.1038/ng.894
Source: PubMed


Studies of recombination and how it varies depend crucially on accurate recombination maps. We propose a new approach for constructing high-resolution maps of relative recombination rates based on the observation of ancestry switch points among admixed individuals. We show the utility of this approach using simulations and by applying it to SNP genotype data from a sample of 2,565 African Americans and 299 African Caribbeans and detecting several hundred thousand recombination events. Comparison of the inferred map with high-resolution maps from non-admixed populations provides evidence of fine-scale differentiation in recombination rates between populations. Overall, the admixed map is well predicted by the average proportion of admixture and the recombination rate estimates from the source populations. The exceptions to this are in areas surrounding known large chromosomal structural variants, specifically inversions. These results suggest that outside of structurally variable regions, admixture does not substantially disrupt the factors controlling recombination rates in humans.

Download full-text


Available from: Nicholas M Rafaels, Apr 09, 2014
  • Source
    • ". The column labeled ||ρ−ˆρ|| 2 2 L corresponds to the per-base L 2 error. We also show the correlation of ρ andˆρ at different scales, as in Wegmann et al. (2011), so that Cor B (ρ, ˆ ρ) is the correlation of the true and estimated recombination rates over a physical distance of B bases, evaluated at the positions 0, B, 2B, . "
    [Show abstract] [Hide abstract]
    ABSTRACT: Two-locus sampling probabilities have played a central role in devising an efficient composite likelihood method for estimating fine-scale recombination rates. Due to mathematical and computational challenges, these sampling probabilities are typically computed under the unrealistic assumption of a constant population size, and simulation studies have shown that resulting recombination rate estimates can be severely biased in certain cases of historical population size changes. To alleviate this problem, we develop here two distinct methods to compute the sampling probability for variable population size functions that are piecewise constant. The first is a novel formula that can be evaluated by numerically exponentiating a large but sparse matrix. The second method is importance sampling on genealogies, based on a characterization of the optimal proposal distribution that extends previous results to the variable-size setting. The resulting proposal distribution is highly efficient, with an average effective sample size (ESS) of nearly 98% per sample. Through a simulation study, we show that accounting for population size changes improves inference of recombination rates.
  • Source
    • "Other crucial applications have included pharmacogenomics; for example, in a recent study, Native American ancestry was significantly associated with the risk of relapse in children suffering from acute lymphoblastic leukemia (Yang et al. 2011). In addition to these traditional applications, in the more recent years, local ancestry inference methods have also found applications in other settings such as localizing sequences of unknown location from the human reference genome (Genovese et al. 2013), studying recombination rate variation (Hinch et al. 2011; Wegmann et al. 2011), inferring natural selection (Tang et al 2007; Jin et al. 2012), making demographic inferences (Bryc et al. 2010; Johnson et al. 2011; Kidd et al. 2012) and in joint association and admixture mapping to boost the power to detect disease linked genes and variants (Pasaniuc et al. 2011; Shriner et al. 2011 ). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Ancestry inference is a frequently encountered problem and has many applications such as forensic analyses, genetic association studies, and personal genomics. The main goal of ancestry inference is to identify an individual's population of origin based on our knowledge of natural populations. Because both self-reported ancestry in humans or the sampling location of an organism can be inaccurate for this purpose, the use of genetic markers can facilitate accurate and reliable inference of an individual's ancestral origins. At a higher level, there are two different paradigms in ancestry inference: global ancestry inference which tries to compute the genome-wide average of the population contributions and local ancestry inference which tries to identify the regional ancestry of a genomic segment. In this mini review, I describe the numerous approaches that are currently available for both kinds of ancestry inference from population genomic datasets. I first describe the general ideas underlying such inference methods and their relationship to one another. Then, I describe practical applications in which inference of ancestry has proven useful. Lastly, I discuss challenges and directions for future research work in this area.
    Frontiers in Genetics 06/2014; 5. DOI:10.3389/fgene.2014.00204
  • Source
    • "In many scenarios of biological interest, substantial evolutionary change has taken place in a small number of generations due to recombination and/or selection on standing variation, rather than mutational input. For example, one may be interested in the genome-wide haplotype patterns that emerge from admixture between historically isolated populations (Wegmann et al., 2011) or from artificial selection on a quantitative trait. Studying these haplotype patterns can be difficult with existing forward-in-time simulators because detailed information about the mosaic haplotype structure of individuals is not readily available, and must be inferred from the output sequences of the simulation and/or stored recombination event data. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Summary: forqs is a forward-in-time simulation of recombination, quantitative traits and selection. It was designed to investigate haplotype patterns resulting from scenarios where substantial evolutionary change has taken place in a small number of generations due to recombination and/or selection on polygenic quantitative traits.Availability and implementation: forqs is implemented as a command-line C++ program. Source code and binary executables for Linux, OSX and Windows are freely available under a permissive BSD license: jnovembre@uchicago.eduSupplementary information: Supplementary data are available at Bioinformatics online.
    Bioinformatics 12/2013; 30(4). DOI:10.1093/bioinformatics/btt712 · 4.98 Impact Factor
Show more