[show abstract][hide abstract] ABSTRACT: Most species have at least some level of genetic structure. Recent simulation studies have shown that it is important to consider population structure when sampling individuals to infer past population history. The relevance of the results of these computer simulations for empirical studies, however, remains unclear. In the present study, we use DNA sequence datasets collected from two closely related species with very different histories, the selfing species Capsella rubella and its outcrossing relative C. grandiflora, to assess the impact of different sampling strategies on summary statistics and the inference of historical demography. Sampling strategy did not strongly influence the mean values of Tajima's D in either species, but it had some impact on the variance. The general conclusions about demographic history were comparable across sampling schemes even when resampled data were analyzed with approximate Bayesian computation (ABC). We used simulations to explore the effects of sampling scheme under different demographic models. We conclude that when sequences from modest numbers of loci (<60) are analyzed, the sampling strategy is generally of limited importance. The same is true under intermediate or high levels of gene flow (4Nm > 2-10) in models in which global expansion is combined with either local expansion or hierarchical population structure. Although we observe a less severe effect of sampling than predicted under some earlier simulation models, our results should not be seen as an encouragement to neglect this issue. In general, a good coverage of the natural range, both within and between populations, will be needed to obtain a reliable reconstruction of a species's demographic history, and in fact, the effect of sampling scheme on polymorphism patterns may itself provide important information about demographic history.
[show abstract][hide abstract] ABSTRACT: Polyploidization plays an important role in plant speciation. The most recent estimates report that up to 15% of angiosperm speciation events and 31% in ferns are accompanied by changes in ploidy level. Polyploids can arise either through autopolyploidy, when the sets of chromosomes originate from a single species, or through allopolyploidy, when they originate from different species. In this study, we used two different coalescent-based methods to determine the date and mode of the polyploidization event that led to the tetraploid cosmopolitan weed, Capsella bursa-pastoris. We sampled 78 C. bursa-pastoris accessions, and 53 and 43 accessions from the only two other members of this genus, C. grandiflora and C. rubella, respectively, and sequenced these accessions at 14 unlinked nuclear loci with locus-specific primers in order to be able to distinguish the two homeologues in the tetraploid. A large fraction of fixed differences between homeologous genes in C. bursa-pastoris are segregating as polymorphisms in C. grandiflora, consistent with an autopolyploid origin followed by disomic inheritance. To test this, we first estimated the demographic parameters of an isolation-with-migration model in a pairwise fashion between C. grandiflora and both genomes of C. bursa-pastoris and used these parameters in coalescent simulations to test the mode of origin of C. bursa-pastoris. Second, we used Approximate Bayesian Computation to compare an allopolyploid and an autopolyploid model. Both analyses led to the conclusion that C. bursa-pastoris originated less than 1 Ma by doubling of the C. grandiflora genome.
Molecular Biology and Evolution 01/2012; 29(7):1721-33. · 10.35 Impact Factor
[show abstract][hide abstract] ABSTRACT: Recent results from Drosophila suggest that positive selection has a substantial impact on genomic patterns of polymorphism and divergence. However, species with smaller population sizes and/or stronger population structure may not be expected to exhibit Drosophila-like patterns of sequence variation. We test this prediction and identify determinants of levels of polymorphism and rates of protein evolution using genomic data from Arabidopsis thaliana and the recently sequenced Arabidopsis lyrata genome. We find that, in contrast to Drosophila, there is no negative relationship between nonsynonymous divergence and silent polymorphism at any spatial scale examined. Instead, synonymous divergence is a major predictor of silent polymorphism, which suggests variation in mutation rate as the main determinant of silent variation. Variation in rates of protein divergence is mainly correlated with gene expression level and breadth, consistent with results for a broad range of taxa, and map-based estimates of recombination rate are only weakly correlated with nonsynonymous divergence. Variation in mutation rates and the strength of purifying selection seem to be major drivers of patterns of polymorphism and divergence in Arabidopsis. Nevertheless, a model allowing for varying negative and positive selection by functional gene category explains the data better than a homogeneous model, implying the action of positive selection on a subset of genes. Genes involved in disease resistance and abiotic stress display high proportions of adaptive substitution. Our results are important for a general understanding of the determinants of rates of protein evolution and the impact of selection on patterns of polymorphism and divergence.
Genome Biology and Evolution 09/2011; 3:1210-9. · 4.76 Impact Factor
[show abstract][hide abstract] ABSTRACT: Both mating system and population history can have large impacts on genetic diversity and population structure. Here, we use multilocus sequence data to investigate how these factors impact two closely related Brassicaceae species: the selfing Capsella rubella and the outcrossing C. grandiflora. To do this, we have sequenced 16 loci in approximately 70 individuals from 7 populations of each species. Patterns of population structure differ strongly between the two species. In C. grandiflora, we observe an isolation-by-distance pattern and identify three clearly delineated genetic groups. In C. rubella, where we estimate the selfing rate to be 0.90-0.94, the pattern is less clear with some sampling populations forming separate genetic clusters while others are highly mixed. The two species also have divergent histories. Our analysis gives support for a bottleneck approximately 73 kya (20-139 kya) in C. rubella, which most likely represents speciation from C. grandiflora. In C. grandiflora, there is moderate support for the standard neutral model in 2 of 3 genetic clusters, while the third cluster and the total data set show evidence of expansion. It is clear that mating system has an impact on these two species, for example affecting the level of genetic variation and the genetic structure. However, our results also clearly show that a combination of past and present processes, some of which are not affected by mating system, is needed to explain the differences between C. rubella and C. grandiflora.
[show abstract][hide abstract] ABSTRACT: The long-term fates of duplicate genes are well studied both empirically and theoretically, but how the short-term evolution of duplicate genes contributes to phenotypic variation is less well known. Here, we have studied the genetic basis of flowering time variation in the disomic tetraploid Capsella bursa-pastoris. We sequenced four duplicate candidate genes for flowering time and 10 background loci in samples from western Eurasia and China. Using a mixed-model approach that accounts for population structure, we found that polymorphisms at one homeolog of two candidate genes, FLOWERING LOCUS C (FLC) and CRYPTOCHROME1 (CRY1), were associated with natural flowering time variation. No potentially causative polymorphisms were found in the coding region of CRY1; however, at FLC two splice site polymorphisms were associated with early flowering. Accessions harboring nonconsensus splice sites expressed an alternatively spliced transcript or did not express this FLC homeolog. Our results are consistent with the function of FLC as a major repressor of flowering in Arabidopsis thaliana and imply that nonfunctionalization of duplicate genes could provide an important source of phenotypic variation.
[show abstract][hide abstract] ABSTRACT: A correct timing of growth cessation and dormancy induction represents a critical ecological and evolutionary trade-off between survival and growth in most forest trees (Rehfeldt et al. 1999; Horvath et al. 2003; Howe et al. 2003). We have studied the deciduous tree European Aspen (Populus tremula) across a latitudinal gradient and compared genetic differentiation in phenology traits with molecular markers. Trees from 12 different areas covering 10 latitudinal degrees were cloned and planted in two common gardens. Several phenology traits showed strong genetic differentiation and clinal variation across the latitudinal gradient, with Q(ST) values generally exceeding 0.5. This is in stark contrast to genetic differentiation at several classes of genetic markers (18 neutral SSRs, 7 SSRs located close to phenology candidate genes and 50 SNPs from five phenology candidate genes) that all showed F(ST) values around 0.015. We thus find strong evidence for adaptive divergence in phenology traits across the latitudinal gradient. However, the strong population structure seen at the quantitative traits is not reflected in underlying candidate genes. This result fit theoretical expectations that suggest that genetic differentiation at candidate loci is better described by F(ST) at neutral loci rather than by Q(ST) at the quantitative traits themselves.