[Show abstract][Hide abstract] ABSTRACT: The Paramecium aurelia complex is a group of 15 species that share at least three past whole-genome duplications (WGDs). The macronuclear genome sequences of P. biaurelia and P. sexaurelia are presented and compared to the published sequence of P. tetraurelia. Levels of duplicate-gene retention from the recent WGD differ by >10% across species, with P. sexaurelia losing significantly more genes than P. biaurelia or P. tetraurelia. In addition, historically high rates of gene conversion have homogenized WGD paralogs, probably extending paralogs lifetime. The probability of duplicate retention is positively correlated with GC content and expression level; ribosomal proteins, transcription factors, and intracellular signaling proteins are overrepresented among maintained duplicates. Finally, multiple sources of evidence indicate that P. sexaurelia diverged from the two other lineages immediately following, or perhaps concurrent with, the recent WGD, with approximately half of gene losses between P. tetraurelia and P. sexaurelia representing divergent gene resolutions (i.e., silencing of alternative paralogs), as expected for random duplicate loss between these species. Additionally, though P. biaurelia and P. tetraurelia diverged from each other much later, there are still over 100 cases of divergent resolution between these two species. Taken together, these results indicate that divergent resolution of duplicate genes between lineages acts to reinforce reproductive isolation between species in the Paramecium aurelia complex.
[Show abstract][Hide abstract] ABSTRACT: Although the analysis of linkage disequilibrium (LD) plays a central role in many areas of population genetics, the sampling variance of LD is known to be very large with high sensitivity to numbers of nucleotide sites and individuals sampled. Here we show that a genome-wide analysis of the distribution of heterozygous sites within a single diploid genome can yield highly informative patterns of LD as a function of physical distance. The proposed statistic, the correlation of zygosity, is closely related to the conventional population-level measure of LD, but is agnostic with respect to allele frequencies and hence likely less prone to outlier artifacts. Application of the method to several vertebrate species leads to the conclusion that > 80% of recombination events are typically resolved by gene-conversion-like processes unaccompanied by crossovers, with the average lengths of conversion patches being on the order of one to several kilobases in length. Thus, contrary to common assumptions, the recombination rate between sites does not scale linearly with distance, often even up to distances of 100 kilobases. In addition, the amount of LD between sites separated by < 200 bp is uniformly much greater than can be explained by the conventional neutral model, possibly because of the nonindependent origin of mutations within this spatial scale. These results raise questions about the application of conventional population-genetic interpretations to LD on short spatial scales, and also about the use of spatial patterns of LD to infer demographic histories.
[Show abstract][Hide abstract] ABSTRACT: Paramecium has long been a model eukaryote. The sequence of the Paramecium tetraurelia genome reveals a history of three successive whole-genome duplications (WGDs), and the sequence of Paramecium biaurelia and Paramecium sexaurelia suggests that these WGDs are shared by all members of the aurelia species complex. Here, we present the genome sequence of Paramecium caudatum, a species closely related to the Paramecium aurelia species group. P. caudatum shares only the most ancient of the three WGDs with the aurelia complex. We found that P. caudatum maintains twice as many paralogs from this early event as the P. aurelia species, suggesting that post-WGD gene retention is influenced by subsequent WGDs, and supporting the importance of selection for dosage in gene retention. The availability of P. caudatum as an outgroup allows an expanded analysis of the aurelia intermediate and recent WGD events. Both the GC content and the expression level of pre-duplication genes are significant predictors of duplicate retention. We find widespread asymmetrical evolution among aurelia paralogs, which is likely caused by gradual pseudogenization rather than by neofunctionalization. Finally, cases of divergent resolution of intermediate WGD duplicates between aurelia species implicate this process acts as an on-going reinforcement mechanism of reproductive isolation long after a WGD event.
[Show abstract][Hide abstract] ABSTRACT: Although pooled-population sequencing has become a widely used approach for estimating allele frequencies, most work has proceeded in the absence of a proper statistical framework. We introduce a self-sufficient, closed-form, maximum-likelihood estimator for allele frequencies that accounts for errors associated with sequencing, and a likelihood-ratio test statistic that provides a simple means for evaluating the null hypothesis of monomorphism. Unbiased estimates of allele frequencies < 5/N (where N is the number of individuals sampled) appear to be unachievable, and near-certain identification of a polymorphism requires a minor-allele frequency > 10/N. A framework is provided for testing for significant differences in allele frequencies between populations, taking into account sampling at the levels of individuals within populations and sequences within pooled samples. Analyses that fail to account for the two tiers of sampling suffer from very large false-positive rates, and can become increasingly misleading with increasing depths of sequence coverage. The power to detect significant allele-frequency differences between two populations is very limited unless both the number of sampled individuals and depth of sequencing coverage exceed 100.
Genome Biology and Evolution 04/2014; · 4.76 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Accurate transmission and expression of genetic information are crucial for the survival of all living organisms. Recently, the coupling of mutation accumulation experiments and next-generation sequencing has greatly expanded our knowledge of the genomic mutation rate in both prokaryotes and eukaryotes. However, because of their transient nature, transcription errors have proven extremely difficult to quantify, and current estimates of transcription fidelity are derived from artificial constructs applied to just a few organisms. Here we report a unique cDNA library preparation technique that allows error detection in natural transcripts at the transcriptome-wide level. Application of this method to the model organism Caenorhabditis elegans revealed a base misincorporation rate in mRNAs of ∼4 × 10(-6) per site, with a very biased molecular spectrum. Because the proposed method is readily applicable to other organisms, this innovation provides unique opportunities for studying the incidence of transcription errors across the tree of life.
Proceedings of the National Academy of Sciences 10/2013; · 9.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We present the complete genomic sequence of the essential symbiont Polynucleobacter necessarius (Betaproteobacteria), which is a valuable case study for several reasons. First, it is hosted by a ciliated protist, Euplotes; bacterial symbionts of ciliates are still poorly known because of a lack of extensive molecular data. Second, the single species P. necessarius contains both symbiotic and free-living strains, allowing for a comparison between closely related organisms with different ecologies. Third, free-living P. necessarius strains are exceptional by themselves because of their small genome size, reduced metabolic flexibility, and high worldwide abundance in freshwater systems. We provide a comparative analysis of P. necessarius metabolism and explore the peculiar features of a genome reduction that occurred on an already streamlined genome. We compare this unusual system with current hypotheses for genome erosion in symbionts and free-living bacteria, propose modifications to the presently accepted model, and discuss the potential consequences of translesion DNA polymerase loss.
Proceedings of the National Academy of Sciences 10/2013; · 9.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Despite much theoretical work, the molecular-genetic causes and evolutionary consequences of asexuality remain largely undetermined. Asexual animal species are rare, evolutionarily short-lived, and thought to suffer mutational meltdown as a result of lack of recombination. Whole-genome analysis of 11 sexual and 11 asexual genotypes of Daphnia pulex indicates that current asexual lineages are in fact very young, exhibit no signs of purifying selection against accumulating mutations, and have extremely high rates of gene conversion and deletion. The reconstruction of chromosomal haplotypes in regions containing SNP markers associated with asexuality (chromosomes VIII and IX) indicates that introgression from a sister species, Daphnia pulicaria, underlies the origin of the asexual phenotype. Silent-site divergence of the shared chromosomal haplotypes of asexuals indicates that the spread of asexuality is as recent as 1,250 y, although the origin of the meiosis-suppressing element or elements could be substantially older. In addition, using previous estimates of the gene conversion rate from Daphnia mutation accumulation lines, we are able to age each asexual lineage. Although asexual lineages originate from wide crosses that introduce elevated individual heterozygosities on clone foundation, they also appear to be constrained by the inbreeding-like effect of loss of heterozygosity that accrues as gene conversion and hemizygous deletion expose preexisting recessive deleterious alleles of asexuals, limiting their evolutionary longevity. Our study implies that the buildup of newly introduced deleterious mutations (i.e., Muller's ratchet) may not be the dominant force imperiling nonrecombining populations of D. pulex, as previously proposed.
Proceedings of the National Academy of Sciences 08/2013; · 9.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The molecular mechanisms leading to asexuality remain little understood despite their substantial bearing on why sexual reproduction is dominant in nature. Here, we examine the role of hybridization in the origin and spread of obligate asexuality in Daphnia pulex, arguably the best-documented case of contagious asexuality. Obligately parthenogenetic (OP) clones of D. pulex have traditionally been separated into 'hybrid' (Ldh SF) and 'nonhybrid' (Ldh SS) forms because the lactase dehydrogenase (Ldh) locus distinguishes the cyclically parthenogenetic (CP) lake dwelling Daphnia pulicaria (Ldh FF) from its ephemeral pond dwelling sister species D. pulex (Ldh SS). The results of our population genetic analyses based on microsatellite loci suggest that both Ldh SS and SF OP individuals can originate from the crossing of CP female F1 (D. pulex × D. pulicaria) and backcross with males from OP lineages carrying genes that suppress meiosis specifically in female offspring. In previous studies, a suite of diagnostic markers was found to be associated with OP in Ldh SS D. pulex lineages. Our association mapping supports a similar genetic mechanism for the spread of obligate parthenogenesis in Ldh SF OP individuals. Interestingly, our study shows that CP D. pulicaria carry many of the diagnostic microsatellite alleles associated with obligate parthenogenesis. We argue that the assemblage of mutations that suppress meiosis and underlie obligate parthenogenesis in D. pulex originated due to a unique historical hybridization and introgression event between D. pulex and D. pulicaria.
[Show abstract][Hide abstract] ABSTRACT: Despite playing a crucial role in germline-soma differentiation, the evolutionary significance of developmentally regulated genome rearrangements (DRGRs) has received scant attention. An example of DRGR is DNA splicing, a process that removes segments of DNA interrupting genic and/or intergenic sequences. Perhaps best known for shaping immune-system genes in vertebrates, DNA splicing plays a central role in the life of ciliated protozoa, where thousands of germline DNA segments are eliminated after sexual reproduction to regenerate a functional somatic genome. Here, we identify and chronicle the properties of 5,286 sequences that putatively undergo DNA splicing (i.e., Internal Eliminated Sequences or IESs) across the genomes of three closely related species of the ciliate Paramecium (P. tetraurelia, P. biaurelia, and P. sexaurelia). The study reveals that these putative IESs share several physical characteristics. While our results are consistent with excision events being largely conserved between species, episodes of differential IES retention/excision occur, may have a recent origin, and frequently involve coding regions. Our findings indicate interconversion between somatic-often coding-DNA sequences and noncoding IESs, and provide insights into the role of DNA splicing in creating potentially functional genetic innovation.
Genome Biology and Evolution 06/2013; · 4.76 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Because spontaneous mutation is the source of all genetic diversity, measuring mutation rates can reveal how natural selection drives patterns of variation within and between species. We sequenced eight genomes produced by a mutation-accumulation experiment in Drosophila melanogaster. Our analysis reveals that point mutation and small indel rates vary significantly between the two different genetic backgrounds examined. We also find evidence that roughly 2% of mutational events affect multiple closely spaced nucleotides. Unlike previous similar experiments, we are able to estimate genome-wide rates of large deletions and tandem duplications. These results suggest that, at least in inbred lines like those examined here, mutational pressures may result in net growth rather than contraction of the Drosophila genome. By comparing our mutation rate estimates to polymorphism data we are able to estimate the fraction of new mutations that are eliminated by purifying selection. These results suggest that ~99% of duplications and deletions are deleterious-making them 10 times more likely to be removed by selection than nonsynonymous mutations. Our results illuminate not only the rates of new small- and large-scale mutations, but the selective forces they encounter once they arise.
[Show abstract][Hide abstract] ABSTRACT: Enormous phylogenetic variation exists in the number and sizes of introns in protein-coding genes. Although some consideration has been given to the underlying role of the population-genetic environment in defining such patterns, the influence of the intracellular environment remains virtually unexplored. Drawing from observations on interactions between co-transcriptional processes involved in splicing and mRNA 3'-end formation, a mechanistic model is proposed for splice-site recognition that challenges the commonly accepted intron- and exon-definition models. Under the suggested model, splicing factors that outcompete 3'-end processing factors for access to intronic binding sites concurrently favor the recruitment of 3'-end processing factors at the pre-mRNA tail. This hypothesis sheds new light on observations such as the intron-mediated enhancement of gene expression and the negative correlation between intron length and levels of gene expression.
[Show abstract][Hide abstract] ABSTRACT: Bacteria and eukaryotes are involved in many types of interaction in nature, with important ecological consequences. However, the diversity, occurrence, and mechanisms of these interactions often are not fully known. The obligate bacterial endosymbionts of Paramecium provide their hosts with the ability to kill sensitive Paramecium strains through the production of R-bodies, highly insoluble coiled protein ribbons. R-bodies have been observed in a number of free-living bacteria, where their function is unknown. We have performed an exhaustive survey of genes coding for homologs of Reb proteins (R-body components) in complete bacterial genomes. We found that genes are much more widespread than previously thought, being present in representatives of major Proteobacterial subdivisions, including many free-living taxa, as well as taxa known to be involved in various kinds of interactions with eukaryotes, from mutualistic associations to pathogenicity. Reb proteins display very good conservation at the sequence level, suggesting that they may produce functional R-bodies. Phylogenomic analysis indicates that genes underwent a complex evolutionary history and allowed the identification of candidates potentially involved in R-body assembly, functioning, regulation, or toxicity. Our results strongly suggest that the ability to produce R-bodies is likely widespread in Proteobacteria. The potential involvement of R-bodies in as yet unexplored interactions with eukaryotes and the consequent ecological implications are discussed.
[Show abstract][Hide abstract] ABSTRACT: Understanding the impact of spontaneous mutations on fitness has many theoretical and practical applications in biology. Although mutational effects on individual morphological or life-history characters have been measured in several classic genetic model systems, there are few estimates of the rate of decline due to mutation for complex fitness traits. Here, we estimate the effects of mutation on competitive ability, an important complex fitness trait, in a model system for ecological and evolutionary genomics, Daphnia. Competition assays were performed to compare fitness between mutation-accumulation (MA) lines and control lines from eight different genotypes from two populations of Daphnia pulicaria after 30 and 65 generations of mutation accumulation. Our results show a fitness decline among MA lines relative to controls as expected, but highlight the influence of genomic background on this effect. In addition, in some assays, MA lines outperform controls providing insight into the frequency of beneficial mutations.
Journal of Evolutionary Biology 12/2012; · 3.48 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Understanding how genetic variation is generated and how selection shapes mutation rates over evolutionary time requires knowledge of the factors influencing mutation and its effects on quantitative traits. We explore the impact of two factors, genomic background and generation time, on deleterious mutation in Daphnia pulicaria, a cyclically parthenogenic aquatic microcrustacean, using parallel mutation-accumulation experiments. The deleterious mutational properties of life-history characters for individuals from two different populations, and individuals maintained at two different generation times, were quantified and compared. Mutational properties varied between populations, especially for clutch size, suggesting that genomic background influences mutational properties for some characters. Generation time was found to have a greater effect on mutational properties, with higher per-generation deleterious mutation rates in lines with longer generation times. These results suggest that differences in genetic architecture among populations and species may be explained in part by demographic features that significantly influence generation time and therefore the rate of mutation.