Molecular signature of epistatic selection: interrogating genetic interactions in the sex-ratio meiotic drive of Drosophila simulans.
ABSTRACT Fine scale analyses of signatures of selection allow assessing quantitative aspects of a species' evolutionary genetic history, such as the strength of selection on genes. When several selected loci lie in the same genomic region, their epistatic interactions may also be investigated. Here, we study how the neutral polymorphism pattern was shaped by two close recombining loci that cause 'sex-ratio' meiotic drive in Drosophila simulans, as an example of strong selection with potentially strong epistasis. We compare the polymorphism data observed in a natural population with the results of forward stochastic simulations under several contexts of epistasis between the candidate loci for the drive. We compute the likelihood of different possible scenarios, in order to determine which configuration is most consistent with the data. Our results highlight that fine scale analyses of well-chosen candidate genomic regions provide information-rich data that can be used to investigate the genotype-phenotype-fitness map, which can hardly be studied in genome-wide analyses. We also emphasize that initial conditions and time of observation (here, time after the interruption of a partial selective sweep) are crucial parameters in the interpretation of real data, while these are often overlooked in theoretical studies.
-
Citations (0)
-
Cited In (0)
Page 1
Molecular signature of epistatic selection: interrogating
genetic interactions in the sex-ratio meiotic drive
of Drosophila simulans
LUIS-MIGUEL CHEVIN1,2*#, HE´LOI¨SE BASTIDE3,
CATHERINE MONTCHAMP-MOREAU3AND FRE´DE´RIC HOSPITAL4
1UMR de Ge´ne´tique Ve´ge´tale, Ferme du Moulon, 91190 Gif Sur Yvette, France
2Ecologie, Syste´matique et Evolution, UMR 8079, Universite´ Paris-Sud, 91405 Orsay Cedex, France
3Laboratoire Evolution Ge´nome et Spe´ciation, UPR9034, CNRS, 91198 Gif-sur-Yvette Cedex, France
4INRA, UMR 1313 ‘‘Ge´ne´tique Animale et Biologie Inte´grative’’, 78352 Jouy-en-Josas, France
(Received 25 July 2008 and in revised form 19 March 2009)
Summary
Fine scale analyses of signatures of selection allow assessing quantitative aspects of a species’
evolutionary genetic history, such as the strength of selection on genes. When several selected loci lie
in the same genomic region, their epistatic interactions may also be investigated. Here, we study how
the neutral polymorphism pattern was shaped by two close recombining loci that cause ‘sex-ratio’
meiotic drive in Drosophila simulans, as an example of strong selection with potentially strong
epistasis. We compare the polymorphism data observed in a natural population with the results
of forward stochastic simulations under several contexts of epistasis between the candidate loci for
the drive. We compute the likelihood of different possible scenarios, in order to determine which
configuration is most consistent with the data. Our results highlight that fine scale analyses of
well-chosen candidate genomic regions provide information-rich data that can be used to investigate
the genotype–phenotype–fitness map, which can hardly be studied in genome-wide analyses. We also
emphasize that initial conditions and time of observation (here, time after the interruption of a
partial selective sweep) are crucial parameters in the interpretation of real data, while these are often
overlooked in theoretical studies.
1. Introduction
Understanding how selection operates at the gene
level is one of the main goals of evolutionary genetics.
Most of the current effort to identify positively selec-
ted genes involves searching for molecular signatures
of selection on neutral polymorphism (Nielsen,
2005). Indeed, the growth experienced by a beneficial
mutation partly affects patterns of polymorphism at
linked neutral variants through genetic hitchhiking
(Maynard-Smith & Haigh, 1974), so neutral loci
linked to a locus under selection may be distinguished
from loci that evolve under pure neutrality. A popular
approach in this context consists of analysing the
polymorphism pattern around a candidate region that
was previously identified through quantitative trait
locus (QTL) analysis or association mapping. In
contrast to large-scale genome scans (Nielsen et al.,
2005; Williamson et al., 2007), which provide a global
picture of natural selection, fine-scale studies of this
kind allow asking detailed questions about how
selection affected peculiar regions, up to the order of
the Mb. For instance, Kim & Stephan (2002) designed
a method to jointly estimate the precise location of
the target of selection and the selection coefficient in-
volved, thus yielding more quantitative information
than the simple presence of positive selection in a
genomic region.
Another appealing possibility would be to use the
pattern of polymorphism in a candidate region to in-
vestigate the relationship between the genotype, the
phenotype and fitness. This step is crucial in order to
get an integrated view of evolution and adaptation,
since selection only operates at the level of pheno-
types, not directly on genes. However, in most cases,
the genotype–phenotype–fitness map is extremely
complex, and cannot be modelled explicitly without
* Corresponding author: e-mail: l.chevin@imperial.ac.uk
# Present address: Division of biology, Imperial College London,
Silwood Park Campus, Ascot, Berkshire SL57PY, UK.
Genet. Res., Camb. (2009), 91, pp. 171–182.
doi:10.1017/S0016672309000147
f Cambridge University Press 2009
Printed in the United Kingdom
171
Page 2
huge simplifying assumptions (Gavrilets, 2004, chap-
ter 2). And even then, its underlying parameters are
often difficult to estimate empirically.
The best examples of integrated investigations
of selection, from the phenotype in natura to the
molecular level, mainly focused on QTL with very
strong additive effects (Rogers & Bernatchez, 2005;
Hoekstra et al., 2006). In such studies, the complexity
of the traits compels to use a reductive approach that
neglects the interactions that may occur (i) between
the focal QTL and other genes that contribute to the
trait and (ii) between the focal trait and other traits
that contribute to fitness. If there were phenotypic
traits whose relationship to fitness was clearly charac-
terized, and the interactions between loci were simple
and biologically explicit, then we could address
specific questions about the functional interactions of
genes under selection using molecular signatures of
selection.
Selfish genetic elements are very appealing candi-
dates in that respect. They take profit of the genomic
machinery in order to increase their own reproductive
success, largely independently (and often at the ex-
pense) of the fitness of the host organism (Hurst &
Werren, 2001). Hence, their prevailing phenotype
is directly their own fitness (besides possible pleio-
tropic effects on fertility or viability). They thus allow
emitting simple, biologically explicit and empirically
testable hypotheses about the genotype–phenotype–
fitness map. These hypotheses can in turn be tested by
various methods, including molecular signatures of
selection.
Segregation distorters (Lyttle, 1991) are among the
best-studied examples of selfish genetic elements.
They hijack the process of meiosis (meiotic drive) or
gametogenesis such as to be found in more than half
of the gametes produced by heterozygous individuals
that carry them, thus violating Mendel’s law of ran-
dom segregation. This confers them a strong selective
advantage, and hence they can affect neutral poly-
morphism through the hitchhiking effect (Chevin &
Hospital, 2006). The molecular mechanisms underly-
ing the drive are usually unknown but likely many.
In males, the known driving elements kill or disable
the alternative gamete (Lyttle, 1991). In females, they
take advantage of the asymmetry of female meiosis
to end up into the egg nucleus. When they act on sex
chromosomes in the heterogametic sex, they also
modify the sex ratio of the population, in which case
they are sometimes called
(Jaenike, 2001).
Here, we study the sex-ratio drive in Drosophila
simulans, which has been well characterized geneti-
cally. This meiotic drive favours distorter X chromo-
somes (XSR) against susceptible Y chromosomes in
males. At least three independent sex-ratio systems
have been found in this species (Tao et al., 2007;
sex-ratio distorters
Jaenike, 2008). In the most thoroughly analysed case
(denoted the ‘Paris’ sex-ratio in TAO et al. (2007)),
the XSRchromosomes have reached high prevalence
in southeast Africa and Madagascar (frequency up to
60%), but their effect is now completely suppressed
by autosomal and Y-linked suppressors (Atlan et al.,
1997). Montchamp-Moreau et al. (2006) investigated
the genetic determinism of the ‘Paris’ drive. Using
a reference XSRchromosome in a suppressor-free
genetic background, they showed that two close
genomic regions were both necessary for the drive to
occur in the lab, which points towards obligate inter-
action between the alleles involved in the drive.
However, we do not know whether their interaction in
the genetic background of the natural populations at
the time of their spread was the same as in the drive
sensitive genetic context used in the lab. This question
can be investigated using molecular signatures of
selection.
The polymorphism pattern in the driving region of
XSRchromosomes of D. simulans was investigated in
two recent studies. Derome et al. (2004) first showed
that the Nrg gene located close to the meiotic drive
elements of D. simulans exhibits the signature of a
selective sweep in the islands of Madagascar and
La Re ´ union. A further study of the sample from
Madagascar, using several intragenic markers in the
same genomic region, allowed uncovering a spatial
pattern consistent with incomplete selective sweeps
at two loci (Derome et al., 2008). This pattern is re-
produced in Fig. 1, including three new markers that
were not in Derome et al. (2008). It has three notable
features. First, along each of the two causative re-
gions previously identified in the genetic study of
Montchamp-Moreau et al. (2006), the diversity of
XSRchromosomes is dramatically reduced relative
to that of standard (non-distorter) chromosomes
(XST). This is consistent with a strong association of
those two regions with the phenotype under study
(meiotic drive). Together with the high frequency of
the haplotypes associated with the drive, this is also
indicative of positive selection (Sabeti et al., 2002;
Voight et al., 2006). Second, the linkage disequi-
librium (LD) within each of the candidate regions,
and also most notably between them, is very strong as
can be seen from both the Dk values and the signifi-
cance of the Fisher exact test. This feature may have
emerged as a result of positive epistasis between the
drive loci included in each region. Third, in spite of
this (putatively strong epistatic) selection involving
two loci only 1 cM apart, the diversity at markers
located in between the two regions is high. To under-
stand how selection has shaped this pattern, we need a
model with two close loci under selection and varying
levels of epistasis.
The effect of positive selection at two closely linked
loci on neutral polymorphism has been investigated
L.-M. Chevin et al.172
Page 3
in several recent studies. Kim & Stephan (2003)
showed that selective sweeps at two closely linked loci
have on average less than additive effects on the re-
duction of heterozygosity at a neutral locus, because
the selective interference between the selected loci
slow down their dynamics. Chevin et al. (2008) further
showed that the fine-scale polymorphism pattern
around two partially linked loci with independent
(multiplicative) effects on fitness can exhibit a spike in
diversity in the interval delimited by the selected loci.
This occurs because each beneficial mutation can
hitchhike a different neutral allele. However, the pat-
tern may be different if there is epistasis between the
loci under selection. In the sex-ratio meiotic drive in
D. simulans, epistasis is suggested by both the genetic
analysis and the strong LD. This genetic system is
thus a good example to study the signature of selec-
tion at two close loci in the presence of genetic inter-
actions. Because meiotic drive systems are expected
to evolve by recruiting interacting elements at linked
loci (Crow, 1991; Palopoli & Wu, 1996), the case is
particularly appealing.
In this paper, we investigate the effect of selective
interactions on the pattern of neutral polymorphism,
in the context of the sex-ratio meiotic drive of D. si-
mulans. We first build a simple model of the inter-
action between the loci that cause the meiotic drive.
This model is used to understand how epistasis be-
tween the driving loci influences their LD. Then, we
run stochastic simulations using this quantitative
framework, to compare the likelihoods of various
levels of epistatic interactions between the SR loci,
and various scenarios of introduction of the driving
mutations on the X chromosome. We focus on the
three features of the polymorphism pattern that were
emphasized above, because they bear interpretable
information. We show that, in a region where several
genes have been identified as candidates for selection
on a phenotype, the polymorphism pattern can pro-
vide more information than the simple presence/
absence of selection, and can potentially be enlighten-
ing about the fundamental genotype–phenotype–
fitness relationship.
2. Methods
Wherever possible we used deterministic numerical
computations. However,theevolutionofthe
3·0
2·5
2·0
1·5
1·0
0·5
0
050100150200250
Distance from A (kbs), according to the genome of D. melanogaster
Candidate regions for the distorter elements
(genetic mapping)
πSR/πST
A
F
F'
G
K
LM
H I’ I J
BCD E
(a)
(b)
Fig. 1. Polymorphism pattern in the SR region of D. simulans in Madagascar. (a) Ratio of nucleotide diversities between
distorter (SR) and standard (ST) X chromosomes. (b) LD between the markers (in letters), quantified by values of
Dk (in bold when equal to 1.0) and P values of Fisher’s exact test (white: non-significant, light grey: P<0.05, dark grey:
P<0.01, black: P<0.001). Note the high linkage within and between the two candidate regions (letters in bold).
Molecular signature of epistasis173
Page 4
polymorphism pattern at several neutral loci under
the infinite site model of mutation and with complex
selection is mathematically intractable, so our analy-
sis mainly relies on stochastic simulations.
(i) A general 2-locus model of sex-ratio meiotic drive
We model sex chromosome meiotic drive favouring
X chromosomes against Y chromosomes in males,
and determined by two loci (SR1and SR2) on the X
chromosome. Alleles are noted in italics; SRidenotes
the mutant, potentially driving allele at locus SRi,
whereas sridenotes the wild allele. Recombination
occurs at rate r between SR1 and SR2 in females,
consistently with what was shown for D. simulans, but
contrary to other meiotic drive systems that have
evolved chromosomal inversions (Jaenike, 2001). The
segregation coefficient ki (1/2fkif1) is the pro-
portion of X-bearing sperm produced by males that
carry only one driving allele SRi. Note that kiis also
the proportion of females in their progeny. If we note
ai (0faif1) the proportion of Y-bearing sperms
eliminated by the meiotic drive due to SRi(starting
from equal amounts of X and Y chromosomes), then
ki=1/(2xai). The phenotypic effect of SRiis thus
the elimination of a proportion ai=(2kix1)/ki of
Y-bearing sperm, since we assume no pleiotropic
effect of meiotic drive genes on viability or fertility
(see Discussion section). We note k12the segregation
coefficient of individuals carrying both driving alleles
SR1and SR2(and a12accordingly). In the absence
of interaction between the SR loci, each allele SRi
eliminates a proportion aiof sperms left intact by the
other allele SRj, so that the proportions of surviving
Y-bearing sperm are multiplicative between the SR
loci. When there is also functional interaction be-
tween the SR loci, we consider that the proportion of
Y-bearing sperm is further reduced by a factor 1xe
(0fef1) such that in the general case, the fraction of
the initial Y-bearing sperms that survive is
1xa12=(1xa1)(1xa2)(1xe)=(1xk1)(1xk2)(1xe)
k1k2
(1)
and
k12=
1
2xa12=
k1k2
k1k2+(1xe)(1xk1)(1xk2):
(2)
The term e quantifies the strength of the interaction
between SR loci in an explicit way, related to their
phenotypic effect (the destruction of Y-bearing
sperm). The combined segregation coefficient k12then
varies from k1(for e=0 and k2=1/2) to 1 (for e=1).
This general framework allows considering different
cases of interest regarding the genetic determinism of
the meiotic drive, from independent effects of both
loci (k1>1/2, k2>1/2, e=0) to obligate synergistic
effects (k1=1/2, k2=1/2, e>0). An intermediate rel-
evant case is the interaction of a driving locus with an
enhancer locus otherwise neutral (k1>1/2, k2=1/2,
e>0).
The dynamics at the SR loci can be calculated
deterministically. We use labels a, b, c and d, for the
two-locus haplotypes SR1–SR2, SR1–sr2, sr1–SR2and
sr1–sr2, respectively, and denote xhfand xhmthe fre-
quencies of any haplotype h among X chromosomes
in male and female gametes, respectively. In eggs, the
new frequency of haplotype h after one generation is
xkhf=xh+2dr(Cx1=2C),
d=1 if h 2 {b,c},
d=x1 if h 2 {a,d}:
where xh=(xhm+xhf)=2, C=xaxdxxcxb and C=
(Cm+Cf)=2.Cm=xamxdmxxcmxbmistheLD insperm
and similarly Cfthe LD in eggs. Note that the ‘ ’
symbol is used here for the sake of clarity of notation,
but does not denote an average value in the popu-
lation, since the sex ratio is not necessarily 1/2.
In males, the X chromosome is maternally in-
herited, so the genotypic frequencies in males are
equal to those among females in the previous gener-
ation. Those frequencies are then affected by the meio-
tic drive, and the new frequencies of haplotypes a, b, c
and d among X chromosomes in sperm after one
generation are
(3)
xkam=
2k12xaf
2k12xaf+2k1xbf+2k2xcf+xdf,
2k1xbf
2k12xaf+2k1xbf+2k2xcf+xdf,
2k2xcf
2k12xaf+2k1xbf+2k2xcf+xdf,
xdf
2k12xaf+2k1xbf+2k2xcf+xdf:
xkbm=
xkcm=
xkdm=
(4)
We iterated equations (3) and (4) to calculate the de-
terministic dynamics of the frequency of XSRchro-
mosomes, and that of the LD between the SR loci.
The LD was calculated as Dk, which is the classical D
(covariance between allelic states at two loci) divided
by its maximum expected value based on allelic fre-
quencies (Lewontin, 1995). Specifically, we calculated
the expected Dk in males at the generation when the
frequency of XSRchromosomes reached 0.6, which is
the frequency observed in natural populations of
Madagascar. This allowed us to evaluate the influence
of the epistasis parameter e on the association be-
tween the SR loci.
(ii) Simulation method
To study the influence of selection and epistasis on the
polymorphism pattern along the recombining region
L.-M. Chevin et al.174
Page 5
of the X chromosome that includes the SR1 and
SR2distorter loci, we used forward individual-based
stochastic simulations. We used a modified version of
the program used in Chevin et al. (2008), which can
simulateseveralDNAsequencefragments(‘markers’)
with mutation within fragments (under the infinite
site model) and recombination within and between
fragments. Two markers were placed such as to in-
clude each of the SR loci (the causative loci were
considered to be restricted to a single nucleotide).
Another marker was placed in the middle of the
SR1–SR2interval. For each marker, we generated the
initial neutral polymorphisms for all the X chromo-
somes in the population by coalescence simulations
using the program ‘ms’ (Hudson, 2002), since co-
alescence theory remains a good approximation when
the sample size is close to the effective population size
(Wakeley & Takahashi, 2003). We assumed that the
sex ratio was unbiased before the introduction of the
meiotic drive allele(s), so that the size of the popu-
lation of X chromosomes was NX=3Ne, where Ne
is the effective population size. For each fragment,
we simulated 3Ne sequences using ‘ms’, with the
mutation parameter h and the recombination par-
ameter r defined at the scale of the entire fragment
(rather than per nucleotide), as is common practice
when using the infinite site model (see for instance
Hudson (2002) and Przeworski (2002)). Empirical
estimates of r and h suggest that r is roughly twice as
large as h in normally recombining genomic regions
of D. simulans (Kliman et al., 2000), so we used h=3
and r=6, which roughly corresponds to 300 bp long
sequences. Then, the selective sweeps at the meiotic
drive loci SR1and SR2were simulated forward in
time. Recombination occurred in females only, at rate
r=r/(2Ne). Segregation distortion occurred in males,
as described in the Model section. We assumed that
the driving alleles had no deleterious effects on fer-
tility or viability (such an effect would be mostly
equivalent to decreasing the strength of the drive).
Mutation occurred in both sexes, at a rate m=h/3Ne.
We used an effective population size of Ne=10000.
This value is lower than the actual effective size
usually reported for fruit flies, and was chosen be-
cause it was tractable in individual-based forward
simulations. Nevertheless it may not affect our result
strongly, since we used relevant values of the popu-
lation parameters for recombination and mutation
inside each fragment. Hence the main consequence
of using a small population size in our context is to
increase the amount of drift, thus limiting the strength
and duration of signatures of selection. This could
affect our results quantitatively to some extent, but
not qualitatively.
The various simulations differed in the parameters
of the meiotic drive (k1, k2and e). The simulations
also differed in the scenarios regarding driver alleles
at SR1and SR2, which could be introduced either
(i) together on the same haplotype (which represents
a migration event from another population), or (ii)
separately in time (delayed). In all cases, simulations,
where either of the alleles was lost, were discarded
as in Chevin et al. (2008). When the driving alleles
appeared by mutation, they were introduced in five
copies in order to decrease computation time. This
does not affect the generality of the results (see Chevin
et al. (2008)), and relies on the fact that a beneficial
mutation that is fated to fixation (i.e. conditional on
ultimate fixation) rises quickly in frequency (Barton,
1998). In the case of delayed appearance, SR2was
present but behaved neutrally before the introduction
of the driving allele at SR1. Hence SR2was taken to be
one of the neutral polymorphic sites from the ‘ms’
simulation, at which the derived allele was chosen to
be the driving allele.
We cancelled the meiotic drive effect when the
pooled frequency of distorters – regardless of their
quantitative effects – among X chromosomes reached
0.6, the value observed in the natural population
of Madagascar (Atlan et al., 1997). This was meant
to represent the effect of rapidly invading drive sup-
pressors on the Y chromosomes and/or on autosomes
(Atlan et al., 2003), or a frequency-dependent dis-
advantage of XSRin fertility (Taylor & Jaenike
(2002), see Discussion section). The population was
then left to evolve for an additional 200 generations,
and 25 samples were drawn from each simulation
every 50 generations, from which statistical measures
were made. Samples consisted of 10 XSRand 5 XST
chromosomes as in Derome et al. (2008).
(iii) Likelihood of scenarios
Our aim was to find which genetic scenario was the
most consistent with the observed polymorphism
pattern in the region. We chose to study two realistic
cases of interest regarding the interaction between
SR loci. The first case is obligate interaction of the
meiotic drive elements, whereby none has an effect of
its own (k1=0.5 and k2=0.5), as observed in the lab
against standard Y chromosomes. In the second case,
SR1is a meiotic drive locus, whose effect is possibly
enhanced by SR2(otherwise neutral). In this scenario,
we chose k1=0.75 (and still k2=0.5). This other
scenario is consistent with the fact that many meiotic
drive systems are thought to evolve by recruiting
interacting elements at linked loci during their spread
in a population (Crow, 1991). Within each of these
qualitatively distinct genetic interaction schemes,
several values of the epistasis parameter e were simu-
lated (from e=0.33 to e=0.89, corresponding under
obligate interaction to k12=0.6 and k12=0.9, respect-
ively). Hence, we assessed both the qualitative and
quantitative influences of epistasis on the likelihood
Molecular signature of epistasis 175