KinganWintersPopGenRevision

advertisement
Recurrent selection for the Winters sex-ratio genes in Drosophila simulans
Submitted to Genetics
Sarah B. Kingan*1, Daniel Garrigan†, and Daniel L. Hartl*
*
Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, 02138
USA
†
Department of Biology, University of Rochester, Rochester, NY 14627
1. Present Address: Department of Biology, University of Rochester, Rochester, NY 14627
Corresponding Author:
Sarah B. Kingan
University of Rochester
Department of Biology
River Campus, Box 270211
Rochester, NY 14627-0211
Phone: (585) 275-4509
Fax: (585) 275-2070
e-mail: skingan@mail.rochester.edu
2
ABSTRACT
Drosophila simulans is unusual in having a least three distinct systems of X chromosome meiotic
drive. These selfish genetic elements have biased transmission during meiosis, resulting in an excess of
female progeny. Here, we characterize naturally occurring genetic variation at the Winters sex-ratio
driver (Distorter on the X or Dox), its progenitor gene (Mother of Dox or MDox), and its suppressor
gene (Not Much Yang or Nmy), which have been previously mapped and characterized. We survey
three North American populations as well as 13 globally distributed strains, and present molecular
polymorphism data at the three loci. We find that all three genes show signatures of selection in North
America, judging from levels of polymorphism and skews in the site-frequency spectrum. This
signature of selection likely results from the biased transmission of the driver and selection on the
suppressor for the maintenance of equal sex ratios. The timing of selection is more recent than the age
of the alleles, suggesting that the driver and suppressor are coevolving under an evolutionary “arms
race”. None of the Winters sex-ratio genes are fixed in D. simulans, and at all loci we find ancestral
alleles, which lack the gene insertions and exhibit high levels of nucleotide polymorphism compared to
the derived alleles. In addition, we find several “null” alleles that have mutations on the derived Dox
background, which result in loss of drive function. We discuss the possible causes of the maintenance
of presence-absence polymorphism in the Winters sex-ratio genes.
3
INTRODUCTION
Meiotic drive can leave signatures in the genome similar to positive natural selection without
increasing the fitness of an organism (LYTTLE 1993). Drive elements are preferentially transmitted
during meiosis by disrupting the development or function of sperm carrying the homologous
chromosome (ZIMMERING et al. 1970, meiotic drive sensu lato), or by true chromosome segregation
defects during meiosis (SANDLER and NOVITSKI 1957, meiotic drive sensu stricto; TAO et al. 2007a);
we use the term in the former sense throughout this article. While drive elements may arise on any
chromosome, sex-chromosome meiotic drive is more easily detected due to the impact it has on the sex
ratio, and a sex-linked driver is also more likely to invade a population than an autosomal driver (Hurst
and Pomiankowski 1991). A chromosomal driver must maintain tight linkage with an insensitive target
locus lest it drive against itself, a condition ensured by the lack of recombination between sex
chromosomes (Charlesworth and Hartl 1978). Because of the impact drive elements have on sex ratios,
sex-linked drivers are often referred to as “sex-ratio distorters” and the phenotype of skewed progeny
sex ratios is termed “sex-ratio”. The mere transmission advantage of a driver, unless balanced by some
detrimental fitness effect or masked by a suppressor, can cause it to sweep through a population in a
manner similar to a positively selected mutation (Edwards 1961; Vaz and Carvalho 2004).
Obviously, a complete sweep of a sex-linked driver dooms a male-less (or female-less)
population to extinction (Hamilton 1967), and natural selection strongly favors genetic factors that
suppress drive and restore Mendelian segregation. Fisher (1930) presented a qualitative argument for
the maintenance of an equal sex ratio, which predicts selection on any heritable variant that increases
the production of the rarer sex. Fisher’s principle has been formalized mathematically and
demonstrated empirically (e.g. Bodmer and Edwards 1960; Carvalho et al. 1998). Suppressors have
been identified in a wide variety of meiotic drive systems and are predicted to be strongly favored by
natural selection for the maintenance of equal sex ratios (reviewed by Jaenike 2001). Furthermore, the
4
evolution of enhancer genes on a suppressed driving chromosome may enable drivers to evade
suppression, setting off another bout of selection for Fisherian sex ratios through suppression (Hartl
1975).
Meiotic drive is widespread, with systems identified in mammals, insects, and plants (JAENIKE
2001). Drosophila is the most extensively studied insect taxon, and sex-chromosome meiotic drive
systems have been identified in more than a dozen species (Jaenike 2001). The rapid evolution of
suppressors by Fisherian selection results in a cryptic sex-ratio distorter, which may be identified only
when the association between the driver and suppressor is lost, such as in hybrids between species or
populations that do not share meiotic drive systems (MERCOT et al. 1995). The coevolutionary arms
race driving strong selection on drivers and suppressors likely contributes to Haldane’s rule (the
preferential sterility of inviability of heterogametic hybrids) and is a leading explanation for the
importance of X-linked loci in causing hybrid male sterility (FRANK 1991; HURST and POMIANKOWSKI
1991; PRESGRAVES 2008; TAO et al. 2007b). Two recently characterized hybrid male sterility factors,
which are also sex-ratio distorters, are evidence of a possible link between meiotic drive and speciation
(ORR and IRVING 2005; PHADNIS and ORR 2008; TAO et al. 2001).
Drosophila simulans is unusual in having at least three distinct X-linked drive systems, termed
Paris, Durham, and Winters (Tao et al. 2007a). Here, we focus on the Winters sex-ratio (SR), whose
driver and suppressor have been mapped to the gene level and whose molecular and cellular features
have been elucidated (Tao et al. 2007a; Tao et al. 2007b). Two genes, Distorter on the X (Dox) and
Mother of Dox (MDox) are required for sex-ratio distortion (TAO et al. 2007a, Y. Tao personal comm.).
Dox is a duplicate copy of MDox, which is located 70 kilobases (kb) proximal from its progenitor locus
on the X chromosome (Figure 1). A dominant suppressor, called Not Much Yang (Nmy) evolved on
chromosome 3R as a retrotransposed copy of Dox (TAO et al. 2007b). Nmy likely suppresses Dox
through an RNA interference mechanism by forming a double stranded RNA with homology to the
distorter RNAs (TAO et al. 2007b). The genes of the Winters sex-ratio are not found in D.
5
melanogaster, which diverged from D. simulans ~2.3 million years ago (LI et al. 1999). Initial surveys
of the genes in the simulans clade indicate that a functional Nmy gene is present in D. mauritiana (Tao
et al. 2007b). Thus, the Winters genes are more than 250,000 years old, the speciation time of D.
simulans, D. mauritiana, and D. sechellia (McDermott and Kliman 2008).
Signatures of positive selection have been previously detected at genomic regions linked to
Drosophila sex-ratio distorters, but we present the first evidence of selection acting directly on a sexratio distorter gene and its suppressor gene. In Drosophila recens, driving X chromosomes show
reduced nucleotide and haplotype variability relative to standard (non-driving) X chromosomes, and
linkage disequilibrium extends over 130 cM of the driving chromosome (DYER et al. 2007). The D.
recens driver is located in a large chromosomal inversion and appears to be in the early stages of
mutational degradation. In the Paris sex-ratio of D. simulans, Derome et al. (2004) found reduced
haplotype diversity at the Nrg locus, which is closely linked to the Paris sex-ratio gene. In a later
study, the group further localized the Paris driver to a pair of duplicated loci 150 kb apart, and
demonstrated reduced haplotype diversity and linkage disequilibrium between variants associated with
drive (Derome et al. 2008). In this study, we characterize patterns of genetic variation in natural
populations of North American D. simulans and find signatures of strong positive selection at all three
genes of the Winters sex-ratio.
MATERIALS AND METHODS
Population Samples: Samples from three North American populations of D. simulans were
examined in this study (Supplementary Table 1). Two sets of isofemale lines were established from
Massachusetts in September 2006: Tremont, collected in a backyard grape arbor on Tremont Street in
Cambridge (n = 34), and Nicewicz, collected at the Nicewicz Family Farm in Bolton (n = 12), ~30 mi
west of Cambridge. F1 males were frozen and used for DNA extraction. In addition, a set of isofemale
lines collected in Winters, CA, in the summer of 1995 (Begun and Whitley 2000) was kindly donated
6
by Sergey Nuzhdin. We also obtained 13 lines of diverse geographic origins from the Tucson Species
Stock Center: 5 African (Madagascar 14021.0251.196, 14021.0251.197, Kenya 14021.0251.199,
Congo 14021.0251.184, and South Africa 14021.0251.169), 2 North American (California
14021.0251.194, unknown 14021.0251.195), 2 European (Scotland 14021.0251.216, Greece
14021.0251.181), and 4 Oceanian (New Guinea 14021.0251.009, New Zealand 14021.0251.007,
Australia 14021.0251.176, New Caledonia 14021.0251.198). All strains were sampled randomly with
respect to sex-ratio phenotype and genotype.
Data Collection: Genomic DNA was extracted from single males using a modified protocol of
the Wizard Genomic DNA Purification Kit from Promega. From the Massachusetts populations F1
males from wild-caught females were used, and so both autosomal alleles were included in our sample.
All other stocks are inbred lab lines and the autosomal loci were found to be homozygous. Polymerase
chain reaction was performed using Takara LA Taq polymerase according to manufacturer’s
instructions. Previously published PCR primers for Dox, MDox, and Nmy were used that amplified
complete genes as well as flanking sequence (Tao et al 2007a, Tao et al 2007b, Figure 1). Internal
sequencing primers were used to obtain 2X coverage (forward and reverse reads) for PCR amplicons.
Primers were designed using Primer3Plus (Untegasser et al. 2007) and Amplify v. 3.14 (Engels 2005).
Sequencing was performed on an ABI3730 capillary sequencer according to manufacturer’s protocols.
Sequences were edited using Sequencher v. 4.8 (Gene Codes Corp.) and aligned by eye with the aid of
bl2seq program of the BLAST package (Tatusova and Madden 1999). Additional editing was
performed using BioEdit (Hall, 2007). At the Nmy locus, singleton variants that were observed as
heterozygous sites in chromatograms were confirmed with repeated PCR and sequencing. Two
samples from the Tremont population, T44 and T62, are double heterozygotes at the Nmy locus; both
heterozygous sites for these samples feature a singleton variant. Haplotype phase was resolved for
these two samples by assuming that each singleton variant arose on the wildtype (e.g. most common)
background, rather than the less likely case of both singletons having arisen on the same genetic
7
background. For each sample we collected 6.2 kb from Dox, 4.5 kb from MDox, and 7.5 kb from Nmy
(Figure 1). A total of 1.6 Megabases (Mb) of resequence data were obtained.
Data Analysis: We calculated population genetic summary statistics using DnaSP (Rozas et al.
2003). The population mutation rate was estimated as the average pairwise diversity,  (Tajima 1983)
and Watterson’s estimator, W, (Watterson 1975) which is based on the number of segregating sites
correcting for sample size. The site frequency spectrum was summarized by both Tajima’s D (Tajima
1989) and Fay and Wu’s H (Fay and Wu 2000). To summarize linkage disequilibrium (LD) across
each gene, we estimated the statistic, ZNS (Kelly 1997), which is the average pairwise R2 value among
all variable sites (Hill and Robertson 1968) and h, the number of haplotypes. In order to calculate the
age of the origin of the genes, we estimated divergence as the average number of nucleotide
substitutions, DXY (Nei 1987 equation 10.20). The fit of various summary statistics to the standard
neutral model was assessed through coalescent simulations using the observed number of segregating
sites, the conservative assumption of no recombination, and 1000 simulations, as implemented in
DnaSP. HKA tests were performed using the HKA software (HEY 2004; HUDSON et al. 1987).
Significance of HKA tests was determined from 10,000 coalescent simulations.
Modeling selection: A Bayesian approach was taken to estimate the time since selection on the
Winters sex-ratio genes in each of the three North American populations, using coalescent simulations
of neutral variants linked to a site under selection. The simulation has two phases (going forward in
time), a complete selective sweep of a new beneficial variant followed by a neutral (recovery) phase.
We used a modified version of a computer program by Przeworski (2003), which models the selected
phase as the structured coalescent in which recombination between neutral variants and the site under
selection is treated analogously to migration between demes (KAPLAN et al. 1989). The neutral locus
evolves according to the infinite sites model, with population mutation rate,  = 4NL (where N is the
effective population size,  is the per site, per generation mutation rate, and L is the length of the
8
sequence) and population recombination rate,  = 4NrL (where r is the per-site, per-generation
recombination rate). Recombination between the neutral and selected sites occurs with rate C = 4NrK
(where K is the distance between the closest neutral site and the selected site). The Bayesian method
estimates the posterior probability distribution for the intensity (4Ns) and the time (T) since the
completion of the selective sweep, using a summary likelihood method, in which the data are
summarized by the number of segregating sites (S), number of haplotypes (h), and Tajima’s D.
The selection model has the following parameters: N, effective population size; s, the selection
coefficient; , mutation rate; r, recombination rate; and T, time since fixation of the beneficial variant.
The posterior probability distributions for model parameters was generated using a rejection algorithm
(Tavare et al. 1997). Briefly, parameter values are sampled from a prior distribution, a genealogy is
simulated with the sampled parameters, and S segregating sites are placed randomly onto the
genealogy. The data summaries described above are calculated from the simulated genealogy and
compared to the summaries from the observed data. Parameter values that generate the observed
number of haplotypes and a Tajima’s D values within some user-specified interval () are accepted and
output to the posterior. To capture uncertainty in model parameters, the prior distribution of , r, and N
are gamma-distributed whereas s is sampled from a uniform prior.
Choice of prior distributions of parameters. In an effort to insure that the prior distribution of
model parameters accurately reflect neutral variation in North American populations of D. simulans,
we calculated the mean W and  (HUDSON 1987) for 29 loci on the X and chromosome 3R sequenced
in the same Winters, CA population (BEGUN and WHITLEY 2000, see Supplementary Table 2). We
used gamma-distributed priors for N, r, and  that yielded priors of the model parameters,  and  with
these empirically observed means. We estimated W and  separately for loci on the X versus 3R and
included all variable sites. The empirically estimated mean per site W and  for the X loci are 0.00488
and 0.01947 and for the 3R loci are 0.01029 and 0.08431 (Supplementary Table 2). The inheritance
9
scalar for the effective size of the X chromosome to that of the autosomes is accounted for in the joint
prior probability distribution of  and . The analysis outputs time scaled in units of 4N generations
and that scaling can be considered arbitrary. To avoid confusion, we have reported scaled times in unit
of N generations.
Mutation rate was calculated from whole-genome divergence between D. simulans and D.
melanogaster. Begun et al. (2007) estimated lineage-specific divergence for D. simulans in 10 kb
windows across the entire genome. We calculated  for each window as the lineage-specific
divergence divided by 2.3MY, the divergence time for D. simulans and D. melanogaster (LI et al.
1999). Assuming 10 generations per year, this calculation gives a median per-site, per-generation
mutation rate for chromosomes X and 3R of 1.2 x 10-9 and 1.0 x 10-9, respectively. These estimates are
within the range of other estimated mutation rates for Drosophila (ANDOLFATTO and PRZEWORSKI
2000), but slightly lower than a commonly used mutation rate based on synonymous sites only (SHARP
and LI 1989). If we assume there is a single effective population size for a population, the per-site, pergeneration r can be calculated as (*)/W. For the Winters, CA population data (BEGUN and WHITLEY
2000), we calculated r = 4.8 x 10-9 for the X and r = 8.2 x 10-9 for 3R. The prior distributions of  for
the X and 3R were gamma with shape parameter 12 and 10, respectively, and scale parameter 10-10;
thus, the means of these distributions are 1.2 x 10-9 and 1.0 x 10-9, respectively (Table 2). The prior
distributions of r for the X and 3R were gamma distributed with shape parameter 48 and 82,
respectively, and scale parameter 10-10; thus, the means of these distributions are 4.8 x 10-9 and 8.2 x
10-9, respectively. The prior for the selection coefficient, s, was uniform between 5 x 10-4 and 0.5.
We chose to estimate r using a population genetic estimate () rather than genetic map data for
several reasons. First, recombination rates estimated from genetic maps are systematically higher than
those estimates from population genetic parameters (ANDOLFATTO and PRZEWORSKI 2000; O'REILLY et
al. 2008). While this pattern may be shaped by selection, demographic factors such as population
10
bottlenecks or population expansions may also increase levels of LD in natural populations (STUMPF
and MCVEAN 2003). Secondly, recombination rate in Drosophila is sensitive to maternal age,
temperature, and genetic background and recombination estimates in laboratory stocks do not take into
account these biological factors (ASHBURNER et al. 2005). Third, our use of the lower, populationbased estimates of recombination is conservative with regards to the estimated strength of selection
and timing of selection (i.e. time since selection may be over estimated and strength of selection may
be under estimated).
RESULTS
Ancestral alleles observed at all loci: For each of the three sequenced loci we observe
multiple chromosomes that lack the gene insertion, which represent the ancestral state of each locus
(Supplementary Table 1). For convenience we refer to the presence of the gene insertion as the
“derived” allele. At the Dox locus, four strains (two from Madagascar, one from New Caledonia, and
one from New Zealand) lack the 3833 bp Dox gene insertion; at MDox, four samples lack the 3549 bp
gene insertion (two from Madagascar, one from Congo, and one from New Zealand); and at Nmy, two
North American samples lack the 2041 bp gene insertion (one individual each from Winters, CA, and
the Tremont population from Massachusetts).
Null mutations at Dox: Three different alleles at Dox were observed that have the derived
gene insertion but have lost their ability to drive (see Figure 2). The wild-type allele is the functional
distorter Dox and is present in 75% of the sampled lines (n = 53). A previously characterized null allele
dox[del105] is present in 3 copies (4%) (Tao et al. 2007a). This allele has a 105 bp deletion
overlapping intron 2 and exon 3, which removes a region that is critical for distortion. Ten samples
(14%) have the dox[del150] null allele, which has a total of 150 bp deleted in exon 4, including one
large 135 bp deletion and two smaller deletions of 12 bp and 3 bp. We found a single copy of
11
dox[del585], which shares the exon 4 deletions with dox[del150] but has an additional 435 bp deletion
spanning exon 1 and intron 1. We tested the ability of dox[del150] and dox[del105] to distort sex
ratios in a non-supressing nmy background, where nmy is a loss-of-function mutant of the Nmy gene
(Tao et al. 2007b). These crosses yielded progeny with equal sex ratios (see Supplementary Table 3
and Supplementary Figure 1). We assume that dox[del585] is a loss-of-function mutant because it
shares the dox[del150] deletions in addition to the large deletion in exon 1.
Insertion-deletion polymorphism: Insertion-deletion (indel) polymorphisms at the Dox locus
were already discussed in the context of loss-of-function mutations. At MDox, we observe one copy of
MDox[del105] which has a 105 bp deletion that spans exons 2 and 3, one copy of MDox[ins135],
which has a total of 135 bp inserted into exon 3, and one copy of MDox[ins32] which has 32 bp
inserted in exon 1. The functional implications of these mutations are not known. In some cases, the
same indel polymorphisms were observed at Dox and MDox, and evidently derive from gene
conversion between the two paralogs (see next section). In addition to indel polymorphism in the
MDox gene sequence, we observe variable numbers and lengths of the 360 bp repetitive elements that
flank the MDox gene (Tao et al. 2007a). (Copies of this element also flank the Dox gene and may
facilitate gene conversion between the paralogs). The New Zealand and Kenyan samples have an
additional full-length repeat element 5' of the MDox gene, and one of the Madagascar samples
(14021.0251.197) is missing the two 3' repeat elements. At Nmy, three samples (two from Madagascar
and one from Congo) have a 6 bp insertion in one of the inverted repeats necessary for suppression by
Nmy; we refer to this allele as Nmy[ins6]. Two of these three samples (the Congolese sample and one
Madagascar sample, 14021.0251.196) also have a 201 bp insertion adjacent to a deletion of 77 bp
between the inverted repeats, which is in the putative loop region of the RNA secondary structure (Tao
et al. 2007a). The functional implications of these mutations at Nmy are not known.
Nucleotide polymorphism and divergence: Estimates of nucleotide polymorphism for the full
dataset at all three genes are relatively low, but not unusually so compared to other datasets for D.
12
simulans (ANDOLFATTO 2001; BEGUN and WHITLEY 2000). Importantly, the derived alleles have
significantly reduced levels of nucleotide polymorphism compared to ancestral alleles (Table 1, Figure
3). Derived alleles have 2.22% of the ancestral allele diversity at Dox when measured as  (4.38%
when measured as W), and the corresponding parameters estimated for MDox are 0.99% (4.62%) and
for Nmy 2.29% (14.65%). To test the statistical significance of the reduction, we performed pairwise
HKA tests (HUDSON et al. 1987) for each locus, in which levels of polymorphism and divergence are
compared for derived and ancestral alleles (Figure 3). Divergence was measured from D. melanogaster
in the region flanking the genes. Deviation from neutral expectations is significant for all three loci
(Dox: 2 = 57.84, P < 0.0001; MDox:  2 = 35.05, P < 0.0001; Nmy:  2 = 13.716, P < 0.0036).
To determine whether the Winters SR genes show signatures of positive selection, we
conducted multilocus HKA tests in which we compared polymorphism and divergence at each of the
three Winters SR genes in the three North American populations to that of 13 unrelated loci sampled in
the same Winters, CA population (Table 3). For our “neutral” set of loci, we chose a subset of the 29
loci sampled by Begun and Whitley (2000) that had the largest number of sampled chromosomes (n =
8). The Winters SR genes are predicted to be non-protein coding RNA genes (TAO et al. 2007a; TAO et
al. 2007b) so we included all variable sites in our analysis because we cannot restrict our analysis to
synonymous sites only, whose evolution is least likely to be influence by non-neutral processes
(ANDOLFATTO 2005; HALLIGAN and KEIGHTLEY 2006). The original Begun and Whitely (2000) study
analyzed only synonymous sites, so we reanalyzed all sites in their data in order to directly compare
the datasets. A multi-locus HKA test on these 13 loci does not show any departure from neutral
expectations ( 2 = 17.99, P < 0.0764, Table 3). However, when we include the Winters SR genes we
observe significant deviation from the neutral expectations in all but one test (Table 3). We first
conducted nine tests where we added data for a single Winters SR locus from a North American
population to the 13 Begun and Whitely (2000) loci. All nine tests are significant except when we
13
added Nmy from the Winters population ( 2 = 20.28, P = 0.0903). For the Nmy data, we conducted
two additional tests for the Tremont and Winters populations where we excluded the single ancestral
allele present in each population. Both of these tests are significant (Winters:  2 = 59.92, P < 0.0001;
Tremont:  2 = 94.52, P < 0.0001). (Here we report the uncorrected P-value but all tests remain
significant at P < 0.0011 after a Bonferonni correction for multiple tests.) If positive selection has
acted on the Winters SR genes, we expect to see deviation in the test in the direction of elevated
divergence and reduced polymorphism at the Winters SR genes. In five of the 11 tests conducted, the
Winters SR gene showed the largest deviation from neutral expectations in both polymorphism and
divergence. In the remaining five significant tests, the Winters SR gene had the largest deviation from
neutral expectations for divergence but not polymorphism. Moreover, these deviations were in the
direction of reduced polymorphism and elevated divergence.
Site-frequency spectrum: Non-neutral processes such as natural selection or non-equilibrium
demography shape the site-frequency spectrum, which is commonly summarized by the statistics
Tajima’s D (TAJIMA 1989) and Fay and Wu’s H (FAY and WU 2000). Tajima’s D (TD) is a summary
of the folded frequency spectrum and compares two estimates of nucleotide polymorphism,  and W,
yielding a negative value if a locus has an excess of low frequency variants and a positive value if a
locus has an excess of intermediate frequency variants. Fay and Wu’s H (FWH) is a summary of the
unfolded frequency spectrum and is sensitive to the frequency of derived mutations such that it is
negative when there is an excess of high frequency derived variants. Both of these statistics are
commonly used as tests of selection where a negative value for each is compatible with a locus having
experienced a selective sweep.
We calculated TD and FWH at the Winters SR genes for a sample of all chromosomes, for each
of the three North American populations and five African samples, and for the derived and ancestral
alleles (Table 1). In the full dataset for all loci, we observe significantly negative Tajima’s D values at
each gene (Dox: -2.19, P < 0.00001; MDox: -2.58, P < 0.00001; Nmy: -2.76, P < 0.00001). For the
14
North American populations, all but two population samples for which we could conduct tests have
significantly negative Tajima’s D values (Dox: Nicewicz, -2.17, P = 0.003; Tremont, -0.29, n.s.;
Winters, -0.34, n.s.; MDox: Nicewicz, -2.09, P < 0.00001; Tremont, -1.98, P = 0.008; Winters, -1.14, P
< 0.05; Nmy: Nicewicz, n.a.; Tremont, -2.88, P < 0.00001; Winters, -2.32, P = 0.008). The African
sample has a significantly positive TD at Dox (1.82, P = 0.001) and TD values close to zero for the
other loci (MDox: -0.36, n.s.; Nmy: 0.32, n.s.). At all loci, the derived alleles have significantly
negative TD’s (Dox: -1.48, P = 0.041; MDox: -2.40, P < 0.00001; Nmy: -2.69, P < 0.00001) whereas
the ancestral alleles have TD’s close to zero (Dox: 0.35, n.s.; MDox: -0.27, n.s.; Nmy: n.a.). Samples
for which TD values could not be calculated due to lack of segregating sites or too few samples are
indicated with “n.a.” In summary, TD estimates are compatible with positive selection acting at all
three Winters SR genes. At each gene, samples including all chromosomes as well as only the derived
alleles show significantly negative TD values. Because the site frequency spectrum is sensitive to
population pooling (HAMMER et al. 2003; PTAK and PRZEWORSKI 2002), the estimates for individual
North American populations minimizes this problem (but may not eliminate it as the geographic scale
of population structure in North American D. simulans is not well understood). For the individual
populations, we observe significantly negative TD values for all tests except for Tremont and Winters
at Dox. This pattern is not likely to result from demographic forces such as population growth because
significantly negative TD values are not observed at any of the reanalyzed Begun and Whitley (2000)
loci (Supplementary Table 2), which were sampled in the same Winters, CA population.
For the complete dataset, we observe significant FWH at Nmy, and marginal significance at the
driver loci (Dox: -25.95, P = 0.067; MDox: -24.57, P = 0.052; Nmy: -91.63, P < 0.00001). Similarly,
North American populations and samples of derived alleles also have significant FWH at Nmy only
(Dox: Nicewicz, n.a; Tremont, 0.06, n.s.; Winters, n.a.; derived, 0.03, n.s.; MDox: Nicewicz, 0.15, n.s.;
Tremont, 0.18, n.s.; Winters, 0.15, n.s.; derived, 0.15, n.s.; Nmy: Nicewicz, n.a.; Tremont, -36.19, P =
15
0.005; Winters, -72.88, P < 0.00001, derived, -29.72 P = 0.003). None of the tests are significant for
the African sample or the ancestral alleles.
Gene conversion between Dox and MDox: Alignment of the paralogous region of the Dox
and MDox loci reveal three gene-conversion tracts. The dox[del150] allele has a sequence motif of
three deletions and a cluster of 5 single nucleotide polymorphisms (SNPs) that is shared with the wildtype MDox haplotype. In addition, we find one MDox haplotype that resembles the wild-type Dox
haplotype in that it lacks these same three deletions and the SNP motif. Finally, the 105 bp deletion
that characterizes the dox[del105] allele is also found in one MDox haplotype. These gene-conversion
tracts were identified by eye and confirmed with the method of Betran et al. (1997) using the DnaSP
software.
Linkage disequilibrium: Positive selection on a beneficial mutation can cause linked neutral
variants to increase in frequency along with the selected site, which results in elevated levels of linkage
disequilibrium across the genomic region. To test for elevated levels of LD at the Winters SR genes,
we summarized LD as the average pairwise R2 value across each gene, ZNS (KELLY 1997). We also
tested for a reduction in the number of haplotypes (h), which results from hitchhiking by positive
selection (NIELSEN 2005). The results of this test are largely parallel with the estimates of ZNS (Table
1). In the complete dataset, we observe significantly elevated LD at all three loci (Dox: 0.52 P = 0.003;
MDox: 0.32, P = 0.046; Nmy: 0.40, P = 0.01). Six of the ten populations for which we could calculate
ZNS show elevated LD (Dox: Nicewicz, 0.80, P = 0.007; Tremont, 0.35, n.s.; Winters, 0.76, P = 0.038;
Africa, 0.97, P = 0.003; MDox: Nicewicz, 0.83, P = 0.015; Tremont, 0.25, n.s.; Winters, n.a.; Africa,
0.37, n.s.; Nmy: Nicewicz, n.a.; Tremont, 0.84, P < 0.00001; Winters, 1.00, P < 0.00001, Africa, 0.47,
n.s.).
Several factors besides selection may increase levels of LD in a sample. These include pooling
derived and ancestral alleles (particular when alleles differ by large genomic insertions that may inhibit
recombination), paralogous gene conversion, and pooling samples from different biological
16
populations. To explore these affects, we first calculated ZNS separately for derived and ancestral
alleles. The sample including all derived alleles at Nmy showed elevated LD (n = 113, ZNS = 0.41, P =
0.005) but we see no significant ZNS values at other loci (Table 1). When we exclude the ancestral
alleles in the Tremont and Winters populations at Nmy (no ancestral alleles were observed at Dox or
MDox in North America), the signature of LD is no longer evident (Tremont, ZNS = 0.0002, n.s.;
Winters ZNS = n.a., no segregating sites), meaning that the elevated LD was caused by the presence of
the single divergent ancestral allele. Next, gene conversion between Dox and MDox may have
introduced several non-independent mutations, which will initially be in linkage disequilibrium with
each other until the association is eroded by recombination or mutation. We performed a second
analysis of LD after encoding all mutations within gene conversion tracts as a single mutation. This
reanalysis only differed from our initial analysis in the LD estimates at Dox and MDox, and resulted in
no significant LD in the North American populations or the total sample of derived alleles (data not
shown). Finally, pooling among subpopulation can result in spuriously high levels of LD (HARTL and
CLARK 2007). The African samples includes several lines from populations which are genetically
differentiated from each other (BAUDRY et al. 2006), which may be the cause of the elevated LD in the
complete dataset at each locus as well as the African sample at Dox. In summary, we do not observe
elevated LD in samples of derived alleles in our North American populations after correcting for gene
conversion or excluding ancestral gene copies.
Age of derived alleles: To estimate the age of the genes, the nucleotide divergence between
the flanking sequence in the ancestral and derived alleles was calculated at each locus. From the
sequence divergence, the age can be estimated as t = d/(2g) (where d is the divergence per site,  is
the per-site per-generation mutation rate, and g is generation time in years). We used the mutation
rates calculated above for the modeling of selection. The per site divergence between the ancestral and
derived alleles for Dox, MDox, and Nmy are 0.0467, 0.0198, and 0.0165, yielding age estimates of
1.96 MY, 832,000 years and 817,000 years, respectively. Based on this result, the Dox gene appears to
17
be much older than the other genes. It is possible that the duplication and transposition event that
created the Dox gene may also be associated with extensive sequence changes, particularly in the
repetitive sequences that flank the gene. A more accurate method of dating the Dox gene insertion is to
determine the divergence between Dox and MDox at the gene insertion sequence, which is 0.0206,
giving an age of 864,000 years, an estimate that is closer to the estimated ages of the other two genes.
At MDox and Dox, we observed no shared polymorphisms and 22 and 77 fixed differences,
respectively, between the ancestral and derived alleles. At Nmy, there are 12 shared polymorphism and
45 fixed differences—these shared polymorphisms result from a recombination event in the middle of
the sequenced region such that sample T37a has the ancestral haplotype at the Nmy gene and a derived
haplotype in the region distal to the gene.
Timing of selection: To estimate the time since selection on the three genes of the Winters sexratio, we implement a model of a selective sweep followed by a neutral (recovery) phase in each of the
three North American populations (Przeworski 2003). We assume the selective sweep was complete
and therefore restrict our analysis to the derived alleles at each gene, which leads us to exclude one
ancestral Nmy allele from each of the Tremont and Winters population. We were unable to perform the
analysis for the Nicewicz population at the Nmy locus, because only one segregating site is present and
Tajima’s D could not be calculated. By assuming fixation, we may be upwardly biasing our estimates
of the time since selection at Nmy (in North America, Dox and MDox are fixed in our sample so this is
less likely to be a problem at these loci). If ancestral alleles are segregating in the population,
recombination between derived and ancestral alleles may introduce mutations onto the derived
background, which would make derived alleles seem more diverse, and it would appear that selection
occurred longer ago than it actually did. In view of the results actually obtained, therefore, excluding
the ancestral Nmy sequences is conservative. In addition, the presence of gene conversion between Dox
and MDox results in conservative estimates of time since selection. Gene conversion increases the
18
number of segregating sites by introducing multiple non-independent mutations, thus increasing the
length of the recovery phase after selection is complete.
We generate 1000 sets of model parameters that are compatible with our data summaries at
each locus. For five of the datasets, we accepted simulated Tajima’s D values within  = 0.1 of the
observed data. However, three datasets (Dox-Tremont, Dox-Winters, and Nmy-Winters) exhibited low
acceptance rates, which led us to increase  to 0.5. The fit of the selection model to the data summaries
can be evaluated based on the shape of the posterior distribution for T, the time since the sweep in
coalescent time units of N generations (see Supplementary Figure 2). If the posterior is flat, it suggests
that selection is either too old to be detected (i.e more than 4N generation ago), or else did not occur
(PRZEWORSKI 2003). Based on an effective population size on the order of 1 x 106 years and 10
generations per year, we should be able to detect selection that occurred up to 4 million generations, or
400,000 years ago. All eight datasets are compatible with the model of selection (Supplementary
Figure 2). The median time since selection for Dox and MDox ranges from 0.0304  N generations to
0.0348  N generations (Table 4). At Nmy, selection is more recent, with a median time of 0.0068  N
generations for the Tremont population and 0.0164  N generations for the Winters population. The
time since selection in years can be calculated as t = TNg where T is the time in coalescent time units,
N is the effective population size, and g is the generation time in years, in this case 0.1, or 10
generations per year. At Dox and MDox, selection occurred around 3,000 years ago, with median times
ranging from 2,800 years for the Tremont population at MDox to 3,500 years ago for the Nicewicz
population at Dox (Table 4). Selection in the Tremont population at Nmy is most recent (median time =
1,600 years), while in the Winters population at Nmy the median time since selection is 3,800 years.
Importantly, the 95% credible interval for all eight datasets excludes the origin of the genes more than
250,000 years ago (TAO et al. 2007a). Selection most likely occurred less than 14,000 years ago, well
after the genes of the Winters SR had evolved in the ancestor of the D. simulans clade.
19
DISCUSSION
In this study, we characterize patterns of genetic variation in North American populations of D.
simulans at the genes of the Winters sex-ratio, one of three X-linked meiotic drive systems in this
species (TAO et al. 2007a). We find that the presence of all genes—the distorter locus, Dox its
progenitor gene, MDox, and the suppressor, Nmy—are polymorphic in this species. The frequencies of
the ancestral form of the driver loci (the allele which lacks the gene insertion) are highest in African
and Oceanean samples, while ancestral Nmy is rare in the North American samples and absent in
samples from other geographic localities. We also find evidence of gene conversion between Dox and
MDox, the paralogous gene pair responsible for sex-ratio distortion in this system. Finally, we find
several loss-of-function mutations on the derived Dox background, consistent with virtually complete
suppression of the sex-ratio system in North American populations.
All three genes of the Winters sex-ratio show signatures consistent with recent positive
selection. In this context, we use the term “selection” to also include the transmission-ratio advantage
of the meiotic drive locus. The evidence for selection is two-fold. First, nucleotide variability on the
derived allele background is greatly reduced compared to the ancestral allele background (Table 1,
Figure 3). Second, all genes show skews in the site-frequency spectrum with an excess of lowfrequency variants observed in all genes, and an excess of high-frequency derived variants observed at
Nmy. These site-frequency skews are reflected in significant negative Tajima’s D and Fay and Wu’s H
statistics (Table 1). Both of these patterns are consistent with a hitchhiking model where a new
mutation has rapidly increased in frequency in a population due to natural selection or biased
transmission during meiosis. In addition, we find our data to be compatible with a coalescent model of
a recent selective sweep at all loci that occurred well after the origins of the genes (Table 4 and Figure
4). In fact, the 95% credible interval for the time since selection at all loci is more recent than the split
between D. simulans and D. mauritiana, ~250,000 years ago (MCDERMOTT and KLIMAN 2008). This
20
result is consistent with theoretical prediction that meiotic drive systems experience repeated bouts of
drive and suppression, and thus multiple rounds of selection (FRANK 1991).
Selection on the Winters sex-ratio is older than on the Paris sex-ratio, the other system in D.
simulans that has been extensively studied. Derome et al. (2008) estimated that selection acted on the
Paris driver only 88 years ago, based on an analysis of linkage disequilibrium across a region linked to
the driver. Our results indicated selection acted less than 15,000 years ago, with an average age across
loci of 3,000 years. Consistent with this estimate, we do not observe elevated linkage disequilibrium in
derived gene copies at any of the Winters sex-ratio genes, after correcting for gene conversion between
Dox and MDox. Significant linkage disequilibrium would be a signature of very recent selection. This
signal is absent; whereas the signal of reduced polymorphism and skewed site frequencies are evident.
At the time that selection was most likely acting on the genes of the Winters sex-ratio, the geographic
range of D. simulans was restricted to Africa, the Indian ocean islands, and Eurasia (Lachaise et al.
1988). North America was likely settled ~500 years ago during the European colonization of the New
World, facilitated by commensalism with humans (Lachaise and Silvain 2004). Interestingly, the most
recent round of selection on the Winters SR occurred around the time of the expansion into Eurasia,
6,500- 5,000 years ago (LACHAISE and SILVAIN 2004). Female biased populations have higher growth
rates than populations with even sex ratios (HAMILTON 1967), suggesting that the unleashing of the
Winters driver and the resulting excess of females may have facilitated the colonization of new
habitats. However, due to the large credible intervals of the estimated time since selection, we cannot
exclude the possibility that selection occurred when the species range was restricted to Africa.
Could other evolutionary forces besides selection have caused these departures from neutral
patterns? Demographic forces such as population-size changes or population subdivision can have
profound effects on genetic variation. However, these factors shape variation across all loci whereas
selection targets particular genes or functional regions. Patterns of variation at Dox, MDox, and Nmy
are unusual when compared to other loci sampled in North American populations (BEGUN and
21
WHITLEY 2000). In all three populations, each gene has either reduced polymorphism or elevated
divergence, or both, as evidenced by significant multi-locus HKA tests (Table 3). Population growth
can result in skews in the site-frequency spectrum similar to what we observed (i.e., an excess of rare
variants and negative Tajima’s D). However, previous work indicates that populations of D. simulans
were been subject to a population bottleneck during their colonization of the New World (WALL et al.
2002). Recent population bottlenecks are expected to result in an excess of intermediate frequency
variants (WAKELEY 2009), whereas we observe a dearth in our data. Indeed, the Tajima’s D estimates
for the Begun and Whitley data (2000) are slightly positive, consistent with a population bottleneck
(Supplementary Table 2). Combined with our detailed knowledge of the function of these genes (TAO
et al. 2007a; TAO et al. 2007b), we are confident that the observed departures from neutral equilibrium
expectations at the genes of the Winters sex-ratio are due to selection.
If all three genes show signatures of positive selection, why are they not fixed in the species?
Even under a simple model of selective neutrality and drift, neutral mutations are not expected to
persist beyond 4N generations, or roughly 400,000 years in the case of D. simulans if we assume 10
generations per year and an effective population size on the order of one million (HARTL and CLARK
2007). Four copies of the ancestral distorter alleles were found in African and Oceanean populations
and two ancestral suppressors were found in North America. Polymorphism at the suppressor can be
explained from a simple model of selection to maintain Fisherian sex ratios. Assume, after Fisher
f
m
(1930), that the total reproductive value of males and females is equal,
W  W
i
i1
j
where Wi is the
j1
fitness of the ith male, Wj is the fitness of the jth female, and there are m males and f females in the
population. If we apportion fitness evenly among individuals 
of each sex, the fitness of each male is
then simply equal to the sex ratio, Wi = f/m. In a female-biased population, members of the “rarer sex”
(males) have higher fitness. Under a model where a sex-ratio distorter invades a population and fixes
due to its transmission advantage, selection on a new suppressor is frequency dependent. At low
22
frequency, a population is female-biased and selection for the maintenance of equal sex ratios is
strong; but at high frequency, most copies of the distorter are masked, the population sex ratio is close
to 50/50, and selection is much weaker. This result explains why selection for Fisherian sex ratios may
be inefficient at removing the last few copies of a non-suppressor allele, even though under a
deterministic model, the suppressor will eventually fix (VAZ and CARVALHO 2004). In addition,
selection is expected to be even less efficient at purging null suppressors if the functional suppressor is
dominant, as vanishingly few individuals will express sex-ratio. This verbal model makes many
simplifying assumptions such as panmixia, infinite population size, no pleiotropic fitness effects of
drivers or distorters, and dominant suppression, but it could nevertheless explain the presence of
ancestral Nmy alleles in North American populations that are fixed for the derived allele at both Dox
and MDox.
Understanding the presence of null alleles of Dox and MDox is more complex. Under simple,
single-population models of sex-chromosome drive, polymorphism between driving (SR) and standard
(ST) X chromosomes can result from three conditions (Vaz and Carvalho 2004). First, the transmission
advantage of an SR chromosome may be balanced by deleterious effects of either the driving locus
itself or linked variants. Experimental work in a variety of Drosophila species indicates that when
mated multiply, SR males suffer reduced fertility as well as reduced sperm competitive ability; these
are examples of pleiotropic effects of the drive locus due to reduced sperm production (Jaenike 2001).
Linked deleterious mutations may affect either male or female fitness and are common when driver
elements occur in chromosomal inversions. In D. recens, females homozygous for SR chromosomes
have reduced fertility, presumably due to a mutation at an unrelated locus trapped in the large
inversion which contains the drive locus (DYER et al. 2007). The last two conditions for SR/ST
polymorphism require the evolution of suppressors by selection for Fisherian sex ratios or genomic
conflict, which mask the expression of drive. If suppression is complete (i.e., suppressors are fixed) the
23
meiotic drive system is essentially “dead” and both loci evolve neutrally. If the suppression is partial
(i.e., suppressors are polymorphic) polymorphism in the driver may be maintained.
For the Winters sex-ratio, we may argue against an offsetting deleterious effect based on
several lines of evidence. First, the distorter is not located within a chromosomal inversion and is
unlikely to be associated with deleterious variants. Second, theoretical work indicates that SR
chromosomes balanced by deleterious effects cannot reach a frequency high enough to skew sex ratios
and induce selection for suppressors (Vaz and Carvalho 2004). So the mere presence of Nmy indicates
the Dox/MDox is not maintained as a balanced polymorphism. However, rejection of this hypothesis
requires careful measurement of the fitness of each genotype. Interestingly, experiments suggest that
the fertility of males expressing drive may be reduced relative to that of males with suppressed drivers
(TAO et al. 2007b). Although rates of female remating in D. simulans is low (MARKOW 1996), in a
female biased population, sperm limitation may be an issue for males. A difficulty in testing this
hypothesis stems from the fact that small fitness effects may have important consequence in natural
populations yet be undetectable in the laboratory.
The partial suppression hypothesis seems unlikely because, although Nmy is not fixed, the
frequency of males homozygous for non-suppressing Nmy is very low. Based on the observed allele
frequencies in our sample, non-suppressing males are expected to occur at 0.6% in the Winters
population and at 0.02% in the Tremont population. Thus, the “neutral” explanation seems most likely
as it is supported by the presence of loss-of-function mutations on the derived Dox background and the
near complete suppression of driving chromosomes based on observed allele frequencies in our
sample.
Our inability to distinguish among these three hypotheses for the polymorphism in the Winters
driver is complicated by the fact that D. simulans violates many assumptions of the simple populationgenetic models implicit in the discussion above. The species exhibits high levels of population
structure, particularly in Africa (Hamblin and Veuille 1999), and it is possible that the ancestral alleles
24
were sampled in populations that do not exchange migrants with populations that currently harbor the
Winters sex-ratio genes. More extensive population sampling of the Madagascar, Congolese, New
Caledonia, and New Zealand populations may shed light on this possibility. The possibility of
competitive exclusion of the Winters driver by the Paris driver also exists. Notably, the frequency of
the Paris driver is highest in central Africa and the Indian Ocean islands (JUTIER et al. 2004), where,
based on our coarse global sampling, ancestral copies of the Winters driver are found. Consistent with
the competitive exclusion hypothesis, the intensity of drive is higher in the Paris system than in the
Winters system, ~96% versus ~81% (Montchamp-Moreau et al. 2006; Tao et al. 2007b). Neither driver
appears to be a balanced polymorphism that would limit the spread of the drivers through the
population, so differential intensity of drive would in large part determine the frequency of the drivers
in the population (Thomson and Feldman 1975). Testing the competitive exclusion hypothesis will
require more extensive population sampling of the Winters driver, particularly in Africa and the Indian
Ocean islands, as well as competition experiments between the two drivers in population cages in the
laboratory.
Our analysis indicates that selection is much more recent than the actual origin of the Winters
sex-ratio genes about 850,000 years ago. The date is based on sequence analysis and is consistent with
the species distribution of the genes. All are absent in D. melanogaster but preliminary data indicates
that the genes are present in D. mauritiana (Tao et al. 2007b, Kingan, unpublished data). Moreover, the
D. sechellia Y chromosome is sensitive to drive by Dox (Tao et al. 2007a). An old origin but recent
selection is suggestive of a genetic “arms-race” model for the evolution of drivers and suppressors,
whereby multiple rounds of suppression and distortion occur due to ongoing genetic conflict between
the loci (Frank 1991). In fact, the structure of the driving locus for Winters supports this “arms-race”
model. Dox may have evolved as an enhancer or modifier of an original distorter, most likely MDox,
which had been suppressed by an unknown locus (or an earlier form of Nmy). The most recent
suppressor, Nmy, may then have evolved to suppress the new, compound distorter. This model is
25
testable by substituting chromosomes with a derived MDox and ancestral Dox into a variety of
autosomal backgrounds. If drive is observed for some genotypes, it would confirm that MDox was
once able to drive alone. In addition, if there is polymorphism in the drive phenotype, one may be able
to map the original suppressor of MDox.
The Winters sex-ratio is not the only trans-specific meiotic drive system: in mice, stalk-eyed
flies, and Drosophila, shared drive systems are found in multiple closely related species (Jaenike
2001). The genomic conflict that results from a single meiotic drive system can have profound effects
on patterns of genomic diversity in multiple species over a period of millions of years. On the
molecular level, these patterns are indistinguishable from those caused by adaptation based on novel
variation. It is only with a detailed understanding of the functional importance of genomic regions that
one can attribute genomic signatures of selection to processes that increase the fitness of individual
organisms.
26
ACKNOWLEDGMENTS
We thank Yun Tao for generously sharing research materials, fly stocks, and unpublished results as
well as his expertise and insight into this system; also Luciana Araripe, Horacio Montenegro, Kalsang
Namgyal, Erik Dopman, and Nguyen Nguyen for technical assistance, and Noemi Velazguez for
administrative assistance. The Nicewicz family farm kindly gave us access to their farm for fly
collections. We are grateful to John Wakeley and Molly Przewoski for help with the coalescent
modeling. Daven Presgraves and Yun Tao provided thoughtful comments, which greatly improved the
manuscript. This work was supported by NIH grant GM065169 to D.L.H and an NSF Graduate
Research Fellowship to S.B.K.
27
REFERENCES
ANDOLFATTO, P., 2001 Contrasting patterns of X-linked and autosomal nucleotide variation in
Drosophila melanogaster and Drosophila simulans. Mol Biol Evol 18: 279-290.
ANDOLFATTO, P., 2005 Adaptive evolution of non-coding DNA in Drosophila. Nature 437: 1149-1152.
ANDOLFATTO, P., and M. PRZEWORSKI, 2000 A genome-wide departure from the standard neutral
model in natural populations of Drosophila. Genetics 156: 257-268.
ASHBURNER, M., K. G. GOLIC and R. S. HAWLEY, 2005 Drosophila. A Laboratory Handbook. Cold
Spring Harbor Laboratory Press, Cold Sping Habor, NY.
BAUDRY, E., N. DEROME, M. HUET and M. VEUILLE, 2006 Contrasted polymorphism patterns in a large
sample of populations from the evolutionary genetics model Drosophila simulans. Genetics
173: 759-767.
BEGUN, D. J., A. K. HOLLOWAY, K. STEVENS, L. W. HILLIER, Y. P. POH et al., 2007 Population
genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans.
PLoS Biol 5: e310.
BEGUN, D. J., and P. WHITLEY, 2000 Reduced X-linked nucleotide polymorphism in Drosophila
simulans. Proc Natl Acad Sci U S A 97: 5960-5965.
BETRAN, E., J. ROZAS, A. NAVARRO and A. BARBADILLA, 1997 The estimation of the number and the
length distribution of gene conversion tracts from population DNA sequence data. Genetics
146: 89-99.
BODMER, W. F., and A. W. EDWARDS, 1960 Natural selection and the sex ratio. Ann Hum Genet 24:
239-244.
CARVALHO, A. B., M. C. SAMPAIO, F. R. VARANDAS and L. B. KLACZKO, 1998 An experimental
demonstration of Fisher's principle: evolution of sexual proportion by natural selection.
Genetics 148: 719-731.
CHARLESWORTH, B., and D. L. HARTL, 1978 Population Dynamics of the Segregation Distorter
Polymorphism of DROSOPHILA MELANOGASTER. Genetics 89: 171-192.
DEROME, N., E. BAUDRY, D. OGEREAU, M. VEUILLE and C. MONTCHAMP-MOREAU, 2008 Selective
sweeps in a 2-locus model for sex-ratio meiotic drive in Drosophila simulans. Mol Biol Evol
25: 409-416.
DEROME, N., K. METAYER, C. MONTCHAMP-MOREAU and M. VEUILLE, 2004 Signature of selective
sweep associated with the evolution of sex-ratio drive in Drosophila simulans. Genetics 166:
1357-1366.
DYER, K. A., B. CHARLESWORTH and J. JAENIKE, 2007 Chromosome-wide linkage disequilibrium as a
consequence of meiotic drive. Proc Natl Acad Sci U S A 104: 1587-1592.
EDWARDS, A. W. F., 1961 The population genetics of "sex-ratio" in Drosophila pseudoobscura.
Heredity 16: 291-304.
FAY, J. C., and C. I. WU, 2000 Hitchhiking under positive Darwinian selection. Genetics 155: 14051413.
FISHER, R. A., 1930 The Genetical Theory of Natural Selection.
FRANK, S. A., 1991 Divergence of meiotic drive-suppression systems as an explanation for sex-biased
hybrid sterility and inviability. Evolution 45: 262-267.
HALLIGAN, D. L., and P. D. KEIGHTLEY, 2006 Ubiquitous selective constraints in the Drosophila
genome revealed by a genome-wide interspecies comparison. Genome Res 16: 875-884.
HAMBLIN, M. T., and M. VEUILLE, 1999 Population structure among African and derived populations
of Drosophila simulans: evidence for ancient subdivision and recent admixture. Genetics 153:
305-317.
28
HAMILTON, W. D., 1967 Extraordinary sex ratios. A sex-ratio theory for sex linkage and inbreeding has
new implications in cytogenetics and entomology. Science 156: 477-488.
HAMMER, M. F., F. BLACKMER, D. GARRIGAN, M. W. NACHMAN and J. A. WILDER, 2003 Human
population structure and its effects on sampling Y chromosome sequence variation. Genetics
164: 1495-1509.
HARTL, D. L., 1975 Modifier theory and meiotic drive. Theor Popul Biol 7: 168-174.
HARTL, D. L., and A. G. CLARK, 2007 Principles of Population Genetics. Sinauer Associates, Inc.,
Sunderland, MA.
HEY, J., 2004 HKA, pp.
HILL, W. G., and A. ROBERTSON, 1968 Llinkage disequilibrium in finite populations. Theoretical
Applied Genetics 38: 226-231.
HUDSON, R. R., 1987 Estimating the recombination parameter of a finite population model without
selection. Genet Res 50: 245-250.
HUDSON, R. R., M. KREITMAN and M. AGUADE, 1987 A test of neutral molecular evolution based on
nucleotide data. Genetics 116: 153-159.
HURST, L. D., and A. POMIANKOWSKI, 1991 Causes of sex ratio bias may account for unisexual sterility
in hybrids: a new explanation of Haldane's rule and related phenomena. Genetics 128: 841-858.
JAENIKE, J., 2001 Sex Chromosome Meiotic Drive. Annual Review of Ecology and Systematics 32.
JUTIER, D., N. DEROME and C. MONTCHAMP-MOREAU, 2004 The sex-ratio trait and its evolution in
Drosophila simulans: a comparative approach. Genetica 120: 87-99.
KAPLAN, N. L., R. R. HUDSON and C. H. LANGLEY, 1989 The "hitchhiking effect" revisited. Genetics
123: 887-899.
KELLY, J. K., 1997 A test of neutrality based on interlocus associations. Genetics 146: 1197-1206.
LACHAISE, D., M.-L. CARIOU, J. R. DAVID, F. LEMEUNIER, L. TSACAS et al., 1988 Historical
biogeography of the Drosophila melanogaster species subgroup. Evolutionary Biology 22: 159225.
LACHAISE, D., and J. F. SILVAIN, 2004 How two Afrotropical endemics made two cosmopolitan human
commensals: the Drosophila melanogaster-D. simulans palaeogeographic riddle. Genetica 120:
17-39.
LI, Y. J., Y. SATTA and N. TAKAHATA, 1999 Paleo-demography of the Drosophila melanogaster
subgroup: application of the maximum likelihood method. Genes Genet Syst 74: 117-127.
LYTTLE, T. W., 1993 Cheaters sometimes prosper: distortion of mendelian segregation by meiotic
drive. Trends Genet 9: 205-210.
MARKOW, T. A., 1996 Evolution of Drosophila mating systems, pp. 73-106 in Evolutionary Biology,
Vol 29.
MCDERMOTT, S. R., and R. M. KLIMAN, 2008 Estimation of isolation times of the island species in the
Drosophila simulans complex from multilocus DNA sequence data. PLoS ONE 3: e2442.
MERCOT, H., A. ATLAN, M. JACQUES and C. MONTCHAMP-MOREAU, 1995 Sex-ratio distortion in
Drosophila simulans: co-occurence of a meiotic drive and a suppressor of drive. Journal of
Evolutionary Biology 8: 283-300.
MONTCHAMP-MOREAU, C., D. OGEREAU, N. CHAMINADE, A. COLARD and S. AULARD, 2006
Organization of the sex-ratio meiotic drive region in Drosophila simulans. Genetics 174: 13651371.
NEI, M., 1987 Molecular Evolutionary Genetics. Columbia University Press, New York.
NIELSEN, R., 2005 Molecular signatures of natural selection. Annu Rev Genet 39: 197-218.
O'REILLY, P. F., E. BIRNEY and D. J. BALDING, 2008 Confounding between recombination and
selection, and the Ped/Pop method for detecting selection. Genome Res 18: 1304-1313.
ORR, H. A., and S. IRVING, 2005 Segregation distortion in hybrids between the Bogota and USA
subspecies of Drosophila pseudoobscura. Genetics 169: 671-682.
29
PHADNIS, N., and H. A. ORR, 2008 A Single Gene Causes Both Male Sterility and Segregation
Distortion in Drosophila Hybrids. Science.
PRESGRAVES, D. C., 2008 Sex chromosomes and speciation in Drosophila. Trends Genet 24: 336-343.
PRZEWORSKI, M., 2003 Estimating the time since the fixation of a beneficial allele. Genetics 164:
1667-1676.
PTAK, S. E., and M. PRZEWORSKI, 2002 Evidence for population growth in humans is confounded by
fine-scale population structure. Trends Genet 18: 559-563.
ROZAS, J., J. C. SANCHEZ-DELBARRIO, X. MESSEGUER and R. ROZAS, 2003 DnaSP, DNA
polymorphism analyses by the coalescent and other methods. Bioinformatics 19: 2496-2497.
SANDLER, L., and E. NOVITSKI, 1957 Meiotic drive as an evolutionary force. American Naturalist 91:
105-110.
SHARP, P. M., and W. H. LI, 1989 On the rate of DNA sequence evolution in Drosophila. J Mol Evol
28: 398-402.
STUMPF, M. P., and G. A. MCVEAN, 2003 Estimating recombination rates from population-genetic
data. Nat Rev Genet 4: 959-968.
TAJIMA, F., 1983 Evolutionary relationship of DNA sequences in finite populations. Genetics 105:
437-460.
TAJIMA, F., 1989 Statistical method for testing the neutral mutation hypothesis by DNA polymorphism.
Genetics 123: 585-595.
TAO, Y., L. ARARIPE, S. B. KINGAN, Y. KE, H. XIAO et al., 2007a A sex-ratio meiotic drive system in
Drosophila simulans. II: an X-linked distorter. PLoS Biol 5: e293.
TAO, Y., D. L. HARTL and C. C. LAURIE, 2001 Sex-ratio segregation distortion associated with
reproductive isolation in Drosophila. Proc Natl Acad Sci U S A 98: 13183-13188.
TAO, Y., J. P. MASLY, L. ARARIPE, Y. KE and D. L. HARTL, 2007b A sex-ratio meiotic drive system in
Drosophila simulans. I: an autosomal suppressor. PLoS Biol 5: e292.
TAVARE, S., D. J. BALDING, R. C. GRIFFITHS and P. DONNELLY, 1997 Inferring coalescence times from
DNA sequence data. Genetics 145: 505-518.
THOMSON, G. J., and M. W. FELDMAN, 1975 Population genetics of modifiers of meiotic drive: IV. On
the evolution of sex-ratio distortion. Theor Popul Biol 8: 202-211.
VAZ, S. C., and A. B. CARVALHO, 2004 Evolution of autosomal suppression of the sex-ratio trait in
Drosophila. Genetics 166: 265-277.
WAKELEY, J., 2009 Coalescent Theory: An Introduction. Roberts and Company Publishers,
Greenwood Village, CO.
WALL, J. D., P. ANDOLFATTO and M. PRZEWORSKI, 2002 Testing models of selection and demography
in Drosophila simulans. Genetics 162: 203-216.
WATTERSON, G. A., 1975 On the number of segregating sites in genetical models without
recombination. Theor Popul Biol 7: 256-276.
ZIMMERING, S., L. SANDLER and B. NICOLETTI, 1970 Mechanisms of meiotic drive. Annu Rev Genet 4:
409-436.
30
TABLE 1
Population genetic summary statistics
n (nanc)
L
S

W
h
Z NS
TD
FWH
71
2342
155
0.00509
0.01396
6***
0.52**
-2.19***
-25.95
Nicewicz
12 (0)
5521
19
0.00057
0.00114
4*
0.80**
-2.17**
--
Tremont
34 (0)
5956
12
0.00045
0.00049
7
0.35
-0.29
0.06
Winters
12 (0)
60601
8
0.0004
0.00044
3
0.76*
-0.34
--
Africa
5 (2)
2343
116
0.02945
0.02376
3*
0.97**
1.82*
-1.20
Derived
67
5511
22
0.00044
0.00084
8*
0.31
-1.48*
0.03
Ancestral
4
2388
84
0.01982
0.01919
4
0.5
0.35
2.33
All Data
69
2788
118
0.0023
0.00918
10***
0.32*
-2.58***
-24.57
Nicewicz
12 (0)
4401
12
0.00045
0.0009
3*
0.83*
-2.09***
0.15
Tremont
33 (0)
4507
9
0.00017
0.00049
5
0.25
-1.98**
0.18
Winters
12 (0)
4508
1
0.00004
0.00007
2
--
-1.14*
0.15
Africa
5 (3)
2788
103
0.01772
0.01859
4
0.37
-0.36
6.40
Derived
65
4400
18
0.00018
0.00086
8
0.26
-2.40***
0.15
Ancestral
4
2815
92
0.01812
0.0186
4
0.41
-0.27
4.33
115
5335
155
0.0009
0.00553
11***
0.40*
-2.76***
-91.63***
Nicewicz
24 (0)
7461
0
0
0
1
--
--
--
Tremont
66 (1)
5403
60
0.00034
0.00233
7***
0.84***
-2.88***
-36.19**
Winters
12 (1)
5385
121
0.00374
0.00744
2***
1.00***
-2.32**
-72.88***
Africa
5 (0)
7311
60
0.00372
0.00359
4***
0.47
0.315
-3.50
Derived
113
7310
67
0.00028
0.00179
11***
0.41**
-2.69***
-29.72**
Dox
All Data
Population
Allele
MDox
Population
Allele
Nmy
All Data
Population
Allele
Ancestral
2
5402
66
0.01222
0.01222
2
1.00
--n is the number of chromosomes sampled. nanc is the number of ancestral alleles present in each population sample. L is the total
number of sites analyzed, excluding alignment gaps. S is the number of segregating sites. h is the number of haplotypes. is the
average number of pairwise differences (Nei 1987). W is Watterson's estimator of population diversity (Watterson, 1975). ZNS is the
average pairwise R2 (Kelley 1997). TD is Tajima's D (Tajima 1989). FWH is Fay and Whu's H (Fay and Wu 2000). * indicates P < 0.05, **
indicates P < 0.01, *** indicates P < 0.001.
32
TABLE 2
Prior distributions of parameters for selection model
prior
 Gam(10-10,12)
r Gam(10-10,48)
N Gam(4x104,25)
s
U(5x10-4,0.5)

 ---
Dox
mean
prior
MDox
mean
95% density
prior
Nmy
mean
95% density
1.2x10-9
6.3x10-10-2.0x10-9
Gam(10-10,12)
1.2x10-9
6.3x10-10-2.0x10-9
95% density
Gam(10-10,10)
1.0x10-9
4.8x10-9
3.5x10-9-6.3x10-9
Gam(10-10,48)
4.8x10-9
4.8x10-10-1.7x10-9
3.5x10-9-6.3x10-9
Gam(10-10,82)
8.2x10-9
6.5x10-9-10x10-9
1.0x106
6.5x105-1.4x106
Gam(4x104,25)
1.0x106
6.5x105-1.4x106
Gam(1x105,25)
2.5x106
1.6x106-3.7x106
--
--
U(5x10-4,0.5)
--
--
U(5x10-4,0.5)
30
--
--
4.9-110
--
22
3.6-80
--
75
9.7-320
120
35-330
--
87
25-240
--
610
160-1540
 is the per site mutation rate. r is the per site recombination rate. N is the effective population size. s is the selection coefficient. is the per locus population
mutation parameter,  is the per locus population recombination parameter.
33
TABLE 3
HKA Tests
L
n
S
Div
P-value
Gene
Population
Chromosome
2
Winters SR Data
Dox
Nicewicz
X
5521
12
19
0.0567
52.51
<0.0001
Dox
Tremont
X
5956
34
12
0.0567
93.27
<0.0001
Dox
Winters
X
6061
12
8
0.0567
72.17
<0.0001
MDox
Nicewicz
X
4401
12
12
0.0611
41.22b
<0.0001
MDox
Tremont
X
4507
33
9
0.0611
59.31
<0.0001
MDox
Winters
X
4508
12
1
0.0611
49.72
<0.0001
Nmy
Nicewicz
3R
7461
24
0
0.0516
77.34
<0.0001
Nmy
Tremont
3R
7460
65
0
0.0516
94.52
<0.0001
Nmy
Winters
3R
7461
11
1
0.0516
59.92
<0.0001
NmyAlla
Tremont
3R
5403
66
60
0.0513
42.85
<0.0001
NmyAll
Winters
3R
5385
12
121
0.0511
20.28
0.0903
Begun and Whitley (2000) Data
bnb
Winters
X
1015
8
11
0.0197
17.99
0.0764
mei-218
Winters
X
1187
8
14
0.0687
ovo
Winters
X
1356
8
9
0.0270
sn
Winters
X
1437
8
28
0.0370
sog
Winters
X
1233
8
8
0.0251
X
Winters
X
1425
8
24
0.0281
yp3
Winters
X
1227
8
8
0.0473
AP-50
Winters
3R
1398
8
58
0.0293
fzo
Winters
3R
1360
8
22
0.0708
hyd
Winters
3R
1786
8
26
0.0208
Osbp
Winters
3R
1166
8
31
0.0266
ry
Winters
3R
1362
8
54
0.0419
T-cp1
Winters
3R
1201
8
9
0.0325
L is the number of bases sequenced in D. simulans. n is the number of D. simulans chromosomes sampled. S is the number of
segregating sites. Div is the per-base divergence from D. melanogaster. 2and P values correspond to multi-locus HKA tests on 13
loci previously sequenced in D. simulans (bottom) and when data from single Winters SR genes for each North American population
were added to the 13 loci (top). See text for details. a. Ancestral alleles were not excluded from the analysis (Nicewicz has no
ancestral alleles in sample).
34
TABLE 4
Posteriors distribution of parameters for selection model
T (N gen)
T (years)


s
median
0.0348
3,500
28
120
0.063
Dox
95% CI
0.0064-0.112
610-10,000
15-49
70-180
0.0019-0.46
median
0.0308
2,900
20
86
0.063
MDox
95% CI
0.0037-0.112
330-11,000
11-35
52-130
0.0015-0.46
T (N gen)
T (years)


s
0.0312
2,900
21
110
0.1
0.0072-0.156
760-12,000
9.6-43
63-170
0.0023-0.48
0.0304
2,800
17
83
0.055
0.0040-0.104
360-9,700
8.9-32
50-130
0.0017-0.47
0.0068
1,600
61
560
0.28
0.0020-0.0212
550-4,500
29-130
340-850
0.014-0.49
T (N gen)
T (years)


s
0.034
3,200
22
110
0.059
0.0035-0.136
300-12,000
11-42
68-180
0.0021-0.48
0.0328
3,100
18
81
0.26
0.0040-0.148
400-14,000
8.5-36
49-133
0.021-0.48
0.0164
3,800
59
570
0.27
0.0032-0.064
790-14,000
26-120
340-890
0.023-0.49
Nicewicz
median
Nmy
95% CI
Tremont
Winters
T (N gen) is the time since selection in coalescent time units. T (years) is the time since selection in years.  is the per-locus
population mutation rate.  is the per-locus population recombination rate. s is the selection coefficient. 95% CI is the 95% credible
interval.
35
SUPPLEMENTARY TABLE 1
Geographic Distribution of Ancestral and Derived Alleles
North American Populations
Number of ancestral alleles (total sampled)
Geographic Origin
Population Name
Bolton, MA
Dox
MDox
Nmy
Nicewicz
0 (12)
0 (12)
0 (24)
Cambridge, MA
Tremont
0 (34)
0 (33)
1 (66)
Winters, CA
Winters
0 (12)
0 (12)
1 (12)
Geographic Origin
Stock ID
Dox Allele
MDox Allele
Nmy Allele
Madagascar
14021.0251.196
ancestral
ancestral
derived
14021.0251.197
ancestral
ancestral
derived
Kenya
14021.0251.199
derived
ancestral
derived
Congo
14021.0251.184
derived
derived
derived
South Africa
14021.0251.169
derived
derived
derived
California
14021.0251.194
derived
derived
derived
North America, unknown
14021.0251.195
derived
derived
derived
Scotland
14021.0251.216
derived
derived
derived
Greece
14021.0251.181
derived
derived
derived
New Guinea
14021.0251.009
derived
derived
derived
New Zealand
14021.0251.007
ancestral
ancestral
derived
Australia
14021.0251.176
derived
derived
derived
New Caledonia
14021.0251.198
ancestral
derived
derived
Global Panel
36
Locus
Aats-gluprop
AP-50
Cen190
fzo
Hsc70
hyd
miranda
nos
Osbp
oso
pit
ry
T-cp1
tld
MEAN
bnb
ct
dec-1
garnet
mei-218
mei-9
otu
ovo
pgd
r
sn
sog
sqh
X
yp3
MEAN
SUPPLEMENTARY TABLE 2
Begun and Whitley 2000 loci Complete sequence. Winters, CA populations
n
L
S
h
Chromosome

W
3R
6
1348
32 6
0.0101
0.0110
3R
8
1398
58 7
0.0161
0.0166
3R
7
1287
23 4
0.0091
0.0073
3R
8
1362
22 5
0.0068
0.0062
3R
7
1292
10 6
0.0031
0.0032
3R
8
1793
26 7
0.0057
0.0056
3R
5
1200
29 5
0.0113
0.0120
3R
7
1073
20 7
0.0074
0.0078
3R
8
1167
31 7
0.0101
0.0103
3R
7
971
24 5
0.0107
0.0101
3R
7
1267
52 5
0.0196
0.0179
3R
8
1362
54 7
0.0163
0.0164
3R
8
1201
9
6
0.0028
0.0029
3R
7
1013
40 7
0.0189
0.0169
0.0106
0.0103
X
8
1030
11 6
0.0035
0.0042
X
6
1090
2
3
0.0008
0.0008
X
7
1493
23 5
0.0068
0.0063
X
7
1265
17 3
0.0039
0.0055
X
8
1230
14 4
0.0058
0.0046
X
7
890
6
4
0.0029
0.0028
X
6
1162
29 5
0.0125
0.0109
X
8
1359
9
6
0.0029
0.0026
X
7
912
17 3
0.0091
0.0076
X
6
1198
9
4
0.0032
0.0033
X
8
1450
28 5
0.0088
0.0078
X
8
1233
8
5
0.0021
0.0025
X
7
777
10 3
0.0055
0.0053
X
8
1425
24 5
0.0082
0.0065
X
8
1241
8
4
0.0027
0.0025
0.0052
0.0049
37
TD
-0.5508
-0.1578
1.4001
0.4938
-0.1073
0.0794
-0.4673
-0.2433
-0.0914
0.3380
0.5476
-0.0374
-0.1608
0.6559
0.1213
-0.8319
-0.0500
0.3685
-1.6704
1.4083
0.2540
0.9132
0.5952
1.0809
-0.1132
0.6903
-0.8615
0.1431
1.3544
0.2580
0.2359

0.0367
0.0360
0.0093
0.0041
0.0271
0.0000
0.8265
0.0096
0.1235
0.0092
0.0333
0.0088
0.0257
0.0305
0.0843
0.0000
0.0298
0.0163
0.0000
0.0111
0.0000
0.0220
0.0174
0.0024
0.0451
0.0109
0.1266
0.0022
0.0071
0.0011
0.0195
n is the number of chromosomes sampled. L is the length of sequence in base pairs. S is the number of segregating sites. h is the number of
haplotypes.  is the per site average number of pairwise differences (Nei 1987). W is the per site Watterson's estimator of population diversity
(Watterson, 1975). TD is Tajima's D (Tajima 1989).  is the per site population recombination rate (Hudson 1987).
38
SUPPLEMENTARY TABLE 3
number of males
genotype
tested
1
Dox; nmy
18
2
dox[del105]; nmy
18
3
dox[del150]; nmy
19
4
dox[del150]; nmy
19
5
dox[del150]; nmy
17
k is the proportion of female progeny.
mean k (s.d)
0.96 (0.038)
0.53 (0.050)
0.56 (0.061)
0.52 (0.032)
0.53 (0.057)
39
FIGURE 1.
Regions sequenced of the genes of the Winters sex-ratio.
Chromosomal location of the distorter locus and suppressor locus are shown at the top. The two genes
of the distorter are Distorter on the X (Dox) and Mother of Dox (MDox). The suppressor gene is called
Not Much Yang (Nmy). Dox and MDox are separated by ~70 kb of DNA sequence on the X
chromosome. Triangles indicate the location of the PCR primers used. Arrows indicate direction of
transcription of the genes (Tao et al. 2007a; Tao et al. 2007b).
FIGURE 2.
Predicted exon structure of the loss-of-function mutants at Dox.
The allele name is followed by the frequency of the mutant in the total pooled sample. Exons are
illustrated as grey boxes, deletions are shown as dashed lines.
FIGURE 3.
Pairwise HKA tests indicate selection acting on the derived form of the Winters SR genes.
Per-site level of polymorphism is shown above the x-axis (W) and per-site average divergence in
flanking sequence (DXY) between D. simulans and D. melanogaster is shown below the x-axis.
Derived alleles are shown in dark grey, ancestral alleles are shown in light grey.
FIGURE 4.
Posterior distributions of the time since selection (in years) for a hitchhiking model.
SUPPLEMENTARY FIGURE 1.
Crossing scheme used to test for sex-ratio distortion of different Dox alleles.
In generation 1, X chromosomes of known Dox genotype were extracted from the Tremont lines.
These were substituted into an isogenic background homozygous for a non-functional suppressor
allele, “nmy[sim1247]. ” Males were mated to tester females (w; e) and sex ratios of their progeny
were determined.
SUPPLEMENTARY FIGURE 2.
Posterior distributions of the time since selection (in units of N generations) for a hitchhiking
model.
41
Download