Supplemental Material

SUPPLEMENTARY MATERIAL for the Note On the distribution of temporal variations in allelic frequency: consequences for the estimation of effective population size and the detection of loci undergoing selection by Isabelle Goldringer and Thomas Bataillon Exploring the effect of uncertainty in the estimation of the effective size, Ne, of the genome on the expected distribution of Fc at individual loci. In our note we outline a procedure to test if individual loci drift faster (slower) than the remaining loci of the genome. We do so by simulating the expected distribution of Fc values at each locus using the remaining loci to estimate a mean effective size, Ne, throughout the genome. Simulating these distributions requires knowledge of the initial frequencies of the alleles at a given "focal" locus as well as Ne. We acknowledge that there is some sampling variance around the estimated initial allelic frequencies of each locus but given the sample sizes typically used (~100 ) they are unlikely to be very large. Here, we focus instead on the sampling variance around the estimate of Ne obtained from the remaining loci of that study. The uncertainty around Ne could potentially affect the null distribution and in turn the p-values of our test. As an example we use patterns of temporal variation in allelic frequency detected at locus ba242-C in wheat population undergoing natural selection. This locus exhibited a relative high value of Fc that could be suggestiing the presence of selection. More information on this dataset can be found in : Goldringer, I., J. Enjalbert, A.-L. Raquin, P. Brabant 2001 Strong selection in wheat populations during ten generations of dynamic management. Genet. Sel. Evol. 33 (Suppl 1) : 441-463. We generate below several Fc distribution expected under the null hypothesis of homogeneous drift throughout the genome for the locus ba242-C. Each distribution corresponds to different levels of uncertainty about Ne. We explore the robustness of the pvalues to such uncertainty. Null distributions for Fc were obtained using simulations (see our note for details) of a Wright Fisher population and each distribution was based on 50,000 independent replicates.  We used the following settings for our simulations  Sample size at generation 1 and 10 where resp. S1=84 and S10=107 individuals. Initial allelic frequencies 0.19 and 0.81  T the number of generation between samples =10  Mean effective size of the remaining loci of the genome Ne=144 We first generated the Fc distribution for locus ba242-C assuming that Ne was known without error (as we do in our Note, see scenario 1 in the Table 1 below), we then explored the effect of incorporating error around Ne by randomly drawing Ne from a Gaussian distribution with mean =144 and different variances (see scenario 2, 3, and 4 in the Table 1 below) thus generating null distributions for Fc at locus ba242-C accounting for increased incertitude around the value of Ne estimated from the remainder loci. Suppl Table 1 Scenario Figure 1 1 S1 2 S2 3 S3 4 S4 Std dev Ne 2 0 10 20 30 Mean Fc 3 0.101 0.121 0.282 0.302 Median Fc 3 0.020 0.020 0.020 0.020 p-value 4 0.028 0.029 0.030 0.031 1 Corresponding figure 2 Standard deviation of the Gaussian distribution describing the uncertainty around Ne. Each simulation of an Fc value started by drawing a Gaussian deviate from that distribution. 3 Mean and median of the distribution of Fc values (for each scenario the distribution was obtained using 50,000 independent replicates). 4 p-value associated with locus ba242-C. A p-value was computed for each scenario based on each distribution as p-value = #(simulations with Fc > 0.19)/50000.

Supplemental Material

Related documents

Products

Support

Supplemental Material

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib