pairs mutation

Supplementary Information for: “Two waves of diversification in mammals and reptiles of Baja California revealed by hierarchical Bayesian analysis” Contents: S1: Hierarchical approximate Bayesian computation S2: Hierarchical population divergence model S3: Summary statistic vector S4: References Figure S1: Multiple population-pair divergence model Table S1: Parameters and their prior distributions S1: Hierarchical Approximate Bayesian Computation The hierarchical model employed in our ABC test for simultaneous divergence across Y taxon-pairs consists of sub-parameters (; within population-pair parameters) that are conditional on “hyper-parameters” () that describe the variability of  among the Y population-pairs. For example, divergence times () can vary across a set of population pairs conditional on the set of hyper-parameters () that varies according to their hyper-prior distribution. Instead of explicitly calculating the likelihood expression P(Data | ,) to get a posterior distribution, we sample from the posterior distribution P((,) | Data) by simulating the data K times under the coalescent model using candidate parameters drawn from the prior distribution P(,). A summary statistic vector D for each simulated dataset is then compared to the observed summary statistic vector in order to generate random observations from the joint posterior distribution f(i,i|Di) by way of a rejection/acceptance algorithm (Weiss and von Haeseler 1998) followed by an optional weighted local regression step (Beaumont et al 2002). Loosely speaking, hyper-parameter values are accepted and used to construct the posterior distribution with probabilities proportional to the similarity between the summary statistic vector from the observed data and the summary statistic vector calculated from simulated data. S2: Hierarchical Population Divergence Model The hierarchical model consist of ancestral populations that split at divergence times TY = {1…Y} in the past (Supplementary Figure 1). The hyper-parameter set,  quantifies the degree of variability in these Y divergence times across the Y ancestral populations and their Y descendent population pairs: (1) , the number of possible divergence times (1    Y); (2) E(), the mean divergence time; and (3)  , the ratio of the variance to the mean in these Y divergence times, Var()/E(). The sub-parameters for    the i-th population-pair (i) are allowed to vary independently across Y population pairs. The sub-parameters  consist of each of the Y taxon-pair’s divergence times and demographic parameters drawn from sub-priors (Supplementary Table 1). Each pair of daughter populations a and b are descended from an ancestral population at a divergence time . Population mutation parameters (; N is the female effective population size and  is the per gene per generation mutation rate) for daughter populations a and b are a and b, whereas ’a and ’b are the population mutation parameters for the daughter populations a and b at the time of divergence until (duration of bottleneck). For each of the Y taxon-pairs, a + b = The daughter populations ’a and ’b then grow  exponentially to sizes a and b. The population mutation parameter for each ancestral population is depicted as A. Each divergence time parameter  is scaled by  AVE, where AVEis a constant determined by the mean of the sub-prior for  (Supplementary Table 1). The uniform prior for  spans all of the empirical estimates of  from a comparative phylogeographic dataset using either Tajima’s (1983) or Watterson’s (1975) estimator of . In mammals, the maximum bound of the sub-prior for  was max = 50.0 whereas in squamate reptiles it was max = 200.0.   S3: Summary Statistic Vector The summary statistic vector D we employ consists of up to six summary statistics collected from each of the Y population pairs (, W, Var( - W), net, b, and w). This includes , the average number of pairwise differences among all sequences within each population pair, W the number of segregating sites within each population pair normalized for sample size, (Watterson 1975), Var( - W) in each population pair, and net, Nei and Li’s net nucleotide divergence between each pair of populations (Nei and Li 1979). This last summary statistic is the difference (b - w) where b is the average pairwise differences between each population pair and w is the average pairwise differences within a sister pair of descendent populations. The vector D is made up of a two-dimensional array where the number of columns correspond to the classes of summary statistics and the number of rows correspond to the number of taxon-pairs (Y) per comparative phylogeographic dataset. We use up to four classes of summary statistics including , net, W, and Var( - W) . Given these four classes of summary statistics collected per taxon pair and Y taxon pairs, the summary statistic vector     D       ( net )1 . . . ( net )Y 1 . . . Y (W )1 . . . (W )Y Var(  W )1 . . . Var(  W )Y     ,     would include 4Y summary statistics. For each data set of Y taxon-pairs, rows 1 though  Y within each column of D are ordered by ascending values of net diversgence (net). S4: References Beaumont, M. A., Zhang, W. & Balding, D. J. 2002 Approximate Bayesian computation in population genetics. Genetics 162, 2025-2035. Hickerson, M. J., Dolman, G. & Moritz, C. 2006 Comparative phylogeographic summary statistics for testing simultaneous vicariance across taxon-pairs. Mol Ecol 15, 209-224. Nei, M. & Li, W. 1979 Mathematical model for studying variation in terms of restriction endonucleases. Proc Natl Acad Sci USA 76, 5269-5273. Weiss, G. & von Haeseler, A. 1998 Inference of population history using a likelihood approach. Genetics 149, 1539-1546. Supplementary Figure 1. Depiction of the multiple population-pair divergence model used for the ABC estimates of , E(), and  . (A): The white lines depict a gene tree with TMRCA being the time to the gene sample’s most recent common  ancestor, and the black tree containing the gene tree is the population/species tree. (B): Parameters in the multiple population-pair divergence model. The population mutation parameter, , is 2N where 2N is the summed haploid effective female population size of each pair of daughter populations ( is the per gene per generation mutation rate). The time since isolation of each population pair is denoted by  (in units of 2NAVE generations, where NAVE is the parametric expectation of N across Y population pairs given the prior distribution). Population mutation parameters for daughter populations a and b are a and b, whereas ’a and ’b are the population mutation parameters for the sizes of daughter populations a and b at the time of divergence until (length of bottleneck). The daughter populations ’a and ’b then grow exponentially to sizes a and b. The  population mutation parameter for each ancestral population is depicted as A. The migration rate between each pair of daughter populations is depicted as M (number of effective migrants per generation). (C): Example of four population-pairs where parameters in (B) are drawn from uniform priors. Supplementary Table 1. Parameters and their prior distributions. Hyper-Parameters () are randomly drawn once per Y taxon-pairs. Sub-taxon Parameters () are randomly drawn once per ith taxon-pair. The per generation per gene DNA mutation rate () is uniform across all taxa. Hyper-Parameters ()   Description Prior Distribution Per gene per generation mutation rate Assumed to be uniform across taxon-pairs Number of possible divergence times across Y Discrete uniform (1, Y) taxon-pairs Matrix of  possible divergence times (t) among Y T = {t1, …, t} Each t within T drawn from uniform (0,max) taxon-pairs. TY = {1, …, Y} E() Matrix of Y divergence times among Y Each  within TY randomly drawn with taxon-pairs. replacement from T matrix The mean  across Y taxon-pairs calculated from 1, …, Y taxon-pairs. Determined by max , , Y Var()/E(), the variance of , divided by the  mean of  across Y taxon-pairs calculated from Determined by max , , Y 1, …, Y.  Sub-Parameters () Description Prior Distribution Each (ith) taxon-pair’s divergence time drawn i, i =1,…,Y randomly (with replacement) from  divergence Uniform (0, max); max = 10.0 times within matrix T ={t1, …, t}. Uniform (0.01 max); i, i =1,…,Y Total population mutation parameter of each taxonpair, where i = 2Ni. max = 60.0 in mammals max = 200.0 in squamates (a)i, i = 1,…,Y Population mutation parameters for daughter (a)i = Uniform (0.0,  i ) (b)i, i = 1,…,Y populations a and b i = 1,…,Y i, = (a + b)i (b)i = Uniform (0.0,  i  (b)i =  i - (a)i) (A) i, i =1,…,Y Population mutation parameter for the ancestral Uniform (0.01, (Amax) population size of the ith taxon-pair ( a) i, i =1,…,Y  Coefficient of population bottleneck magnitude in Uniform (0.01, i) ( b) i, i =1,…,Y daughter populations a and b at beginning of population bottleneck ( a and b before the    present) ( a) i, i =1,…,Y  between  beginning of bottleneck in Length of time ( b) i, i =1,…,Y daughter populations a and b and the present time. Uniform (0.0,  i)

pairs mutation

Related documents

Products

Support

pairs mutation

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib