2013-SystBiol-Adams-SuppMaterial

Supplementary Material from D.C. Adams, “Quantifying and comparing phylogenetic evolutionary rates for shape and other high-dimensional phenotypic data”. Systematic Biology. Type I error and Statistical Power Here I use computer simulations to show the statistical properties of the proposed method for comparing evolutionary rates for high-dimensional multivariate traits. Three sets of simulations were performed. The first set was conducted on four perfectly balanced phylogenetic trees that differed in their number of taxa: N = 16, 32, 64, 128. The second set was conducted on randomly generated trees that differed in their number of taxa: N = 16, 32, 64, 128. The final set of simulations was conducted on randomly generated trees (N = 16, 32, 64, 128) with three groups of taxa on each. For each simulation, a phylogeny was specified, taxa were divided into two (or three) groups, and the number of trait dimensions was specified (p = 2, 3, 4, 5, 7, 10). Next, input error covariance matrices of dimension p × p were constructed for each group, which were used to generate the phenotypic data. For simulations assuming isotropic error, the evolutionary rate for the first group was set to 12  1.0 for each trait dimension. For the second, group, the evolutionary rate for all trait dimensions was set to a fixed proportional difference relative to that in the first group. These varied across simulations such that the initial rate difference was known between groups (  22  1.0, 1.5, 2.0, 3.0, 4.0 ). For simulations assuming non-isotropic error, , the evolutionary rate for the first group was drawn from a normal distribution (   1.0 ;  0.1 ) for each trait dimension. The evolutionary rates for all traits for the second group were then obtained by multiplying these values by a constant ( k  1.0, 1.5, 2.0, 3.0, 4.0 ) to obtain a known initial rate difference between groups. Simulations with three groups had the evolutionary rate of the third group set as: 12   32  1.0 . From the initial covariance matrices, 1000 phenotypic datasets were obtained by evolving multidimensional traits along the phylogeny according to a Brownian motion model of evolution. Data were simulated using the function ‘transformPhylo.sim’ in the R-package motmot (Thomas and Freckleton, 2011), which is capable of simulating data under a Brownian motion model with differing rates for groups of taxa. For the case of randomly generated trees, a new phylogeny was 2 simulated for each data set. For each simulation, evolutionary rates (  mult ) were then estimated for each group of taxa on the phylogeny, and the proposed ratio-based test was used to determine whether or 2 2 2 not  mult differed from one another among groups. Specifically, the observed ratio (  mult . A  mult . B ) was compared to ratios obtained from data under the null hypothesis of no rate difference among groups; the latter of which was generated by phylogenetic simulation. Here, the global evolutionary rate across the 2 entire phylogeny (  mult ) was used to generate simulated data sets using ‘sim.char’ in the R-package .Tot 2 geiger (Harmon et al., 2008), where  mult was used as the input rate for each trait dimension (using .Tot 2 provided equivalent results, as N is a constant). The proportion significant results (of 1000)  mult .Tot / N was then treated as the significance level of the observed rate ratio for that simulated dataset. Across all simulation conditions, simulations where 12   22  1.0 represented Type I error rate assessments (i.e., no difference in their evolutionary rates), while simulations where  22 / 12  1.0 assessed statistical power. For Type I error simulations, rate comparisons were also performed with likelihood methods based on the evolutionary rate matrix R, using the function ‘evol.vcv’ in the Rpackage phytools (Revell, 2012). 2 Results: For all simulations, hypothesis tests based on  mult displayed appropriate Type I error rates near  = 0.05, which remained consistently near 0.05 regardless of trait dimensionality (figs. A1-3). By contrast, tests based on the evolutionary rate matrix R had unacceptably high Type I error rates which increased with trait dimensionality (Fig. 2a: main text). With the likelihood approach, Type I error rates were over 8% for tests on bivariate traits, and exceeded 50% when p = 10. Thus, this method has unacceptably high Type I error when used on high-dimensional data. When evolutionary rates differed between traits (  22 / 12  1.0 ), the distance-based method 2 demonstrated acceptable statistical power, and was capable of identifying even small differences in  mult between traits. The power of the test also rose rapidly as the difference between evolutionary rates increased. This pattern became more acute as the number of species in the phylogeny increased. Importantly, these statistical findings were obtained regardless of whether phylogenies were balanced or random, or whether evolutionary rates for two or more groups of taxa on the phylogeny were compared (figs. A1-A3). Overall these simulations reveal that the distance-based method for comparing evolutionary rates 2 (  mult ) has appropriate Type I error and statistical power, and is therefore appropriate for use on high- dimensional datasets. Fig. A1. Statistical power curves for tests comparing evolutionary rates for two groups of taxa on balanced phylogenies. Each point on each power curve is the result of 1000 simulations at the conditions specified. Fig. A2. Statistical power curves for tests comparing evolutionary rates for two groups of taxa on random phylogenies. Each point on each power curve is the result of 1000 simulations at the conditions specified. Fig. A2. Statistical power curves for tests comparing evolutionary rates for three groups of taxa on random phylogenies. Each point on each power curve is the result of 1000 simulations at the conditions specified. Literature Cited Harmon, L. J., J. Weir, C. Brock, R. E. Glor, and W. Challenger. 2008. GEIGER: Investigating evolutionary radiations. Bioinformatics 24:129-131. Revell, L. J. 2012. Phytools: An R package for phylogenetic comparative biology (and other things). Methods in Ecology and Evolution 3:217-223. Thomas, G. H., and R. P. Freckleton. 2011. MOTMOT: models of trait macroevolution on trees. Methods in Ecology and Evolution 3:145-151.

2013-SystBiol-Adams-SuppMaterial

Related documents

Products

Support

2013-SystBiol-Adams-SuppMaterial

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib