APPENDIX 4 SIMULATION EXPERIMENT TO INVESTIGATE WHETHER R-PACKAGES OUWIE AND MOTMOT PROVIDE COMPARABLE PARAMETER ESTIMATES AND LIKELIHOOD SCORES In order to fit the different evolutionary models of interest, we had to use two different Rpackages, namely OUwie (Beaulieu et al. 2012) and motmot (Thomas and Freckleton 2012). This is because, although the models fit by the two packages highly overlap, OU models with varying σ and β parameters are only implemented in OUwie, while the general (e.g. allowing for non-monophyletic groups) model with group phylogenetic means (BMSG) is only implemented in motmot. This poses a practical problem, because the way for calculating model parameters and, above all, the optimizing algorithms used by each implementation to obtain ML estimators, may not coincide. In such a case, the likelihood scores (and corresponding parameter estimates) would not be comparable and thus one would not be able to evaluate the relative fit of models obtained using different software packages. To address this potential obstacle, we performed a series of simulations in order to compare the parameter estimates and likelihood scores provided by the two packages for a model implemented by both, e.g. a BM with different rates for each group, calculated considering a single phylogenetic mean for the entire phylogeny (BMS). Simulations were conducted on a pure birth, random phylogeny with 64 species, divided in two groups originating in a random node relatively deep in the tree. We then used the transformPhylo.sim function of R-package motmot (Thomas and Freckleton 2012) to simulate a continuous phenotypic trait evolving under a BM a single phylogenetic mean, but varying rate differentiation between groups. The rate of the first group was always set to σ21 = 1, whereas the rate of the second group (σ22 = θσ21) was set to: 1.5, 1 2, 3, 4, 5, or 6 times larger than σ21. For each of the above rate-difference conditions, 1000 phenoytpic datasets were simulated. We then fit the two aforementioned models (BM1 and BMS) in both packages, and compared parameter estimates and likelihood scores using the same approach as above. Across 1000 simulated datasets, and across varying conditions of differentiation in evolutionary rates between groups, the correlation and mean ratio of both relative rate estimates and log-likelihoods obtained by the two packages approached unit value (Table A4, Fig. A4).These results show that, except for a very few cases where problems of convergence occur during the optimization process (e.g. outliers in Fig. A4), OUwie and motmot provide practically the same likelihood scores and parameter estimates. As such, models fitted in the two packages can be directly compared using likelihood criteria. This is very relevant for evolutionary inference both with regards to empirical and simulation studies, as it broadens the range of models of phenotypic evolution on phylogenies available for comparison. Table A4: Correlations and mean ratios of relative rate estimates and likelihoods obtained from fitting a two-rate model using OUwie (Beaulieu et al. 2021) or motmot (Thomas and Freckleton 2012) R-packages across different simulation conditions for 1000 datasets simulated under a singlemean, two-rate BM process. Simulated relative rate (θ) 1.5 2 3 4 5 6 θ estimate Correlation 0.983 0.999 0.996 0.999 0.992 0.996 Ratio 0.994 0.991 0.986 0.983 0.980 0.981 Log-likelihood Correlation 0.999 0.999 0.999 0.997 0.909 0.969 Ratio 1.000 0.999 0.999 0.998 0.998 0.998 2 Figure A4: Comparison of relative rate estimates and log-likelihoods obtained from OUwie (Beaulieu et al. 2021) and motmot (Thomas and Freckleton 2012) for a two-rate BM evolutionary model. Results are shown for the case of θ=2, but were similar for all the simulation conditions examined (Table A4). 3