Supplementary materials Biomechanical profiling Biting performance is quantified using a specifically defined biomechanical performance measure, the mechanical advantage (MA). MA is the ratio between the moment arm of the muscle moment (effort) and the moment arm of the biting moment (load), and is otherwise known as the force advantage (Westneat, 1994). MA quantifies the degree in which effort can be translated into load and is an indicator of mechanical efficiency. A high MA relates to high efficiency in force translation from effort to load. Conversely, a low MA means low efficiency in force translation, but high rotational speed at the point of load (high speed advantage or displacement advantage (Westneat, 1994)). Based on anatomical observations of modern avian and crocodilian jaw adductor musculature (pers. obs. but also Holliday and Witmer, 2007; Holliday, 2009), corresponding muscle attachment sites were digitally landmarked on lateral view images of 50 skulls (from published sources, photographs, or novel composite reconstructions), including 41 theropod taxa and the outgroup Plateosaurus. Landmarks were plotted on published figures (lateral-view photographs or interpretive diagrams of well-preserved skull specimens and skull reconstructions). Lateral-view photographs, taken by the author, of theropod skull replicas from various museum collections were also used. In some taxa, composite skull reconstructions were used that were created for this study from images of disarticulated cranial materials. Jaw muscles were simplified into three major groups, the MAME group (M. adductor mandibulae externus including pars profundus + M. pseudotemporalis superficialis), the MAMP group (M. adductor mandibulae posterior/M. pseudotemporalis profundus), and MPT group (M. pterygoideus dorsalis + M. pterygoideus ventralis) following Rayfield et al. (2001) (Fig. S1). Attachment sites for each muscle group were marked for the origins and insertions on the cranium and mandible (Fig. S2). Each attachment site was represented by at least two landmarks, at the anterior-most and posterior-most extents of the attachment site (Fig. S2). Biting positions taken at the tip of each tooth along the tooth row and the fulcrum (jaw joint) were also marked (Fig. S2). X,Y coordinates of muscle attachments, biting points and jaw joint were recorded using ImageJ (Abramoff et al., 2004). The distance between the jaw joint and the line of action for each muscle was taken as the moment arm of the muscle (mM) and calculated using the following formula: mM = ( |aXJ + bYJ + c| )/ (a2 + b2) (S1) where XJ and YJ are the X,Y coordinates of the jaw joint, and a, b, and c are the parameters for the straight line connecting the muscle landmarks (in the form ax + by + c = 0; Fig. S3). This is a direct application of a fundamental relationship between a point (X0, Y0) and a straight line, ax + by + c =0, where the distance between the point and straight line l is: l = (|aX0 + bY0 + c|) / (a2 + b2). (S2) If the straight line were to be the major axis through the body of the muscle and the point were to be the jaw joint (XJ, YJ) then the moment arm of that muscle (mM) would be the perpendicular distance from the straight line and the jaw joint, or the distance from a point to a straight line l of Equation S2, hence Equation S1. The coefficients a, b and c can be derived from the X,Y coordinates of the landmarks. First the straight line ax + by + c =0 can be converted to the form: y = (-a/b)x + (-c/b) (S3) which is the familiar form of a straight line, y = slope x + intercept. and intercept is -c/b. Thus slope is -a/b From the equation of a straight line through two points, y = {[(Y2 – Y1) / (X2 – X1)] × (x – X1)} + Y1 (S4) the slope and intercept can be calculated from the coordinates of the insertion (XI,YI) and origin (XO,YO) points as: slope = (YI – YO) / (XI – XO) and intercept = -XO [(YI – YO) / (XI – XO)] + YO. A common factor b in the slope (-a/b) and intercept (-c/b) will determine the coefficients a, b and c; i.e. a = slope × factor, b = -factor, c = intercept × factor. Moment arms were calculated for each origin/insertion pair for the anteriorand posterior extents of the muscle (Fig. S2) the mean value of which taken as the moment arm for each muscle group. Moment arms for each muscle group were then divided by the load arm (direct distances between the jaw joint and bite points) to compute mechanical advantages (MA). along the tooth row. MA were computed at each biting point Potentially valuable three-dimensional information is lost by using simple two-dimensional models, but the benefits of using simple models outweigh the drawbacks. Firstly, simple models are explicit in the information represented. This enables explicit hypothesis testing (the ability to test a single hypothesis relating to a single biomechanical performance). Sophisticated models on the other hand have inherent uncertainties (many variables with uncertain properties) so the information represented is not explicit. This makes hypothesis testing tricky; due to the complexity of the model, there is no certainty that the analysis conducted actually is testing the hypothesis of interest (or in other words, there are too many variables). Secondly, simple models enable direct comparisons across a wide range of taxa (simple biomechanical metrics are directly comparable) while it is difficult in sophisticated models. Thirdly, simple models readily allow one to amount a large enough sample size for rigorous statistical treatment. This is especially important when dealing with fossil taxa as three-dimensionally preserved specimens are rare and such data is limited. For the purposes of this study (biomechanical disparity and phylogenetic comparative analyses) it is imperative that a large standard dataset be compiled. Simple biomechanical metrics from simple biomechanical models are most appropriate for this. To enable direct comparisons amongst taxa with differing tooth counts, biting positions were adjusted as percentages along the tooth row with 0% being the posterior-most position and 100% being the anterior-most position of the tooth row (Fig. S2). Note, that this standardisation of tooth row position is preceded by MA computation; MA is computed from the absolute distances, not standardised positions. For each taxon, MA was plotted against tooth row positions and polynomial functions of either the second order (y = β0 + β1x1 + β2x22), third order (y = β0 + β1x1 + β2x22 + β3x33) or fourth order (y = β0 + β1x1 + β2x22 + β3x33 + β4x44) were fitted using R (R Core Development Team, 2009) (Fig. S4). The best order of polynomial for each profile was determined using a weighted AIC-based test in the paleoTS library (Hunt, 2006) in R. Although MA profiles look simple, and intuitively, a second order polynomial may best represent them, a vast majority of theropods had MA profiles that could not be described sufficiently by second order polynomials (second order polynomial curves had such poor fit to the profiles that they were even visually obvious; this was substantiated by AIC-based comparisons). Thus in most theropod profiles, either third or fourth order polynomials were fitted. This introduces a discrepancy in the number of coefficients amongst taxa but the three common coefficients (i.e. β0, β1 and β2) were sufficient in explaining the multivariate variability in the profiles (multivariate ordination using three and five coefficients respectively were nearly identical), and they were taken as independent variables and subjected to a correlation-based principal components analysis (PCA) to visualise “function space” (Anderson, 2009) occupation of different theropod and non-theropod clades. A correlation matrix was employed instead of a variance-covariance matrix because the absolute magnitudes differed on orders of magnitudes across the variables (Jolliffe, personal communications). In this way, much of the variance could be explained by PC1 and PC2 (89.9% and 9.9% respectively). Loadings for PC1 show that each polynomial coefficient is represented in nearly equal proportion. An alternative to using the polynomial coefficients to describe the biomechanical profiles would be to fit MA values along the polynomial curves (or to interpolate between the observed MA values along the different positions of the tooth row). For instance, MA values along the curves from 0% to 100% of the tooth row can be predicted by values fitted onto the polynomial functions; values fitted at an increment of 1 would yield 101 fitted MA values per taxon. Thus each taxon would be represented by 101 “variables”. A correlation-based PCA on these “variables” would result in PC1 and PC2 explaining 93.1% and 6.9% respectively of the total variance. A plot of PC1 against PC2 shows a different rotation of axes as compared to that on polynomial coefficients but the relative distribution in “function space” is nearly identical. One problem with this approach of course is that the “variables” are not independent of each other but form a continuum. Moreover, interpretation of the PC axes with respect to the original variables is difficult in the 101 fitted MA values but simple in polynomial coefficients (i.e. intercept, slope, parabolic curvature). Therefore, polynomial coefficients were preferred over 101 fitted MA values for use in PCA but also subsequent phylogenetic comparative analyses. Fig. S1. Identification of muscle attachments and reconstruction of jaw adductor muscles in theropods. (a) Based on anatomical observations of jaw myology in modern archosaurs, corresponding muscle attachments were identified in fossil materials. MAME, red; MAMP, green; MPT, purple. (b) - (d) Muscles were reconstructed by major groups: (b) MAME group; (c) MPT group; and (d) MAMP group. Fig. S2. Computation of moment arms using digital landmarks. (a) Bite points at the tip of each tooth along the entirety of the tooth row and muscle attachment points were landmarked. Each muscle was recognised by at least four landmarks, a pair of origin and insertion points for the anterior and posterior extents of the muscles. (b) Landmarks and moment arms for the MAME. Moment arms for the line of action of the muscle at the anterior and posterior extents of the MAME were computed as the perpendicular distance from the jaw joint to the line of action of the muscle. Total moment arm for that muscle (mE) was taken as the mean of the anterior and posterior moment arms (mant and mpost respectively). (c) and (d) Moment arms for the MPT and MAMP respectively. Fig. S3. Computation of moment arms from X-Y coordinates. Moment arms are computed as the distance between a point (jaw joint, J) and the straight line (ax + by + c = 0) connecting the origin and insertion points (Mo and Mi respectively). for both anterior and posterior extents of the muscles. This is done Fig. S4. Biomechanical profile plot. The mean MA at each biting position is standardised along the tooth row as a percentage scale, with the posterior extent as 0% and anterior extent as 100%. MA are plotted against respective percent tooth row positions and a polynomial function is fitted using least squares regression. Phylogeny The phylogeny of the Lloyd et al. (2008) dinosaur supertree formed the basis of the phylogenetic comparative analyses. The position of Monolophosaurus was shifted to a more basal position to reflect more recent understanding (Smith et al., 2007; Brusatte et al., in press). For branch length estimation, a tree with 168 taxa (primarily theropods but with some sauropodomorphs and basal ornithischians) was used (Fig. S5). In estimating branch lengths, a larger tree is preferable over a smaller tree (i.e., n = 42) because additional stratigraphic information from taxa interspersed amongst those of the smaller tree help to more accurately estimate the ages of internal and more basal nodes. The tree was dated using the first occurrence dates of each terminal taxon (compiled from various sources but primarily the Dinosauria II and converted to millions of years ago using the International Stratigraphic Chart, 2008). After this initial dating some zero duration branches existed (an inevitable consequence in the standard dating method used by palaeontologists) and at this point the method of Brusatte et al. (2008) was applied such that zero duration branches were lengthened by sharing time equally with a directly ancestral branch of positive duration. This is a modification of the approach of Ruta et al. (2006) where sharing is proportional to character changes, which was not possible or desirable in the present context (R code for implementing both methods is available from http://www.graemetlloyd.com/). Terminal branches were extended to fit the last occurrence dates for each terminal taxon. Taxa not represented in the biomechanical analysis were pruned out after branch lengths were estimated, and the reduced tree (42 taxa) was used for subsequent phylogenetic comparative analyses (Fig. S6). The recent phylogenetic hypothesis with Proceratosaurus as a basal tyrannosauroid (Rauhut et al., 2010) is not tested here as this alternate position is difficult to reflect in the 168-taxon supertree topology (there is limited overlap in taxa between this tree and that of Rauhut et al., 2010). Thus conforming to an objective consensus phylogenetic position, i.e. the Lloyd et al. (2008) supertree, was preferred. Fig. S5. The phylogeny of Theropoda taken from the supertree of Lloyd et al. (2007) with additional basal sauropodomorph and ornithischian taxa (n = 168). The inclusion of Late Triassic to Early Jurassic basal sauropodomorphs and ornithischians supply additional information on the possible ages of the basal divergence dates, i.e. for Saurischia. Fig. S6. Phylogenetic relations of the 42 extinct saurischian taxa. The tree of Fig. S5 was pruned after nodes were dated and branch lengths estimated to include the taxa under study for biomechanical profiling. Testing for phylogenetic signals One method of detecting phylogenetic signal in the biomechanical variable is to employ the phylogenetic eigenvector regression (PVR; Diniz-Filho et al., 1998). PVR is a test based on multiple linear regression in which a phenotypic variable is the response variable with phylogeny as the predictor variable represented as principal coordinates (PCo) axes extracted from a phylogenetic distance matrix (pair wise Euclidean distances computed in R). From the total of 41 PCo axes, the first 34 were retained for multiple regression analysis because they explain 95% of the total variance in phylogeny. Since the response in this case is a multivariate dataset (i.e, the three polynomial coefficients, 0, 1 and 2), PVR was modified slightly by using multivariate multiple regression (MMR) instead of a standard multiple linear regression. MMR is an extension of multiple regression, where the response variable is a matrix of multiple variables instead of a vector of a single variable. Another method for detecting phylogenetic signal in a phenotypic dataset is to employ the method of Blomberg et al. (2003). This method uses phylogenetically independent contrasts (Felsenstein, 1985) and compares the variances of the contrasts computed from a given variable on a particular tree topology with those computed from permutations of that variable across the same tree (i.e. randomly reshuffling the values amongst the OTUs while keeping the tree topology constant). If the variances in the contrasts for the data in the real phylogenetic positions are lower than those from the permutations, then there is a significant phylogenetic signal in that data (Blomberg et al., 2003). This test was conducted in R using the picante library (Kembel et al., 2009) which also computes Blomberg et al.’s (2003) K statistic. A K less than one would indicate that closely related OTUs have values that are less similar than expected under Brownian motion evolution (or departure from Brownian motion, such as adaptive evolution), while a K greater than one would indicate that closely related OTUs have values more similar than expected (Blomberg et al., 2003). Because the independent contrasts method (Felsenstein, 1985) assumes that character evolution can be modelled as a random walk (i.e. a Brownian motion model of evolution) and that characters change at a uniform rate per unit branch length in all branches (i.e. variance accumulation is assumed to be equal per unit time), the data and/or selected branch lengths must conform to these assumptions. Diagnostic checks of branch lengths available in the PDAP module (Midford et al., 2005) of Mesquite (Maddison & Maddison, 2009) revealed that the combination of raw data and initial branch lengths did not meet the assumptions. Therefore data were logarithmically transformed, log100, log10(-1/1), and log102. The coefficient 1 required logarithmic transformation of the negative reciprocal instead of just logarithmic transformation of the raw values because the raw values were all negative (negative values cannot be log transformed) and simply reversing the sign would also reverse the magnitude hence the negative reciprocal. In the case that data transformation was not enough, then branch lengths were adjusted by assigning minimum internal branch lengths (Laurin, 2004; Laurin et al., 2009) while keeping the ages of the terminals constant using the Stratigraphic Tools (Josse et al., 2006) module of Mesquite until contrasts were adequately standardised (Garland et al., 1992). Minimal branch lengths of 3 million years and 4.5 million years were necessary for adequate standardisation of log100 and log10(-1/1) contrasts respectively. Contrasts for log102 did not require any branch length adjustment for adequate standardisation so the initial dated tree was used. Even after branch length adjustments, there was still a strong correlation between the estimated nodal values against their respective ages in all three variables. However, this may not necessarily be a statistical artefact when using fossil phylogenies (these tests were devised with ultrametric trees in mind) and instead may indicate the presence of a trend (Laurin, pers. comm.). The RMesquite library (Lapp & Maddison, 2010) was used to enable the graphical tree manipulation devices of Mesquite within R prior to the Blomberg et al. (2003) test. Tracing the evolution of function space occupation To trace the evolution of function space occupation, nodal values (ancestors) were estimated for the three coefficients (i.e. β0, β1 and β2) using the maximum likelihood (ML) method of ancestor character estimation (Schluter et al., 1997) available in the ape library (Paradis et al., 2004) in R (this is the equivalent of the weighted squared-change parsimony method of Maddison (1991)). Because the maximum likelihood method also assumes a Brownian motion model of evolution, the transformed data and adjusted branch lengths were used (see above). Data transformation is not only necessary for conformation to the assumptions of Brownian motion evolution, but also for accurate estimation of ancestor values on small-value variables (β1 is in the order of 10-3 while β2 is in the order of 10-5) which without transformation tend to produce spurious results. Ancestor estimates were back-transformed and along with the original coefficients formed the basis for a multivariate ordination to visualise function space occupation using a correlation-based PCA. Ancestor values were estimated for the polynomial coefficients and included in the post hoc PCA (along with the OTU values) instead of being estimated for PC scores of the OTUs from an a priori PCA (using only the OTU polynomial coefficients) because the latter computes ancestral function space coordinates (PC scores) from OTU function space which could be variable depending on the ordination method (PCoA instead of PCA, or a variance-covariance matrix basis as opposed to a correlation matrix), whereas the former is more or less concrete (there is no variability in the OTU values). Thus, while the former results in one set of ancestor values, the latter results in variable ancestor values (depending on the ordination method employed on the OTUs). Fig. S7. ML ancestor estimates of β0. Branch lengths were adjusted so that minimum internal branch length is 3 Ma and ML ancestor estimates were computed for log10β0. Ancestor estimates were back-transformed to the arithmetic scale. Fig. S8. ML ancestor estimates of β1. Branch lengths were adjusted so that minimum internal branch length is 4.5 Ma and ML ancestor estimates were computed for log10(-1/β0). Ancestor estimates were back-transformed to the arithmetic scale and shown at ×103 because values are small on the level of 10-3. Fig. S9. ML ancestor estimates of β2. ML ancestor estimates were computed for log10β2. Ancestor estimates were back-transformed to the arithmetic scale and shown at ×105 because values are small on the level of 10-5. Literature cited in supplementary information Abramoff, M.D., Magelhaes, P.J., & Ram, S.J. 2004. Image processing with ImageJ. Biophotonics Internat. 11: 36-42. Anderson, P.S.L. 2009. Biomechanics, functional patterns, and disparity in Late Devonian arthrodires. Paleobiol 35: 321-342. Blomberg, S.P., Garland, T. & Ives, A.R. 2003. Testing for phylogenetic signal in comparative data: behavioral traits are more labile. Evolution 57: 717-745. Brusatte, S. L., Benton, M. J., Ruta, M. & Lloyd, G. T. 2008. Superiority, competition, and opportunism in the evolutionary radiation of dinosaurs. Science 321: 1485-1488. Brusatte, S.L., Benson, R.B.J., Chure, D.J., Xu, X., Sullivan, C. & Hone, D.W.E. 2009. The first definitive carcharodontosaurid (Dinosauria: Theropoda) from Asia and the delayed ascent of tyrannosaurids. Naturwissenschaften 96: 1051-1058. Brusatte, S.L., Benson, R.B.L., Currie, P.J. Zhao, X. in press. The skull of Monolophosaurus jiangi (Dinosauria: Theropoda) and its implications for early theropod phylogeny and evolution. Zool. J. Linn. Soc. (doi: 10.1111/j.1096-3642.2009.00563.x) Diniz-Filho, J. A. F., De Sant'ana, C. E. R. & Bini, L. M. 1998. An eigenvector method for estimating phylogenetic inertia. Evolution 52: 1247-1262. Felsenstein, J. 1985. Phylogenies and the comparative method. Am. Nat. 125: 1-15. Hammer, O., Harper, D. A. T., & Ryan, P. D. 2001. PAST: paleontological statistics software package for education and data analysis. Palaeontologia Electronica 4: 99 pp. Holliday, C.M. 2009. New insights into dinosaur jaw muscle anatomy. Anat. Rec. 292: 1246-1265. Holliday, C. M. & Witmer, L. M. 2007. Archosaur adductor chamber evolution: Integration of musculoskeletal and topological criteria in jaw muscle homology. J. Morphol. 268: 457-484. Hunt, G. 2008. paleoTS: Modeling evolution in paleontological time-series. R package version 0.3-1. Josse, S., Moreau, T. & Laurin, M. 2006. Stratigraphic tools for Mesquite. Available at: http://mesquiteproject.org/packages/stratigraphicTools/. Kembel, S., Ackerly, D., Blomberg, S., Cowan, P., Helmus, M., Morlon, H. & Webb, C. 2009. picante: R tools for integrating phylogenies and ecology. R package version 0.7-2. http://CRAN.R-project.org/package=picante Lapp, H. & Maddison., W. 2010. RMesquite: Wrapper for Mesquite methods in R. R package version 0.5-0/r26. http://R-Forge.R-project.org/projects/rmesquite/ Laurin, M. 2004. The evolution of body size, Cope's rule and the origin of amniotes Syst. Biol. 53: 594-622. Laurin, M., Canoville, A. & Quilhac, A. 2009. Use of paleontological and molecular data in supertrees for comparative studies: the example of lissamphibian femoral microanatomy. J. Anat. 215: 110-123. Lloyd, G. T., Davis, K. E., Pisani, D., Tarver, J. E.; Ruta, M., Sakamoto, M., Hone, D. W. E., Jennings, R. & Benton, M. J. 2008. Dinosaurs and the Cretaceous Terrestrial Revolution. Proc. R. Soc. Lond B 275: 2483-2490. Maddison, W.P. 1991. Squared-change parsimony reconstructions of ancestral states for continuous-valued characters on a phylogenetic tree. Syst. Zool. 40: 304-314. Maddison, W.P. & Maddison, D.R. 2009. Mesquite: a modular system for evolutionary analysis. Version 2.6. http://mesquiteproject.org Midford, P.E., Garland, T., & Maddison, W.P. 2005. PDAP package of Mesquite. Version 1.07. Paradis, E., Claude, J. & Strimmer, K. 2004. APE: Analyses of phylogenetics and evolution in R language. Bioinformatics 20: 289–290. R Core Development Team 2008. R: a Language and Environment for Statistical Computing R Foundation for Statistical Computing. Vienna, Austria. Ruta, M., Wagner, P.J. & Coates, M.I. 2006. Evolutionary patterns in early tetrapods. I. Rapid initial diversification followed by decrease in rates of character change. Proc. R. Soc. Lond B 273: 2107-2111. Schluter, D., Price, T., Mooers, A. O. & Ludwig, D. 1997. Likelihood of ancestor states in adaptive radiation. Evolution 51: 1699-1711. Smith, N.D., Makovicky, P.J., Hammer, W.R. & Currie, P.J. 2007. Osteology of Cryolophosaurus ellioti (Dinosauria : Theropoda) from the Early Jurassic of Antarctica and implications for early theropod evolution. Zool. J. Linn. Soc. 151: 377-421. Westneat, M. W. 1994. Transmission of force and velocity in the feeding mechanisms of labrid fishes (Teleostei, Perciformes). Zoomorph. 114: 103-118.