1 Descriptions of the Computer Programs Used in the Analyses PAUP* (v. 4.0b10; Swofford, 2002) implements a range of deterministic strategies for exploring tree space. Four different strategies were evaluated. First, a series of “successive approximations” was used to estimate both the ML tree and the parameter values of the substitution model (Sullivan et al., 2005). The specific implementation was inspired by Foster (2003) and Sullivan (2005). The neighbor-joining tree based on absolute differences was used as the starting tree, followed by one round of NNI branch swapping, two rounds of SPR swapping and one round of TBR swapping. Between each branch-swapping round the substitution-model parameter values were optimized, and were then fixed at these values for the subsequent tree optimizations. The ApproxLim option was set to the default value (5%) for the NAD4 data, to 2% for the Isospora data and 1% for the HIV data. Second, the default PAUP* strategy was used, based on the stepwise-addition starting tree followed by branch swapping. Three analyses were performed for each data set, using different branch-swapping options: NNI based on ten random addition sequences, SPR based on three random addition sequences, and a single TBR (this latter is the PAUP* default option). In all cases the parameters for the nucleotide-substitution model were fixed at the values determined for the optimal tree found using the successive-approximations strategy above, and using the same ApproxLim percentages. Third, the quartet-puzzling strategy was used, with the default options. Fourth, the star decomposition strategy was used for the NAD4 and Isospora data sets only, with the default options. This latter analysis is impractical for the HIV data set (e.g. the analysis of the Isospora data set took 8.5 weeks on the fastest computer used). In both cases the parameters for the nucleotide-substitution model were fixed at the values determined for the optimal tree found using the successive-approximations strategy. Tree-Puzzle (v. 5.2; Schmidt et al., 2002) uses quartet puzzling to explore tree space deterministically, while IQPNNI (v. 3.0; Vinh and von Haeseler, 2004) extends this idea by adding branch swapping and a stopping criterion. PhyNav (v. 1.0; Vinh et al., 2005) creates subsets larger than quartets, which are then stitched together. These programs all have roughly the same user options. For Tree-Puzzle the GTR+G model was used with five categories for the discrete gamma distribution, and GTR+G+I was used for the other two programs. For TreePuzzle, the nucleotide-substitution rates were fixed at the values determined for the optimal tree found using the successive-approximations strategy, while all other parameter values were estimated by the program. PhyML (v. 2.4.4; Guindon and Gascuel, 2003) uses a series of heuristics to explore a part of tree space deterministically. Three versions of each analysis were run, using respectively the NNI branch-swapping strategy described by Guindon and Gascuel (2003), the SPR branchswapping strategy described by Hordijk and Gascuel (2005) and the hybrid strategy described by Hordijk and Gascuel (2005). All parameter values were estimated by the program. RAxML-VI (v. 1.0; Stamatakis et al., 2005) has a range of strategies for stochastically exploring parts of tree space. As recommended in the instructions, two different strategies were used for each analysis. First, ten hill-climbing runs were performed based on random-additionorder parsimony starting trees. Second, five simulated-annealing runs were performed based on random-addition-order parsimony starting trees, with the analysis time-period being set to four times the average time of the hill-climbing runs. All other default options were used, with the parameter values of the GTR+G model being estimated by the program. Note that RAxML-VIHPC (Stamatakis, 2006), designed specifically for much larger data sets (e.g. >1,000 sequences), employs a different set of search strategies to those evaluated here, which are actually much closer to those implemented in RAxML-V (Stamatakis, 2005). GARLI (v. 0.93, 0.942, 0.95; Zwickl, 2006) is basically a development of the GAML 2 version 1 (Lewis, 1998) and 2 (Brauer et al., 2002) programs, and uses a genetic algorithm to explore tree space stochastically. The default options were used for all analyses, with all parameter values being estimated by the program. Version 0.93 was used for analysing the main three data sets, version 0.942 for the ancillary analyses, and 0.95 for the HKY analyses (see below). Ten analyses were run for each data set, each starting from a random tree. In addition, for each of the main data sets a single analysis was run starting from the neighbor-joining tree. TreeFinder (v. May 2006; Jobb et al., 2004) currently uses an unspecified algorithm to explore tree space deterministically (the algorithm described by Jobb et al., 2004 was used by earlier versions). The default options were used for all analyses, except that ten starting trees were created for each analysis using the “random walk” option with the neighbor-joining tree as the centre tree. All parameter values were estimated by the program. MultiPhyl (v. 1.0.6; Keane, 2006) uses a similar series of heuristics to PhyML in order to explore a part of tree space deterministically. The default options were used for all analyses, with SPR branch swapping. All parameter values were estimated prior to the analysis and then fixed during the tree search. This program was run as a distributed analysis (i.e. using multiple processors in a heterogeneous computer system) via the online service provided by the Heterogeneous Distributed Computing group at the National University of Ireland, Maynooth (http://www.cs.nuim.ie/distributed/multiphyl.php). DPRml (Keane et al., 2005) is basically an extension of the fastDNAml program (Olsen et al., 1994) that accepts a wider range of nucleotide-substitution models, using a series of heuristics to explore tree space deterministically. It thus represents the older style of search strategy, which the more recent programs are intended to supplant. This program was also run as a distributed analysis via the facilities of the Heterogeneous Distributed Computing group, using the default options. REFERENCES Brauer, M. J., M. T. Holder, L. A. Dries, D. J. Zwickl, P. O. Lewis, and D. M. Hillis. 2002. Genetic algorithms and parallel processing in maximum-likelihood phylogeny inference. Mol. Biol. Evol. 19:1717–1726. Foster, P. G. 2003. Likelihood in molecular phylogenetics. Unpublished notes used for Molecular Systematics course. Natural History Museum, London, U.K. July 2001; September 2003. (http://www.ch.embnet.org/CoursEMBnet/PHYL03/Slides/unix_like_pfoster.pdf; (http://bioinf.ncl.ac.uk/molsys/data/like.pdf) Guindon, S., and O. Gascuel. 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52:696–704. Hordijk, W., and O. Gascuel. 2005. Improving the efficiency of SPR moves in phylogenetic tree search methods based on maximum likelihood. Bioinformatics 21:4338–4347. Jobb, G., A. von Haeseler, and K. Strimmer. 2004. Treefinder: a powerful graphical analysis environment for molecular phylogenetics. BMC Evol. Biol. 4:18. Keane, T. M. 2006. Computational methods for statistical phylogenetic inference. Ph.D. thesis, The National University of Ireland Maynooth, Ireland. Keane, T. M., T. J. Naughton, S. A. A. Travers, J. O. McInerney, and G. P. McCormack. 2005. DPRml: distributed phylogeny reconstruction by maximum likelihood. Bioinformatics 21:969–974. Lewis, P. O. 1998. A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data. Mol. Biol. Evol. 15:277–283. Olsen, G. J., H. Matsuda, R. Hagstrom, and R. Overbeek. 1994. FastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comp. 3 Appl. Biosci. 10:41–48. Schmidt, H. A., K. Strimmer, M. Vingron, and A. von Haeseler. 2002. Tree-Puzzle: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18:502–504. Stamatakis, A. 2005. An efficient program for phylogenetic inference using simulated annealing. Page 198b in Proceedings of the 19th international parallel and distributed processing symposium (IPDPS’05), and the 4th international workshop on high performance computational biology (HiComB’05). IEEE Press, Piscataway NJ. Stamatakis, A. 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22:2688–2690. Stamatakis, A., T. Ludwig, and H. Meier. 2005. RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 21:456–463. Sullivan, J. 2005. Maximum likelihood methods for phylogeny estimation. Meth. Enzymol. 395:757–779. Swofford, D. L. 2002. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Sinauer Associates, Sunderland MA. Vinh, L. S., H. A. Schmidt, and A. von Haeseler. 2005. PhyNav: a novel approach to reconstruct large phylogenies. Pages 386–393 in Classification, the ubiquitous challenge: proceedings of the 28th annual conference of the Gesellschaft für Klassifikation e.V. (C. Weihs and W. Gaul, eds). Springer-Verlag, Heidelberg. Vinh, L. S., and A. von Haeseler. 2004. IQPNNI: moving fast through tree space and stopping in time. Mol. Biol. Evol. 21:1565–1571. Zwickl, D. J. 2006. Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. Ph.D. thesis, The University of Texas at Austin, U.S.A. 4 Maximum-Likelihood Computer Programs Available Computer programs currently available that implement the maximum-likelihood criterion for evaluating phylogenetic trees based on nucleotide data, and which were considered for inclusion in the evaluation. _______________________________________________________________________________________________________ Program Version Reference Internet access _______________________________________________________________________________________________________ Programs used DPRml GARLI IQPNNI MultiPhyl PAUP* PhyML PhyNav RAxML-VI Tree-Puzzle TreeFinder 1.0 0.951 a 3.0.1b 1.0.6 4.0b10 2.4.4 1.0 1.0 c 5.2 May 2006 Keane et al. (2005) Zwickl (2006) Vinh and von Haeseler (2004) Keane (2006) Swofford (2002) Guindon and Gascuel (2003) Vinh et al. (2005) Stamatakis et al. (2005) Schmidt et al. (2002) Jobb et al. (2004) http://www.cs.nuim.ie/distributed/ http://www.bio.utexas.edu/faculty/antisense/garli/Garli.html http://www.cibiv.at/software/iqpnni/ http://www.cs.nuim.ie/distributed/multiphyl.php http://paup.csit.fsu.edu/ http://atgc.lirmm.fr/phyml/ http://www.cibiv.at/software/phynav/ http://icwww.epfl.ch/~stamatak/index-Dateien/Page443.htm http://www.tree-puzzle.de/ http://www.treefinder.de/ Programs not implementing the reversible GTR+G substitution model FastDNAml MetaPIGA MOLPHY NHML Phylip POY TrExML SSA SEMPHY 1.1.1a 1.02 2.3b3 3 3.66 3.0.11 1.0 1.0 2.0 Olsen et al. (1994) Lemmon and Milinkovitch (2002) Adachi and Hasegawa (1996) Galtier and Gouy (1998) Felsenstein (1989) Wheeler (2006) Wolf et al. (2000) Salter and Pearl (2001) Friedman et al. (2002) http://geta.life.uiuc.edu/~gary/programs/fastDNAml.html http://www.ulb.ac.be/sciences/ueg/html_files/MetaPIGA.html http://www.ism.ac.jp/ismlib/softother.e.html www.genetix.univ-montp2.fr/nhml.htm http://evolution.genetics.washington.edu/phylip.html http://research.amnh.org/scicomp/projects/poy.php http://whitetail.bemidjistate.edu/trexml/trexml.man.html http://www.stat.unm.edu/~salter/software/ssa/ssa.html http://compbio.cs.huji.ac.il/semphy/ 5 Programs not designed for extensive tree searches APE d 1.8-4 Paradis et al. (2004) http://cran.r-project.org/src/contrib/Descriptions/ape.html DAMBE 4.5.20 Xia and Xie (2001) http://dambe.bio.uottawa.ca/dambe.asp HyPhy 0.99beta Kosakovsky Pond et al. (2005) http://www.hyphy.org/ PAML 3.15 Yang (1997) http://abacus.gene.ucl.ac.uk/software/paml.html PHASE 2.0 Jow et al. (2002) http://www.bioinf.manchester.ac.uk/resources/phase/ P4 0.83 Foster (2006) http://www.nhm.ac.uk/research-curation/projects/software/p4.html _______________________________________________________________________________________________________ a Current release, but version 0.93 was used for the principal analyses and 0.942 for the other analyses. Current release, but version 3.0 was used for most of the analyses. c The current release is RAxML-VI-HPC v2.2.0 (Stamatakis, 2006), which is actually quite a different program. d PhyML can be used in conjunction with APE. b References Adachi, J., and M. Hasegawa. 1996. MOLPHY version 2.3: programs for molecular phylogenetics based on maximum likelihood. Comput. Sci. Monogr. 28:1–150. Foster, P. G. 2006. P4, a python package for phylogenetics. Distributed by the author. Department of Zoology, Natural History Museum, London, U.K. July 2006. Friedman, N., M. Ninio, I. Pe'er, and T. Pupko. 2002. A structural EM algorithm for phylogenetic inference. J. Computat. Biol. 9:331–353. Galtier, N., and M. Gouy. 1998. Inferring pattern and process: maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis. Mol. Biol. Evol. 15:871–879. Guindon, S., and O. Gascuel. 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52:696–704. Jobb, G., A. von Haeseler, and K. Strimmer. 2004. Treefinder: a powerful graphical analysis environment for molecular phylogenetics. BMC Evol. Biol. 4:18. Jow, H., C. Hudelot, M. Rattray, and P. Higgs. 2002. Bayesian phylogenetics using an RNA substitution model applied to early mammalian evolution. Mol. Biol. Evol. 19:1591–1601. Keane, T. M. 2006. Computational methods for statistical phylogenetic inference. Ph.D. thesis, The National University of Ireland Maynooth, Ireland. Keane, T. M., T. J. Naughton, S. A. A. Travers, J. O. McInerney, and G. P. McCormack. 2005. DPRml: distributed phylogeny reconstruction by maximum likelihood. Bioinformatics 21:969–974. 6 Kosakovsky Pond, S. L., S. D. W. Frost, and S. V. Muse. 2005. HyPhy: hypothesis testing using phylogenies. Bioinformatics 21:676–679. Lemmon, A. R., and M. C. Milinkovitch. 2002. The metapopulation genetic algorithm: an efficient solution for the problem of large phylogeny estimation. Proc. Nat. Acad. Sci. U.S.A. 99:10516–10521. Olsen, G. J., H. Matsuda, R. Hagstrom, and R. Overbeek. 1994. FastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comp. Appl. Biosci. 10:41–48. Paradis, E., J. Claude, and K. Strimmer. 2004. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20:289–290. Schmidt, H. A., K. Strimmer, M. Vingron, and A. von Haeseler. 2002. Tree-Puzzle: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18:502–504. Stamatakis, A., T. Ludwig, and H. Meier. 2005. RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 21:456–463. Swofford, D. L. 2002. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Sinauer Associates, Sunderland MA. Vinh, L. S., H. A. Schmidt, and A. von Haeseler. 2005. PhyNav: a novel approach to reconstruct large phylogenies. Pages 386–393 in Classification, the ubiquitous challenge: proceedings of the 28th annual conference of the Gesellschaft für Klassifikation e.V. (C. Weihs and W. Gaul, eds). Springer-Verlag, Heidelberg. Vinh, L. S., and A. von Haeseler. 2004. IQPNNI: moving fast through tree space and stopping in time. Mol. Biol. Evol. 21:1565–1571. Wheeler, W. C. 2006. Dynamic homology and the likelihood criterion. Cladistics 22:157–170. Wolf, M. J., S. Easteal, M. Kahn, B. D. McKay, and L. S. Jermiin. 2000. TrExML: a maximum likelihood approach for extensive tree-space exploration. Bioinformatics 16:383–394. Xia, X., and Z. Xie. 2001. DAMBE: data analysis in molecular biology and evolution. J. Hered. 92:371–373. Zwickl, D. J. 2006. Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. Ph.D. thesis, The University of Texas at Austin, U.S.A. 7 Input File for the Ratchet(Nixon) Analyses #NEXUS [ This is a fully commented setup file that can be used to implement the likelihood ratchet using the PAUPRat program of Derek Sikes and Paul Lewis: http://www.ucalgary.ca/~dsikes/software2.htm Sikes, D.S. & Lewis, P.O. 2001. Beta software, version 1. PAUPRat: PAUP* implementation of the parsimony ratchet. Distributed by the authors. Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, USA. June 2001. You need to obtain a copy of the PAUPRat program, and to read its Instruction manual. Basically, you first run this setup file through PAUPRat, and then you run the PAUPRat output file through PAUP*. The input files for PAUP* are your data file and the control file created by PAUPRat from this setup template. The output files from PAUP* are: lratchet.log - a text file with the results lratchet.tre - a treefile with the optimal trees from each iteration model.out - a text file with the model parameter values used lratchet.tmp - a temporary file that you can discard. ] [ The original idea for the parsimony ratchet was by Kevin Nixon: Nixon, K.C. 1999. The parsimony ratchet: a new method for rapid parsimony analysis. Cladistics 15: 407-414. ] [ The original version of the likelihood ratchet was by Rutger Vos: Vos, R.A. 2003. Accelerated likelihood surface exploration: the likelihood ratchet. Systematic Biology 52: 368-373. The original setup file (June 2002) was downloaded from: http://www.sfu.ca/~rvosa/likelihoodratchet ] [ Modifications (November 2006) were by David Morrison, to implement the 'ratchet' part of the procedure, as this was missing from the Vos version (which generates a new starting tree for each iteration). Also, the strategy now provides a series of initial "successive approximations" to estimate both the starting tree and the substitution-model parameter values. Finally, the tree-search strategy has been optimized for maximum-likelihood analyses of up to 150 sequences. ] [ The successive approximations were based on the ideas of: Sullivan, J., Abdo, Z., Joyce, P. & Swofford, D.L. 2005. Evaluating the performance of a successive-approximations approach to parameter optimization in maximum-likelihood phylogeny estimation. Molecular Biology & Evolution 22: 1386-1392. The specific implementation was inspired by: Sullivan, J. 2005. Maximum likelihood methods for phylogeny estimation. Methods in Enzymology 395: 757-779. and by Peter Foster: http://bioinf.ncl.ac.uk/molsys/data/like.pdf http://www.ch.embnet.org/CoursEMBnet/PHYL03/Slides/unix_like_pfoster.pdf 8 Foster, P.G. 2001. Likelihood in Molecular Phylogenetics. Unpublished notes used for Molecular Systematics course. Natural History Museum, London, UK. July, 2001; September 2003.] ] [ There are seven settings in this setup file that you might need to change: (1) You must modify the 'nchar' command to match your data set. (2) The default number of re-weighting iterations is 10, and the percentage of characters to re-weight is 25. This produces 11 trees (the initial tree plus 10 attempts to change island). You can change these values using the 'nreps' and 'pct' commands (e.g. nreps=20 pct=15). (3) The default re-weighting scheme treats all of the characters as equal. You can change this using the 'wtmode' command. (4) The default substitution model is GTR+G+I (general time reversible, with gamma-distributed site-to-site variation and a proportion of invariable sites). If you want to use a different model, then you need to change all of the 'LScores' and 'LSet' commands. Note that the complexity of the model does not affect the speed of the tree searches (since the model is fixed for all searches), but does affect the speed of model estimation during the initial successive approximations. (5) The default tree-search strategy is SPR (subtree-prune-regraft) (the PAUP* default is TBR, intended for parsimony searches). If you want to use a different strategy, then you need to change all of the 'Swap=spr' commands. If you want separate strategies for the re-weighted and unweighted searches, then you need to change the commands labelled 'rewtdcmd' and 'normcmd', respectively. (6) During the tree search the log-likelihood scores are not fully optimized unless they are within 2% of the current optimum value (the PAUP* default is 5%, intended for <50 sequences). If you want to use a different strategy, then you can change this using the 'ApproxLim' command (e.g. ApproxLim=1 for data sets with larger negative log-likelihoods). Note that this value can make a big difference to how long the ratchet takes to run; even a change in value of 0.01% can be important for large data sets (multiple genes for >100 sequences). For a discussion, see: Rogers J.S. & Swofford, D.L. 1998. A fast method for approximating maximum likelihoods of phylogenetic trees from nucleotide sequences. Systematic Biology 47: 77-89. (7) Only one tree is saved during the re-weighted tree search, on the principle that the optimal tree does not necessarily have to be found for this search (only for the unweighted search). If you do want to find the optimal tree, then you need to change the 'MulTrees=no' command. Also, you might like to consider using the 'RearrLimit' or 'TimeLimit' commands if you wish to prevent unduly long re-weighted searches. ] [Start of instructions. Don't change.] begin pauprat; [Enter the number of characters after nchar= on the following line.] dimensions nchar=10922; [Enter the number of iterations after 'nreps=' and the fraction of characters drawn after 'pct=' on the following line. The default values seem to work, but you can always use more replicates and a greater percentage (probably up to 35%, as for the parsimony ratchet) if you expect a very complex landscape, or if you have a small 9 data set and/or a very fast computer. 'Seed=0' sets a randomly chosen random-number seed, but you can pre-specify a particular seed if you want exact repetition of the characters chosen for re-weighting.] set seed=0 nreps=10 pct=25; [Choose the weighting mode. The choices are: additive, multiplicative, uniform. Typically, the default works fine unless you are using a weighting scheme (i.e. a 'WtSet' command) based on codon positions, in which case you might want to try 'mult'.] set wtmode=uniform; [Don't change this unless you want a lot of output.] set terse; [Opening message.] startcmd startcmd startcmd startcmd startcmd startcmd "[!* * * * * * * * * * * * * * * * * * "[!* ----- Likelihood Ratchet v2 ----"[!* David A. Morrison "[!* Sveriges Lantbruksuniversitet "[!* November, 2006 "[!* * * * * * * * * * * * * * * * * * *]"; *]"; *]"; *]"; *]"; *]"; [Record the current time.] startcmd "Time"; [The *.log file stores PAUP*'s display buffer.] startcmd "Log File=lratchet.log"; [Automatically increase the 'maxtrees' setting. Don't change.] startcmd "Set Increase=auto"; [Get the starting tree. No need to change unless you want to specify a user starting tree, in which case use the 'GetTrees' command.] startcmd "DSet Dist=logdet Objective=ME Rates=equal PInv=0 Subst=all NegBrLen=setzero"; startcmd "NJ BioNJ=yes ShowTree=no BrLens=no BreakTies=systematic"; [Set the optimality criterion to ML. Don't change.] startcmd "Set Criterion=likelihood"; [Optimize the substitution-model parameters.] startcmd "LScores 1 / NST=6 BaseFreq=estimate RMatrix=estimate Rates=gamma Shape=estimate PInvar=estimate"; [The *.tmp file contains the current working tree. It can be used to re-start a ratchet run that has been interrupted. Don't change.] startcmd "SaveTrees File=lratchet.tmp Replace"; startcmd "Time"; 10 [Do an NNI search based on these parameter estimates, and then optimize the substitution-model parameters again.] startcmd "LSet BaseFreq=previous NST=6 RMatrix=previous Rates=gamma Shape=previous PInvar=previous ApproxLim=2 AdjustAppLim=no"; startcmd "HSearch Status=no Start=current Swap=nni MulTrees=yes"; startcmd "SaveTrees File=lratchet.tmp Replace"; startcmd "LScores 1 / NST=6 BaseFreq=estimate RMatrix=estimate Rates=gamma Shape=estimate PInvar=estimate"; startcmd "Time"; [Do an SPR search based on these parameter estimates, and then optimize the substitution-model parameters again. Save the model parameter values to the model.out file. The 'LongFmt' option is used only to deal with a long-standing bug in PAUP* version 4b10.] startcmd "LSet BaseFreq=previous NST=6 RMatrix=previous Rates=gamma Shape=previous PInvar=previous ApproxLim=2 AdjustAppLim=no"; startcmd "HSearch Status=no Start=current Swap=spr MulTrees=yes"; startcmd "SaveTrees File=lratchet.tmp Replace"; startcmd "Default LScores LongFmt=yes"; startcmd "LScores 1 / NST=6 BaseFreq=estimate RMatrix=estimate Rates=gamma Shape=estimate PInvar=estimate ScoreFile=model.out Replace"; startcmd "Default LScores LongFmt=no"; startcmd "Time"; [The *.tre file contains the set of solutions for the initial tree plus all subsequent iterations. There will thus be at least nreps+1 trees in this file at the end. Don't change.] startcmd "SaveTrees File=lratchet.tre Replace"; [Set the substitution-model parameters for the likelihood model used in all subsequent iterations.] startcmd "LSet BaseFreq=previous NST=6 RMatrix=previous Rates=gamma Shape=previous PInvar=previous ApproxLim=2 AdjustAppLim=no"; [Commands for the branch-swapping cycles under the re-weighted scheme. This is the tree search that tries to get to another island of trees.] rewtdcmd "HSearch Status=no Start=1 Swap=spr MulTrees=no"; [Updates the *.tmp file to contain the current tree. Don't change.] rewtdcmd "SaveTrees File=lratchet.tmp Replace"; rewtdcmd "Time"; [Commands for the branch-swapping cycles under the original weighting scheme. This is the tree search that tries to find the peak of the island.] normcmd "HSearch Status=no Start=1 Swap=spr MulTrees=yes"; [Update the *.tmp file to contain the current starting tree. Don't change.] normcmd "SaveTrees File=lratchet.tmp Replace"; 11 [Update the set of optimal trees over all iterations. Note that both the 'GetTrees' and 'SaveTrees' commands are used in order to get all of the trees into a single Trees block in the treefile (the default in PAUP* is to create a separate block for each ratchet iteration). Don't change.] normcmd "GetTrees Rooted=no Unrooted=yes File=lratchet.tre Mode=7"; normcmd "SaveTrees File=lratchet.tre Replace"; normcmd "GetTrees Rooted=no Unrooted=yes File=lratchet.tmp Mode=3 Warntree=no"; normcmd "Time"; [Retrieve the final set of optimal trees at the end of the ratchet search. Don't change.] stopcmd "GetTrees File=lratchet.tre Mode=3"; [Print the negative log-likelihoods and the between-tree distances for the set of optimal solutions. Note that the trees are numbered in reverse order (i.e. the final-iteration tree is #1). There will be more than nreps+1 trees if some of the iterations found several equally optimal trees. Don't change.] stopcmd "LScores All / SortTrees=yes"; stopcmd "TreeDist Metric=symdiff"; stopcmd "Time"; [Stop the logging of the display buffer.] stopcmd "Log Stop"; [Final message.] stopcmd stopcmd stopcmd stopcmd stopcmd stopcmd stopcmd "[!* * * * * * * * * * * * * * * * "[!* -- THIS SEARCH IS COMPLETE -"[!* A LOG FILE HAS BEEN WRITTEN "[!* AND ALL TREES HAVE BEEN SAVED "[!* IT IS OKAY TO QUIT PAUP "[!* * * * * * * * * * * * * * * * "Quit"; [Define the name of the ratchet script file.] write file=lratchet.nex; end; *]"; *]"; *]"; *]"; *]"; *]";