Rainbow: A Toolbox for Phylogenetic Supertree Construction and Analysis D. Chen, O. Eulenstein, and D. Fernández-Baca Department of Computer Science Iowa State University Ames, IA 50011, USA ABSTRACT Summary: Rainbow is a program that provides a graphic user interface (GUI) to construct supertrees using different methods. It also provides tools to analyze the quality of the supertrees produced. Rainbow is available for Mac OS X, Windows and Linux. Availability: Rainbow is free open-source software. Its binary files, source code, and manual and can be downloaded from the Rainbow webpage: http://genome.cs.iastate.edu/Rainbow/. Contact: duhong@iastate.edu Phylogenetic supertrees are typically rooted evolutionary trees assembled from smaller rooted trees that share some but not necessarily all taxa (leaf nodes) in common. Supertrees can make novel statements about relationships among taxa that do not co-occur on any single source tree, while retaining hierarchical information from the source trees. As a method of combining existing phylogenetic information to produce more inclusive phylogenies, the supertree approach can potentially resolve problems associated with other methods, such as absence of homologous characters, incompatible data types, or non-overlapping sets of taxa (Sanderson et al., 1998, Bininda-Emonds et al., 2002). Supertree construction methods can be classified as either matrix-based or not. Examples of the latter include most consensus supertree methods, MinCut (MC) algorithm (Semple and Steel, 2000), and Modified MinCut (MMC) algorithm (Page, 2002). Matrix-based supertree methods encode the source trees as a matrix that is then combined and analyzed using an optimization criterion. Two examples of this approach are Matrix Representation with Parsimony (MRP) (Baum, 1992; Ragan, 1992) and Matrix Representation with Flipping (MRF) (Chen et al., 2003; Eulenstein et al., 2004). Supertree programs available on the Internet include Thorely et al.’s RadCon (Thorely et al. 2000), Salamin's SuperTree 0.85b (Salamin, 2002) and Page’s supertree0.3 (Page, 2003a). The first two compute MRP matrices from phylogenetic trees, but do not integrate parsimony analyses on the output matrices. Page’s program is a console application to construct MC and MMC supertrees. Rainbow is a supertree analysis tool for Mac OS X, Windows and Linux that integrates graphic tree display, supertree construction, and result analysis. Rainbow uses the NCL C++ library (Lewis, 2003) to interpret tree files in Nexus format (Maddison et al., 1997), and the TreeLib C++ class library (Page, 2003b) for displaying trees. It can construct MRF/P, and MMC supertrees. The MRP and MMC supertree implementations require the following programs: PAUP*4 (Swofford, 2002), which is used for parsimony analyses, and Page’s supertree-0.3 program, which is needed to obtain MMC supertrees. Rainbow determines the accuracy of constructed supertrees by how well they fit their source trees. The fit is measured by the size of the normalized maximal agreement subtree (Eulenstein et al., 2004), the symmetric difference, and the triplet fit (Page 2002). Figure 1. A screen-shot of Rainbow under Mac OS X. Right frame: The tree navigation bar, showing the currently open files. Bottom right frame: A source tree whose weight is being set. Top center: The supertree wizard. Left frame: One of the resulting supertrees. Partially hidden from view are the log and report windows (bottom left and center, respectively). Figure 1 is a screen-shot of a typical Rainbow session. A supertree wizard guides the user in setting parameters for the various supertree methods, such as the number of random-addition-sequence replications to be performed, the maximum number of tie trees to be kept, and, for the MRF/P heuristics, the branch swapping methods to be applied. It also provides a dialog box to set weights for the source trees, so the user can construct weighted MRF/P, and weighted MMC supertrees. A log window reports the progress of a supertree construction and a report window displays the accuracy analysis of the constructed supertrees. A Rainbow manual including hands-on examples is available through the Rainbow webpage. Future plans include the integration of Quartet Suite 1.0 (Piaggio-Talice, 2004), that is an implementation of a quartet-based supertree method by Piaggio-Talice et al. (2004). ACKNOWLEDGMENTS Rod Page generously provided his programs for calculating MMC supertrees and TreeLib library for handling phylogenetic trees. J.G. Burleigh and Mike Sanderson tested the software and offered many helpful suggestions. This work was supported in part by NSF grants 1053164 (Eulenstein), CCR9988348 (Fernández-Baca), and EF-0334832 (Eulenstein, Fernández-Baca). REFERENCES Baum, B. (1992) Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees. Taxon, 41, 310. Bininda-Emonds, O.R.P., Gittleman J.L., and Steel M.A. (2002) The (super)tree of life: Procedures, problems, and prospects. Ann. Rev. Ecol. Syst., 33, 265289. Chen, D., Diao L., Eulenstein O., Fernández-Baca D., and Sanderson M.J. (2003) Flipping: A Supertree Construction Method. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 61,135160. Eulenstein, O., Chen D., Burleigh J.G., Fernández-Baca D., and Sanderson M.J. (2004) Performance of flip-supertree construction with a heuristic algorithm, Syst. Biol. 53(2):1-10, 2004. Lewis O.L., (2003) NCL: a C++ class library for interpreting data files in NEXUS format. Bioinformatics. 19(17), 23302331. Maddison, D. R., Swofford D.L., and Maddison W.P. (1997) NEXUS: An Extensible File Format for Systematic Information. Syst. Biol., 464(4), 590621. Page, RDM., (2002) Modified mincut supertrees. In R. Guigó and D. Gusfield (eds), WABI 2002, LNCS, 2452, 537551. Page, RDM., (2003a) http://darwin.zoology.gla.ac.uk/~rpage/supertree/index.html. Page, RDM., (2003b) http://taxonomy.zoology.gla.ac.uk/rod/treeview.html. Piaggio-Talice, R., (2003) http://genome.cs.iastate.edu/supertree/downloads/download.html. Piaggio-Talice, R., Burleigh J.G., and Eulenstein O. (To appear) Quartet Supertrees. In Supertrees, the Book, O. P. Bininda-Emonds (ed.), Kluwer Academic Press. Ragan, MA (1992) Phylogenetic inference based on matrix representation of trees. Molecular Phylogenetics and Evolution, 1, 5358. Salamin, N., Hodkinson T.R. and Savolainen V. (2002) Building supertrees: an empirical assessment using the grass family (Poaceae), Syst. Biol., 51(1), 134150. Sanderson, M.J., Purvis A., and Henze C. (1998). Phylogenetic supertrees: Assembling the trees of life. Trends Ecol. Evol., 13, 105109. Semple C. and Steel M. (2000) A supertree method for rooted trees. Discrete Applied Mathematics, 105, 147158. Swofford, D.L. (2002) PAUP*. Phylogenetic analysis using parsimony (*and other methods). Version 10. Sinauer Associates, Sunderland, Massachusetts. Thorley, J.L. (2000) RadCon: phylogenetic tree comparison and consensus, Bioinformatics. 16(5), 486487.