Introduction to PAUP and MacClade for SEP245 About this guide This guide describes, differentiates and outlines the navigation structures of PAUP and MacClade. It is meant as an introduction for students, and so, much of the more esoteric options of both applications are not discussed here. This is not a lecture on systematic theory, so you won’t find definitions of jargon like ‘parsimony’, ‘rooted’ etc. Where applicable, there will be suggestions for further reading. A basic knowledge of MacOS principles and practices is expected. Commands like ‘OK’ or ‘Apply’ are therefore not mentioned as they are self-evident in the given situations. Notes and hints are in italics. PAUP Introduction The influence of high-speed computer analysis of molecular, morphological and/or behavioural data to infer phylogenetic relationships has expanded well beyond its central role in evolutionary biology, now encompassing applications in areas as diverse as conservation biology, ecology, and forensic studies. The success of previous versions of PAUP: Phylogenetic Analysis Using Parsimony has made it the most widely used software package for the inference of evolutionary trees. With the inclusion of maximum likelihood and distance methods in PAUP* 4.0, the new version represents a great improvement over its predecessors. In addition, the speed of the branch-and-bound algorithm has been enhanced and a number of new features have been added, from agreement subtrees to tests for combinability of data and permutation tests for nonrandomness of data structure. There are several different versions of PAUP available, some of which use a command-line interface (the MS-Dos and UNIX versions) while others use a graphical user interface (MacOS and Windows95/NT). This guide focuses on PAUP* 4.0 for Macintosh. Concepts and navigation of PAUP* 4.0 for Macintosh PAUP has two modes: ‘edit mode’ and ‘execute mode’. The edit mode shows the current data matrix and additional settings you’ve made, the execute mode logs searches you’ve performed and shows their results (including trees). Each mode has its own window. The main menu of PAUP consists of: File Edit Options Data Analysis Trees Window Help Nothing unusual here except the ‘execute file’ command (which opens the execute mode). The regular ‘edit’ commands; controls for the scrolling ‘execute’ window. General program settings and preferences. Manipulation of the data matrix: weight, ordered/ unordered. Analyses other than phylogenetic analysis (e.g. base frequencies) Search settings and commands: parsimony, likelihood, distance. Various search methods. Manipulation and analysis of trees. Also opening, saving and printing of trees. Display – and window controls. Toggles between ‘balloon help’ on and off, gives access to an obsolete help file. All settings and commands for searching and analysing are in the ‘Data’, the ‘Analysis’ and the ‘Trees’ menus. Apart from these menus you’ll probably only use the ‘File’ menu. Common PAUP procedures Opening and executing a file When opening a file, you’ll be offered the choice between ‘edit’ and ‘execute’ mode. If you choose ‘edit’ mode, a window opens with the data matrix and some additional comments. Before anything else can be done, the file needs to be ‘executed’ (File > Execute File). This opens the ‘execute’ window. After this the program is ready for further tweaking of parameters. When opening in ‘execute’ mode the current file is immediately processed and the window with the current matrix (the ‘edit’ window) doesn’t open. This mode doesn’t provide the user with as much feedback and information as the ‘edit’ mode and is therefore not recommended for novice users. Editing data After executing weights can be assigned (Data > Set Character Weights) (this includes weighting for different codon positions. Note that this type of weighting only works after codon positions have been calculated in MacClade and saved with the file) and characters can be ordered (this includes Ti/Tv ordering. Note that this feature is only available after a Ti/Tv matrix has been entered using MacClade. If this has been done you’ll see a small matrix with the Ti/Tv ratios at the bottom of the ‘edit’ window). Outgroups are also defined in this menu (Data > Define outgroups). Running a search The program is now ready to start searching for the best tree among the many possible ones. It depends on the size of your data set which search algorhithm you’ll use. There’s a good explanation of the various algorhithms on pp. 376-378 of Freeman & Herron1. A rule of thumb for the applications of the various search methods would be: Small data sets: exhaustive search Larger data sets: branch-and-bound Large data sets: heuristic search Start a search method (Analysis > [chosen search method] > Search). Unless stated otherwise, go with the default settings of the algorhithms. They’re fine. Searching may take some time. Click ‘Close’ to close the status window and return to the ‘execute’ window. The found trees are now stored and the search settings and results are logged in the ‘execute’ window. Visualizing trees The first thing you’ll probably want to do now that the searching is finished is see the trees (Trees > Show trees). The resulting trees can now be evaluated, and consensus trees can be constructed. For a detailed discussion of the various approaches for analysing trees, read pp. 378-379 of Freeman & Herron. Make sure you save your trees (Trees > Save Trees to file) so that they’re available for further analysis in MacClade. Using the command-line interface PDF Manual Although all functionality of PAUP can be unleashed using the menus, there’s also the possibility of using a command-line interface. This feature is useful if you want all modifications you’ve made to the data and to the search algorhithms to show up in the execute log, and to be saved with the data. Explaining the commands for this approach is beyond the scope of this guide; there’s a manual in Adobe Acrobat *.pdf file format that goes in the details (it’s in the same folder as PAUP itself). PAUP FAQ You’ll find a ‘frequently asked questions’ section here: http://www.lms.si.edu/PAUP/paupfaq/index.html This FAQ focuses on the command-line interface Using PAUP and MacClade concurrently PAUP* 4.0 and MacClade 3 use a common data file format (NEXUS), allowing easy interchange of data between the two programs. MacClade What’s MacClade? MacClade is a computer program for phylogenetic analysis. It is not intended as a standalone tool to infer phylogeny (although it complements programs like PAUP nicely in this regard). Its analytical strength is in studies of character evolution. It also provides many tools for entering and editing data and phylogenies, and for producing tree diagrams and charts. MacClade enables you to use the mouse-window interface to specify and rearrange phylogenies by hand, and watch the number of character steps and the distribution of states of a given character on the tree change as you do so. MacClade does not have a sophisticated search algorithm to find best trees: it largely relies on you to do it by hand (which is surprisingly effective), with only a local rearrangement algorithm available to improve on that tree. Concepts and navigation MacClade has two main windows: the ‘data window’ and the ‘tree window’. You can toggle between the two windows using the ‘Go to….’ - command in the Display menu. The data window is a matrix window similar to spreadsheet programs like MSExcel. Taxa, characters and character states are entered in this window. There are various utilities for facilitating this process in the Utilities menu. You can specify what type of characters you’re entering (e.g., DNA bases, RNA bases, morphological characters) using the Format menu. The Assume window lets you specify which characters are ordered, add weights and has a handy utility for calculating codon positions. Assumptions can only be specified for selected characters; in other words: if you don’t select characters, you’ll get an error message. (The ‘select all’-command is in the Edit menu.) Menu options of the data window File Edit Utilities Open, close, save, print. Cut, copy, paste, select all. Tools for filling in / completing / filtering matrices Format Assume Display Help Converts between various types of data (e.g.matrix contains DNA / RNA or matrix contains morphological data) Settings for various assumptions (e.g. ordered / unordered; weighting; codon positions) Toggles between ‘data window’, ‘tree window’ and ‘character window’. Visualisation options. Balloon help on / off When you open the tree window for the first time using the current data matrix, MacClade asks you if you want to open an existing tree file (For instance one you’ve created using Paup) or if you want to create a random tree. There are some more menu options when the tree window is open, and there’s also a small statistics window that shows the treelength, the number of taxa and the number of characters. Additional menu options of the Tree window Tools Tree ∑ Tools for graphical manipulation of trees. Click ‘n’ drag this menu to keep it open. Select a tool and click on the branch you want to modify. Open, close, save, import, export of trees Selecting any of these options adds an analysis to the statistics window Trace Chart Selecting ‘Trace character’ paints character changes on the tree. A single or all characters can be traced. A small window explaining the color codes and showing the number of state changes in the current character opens after selecting ‘trace character’ Visualizes additional data. In MacClade's tree window, hypothesized phylogenetic trees or cladograms can be manipulated and character evolution visualized upon them. To manipulate the tree, tools are provided to move branches, reroot clades, create polytomies, and automatically search for more parsimonious trees. Character evolution is reconstructed on the tree and indicated by "painting" the branches. Alternative reconstructions of character evolution can be explored. Summaries of changes in all characters can be depicted on the tree. As trees are manipulated, MacClade updates statistics such as treelength and the results are depicted in graphics and charts. MacClade provides charts summarizing various aspects of character evolution on one or more trees, as well as charts comparing two or more trees. For example, the charts can show the number of trees of each length, the number of characters on the tree with different consistency indices, and so on. It has a number of charts specifically designed for DNA/RNA sequence data, including one showing the number of changes on the tree at first, second, and third codon positions, and a chart of the relative frequencies of various transitions and transversions. Common MacClade procedures Editing and analysing molecular data MacClade has some additional features for analysing DNA and RNA data. To unlock these features, you’ll have to tell the program what kind of data you’re using (Format > DNA or RNA). MacClade can calculate codon positions (Assume > Change codons…. > Calculate positions), a feature that is essential if you want to analyse the number of state changes per codon position (Tree window > Chart > Current tree only > Codon position > Chart). MacClade can also calculate transition/transversion ratio (Tree window > Chart > State changes & stasis…. > Chart (For numbers instead of a graph, click the X button in the upper right corner of the chart window)). If you apply the calculated ratio to the data file (Assume > Type edit…. > New > Step matrices > Enter calculated values in matrix > Apply > OK), it will show up as a ‘Userdefined’ preset in the ( Data >) ‘Set character types….’ window of PAUP. Manipulating trees To move a branch, click and drag it to the branch you want to connect it to and release it. To open the toolbox (for functions like rerooting, and rotating of branches) select the Tools menu and drag the mouse downwards into the tree window. You’ll see a square the size of the tools menu around your mouse. If you click again, the menu becomes available as a separate window in the tree window. To manipulate a tree, select a tool (for instance the ‘rotate branch’) and click on the branch you want to edit (rotate, in this example). 1 Freeman, S. & J.C. Herron (1998) Evolutionary analysis (Upper Saddle River: Prentice Hall)