Introduction to PAUP and MacClade for SEP245

advertisement
Introduction to PAUP and MacClade for SEP245
About this guide
This guide describes, differentiates and outlines the navigation structures of PAUP and
MacClade. It is meant as an introduction for students, and so, much of the more esoteric
options of both applications are not discussed here. This is not a lecture on systematic
theory, so you won’t find definitions of jargon like ‘parsimony’, ‘rooted’ etc. Where
applicable, there will be suggestions for further reading. A basic knowledge of MacOS
principles and practices is expected. Commands like ‘OK’ or ‘Apply’ are therefore not
mentioned as they are self-evident in the given situations. Notes and hints are in italics.
PAUP
Introduction
The influence of high-speed computer analysis of molecular, morphological and/or
behavioural data to infer phylogenetic relationships has expanded well beyond its central
role in evolutionary biology, now encompassing applications in areas as diverse as
conservation biology, ecology, and forensic studies. The success of previous versions of
PAUP: Phylogenetic Analysis Using Parsimony has made it the most widely used
software package for the inference of evolutionary trees. With the inclusion of maximum
likelihood and distance methods in PAUP* 4.0, the new version represents a great
improvement over its predecessors. In addition, the speed of the branch-and-bound
algorithm has been enhanced and a number of new features have been added, from
agreement subtrees to tests for combinability of data and permutation tests for
nonrandomness of data structure. There are several different versions of PAUP available,
some of which use a command-line interface (the MS-Dos and UNIX versions) while
others use a graphical user interface (MacOS and Windows95/NT). This guide focuses on
PAUP* 4.0 for Macintosh.
Concepts and navigation of PAUP* 4.0 for Macintosh
PAUP has two modes: ‘edit mode’ and ‘execute mode’. The edit mode shows the current
data matrix and additional settings you’ve made, the execute mode logs searches you’ve
performed and shows their results (including trees). Each mode has its own window.
The main menu of PAUP consists of:
File
Edit
Options
Data
Analysis
Trees
Window
Help
Nothing
unusual here
except the
‘execute
file’
command
(which
opens the
execute
mode).
The regular
‘edit’
commands;
controls for
the scrolling
‘execute’
window.
General
program
settings and
preferences.
Manipulation
of the data
matrix:
weight,
ordered/
unordered.
Analyses
other than
phylogenetic
analysis (e.g.
base
frequencies)
Search
settings and
commands:
parsimony,
likelihood,
distance.
Various
search
methods.
Manipulation
and analysis
of trees. Also
opening,
saving and
printing of
trees.
Display –
and window
controls.
Toggles
between
‘balloon help’
on and off,
gives access
to an obsolete
help file.
All settings and commands for searching and analysing are in the ‘Data’, the ‘Analysis’
and the ‘Trees’ menus. Apart from these menus you’ll probably only use the ‘File’ menu.
Common PAUP procedures
Opening and executing a file
When opening a file, you’ll be offered the choice between ‘edit’ and ‘execute’ mode. If
you choose ‘edit’ mode, a window opens with the data matrix and some additional
comments. Before anything else can be done, the file needs to be ‘executed’ (File >
Execute File). This opens the ‘execute’ window. After this the program is ready for
further tweaking of parameters. When opening in ‘execute’ mode the current file is
immediately processed and the window with the current matrix (the ‘edit’ window)
doesn’t open. This mode doesn’t provide the user with as much feedback and information
as the ‘edit’ mode and is therefore not recommended for novice users.
Editing data
After executing weights can be assigned (Data > Set Character Weights) (this includes
weighting for different codon positions. Note that this type of weighting only works after
codon positions have been calculated in MacClade and saved with the file) and
characters can be ordered (this includes Ti/Tv ordering. Note that this feature is only
available after a Ti/Tv matrix has been entered using MacClade. If this has been done
you’ll see a small matrix with the Ti/Tv ratios at the bottom of the ‘edit’ window).
Outgroups are also defined in this menu (Data > Define outgroups).
Running a search
The program is now ready to start searching for the best tree among the many possible
ones. It depends on the size of your data set which search algorhithm you’ll use. There’s
a good explanation of the various algorhithms on pp. 376-378 of Freeman & Herron1. A
rule of thumb for the applications of the various search methods would be:



Small data sets: exhaustive search
Larger data sets: branch-and-bound
Large data sets: heuristic search
Start a search method (Analysis > [chosen search method] > Search). Unless stated
otherwise, go with the default settings of the algorhithms. They’re fine. Searching may
take some time. Click ‘Close’ to close the status window and return to the ‘execute’
window. The found trees are now stored and the search settings and results are logged in
the ‘execute’ window.
Visualizing trees
The first thing you’ll probably want to do now that the searching is finished is see the
trees (Trees > Show trees). The resulting trees can now be evaluated, and consensus trees
can be constructed. For a detailed discussion of the various approaches for analysing
trees, read pp. 378-379 of Freeman & Herron. Make sure you save your trees (Trees >
Save Trees to file) so that they’re available for further analysis in MacClade.
Using the command-line interface
PDF Manual
Although all functionality of PAUP can be unleashed using the menus, there’s also the
possibility of using a command-line interface. This feature is useful if you want all
modifications you’ve made to the data and to the search algorhithms to show up in the
execute log, and to be saved with the data. Explaining the commands for this approach is
beyond the scope of this guide; there’s a manual in Adobe Acrobat *.pdf file format that
goes in the details (it’s in the same folder as PAUP itself).
PAUP FAQ
You’ll find a ‘frequently asked questions’ section here:
http://www.lms.si.edu/PAUP/paupfaq/index.html
This FAQ focuses on the command-line interface
Using PAUP and MacClade concurrently
PAUP* 4.0 and MacClade 3 use a common data file format (NEXUS), allowing easy
interchange of data between the two programs.
MacClade
What’s MacClade?
MacClade is a computer program for phylogenetic analysis. It is not intended as a standalone tool to infer phylogeny (although it complements programs like PAUP nicely in
this regard). Its analytical strength is in studies of character evolution. It also provides
many tools for entering and editing data and phylogenies, and for producing tree
diagrams and charts. MacClade enables you to use the mouse-window interface to
specify and rearrange phylogenies by hand, and watch the number of character steps and
the distribution of states of a given character on the tree change as you do so. MacClade
does not have a sophisticated search algorithm to find best trees: it largely relies on you
to do it by hand (which is surprisingly effective), with only a local rearrangement
algorithm available to improve on that tree.
Concepts and navigation
MacClade has two main windows: the ‘data window’ and the ‘tree window’. You can
toggle between the two windows using the ‘Go to….’ - command in the Display menu.
The data window is a matrix window similar to spreadsheet programs like MSExcel.
Taxa, characters and character states are entered in this window. There are various
utilities for facilitating this process in the Utilities menu. You can specify what type of
characters you’re entering (e.g., DNA bases, RNA bases, morphological characters) using
the Format menu. The Assume window lets you specify which characters are ordered,
add weights and has a handy utility for calculating codon positions. Assumptions can only
be specified for selected characters; in other words: if you don’t select characters, you’ll
get an error message. (The ‘select all’-command is in the Edit menu.)
Menu options of the data window
File
Edit
Utilities
Open, close,
save, print.
Cut, copy,
paste, select all.
Tools for filling
in / completing /
filtering
matrices
Format
Assume
Display
Help
Converts
between various
types of data
(e.g.matrix
contains DNA /
RNA or matrix
contains
morphological
data)
Settings for
various
assumptions
(e.g. ordered /
unordered;
weighting;
codon positions)
Toggles
between ‘data
window’, ‘tree
window’ and
‘character
window’.
Visualisation
options.
Balloon help
on / off
When you open the tree window for the first time using the current data matrix,
MacClade asks you if you want to open an existing tree file (For instance one you’ve
created using Paup) or if you want to create a random tree. There are some more menu
options when the tree window is open, and there’s also a small statistics window that
shows the treelength, the number of taxa and the number of characters.
Additional menu options of the Tree window
Tools
Tree
∑
Tools for graphical
manipulation of
trees. Click ‘n’ drag
this menu to keep it
open. Select a tool
and click on the
branch you want to
modify.
Open, close, save,
import, export of
trees
Selecting any of
these options adds
an analysis to the
statistics window
Trace
Chart
Selecting ‘Trace
character’ paints
character changes on
the tree. A single or
all characters can be
traced. A small
window explaining
the color codes and
showing the number
of state changes in the
current character
opens after selecting
‘trace character’
Visualizes
additional data.
In MacClade's tree window, hypothesized phylogenetic trees or cladograms can be
manipulated and character evolution visualized upon them. To manipulate the tree, tools
are provided to move branches, reroot clades, create polytomies, and automatically search
for more parsimonious trees. Character evolution is reconstructed on the tree and
indicated by "painting" the branches. Alternative reconstructions of character evolution
can be explored. Summaries of changes in all characters can be depicted on the tree. As
trees are manipulated, MacClade updates statistics such as treelength and the results are
depicted in graphics and charts.
MacClade provides charts summarizing various aspects of character evolution on one or
more trees, as well as charts comparing two or more trees. For example, the charts can
show the number of trees of each length, the number of characters on the tree with
different consistency indices, and so on. It has a number of charts specifically designed
for DNA/RNA sequence data, including one showing the number of changes on the tree
at first, second, and third codon positions, and a chart of the relative frequencies of
various transitions and transversions.
Common MacClade procedures
Editing and analysing molecular data
MacClade has some additional features for analysing DNA and RNA data. To unlock
these features, you’ll have to tell the program what kind of data you’re using (Format >
DNA or RNA).
MacClade can calculate codon positions (Assume > Change codons…. > Calculate
positions), a feature that is essential if you want to analyse the number of state changes
per codon position (Tree window > Chart > Current tree only > Codon position > Chart).
MacClade can also calculate transition/transversion ratio (Tree window > Chart > State
changes & stasis…. > Chart (For numbers instead of a graph, click the X button in the
upper right corner of the chart window)).
If you apply the calculated ratio to the data file (Assume > Type edit…. > New > Step
matrices > Enter calculated values in matrix > Apply > OK), it will show up as a ‘Userdefined’ preset in the ( Data >) ‘Set character types….’ window of PAUP.
Manipulating trees
To move a branch, click and drag it to the branch you want to connect it to and release it.
To open the toolbox (for functions like rerooting, and rotating of branches) select the
Tools menu and drag the mouse downwards into the tree window. You’ll see a square the
size of the tools menu around your mouse. If you click again, the menu becomes
available as a separate window in the tree window. To manipulate a tree, select a tool (for
instance the ‘rotate branch’) and click on the branch you want to edit (rotate, in this
example).
1
Freeman, S. & J.C. Herron (1998) Evolutionary analysis (Upper Saddle River: Prentice Hall)
Download