Slides

advertisement
Species Tree Workshop
January 14, 2012
Practice with BEST
Please download MrBayes 3.2 for either
Windows, Macintos, or UNIX from
http://mrbayes.sourceforge.net/
Agenda



The MrBayes with BEST (v 3.2)
implementation (work in progress)
Run the finch example (download finch.nex)
Run a multiple allele data set


(yeast with 4 genes, 22 taxa, 6 species )
…or Try your own data
Previous Implementation:
MrBayes with BEST
Step 1: Use MrBayes to propose vectors of joint gene trees (unlinked
and rooted with outgroup).
Step 2: Given those gene trees, propose a compatible species tree.
Step 3: Implement the chain fully within MrBayes using the usual
properties of the MCMC as proposed by the user.
Program found at www.stat.osu.edu/~dkp/BEST
New Implementation:
MrBayes 3.2 integrated with BEST
Assumes molecular clock for gene trees as part of a full model
including Coalescent for gene trees|species tree
Program found at http://mrbayes.sourceforge.net/
Implementation: MrBayes 3.2

As always
Wide variety of nucleotide, amino acid, and codon models
 Variety of proposal distribution options
 Parallel “hot” and “cold” chains to balance efficiency while
covering large tree spaces.
 Checkpointing to allow stop and starts


New speed improvements
BEST can use MPI for Mac and UNIX
 GPU (NVIDIA graphics card) support

Steps for any Bayesian Runs







Read the data
Set the model (data|gene tree)
Set the Prior (including gene|species)
Set the MCMC rules
Run the MCMC
Check convergence
Summarize results
Files created







ckp (Checkpoint file for restarting)
tree5.run2.t (trees saved loci 5 in run 2)
tree5.parts (partitions seen for tree 5)
tree5.trprobs (tree probabilties)
tree5.con.tre (consensus tree)
tree5.tstat (partition statistics)
tree5.vstat (branch and node statistics)
Remember




Use a separate folder for each analysis
Remember the “taxset”and
“speciespartition” statements in MrBayes
with ≥ one taxa per species
Remember to allow variable population
sizes
With n loci, the species tree shows up as
files labeled n+1
Remember to unlink

Gene tree topologies and branch lengths for
sure


unlink topology=(all) brlens=(all);
Parameters of model as approriate

unlink statefreq=(all) revmat=(all);
Issues


Gene trees following a molecular clock is too
restrictive
Some outputs still need to be modified for
species tree use
Species Tree Notation
0.005
0.01
0.035
0.03
0.02
A
Topology, branch lengths, &
population sizes:
(D:
0.035(C:0.03(A:0.02,B:0.02):0.01#0.
3):0.005#0.2)#0.25
0.02
B
C
D
qAB = 0.3, qABC = 0.2, qABCD = 0.25
Three lineages of grassfinches (Poephila)
Long-tailed
(acuticauda)
Long-tailed
(hecki)
Black-throated
(cincta)
30 gene trees from Australian finches
P. acuticauda P. hecki P. cincta
Jennings & Edwards (2005) Evolution 59, 2033-2047.
Estimated species tree distribution using BEST
Estimated species tree distribution using BEST
1.0
0.94
1.0
0.03
Download