Practical Session: Bayesian evolutionary analysis by sampling trees (BEAST) Rebecca R. Gray, Ph.D. Department of Pathology University of Florida • BEAST: – is a cross-platform program for Bayesian MCMC analysis of molecular sequences – entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models – can be used as a method of reconstructing phylogenies, but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology – uses MCMC to average over tree space, so that each tree is weighted proportional to its posterior probability Citations • The recommended citation for this program is: – Drummond AJ, Rambaut A (2007) "BEAST: Bayesian evolutionary analysis by sampling trees." BMC Evolutionary Biology 7:214 • To cite the relaxed clock model in BEAST: – Drummond AJ, Ho SYW, Phillips MJ & Rambaut A (2006) PLoS Biology 4, e88 • To cite the Bayesian Skyline model in BEAST: – Drummond AJ, Rambaut A & Shapiro B and Pybus OG (2005) Mol Biol Evol 22, 1185-1192 • The original MCMC paper was: – Drummond AJ, Nicholls GK, Rodrigo AG & Solomon W (2002) Genetics 161, 1307-1320 Basic Pipeline • 1) setting up xml file (beauti) • 2) running xml file (beast) • 3) evaluating the performance of the run (Tracer) • 4) comparing models, obtaining estimates of parameters (Tracer) • 5) summarizing the tree distribution (TreeAnnotator) • 6) viewing MCC tree (Figtree) Downloading programs • http://beast.bio.ed.ac.uk/Main_Page\ – Download contains beauti, BEAST, TreeAnnotator • http://beast.bio.ed.ac.uk/Tracer • http://beast.bio.ed.ac.uk/FigTree PRACTICAL: RIFT VALLEY FEVER VIRUS Epidemiology of RVF • The virus was first identified in 1931 in the Rift Valley of Kenya • Mosquito vector, primarily infects livestock • 1997–1998, a major outbreak occurred in Kenya, Somalia and the United Republic of Tanzania • September 2000 cases were confirmed in Saudi Arabia and Yemen (first reported occurrence of the disease outside the African continent) Setting up xml file in beauti • Requires a nexus file – Helpful to have dates with the sample name – Use the finest resolution available • GUI interface allows basic selection of parameters • Xml file can be manually edited to test specific hypotheses/tweak run Beauti practical • Import alignment (g_63.nex) • Tip dates – use tipdates, guess dates (years since some time in the past) • Site models – use GTR + G, empirical base frequencies • Test hypothesis of strict vs. relaxed molecular clock • Trees – coalescent tree prior – constant size • 5 x 107 generations BEAST • Open xml file with text editor • Run in beast • Check mixing of the MCMC chain • Open S log files in Tracer • Open L and G2 log files • What can we do about the trace?? Proper mixing • First step – run chain longer – Open L200 files • Other steps to try: – Over parameterization – reduce complexity – Temporal/phylogenetic signal – Priors are inappropriate Model testing • Bayes factors: – Compare estimates of the marginal likelihoods of the models of interest – 2*(ln marginal likelihood model 1 – ln marginal likelihood model 2) – >10, strong support for alternative (more complex model) • Strict clock vs. relaxed clock – Also consider the coefficient of variation Summarizing tree • TreeAnnotator – Burnin 10% (501 samples) – Keep median heights – MCC tree • Visualizing tree: FigTree – Posterior probabilities for branches – Median heights for clades of interest Advanced analyses • Different coalescent priors – Parametric models (exponential, logistic) – Bayesian skyline plots • Phylogeography – Lemey et al, 2009, Plos Computational Biology • Site specific rates of variation Log10 Ne Log10 Ne Change in effective population size over time Bayesian Genealogy Of G Gene 1916 (1868-1942) 16 Additional resources • Tutorials on the beast website, google group • 16th International BioInformatics Workshop on Virus Evolution and Molecular Epidemiology – Johns Hopkins University, Baltimore – 29 August - 03 September 2010, Bethesda, USA – http://www.rega.kuleuven.be/cev/workshop/