HOW DOES A REAL PHYLOGENETIC TREE OF AVIDA ED INDIVIDUALS COMPARE TO A SIMULATED TREE? Sara Hettenbach Background • AvidaEd is a computer program developed by Michigan state that uses digital organisms to show evolution. • When placed in an environment and given rewards for more complex functions, these individuals evolve to gain new functions, such as number recognition and addition functions. • These individuals replicate asexually using a 50 letter genome that mutates at a pre-determined rate. Background • Phylogenetics = The study of evolutionary relationships among groups of individuals. • Phylogenetic analysis programs use DNA sequencing to identify when and where mutations happen. • Using these sequences, phylogeny programs determine how closely related individuals are. • There are different ways to form an evolutionary tree: • Maximum likelihood: This type of program searches for the most likely tree given the DNA genetic information. • Parsimony: Forms an evolutionary tree based on the process that would take the fewest evolutionary steps • Distance phylogeny: Uses pairwise distance matrices to form the phylogenetic tree • Bayesian inference: Uses algorithms and posterior distributions on a set of parameters to infer a phylogentic tree. Interest • When approaching this inquiry, I knew of the AvidaEd program, but was unaware of all of the possibilities and uses for the program. • Upon more reading, I became interested in classroom applications of this program, as well as other possible experiments. • Brad introduced me to a previous study similar to mine that used a know phylogenetic tree of E.Coli to test phylogenetic analyses. • After reading this, I was interested to see if the AvidaEd organisms could be used to do something similar. Methods Overview • I formed my own evolutionary tree of AvidaEd individuals by isolating the two most fit individuals after 500 updates (or generations). • These isolated individuals were used to start the next generation. • Each generation had the same environment with the same rewards, mutation rate settings, and time. Methods Overview Continued • After my known tree was formed, I exported all of the individuals’ genomic data. • The AvidaEd organisms do not have a DNA genome. • Each individual has a 50 letter long genome that uses all 26 letters of the alphabet. • I translated each of these letters into a three-letter DNA codon. letter Three codon code a GCU b UAU c UGU d GAU e GAG f UUU g GGU h CAU i AUU j UGC k AAA l CUU m AUG n AAU o UCC p CCU q CAA r AGA s UCU t ACU u UGG v GUU w CAC x AUC y GUC z GGC Methods Overview Continued • After converting the final genome DNA sequences so that they were in the correct format for alignment programs, I was finally able to input the information into the phylogenetics programs. • Once the DNA sequencing data is input and the parameters are set, the program aligns the data and the “spits out” a tree! • Statistical tests are run in the program • Bootstrapping: I set all of the programs to replicate 100 different phylogenetic trees. The bootstrap value displayed in the key or on the tree branch shows the percentage that that branch was the same in all of the 100 replicates. Bootstrap values above 70 are considered well supported. Results Maximum likelihood tree Results Parsimony Tree Results Distance phylogeny Results Bayesian inference Results Details • The two trees that do not show bootstrap values had bootstrap values greater than 70 on all branches, and therefore are considered well-supported. • The parsimony tree got the evolution of my Avida organisms perfect! • The other two methods of analyses did categorize the right organisms as being most closely related, but some of the branching points off in time or placement. Conclusion/Implications • I was very excited by my results! • Shows how to think critically about evolution; in cases other than this one, we don’t know the lineage of evolution. • Also demonstrates that the AVidaEd individuals are an actual instance of evolution, not a simulation of one. • If I was to re-create another tree, the individuals would most likely not have the same genomes! • This whole project was a learning process for me, but by the end of it I felt much more comfortable with the phylogenetic analyses programs. • I also felt that I had a better understanding of phylogenetics; this was great practice. • All of these things that I learned, students could learn too! Acknowledgements • Huge thank you to Brad, who persuaded me to do this project, helped me along the way, and received a lot of late-night emails. • I’d also like to thank Helen, Bob, and the rest of my peers in this class who provided feedback and support. • Dr. Paulyn Cartwright for a brief introduction into phylogenetics. • French phylogeny website that provided the software/analyses programs. • Dereeper A., Audic S., Claverie J.M., Blanc G. BLAST-EXPLORER helps you building datasets for phylogenetic analysis. BMC Evol Biol. 2010 Jan 12;10:8. (PubMed) • Dereeper A.*, Guignon V.*, Blanc G., Audic S., Buffet S., Chevenet F., Dufayard J.F., Guindon S., Lefort V., Lescot M., Claverie J.M., Gascuel O. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W465-9. Epub 2008 Apr 19. (PubMed) *: joint first authors • Thompson J.D., Higgins D.G., Gibson T.J. CLUSTAL W: improving the sensitivity of progressivemultiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, Nov 11;22(22):4673-80. (PubMed) Goloboff P., Farris S., Nixon K. TNT (Tree analysis using New Technology) ver. 1.1 2000, Published by the authors, Tucumán, Argentina. • Chevenet F., Brun C., Banuls AL., Jacq B., Chisten R. TreeDyn: towards dynamic graphics and annotations for analyses of trees. BMC Bioinformatics. 2006, Oct 10;7:439. (PubMed)