Comparative Biology with focus on 8 examples •Comparative Biology •The Domain of Comparative Biology •Co-modeling in Comparative Biology •The purpose of Comparative Biology •Examples of Stochastic Comparative Modeling •Gene Frequencies in Populations •Genome Structure Evolution •Stemmatology: Manuscript Evolution •RNA Secondary Structure Evolution •Protein Structure Evolution •Movement Evolution •Shape Evolution •Pattern Evolution Comparative Biology Most Recent Common Ancestor Time Direction ? ATTGCGTATATAT….CAG observable Key Questions: •Which phylogeny? •Which ancestral states? •Which process? ATTGCGTATATAT….CAG observable ATTGCGTATATAT….CAG observable Key Generalisations: •Homologous objects •Co-modelling •Genealogical Structures? Comparative Biology: Evolutionary Models Object Nucleotides/Amino Acids/codons Continuous Quantities Sequences Gene Structure Genome Structure Population Structure RNA Protein Networks Metabolic Pathways Protein Interaction Regulatory Pathways Signal Transduction Macromolecular Assemblies Motors Shape Patterns Tissue/Organs/Skeleton/…. Dynamics MD movements of proteins Locomotion Culture Manuscripts (stemmatology) Language Vocabulary Grammar Phonetics Semantics Phenotype Dynamical Systems Type CTFS continuous time finite states CTNS continuous time continuous states CTUS continuous time countable states Matching CTCS MM Brownian Motion/Diffusion SCFG-model like non-evolutionary: extreme variety CTCS CTFS CTCS CTCS CTCS ? ? - (non-evolutionary models) - (non-evolutionary models) - (non-evolutionary models) Reference Jukes-Cantor 69 +500 others Felsenstein 68 + 50 others Thorne, Kishino Felsenstein,91 + 40others DeGroot, 07 Miklos, Fisher, Wright, Haldane, Kimura, …. Holmes, I. 06 + few others Lesk, A;Taylor, W. Snijder, T (sociological networks) Mithani, 2009a,b Stumpf, Wiuf, Ideker Quayle and Bullock, 06, Teichmann Soyer et al.,06 Dryden and Mardia, 1998, Bookstein, Jones & Moriarty Turing, 52; Grenander, analogues to genetic models analogous to sequence models Biggins 05, Munz 10, Cavalli-Sforza & Feldman, 83 Chris J Howe, http://www.cs.helsinki.fi/u/ttonteri/casc/ “Infinite Allele Model” (CTCS) Swadesh,52, Sankoff,72, Gray & Aitkinson, 2003 Dunn 05 Bouchard-Côté 2007 Sankoff,70 Brownian Motion/Diffusion - The Purpose of Comparative Biology To describe evolution: • Make realistic model (pass goodness-of-fit (GOF) test) • Estimate Parameters • Make statements about the path of evolution – ancestral analysis Analyse homologous pairs or sets • What is the equilibrium distribution • Integrate over histories Biological Questions: • Rate of Evolution • Heterogeneity Time State Space • Selection • Co-Evolution of different components within a level • Dependence among different levels (co-modelling) Most of these questions have not been addressed beyond the sequence level: • Primarily due to lack of data • Secondarily due to lack of models Xt is a diffusion with m(x)=0 and s(x)=x(1-x) Famous Models: • Continuous Time Continuous States Markov Process - specifically Diffusion. • For instance Ornstein-Uhlenbeck, which has Gausssian equilibrium distribution E. Thompson (1975) Human Evolutionary Trees CUP Population Gene Frequencies Genome Structure Evolution • Evolutionary events: Duplication Inversion 1 1 1 1 2 3 Transposition Deletion 1 2 3 1 2 3 1 2 1 3 3 k • Inference Principles • Shortest Path (Parsimony) • Sum over paths with probabilities (ML) 2 k 3 1 • Extensions: • Directions of Genes Unknown • A set of chromosomes related by a phylogeny Genome Structure Evolution • Full graph for 5 genes • Genomic reconstruction for human, mouse and rat. Ashmole 59 Buryed at Caane thus seythe the Croniculer Digby 186 Beryed att Cane & thus says the cronyclere BL Ad 31042 Beryed at caene so seyth the cronyclere Lansd. 762 Buried at cane this saith the croneclere de Worde R. Wyer And is buried at Cane as the Cronycle sayes And buryed at cane as the Cronycle sayes Phylogeny of “Canterbury Tales”: Howe et al ,2001 Phylogenetics of Medieval Manuscripts by Christopher Howe Stemmatology: Evolution of Manuscripts Tree Representations of RNA Structure Basic Edit Operations A Tree Distance Pairwise Edit Algorithm How Do RNA Folding Algorithms Work?. S.R. Eddy. Nature Biotechnology, 22:1457-1458, 2004. Average complexity of the Jiang-Wang-Zhang pairwise tree alignment algorithm and of a RNA secondary structure alignment algorithmClaire Herrbach, Alain Denise and Serge Dulucq RNA Structure Evolution Protein Structure Evolution ? ? ? ? Known a-globin Unknown 300 amino acid changes 800 nucleotide changes 1 structural change 1.4 Gyr Known Myoglobin 1. Given Structure what are the possible events that could happen? 2. What are their probabilities? Old fashioned substitution + indel process with bias. Bias: Folding(Sequence Structure) & Fitness of Structure 3. Summation over all paths. Trajectories between two Secondary Structures • Observation: two structures with sequence and secondary structure information • Space of Protein Structures is large and complicated – both continuous and discrete • Approximated by a series of stepping stones and a continuous time markov chain S1 Sk S2 3D Structure Sn S3 1 structure 2D Structure 1D Structure Set of sequences HQYWYWLLATIVVAWMCM HSGHPPMCWFFWFLLIVIC FYYRKKNQEDDNERPMTSG QYYWWWFCTNSPPHYHRQ DEEDNKRRKLWWAFFCCV FIIAILLMVAGSTGVMMLMP The Evolution/Comparison of Molecular Movements Molecular Movements of Homologous Proteins are themselves homologous The full problem: 2 times 1000 atoms observed at 106 time points. Reductions: i. only a-carbons 100 space points ii. Only correlated pairwise movements 1 dimensional summary for each aa pair Dynamic Fingerprint Matrix (DFM) Shapes and Shape Evolution David F. Wiley, Nina Amenta, Dan A. Alcantara, Deboshmita Ghosh, Yong J Kil, Eric Delson, Will Harcourt-Smith, F. James Rohlf, Katherine St. John, Bernd Hamann, Ryosuke Motani, Steven Frost, Alfred L. Rosenberger, Lissa Tallman, Todd Disotell, and Rob O'Neill • Landmarks • Abstract Semilandmarks We propose to develop software tools for the analysis, interpretation and visualization of three-dimensional shape data from living and extinct organisms, using the statistical framework of geometric morphometics. While this software will be widely useful in biology and paleontology, we plan to focus our work by concentrating on one significant problem: incorporating fossils into evolutionary trees. Evolutionary trees for groups of living species are usually estimated using DNA sequence data. Since this is usually not available for extinct species, we need to use morphology (the shapes of fossil and modern specimens) to decide how the extinct species should be included in a tree whose framework is based on molecular studies. Specifically, we plan to estimate a well-supported evolutionary tree for the mainly African papionin monkeys, an inherently interesting group that includes about as many extinct as living clusters of species. Our analysis will be based on a large existing database of threedimensional data (mostly skull surfaces) at the American Museum of Natural History. This end-to-end analysis project should produce research results at all levels. The interactive graphics, visualization and statistical analysis tools we propose are ever more widely needed as the amount of threedimensional morphology data increases. We expect that the close interaction of geometric morphometrics and computer graphics will lead to new ideas about the representation of shape. We have new approaches to the problem of integrating morphology with molecular data in the study of evolution, which we hope will be successful and applicable in many parts of the tree of life. And with massive amounts of new data, new processing and analytic software, and new approaches to integrating morphology, we hope to be able to answer specific questions about the evolution of African monkeys, which have remained elusive up until now. Our proposed work will have a variety of broader impacts beyond our own research agendas. A large part of Gunz (2009) Early modern human diversity suggests subdivided population structure and a complex out-of-Africa scenario Comparison of cranial ontogenetic trajectories among great apes and humans Philipp Mitteroeckera*, Evolutionary Morphing David F. Wiley http://graphics.idav.ucdavis.edu/research/projects/EvoMorph Evolutionary Morphing The Evolution/Comparison of Molecular Movements http://www.stats.ox.ac.uk/__data/assets/file/0015/3327/brooks.pdf The Phylogenetic Turing Patterns I The Phylogenetic Turing Patterns II Reaction-Diffusion Equations: Stripes: p small Analysis Tasks: 1. Choose Class of Mechanisms 2. Observe Empirical Patterns 3. Choose Closest set of Turing Patterns T1, T2,.., Tk, 4. Choose parameters p1, p2, .. , pk (sets?) behind T1,.. Spots: p large Evolutionary Modelling Tasks: 1. p(t1)-p(t2) ~ N(0, (t1-t2)S) 2. Non-overlapping intervals have independent increments I.e. Brownian Motion Scientific Motivation: 1. Is there evolutionary information on pattern mechanisms? 2. How does patterns evolve? Summary •Comparative Biology •The Domain of Comparative Biology •Co-modeling in Comparative Biology •The purpose of Comparative Biology •Examples of Stochastic Comparative Modeling •Gene Frequencies in Populations •Genome Structure Evolution •Stemmatology: Manuscript Evolution •RNA Secondary Structure Evolution •Protein Structure Evolution •Movement Evolution •Shape Evolution •Pattern Evolution