Molecular data assisted morphological analyses Molecular data assisted morphological analyses Use molecular data to define the limits of species. “barcoding”, need some baseline information to set the molecular limits of a species Polysiphonia study rbcL sequences as molecular data The level of expected intraspecific sequence divergence established by McIvor et al. (2001, Mol. Ecol. 10:911-919) in a study where they compared rbcL sequence data with karyological and interbreeding data. Sequences generated from multiple North Carolina specimens sequence similarity used for species identification phylogenetic analyses used to determine evolutionary relationships (an advantage of rbcL as a “barcoding” gene) Species specimen tree based on phylogenetic tree used for character state mapping Phylogenetic tree Analyze morphological characters in each molecularly-defined specimen Determine the consistency of morphological characters within species Poly NC-6 Poly NC-10 Poly NC-13 4 pericentral cells (0) 5-7 pericentral cells (1) > 8 pericentral cells (2) Poly NC-16 A Poly NC-23 Poly NC-17 B C Lanceolate (1) Poly NC-10 Poly NC-13 Linear to lanceolate (0/1) Poly NC-16 Poly NC-31 Lanceolate to fractiflexus (1/2) Linear to fractiflexus (0/2) Poly NC-22 No information Poly NC-2 Poly NC-30 Poly NC-1 Poly NC-3 Poly NC-5 Equivocal Poly NC-19 Poly NC-6 Linear (0) A Poly NC-19 Poly NC-31 Poly NC-22 B Poly NC-7 Poly NC-14 C Poly NC-15 D G F H E Poly NC-18 Poly NC-20 Poly NC-29 Poly NC-12 Poly NC-21 Poly NC-24 Poly NC-9 Poly NC-33 Poly NC-11 Poly NC-4 Poly NC-26 Poly NC-27 Poly NC-28 Poly NC-32 Poly NC-23 Poly NC-17 Poly NC-2 Poly NC-30 Poly NC-1 Poly NC-3 Poly NC-5 Poly NC-7 Poly NC-14 Poly NC-15 Poly NC-18 D G F H E Poly NC-20 Poly NC-29 Poly NC-12 Poly NC-21 Poly NC-24 Poly NC-9 Poly NC-33 Poly NC-11 Poly NC-4 Poly NC-26 Poly NC-27 Poly NC-28 Poly NC-32 Trees, Phylogenetic Analyses and other ventures down the Dark Side. Phylogenetic Analyses, Trees, etc. Tree terminology Node: point at which 2 or more branches diverge internal node terminal node or OTU Internal node: hypothetical last common ancestor Terminal node: molecular or morphological data from which the tree is derived (OTUs = Operational Taxonomic Units) Clade: a node and everything arising from it terminal node or OTU internal node clade clade Monophyletic group: a group in which all members are derived from a unique common ancestor Polyphyletic group: a group in which all members are not derived from a unique common ancestor. The common ancestor of the group has many descendants that are not in the group Paraphyletic group: a group that excludes some of the descendants of the common ancestor A couple of points about trees… A B All branches can rotate freely around a node C i.e. B is not more closely related to C than A, and C is not more closely related to D than E D E Branch lengths may be proportional to the hypothesized distance between nodes – PAUP’s “phylogram” Branch lengths may be be drawn as equal between nodes – “cladograms” (these are used when one is interested only in the branching pattern) DNA or protein sequence trees are hypotheses of how a particular DNA locus or protein has evolved. We assume that the way the DNA or protein has evolved reflects the way the species has evolved i.e. gene tree = species tree This may or may not reflect reality. i.e. molecules do not necessarily trump morphology, etc. (las moléculas no necesariamente morfología del triunfo) DNA sequence analyses (protein as well) www.sinauer.com DNA sequence analyses (protein as well) Molecular trees are only as good as the data they are based upon i.e. GARBAGE IN = GARBAGE OUT (Basura en = basura hacia fuera) Sequence alignment is the most important step in phylogenetic analysis same sites in different sequences need to be homologous area to possibly remove because of uncertain homology between sites inferred insertion/deletion mutations (gaps) DNA sequence analyses (protein as well) Analysis methods Distance methods: based on similarity between OTUs Optimization methods Parsimony: searching for the tree that requires the least number of mutational steps Maximum Likelihood: searching for the most likely (tree with highest probability) given the OTUs (sequences) and model of evolution Bayesian: searching for a set of trees in which the likelihoods are so similar that changes between them are essentially random The choice of analysis method may be deeply philosophical or it may be based on practicality What method can I use and get a result in a reasonable amount of time? DNA sequence analyses (protein as well) Testing Trees Decay Analysis or Bremer Support Values: a test used in parsimony analyses where one determines how many steps less parsimonious than minimal is a particular branch in your tree no longer resolved in the consensus of all possible trees that length. Bootstrapping: a way to test the level of support in your data for a particular relationship in your tree. by default most programs will show bootstrap values when they are greater than 50 but, does a bootstrap value of 50 mean anything? Hillis & Bull (1993) Systematic Biology 42:182-192 (tested bootstrap values based on a known phylogeny) Wilson’s Rule: 60-80, is there other evidence to support the relationship, be cautious 80-90, usually pretty solid 90-100, solid and unlikely to be misleading