Wilson_some_Molecular_info.ppt

advertisement
Molecular data assisted
morphological analyses
Molecular data assisted morphological analyses
Use molecular data to define the limits of species.
“barcoding”, need some baseline information to set the molecular limits of a species
Polysiphonia study
rbcL sequences as molecular data
The level of expected intraspecific sequence divergence established by McIvor et
al. (2001, Mol. Ecol. 10:911-919) in a study where they compared rbcL sequence
data with karyological and interbreeding data.
Sequences generated from multiple North Carolina specimens
sequence similarity used for species identification
phylogenetic analyses used to determine evolutionary relationships
(an advantage of rbcL as a “barcoding” gene)
Species specimen
tree based on
phylogenetic tree
used for character
state mapping
Phylogenetic tree
Analyze morphological characters in each molecularly-defined specimen
Determine the consistency of morphological characters within species
Poly NC-6
Poly NC-10
Poly NC-13
4 pericentral cells (0)
5-7 pericentral cells (1)
> 8 pericentral cells (2)
Poly NC-16
A
Poly NC-23
Poly NC-17
B
C
Lanceolate (1)
Poly NC-10
Poly NC-13
Linear to lanceolate (0/1)
Poly NC-16
Poly NC-31
Lanceolate to
fractiflexus (1/2)
Linear to
fractiflexus (0/2)
Poly NC-22
No information
Poly NC-2
Poly NC-30
Poly NC-1
Poly NC-3
Poly NC-5
Equivocal
Poly NC-19
Poly NC-6
Linear (0)
A
Poly NC-19
Poly NC-31
Poly NC-22
B
Poly NC-7
Poly NC-14
C
Poly NC-15
D
G
F
H
E
Poly NC-18
Poly NC-20
Poly NC-29
Poly NC-12
Poly NC-21
Poly NC-24
Poly NC-9
Poly NC-33
Poly NC-11
Poly NC-4
Poly NC-26
Poly NC-27
Poly NC-28
Poly NC-32
Poly NC-23
Poly NC-17
Poly NC-2
Poly NC-30
Poly NC-1
Poly NC-3
Poly NC-5
Poly NC-7
Poly NC-14
Poly NC-15
Poly NC-18
D
G
F
H
E
Poly NC-20
Poly NC-29
Poly NC-12
Poly NC-21
Poly NC-24
Poly NC-9
Poly NC-33
Poly NC-11
Poly NC-4
Poly NC-26
Poly NC-27
Poly NC-28
Poly NC-32
Trees, Phylogenetic
Analyses and other
ventures down the
Dark Side.
Phylogenetic Analyses, Trees, etc.
Tree
terminology
Node: point at which 2 or more
branches diverge
internal node
terminal node
or OTU
Internal node: hypothetical last common ancestor
Terminal node: molecular or morphological data
from which the tree is derived
(OTUs = Operational Taxonomic Units)
Clade: a node and everything
arising from it
terminal node
or OTU
internal node
clade
clade
Monophyletic group: a group in
which all members are derived
from a unique common ancestor
Polyphyletic group: a group in
which all members are not
derived from a unique common
ancestor. The common ancestor
of the group has many
descendants that are not in the
group
Paraphyletic group: a group that
excludes some of the
descendants of the common
ancestor
A couple of points about trees…
A
B
All branches can rotate freely around a node
C
i.e. B is not more closely related to C than A,
and C is not more closely related to D than E
D
E
Branch lengths may be proportional to the
hypothesized distance between nodes –
PAUP’s “phylogram”
Branch lengths may be be drawn as equal
between nodes – “cladograms”
(these are used when one is interested only in the branching
pattern)
DNA or protein sequence trees are hypotheses of how a particular DNA
locus or protein has evolved.
We assume that the way the DNA or protein has evolved reflects the
way the species has evolved i.e. gene tree = species tree
This may or may not reflect reality.
i.e. molecules do not necessarily trump morphology, etc. (las moléculas
no necesariamente morfología del triunfo)
DNA sequence analyses (protein as well)
www.sinauer.com
DNA sequence analyses (protein as well)
Molecular trees are only as good as the data they are based upon
i.e. GARBAGE IN = GARBAGE OUT (Basura en = basura hacia fuera)
Sequence alignment is the most important step in phylogenetic analysis
same sites in
different
sequences
need to be
homologous
area to possibly remove because of
uncertain homology between sites
inferred insertion/deletion mutations (gaps)
DNA sequence analyses (protein as well)
Analysis methods
Distance methods: based on similarity between OTUs
Optimization methods
Parsimony: searching for the tree that requires the least
number of mutational steps
Maximum Likelihood: searching for the most likely (tree with
highest probability) given the OTUs (sequences) and model of evolution
Bayesian: searching for a set of trees in which the likelihoods
are so similar that changes between them are essentially random
The choice of analysis method may be deeply philosophical or it may be
based on practicality
What method can I use and get a result in a reasonable amount of time?
DNA sequence analyses (protein as well)
Testing Trees
Decay Analysis or Bremer Support Values: a test used in parsimony
analyses where one determines how many steps less parsimonious
than minimal is a particular branch in your tree no longer resolved in the
consensus of all possible trees that length.
Bootstrapping: a way to test the level of support in your data for a
particular relationship in your tree.
by default most programs will show bootstrap values when
they are greater than 50
but, does a bootstrap value of 50 mean anything?
Hillis & Bull (1993) Systematic Biology 42:182-192 (tested bootstrap values
based on a known phylogeny)
Wilson’s Rule:
60-80, is there other evidence to support the relationship, be cautious
80-90, usually pretty solid
90-100, solid and unlikely to be misleading
Download