Interpretation of phylogenetic trees

advertisement
Phylogenetic Interpretation
Dr Laura Emery
Laura.Emery@ebi.ac.uk
www.ebi.ac.uk
Objectives
• After this tutorial you should be able to…
• Discuss the impact of a range of biological phenomena upon
phylogenetic inference
• Appreciate some challenges and limitations of phylogenetic
approach
• Interpret published phylogenies (and your own)
Phylogenetic interpretation is essential
throughout data analysis
Decide
upon and
implement
method
Data assessment
- known biology
- additional data
(e.g. geography)
Investigate
unexpected and
unresolved
aspects further
- consider including
more data
Formulate
hypotheses
No
No
Yes
Can you
validate
this?
Phylogeneti
c Result(s)
Answere
d your
question?
Yes
Final phylogeny
and analysis
Phylogenetic interpretation skill set
1. Tree-thinking skills
• Revise: relatedness, trait evolution, confidence, homology
2. Knowledge of phylogenetic methods and their limitations
3. Knowledge of biological processes affecting sequence
evolution
•
gene duplication, recombination, horizontal gene transfer,
population genetic processes, and many more!
4. Knowledge of the data you wish to interpret
Recap of tree-thinking skills
1.
2.
3.
4.
Relatedness
Trait evolution
Confidence
Homology
1. Relatedness: taxa that share a more
recent common ancestor are more closely
most recent common ancestor
related
shared with second cousin
most recent common
ancestor shared with
first cousin
2. Trait evolution
• It can be useful to map traits onto phylogenies as a first
step in inferring their evolutionary histories
• Interpreting trait evolution in its phylogenetic context is
rarely straightforward!
• Assumptions must be made regarding the loss and gain
of traits
• It is often useful to construct alternative scenarios
• Then we have to decide upon the most plausible
(character state methods e.g. MP and ML can be applied)
Example: The Evolution of Mitochondria
origin of
eukaryotes
Ginger et al. 2010
Example: The Evolution of Mitochondria
G = gain
L = loss
G G
G
G
G
G
G
G
G
G
G
G
origin of
eukaryotes
Ginger et al. 2010
Scenario one: Mitochondria evolved from mitosomes
Example: The Evolution of Mitochondria
G = gain
L = loss
G
L
G
L
G L
L
G
origin of
eukaryotes
Ginger et al. 2010
Scenario two: Mitochondria occurred at the origin of eukaryotes
3. Tree Confidence Question
Does this tree support the grouping of pelecaniforms and
ciconiiforms as a monophyletic group?
4. Homology is similarity due to shared
ancestry
Example: limbs and wings
• Limbs are homologous
they share a common
ancestor
• Wings are not homologous
they are an analogous as
they have evolved
similarity independently
Homology Question: Trap-jaws in ants
Based on this
phylogeny, which
scenario do you think is
more likely?
• trap-jaws are
homologous
• trap-jaws are
analogous and have
evolved independently
four times
Moreau et al. 2006
Homology Question: Trap-jaws in ants
Based on this
phylogeny, which
scenario do you think is
more likely?
L
L
L
• trap-jaws are
homologous
L
• trap-jaws are
analogous and have
evolved independently
G
four times
L
L
L
Moreau et al. 2006
Scenario one: Trap-jaws are homologous
Homology Question: Trap-jaws in ants
G
Based on this
phylogeny, which
scenario do you think is
more likely?
G
• trap-jaws are
homologous
• trap-jaws are
analogous and have
evolved independently
four times
G
G
more parsimonious
Moreau et al. 2006
Scenario two: Trap-jaws are analogous
Phylogenetic interpretation skill set
1. Tree-thinking skills
• Revise: relatedness, trait evolution, confidence, homology
2. Knowledge of phylogenetic methods and their limitations
3. Knowledge of biological processes affecting sequence
evolution
•
gene duplication, recombination, horizontal gene transfer,
population genetic processes, and many more!
4. Knowledge of the data you wish to interpret
Processes that affect sequence evolution
1.
2.
3.
4.
5.
6.
7.
Gene/genome duplication and divergence
Recombination
Horizontal gene transfer
Coevolution
Migration
Rate and time of divergence
Other
1. Gene duplication
Gene duplication and
subsequent divergence
can result in novel gene
functions (it can also
result in pseudogenes)
• Genes that are
homologous due to
gene duplication are
paralogous
• Genes that are
homologous due to
speciation are
orthologous
Gene duplication question
This is a tree of gene family that has undergone one gene
duplication event in its evolutionary past.
Where on the tree did this occur?
Is the event well-supported?
Cells Tissues & Organs 2007
2. Recombination
• Single or small numbers of events:
• Within genes
• Between genes
• Where there is extensive recombination - a phylogenetic
approach is inappropriate (not tree-like)
Recombination example: Dengue-2 virus
data from E. Holmes, figure from A. Rambaut
Recombination Question
Can you spot the recombinant strain?
Mauro et al 2003
3. Horizontal Gene Transfer (HGT/LGT)
Horizontal gene transfer
violates the assumption that
sequences have evolved in
a tree-like manner
• Where sparse, can be
detected by comparing with
species phylogeny
• Where extensive,
phylogenetic approach is
inappropriate
Gogarten & Townsend 2005
Phylogenetics is not appropriate for highly
recombinant taxa
• Phylogenetics
assumes that
patterns of
relatedness among
taxa follow a treelike structure
• Recombination and horizontal gene transfer produce
networks
• Avoid phylogenetics for:
• Intraspecific sexual species (recombination at each meiosis)
• Asexual species with extensive HGT (e.g. some Bacteria)
Horizontal gene transfer question
Can you spot the horizontally transferred gene?
4. Coevolution
Where parasites or symbionts co-evolve with their hosts,
both topologies are expected to be very similar.
Weiss 2009 from Reed et al 2007
Coevolution Question
Do these phylogenies provide evidence that the lice are
inherited vertically?
Hafner & Nadler 1988
6. Migration
Patterns of migration
influence phylogenetic
topology, especially in
structured populations
Phylogeography example: Chimpanzees
P. troglodytes and P.schweinfurthii
are more dissimilar than you would
expect given their proximity
> Chimpanzees can't cross rivers!
Gao et al 1999
Migration Question
What can you infer about
patterns of migration of
the Taiwanese stagbeetle based upon this
phylogeny?
Black = Taiwan
5. Rate and time of divergence
• Phylogenies can be used to date divergence times when
some temporal information is known
• e.g. carbon dating from fossil evidence
• e.g. dates of sample isolation
• Genetic change = Evolutionary rate
x Divergence
time (substitutions/site) (substitutions/site/year) (years)
• If all lineages evolve at the same
rate (i.e. there is a molecular clock)
then branch lengths should reflect
divergences times
A
B
C
E
D
Is there a molecular clock?
• Zuckerland and Pauling (1962)
• No. substitutions in haemoglobin roughly proportional to
time based upon fossil datings
Dating divergence with a molecular clock
X
d = genetic distance
(branch length)
We know time T since a and c diverged
We want to find out time X since a and b diverged
1. Use T to estimate the evolutionary rate r
r = d(a-c) / 2T
2. Use r to estimate time X
X = 1/2 (d(a-b) / r)
Dating Drosophila Divergence around Hawaii
• The volcanic activity around Hawaii has produced a
chain of islands; the oldest is furthest away from the
mainland
Figure Andrew Rambaut from
• Several species including Drosophila have diverged
Fleischer, McIntosh &Tarr 1998
with island formation
Dating Drosophila Divergence in Hawaii
• Island formation dates reflecting species’ divergence
were plotted against genetic distance (branch length)
• Genetic distance scaled linearly with divergences date,
Genetic distance
indicating the presence of a molecular clock
gradient = evolutionary
rate
NB: Not all species
exhibit a molecular clock!
Time
Fleischer, McIntosh &Tarr 1998
7. Other biological processes can complicate
molecular analyses
•
•
•
•
•
•
•
•
Population genetic processes
Epidemiological processes
Gene conversion
Codon bias
Hypermutable sites
Concerted evolution
Reassortment
Many more…
Summary: Phylogenetic interpretation skill
set
1. Tree-thinking skills
• Revise: relatedness, trait evolution, confidence, homology
2. Knowledge of phylogenetic methods and their limitations
3. Knowledge of biological processes affecting sequence
evolution
•
gene duplication, recombination, horizontal gene transfer,
population genetic processes, and many more!
4. Knowledge of the data you wish to interpret
Further Reading
• Molecular Evolution: A Phylogenetic Approach (1998)
Roderic D M Page & Edward C Holmes, Blackwell
Science, Oxford.
• The Phylogenetic Handbook (2003), Marco Salemi and
Anne-Mieke Vandamme Eds, Cambridge University
Press, Cambridge.
• Inferring Phylogenies (2003) Joseph Felsenstein,
Sinauer.
• Molecular Evolution (1997) Wen-Hsiung Li , Sinauer
Train online
• Free online courses
• Learn in your own time, at
your own pace
• Created for life-science
researchers
• No previous knowledge of
bioinformatics needed
www.ebi.ac.uk/training/online
Acknowledgements
People
• Andrew Rambaut (University of Edinburgh)
team
• Paul Sharp (University of Edinburgh)
• Nick Goldman (EMBL-EBI)
• Benjamin Redelings (Duke University)
• Brian Moore (University of California, Davis)
• Olivier Gascuel (University of Montpelier)
• Aiden Budd (EMBL-EBI)
Funding
EMBL member states and…
…and the EBI training
Thank you!
www.ebi.ac.uk
Twitter: @emblebi
Facebook: EMBLEBI
Now it's your turn…
• Open your tutorial manual and begin Tree-thinking quiz 2
(appendix 2)
• The manual is available to download from:
http://www.ebi.ac.uk/training/course/scuola-di-bioinformatica2013
• When you are finished you can mark your own.
• Remember to ask for help at any stage!
Download