Balanced Minimum Evolution

advertisement
BALANCED
MINIMUM
EVOLUTION
DISTANCE BASED
PHYLOGENETIC
RECONSTRUCTION
1. Compute distance matrix D.
2. Find binary tree using just D.
Balanced Minimum Evolution (BME) is a distance based method to go from a
distance matrix to a phylogenetic tree.
MINIMUM EVOLUTION
PHYLOGENETIC
RECONSTRUCTION
Tree topology being
considered.
Assign branch
lengths using ME.
Fixed distance matrix.
Sum up branch lengths
(ex. 36)
Goal: Find tree topology T with smallest sum of branch lengths (assigned by ME).
That is, find smallest sum of branch lengths for all (2n-5)!! binary tree topologies!
MINIMUM EVOLUTION
PHYLOGENETIC
RECONSTRUCTION
• Given the matrix of pairwise evolutionary distances, the
ME approach estimates the length of any given tree
topology and then selects the tree topology with shortest
length.
• Minimum evolution is conceptually close to characterbased parsimony.
• Complies with Occam’s principle of scientific inference,
which essentially maintains that simpler explanations are
preferable to more complicated ones and that ad hoc
explanations should be avoided.
• Numerous variants of the ME principle exist, depending on
how the branch lengths are estimated and how the tree
length is calculated from these branch lengths.
MINIMUM EVOLUTION
PHYLOGENETIC
RECONSTRUCTION
Tree topology being
considered.
Assign branch
lengths using ME.
Fixed distance matrix.
Sum up branch lengths
(ex. 36)
How do we assign branch lengths to a tree topology???
LEAST SQUARES ESTIMATE
(HOW TO ASSIGN BRANCH LENGTHS TO A TREE TOPOLOGY)
Least Squares
Observe red data points.
Find blue quadratic which minimizes sum of
the squared distances from the red points to
the blue quadratic.
ME analogy for least squares on trees
Red dots
Blue quadratic
Residual/Error
Estimated distances (D)
Binary tree
Sum of branch lengths
MINIMUM EVOLUTION
PHYLOGENETIC
RECONSTRUCTION
Tree topology being
considered.
Assign branch lengths
using least squares.
Fixed distance matrix.
Sum up branch lengths
(ex. 36)
Goal: Find tree topology T with smallest sum of branch lengths (assigned by ME).
That is, find smallest sum of branch lengths for all (2n-5)!! binary tree topologies!
LEAST SQUARES
ASSIGNMENT OF
BRANCH LENGTHS
• If distance estimates are independent with the same variance,
use ordinary least squares (OLS).
• If distance estimates are independent with different variance,
use weighted least squares (WLS). (This is BME!)
• Well known that distance estimates obtained from sequences
do not have the same variance, because the largest distances
are much more variable than the shortest ones (Fitch and
Margoliash, 1967) and are mutually dependent when they
share a common history (or path) in the true phylogeny (Nei
and Jin, 1989).
• Thus ordinary least-squares poorly fits the features of
evolutionary distance data.
BALANCED MINIMUM
EVOLUTION
• In BME, sibling subtrees have equal weight, as opposed to the
standard unweighted OLS, where all taxa have the same
weight and thus the weight of a subtree is equal to the number
of its taxa.
• BME is consistent!
• BME is NP-Hard [W. Day (87)].
• BME outperforms Neighbor Joining, BIONJ, WEIGHBOR and
FITCH [Desper, Gascuel 2002].
• Software (and web version) FastME is a heuristic which finds
the BME solution. Uses NNI and SPR moves.
WHY IS IT CALLED “BALANCED”?
= distance
estimate.
or
is the balanced distance between taxa in A and B in tree T.
If B is composed to two subtrees B1 and B2:
PAUPLIN’S FORMULA
(SHORTCUT FOR BME!)
D is the distance matrix.
T is the tree topology considered.
is the sum of branch lengths assigned by BME.
BME VERSION 2.0
(PAUPLIN’S FORMULA)
Instead of assigning branch lengths to tree topology T using
weighted least squares then summing edge lengths, cut to the
chase and use Pauplin’s formula!
Given distance matrix D, find binary tree T with the smallest sum
of total branch lengths:
EXERCISE
Which tree is the BME optimal?
Why?
FASTME ON THE WEB
http://www.atgc-montpellier.fr/fastme/
• Submit distance matrix in Phylip format.
• Initial tree: OLS_GME, balanced_GME, NJ or BIONJ.
• Finds optimal tree using moves: OLS_NNI or balanced_NNI.
• Enter email and wait for results!
• Self-contained executable available.
COMPUTATIONAL
EXAMPLE
Download sequence at:
http://dl.dropbox.com/u/623333/BME%20Example/GeneSeq8taxa
.nex
Calculate distance matrix (use HKY):
http://bioweb2.pasteur.fr/phylogeny/intro-en.html
Compute BME tree:
http://www.atgc-montpellier.fr/fastme/
REFERENCES
"Fast and accurate phylogeny reconstruction algorithms
based on the minimum-evolution principle.” Desper R.,
Gascuel O., Journal of Computational Biology. 2002 9(5):687705.
"Theoretical foundation of the balanced minimum evolution
method of phylogenetic inference and its relationship to
weighted least-squares tree fitting.” Desper R., Gascuel O.,
Molecular Biology and Evolution. 2004 21(3):587-598.
"Getting a Tree Fast: Neighbor Joining, FastME, and
Distance-Based Methods." Desper R., Gascuel O., Current
Protocols in Bioinformatics. 2006 6.3.1-6.3.28. Edited by John
Wiley & Sons
"Neighbor-Joining Revealed." Gascuel O., Steel M., Molecular
Biology and Evolution. 2006 23(11):1997-2000.
Download