20071220153015501

advertisement
Drawing Phylogenetic
Networks
Daniel H. Huson
joint work with Tobias Kloepper and Regula Rupp
1Future Directions in Phylogenetics, Cambridge, December 2007
Split Networks and Cluster Networks
Split network
Data: binary sequences (Kumar, 1998)
Cluster network
2
How to Draw Cluster Networks?
Data: 61 genes (Leebens-Mack et al, MBE, 2005)
3
Cladograms For Trees
4
Phylograms and Radial Diagrams
5
Drawing a Cladogram for a Tree
Assign x-coords
in a postorder
traversal of N:
if v is a leaf:
x(v) = 0
4
else:
x(v) = max x of
children + 1
1
0
0
3
0
2
1
0
0
0
7
Drawing a Cladogram for a Tree
1
Assign y-coords
in a postorder
traversal:
if v is a leaf:
y(v) = number
of leaves visited
else:
y(v) = mean
y of children
1.5
2
2.625
3
3.75
4.125
4.5
4
5
6
8
Naïve Algorithm for Drawing Networks
Network N:
p



q
p
Choose a guide tree T
Compute coordinates for T
Draw network using tree coordinates
9
Naïve Algorithm for Drawing Networks
Q
R
P
Problems:

x-coordinates: P and Q have different x-coordinates

y-coordinates: R isn‘t placed between P and Q

Unnecessary edge crossings
10
Better x-Coordinates
Assign x-coords
in a postorder
traversal of N:
if v is a leaf:
x(v) = 0
else:
x(v) = max x of
children + 1
2
1
0
0
1
4
0
0
3
2
0
11
Better y-Coordinates?

Need to introduce:
–
–
LSA guide tree
topological embedding 
12
Lowest Single Ancestor

The LSA of a node v is the last node ( v)
on all paths from  to v:
v

lsa(v)
13
LSA Guide Tree

Connect each reticulate node to its LSA
and remove all reticulate edges:
LSA tree T
cladogram
14
Topological Embedding


A topological embedding  is given by an
ordering of the children of each node v:
v e
q v
d
r
q
r

r
w
p b
p
w a
Any  gives rise to a planar drawing of T
15
Better y-Coordinates


Choose  so that reticulate nodes are
placed between their sources:
Order subtrees in preorder traversal of T
16
Resulting Cladogram

Use diagonal or curved lines for
reticulate edges
17
Additional Twist for y-Coordinates


LSA guide tree has true and false leaves:
Network N
LSA tree T
A leaf is false if it is only a leaf in T
18
Additional Twist for y-Coordinates

False leaves produce uneven spacing of
leaves in N:
19
Additional Twist for y-Coordinates

Assign integer coordinates to true
leaves, fractional to false ones:
1
2
3
3½
4
5
6
20
Circular Cladograms

Compute polar coordinates in similar way:
21
Phylograms and Radial Diagrams


y-coordinates: as for cladogram
x-coordinates: preorder traversal
22
Example 1




Multiple gene trees
Leebens-Mack et al,
MBE, 2005
61 chloroplast
genes for 26 plants
Filtered cluster
networks
–
–
–
–
50%
30%
20%
10%
23
Example 2: Splits vs Clusters
Split network of consensus splits
from 106 maximum-parsimony
trees (Rokas et al, 2003)
(Holland et al, 2004)
Weighted cluster network
24
Weights for Reticulate Edges?

Use LSA to determine weight of
reticulate edge:
1,2,3,4,5

1,3,4
v
lsa(v)
2,5

Use average weight on paths to lsa(v)
25
Example 3


Arndt von Haeseler
over 12,000 trees

76.5 %

11.4 %

11.4 %
Weighted cluster network
26
Outlook



All algorithms discussed have been
implemented and will be made available in
Dendroscope2
Dendroscope2 will be released in first
half of 2008
Dendroscope 1 (for trees only) is
available from:
www-ab.informatik.uni-tuebingen.de/software
27
Download