Reconstructing the Temporal Ordering of Biological Samples

advertisement
Vladimir Duran
Karen Lai
Vien Phan
Lin Zhang Lin
Math 870 Project Topic:
“Reconstructing the temporal ordering of biological samples”
Background:
Given a sequence of sample data (eg. Cell developmental stages), we want to be able to
order them in some biological sense of closeness or their temporal progression.
We are going to implement the algorithm suggested in the paper “Reconstructing the
temporal ordering of biological samples using microarry data” by Paul M. Magwene,
Paul LIzardi and Junhyong Kim.
Algorithm:
Input a sequence of points in ° d V = ( x1 , x2 ,..., xn )
Step 1:
Using the points as vertices and some user defined weight function, construct a complete
graph(G(V,E)) with the weight for each edge is calculated using the given weight
function.
Step 2:
Calculate a MST from G(V, E) using known algorithm(eg. Kruskal’s algorithm or Prim’s
algorithm)
Setp 3:
If the MST is a path, then take the MST as a ordering of V:
else find a diameter path for the MST using the algorithm stated below.
Algorithm for finding a diameter given a tree:
(Note: when we do DFS, we also keep track of the depth of each node(begin with the
depth of v is 0)
Diameter (T – a tree)
Pick any v T as the root and do DFS on v.
Suppose w is a node in T with the maximum depth.
Then do a DFS on w. S
Suppose u is a node in T with the maximum depth,
the path from w to u is a diameter path for T.
Step 4:
Calculate some statistics from the diameter path we found.
Noise ration = # nodes on branches
# total nodes
Sampling intensity = Avg length of the diameter path
Total length of the diameter path
If these numbers are small(sampleing intensity <0.03), we take the diameter as an
estimated ordering;
Else we do PQ-tree.
Step 5:
Construct a PQ-tree, base on the diameter path we found, following the algorithm stated
in the paper. The leaves of the PQ-tree we get, would form an ordering for the given set.
Step 6:
We would test the algorithm using generated data sets and some biological data sets
mentioned in the paper
Step 7:
We would try to improve the algorithm using ideas suggested in the class website by
“using ideas from proximity-graphs and the notion of principal-curves”.
Step 8:
Test our new algorithm.
Download