Vladimir Duran Karen Lai Vien Phan Lin Zhang Lin Math 870 Project Topic: “Reconstructing the temporal ordering of biological samples” Background: Given a sequence of sample data (eg. Cell developmental stages), we want to be able to order them in some biological sense of closeness or their temporal progression. We are going to implement the algorithm suggested in the paper “Reconstructing the temporal ordering of biological samples using microarry data” by Paul M. Magwene, Paul LIzardi and Junhyong Kim. Algorithm: Input a sequence of points in ° d V = ( x1 , x2 ,..., xn ) Step 1: Using the points as vertices and some user defined weight function, construct a complete graph(G(V,E)) with the weight for each edge is calculated using the given weight function. Step 2: Calculate a MST from G(V, E) using known algorithm(eg. Kruskal’s algorithm or Prim’s algorithm) Setp 3: If the MST is a path, then take the MST as a ordering of V: else find a diameter path for the MST using the algorithm stated below. Algorithm for finding a diameter given a tree: (Note: when we do DFS, we also keep track of the depth of each node(begin with the depth of v is 0) Diameter (T – a tree) Pick any v T as the root and do DFS on v. Suppose w is a node in T with the maximum depth. Then do a DFS on w. S Suppose u is a node in T with the maximum depth, the path from w to u is a diameter path for T. Step 4: Calculate some statistics from the diameter path we found. Noise ration = # nodes on branches # total nodes Sampling intensity = Avg length of the diameter path Total length of the diameter path If these numbers are small(sampleing intensity <0.03), we take the diameter as an estimated ordering; Else we do PQ-tree. Step 5: Construct a PQ-tree, base on the diameter path we found, following the algorithm stated in the paper. The leaves of the PQ-tree we get, would form an ordering for the given set. Step 6: We would test the algorithm using generated data sets and some biological data sets mentioned in the paper Step 7: We would try to improve the algorithm using ideas suggested in the class website by “using ideas from proximity-graphs and the notion of principal-curves”. Step 8: Test our new algorithm.