Dynamic Programming Dynamic Programming • Dynamic Programming is a general algorithm design technique • It breaks up a problem into a series of overlapping subproblems ( sub-problem whose results can be reused several times) • Main idea: - set up a recurrence relating a solution to a larger instance to solutions of some smaller instances - solve smaller instances once - record solutions in a table - extract solution to the initial instance from that table Discussed topics •Assembly-line scheduling •Partition of Data Sequence •Route Segmentation and Classification for GPS data •Longest Common Subsequence (LCS) Assembly-line scheduling Recursive equation: f1[ j ] min( f1[ j 1] a1, j , f 2 [ j 1] t 2, j 1 a1, j ) where f1[j] and f2[j] are the accumulated time cost to reach station S1,j and S2,j a1,j is the time cost for station S1,j t2,j-1 is the time cost to change station from S2,j-1 to S1,j Partition of Data Sequence Partition of the sequence X into to K nonoverlapping groups with given cost functions f(xi, xj) so that the total value of the cost function is minimal: f ( x1 , xi2 1 ) f ( xi2 , xi3 1 ) ... f ( xiK , xN ) min Partition of Data Sequence G(k,n) is cost function for optimal partition of n points into k non-overlapping groups: k G (k , n) min f i j , i j 1 1 {i j } j 1 Recursive equation: G(k , n) min G(k 1, j 1) f j, n ; 0 n N . k j n Examples: Image Quantization Input Quantized Here, cost function is: f (i, j ) n( k ) ( k q ) 2 Square Error ik j where q n( k ) k ik j Quantize value Examples: Polygonal Approximation 5004 points are approximated by 78 points. Given approximated points M, errors are minimized. Given error ε, number of approximated points are minimized. Here, cost function is: f ( xi , x j ) max ik j {d k } or f ( xi , x j ) 2 d k i k j Examples: Route segmentation Ski Running and Jogging estimated segment result estimated segment result 6 20 5 15 4 speed speed 25 10 3 2 5 1 0 0 200 400 600 800 1000 time 1000 2000 3000 4000 time Non-moving Divide the routes into several segments by speed consistency Here, cost function is: estimated segment result 10 8 speed 0 0 6 f (i, j ) (vk )i k j (t j ti ) 4 2 0 0 1000 2000 3000 time 4000 5000 6000 Speed variance Time duration Examples: Route classification 1 Stop Walk Car 0.8 Run Probability Bike Determine the moving type only by speed will cause mis-classification. 0.6 0.4 0.2 0 10 20 30 40 Speed(km/h) Frequent Moving type dependency of 1st order HMM Examples: Route classification (cont.) 1st order HMM, maximize: M f P(mi | Xi , mi 1 ) i 1 mi : moving type of segment i Xi : feature vector (e.g. speed) Solve by dynamic programming similar with the Assembly-line scheduling problem Longest Common Subsequence (LCS) Find a maximum length common subsequence between two sequences. For instance, Sequence 1: president Sequence 2: providence Its LCS is priden. president providence How to compute LCS? Let A=a1a2…am and B=b1b2…bn . len(i, j): the length of an LCS between a1a2…ai and b1b2…bj With proper initializations, len(i, j) can be computed as follows. 0 if i 0 or j 0, len(i, j ) len(i 1, j 1) 1 if i, j 0 and ai b j , max(len(i, j 1), len(i 1, j )) if i, j 0 and a b . i j i 0 1 2 3 4 5 6 7 8 9 j p r e s i d e n t 0 1 p 2 r 3 o 4 v 5 i 6 d 7 e 8 n 9 c 10 e 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 1 2 2 2 2 2 2 2 2 2 0 1 2 2 2 2 2 3 3 3 3 0 1 2 2 2 2 2 3 3 3 3 0 1 2 2 2 3 3 3 3 3 3 0 1 2 2 2 3 4 4 4 4 4 0 1 2 2 2 3 4 5 5 5 5 0 1 2 2 2 3 4 5 6 6 6 0 1 2 2 2 3 4 5 6 6 6 Running time and memory: O(mn) and O(mn). i 0 1 2 3 4 5 6 7 8 9 j p r e s i d e n t 0 1 p 2 r 3 o 4 v 5 i 6 d 7 e 8 n 9 c 10 e 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 1 2 2 2 2 2 2 2 2 2 0 1 2 2 2 2 2 3 3 3 3 0 1 2 2 2 2 2 3 3 3 3 0 1 2 2 2 3 3 3 3 3 3 0 1 2 2 2 3 4 4 4 4 4 0 1 2 2 2 3 4 5 5 5 5 0 1 2 2 2 3 4 5 6 6 6 0 1 2 2 2 3 4 5 6 6 6 Output: priden Time Series Matching using Longest Common Subsequence Minimum Bounding Envelope (MBE) for LCSS delta = time matching region (left & right) 2 0 epsilon = spatial matching region (up & down) -2 -4 0 20 40 60 80 100 120 Point Correspondense, Similarity [=10, =0.3] = 0.875 0 20 40 Recursive for i 0, j equation: 0 60 80 0 len(i, j ) len(i 1, j 1) 1 max(len(i 1, j ), len(i, j 1)) 100 120 i 0, j 0 i 0, j 0,| ti t j | , d ( pi , p j ) else Spatial Data Matching using Longest Common Subsequence epsilon = spatial matching distance Recursive equation: 0 len(i, j ) len(i 1, j 1) 1 max(len(i 1, j ), len(i, j 1)) i 0, j 0 i 0, j 0, d ( pi , p j ) else Conclusion Main idea: - Divide into sub-problems - Design cost function and recursive function - Optimization (fill in the table) - Backtracking