Dynamic Programming

Dynamic Programming • • • • • Optimization Problems Dynamic Programming Paradigm Example: Matrix multiplication Principle of Optimality Example: String Matching Problems Optimization Problems • In an optimization problem, there are typically many feasible solutions for any input instance I • For each solution S, we have a cost or value function f(S) • Typically, we wish to find a feasible solution S such that f(S) is either minimized or maximized • Thus, when designing an algorithm to solve an optimization problem, we must prove the algorithm produces a best possible solution. Example Problem You have six hours to complete as many tasks as possible, all of which are equally important. Task A - 2 hours Task D - 3.5 hours Task B - 4 hours Task E - 2 hours Task C - 1/2 hour Task F - 1 hour How many can you get done? • Is this a minimization or a maximization problem? • Give one example of a feasible but not optimal solution along with its associated value. • Give an optimal solution and its associated value. Dynamic Programming • The key idea behind dynamic program is that it is a divide-and-conquer technique at heart • That is, we solve larger problems by patching together solutions to smaller problems • However, dynamic programming is typically faster because we compute these solutions in a bottom-up fashion Fibonacci numbers • F(n) = F(n-1) + F(n-2) – F(0) = 0 – F(1) = 1 • Top-down recursive computation is very inefficient – Many F(i) values are computed multiple times • Bottom-up computation is much more efficient – Compute F(2), then F(3), then F(4), etc. using stored values for smaller F(i) values to compute next value – Each F(i) value is computed just once Recursive Computation F(n) = F(n-1) + F(n-2) ; F(0) = 0, F(1) = 1 Recursive Solution: F(6) = 8 F(4) F(5) F(4) F(3) F(2) F(1) F(0) F(2) F(1) F(3) F(3) F(1) F(0) F(2) F(1) F(0) F(1) F(2) F(1) F(0) F(2) F(1) F(1) F(0) Bottom-up computation We can calculate F(n) in linear time by storing small values. F[0] = 0 F[1] = 1 for i = 2 to n F[i] = F[i-1] + F[i-2] return F[n] Moral: We can sometimes trade space for time. Key implementation steps • Identify subsolutions that may be useful in computing whole solution – Often need to introduce parameters • Develop a recurrence relation (recursive solution) – Set up the table of values/costs to be computed • The dimensionality is typically determined by the number of parameters • The number of values should be polynomial • Determine the order of computation of values • Backtrack through the table to obtain complete solution (not just solution value) Example: Matrix Multiplication • Input – List of n matrices to be multiplied together using traditional matrix multiplication – The dimensions of the matrices are sufficient • Task – Compute the optimal ordering of multiplications to minimize total number of scalar multiplications performed • Observations: – Multiplying an X  Y matrix by a Y  Z matrix takes X  Y  Z multiplications – Matrix multiplication is associative but not commutative Example Input • Input: – M1, M2, M3, M4 • • • • M1: 13 x 5 M2: 5 x 89 M3: 89 x 3 M4: 3 x 34 • Feasible solutions and their values – – – – – ((M1 M2) M3) M4:10,582 scalar multiplications (M1 M2) (M3 M4): 54,201 scalar multiplications (M1 (M2 M3)) M4: 2856 scalar multiplications M1 ((M2 M3) M4): 4055 scalar multiplications M1 (M2 (M3 M4)): 26,418 scalar multiplications Key implementation steps • Identify subsolutions that may be useful in computing whole solution – Often need to introduce parameters – Define dimensions to be (d0, d1, …, dn) where matrix Mi has dimensions di-1 x di – Let M(i,j) be the matrix formed by multiplying matrices Mi through Mj – Define C(i,j) to be the minimum cost for computing M(i,j) Key implementation steps • Develop a recurrence relation (recursive solution) C 1 1 0 – Recurrence relation for C(i,j) • C(i,j) = mink=i to j-1 ( C(i,k)+ C(k+1,j) + di-1dkdj) – The last multiplication is between matrices M(i,k) and M(k+1,j) 2 • C(i,i) = 0 – Set up the table of values/costs to be computed • The dimensionality is typically determined by the number of parameters • The number of values should be polynomial 3 4 2 3 4 0 0 0 Key implementation steps • Determine the order of computation of values C 1 2 3 4 1 0 1 2 3 0 1 2 0 1 2 3 4 0 Computing actual ordering C 1 2 3 4 P 1 2 3 4 1 0 5785 1530 2856 1 0 1 1 3 0 1335 1845 2 0 2 3 0 9078 3 0 3 0 4 2 3 4 0 P(i,j) records the intermediate multiplication k used to compute M(i,j). That is, P(i,j) = k if last multiplication was M(i,k) M(k+1,j) Pseudocode int MatrixOrder() forall i, j C[i, j] = 0; for j = 2 to n for i = j-1 to 1 C(i,j) = mini<=k<=j-1 (C(i,k)+ C(k+1,j) + di-1dkdj) P[i, j]=k; return C[1, n]; Backtracking Procedure ShowOrder(i, j) if (i=j) write ( “Ai”) ; else k = P [ i, j ] ; write “ ( ” ; ShowOrder(i, k) ; write “  ” ; ShowOrder (k+1, j) ; write “)” ; Principle of Optimality • In book, this is termed “Optimal substructure” • An optimal solution contains within it optimal solutions to subproblems. • More detailed explanation – Suppose solution S is optimal for problem P. – Suppose we decompose P into P1 through Pk and that S can be decomposed into pieces S1 through Sk corresponding to the subproblems. – Then solution Si is an optimal solution for subproblem Pi Example 1 • Matrix Multiplication – In our solution for computing matrix M(1,n), we have a final step of multiplying matrices M(1,k) and M(k+1,n). – Our subproblems then would be to compute M(1,k) and M(k+1,n) – Our solution uses optimal solutions for computing M(1,k) and M(k+1,n) as part of the overall solution. Example 2 • Shortest Path Problem – Suppose a shortest path from s to t visits u – We can decompose the path into s-u and u-t. – The s-u path must be a shortest path from s to u, and the u-t path must be a shortest path from u to t • Conclusion: dynamic programming can be used for computing shortest paths Example 3 • Longest Path Problem – Suppose a longest path from s to t visits u – We can decompose the path into s-u and u-t. – Is it true that the s-u path must be a longest path from s to u? • Conclusion? Example 4: The Traveling Salesman Problem What recurrence relation will return the optimal solution to the Traveling Salesman Problem? If T(i) is the optimal tour on the first i points, will this help us in solving larger instances of the problem? Can we set T(i+1) to be T(i) with the additional point inserted in the position that will result in the shortest path? No! T(4) T(5) Shortest Tour Summary of bad examples • There almost always is a way to have the optimal substructure if you expand your subproblems enough • For longest path and TSP, the number of subproblems grows to exponential size • This is not useful as we do not want to compute an exponential number of solutions When is dynamic programming effective? • Dynamic programming works best on objects that are linearly ordered and cannot be rearranged – – – – characters in a string files in a filing cabinet points around the boundary of a polygon the left-to-right order of leaves in a search tree. • Whenever your objects are ordered in a left-toright way, dynamic programming must be considered. Efficient Top-Down Implementation • We can implement any dynamic programming solution top-down by storing computed values in the table – If all values need to be computed anyway, bottom up is more efficient – If some do not need to be computed, top-down may be faster Inexact Matching of Strings • General Problem – Input • Strings S and T – Questions • How distant is S from T? • How similar is S to T? • Solution Technique – Dynamic programming with cost/similarity/scoring matrix Measuring Distance of S and T • Consider S and T • We can transform S into T using the following four operations – insertion of a character into S – deletion of a character from S – substitution (replacement) of a character in S by another character (typically in T) – matching (no operation) Example • • • • • • • • S = vintner T = writers vintner wintner (Replace v with w) wrintner (Insert r) writner (Delete first n) writer (Delete second n) writers (Insert S) Example • Edit Transcript (or just transcript): – a string that describes the transformation of one string into the other • Example – RIMDMDMMI – v intner – wri t ers Edit Distance • Edit distance of strings S and T – The minimum number of edit operations (insertion, deletion, replacement) needed to transform string S into string T – Levenshtein distance, Levenshtein appears to have been the first to define this concept • Optimal transcript – An edit transcript of S and T that has the minimum number of edit operations – cooptimal transcripts Alignment • A global alignment of strings S and T is obtained – by inserting spaces (dashes) into S and T • they should have the same number of characters (including dashes) at the end – then placing two strings over each other matching one character (or dash) in S with a unique character (or dash) in T – Note ALL positions in both S and T are involved Alignments and Edit transcripts • Example Alignment – v-intner– wri-t-ers • Alignments and edit transcripts are interrelated – edit transcript: emphasizes process • the specific mutational events – alignment: emphasizes product • the relationship between the two strings – Alignments are often easier to work with and visualize • also generalize better to more than 2 strings Edit Distance Problem • Input – 2 strings S and T • Task – Output edit distance of S and T – Output optimal edit transcript – Output optimal alignment • Solution method – Dynamic Programming Identifying Subproblems • Let D(i,j) be the edit distance of S[1..i] and T[1..j] – The edit distance of the first i characters of S with the first j characters of T – Let |S| = n, |T| = m • D(n,m) = edit distance of S and T • We will compute D(i,j) for all i and j such that 0 <= i <= n, 0 <= j <= m Recurrence Relation • Base Case: – For 0 <= i <= n, D(i,0) = i – For 0 <= j <= m, D(0,j) = j • Recursive Case: – 0 < i <= n, 0 < j <= m – D(i,j) = min • D(i-1,j) + 1 (what does this mean?) • D(i,j-1) + 1 (what does this mean?) • D(i-1,j-1) + d(i,j) (what does this mean?) – d(i,j) = 0 if S(i) = T(j) and is 1 otherwise What the various cases mean • D(i,j) = min – D(i-1,j) + 1: • Align S[1..i-1] with T[1..j] optimally • Match S(i) with a dash in T – D(i,j-1) + 1 • Align S[1..i] with T[1..j-1] optimally • Match a dash in S with T(j) – D(i-1,j-1) + d(i,j) • Align S[1..i-1] with T[1..j-1] optimally • Match S(i) with T(j) Computing D(i,j) values D(i,j) 0 v i n t n e r 0 1 2 3 4 5 6 7 w 1 r 2 i 3 t 4 e 5 r 6 s 7 Initialization: Base Case D(i,j) v i n t n e r 0 1 2 3 4 5 6 7 0 0 1 2 3 4 5 6 7 w 1 1 r 2 2 i 3 3 t 4 4 e 5 5 r 6 6 s 7 7 Row i=1 D(i,j) v i n t n e r 0 1 2 3 4 5 6 7 0 0 1 2 3 4 5 6 7 w 1 1 1 r 2 2 2 i 3 3 3 t 4 4 4 e 5 5 5 r 6 6 6 s 7 7 7 Entry i=2, j=2 D(i,j) v i n t n e r 0 1 2 3 4 5 6 7 0 0 1 2 3 4 5 6 7 w 1 1 1 2 r 2 2 2 ? i 3 3 3 t 4 4 4 e 5 5 5 r 6 6 6 s 7 7 7 Entry i=2, j=3 D(i,j) v i n t n e r 0 1 2 3 4 5 6 7 0 0 1 2 3 4 5 6 7 w 1 1 1 2 r 2 2 2 2 i 3 3 3 ? t 4 4 4 e 5 5 5 r 6 6 6 s 7 7 7 Calculation methodologies • Location of edit distance – D(n,m) • • • • Example was to calculate row by row Can also calculate column by column Can also use antidiagonals Key is to build from upper left corner Traceback • Using table to construct optimal transcript • Pointers in cell D(i,j) – Set a pointer from cell (i,j) to • cell (i, j-1) if D(i,j) = D(i, j-1) + 1 • cell (i-1,j) if D(i,j) = D(i-1,j) + 1 • cell (i-1,j-1) if D(i,j) = D(i-1,j-1) + d(i,j) – Follow path of pointers from (n,m) back to (0,0) What the pointers mean • horizontal pointer: cell (i,j) to cell (i, j-1) – Align T(j) with a space in S – Insert T(j) into S • vertical pointer: cell (i,j) to cell (i-1, j) – Align S(i) with a space in T – Delete S(i) from S • diagonal pointer: cell (i,j) to cell (i-1, j-1) – Align S(i) with T(j) – Replace S(i) with T(j) Table and transcripts • The pointers represent all optimal transcripts • Theorem: – Any path from (n,m) to (0,0) following the pointers specifies an optimal transcript. – Conversely, any optimal transcript is specified by such a path. – The correspondence between paths and transcripts is one to one. Running Time • Initialization of table – O(n+m) • Calculating table and pointers – O(nm) • Traceback for one optimal transcript or optimal alignment – O(n+m) Operation-Weight Edit Distance • Consider S and T • We can assign weights to the various operations – – – – insertion/deletion of a character: cost d substitution (replacement) of a character: cost r matching: cost e Previous case: d = r = 1, e = 0 Modified Recurrence Relation • Base Case: – For 0 <= i <= n, D(i,0) = i d – For 0 <= j <= m, D(0,j) = j d • Recursive Case: – 0 < i <= n, 0 < j <= m – D(i,j) = min • D(i-1,j) + d • D(i,j-1) + d • D(i-1,j-1) + d(i,j) – d(i,j) = e if S(i) = T(j) and is r otherwise Alphabet-Weight Edit Distance • Define weight of each possible substitution – r(a,b) where a is being replaced by b for all a,b in the alphabet – For example, with DNA, maybe r(A,T) > r(A,G) – Likewise, I(a) may vary by character • Operation-weight edit distance is a special case of this variation • Weighted edit distance refers to this alphabetweight setting Modified Recurrence Relation • Base Case: – For 0 <= i <= n, D(i,0) = S1 <= k <= i I(S(k)) – For 0 <= j <= m, D(0,j) = S1 <= k <= j I(T(k)) • Recursive Case: – 0 < i <= n, 0 < j <= m – D(i,j) = min • D(i-1,j) + I(S(i)) • D(i,j-1) + I(T(j)) • D(i-1,j-1) + d(i,j) – d(i,j) = r(S(i), T(j)) Measuring Similarity of S and T • Definitions – Let S be the alphabet for strings S and T – Let S’ be the alphabet S with character - added – For any two characters x,y in S’, s(x,y) denotes the value (or score) obtained by aligning x with y – For a given alignment A of S and T, let S’ and T’ denote the strings after the chosen insertion of spaces and l their new length – The value of alignment A is S1<=i<=l s(S’(i),T’(i)) Example s a b - a 1 -2 0 2 -1 b •a b a a - b a b •a a a a a b - b • 1-2+1+1+0+2+0+2=5 0 String Similarity Problem • Input – 2 strings S and T – Scoring matrix s for alphabet S’ • Task – Output optimal alignment value of S and T • The alignment of S and T with maximal, not minimal, value – Output this alignment Modified Recurrence Relation • Base Case: – For 0 <= i <= n, V(i,0) = S1 <= k <= i s(S(k),-) – For 0 <= j <= m, V(0,j) = S1 <= k <= j s(-,T(k)) • Recursive Case: – 0 < i <= n, 0 < j <= m – V(i,j) = max • V(i-1,j) + s(S(i),-) • V(i,j-1) + s(-,T(j)) • V(i-1,j-1) + s(S(i), T(j)) Longest Common Subsequence Problem • Given 2 strings S and T, a common subsequence is a subsequence that appears in both S and T. • The longest common subsequence problem is to find a longest common subsequence (lcs) of S and T – subsequence: characters need not be contiguous – different than substring • Can you use dynamic programming to solve the longest common subsequence problem? Computing alignments using linear space. • Hirschberg [1977] • Suppose we only need the maximum similarity/distance value of S and T without an alignment or transcript • How can we conserve space? – Only save row i-1 when computing row i in the table Illustration 0 0 1 2 3 4 .. . n-1 n 1 2 3 4 5 6 7 … m Linear space and an alignment • Assume S has length 2n • Divide and conquer approach – Compute value of optimal alignment of S[1..n] with all prefixes of T • Store row n only at end along with pointer values of row n – Compute value of optimal alignment of Sr[1..n] with all prefixes of Tr • Store only values in row n • Find k such that – V(S[1..n],T[1..k]) + V(Sr[1..n],Tr[1..m-k]) – is maximized over 0 <= k <=m Illustration k=0 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 V(S[1..6], T[1..0]) m-k=18 6 5 4 3 2 1 0 - 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 V(Sr[1..6], Tr[1..18]) Illustration k=1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 V(S[1..6], T[1..1]) m-k=17 6 5 4 3 2 1 0 - 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 V(Sr[1..6], Tr[1..17]) Illustration k=2 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 V(S[1..6], T[1..2]) m-k=16 6 5 4 3 2 1 0 - 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 V(Sr[1..6], Tr[1..16]) Illustration k=9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 V(S[1..6], T[1..9]) m-k=9 6 5 4 3 2 1 0 - 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 V(Sr[1..6], Tr[1..9]) Illustration k=18 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 V(S[1..6], T[1..18]) m-k=0 6 5 4 3 2 1 0 - 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 V(Sr[1..6], Tr[1..0]) Illustration 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 6 5 4 3 2 1 0 - 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 Recursive Step • Let k* be the k that maximizes – V(S[1..n],T[1..k]) + V(Sr[1..n],Tr[1..m-k]) • Record all steps on row n including the one from n-1 and the one to n+1 • Recurse on the two subproblems – S[1..n-1] with T[1..j] where j <= k* – Sr[1..n] with Tr[1..q] where q <= m-k* Illustration 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 6 5 4 3 2 1 0 - 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 Time Required • cmn time to get this answer so far • Two subproblems have at most half the total size of this problem – At most the same cmn time to get the rest of the solution • cmn/2 + cmn/4 + cmn/8 + cmn/16 + … <= cmn • Final result – Linear space with only twice as much time

Dynamic Programming

Related documents

Products

Support

Dynamic Programming

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib