CS 473ug: Algorithms Chandra Chekuri chekuri@cs.uiuc.edu 3228 Siebel Center University of Illinois, Urbana-Champaign Fall 2007 Chekuri CS473ug Longest Increasing Subsequence Part I Longest Increasing Subsequence Chekuri CS473ug Longest Increasing Subsequence Longest Increasing Subsequence Sequence: an ordered list a1 , a2 , . . . , an . Length of sequence is n ai1 , ai2 , . . . , aik is a subsequence of a1 , a2 , . . . , an if 1 ≤ i1 < i2 < . . . < ik ≤ n. A sequence is increasing if a1 < a2 < . . . < an . It is non-decreasing if a1 ≤ a2 ≤ . . . ≤ an . Similarly decreasing and non-increasing. Chekuri CS473ug Longest Increasing Subsequence Longest Increasing Subsequence Sequence: an ordered list a1 , a2 , . . . , an . Length of sequence is n ai1 , ai2 , . . . , aik is a subsequence of a1 , a2 , . . . , an if 1 ≤ i1 < i2 < . . . < ik ≤ n. A sequence is increasing if a1 < a2 < . . . < an . It is non-decreasing if a1 ≤ a2 ≤ . . . ≤ an . Similarly decreasing and non-increasing. Longest Increasing Subsequence Input A sequence of numbers a1 , a2 , . . . , an Goal Find an increasing subsequence ai1 , ai2 , . . . , aik of maximum length Chekuri CS473ug Longest Increasing Subsequence Example Chekuri CS473ug Longest Increasing Subsequence A First Recursive Approach L(i): length of longest common subsequence in first i elements a1 , a2 , . . . , ai . Chekuri CS473ug Longest Increasing Subsequence A First Recursive Approach L(i): length of longest common subsequence in first i elements a1 , a2 , . . . , ai . Can we write L(i) in terms of L(1), L(2), . . . , L(i − 1)? Case 1 : L(i) does not contain ai , then L(i) = L(i − 1) Case 2 : L(i) contains ai , then L(i) =? Chekuri CS473ug Longest Increasing Subsequence A First Recursive Approach L(i): length of longest common subsequence in first i elements a1 , a2 , . . . , ai . Can we write L(i) in terms of L(1), L(2), . . . , L(i − 1)? Case 1 : L(i) does not contain ai , then L(i) = L(i − 1) Case 2 : L(i) contains ai , then L(i) =? What is the element in the subsequence before ai ? If it is aj then it better be the case that aj < ai since we are looking for an increasing sequence. Chekuri CS473ug Longest Increasing Subsequence A First Recursive Approach L(i): length of longest common subsequence in first i elements a1 , a2 , . . . , ai . Can we write L(i) in terms of L(1), L(2), . . . , L(i − 1)? Case 1 : L(i) does not contain ai , then L(i) = L(i − 1) Case 2 : L(i) contains ai , then L(i) =? What is the element in the subsequence before ai ? If it is aj then it better be the case that aj < ai since we are looking for an increasing sequence. Do we know which j? No! Chekuri CS473ug Longest Increasing Subsequence A First Recursive Approach L(i): length of longest common subsequence in first i elements a1 , a2 , . . . , ai . Can we write L(i) in terms of L(1), L(2), . . . , L(i − 1)? Case 1 : L(i) does not contain ai , then L(i) = L(i − 1) Case 2 : L(i) contains ai , then L(i) =? What is the element in the subsequence before ai ? If it is aj then it better be the case that aj < ai since we are looking for an increasing sequence. Do we know which j? No! So we try all possibilities L(i) = 1 + Chekuri max j<i and aj <ai CS473ug L(j) Longest Increasing Subsequence A First Recursive Approach L(i): length of longest common subsequence in first i elements a1 , a2 , . . . , ai . Can we write L(i) in terms of L(1), L(2), . . . , L(i − 1)? Case 1 : L(i) does not contain ai , then L(i) = L(i − 1) Case 2 : L(i) contains ai , then L(i) =? What is the element in the subsequence before ai ? If it is aj then it better be the case that aj < ai since we are looking for an increasing sequence. Do we know which j? No! So we try all possibilities L(i) = 1 + max j<i and aj <ai Is the above correct? Chekuri CS473ug L(j) Longest Increasing Subsequence A First Recursive Approach L(i): length of longest common subsequence in first i elements a1 , a2 , . . . , ai . Can we write L(i) in terms of L(1), L(2), . . . , L(i − 1)? Case 1 : L(i) does not contain ai , then L(i) = L(i − 1) Case 2 : L(i) contains ai , then L(i) =? What is the element in the subsequence before ai ? If it is aj then it better be the case that aj < ai since we are looking for an increasing sequence. Do we know which j? No! So we try all possibilities L(i) = 1 + max j<i and aj <ai L(j) Is the above correct? No, because we do not know that L(j) is a subsequence that actually ends at aj ! Chekuri CS473ug Longest Increasing Subsequence A Correct Recursion L(i): longest common subsequence in first i elements a1 , a2 , . . . , ai that ends in ai Chekuri CS473ug Longest Increasing Subsequence A Correct Recursion L(i): longest common subsequence in first i elements a1 , a2 , . . . , ai that ends in ai L(i) = 1 + max j<i and aj <ai What is the final solution? Chekuri CS473ug L(j) Longest Increasing Subsequence A Correct Recursion L(i): longest common subsequence in first i elements a1 , a2 , . . . , ai that ends in ai L(i) = 1 + max j<i and aj <ai What is the final solution? maxni=1 L(i) How many subproblems? Chekuri CS473ug L(j) Longest Increasing Subsequence A Correct Recursion L(i): longest common subsequence in first i elements a1 , a2 , . . . , ai that ends in ai L(i) = 1 + max j<i and aj <ai What is the final solution? maxni=1 L(i) How many subproblems? O(n) Chekuri CS473ug L(j) Longest Increasing Subsequence Table based Iterative Algorithm Recurrence: L(i) = 1 + max j<i and aj <ai Iterative algorithm: for i = 1 to n do L[i] = 1 for j = 1 to i − 1 do if aj < ai and 1 + L[j] > L[i] L[i] = 1 + L[j] Output maxni=1 L[i] Running Time: Chekuri CS473ug L(j) Longest Increasing Subsequence Table based Iterative Algorithm Recurrence: L(i) = 1 + max j<i and aj <ai Iterative algorithm: for i = 1 to n do L[i] = 1 for j = 1 to i − 1 do if aj < ai and 1 + L[j] > L[i] L[i] = 1 + L[j] Output maxni=1 L[i] Running Time: O(n2 ) Space: Chekuri CS473ug L(j) Longest Increasing Subsequence Table based Iterative Algorithm Recurrence: L(i) = 1 + max j<i and aj <ai Iterative algorithm: for i = 1 to n do L[i] = 1 for j = 1 to i − 1 do if aj < ai and 1 + L[j] > L[i] L[i] = 1 + L[j] Output maxni=1 L[i] Running Time: O(n2 ) Space: O(n) Chekuri CS473ug L(j) Longest Increasing Subsequence A DAG Based Approach Given sequence a1 , a2 , . . . , an create DAG as follows: for each i there is a node vi if i < j and ai < aj add an edge (vi , vj ) find longest path Chekuri CS473ug Longest Increasing Subsequence Dynamic Programming Methods There is always a DAG of computation underlying every dynamic program but it is not always clear. Chekuri CS473ug Longest Increasing Subsequence Dynamic Programming Methods There is always a DAG of computation underlying every dynamic program but it is not always clear. Three ways to come up with dynamic programming solutions recursion followed by memoization with a polynomial number of subproblems bottom up via tables identify a DAG followed by a shortest/longest path in DAG But there is always a recursion+memoization that is implicit in the other methods! Chekuri CS473ug Longest Increasing Subsequence Dilworth’s Theorem Given sequence a1 , a2 , . . . , an the longest increasing subsequence can be of size 1. Chekuri CS473ug Longest Increasing Subsequence Dilworth’s Theorem Given sequence a1 , a2 , . . . , an the longest increasing subsequence can be of size 1. Example: a1 > a2 > a3 > . . . > an Chekuri CS473ug Longest Increasing Subsequence Dilworth’s Theorem Given sequence a1 , a2 , . . . , an the longest increasing subsequence can be of size 1. Example: a1 > a2 > a3 > . . . > an In the above example the longest decreasing sequence is of size n. Chekuri CS473ug Longest Increasing Subsequence Dilworth’s Theorem Given sequence a1 , a2 , . . . , an the longest increasing subsequence can be of size 1. Example: a1 > a2 > a3 > . . . > an In the above example the longest decreasing sequence is of size n. Question: Given a sequence of length n, can the longest increasing sequence and the longest decreasing sequence be both small? Chekuri CS473ug Longest Increasing Subsequence Dilworth’s Theorem Given sequence a1 , a2 , . . . , an the longest increasing subsequence can be of size 1. Example: a1 > a2 > a3 > . . . > an In the above example the longest decreasing sequence is of size n. Question: Given a sequence of length n, can the longest increasing sequence and the longest decreasing sequence be both small? Dilworth’s Theorem Consequence: In any sequence of length n, either the longest increasing sequence or the longest decresing √ sequence is of length b nc + 1. Exercise: give a sequence of length n in which both the longest increasing subsequence and longest decreasing susbequence are no √ more than b nc + 1. Chekuri CS473ug Edit Distance Part II Edit Distance Chekuri CS473ug Edit Distance Spell Checking Problem Given a string “exponen” that is not in the dictionary, how should a spell checker suggest a nearby string? Chekuri CS473ug Edit Distance Spell Checking Problem Given a string “exponen” that is not in the dictionary, how should a spell checker suggest a nearby string? What does nearness mean? Question: Given two strings x1 x2 . . . xn and y1 y2 . . . ym what is a distance between them? Chekuri CS473ug Edit Distance Spell Checking Problem Given a string “exponen” that is not in the dictionary, how should a spell checker suggest a nearby string? What does nearness mean? Question: Given two strings x1 x2 . . . xn and y1 y2 . . . ym what is a distance between them? Edit Distance: minimum number of “edits” to transform x into y . Chekuri CS473ug Edit Distance Edit Distance Edit Distance: minimum number of “edits” to transform x into y . Edit operations: delete a letter add a letter substitute a letter with another letter Chekuri CS473ug Edit Distance Edit Distance Edit Distance: minimum number of “edits” to transform x into y . Edit operations: delete a letter add a letter substitute a letter with another letter Why is substitute not “delete plus add”? In general different edits can have different costs and using substitution as a edit allows a single operation as far as distance is concerned Chekuri CS473ug Edit Distance Example Chekuri CS473ug Edit Distance Edit Distance as Alignment Chekuri CS473ug Edit Distance Edit Distance Problem Input Two strings x = x1 x2 . . . xn and y = y1 y2 . . . ym over some fixed alphabet Σ Goal Find edit distance between x and y : minimum number of edits to transform x into y Chekuri CS473ug Edit Distance Edit Distance Problem Input Two strings x = x1 x2 . . . xn and y = y1 y2 . . . ym over some fixed alphabet Σ Goal Find edit distance between x and y : minimum number of edits to transform x into y Note: EditDist(x,y) = EditDist(y,x) Chekuri CS473ug Edit Distance Towards a Recursive Solution Think of aligning the strings. Can we express the full problem as a function of subproblems? Chekuri CS473ug Edit Distance Towards a Recursive Solution Think of aligning the strings. Can we express the full problem as a function of subproblems? Case 1 xn is aligned with ym . Then x1 x2 . . . xn−1 is aligned with y1 y2 . . . ym−1 Chekuri CS473ug Edit Distance Towards a Recursive Solution Think of aligning the strings. Can we express the full problem as a function of subproblems? Case 1 xn is aligned with ym . Then x1 x2 . . . xn−1 is aligned with y1 y2 . . . ym−1 Case 2a xn is aligned with and ym is to left of . Then x1 x2 . . . xn−1 is aligned with y1 y2 . . . yn . Chekuri CS473ug Edit Distance Towards a Recursive Solution Think of aligning the strings. Can we express the full problem as a function of subproblems? Case 1 xn is aligned with ym . Then x1 x2 . . . xn−1 is aligned with y1 y2 . . . ym−1 Case 2a xn is aligned with and ym is to left of . Then x1 x2 . . . xn−1 is aligned with y1 y2 . . . yn . Case 2b ym is aligned with and xn is to left of . Then x1 x2 . . . xn is aligned with y1 y2 . . . ym−1 . Chekuri CS473ug Edit Distance Towards a Recursive Solution Think of aligning the strings. Can we express the full problem as a function of subproblems? Case 1 xn is aligned with ym . Then x1 x2 . . . xn−1 is aligned with y1 y2 . . . ym−1 Case 2a xn is aligned with and ym is to left of . Then x1 x2 . . . xn−1 is aligned with y1 y2 . . . yn . Case 2b ym is aligned with and xn is to left of . Then x1 x2 . . . xn is aligned with y1 y2 . . . ym−1 . Subproblems involve aligning prefixes of the two strings. Find edit distance between prefix x[1..i] of x and prefix y [1..j] of y EditDist(x,y) is the distance between x[1..n] and y [1..m] Chekuri CS473ug Edit Distance A Recursive Solution E [i, j]: edit distance between x[1..i] and y [1..j] Chekuri CS473ug Edit Distance A Recursive Solution E [i, j]: edit distance between x[1..i] and y [1..j] Case 1 xi aligned with xj . E [i, j] = diff (xi , yj ) + E [i − 1, j − 1] where diff (xi , yj ) = 0 if xi = yj , otherwise diff (xi , xj ) = 1 Chekuri CS473ug Edit Distance A Recursive Solution E [i, j]: edit distance between x[1..i] and y [1..j] Case 1 xi aligned with xj . E [i, j] = diff (xi , yj ) + E [i − 1, j − 1] where diff (xi , yj ) = 0 if xi = yj , otherwise diff (xi , xj ) = 1 Case 2a xi is mapped to and xi is to right of xj E [i, j] = 1 + E [i − 1, j] Chekuri CS473ug Edit Distance A Recursive Solution E [i, j]: edit distance between x[1..i] and y [1..j] Case 1 xi aligned with xj . E [i, j] = diff (xi , yj ) + E [i − 1, j − 1] where diff (xi , yj ) = 0 if xi = yj , otherwise diff (xi , xj ) = 1 Case 2a xi is mapped to and xi is to right of xj E [i, j] = 1 + E [i − 1, j] Case 2b yj is mapped to and xi is to left of yj E [i, j] = 1 + E [i, j − 1] Chekuri CS473ug Edit Distance A Recursive Solution E [i, j] = min{diff (xi , yj )+E [i −1, j −1], 1+E [i −1, j], 1+E [i, j −1] Base cases: Chekuri CS473ug Edit Distance A Recursive Solution E [i, j] = min{diff (xi , yj )+E [i −1, j −1], 1+E [i −1, j], 1+E [i, j −1] Base cases: E [i, 0] = i for i ≥ 0 E [0, j] = j for i ≥ 0 How many subproblems? Chekuri CS473ug Edit Distance A Recursive Solution E [i, j] = min{diff (xi , yj )+E [i −1, j −1], 1+E [i −1, j], 1+E [i, j −1] Base cases: E [i, 0] = i for i ≥ 0 E [0, j] = j for i ≥ 0 How many subproblems? O(mn) Chekuri CS473ug Edit Distance Iterative Solution What is table? Chekuri CS473ug Edit Distance Iterative Solution What is table? E is a two-dimensional array of size (n + 1)(m + 1) How do we order the computation? Chekuri CS473ug Edit Distance Iterative Solution What is table? E is a two-dimensional array of size (n + 1)(m + 1) How do we order the computation? To compute E [i, j] need to have computed E [i − 1, j − 1], E [i − 1, j], E [i, j − 1]. Chekuri CS473ug Edit Distance Iterative Solution Chekuri CS473ug Edit Distance Iterative Solution for i = 0 to n E [i, 0] = i for j = 0 to m E [0, j] = j for i = 1 to n do for j = 1 to m do E [i, j] = min{diff (xi , yj ) + E [i, j], 1 + E [i − 1, j], 1 + E [i, j − 1]} Running Time: Chekuri CS473ug Edit Distance Iterative Solution for i = 0 to n E [i, 0] = i for j = 0 to m E [0, j] = j for i = 1 to n do for j = 1 to m do E [i, j] = min{diff (xi , yj ) + E [i, j], 1 + E [i − 1, j], 1 + E [i, j − 1]} Running Time: O(nm) Space: Chekuri CS473ug Edit Distance Iterative Solution for i = 0 to n E [i, 0] = i for j = 0 to m E [0, j] = j for i = 1 to n do for j = 1 to m do E [i, j] = min{diff (xi , yj ) + E [i, j], 1 + E [i − 1, j], 1 + E [i, j − 1]} Running Time: O(nm) Space: O(nm) Chekuri CS473ug Edit Distance Iterative Solution for i = 0 to n E [i, 0] = i for j = 0 to m E [0, j] = j for i = 1 to n do for j = 1 to m do E [i, j] = min{diff (xi , yj ) + E [i, j], 1 + E [i − 1, j], 1 + E [i, j − 1]} Running Time: O(nm) Space: O(nm) Can reduce space to O(n + m) if only distance is needed but not obvious how to actually compute the edits. Can do it with a clever scheme (see Jeff’s notes). Chekuri CS473ug Edit Distance Example Chekuri CS473ug Edit Distance Where is the DAG? one node for each (i, j), 1 = 0 ≤ i ≤ n, 0 ≤ j ≤ m. Edges for node (i, j): from (i − 1, j − 1) of cost diff (xi , yj ), from (i − 1, j) of cost 1, from (i, j − 1) of cost 1 find shortest path from (0, 0) to (n, m) Chekuri CS473ug