Programming for Engineers in Python Lecture 12: Dynamic Programming Autumn 2011-12 1 Lecture 11: Highlights • GUI (Based on slides from the course Software1, CS, TAU) • GUI in Python (Based on Chapter 19 from the book “Think Python”) • • • • • Swampy Widgets Callbacks Event driven programming Display an image in a GUI • Sorting: • Merge sort • Bucket sort 2 Plan • Fibonacci (overlapping subproblems) • Evaluating the performance of stock market traders (optimal substructure) • Dynamic programming basics • Maximizing profits of a shipping company (Knapsack problem) • A little on the exams and course’s grade (if time allows) 3 Remember Fibonacci Series? • Fibonacci series 0, 1, 1, 2, 3, 5, 8, 13, 21, 34 • Definition: • fib(0) = 0 • fib(1) = 1 • fib(n) = fib(n-1) + fib(n-2) en.wikipedia.org/wiki/Fibonacci_number http://www.dorbanot.com/3757 סלט פיבונאצ'י 4 Recursive Fibonacci Series Every call with n > 1 invokes 2 function calls, and so on… 5 Redundant Calls Fib(5) Fib(3) Fib(4) Fib(3) Fib(2) Fib(2) Fib(1) Fib(1) Fib(1) Fib(2) Fib(0) Fib(1) Fib(1) Fib(0) Fib(0) 6 Redundant Calls Iterative vs. recursive Fibonacci 7 Number of Calls to Fibonacci n value 1 1 Number of calls 1 2 1 3 3 2 5 23 28657 92735 24 46368 150049 8 Demonstration: Iterative Versus Recursive Fibonacci 9 Demonstration: Iterative Versus Recursive Fibonacci (cont.) Output (shell): 10 Memoization • Enhancing Efficiency of Recursive (Fibonacci) • The problem: solving same subproblems many time • The idea: avoid repeating the calculations of results for previously processed inputs, by solving each subproblem once, and using the solution the next time it is encountered • How? Store subproblems solution in a list • This technique is called memoization http://en.wikipedia.org/wiki/Memoization 11 Fibonacci with Memoization 12 Timeit Output (shell): 13 Fibonacci: Memoization Vs. Iterative • Same time complexity O(N) • Iterative x 5 times faster than Memoization – the overhead of the recursive calls • So why do we need Memoization? We shall discuss that later 14 Overlapping Subproblems • A problem has overlapping subproblems if it can be broken down into subproblems which are reused multiple times • If divide-and-conquer is applicable, then each problem solved will be brand new • A recursive algorithm is called exponential number of times because it solves the same problems repeatedly 15 Evaluating Traders’ Performance • How to evaluate a trader’s performance on a given stock (e.g., ?)טבע • The trader earns $X on that stock, is it good? Mediocre? Bad? • Define a measure of success: • Maximal possible profit M$ • Trader’s performance X/M (%) • Define M: • Maximal profit in a given time range • How can it be calculated? 16 Evaluating Traders’ Performance • Consider the changes the stock undergoes in the given time range • M is defined as a continuous time sub-range in which the profit is maximal • Examples (all numbers are percentages): • [1,2,-5,4,7,-2] [4,7] M = 11% • If X = 6% traders performance is ~54% • [1,5,-3,4,-2,1] [1,5,-3,4] M = 7% • If X = 5% trader’s performance is ~71% • Let’s make it a little more formal… 17 Maximum Subarray Sum • http://en.wikipedia.org/wiki/Maximum_subarray_problem • Input: array of numbers • Output: what (contiguous) subarray has the largest sum? • Naïve solution (“brute force”): • How many subarrays exist for an array of size n? • n + n-1 + n-2 + ….. + 1 O(n2) • The plan: check each and report the maximal • Time complexity of O(n2) • We will return both the sum and the corresponding subarray 18 Naïve Solution (“Brute Force”) 19 Naïve Solution (shorter code) 20 Efficient Solution • The solution for a[i:i] for all i is 0 (Python notation) • Let’s assume we know the subarray of a[0:i] with the largest sum • Can we use this information to find the subarray of a[0:i+1] with the largest sum? • A problem is said to have optimal substructure if the globally optimal solution can be constructed from locally optimal solutions to subproblems 21 Optimal Substructure t = a[j:i] >= 0 (why?) j k i-1 i s = a[j:k+1] is the optimal subarray of a[0:i] • What is the optimal solution’s structure for a[0:i+1]? • Can it start before j? No! • Can it start in the range j + 1 k? No! • Can it start in the range k + 1 i-1? • No! Otherwise t would have been negative at a earlier stage 22 Optimal Substructure t = a[j:i] >= 0 (why?) j k i-1 i s = a[j:k+1] is the optimal subarray of a[0:i] • What is the optimal solution’s structure for a[0:i+1]? • Set the new t = t + a[i] • If t > s than s = t, the solution is (j,i+1) • Otherwise the solution does not change • If t < 0 than j is updated to i+1 so t = 0 (for next iteration) • Otherwise (0 =< t <= s) change nothing 23 Example 24 The Code 25 Efficiency – O(n) Constant time 26 Efficiency – O(n) • The "globally optimal" solution corresponds to a subarray with a globally maximal sum, but at each step we only make a decision relative to what we have already seen. • At each step we know the best solution thus far, but might change our decision later based on our previous information and the current information. • In this sense the problem has optimal substructure. • Because we can make decisions locally we only need to traverse the list once. 27 O(n) Versus O(n2) Output (shell): 28 Dynamic Programming (DP) • Dynamic Programming is an algorithm design technique for optimization problems • Similar to divide and conquer, DP solves problems by combining solutions to subproblems • Not similar to divide and conquer, subproblems are not independent: • Subproblems may share subsubproblems (overlapping subproblems) • Solution to one subproblem may not affect the solutions to other subproblems of the same problem (optimal substructure) Dynamic Programming (cont.) • DP reduces computation by • Solving subproblems in a bottom-up fashion • Storing solution to a subproblem the first time it is solved • Looking up the solution when subproblem is encountered again • Key: determine structure of optimal solutions Dynamic Programming Characteristics • Overlapping subproblems • Can be broken down into subproblems which are reused multiple times • Examples: • Factorial does not exhibit overlapping subproblems • Fibonacci does • Optimal substructure • Globally optimal solution can be constructed from locally optimal solutions to subproblems • Examples: • Fibonacci, msum, Knapsack (coming next) 31 Optimizing Shipping Cargo (Knapsack) • A shipping company is trying to sell a residual capacity of 1000 metric tones in a cargo ship to different shippers by an auction • The company received 100 different offers from potential shippers each characterized by tonnage and offered reward • The company wish to select a subset of the offers that fits into its residual capacity so as to maximize the total reward 32 Optimizing Shipping Cargo (Knapsack) • The company wish to select a subset of the offers that fits into its residual capacity so as to maximize the total reward 33 Formalizing • • • • • Shipping capacity W = 1000 Offers from potential shippers n = 100 Each offer i has a weight wi and an offered reward vi Maximize the reward given the W tonnage limit A(n,W) - the maximum value that can be attained from considering the first n items weighting at most W tons 34 First Try - Greedy • Sort offers i by vi/wi ratio • Select offers until the ship is full • Counter example: W = 10, {(vi,wi)} = {(7,7),(4,5),(4,5)} 35 Solution • A(i,j) - the maximum value that can be attained from considering the first i items with a j weight limit: • A(0,j) = A(i,0) = 0 for all i ≤ n and j ≤ W • If wi > j then A(i,j) = A(i-1,j) • If wi < j we have two options: • Do not include it so the value will be A(i-1,j) • If we do include it the value will be vi + A(i-1,j-wi) • Which choice should we make? Whichever is larger! the maximum of the two • Formally: 36 Optimal Substructure and Overlapping Subproblems • Overlapping subproblems: at any stage (i,j) we might need to calculate A(k,l) for several k < i and l < j. • Optimal substructure: at any point we only need information about the choices we have already made. 37 Solution (Recursive) 38 Solution (Memoization) – The Idea W M(N,W) N 39 Solution (Memoization) - Code 40 Solution (Iterative) – The Idea In Class “Bottom-Up”: start with solving small problems and gradually grow W M(N,W) N 41 DP VS. Memoization • Same Big O computational complexity • If all subproblems must be solved at least once, DP is better by a constant factor due to no recursive involvement • If some subproblems may not need to be solved, Memoized algorithm may be more efficient, since it only solve these subproblems which are definitely required 42 Steps in Dynamic Programming 1. Characterize structure of an optimal solution 2. Define value of optimal solution recursively 3. Compute optimal solution values either topdown (memoization) or bottom-up (in a table) 4. Construct an optimal solution from computed values Why Knapsack? בעיית הגנב http://en.wikipedia.org/wiki/Knapsack_problem 44 Extensions • NP completeness http://en.wikipedia.org/wiki/NP-complete • Pseudo polynomial http://en.wikipedia.org/wiki/Pseudo-polynomial_time 45 References • • Intro to DP: http://20bits.com/articles/introduction-to-dynamic-programming/ Practice problems: http://people.csail.mit.edu/bdean/6.046/dp/ 46