Online Algorithms Amrinder Arora Permalink: http://standardwisdom.com/softwarejournal/presentations/ 2 Ski Rental Problem • Classic toy problem of rent/buy nature • Rental cost = 1$, Buying cost = $k • Algorithm – – – – Rent for k-1 days Buy skis on day #k If you stop skiing during the first (k-1) days, it costs the same If you stop skiing after day #k, your cost is $(2k-1) which is (21/k) times more than best possible ($k) • This is the best algorithm. (But we will look for a better one shortly.) http://en.wikipedia.org/wiki/Ski_rental_problem 3 Agenda • • • • What are online algorithms? (And why?) How to analyze them? Some example problems and algorithms Categories of online algorithms 4 What does “Online” mean? • “Online” means - Input arrives a little at a time, need instant response – E.g., stock market, paging • Question: what is a “good” algorithm? 5 Applications • Variety of Applications – Scheduling – Memory Management – Routing • Sample Problems – Ski Rental – Job Scheduling – Paging – Ambulances – Graph Problems 6 Analyzing Online Algorithms • Competitive Analysis • Probabilistic Analysis • Other/newer methods 7 Competitive Analysis For any input, the cost of our online algorithm is never worse than c times the cost of the optimal offline algorithm. Very robust Pessimistic 8 Probabilistic Analysis • Assume a distribution generating the input • Find an algorithm which minimizes the expected cost of the algorithm. Can incorporate information predicting the future Can be difficult to determine accurately 9 Competitive Analysis • Input sequence: • Full knowledge optimum (also called offline optimum): C ( ) • k-competitive if for all input sequences : MIN • Sometimes, to ignore boundary conditions 10 1966 – Online Era Begins! • We are given a set of m identical machines. • A sequence of jobs arrives online. • Each job must be scheduled immediately on one of m machines without knowledge of any future jobs. • The goal is to minimize the completion time of the last job • The must basic scheduling problem was introduced in 1966 11 List Scheduling • Schedule the job on the least loaded machine. • Graham showed that the List Scheduling Algorithm is (2-1/m) competitive • This analysis is tight. • This is the optimal algorithm for m = 2 and m = 3. 12 Fancier Results • In 1992 Bartal gave an 1.986-competitive "New Algorithms for an Ancient Scheduling Problem" • Karger in 1996 generalized the algorithm and proved an upper bound of 1.945 "A Better Algorithm for an Ancient Scheduling Problem" • The best algorithm known so far achieved a 1.923 competitive ration • Lower Bounds – Susanne Albers proved a bound of 1.852 – Current best lower bound of 1.88 13 Extending Research • Many variants of basic problem is studied: – Jobs may be preempted – Jobs may be rejected at a penalty – Online algorithms may use randomization – In addition they are results for different machine types 14 15 Paging Problem • Maintain a two level memory system, consisting of a small fast memory and a large slow memory • The goal is to serve a sequence of requests to memory pages so as to minimize number of page faults. • Hit: Page found 16 Minimizing Paging Faults • On a fault evict a page from cache • Paging algorithm ≡ Eviction policy • Goal: minimize the number of page faults 17 Paging Algorithms • LRU, FIFO, LIFO, Least freq use • LIFO, LFU not competititive • LRU, FIFO k-competitive. • Theorem: No deterministic online algorithm can be better than k-competitive 18 Worst case • In the worst case page, the number of page faults on n requests is n. E.g. cache of size 4, request sequence p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 19 Compare to optimal p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 … is hard for everyone (i.e. 13 faults) p1 p 2 p3 p 4 p5 p1 p2 p 3 p4 p5 p1 p 2 p3 p4 … 8 faults Optimal algorithm knows the future 20 k-Server problem • Ambulance location problem • Metrical Task Systems • Somewhat, ice cream vending problem 21 Online Graph Theory • Online Minimum Spanning Tree • Online Graph Coloring • Online Shortest Paths 22 Randomization • If ski rental costs $1, and buying costs $10, then with probability=0.5, buy after 8 days, and with probability=0.5, buy after 10 days. 23 Randomization 3 kinds of adversaries • Oblivious (weak): Does not know the randomized results • Adaptive Online (medium): Knows the decision/random output of algorithm, but needs to make their decision first. • Adaptive Offline (strong): Knows everything – randomization does not help. 24 Summary Basic Concepts • Online Problem – Problem is revealed, one step at a time. • Online Algorithm – Needs to respond to the request sequence that has been shown, not knowing next items in the sequence • Competitive Ratio – The worst ratio of the online algorithm to the optimal offline algorithm, considering all possible request sequences. Importance and Research Topics • Online algorithms show up in many practical problems. • Even if you are considering an offline problem, consider what would be the online version of that problem. • Research areas including improving algorithms, improving analysis of existing algorithms, proving tightness of analysis, considering problem variations, etc. 25 Extra Material • Adversary Approach to prove that 3n/2 – 2 comparisons are always needed to find both the minimum and the maximum given an unsorted array of n numbers: • Notes at: http://is.gd/csp1ab 26