Sometimes it Pays to be Greedy: Greedy Algorithms in Economic Epidemiology Fred Roberts, DIMACS 1 Optimization Problems in Economic Epidemiology Many problems in Economic Epi can be formulated as optimization problems: Find a solution that maximizes or minimizes some value. •Find the optimal location for a hospital. •Find the optimal assignment of health care workers to jobs. •Optimize investment in health care supplies. •Minimize the total cost of a series of medical tests or public health interventions. •Control an outbreak with as small an investment 2 in vaccines as possible. Greedy Algorithms Often, the simplest approach to an optimization problem is a greedy algorithm: Choose the best (cheapest, highest-rated,…) available alternative at each step. In general, greedy algorithms will find locally optimal solutions, but not globally optimal ones. Global optimum Local optimum 3 Greedy Algorithms •We give examples from Economic Epi: – Some where a greedy solution achieves a global optimum – Others where it doesn’t, but we can either make modifications or get a bound on how far from optimal we are. 4 Outline 1. Four Applications of Classical Operations Research Methods 2. Vaccination Strategies for Control of a Highly Infectious Disease Spreading through a Social Network 3. Algorithms for Sequential Public Health or Medical Decision Making 5 Classic Example I: Assigning Health Care Workers to Jobs •n workers W1, W2, …, Wn •m jobs J1, J2, …, Jm •We know which workers are qualified to do which jobs and the cost of using each worker. •Goal: assign workers to jobs they are qualified for, each to at most one job, filling as many jobs as possible, and among all ways of filling as many jobs as possible, find the way to do it with minimum total cost. •This is known as the Minimum Cost Assignment Problem 6 Assigning Health Care Workers to Jobs Greedy Algorithm: •At each stage, add the least expensive worker to those getting job assignments if there is an acceptable (feasible) assignment using that worker and all those who have previously been assigned jobs, switching job assignments if necessary. •The greedy algorithm always gives an optimal job assignment. 7 Classic Example II: Investing in Health Care Options •Suppose we are faced with a selection of health care options in which to invest. •Option i has an estimated cost ci and an estimated value vi. AIDS Prevention – Alternative health care facilities Options – Alternative supplies for a clinic Option 1: Condoms Option 2: Educational – Alternative research programs Posters •Problem: Determine which ones to Option 3: Clean Needles invest in so that the total cost is to Distribute within budget and the total value is Option 4: Testing Option 5: Funded as large as possible. Researchers 8 Investing in Health Care Options Knapsack Problem Maximize i vixi Subject to i cixi ≤ B where xi = number of items i chosen Variants xi = 0 or 1 xi {0, 1, …, bi} Bounded Knapsack Problem xi is any integer Unbounded Knapsack Problem 9 Investing in Health Care Options Greedy Algorithm •Due to George Dantzig 1957 •Sort items in decreasing order of value per unit cost: vi/ci •Pick as many copies of the first item as possible until no more are possible or until one more would violate i cixi ≤ B. •Continue in the same way with the second item, then the third, etc. •For the unbounded knapsack problem, this algorithm always achieves at least half of the value obtained by the optimal solution. •Is this acceptable? •It depends on the application: do you need a fast decision? 10 Classic Example III: Locating Health Care Facilities •We have a number of users of a planned set of health care facilities. Where do we put the facilities and how do we assign a user to a facility? 11 Locating Health Care Facilities •There are two costs: fi = cost of opening a facility at i cij = cost of sending user at j to facility at i •Let F = sum of fi over all opened facilities. •Let C = sum of costs cij over all users j. •We want to minimize F+ C. •Assume that there is no limit to the number of facilities we might open. •However, there is a tradeoff between increased cost of more facilities and decreased cost of getting to a nearby facility. •This is the Uncapacitated Facility Location Problem •Uncapacitated since we have no limit on the number of facilities. 12 Locating Health Care Facilities Cost 0.5 f 1 1 a Cost 5 1 Cost 4 e Numbers on edges are costs of moving along the edge b Cost 3 1 1 c Cost 6 d 1 Cost 1 Given users at red circled locations, where do we locate facilities to minimize F+C? 13 Locating Health Care Facilities Greedy Algorithm •Due to Charikar and Guha (2004) •First find a preliminary solution S. •Order the nodes of the network in order of increasing cost of locating a facility at the node. •Choose p so that if S is the set of the first p facilities, then the cost F + C associated with S is as small as possible. •Modify the preliminary solution in a series of steps by randomly selecting nodes to add to S and subsets of nodes to remove from S. 14 Locating Health Care Facilities •Charikar and Guha show that, given , the algorithm is guaranteed to achieve a cost F+C that is at most 2F* + 3C* + (F* + C*) in at most O(nlog(n/ ) steps, where F* and C* are costs associated with an arbitrary optimal solution. 15 Classic Example IV: Rerouting Emergency Vehicles in Case of Floods •New initiative in Climate and Health at DIMACS. 16 Extreme Events due to Global Warming •We anticipate an increase in number and severity of extreme events due to global warming. •More heat waves. •More floods, hurricanes. 17 Extreme Events due to Global Warming Areas of Emphasis in DIMACS Climate & Health Initiative •Evacuations during extreme heat events •Rolling power blackouts during extreme heat events •Pesticide applications after floods •Emergency vehicle rerouting after floods 18 Minimum Spanning Tree Problem 2 20 • • • 10 26 14 15 22 8 28 16 A spanning tree is a tree using the edges of the graph and containing all of the nodes. It is minimum if the sum of the numbers on the edges used is as small as possible. 19 Red edges define a minimum spanning tree. Minimum Spanning Tree Problem • Minimum spanning trees arise in many applications. • One example: Given a road network, find usable roads that allow you to go from any node to any other node, minimizing the lengths of the roads used. • This problem arises in the DIMACS Climate and Health project: Find a usable road network for emergency vehicles in case extreme events leave flooded roads. 20 Minimum Spanning Tree Problem • Kruskal’s algorithm (greedy algorithm): – List the edges in order of increasing weight. – For each edge, greedily include it if it does not form a cycle with edges already chosen. – Stop when no more edges can be included. • Kruskal’s algorithm gives an optimal solution. 21 Vaccination Strategies for Control of a Highly Infectious Disease Spreading through a Social Network Work with Paul Dreyer and Stephen Hartke 22 The Model: Moving From State to State Social Network = Graph Nodes = People Edges = contact t=0,1,2, … SI model Once in infected state, stay there. Times are discrete: t = 0, 1, 2, … = infected = susceptible 23 Disease Process Highly Infectious Disease: You change your state from to at time t+1 if at least one of your neighbors have state at time t. You never leave state . 24 Vaccination Strategies Let’s say you have a limited amount of vaccine available each time period, say v doses. Whom should you vaccinate? 25 Vaccination Strategies More precisely: What vaccination strategy minimizes number of people ultimately infected if a disease breaks out with one infection? Sometimes called the firefighter problem: alternate fire spread and firefighter placement. 26 Some Results on the Firefighter Problem Thanks to Kah Loon Ng DIMACS for some of the following slides, slightly modified by me 27 Three doses of vaccine per time period (v = 3) 28 v=3 29 v=3 30 v=3 31 v=3 32 v=3 33 v=3 34 v=3 35 Some questions that can be asked (but not necessarily answered!) • Can the fire be contained? • How many time steps are required before fire is contained? • How many firefighters per time step are necessary? • What fraction of all nodes will be saved (burnt)? • Does where the fire breaks out matter? • Fire starting at more than 1 node? • Consider different graphs. Construction of (connected) graphs to minimize damage. • Complexity/Algorithmic issues 36 Containing Fires in Infinite Grids Ld Fire starts at only one node: d= 1: Trivial. d = 2: Impossible to contain the fire with 1 firefighter per time step 37 Containing Fires in Infinite Grids Ld d = 2: Two firefighters per time step needed to contain the fire. 8 time steps 18 burnt nodes 38 Containing Fires in Infinite Grids Ld Wang and Moeller (2002): If d 3, 2d-1 firefighters per time step are sufficient to contain any outbreak starting at a single node. Hartke 2004: If d 3, 2d – 2 firefighters per time step are not enough to contain an outbreak in Ld. Thus, 2d – 1 firefighters per time step is the minimum number required to contain an outbreak in Ld and containment can be attained in 2 time steps. 39 Firefighting on Trees Epidemic starts at the root. Number doses of vaccine: v = 140 Firefighting on Trees Greedy algorithm: For each node x, define weight (x) = number descendants of x + 1 Algorithm: At each time step, place firefighter at node that has not been saved such that weight (x) is maximized. 41 Firefighting on Trees 26 22 Firefighting on Trees: 12 8 9 2 6 1 1 3 1 1 7 5 1 3 1 6 11 1 4 1 2 1 2 3 1 1 42 Firefighting on Trees Greedy =7 Optimal =9 43 Firefighting on Trees Theorem (Hartnell and Li, 2000): For any tree with one fire starting at the root and one firefighter to be deployed per time step, the greedy algorithm always saves more than ½ of the nodes that any algorithm saves. 44 Algorithms for Sequential Public Health or Medical Decision Making •A patient presents with certain symptoms. •Which test do we do first? •On the basis of the outcome of the first test, which test do we do next? •Tests are expensive. •So are false positive and false negative results. •“Cost” is a combination of cost of testing and cost of false results. •In what order should we do tests in order to minimize total “cost”? 45 Algorithms for Sequential Public Health or Medical Decision Making •We have several potential interventions for a public health crisis. •Assume funds limit us to one intervention at a time. •Which intervention do we invest in first? •On the basis of the outcome of the first intervention, which do we launch next? •Interventions are expensive. •So are false positive and false negative assessments of the outcome of our interventions. •“Cost” is a combination of cost of the intervention and cost of false results. •In what order should we launch the interventions in 46 order to minimize total “cost”? Sequential Diagnosis Problem •Such sequential diagnosis problems arise in many areas: –Communication networks (testing connectivity, paging cellular customers, sequencing tasks, …) –Manufacturing (testing machines, fault diagnosis, routing customer service calls, …) –Inspecting containers at ports 47 Sequential Decision Making Problem •A physician is looking to determine if a patient has disease x. The doctor has a variety of tests to choose from. In the end, the patient is to be classified into one of several categories. •Simple case: 0 = “doesn’t have the disease”, 1 = “does have the disease” •Testing scheme: specifies which tests are to be made based on previous observations Blood test endoscopy MRI 48 Stress test Sequential Decision Making Problem •We are looking to determine if an epidemic can be controlled. We have a variety of interventions to choose from. In the end, the epidemic is to be classified into one of several categories. •Simple case: 0 = controllable, 1 = not controllable •Intervention scheme: specifies which interventions are to be made based on assessments of previous interventions. •H1N1 Virus. Intervention 1: Close Schools if 15% absenteeism Intervention 2: Close Airports Intervention 3: Tamiflu to health care workers Intervention 4: Invest in vaccine. 49 Sequential Decision Making Problem •0’s and 1’s suggest binary digits (bits) •Bit String: A sequence of bits: 0001, 1101, … •Boolean Function: A function that assigns to each bit string a 0 or a 1. Bit String x 00 01 10 11 B(x) 1 0 0 1 B(00) = 1, B(10) = 0 50 Sequential Decision Making Problem •Following in language of medical testing. •Patients have attributes related to the disease being tested for, each in a number of states •Sample attributes: –White blood cell count –PSA –Creatinin clearance –Fever > 40 degrees Centigrade –Severe cough –Severe fatigue 51 Sequential Decision Making Problem •Simplest Case: Attributes are in state 0 or 1 (absent or present, higher than threshold or not) •Then: Patient corresponds to a bit string like 011001 •So: Classification is a decision function F that assigns each bit string to a category. 011001 F(011001) If attributes 2, 3, and 6 are present, assign patient to category F(011001). 52 Sequential Decision Making Problem •If there are two categories, 0 and 1 (“has disease” or “doesn’t have disease”), the decision function F is a Boolean function. Example: F(000) = F(111) = 1, F(abc) = 0 otherwise This classifies a patient as positive (sick with the disease) iff he has none of the attributes or all of them. 1= 53 Binary Decision Tree Approach •Tests measure presence/absence of attributes: so 0 or 1 •Use two categories: 0, 1 (has disease or doesn’t) •Binary Decision Tree: –Nodes are tests or categories –Two arcs exit from each test node, labeled left and right. –Take the right arc when test says the attribute is present, left arc otherwise 54 Binary Decision Tree Approach •Reach category 1 from the root by: a0 L to a1 R a2 R 1 or a0 R a2 R1 •Patient classified in category 1 iff he has a1 and a2 and not a0 or a0 and a2 and possibly a1. •Corresponding Boolean function: • F(111) = F(101) = F(011) = 1, F(abc) = 0 otherwise. Figure 1 55 Binary Decision Tree Approach •This binary decision tree corresponds to the same Boolean function F(111) = F(101) = F(011) = 1, F(abc) = 0 otherwise. However, it has one less test node ai. So, it is more efficient if all tests are equally costly and equally likely. Figure 2 56 Binary Decision Tree Approach •The problem of finding the “least cost” binary decision tree for a given Boolean function is very hard (NP-complete). •For small n = number of attributes, can try to solve it by trying all possible binary decision trees corresponding to the Boolean function F. •Even for n = 4, not practical. 57 Binary Decision Tree Approach Promising Approach: Special Assumptions about Boolean Function F •Stroud and Saeger (Los Alamos National Lab) enumerate all “complete, monotone” Boolean functions and calculate the least expensive corresponding binary decision trees. •Their method practical for n up to 4, not n = 5. 58 Binary Decision Tree Approach Monotone Boolean Functions: •Given two bit strings x1x2…xn, y1y2…yn •Suppose that xi yi for all i implies that F(x1x2…xn) F(y1y2…yn). •Then we say that F is monotone. Incomplete Boolean Functions: •Boolean function F is incomplete if F can be calculated by finding at most n-1 attributes and knowing the value of the input string on those 59 attributes Complete, Monotone Boolean Functions Combinatorial Explosion! 2 2 No. BDTs from CM Bool. Funs. 4 3 9 60 4 114 11,808 1,079,779,602 5 6,894 63,515,920 5 x 1018 No. of attributes No. CM Bool. Funs. No. BDTs 60 Cost Functions •Stroud-Saeger method applies to more sophisticated cost models, not just cost = number of tests in the BDT. •Cost Complication: How many nodes of the decision tree are actually visited during average procedure toward diagnosis? Depends on “distribution” of the disease. •Answer can also depend on probability of test errors and probability a patient has the disease. 61 Cost Functions: Unit Costs Tree Utilization •Assume we are given probability of test errors for different tests and a priori probability a patient has the disease. •This allows us to calculate “expected” cost of utilization of the tree Cutil. •It also allows us to calculate probability of false positive and probability of false negative. 62 Cost Functions OTHER COSTS: •Cost of false positive: Cost of additional tests. –If it means beginning a series of treatments, it could be expensive, not to mention psychological cost to patient. •Cost of false negative: –Complex issue. –What is cost of patient going untreated? 63 Cost Function used for Evaluating the Decision Trees CTot = CFalsePositive *PFalsePositive + CFalseNegative *PFalseNegative + Cutil CFalsePositive is the cost of false positive (Type I error) CFalseNegative is the cost of false negative (Type II error) PFalsePositive is the probability of a false positive occurring PFalseNegative is the probability of a false negative occurring Cutil is the expected cost of utilization of the tree. PFalsePositive and PFalseNegative are calculated from the tree. Cutil is calculated from tree and probability of disease and probability of test errors. CFalsePositive, CFalseNegative are input – given information. 64 Stroud Saeger Results • Using this cost function Ctot, Stroud-Saeger found an algorithm for enumerating all BDTs coming from complete, monotone Boolean functions and then ranking all trees in terms of their total costs. • The method is feasible for n = 3 or 4 types of tests, not for n > 4. D. Madigan, S. Mittal, F. Roberts: A new approach: Searching through a Generalized Tree Space • Idea: Sometimes adding more possibilities results in being able to do more efficient searches. • We expand the space of trees from those corresponding to Stroud and Saeger’s “Complete and Monotonic” Boolean Functions to “Complete and Monotonic” BDTs. 65 CM Trees • Monotonic Decision Trees – A binary decision tree will be called monotonic if all the left leaves are class “0” and all the right leaves are class “1”. • Complete Decision Trees – A binary decision tree will be called complete if every type of test occurs at least once in the tree and, at any non-leaf node in the tree, its left and right sub-trees are not identical. • CM Tree = complete, monotonic BDT 66 The CM Tree Space complete, monotonic BDTs No. of attributes Distinct BDTs Trees From CM Boolean Functions Complete, Monotonic BDTs 2 74 4 4 3 16,430 60 114 4 1,079,779,602 11,808 66,600 67 Tree Neighborhood and Tree Space • Define tree neighborhood by giving four operations for moving from one tree in CM Tree Space to another. • We have developed an algorithm for finding lowcost BDTs by searching through CM Tree Space from a tree to one of its neighbors. 68 Tree Space Traversal • Naïve Idea: Greedy Search 1. Randomly start at any tree in the CM tree space 2. Find its neighboring trees using the four operations 3. Move to the neighbor with the lowest cost 4. Iterate until we find a minimum – Problem: The CM Tree space is highly multimodal (more than one local minimum)! – Therefore, we implement a stochastic search algorithm with simulated annealing to find the best tree – a variant of the greedy algorithm. 69 Results: Searching CM Tree Space • We were able to perform experiments for 3, 4 and 5 tests, successfully; significantly faster than existing methods of searching through BDTs obtained from complete, monotonic Boolean functions. • Results show improvement compared to existing extensive search methods. – They found the optimal tree almost half the time – They often found a less costly tree than the best tree arising from a complete, monotone Boolean function. 70 Conclusion: Sometimes it Pays to be Greedy 71