Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Online Load-Distance Balancing Israel Shalom Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Online Load-Distance Balancing Research Thesis Submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science Israel Shalom Submitted to the Senate of the Technion — Israel Institute of Technology Tamuz 5771 Haifa July 2011 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 The research thesis was done under the supervision of Prof. Seffi Naor in the Computer Science Department. The generous financial support of the Technion is gratefully acknowledged. Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Contents Abstract 1 Abbreviations and Notation 3 1 Introduction 4 1.1 Online Problems . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Load-Distance Balancing . . . . . . . . . . . . . . . . . . . . . 5 1.3 Related Problems . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3.1 Load Balancing . . . . . . . . . . . . . . . . . . . . . . 6 1.3.2 Facility Location . . . . . . . . . . . . . . . . . . . . . 6 1.3.3 Bipartite Matching . . . . . . . . . . . . . . . . . . . . 7 1.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.5 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.5.1 Load-Distance Balancing . . . . . . . . . . . . . . . . 10 1.5.2 Facility Location . . . . . . . . . . . . . . . . . . . . . 11 1.5.3 Bipartite Matching . . . . . . . . . . . . . . . . . . . . 11 2 Our Results 13 2.1 Continous Latencies and Greedy Algorithm . . . . . . . . . . 14 2.2 Bipartite Matching . . . . . . . . . . . . . . . . . . . . . . . . 15 3 Load-Distance Balancing 17 3.1 Problem Setting . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 Generic Lower Bound . . . . . . . . . . . . . . . . . . . . . . 18 3.3 Deterministic and Randomized Algorithms 20 i . . . . . . . . . . Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 4 Load-Distance Balancing: Greedy Implementation 21 4.1 Height Preservation . . . . . . . . . . . . . . . . . . . . . . . 21 4.2 WaterFill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5 Load-Distance Balancing: Greedy 5.1 Linear Latencies . . . . . . . . . 5.2 Polynomial Latencies . . . . . . . 5.2.1 Upper Bound . . . . . . . 5.2.2 Lower Bound . . . . . . . 5.3 Concave Latencies . . . . . . . . 5.4 Bounded-Slope Latencies . . . . 5.4.1 Upper Bound . . . . . . . 5.4.2 Lower Bound . . . . . . . Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Bipartite Matching 6.1 Negative Results . . . . . . . . . . . . . . . . . . . 6.1.1 Non-metric Distances . . . . . . . . . . . . 6.1.2 Non-Splittable Demands . . . . . . . . . . . 6.2 Metric Distances and Splittable Demands . . . . . 6.2.1 Lower Bound . . . . . . . . . . . . . . . . . 6.2.2 Upper Bound Outline . . . . . . . . . . . . 6.2.3 Hierarchically Well-Separated Trees (HSTs) 6.2.4 Restricted Rearrangement Model . . . . . . 6.2.5 Stable Matchings . . . . . . . . . . . . . . . 6.2.6 Local-Match Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 26 27 27 29 29 31 32 32 . . . . . . . . . . 34 34 34 35 36 36 37 38 38 39 40 7 Summary 44 7.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 ii Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 List of Figures 3.1 Generic Lower Bound Example . . . . . . . . . . . . . . . . . 18 5.1 Illustration for Lemma 5.3.3 . . . . . . . . . . . . . . . . . . . 31 6.1 6.2 6.3 Non-Metric Lower Bound Example . . . . . . . . . . . . . . . Non-Splittable Demands Lower Bound Example . . . . . . . . Example for Lemma 6.2.3 . . . . . . . . . . . . . . . . . . . . 35 36 40 iii iv Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Abstract This work concerns online algorithms, which differ from offline algorithms by the fact that they do not receive their input at once, but rather sequentially. For each new portion of the received input, the algorithm provides some part of the output. Decisions made by the online algorithm regarding previous requests are irrevocable. The competitive ratio of an online algorithm is defined to be the worst-case ratio between the cost of its output and the cost of the optimum output for the same input, taken over all possible input sequences. Particularly, we consider online assignment problems. Having been rigorously studied in the past half-century, these problems concern the assignment of resources (servers) to users (requests), and are relevant to many practical problems; for instance sharing of computing resources or loadbalancing for network switching. In the online case, requests arrive one-byone, and the online algorithm returns the handling server. As with any algorithm, a cost is associated with every output. Our cost model in Load-Distance Balancing (LDB) is made of two components: one is the “moving” or “distance” cost, and the other is the “congestion” or “latency” cost. The first depends on the distance that is defined between any request/server pair, and the second depends on the latency function of the server as well as the load (number of served requests) on the server. While such components might be due to entirely different causes, it makes sense to add them, since both are translated to the latency which the end-user experiences. At the very beginning, through a challenging instance of the LDB problem we realize that with no further assumptions about latency functions, no online algorithm can assure any competitive ratio. We thus proceed to examining particular subclasses of latency functions: linear, concave, 1 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 bounded-slope (derivative is bounded between two constants), polynomial and capacity functions. The last class, functions that are zero until some capacity, and then they grow to infinity are particularly useful, since they represent the problem of bipartite matching with capacities. In the minimum bipartite matching problem, there are no latencies - the goal is to minimize the total moving cost without exceeding capacity per server. For the first four classes - linear, concave, bounded-slope and polynomial - we suggest the greedy algorithm. Greedy algorithms make the locally optimal choice in every step of the way, and their usage is a fundamental technique in Computer Science. Despite being intuitive, the implementation of Greedy proves difficult, since one can split requests over several servers, and therefore Greedy must choose between an infinite number of choices. We solve this problem by creating WaterFill, which acts greedily and works efficiently. We find that Greedy (or its variation) is optimal up to a constant in all these subcases. For capacity latencies the analysis is entirely different. We start by showing that in the general case no algorithm can achieve a bounded competitive ratio, and we see that the only sub-case in which online algorithms can perform well are metric distances, and splittable demands (i.e. a request can be served by multiple servers). For this case, we suggest LocalMatch, which is entirely different from Greedy, and it is based on bipartite matching algorithms. We find that LocalMatch is O(log2 k)-competitive, and we show a lower bound of Ω(log k) for the competitive ratio of any online algorithm, where k is the number of servers. 2 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Abbreviations and Notation All abbreviations, notations and terms are defined prior to their first usage within the work body. Following are the key notations that are used extensively throughout the paper: • LDB: Load-Distance Balancing, see Chapter 3. • UniFL: Universal Facility Location, see Section 1.3.2. • KKT: Karush-Kuhn-Tucker conditions, see Section 2.1. • Cost notation: for an algorithm A, C(A) denotes the entire cost, ΓA the latency cost, ∆A the distance cost. • OPT: the optimal algorithm, offline unless noted otherwise. • Greedy or GRE: the online greedy algorithm. • ` denotes load, and f (`) denotes the associated latency. 3 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Chapter 1 Introduction 1.1 Online Problems This work concerns competitive analysis of online algorithms. Online algorithms do not receive their input at once, but rather sequentially. For each portion of the received input, called a request, the algorithm must provide some part of the output. The competitive ratio of an online algorithm is defined to be the ratio between the cost of its output and the cost of the optimum output for the same input. Clearly, this ratio depends on the input, and as we do in other fields in computer science, we consider the worst-case scenario. We call an online algorithm α-competitive if for all inputs its cost is at most α times the optimum cost. If an online algorithm is randomized (i.e. it uses coin-tosses in order to determine output), then the competitive factor is defined with respect to the expected cost. Defined formally, let P be all the instances of a problem and let CP (A) denote the cost of the output of algorithm A to problem P ∈ P. An algorithm A is α-competitive if: CP (A) α = sup P ∈P CP (OPT) A classical example of an online problem is the following: you go to a ski resort, and you have to decide whether you want to purchase your own skis for 20 dollars, or rent skis for one dollar per day. In the “offline” case, you know precisely how many days you will spend in the resort, and thus can 4 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 easily compute the optimum choice - if the number of days is above twenty, then buy, otherwise rent. In the online case, consider that you have an unpleasant boss, who can terminate your vacation in any given morning. Now, what must you do in order to ensure that you don’t pay “too much”? In this case the answer is: rent in the first twenty days, and buy skis in the 21st day if the boss hasn’t called you back. This will ensure that you will pay at most twice the optimum solution. The proof is rather easy: let x be the days spent in the resort; if x > 20 then C(A) = 40 = 2·20 = 2·C(OPT); otherwise C(A) = C(OPT) = x. In the past two decades, online algorithms have been studied thoroughly in the computer science community. Some of the classic problems include the k-server problem and the caching problem [11]. Results in this field are usually in one of the two types: either finding an online algorithm having improved competitive factor, or proving that the competitiveness for some problem cannot be below a certain threshold. 1.2 Load-Distance Balancing The Load-Distance Balancing problem (LDB) was studied by Bortnikov et al. [12]. In this work we are focusing on one of the variants they introduced, Min-Avg LDB. The problem instance consists of a graph G = (V, E), a set of requests R and servers S, R, S ⊆ V , a (non-metric) distance function wrs between each request/server pair, and a latency function fs for every server s. When a unit of request r is matched to a server s, it incurs both the distance cost wrs , and the congestion cost fs (`s ), where `s is the load on server s. For instance, let r be a request is matched matched to s, with distance wrs = 5. Let s have a linear latency cost, fs (`s ) = 2`s , and an existing load (prior to arrival of r) of 5. Then, after r we will have `s = 6, and therefore the total cost for r will be 17. We define the problem setting more formally in Section 3.1. As with previous frameworks, we do not restrict ourselves to unit demands. This means that for every request, any amount of demand can appear at once. Furthermore, we extend the framework by considering both the splittable model, in which a demand can be split between numerous servers, and the non-splittable model, in which the entire demand must be assigned to a 5 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 single server. 1.3 1.3.1 Related Problems Load Balancing Load balancing is a well-studied problem, and is essentially a special case load-distance balancing, where edge costs are 0. Thus, all of our results are applicable to this problem too. Notice that while most of the study on load-balancing problems consider the makespan - the load in the most loaded server, there is still lots of merit to studying the average load, since it better embodies the delays experienced by users. 1.3.2 Facility Location Our problem is also closely related to the facility location problem. In the facility location problem, we have demand points (requests), and facilities (servers). Each demand/facility pair has a distance, and each facility has an opening cost. The goal is to minimize the overall cost incurred by opening costs and connection costs. The offline facility location problem and its approximation was studied rigorously by Shmoys et al. [28], and later by Charikar et al. [13], and then Sviridenko [29]. The online version of the problem was analyzed by Meyerson [26], yielding an O(log n)-competitive algorithm for metric distances. In the capacitated facility location problem, each facility also has a capacity on the number of clients it can serve. The offline version of this problem was studied in [22, 14, 24]. Universal Facility Location (UniFL) problem was introduced by Mahdian and Pal [25]. In this model each server s has a cost-function Fs associated with it. Like in LDB, in UniFL we have moving cost and congestion cost. The moving cost is identical to LDB, whereas the congestion cost is Fs (`s ), i.e. rather than representing the congestion-cost per unit of demand, Fs represents the overall cost. In other words, we can reduce LDB to UniFL by simply setting Fs (`s ) = `s fs (`s ). Notice that UniFL generalizes all previously mentioned Facility Location 6 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 problems, using the following functions: • Opening Costs: Fs (x) = n α , s 0 if x > 0 otherwise • Hard Capacities: n ∞, Fs (x) = αs , 0 if x > cs if cs ≥ x > 0 otherwise • Soft Capacities: x Fs (x) = αs cs 1.3.3 Bipartite Matching Matching is one of the most fundamental problems in algorithms. Within this framework, bipartite matching is especially common: it has many natural applications, mostly different variations of resource assignment problems, such as assigning jobs to machines, or cities to facilities. A significant amount of study has been devoted to the online version of the bipartite matching problem. In this version, the input is a graph in which some vertices are designated as servers. Then, requests arrive one by one over time, and upon arrival the algorithm must assign the request to a server. This version of the problem was introduced two decades ago [21, 20]. It is easy to see that for general edge-distances, no online algorithm can have any finite competitive ratio. Therefore, the works in this area focused on a natural restriction to the metric case. Recent breakthroughs [27, 8] drastically reduced the best competitive ratio for this problem: up until the publishing of these, the competitive ratio was linear in k, where k is the number of servers; and these results showed algorithmic with a competitive ratio that is polylogarithmic in k. We further generalize this problem, by having demands. Instead of representing one unit, each request r now has a demand which has to be matched 7 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 to a server (or multiple servers, if demands are splittable), and each server s has a capacity, which is the maximum amount of demand it can accommodate. Our task is to minimize the connection costs, without exceeding capacity for each server. Recall that we are exploring the online version of the problem, in which requests come one by one and must be matched immediately. We consider two variations - splittable demand, where a request can be served by two or more servers, and non-splittable demand - where for every request, the entire demand must be served by a single server. Our Load-Distance Balancing framework supports the generalized version of the problem, with the following latency function: fs (x) = 1.4 n ∞, 0 if x > cs otherwise Applications With the emergence of cloud computing [4], an increasing number of services are network-based and use shared resources. Users accessing a shared resource experience a delay. Typically, two major factors play a role in determining the overall delay - the distance delay, which depends on the distance between users and resources, and the congestion delay, which depends on the load of the users sharing the resource. While in some cases the distance delay might be negligible, in others it will be a hefty portion of the cost, since the overall data that needs to be transferred to the resource could be large. Software as a Service (SaaS) and Platform as a Service (PaaS) are two important paradigms in which cloud computing is used. In SaaS, the supplier of the service delivers an application over the servers on demand. Many examples of such services exist, from salesforce.com to Google Docs. In PaaS, the supplier delivers a computing platform on which users can develop applications and then serve customers. A premier example is the Amazon Elastic Computing Cloud (EC2), allowing users to rent virtual computers and deploy computer applications. In both cases, when an end-user runs an application on a server, distance to the server, as well as congestion at the server, are major sources of delay. 8 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Consider some further examples to which this type of setting is relevant: • Rich Internet Applications (RIA) are web applications which are on a par (in terms of richness) with standard desktop applications through various browser plug-ins [17]. For example, an RIA provider might have many servers and a customer can be assigned to any of them. The customer would then experience two types of delays - one caused by the distance to server, and the other by the congestion at the server. • Wireless mesh networking (WMN) technologies strive to be a nextgeneration platform for high-speed Internet access in urban and rural areas [1]. The infrastructure for WMN consists of a limited set of landline gateways which are connected to the Internet, as well as a set of wireless routers forwarding the user traffic to one of the gateways and back. When a user initiates a session, it must be decided through which gateway the user accesses the Internet. Both the distance to a gateway and the number of users that are already assigned to it are considerations that come into play. • When routing on a network between two endpoints, several routes can be selected, and ideally a route should be both short (small number of hops) and not congested (little traffic on intermediary nodes). Therefore, a cost model which takes into account both factors is called for. We model the above scenarios by a cost model which is the sum of distance costs and congestion costs. Distance costs are determined by the distance between matched server/request pairs. The congestion cost on a server s is determined by the load on it, `s , as well as the latency function fs - any unit that is matched to s will experience delay of fs (`s ). While these two costs come up for different reasons, summing them up makes sense since they are both experienced ultimately as delay, in time units. 9 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 1.5 1.5.1 Previous Work Load-Distance Balancing Bortnikov et. al [12] studied the Load-Distance balancing problem, and showed it is NP-hard for concave delay functions, and suggested a polynomial time optimal algorithm for almost convex functions. They have also studied a game-theoretic model of the problem; however, they did not study the online case. As mentioned earlier, a special case of this framework is the load balancing problem (zero edge-distances), which was studied in the online form by many researchers [18, 19, 7]. Some work was even done in models that consider both the load and the distance as we do [16, 6]. Most of the work done is concerned with minimizing the makespan - the largest load experienced by any server. Our analysis differs in two significant manners: it considers a general latency function over the load, and not the load itself; and it considers the average latency rather than the maximum latency. Awerbuch et al. [5] considered an Lp -norm cost model: that is, given the vector of loads on the servers, rather than taking its average (equivalent to L1 -norm) or maximum (equivalent to L∞ -norm), they consider any Lp norm. Recall that the Lp norm is: 1/p |X|p = X |xi |p 1≤i≤n And L∞ is: |X|∞ = lim |X|p = max {|xi |} p→∞ 1≤i≤n They propose a greedy algorithm, and show that it is O(p)-competitive and that any deterministic online algorithm is Ω(p)-competitive. Some of their techniques have proven particularly useful for our analysis, especially in the linear and polynomial latencies cases. 10 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 1.5.2 Facility Location The offline version of several variants of the facility location problem have been thoroughly explored [28, 13, 29, 22, 14, 24]. In the simplest setting, each server has an opening cost and the goal is to minimize the sum of the opening costs and the distance costs. For this case, it is easy to see that the problem is NP-hard for general distance functions, by reducing it to the Set-Cover problem. It is common to study the problem under the assumption that the distance function is a metric. This version is still NP-hard, but many constant-factor approximation algorithms were found for it. In fact, many techniques for approximating the uncapacitated facility location problem were developed. A recent book by Williamson and Shmoys [30] displays several ways to do so, and ultimately achieves a (1 + 2e )-approximation for this problem. For the more generalized UniFL problem, a (7.88 + ε)-approximation was shown by Mahdian and Pal [25]. The online facility location problem was studied by Meyerson for metric distances [26]. He gave an O(log n)-competitive algorithm under the assumption of random appearances , where n is the number of locations of requests, together with a constant lower bound for any online algorithm. Alon et al. [3] explored it for non-metric distances in the context of a general framework for online network optimization problems, showing an O(log2 k)-competitive algorithm, where k is the number of servers. As set cover is a special case, there is an implied lower bound of Ω((log2 k)/(log log k)) [2]. 1.5.3 Bipartite Matching The bipartite matching problem (often called maximum cardinality bipartite matching) is a fairly understood problem in its offline version, and can be solved optimally (e.g. using the Hungarian method [23]). Recall, though, that the problem we analyze is the capacitated version of the problem, where multiple requests can be assigned to a single server, and can easily be shown to be NP-hard by being reduced to the knapsack problem. The online version of the (standard) bipartite matching problem was introduced by Khuller et al. [21], and independently by Kalyanasundaram and 11 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Pruhs [20]. They showed a tight-bound of 2k − 1, by giving a (2k − 1)competitive deterministic algorithm, where k is the number of servers, and showing that no deterministic algorithm can get a competitive ratio below 2k − 1. The lower-bound is true even in the case of a unit metric, i.e. w(eij ) ∈ {0, 1} for all (i, j) ∈ E. Last, they showed that the greedy algorithm which matches every request to the closest available server has a competitive ratio of 2k − 1, i.e. can be exponentially worse than the optimum. In a recent breakthrough, Meyerson et al. [27] gave the first randomized algorithm with an O(log3 k)-competitive ratio for general metrics; this is the first algorithm with a performance sublinear in k for general metric spaces. They obtain this bound by utilizing embedding techniques of general metrics into hierarchically well-separated trees. Later on, Bansal et al. [8] improved the latter bound and obtained an O(log2 k)-competitive randomized algorithm. Bansal et al. [8] also gave an Ω(log k) lower bound on the competitive ratio of any randomized algorithm, even on a unit metric. While our framework generalizes this problem, a significant portion of our techniques are similar to the ones in [8]. 12 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Chapter 2 Our Results In this work we consider the online version of the LDB/UniFL problem. This means that requests appear over time, and must be assigned to some server(s) upon arrival. Once some demand is assigned, it cannot be moved and reassigned. We start with some preliminaries in Section 3, and we make a significant observation: a generic lower bound, which particularly suggests that no bounds can be offered if we consider all possible latency functions. This has a very profound implication on our analysis, since it suggests that in order to provide any finite competitive ratio, we must turn our attention into specific subclasses of latency functions. We proceed to discussing particular latency classes. While we discuss many different cases, there is a distinct dichotomy between two major subclasses one is continuous latency classes, for which we suggest the greedy algorithm, and the other is the latency functions which are in the capacity form. The first we analyze thoroughly in Sections 4 and 5, and the second, which effectively is Bipartite Matching, is analyzed in Section 6. In the remainder of this section, we split the discussion between the two large classes. 13 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 2.1 Continous Latencies and Greedy Algorithm The algorithm we analyze is the greedy algorithm, denoted by Greedy. It is of utmost importance to analyze this algorithm since it constitutes the most natural approach to the problem. Upon arrival of a request, Greedy matches it in a way that minimizes the additional overall cost. It turns out that the implementation of the greedy algorithm is non-trivial and challenging for splittable demands, since there are infinitely many choices for feasible matchings of demands to servers. To this end, we engineer an efficient algorithm WaterFill, and show that it implements Greedy. The proof relies on presenting a convex-optimization problem which characterizes greedy actions, and showing that the output WaterFill is an optimum point for it, using Karush-Kuhn-Tucker (KKT) conditions. We start with the most natural one, linear functions, in which the congestion increases linearly with respect to the load of the server. This is applicable to many settings in which demands share resources, such as CPU sharing. The second class is concave functions, in which congestion cost increases with the number of requests, yet marginal costs shrink over time. This captures many sublinear network congestion settings. We then proceed to examining polynomial functions, and lastly we consider arbitrary functions with bounded derivative. Overall, during our work, we covered most natural latency function classes. We show that Greedy (or its variation) provides surprisingly good performance: it is optimal up to a constant factor for all the cases we study. While being relatively simple in concept and implementation, we note that our analysis for Greedy in various cases is rather intricate. It is easy to see that Greedy is 1-competitive for constant latencies: if server s has constant latency αs , for each request we can simply rewrite 0 = w + α , and the optimal solution assigns request r to the server s wrs rs s 0 , which is exactly what Greedy does. minimizing wrs Our results are as follows. We show that Greedy is 5.83-competitive for linear latencies, 9-competitive for concave latencies, and O(p)O(p) -competitive for polynomial latencies of degree p. The case of concave latencies requires a particularly challenging analysis, since the derivative of fs is unbounded. Recall that our framework generalizes average load balancing, where the cost 14 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 P is s `s fs (`s ). Our result for concave latencies thus implies that the greedy assignment is θ(1)-competitive for the problem of minimizing the average latency in load-balancing (with no distance cost). To our knowledge, there is no previous study that shows this bound. For bounded-slope latencies (latencies which have bounded derivative), we slightly adjust our algorithm and introduce HighGreedy. We show that HighGreedy is θ(γ)-competitive where γ is the maximum ratio of the maximum to minimum derivative for any single latency function. We complement our results for polynomial and bounded-slope latencies with matching lower bounds. Table 2.1 summarizes the performance of our algorithms. Class Linear Concave Polynomial Bounded-Slope Algorithm Greedy Greedy Greedy HighGreedy Competitive Ratio 5.83 9 θ(p)θ(p) θ(γ) Table 2.1: Greedy Performance All of our results are applicable for both splittable and non-splittable demands. 2.2 Bipartite Matching We first show that for arbitrary edge-distances, no online algorithm can have any finite competitive-ratio, both for splittable and non-splittable demands. Then we show that for non-splittable demands, no online algorithm can have any finite competitive ratio, even with metric edge-distances. In view of these negative results, we proceed to examining metric edgedistances with splittable demands. For this case, we present an O(log2 k) competitive online algorithm, where k is the number of servers, and we complement this result by showing a lower bound of Ω(log k). Notice that the bounds are independent of the size of the capacities and demands. Our results are parallel to the ones in [8], and we use similar techniques namely we use an embedding into a special type of trees called Hierarchically 15 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Well-Separated Trees [15], which we formally define later. We display a simpler proof for the bound displayed in the original paper, and we then adapt it to our case. Table 2.2 shows a summary of online performance guarantees for different cases. Splittable Non-Splittable Nonmetric ∞ ∞ Metric ∞ O(log2 k), Ω(log k) Table 2.2: Bipartite Matching Performance 16 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Chapter 3 Load-Distance Balancing 3.1 Problem Setting Consider a graph G = (V, E), where S ⊆ V denotes the set of servers, and R ⊆ V the requests. Denote n = |V |, k = |S|. Define a non-negative distance function w between requests and servers and denote the distance between r ∈ R and s ∈ S by wrs . The function wrs is not necessarily subject to the triangle inequality. Also for convenience let G be a complete graph, and set wrs = ∞ whenever (r, s) ∈ / E. Each request r has a demand dr , which must be matched to a set of servers. A matching is a function over the edges x : R × S → R. We say that a matching satisfies the demand for a request r if the total amount matched to the servers is equal to the demand, P i.e. s∈S xrs = dr . Let A be an algorithm computing a matching x. Let `s (x) denote the load on server s induced by x, i.e. the total amount of demand matched to it: P `s (x) = r∈R xrs . Where the meaning is clear from the context, we use the shorthand `s . Each server s is associated with a latency function fs . When a unit of demand is matched between r and s, it incurs a latency cost of fs (`s ), as well as a distance cost of wrs . Thus, the overall cost C(A) is the sum of the costs experienced by all requests; namely, C(A) = XX xrs (wrs + fs (`s )). r∈R s∈S 17 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 The sum could also be written as the sum ΓA of latencies over all the servers, and the sum ∆A of the costs over the edges. Namely, C(A) = ΓA + ∆A = X `s fs (`s ) + s∈S X xrs wrs . r∈R,s∈S Having written it this way, we easily see that this is an instance of the Universal Facility Location (UniFL) problem, where cities are requests, facilities are servers, and the cost function is Fs (`s ) = `s fs (`s ). We note that we explore both splittable and non-splittable assignments of demand. For the latter version we require xrs ∈ {0, dr }. 3.2 Generic Lower Bound Theorem 3.2.1. Let F be a class of functions. For the LDB problem restricted to the latencies in this class, no online algorithm can provide a competitive ratio better than: 3f (1.5x) sup 8f (x) f ∈F,x Proof. Consider Figure 3.1 for the following discussion. Let f and x be the function and value that attain the supremum, or a value that is arbitrarily close to it. The three vertices at the bottom are requests, all with x demand. The two on top are servers with the latency function f . Shown edges have zero-cost, and missing edges have infinite cost. s2 s1 r1 r2 r3 Figure 3.1: Generic Lower Bound Example We use Yao’s minimax principle [31, 10]. Consider a problem over the inputs χ and let A be the set of all possible deterministic algorithms that solve the problem. Let C(a, x) ≥ 0 denote the cost of algorithm for algorithm a ∈ A 18 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 and x ∈ χ. A random algorithm A is defined by a probability distribution p over A, and a random input X is defined by a probability distribution q over χ. Then, Yao’s principle states that: max E[C(A, x)] ≥ min E[C(a, X)] x∈χ a∈A i.e., the worst-case expected cost of the randomized algorithm is at least the cost of the best deterministic algorithm against an input distribution q. Using this principle, we consider only deterministic algorithms, and give a distribution over inputs. The first request comes at r2 . The second request comes uniformly at random in r1 or r3 . Either way, with probability p > 21 , the second request appears at a server with a load of 0.5, and thus at the end, the more loaded server has `s ≥ 1.5x, for a total cost of 1.5f (1.5x). Therefore, any online algorithm has the expected cost of at least 0.75f (1.5x). In contrast, the offline algorithm can ensure `s1 = `s2 = x, for a total cost of 2f (x). This has vast implications on our analysis. Some examples of functions f for which f (1.5x)/f (x) are unbounded: • Any function that has both a zero value and a non-zero value: since f is non-decreasing, if f (x1 ) = 0 and f (x2 ) > 0, then there exists ξ such that f (ξ) = 0 and f (ξ + ) > 0. f (1.5ξ)/f (ξ) will be unbounded. • Any function that has both a finite and an “infinite” value. In reality, such functions do not exist, but in Computer Science often we can treat disproportionally large numbers as infinite. Similiarly to the above, we can find the point in which the function explodes. In particular, this includes any form of hard-capacities and bipartite matching. This means that we cannot analyze the problem without restricting ourselves to a class of functions. Even within particular classes of functions, the above example is useful for showing lower bounds. Another direction of analysis that is helpful is considering the special case of metric distances. The poor performance of online algorithm for the above instance stems from the fact that while some request/server pairs are close, some are infinitely distant 19 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 from each other, and this can be remedied by restricting ourselved to metric cases. 3.3 Deterministic and Randomized Algorithms Theorem 3.3.1. For the LDB problem with splittable demands and almost convex latency functions; for every randomized algorithm AR , there exists a deterministic algorithm AD such that C(AD ) ≤ C(AR ) for any input sequence. Proof. The intuition behind this is that if we can split demand, we have no reason to resort to randomization - rather than giving a few possible matchings as output, we can simply give their weighted average. Formally, let AR be a randomized algorithm. We define a deterministic algorithm AD as follows: when a request r comes, AR assigns xrs for every s ∈ S. Let x̃rs be the distribution which it uses to select the matching. AD outputs x∗rs = E[x̃rs ]. By linearity of expectation, this indeed satisfies r. Now, let us compare the costs of AR and AD . Recall that the cost comprises of two parts, distance cost (∆) and congestion cost (Γ). On AR , these denote the expectation of the respective costs. By linearity of expectation, we have: X X ∆AD = E[x̃rs ] · wrs = E x̃rs wrs = ∆AR . rs∈E r∈R,s∈S P Due to linearity of expectation, we know that E[`s (x̃)] = E[ r x̃rs ] = P P ∗ ∗ r E[x̃rs ] = r xrs = `s (x ). Since fs is almost convex, we can use it together with Jensen’s inequality to obtain for every s: `s (x∗ ) · fs (`s (x∗ )) = `s (E[x̃]) · fs (`s (E[x̃])) ≤ E[`s (x̃) · fs (`(x̃))]. Summing up for all servers, we find that ΓAD ≤ ΓAR . We also have ∆AD = ∆AR , therefore C(AR ) ≤ C(AD ). 20 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Chapter 4 Load-Distance Balancing: Greedy Implementation In this section we discuss the implementation of Greedy. For non-splittable demands, Greedy is fairly straightforward: we simply select the best among the k servers for each request upon arrival. For splittable demands, we cannot iterate over all possible solutions. Thus, we characterize the Greedy output, and introduce WaterFill which satisfies the characterization. 4.1 Height Preservation We start with two definitions: • Given a request r, we define the height of server s (with respect to r), Φs , as the cost of matching an additional infinitesimal demand from s to r. Φs = lim C(x + xr s)/xr s = xrs →0 ∂C(x + xrs ) = `s fs0 (`s ) + fs (`s ) + wrs . ∂xrs • Let A be an online algorithm. Given a request r, denote by SA the servers which A has matched r to, i.e. xrs > 0. We say that A is height-preserving if Φs̄ ≤ Φs for all s̄ ∈ SA , s ∈ S. Intuitively, a height-preserving algorithm adds demand from r to s only if s has minimal height with respect to r. 21 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Theorem 4.1.1. Any height-preserving algorithm A is greedy, i.e. produces the output which minimizes cost given previous assignment. Proof. Let x∗r be A’s output. We write the problem of implementing Greedy as an optimization problem. Let us the servers be: S = {s1 , . . . , sk }. Our variables are xrs1 , . . . , xrsk . minimize f (xr ) = C(x + xr ) subject to gi (xr ) = −xrsi ≤ 0 i = 1, . . . k h1 (xr ) = k X xrsi − dr = 0 i=1 We now prove that x∗r is optimum to the above optimization problem, by showing that it satisfies the Karush-Kuhn-Tucker (KKT) conditions. These are sufficient for optimality in our case, since gi and h1 are affine; and the only stationary point of f , x = 0 is a global minimum. Recall that KKT conditions require selecting µi for every gi , then λj for every hj . Denote Φ∗ = mins {Φs } and let us select λ = −Φ∗ (we have only one tight constraint) and µi = Φsi − Φ∗ . Now, we verify that the KKT conditions hold: • Stationarity. For every variable xrs : m l i=1 i=1 X ∂gi X ∂h1 ∂f (x∗ ) + µi (x∗ ) + λ (x∗ ) = 0. ∂xrs ∂xrs ∂xrs ∂g In our case, ∂x∂frs = Φsi , ∂xrsj = −δij (Kronecker delta). Therefore i i our equation becomes Φsi − µi + λ = 0, which holds by definitions of λ and µ. • Primal feasibility. Follows from the legality of the output. • Dual feasibility. µi = Φsi − Φ∗ ≥ 0. • Complementary slackness. µi ·gi (x) = 0. If si ∈ SA , then Φsi = Φ∗ , and so µi = 0. Otherwise, si ∈ / SA , and so gi (x) = −xrsi = 0. 22 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 All KKT conditions hold, x∗r is a solution to the convex program. 4.2 WaterFill We now introduce WaterFill and prove that it is height-preserving. Let us define δs () as the amount of demand that needs to be added to server s in order to increase its height by . Intuitively, what WaterFill does is simple: in every iteration it computes the “lowest” server (or set of servers), and adds demand to each one of them to the point that either the demand is satisfied or the servers reach the height of the servers that are next up. The demand that is added to each server is computed so that at the end of the iteration, the height of each one of the “lowest” servers remains the same. Algorithm 1 WaterFill(r) Require: A request r Ensure: r is matched. 1: while dr > 0 do 2: Compute Φs for all servers. 3: Let h ← mins∈S {Φs }, and S ∗ the set of servers that achieve the minimum. 4: Find the height of P the servers that are next up, h0 ← mins∈S\S ∗ {Φs }. 5: Select such that δs () ≤ dr and ≤ h − h0 ∗ 6: for s ∈ S do 7: xrs ← xrs + δs () 8: dr ← dr − δs () 9: end for 10: end while Theorem 4.2.1. WaterFill is height-preserving. Proof. Let s̄ ∈ SWF , and any server s ∈ S. If we added any demand to s̄, then at some point, it was in S ∗ , so at that point we had Φs̄ ≤ Φs . From here on, observe any step in which Φs̄ increases. At this step s̄ ∈ S ∗ . Consider two cases: 1. s ∈ S ∗ , then at the beginning of the step Φs̄ = Φs . Since we increased both Φs̄ and Φs by ; at the end of the step equality is preserved. 23 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 2. s ∈ / S ∗ , then Φs ≥ h0 . Since Φs̄ started at h and was increased by at most ≤ h0 − h, the inequality is preserved. Thus, at the end of the step Φs̄ ≤ Φs . Computing and δs are implied in the algorithm. For linear latencies, the height becomes: Φs = 2αs `s (x) + wrs , and therefore δs () = /2αs . With P P this, we can calculate dr ≥ s δs () = s 2α1 s , and thus compute : ← min dr X 1 2αs s !−1 , h0 − h . For arbitrary latencies, however, computing and δs values might be more difficult. Still, simple numerical methods should be sufficient for efficiently computing them to any arbitrary accuracy for any latency function. 24 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Chapter 5 Load-Distance Balancing: Greedy Performance In this section we analyze the performance of the greedy algorithm with various classes of latencies: linear, polynomial, concave and bounded-slope. We start by making some definitions and lemmas that help us throughout the work: • Xr denotes the set of all the matchings satisfying request r. • x + x0 denotes the union of matchings x and x0 . • An online algorithm A is greedy if given a current matching x, and a new input request r, it matches r while minimizing the impact on the cost: x∗r = arg min{C(x + xr ) − C(x)}. We write Greedy or GRE for xr ∈Xr short. We observe that we can restrict ourselves to latency functions with fs (0) = 0, otherwise we can define fs0 = fs − fs (0), and adjust distances accordingly. The following algebraic manipulation is used throughout the work. p Lemma 5.0.2. For α, β ≥ 0, if C(ALG) ≤ α·C(OPT)+β· pC(ALG) · C(OPT), then the competitive ratio of ALG is at most α + 21 (β 2 + β β 2 + 4α). 25 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Proof. Defining ρ = p C(ALG)/C(OPT) and dividing both sides by C(OPT): ρ2 ≤ α + βρ p β + β 2 + 4α ρ≤ 2 p p 2 2β + 4α + 2β β 2 + 4α β 2 + β β 2 + 4α 2 ρ ≤ =α+ 4 2 Lemma 5.0.3. Let `rs be the load on server s at the time of arrival of request r, and xrs a legal matching. There exist values λrs ∈ [0, 1] such that: C(GRE) ≤ ∆OPT + XX xrs [`rs fs0 (`rs + λrs xrs ) + fs (`rs + xrs )] s∈S r∈R Proof. Let x̄ be the matching of Greedy until the arrival of request r. Let x̄r be the additional matching. By definition of greediness: C(x̄ + x̄r ) − C(x̄) ≤ C(x̄ + xr ) − C(x̄) i Xh = (`rs + xrs )fs (`rs + xrs ) − `rs fs (`rs ) + wrs xrs s∈S = Xh `rs [fs (`rs + xrs ) − fs (`rs )] + xrs fs (`rs + xrs ) + wrs xrs s∈S i Xh = xrs [`rs fs0 (`rs + λrs xrs ) + fs (`rs + xrs )] + wrs xrs s∈S where the last step uses the mean-value theorem. Summing for all requests, the telescoping result on the left-hand side is C(GRE). Summing first by servers, then by requests on the right-hand side, we get the desired result. 5.1 Linear Latencies In this section, we analyze latency functions that are in the form of fs (`s ) = αs `s , where αs is constant for every server. Theorem 5.1.1. For linear latencies, Greedy is 5.83-competitive. 26 i Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Proof. Denote by `s the load at the end of the algorithm, and `∗s the load of s the optimal matching. Using Lemma 5.0.3: C(GRE) ≤ ∆OPT + XX xrs [`rs αs + αs (`rs + xrs )] s∈S r∈R ≤ ∆OPT + X αs s∈S X (xrs )2 + r∈R = ∆OPT + ΓOPT + 2 X X 2αs `s s∈S X xrs r∈R αs `s `∗s ≤ C(OPT) + 2 p ΓOPT ΓGRE . s∈S The first step follows since `rs ≤ `s for all r and fs0 = αs . The second step P 2 P is due to the convexity of x2 xi ≤ ( xi )2 . In the last step, we use the √ √ Cauchy-Schwarz inequality, with the vectors ~as = αs · `s and ~bs = αs · `∗s . The term in the summation becomes h~a, ~bi and it is bounded by k~ak · k~bk. Using Lemma 5.0.2 with α = 1 and β = 2, it follows that the competitive √ ratio is at most 3 + 2 2 ≈ 5.83. 5.2 Polynomial Latencies In this section, we analyze latency functions that are in the form of fs (`s ) = `ps for some integer p. The results of this section can easily be generalized to all polynomials. 5.2.1 Upper Bound We start by showing two algebraic properties. The first is derived in a straightforward manner from Hölder’s inequality. The second easily follows from Lemma 4.1 in [5]. Lemma 5.2.1. For all p ≥ 1, X 1 `ps `∗s ≤ (ΓpGRE ΓOPT ) p+1 . s Proof. Let α, β be any two vectors and t any non-negative integer. We use Hölder’s inequality, which states that h~a, ~bi ≤ k~akx k~bky for any 1 ≤ x, y < ∞ 27 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 satisfying X i αit βi ≤ 1 x 1 y + X = 1. We substitute x = t + 1 and y = (t + 1)/t: (αi t ) t+1 t t t+1 X βi t+1 1 t+1 " = X i i αi t+1 t X i βi t+1 #1/(t+1) i Substituting α = `, β = `∗ and t = p, we achieve the desired result. Lemma 5.2.2. For any three integers a, b, t > 0: (a+b)t ≤ eat +(b(t+1))t . Proof. Lemma 4.1 in [5] states that: p−1 (a + b) p−1 ≤ ca p−1 p−1 + b +1 ln c By setting c = e, t = p − 1, we achieve the desired result. Theorem 5.2.3. For degree-p polynomial latencies, Greedy is O(p)O(p) competitive. Proof. We use Lemma 5.0.3, and the fact that both fs and fs0 are nondecreasing: C(GRE) ≤ ∆OPT + XX = ∆OPT + XX ≤ ∆OPT + XX xrs [`s fs0 (`s + xrs ) + fs (`s + xrs )] s∈S r∈R xrs [`s p(`s + xrs )p−1 + (`s + xrs )p ] s∈S r∈R xrs (p + 1)(`s + xrs )p s∈S r∈R We apply first Lemma 5.2.2 and then Lemma 5.2.1 to find: C(GRE) ≤ ∆OPT + XX xrs (p + 1)e`ps + s∈S r∈R ≤ ∆OPT + e(p + 1) XX [(p + 1)xrs ]p+1 s∈S r∈R X `ps `∗s + (p + 1)p+1 ΓOPT s∈S 1 ≤ (p + 1)p+1 C(OPT) + e(p + 1) ΓGRE p ΓOPT p+1 28 (*) Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 If the left term is dominant in the right-hand side, we are done. Otherwise: 1 ΓGRE ≤ 2e(p + 1) ΓGRE p ΓOPT p+1 p+1 p Γp+1 ΓGRE ΓOPT GRE ≤ [2e(p + 1)] ΓGRE ≤ [2e(p + 1)]p+1 ΓOPT Plugging this back into (*): 1 p+1 C(GRE) ≤ (p + 1)p+1 C(OPT) + e(p + 1) [2e(p + 1)](p+1)p Γp+1 OPT ≤ (p + 1)p+1 C(OPT) + [2e(p + 1)]p+1 C(OPT) = O(p)O(p) C(OPT) 5.2.2 Lower Bound In Section 3 of [5], we can find an example which yields a lower bound of Ω(p) for the Lp -norm load balancing. In Lp+1 -norm load balancing, the cost is: ! 1 p+1 X 0 p+1 C (ALG) = `s . s∈S For the same example, if we set polynomials of fs (`s ) = `ps , we have C(ALG) = C 0 (ALG)p+1 . Therefore: p+1 C(ALG) = C 0 (ALG)p+1 = Ω(p)C 0 (OPT) = Ω(p)Ω(p) C(OPT). 5.3 Concave Latencies In this section we analyze concave latency functions. We start with a property of such functions: Lemma 5.3.1. For a concave f , xf 0 (x) ≤ f (x). Proof. The mean-value theorem states that there exists ξ 0 ∈ [0, x] such that f 0 (ξ 0 ) = f (x)/x. Since f 0 is decreasing, f 0 (x) ≤ f 0 (ξ 0 ) ≤ f (x)/x. Multiplying by x, this yields the desired result. 29 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Lemma 5.3.2. C(GRE) ≤ C(OPT) + 2 P s∈S fs (`s )`∗s Proof. We use Lemma 5.0.3, and the property above: C(GRE) ≤ ∆OPT + XX ≤ ∆OPT + XX ≤ ∆OPT + X xrs [`s fs0 (`s ) + fs (`s + xrs )] s∈S r∈R xrs [fs (`s ) + fs (`s + `∗s )] s∈S r∈R [2fs (`s ) + fs (`∗s )] s∈S ≤ C(OPT) + 2 X xrs r∈R X fs (`s )`∗s s∈S Notice that we have arrived to this point also in the analysis of linear latencies, where the term in the summation was αs `s `∗s . In that case, since the impact of the value inside f and outside f was identical, we were able to apply the Cauchy-Schwartz theorem. Unfortunately in this case we cannot apply the same technique, and we must produce fs (`∗s ) on the right-hand side somehow. At this point, we take notice of an interesting property of concave functions. Lemma 5.3.3. For a concave and non-negative f , with f (0) = 0, and 0 ≤ a ≤ b: bf (a) ≥ af (b). Proof. Plug y = f (y) = 0 in the standard definition of concavity. For t ∈ [0, 1]: f (tb + (1 − t)y) ≥ tf (b) + (1 − t)f (y) ⇒ f (tb) ≥ tf (b). We know that a ≤ b, and so t = a/b ∈ [0, 1]. Using this: f (b) f (b) f (b) b = ≤ a = . b f (a) f (b) a f (a b ) b Multiplying both sides by a and f (a), we achieve the desired result. 30 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 f(b) f(a) 2 1 3 a b Figure 5.1: Illustration for Lemma 5.3.3 For a visual illustration, consider Figure 5.1, where bf (a) > af (b) since sections 1+3 are larger than 1+2. Theorem 5.3.4. Greedy is 9-competitive over concave latency functions. Proof. Let S1 be the servers for which `s < `∗s , and S2 the remaining servers. Using Lemma 5.3.3, for s ∈ S2 , `∗s f (`s ) ≤ `s f (`∗s ). We have: X fs (`s )`∗s = s∈S X fs (`s )`∗s + s∈S1 X fs (`s )`∗s ≤ ΓOPT + s∈S2 X fs (`s )`∗s s∈S2 Xp Xp p (`∗s f (`s ))2 ≤ ΓOPT + (`∗s f (`∗s )) (`s f (`s )) ≤ ΓOPT + s∈S2 s∈S2 Xp p p (`∗s f (`∗s )) (`s f (`s )) ≤ ΓOPT + ΓOPT ΓGRE ≤ ΓOPT + s∈S The last step is due to Cauchy-Schwartz inequality. Plugging into Lemma 5.3.2: p C(GRE) ≤ 3C(OPT) + 2 ΓOPT ΓGRE Using Lemma 5.0.2 with α = 3, β = 2, we conclude Greedy is 9-competitive. 5.4 Bounded-Slope Latencies In this section we examine latency functions with bounded slope - that is, for every s, there exist 0 < αs , βs < ∞, such that: βs ≤ fs0 (x) ≤ αs . 31 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Furthermore every fs is continuously differentiable, and is almost convex. Notice that most reasonable non-decreasing functions admit to these properties. Denote γ as the greatest ratio between αs and βs in any server, i.e., γ = sups {αs /βs }. 5.4.1 Upper Bound For this class of latencies, we use a slightly different algorithm. We define the high-cost C 0 of a matching x as the cost when we adjust latencies to fs (`s ) = αs `s . We define HighGreedy, as the algorithm minimizing increase to the high-cost for every request. We will show that HighGreedy is O(γ)competitive, and no better online algorithm exists. Theorem 5.4.1. HighGreedy is O(γ)-competitive for bounded-slope latencies. Proof. We start with the following observation: βs x ≤ fs (x) ≤ αs x ≤ γβs x ≤ γfs (x) Therefore, for any algorithm A: C(A) ≤ C 0 (A) ≤ γC(A). In the high-cost model, latencies are linear. Therefore, HighGreedy is 5.83-competitive in respect to the high-cost using Theorem 5.1.1. Again using our observation: C(HiGre) ≤ C 0 (HiGre) ≤ 5.83 · C 0 (OPT) ≤ 5.83 · γ · C(OPT). 5.4.2 Lower Bound We can see the lower-bound for the bounded-slope latencies by applying the results of Section 3.2, with the following latency function: fs (x) = n γ(x − 1) + 1, x if x > 1 otherwise The lower bound is thus Ω(fs (1.5)/fs (1)) = Ω(γ). It is also possible to see that Greedy is O(γ 2 )-competitive for boundedslope, by repeating the proof of Theorem 5.1.1, and using the linear bounds 32 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 of Theorem 5.4.1. We do not have an instance which shows that the competitive ratio of Greedy is Ω(γ 2 ), so it is still possible that careful analysis may show that it performs as well as HighGreedy. 33 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Chapter 6 Bipartite Matching In this section we examine the Load-Distance Balancing problem with capacity functions. This means that instead of having a latency function, each server will have a capacity which it cannot exceed, denoted by an infinite value in the latency function. As described in Section 1.5.3, this effectively reduces the problem to bipartite matching with demands. 6.1 Negative Results We now show that if we have either nonmetric distance or non-splittable demands, then no online algorithm can have a finite competitive ratio. 6.1.1 Non-metric Distances Theorem 6.1.1. For the LDB problem with capacity functions and general (i.e. not necessarily metric) distance functions, there is no (deterministic or randomized) online algorithm that can guarantee any bounded competitive ratio; neither for the splittable nor for the non-splittable case. Proof. Unfortunately, we cannot use Theorem 3.3.1, since we would like to prove the theorem for both splittable and non-splittable demand. We use Yao’s minimax principle (see Section 3.2) and rather than examining all randomized algorithms, we present a probability distribution over all 34 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 inputs, for which no deterministic algorithm can guarantee any bounded ratio. s1 1 0 r1 s2 0 0 r2 s3 1 0 s4 r3 Figure 6.1: Non-Metric Lower Bound Example Consider Figure 6.1: all requests have unit demand, and all servers have unit capacity. The first request appears at r2 , and the second one appears uniformly at random in one of r1 , r3 . When the online algorithm A gets the first request at r2 , it matches some demand x22 to s2 . The rest, x23 = 1 − x22 , is matched to s3 , for no cost. The second request exhausts all the remaining capacity in one of the free servers (s2 , s3 ), and the remaining demand is matched to a paid server (s1 , s4 ), for a cost of 1. The expected cost is thus: 1 1 C(A) = (1 − x22 ) + (x22 ) = 1. 2 2 Both in the splittable and the non-splittable model, in the offline case, all the demand can clearly be matched through the free edges, so in both cases, the optimal cost is 0, yielding an unbounded competitive ratio. 6.1.2 Non-Splittable Demands Theorem 6.1.2. For the LDB problem with capacity functions and nonsplittable demands, there is no (deterministic or randomized) online algorithm that can guarantee any bounded competitive ratio. This is also true for metric distances, and even for a line metric, where all vertices are laid out on R. Proof. The instance is described in Figure 6.2. We have two requests at the same vertex and three servers at distinct vertices. The servers s1 , s2 and s3 have capacities of 2, 1, 2, respectively and are at distance of 1, m, m2 to the request vertex, respectively. 35 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 0 2 1 2 m2 1 m Figure 6.2: Non-Splittable Demands Lower Bound Example Let A be an online algorithm. The first input of the instance is r1 , with unit demand. Let p be the probability r1 is assigned to s1 by A. If p < 1/2, then our input stops here. Since with probability greater than half, the demand of r1 was matched to one of the more expensive servers (m or m2 ), we have C(A) = Ω(m). On the other hand, C(OPT) = 1, since OPT matches r1 to s1 . If p ≥ 1/2, the second request r2 appears with demand 2. Since s2 has only unit capacity, the only servers eligible for this request are s1 and s3 . With probability p, s1 is occupied, and A has to match r2 to s3 , for a total cost of m2 , yielding C(A) = Ω(m2 ). OPT, on the other hand, matches s1 to r2 and s2 to r1 for a total cost of m + 1. In both cases, the competitive ratio is Ω(m), which is unbounded for arbitrary m. 6.2 Metric Distances and Splittable Demands In this section we explore the case where the distance consists a metric with splittable demands. This is the only case for which there are online algorithms with bounded competitive-ratio. 6.2.1 Lower Bound Theorem 6.2.1. The competitive ratio of any online algorithm for the LDB problem with capacities is Ω(log k), where k is the number of vertices. This is true even when for unit distances, capacities, and demands, namely: wrs , dr , cs ∈ {0, 1}. Proof. We define an instance and show that any online algorithm incurs a cost of Ω(log k)·C(OPT) on it. By Theorem 3.3.1, without loss of generality, we can restrict ourselves to deterministic algorithms. The instance is the 36 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 same as in the lower bound proof of [8], and the proof is along the same lines. We however note that the bound cannot be deduced from [8] directly, since in this case demands are splittable. Let G = (V, E) be a graph with k + 1 vertices and k servers. Denote the vertices by {v0 , . . . , vk }, where the servers are located in {v1 , . . . , vk }. All edges have unit distance, all servers have unit capacity. There are k requests in total, each with unit demand. The first request appears at vertex v0 , namely r0 = v0 . Each subsequent request appears at the server with the most used capacity at that point of time, never repeating the same vertex. Consider the arrival of request ri . Up to this point there were i requests in Vi = {r0 , . . . , ri−1 }. The overall capacity in Vi is i − 1, since there is no server in r0 . Therefore, at least one unit of request is matched to V̄i = V \Vi . The adversary selects the most exhausted server from this set, of which used capacity is at least 1/|V̄i | = 1/(k −i). At least this amount of demand has to be matched outside the server, yielding a cost of C(ri ) ≥ 1/(k−i). Summing up over all requests: C(A) = C(r0 ) + k−1 X C(ri ) ≥ 1 + i=1 k−1 X i=1 1 ≥ H(k − 1) = Ω(log k). k−i The offline algorithm in contrast matches r1 through rk−1 to servers at distance 0. It matches r0 to the remaining server, yielding a cost C(OPT) = 1, thus achieving the desired lower bound. 6.2.2 Upper Bound Outline In this section, we present an O(log2 k)-competitive online algorithm, which is a modification of the algorithm presented in [8], which we term the BBGN algorithm. We begin by displaying a certain class of trees metrics, called HSTs, as well as a new cost model which allows us to perform rearrangements. We then show a characterization of online algorithms that are successful on HSTs. We present Local-Match, and using the latter characterization show that the ultimate matching that it yields is optimal in HSTs. Finally, we show 37 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 that the rearrangements and embedding in HSTs each incur a loss of a factor of O(log k). 6.2.3 Hierarchically Well-Separated Trees (HSTs) In order to achieve our upper bound, we use a special type of tree metrics called Hierarchically Well-Separated Trees (HSTs) [9]. Given a parameter α ≥ 1, we define an α-HST as a leveled tree T = (V, E), with the distance function w, which for any v ∈ V , any child c of v, and the parent p of v, wpv = αwvc . In other words, all the children of v are at the same distance, and the distance to its parent is α times the distance to its children. For the root r, we can assume that there is a “virtual” parent of arbitrary distance. Any metric can be probabilistically embedded in an α-HST, with a stretch of at most O(α log n) [15]. Given an instance on a general metric, we first embed it into a 2-HST, and then apply our algorithm, which is O(log k)competitive over 2-HSTs. Despite the fact that the number of nodes n in the metric space can be significantly larger than k, we can still get a stretch of O(log k) (rather than O(log n)), via a technique used by [27] and [8]. We construct the HST for the sub-metric induced on the k server nodes, and when a request r arrives, we pretend that it has arrived at the server s(r) P closest to it. Using the triangle inequality, and the fact that r∈R wr,s(r) ≤ C(OPT), we see that the competitive ratio increases by only a constant factor. Thus, the total competitive factor achieved is O(log2 k). Proposition 6.2.2. For any β-competitive online algorithm over 2-HSTs, there exists an O(β log k)-competitive online algorithm on general metrics. Given an HST and two leaves u, v, we define their δ-distance, denoted by δ(u, v), as the number of edges in the path from u (v) to their least common ancestor. For instance δ(u, u) = 0, if u and v have a common parent then δ(u, v) = 1, and so forth. 6.2.4 Restricted Rearrangement Model The restricted rearrangement model described in [8] is a tool utilized by us. In this model, an online algorithm is allowed to evacuate r from its previous 38 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 match s, and then match r0 to s, but only if δ(s, r0 ) ≤ δ(s, r). Also, after this, r must be matched again. In the model, the algorithm must pay for both the original and the new matching. It is shown that for every algorithm in this cost and rearrangement model, there exists another online algorithm that achieves the same cost, with no rearrangements. We therefore allow rearrangements with the above restrictions and evaluate the cost under this model. Throughout the work, once we unmatch a request, it goes into a queue, and the contents of the queue re-enter the algorithm prior to the handling of the next request. Though [8] presents this model only for non-splittable demands, the proof can be adapted to the splittable case. 6.2.5 Stable Matchings We start with the definition of stable matchings. A request/server pair (r, s) is unstable in a matching, if r is matched to s0 , s is matched to r0 , and δ(r, s) < min{δ(r0 , s), δ(r, s0 )}. A matching is called stable if it doesn’t contain any unstable pairs. Lemma 6.2.3. A stable matching over a tree metric is necessarily optimal. Proof. We consider the cost as follows. Let Ti be the subtree rooted at a node i. Let ψ(i) be the amount of demand in Ti that is matched outside of Ti . This demand will travel from i to its parent, and will eventually reach another node which is at the same level as i, incurring a cost of 2w(i, parent(i))ψ(i). P Therefore, the total cost of the matching is 2 i∈T w(i, parent(i))·ψ(i). This means that the value of ψ(i) determines the cost of a matching. Now, let R(i) and S(i) denote the total demand and capacity in Ti , respectively. Let χ(i) = max{R(i) − S(i), 0} denote the excess demand of Ti . The excess demand of Ti must be routed outside Ti , therefore ψ(i) ≥ χ(i) for any matching, and particularly ψ OPT (i) ≥ χ(i) In a stable matching, for any subtree Ti , the matching restricted to Ti must either exhaust all demand, or all the capacity. Otherwise, let s, r ∈ Ti have remaining demand and capacity. Both s and r are both matched (at least partially) outside Ti , and constitute an unstable pair (see Figure 6.3 for an example, where r and s are both matched outside and thus unstable). 39 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 r2 r3 Ti r' r s s' Figure 6.3: Example for Lemma 6.2.3 If the demand is exhausted, then ψ(i) = 0. Otherwise, the capacity is exhausted, and thus ψ STABLE (i) = χ(i). It follows that ψ STABLE ≤ ψ OPT , and thus any stable matching is optimal. 6.2.6 Local-Match Algorithm In this section we introduce Local-Match, an online algorithm for splittable demands. We show that Local-Match is O(log k)-competitive in HSTs, and is thus O(log2 k)-competitive for general metrics. Let c̃s (d˜r ) denote the residual capacity (demand) in s (r). Let L(·) denote the δ-distance to the farthest match of a server or a request. Namely: L(s) = n max r:xrs >0 {δ(s, r)}, ∞ if c̃s = 0 otherwise L(r) = n max s:xrs >0 {δ(s, r)}, 0 if d˜r = 0 otherwise If L(s) is finite, denote r̂(s) as the farthest server, that is the request r which has attained the maximum. Also, given a request r, we say that a server s is `-eligible if δ(r, s) = ` and L(s) > `. Lemma 6.2.4. The output of Local-Match is a stable matching, and thus is optimal for tree metrics. 40 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Algorithm 2 Local-Match(r) Require: A request r Ensure: r is matched. 1: while d˜r > 0 do 2: Find the lowest `, for which there exists an `-eligible server. 3: Choose uniformly at random an `-eligible server s. 4: if c̃s = 0 then 5: Unmatch(r̂(s),s,dr ) {r̂(s) goes into the queue} 6: end if 7: Match(r,s) 8: end while Algorithm 3 Match(r,s) Ensure: Matches s and r as much as possible without exceeding demand or capacity. 1: ← min{c̃s , d˜r } 2: xrs ← xrs + 3: d˜r ← d˜r − 4: c̃s ← c̃s − Algorithm 4 Unmatch(r,s,d) Ensure: Unmatches s and r to evacuate d demand. 1: ← min{d, xrs } 2: xrs ← xrs − 3: d˜r ← d˜r + 4: c̃s ← c̃s + Proof. First observe that L(s) only decreases throughout the run of the algorithm. Assume by contradiction, that in the final matching, there exists a unstable pair r, s. Let i = δ(r, s). We have L(r) ≥ δ(r, s0 ) > i, and similarly L(s) > i. Initially L(r) = 0, and when the algorithm finishes, L(r) > i. Observe the first pass in which L(r) surpasses i. In this pass r is processed. In the beginning of the pass, L(r) ≤ i, and in the end L(r) > i. 41 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 During this pass, we selected ` > i, thus there were no i-eligible servers, and particularly s was not i-eligible. However, δ(s, r) = i, so it must be that L(s) ≤ i. Since L(s) decreases throughout the algorithm, when it finishes, we still have L(s) ≤ i, a contradiction. We now make an observation that will be helpful in the last proof. Observe the request r throughout the run of the algorithm. When it is matched to s, ε of demand is matched, where 0 < ε ≤ dr . Suppose we were to change the input, and instead a single request with demand dr , had two requests with demands ε and dr − ε, the output would not have changed. This is also true when the demand is split through unmatching. By repeating this (1) (m) process it follows that we could have dr = εr + . . . + εr , such that the arrival a single request with dr is equivalent to the arrival of m requests with (i) demands εr . Yet, in the latter case, the requests never split. Let us call each such ε a mini-demand. Lemma 6.2.5. Let εr be a mini-demand of r, and let s̄ be its ultimate destination. Then, the expected cost of reassignments for it is O(log k)εr w(r,s̄) for 2-HST metrics. Proof. We first show that for each 0 ≤ δ ≤ δ(r, s), the expected number of times εr is reassigned in N (r, δ) is O(log k). At some time during the online algorithm r is matched in s ∈ N (r, δ) - if it is not, then we have zero reassignments, and we are done. Let Wδ be the set of δ-eligible servers. Among the requests that are handled after the arrival of r, either via rematchings or new requests, let rs be the first one selecting s, and has the potential of evacuating εr , i.e. δ(rs , s) < δ(r, s). Let us call this request the owner of s; each s ∈ Wδ has at most one owner. The process of reassignments of r is therefore as follows: r is matched to a server in s ∈ Wδ . The owner of s comes and evacuates it, and r is matched to another server. This continues until either εr matched in level δ 0 > δ, or εr no longer moves. Notice that εr never matches to a server s after being unmatched from it, since after that point L(s) = δ and s will be never δ-eligible again. Consider the arrival of owners o1 , . . . , o|Wδ | , and their respective servers as s1 , . . . , s|Wδ | . The probability of εr matching to r1 is 1/|Wδ |. and whether 42 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 that has happened or not, εr is randomly matched to one in r2 , . . . , r|Wδ | , since it can’t return to s1 . Therefore, εr will be evacuated by o2 with probability 1/(|Wδ | − 1). Repeating this process, we indeed see that the expected number of reassignments in level δ is upper bounded by harmonic number of |Wδ |, which is O(log k). Since the metric is a 2-HST, wrs = 2(2δ(r,s) − 1), therefore the cost of each reassignment of request r in level δ is 2(2δ − 1). We can bound the total expected cost of reassigning εr : δ(r,s̄) C(εr ) ≤ O(log k) X 2(2δ − 1) = O(log k)2(2δ(r,s̄) − 1) = O(log k)wrs̄ . i=0 Theorem 6.2.6. Local-Match is O(log k)-competitive on HST’s, and yields an O(log2 k)-competitive algorithm in general metrics. Proof. By Lemma 6.2.4, the final matching produced by the algorithm is optimal over HST’s, at the cost of Cfinal . By linearity of expectation and Lemma 6.2.5, the total expected cost of the algorithm, including reassignments, is O(log k)Cfinal . Due to the properties of the reassignment model (see Section 6.2.4), we can translate this algorithm to a model with no reassignments, with the same cost. Thus we conclude that the algorithm is O(log k)-competitive over 2-HSTs. Using Proposition 6.2.2, we find that by embedding it into 2-HSTs, we can provide an O(log2 k)-competitive online algorithm over any metric. 43 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Chapter 7 Summary 7.1 Conclusions We define the Load-Distance Balancing, a framework that generalizes many problems and is closely related to Universal Facility Location. We consider the problem in two settings, splittable and non-splittable demands. We see that in its most general form, the problem has no good online solution, and proceed to examining particular latency functions. We show that Greedy performs well under most cases - it is 5.83-competitive for linear latency functions, 9-competitive for concave latency functions, and O(p)O(p) -competitive for polynomial functions, where p is the highest degree of polynomial. For latency functions with bounded derivative, we showed a variation, HighGreedy, is O(γ)-competitive where γ is the maximum change of slope for any single function. We showed that all of our results are tight up to a constant factor. Our results with Greedy apply to both splittable and non-splittable demands: if the algorithm is greedy then it obtains the above bounds for both cases. However, while it is easy to compute the greedy matching for nonsplittable demands, the same does not apply for splittable demands. To this end we introduced WaterFill, an algorithm which efficiently computes the greedy matching for linear latencies. We proceed to examining another particular case of latency functions, which generalizes the online bipartite-matching problem by adding demands and 44 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 capacities. Here, we see there are no good online algorithms unless we assume both splittable demands and metric distances. We then present an algorithm that is O(log2 k)-competitive over general metrics, and O(log k)competitive for 2-HSTs. We complement this result by showing that the competitive ratio of any online algorithm is Ω(log k). As with the original bipartite matching problem, the question of whether the gap between the lower bound of log k and the upper bound of log2 k can be closed, remains open. As we see, while the framework is essentially the same, the analysis differs greatly between the cases for which greedy performs well and the bipartite matching with demands. 7.2 Future Work Though an extensive collection of latency function classes were exhausted in this work, a natural extension of our work would be the introduction of further classes of latency functions, or considering instances in which there are latency functions from different classes. Such analysis must notice cases with no good online algorithm (see Section 3.2). An interesting possible direction for future work is choosing a different cost model. We have considered the sum of latencies, but two different cost functions could also be interesting: • Max request cost - the maximum cost incurred by any request - C(x) = P maxr∈R s∈S xrs (wrs + fs (`s )). This is simply changing the summation in our definition to maximum. • Max unit cost - the maximum cost incurred by any unit of demand C(x) = maxr∈R maxs∈S wrs + fs (`s ). This would be a generalization of the makespan cost. Another direction that would be interesting to explore is a game-theoretic version of this problem. Here, also, we can consider each request either as a single agent, or we can consider the case where each agent is responsible for demand, consider one of the three possible costs as the social cost, and explore the price of anarchy and price of stability. 45 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Yet another direction is the bipartite matching with demands. Here, the most obvious effort would be to bridge the gap between the O(log2 k) upperbound and the Ω(log k) lower bound. Notice that such an effort may also yield interesting results in the original bipartite matching problem. Our problem is not a generalization of the bipartite matching problem, since we allow splitting demands. Therefore, a better bound for this problem does not immediately infer the same for the original problem. However, as we have seen, these two problems are closely related, and similar techniques can be utilized for analyzing them. We are thus hopeful that a breakthrough here would facilitate a breakthrough in the original problem. Our negative results here also imply that for the capacitated facility location problem, no online algorithm can provide any competitive ratio if we have either arbitrary edge-distances or non-splittable demands. We also know that for the metric edge-distances and splittable demands, there is a lower bound of Ω(log k). Our upper bound of O(log2 k) does not apply for the capacitated facility location problem, since the problem we explored is a special case of hard capacities, in which the opening cost is zero. Therefore, the general case of capacitated facility location problem would be another interesting direction to study. 46 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Bibliography [1] Ian F. Akyildiz, Xudong Wang, and Weilin Wang. Wireless mesh networks: a survey. Computer Networks, 47(4):445–487, 2005. [2] Noga Alon, Baruch Awerbuch, Yossi Azar, Niv Buchbinder, and Joseph Naor. The online set cover problem. In STOC, pages 100–105, 2003. [3] Noga Alon, Baruch Awerbuch, Yossi Azar, Niv Buchbinder, and Joseph Naor. A general approach to online network optimization problems. ACM Transactions on Algorithms, 2(4):640–660, 2006. [4] Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski, Gunho Lee, David Patterson, Ariel Rabkin, Ion Stoica, and Matei Zaharia. A view of cloud computing. Commun. ACM, 53:50–58, April 2010. [5] Baruch Awerbuch, Yossi Azar, Edward F. Grove, Ming-Yang Kao, P. Krishnan, and Jeffrey Scott Vitter. Load balancing in the lp norm. In FOCS, pages 383–391, 1995. [6] Baruch Awerbuch, Shay Kutten, and David Peleg. Competitive distributed job scheduling (extended abstract). In STOC, pages 571–580, 1992. [7] Yossi Azar, Andrei Z. Broder, and Anna R. Karlin. On-line load balancing. In FOCS, pages 218–225, 1994. [8] Nikhil Bansal, Niv Buchbinder, Anupam Gupta, and Joseph Naor. An (log2 )-competitive algorithm for metric bipartite matching. In ESA, pages 522–533, 2007. 47 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 [9] Yair Bartal. Probabilistic approximations of metric spaces and its algorithmic applications. In FOCS, pages 184–193, 1996. [10] A. Borodin and R. El-Yaniv. Online Computation and Competitive Analysis. Cambridge University Press, 1998. [11] Allan Borodin and Ran El-Yaniv. Online Computation and Competitive Analysis. Cambridge University Press, 1998. [12] Edward Bortnikov, Samir Khuller, Yishay Mansour, and Joseph Naor. The load-distance balancing problem. In INOC, 2009. [13] Moses Charikar, Sudipto Guha, Éva Tardos, and David B. Shmoys. A constant-factor approximation algorithm for the k-median problem. J. Comput. Syst. Sci., 65(1):129–149, 2002. [14] Fabián A. Chudak and David P. Williamson. Improved approximation algorithms for capacitated facility location problems. In IPCO, pages 99–113, 1999. [15] Jittat Fakcharoenphol, Satish Rao, and Kunal Talwar. A tight bound on approximating arbitrary metrics by tree metrics. J. Comput. Syst. Sci., 69(3):485–497, 2004. [16] Anja Feldmann, Jiri Sgall, and Shang-Hua Teng. Dynamic scheduling on parallel machines. In FOCS, pages 111–120, 1991. [17] Piero Fraternali, Gustavo Rossi, and Fernando Sanchez-Figueroa. Rich internet applications. IEEE Internet Computing, 14:9–12, 2010. [18] Ronald L. Graham. Bounds for certain multiprocessing anomalies. The Bell System Technical Journal, pages 1563–1581, 1966. [19] Ronald L. Graham, Eugene Lawler, Jan K. Lenstra, and A. H. G. Rinnooy Kan. A.r.: Optimization and approximation in deterministic sequencing and scheduling : a survey. ann. Discrete Math, pages 287–326, 1979. [20] Bala Kalyanasundaram and Kirk Pruhs. Online weighted matching. J. Algorithms, 14(3):478–488, 1993. 48 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 [21] Samir Khuller, Stephen G. Mitchell, and Vijay V. Vazirani. On-line algorithms for weighted bipartite matching and stable marriages. Theor. Comput. Sci., 127(2):255–267, 1994. [22] Madhukar R. Korupolu, C. Greg Plaxton, and Rajmohan Rajaraman. Analysis of a local search heuristic for facility location problems. In SODA, pages 1–10, 1998. [23] Harold W. Kuhn. The Hungarian method for the assignment problem. Naval Research Logistic Quarterly, 2:83–97, 1955. [24] Retsef Levi, David B. Shmoys, and Chaitanya Swamy. Lp-based approximation algorithms for capacitated facility location. In IPCO, pages 206–218, 2004. [25] Mohammad Mahdian and Martin Pál. Universal facility location. In ESA, pages 409–421, 2003. [26] Adam Meyerson. Online facility location. In FOCS, pages 426–431, 2001. [27] Adam Meyerson, Akash Nanavati, and Laura J. Poplawski. Randomized online algorithms for minimum metric bipartite matching. In SODA, pages 954–959, 2006. [28] David B. Shmoys, Éva Tardos, and Karen Aardal. Approximation algorithms for facility location problems (extended abstract). In STOC, pages 265–274, 1997. [29] Maxim Sviridenko. An improved approximation algorithm for the metric uncapacitated facility location problem. In IPCO, pages 240–257, 2002. [30] David P. Williamson and David B. Shmoys. The Design of Approximation Algorithms. Cambridge University Press, 2011. [31] Andrew Chi-Chih Yao. Probabilistic computations: Toward a unified measure of complexity (extended abstract). In FOCS, pages 222–227, 1977. 49 וקעורות .עבור רוב תתי המקרים שנחקרו ,אנו מכסים בצורה טובה את הבעיה – האלגוריתם הנחקר נותן יחס תחרותיות שהוא הטוב ביותר שניתן עד כדי קבוע מזה שניתן להבטיח באלגוריתם מקוון העליון שאנו נותנים. ישנן מספר כיוונים להמשך המחקר .כיוון ראשון הוא מציאת תתי-מקרים "נוחים" נוספים .למרות שחקרנו את כל תתי המקרים האינטואיטיביים ,לא ניתן לפסול את האפשרות שישנם תתי-מקרים נוספים שניתן לחקור ולהגיע ליחס תחרותיות טוב .ייתכן שניתן אף להרחיב את הדיון גם לפונקציות שהן "בלתי אפשריות" ,באמצעות הגבלת הדיון למרחקים מטריים בלבד ,כמו שעשינו במקרה של פונקציות קיבולת. כיוון שני הוא שימוש במודל שונה לשילוב העומס ומרחק .במחקר שלנו ,אנו מסתכלים על סכום העלויות. במקום זאת ,ניתן להסתכל על העלות המקסימלית )משולבת עומס-מרחק( של בקשה כלשהיא ,או של שרת כלשהוא .מודל זה נפוץ בבעיות איזון עומסים ) ,(load-balancingוהוא נקרא .makespan כיוון שלישי ואחרון הוא מודל תורת-המשחקים .חקרנו את הבעיה תחת ההנחה שבקשה שמגיעה לא מנסה להביא למינימום את העלות שלה ,אלא מנסה להביא לעלות מינימאלית כוללת לכולם ,כלומר מתייחסת גם לעלות נוספת שהיא גורמת לבקשות אחרות .בפרקטיקה ,הרבה פעמים זה לא המקרה, והבקשה תנסה להביא למינימום את העלות שלה מבלי להתחשב בבקשות אחרות .אם כן ,ניתן לחקור את הבעיה במודל של תורת-המשחקים ,כאשר כל בקשה היא למעשה סוכן אנוכי ,ולבדוק את מחיר האנרכיה ) (Price of Anarchyומחיר היציבות ) .(Price of Stabilityהאחרונים הם מדדים בתורת המשחקים שמייצגים את טיב נקודות שיווי-משקל אל מול הפיתרון האופטימאלי. Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 כלשהו .במקרה של פונקציות קיבלות עם מטריקה ופיצול ,ישנו פער של logkבין החסם התחתון לחסם לסיכום: אלגוריתם 5.83 ליניאריות קעורות 9 חמדן ሻఏሺሻߠሺ פולינומיאליות חסומות נגזרת חמדן-גבוה ߠሺߛሻ עבור המחלקה האחרונה ,כאמור ,אנחנו תחילה מראים שהוא "לא פתיר" במידה והמרחק לא-מטרי או הבקשות אינן ניתנות לפיצול .דהיינו ,אנו מראים שעבור מקרים אלה ,אף אלגוריתם מקוון לא יכול להבטיח יחס תחרותיות קבוע .לאחר מכן ,אנו מטפלים במקרה היחיד שכן ניתן לטיפול ,הוא המקרה המטרי וניתן לפיצול .במקרה זה ,אנו מראים חסם תחתון של Ωሺlog ݇ሻכאשר kהוא מספר השרתים. לאחר מכן ,אנו מציגים את האלגוריתם ,LOCALMATCHומראים כי יש לו יחס תחרותיות של .ܱሺlog ଶ ݇ሻ אנו עושים זאת באמצעות טכניקות מתקדמות של ,tree-embeddingבהן המטריקה מתמפה לעץ מסוג ,(HST) Hierarchically Well-Separated Treeוהוא בעל תכונות שמקלות מאד על הניתוח .אנו מראים כי ל LOCALMATCH-יחס תחרותיות של ܱሺlog ݇ሻכאשר המטריקה היא .HSTבאמצעות הטכניקות של ,tree-embeddingאנו יודעים שניתן לקרב כל מטריקה על ידי -HSTים עם איבוד דיוק של .ܱሺlog ݇ሻשני קבועים אלה מביאים ביחד ליחס התחרותיות של .ܱሺlog ଶ ݇ሻלסיכום: מטרי לא מטרי לא מפוצל ∞ ∞ מפוצל ∞ Oሺlog ଶ kሻ, Ωሺlog kሻ מסקנות המחקר וכיווני המשך אנו רואים במודל המבוסס מרחק ועומס מודל שימושי ובעל ערך ,שמכליל מספר רב של בעיות באמצעות המודל של פונקציות שיהוי .עם זאת ,אנו מגלים כי בצורתה הכללית ,לא ניתן לחקור את בעיית Load- Distance Balancingבמקרה המקוון ולכן מחלקים את הדיון לתתי מקרים .אנו רואים בפירוש כי בעוד שחלק מתתי המקרים דומים מאד אחד לשני ,חלקם שונים באופן מהותי ,וספציפית הדיון בפונקציות קיבולת מצריך טכניקות ואלגוריתמים שונים לחלוטין מאלה ששימשו אותנו עבור פונקציות ליניאריות Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 מחלקה יחס תחרותיות טריוויאלי כאשר אנחנו מניחים שניתן לפצל בקשה על מספר שרתים – שכן ישנן מספר אינסופי של השמות אפשריות .לכן ,אנו מציעים את האלגוריתם ,WATERFILLומוכיחים שהוא מבצע את הבחירה שמתאימים למאפיין מסוים )"משמר גובה"( עושות למעשה את הבחירה החמדנית ,ולבסוף מראים ש- WATERFILLמתאים למאפיין זה. עבור המחלקה האחרונה – פונקציות קיבולת – המחקר שונה לחלוטין .תחילה אנו מראים שללא הנחות נוספות ,שום אלגוריתם מקוון לא יכול להשיג ביצועים טובים )יחס תחרותיות חסום( ,וניתן למצוא תוצאות רק למקרה שבו הבקשות ניתנות לפיצול וגם המרחקים בין השרתים ולבקשות מהווים מטריקה – דהיינו מתאימים לאי-שיוויון המשולש .למקרה זה ,אנו מציעים אלגוריתם שהוא שונה ,LOCALMATCH ,והוא מבוסס על אלגוריתם לשידוך מטרי שהוצע בעבר. תוצאות המחקר אנו מגלים כי האלגוריתם החמדן )או וריאציה קלה שלו( מגיע לביצועים מרשימים עבור כל מחלקות הפונקציות שהוא נחקר עבורן .משמעות הדבר היא שהתוצאה שהוא משיג רחוקה לכל היותר בגורם קבוע מהתוצאה הטובה ביותר שאלגוריתם מקוון כלשהו יכול להשיג .ספיציפית ,אנחנו מראים כי: • עבור פונקציות ליניאריות ,יחס התחרותיות הוא .5.83 • עבור פונקציות קעורות ,יחס התחרותיות הוא .9 • עבור פונקציות פולינומיאליות ,יחס התחרותיות הוא ሻைሺሻ ,ܱሺכאשר pמייצג את דרגת בפולינום .כמו כן ,אנו מראים דוגמא למקרה שבו כל אלגוריתם מקוון יקבל תוצאה שהיא במרחק לפחות מהפיתרון האופטימאלי. • עבור פונקציות חסומות-נגזרת ,אנו מציעים וריאציה קלה של האלגוריתם החמדן ,ומראים שיש לו יחס תחרותיות ,ܱሺߛሻכאשר ߛ הוא היחס הגבוה ביותר בין נגזרת מקסימאלית לנגזרת מינימאלית בפונקציה כלשהיא .אנו משלימים תוצאה זו בכך שאנו מראים דוגמא למקרה שבו כל אלגוריתם מקוון יקבל תוצאה שהוא במרחק Ωሺߛሻמהפיתרון האופטימאלי. Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 החמדנית .לצורך כך אנו מאפיינים הפיתרון החמדני כבעית אופטימיזציה ,מראים שאלגוריתמים מודל עלות מבוססת מרחק ועומס )(Load-Distance Balancing כאשר אנו עושים השמה של בקשה לשרת ,התשלום על השמה זו היא סכום של שני תשלומים – הראשון מבוסס על המרחק ביניהם ,והשני מבוסס על העומס שישנו כעת בשרת .למרות שהעלויות מושפעות ממאפיינים שונים לגמרי ,ישנו היגיון בחיבור ביניהם ,שכן שניהם מתרגמים לזמן שיהוי ) (latencyשהמשתמש חווה. כאשר אנחנו מסתכלים על עלות העומס – אנו תחילה מחשבים את העומס על השרת )מספר הבקשות שהוא משרת( ,ומניחים שלכל שרת יש פונקציית שיהוי שמקבלת עומס נתון ומחזירה את השיהוי .מודל עלות זה מאפשר לנו להכליל מספר רב של בעיות שנחקרו בעבר ,באמצעות מחלקות שונות של פונקציות. שיטת מחקר כבר בתחילת הדרך ,אנו מגלים כי שום אלגוריתם מקוון לא יכול להבטיח יחס תחרותיות כלשהו ללא הנחות כלשהן על פונקציות שיהוי .בתוך כך ,אנו מגלים מאפיינים של פונקציות שיהוי "בלתי אפשריות" – כאלה שייגרמו לכך ששום אלגוריתם מקוון לא יוכל להבטיח יחסי תחרותיות .לאור זאת ,המשך המחקר מתחלק למחלקות שונות של פונקציות שיהוי :פונקציות ליניאריות ,בהן השיהוי הוא ביחס ישר לעומס; פונקציות קעורות ,בהן התוספת השולית לשיהוי הולכת וקטנה; פונקציות חסומות ,בהן הנגזרת חסומה בין שני קבועים; פונקציות פולינומאליות; ופונקציות קיבולת – שהן נשארות באפס עד לנקודת הקיבולת של השרת ,ואז גדלות לאינסוף .המחלקה האחרונה היא שימושית על מנת לייצג את בעיית השידוך עם קיבולות – בה העלות היא מבוססת על מרחק בלבד ,אבל ישנה מגבלה של מספר בקשות שניתן לשים על כל שרת. עבור ארבע המחלקות הראשונות – ליניאריות ,קעורות ,חסומות-נגזרת ופולינומיאליות – אנו מציעים את האלגוריתם החמדן לפיתרון הבעיה .אלגוריתמים חמדניים הינם כלי חשוב במדעי המחשב ,והעיקרון בהם הוא בכל שלב לעשות את הצעד המיטבי בהסתכלות מקומית ורגעית ,מבלי לחשוב על צעדי המשך. במקרה שלנו ,האלגוריתם החמדן פועל בצורה פשוטה :כל בקשה שמגיעה מקבלת את ההשמה כך שהתוספת לעלות הכוללת תהיה מינימאלית .עם זאת ,אנחנו מגלים שיישום האלגוריתם החמדן אינו Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 מודל העלות שאנו חוקרים הוא עלות שמורכבת משני מרכיבים – עלות "מרחק" ועלות "עומס" .דהיינו תקציר בעיות מקוונות )(Online Problems בעיות מקוונות ואלגוריתמים לפתרונן הינו נושא שמעסיק רבות את קהיליית מדעי המחשב .המאפיין של בעיות אלה הוא שהקלט לאלגוריתם מגיע צעד אחר צעד ,ועל האלגוריתם להוציא פלט לכל חלק בקלט. על אלגוריתמים מקוונים אם כן להתמודד עם היעדר המידע ,והם נמדדים באמצעות "יחס תחרותיות" – היחס הגרוע ביותר שיכול להיות בין פתרון מיטבי לבעיה ,לבין פתרון של האלגוריתם המקוון עבור כל הקלטים האפשריים. דוגמא קלאסית לבעיה כזו היא בעיית הסקי :גולש סקי מיומן נוסע לחופשה ,ועליו להחליט האם להשכיר מגלשיים בדולר אחד ליום ,או לקנותם במחיר של שבעה דולרים בסה"כ .החלק המאתגר הוא שהגולש אינו יודע מלכתחילה כמה ימים הוא ייגלוש ,אלא שבכל יום נתון חופשתו יכולה להסתיים לפתע. אלגוריתם מקוון טוב לבעיה זו היא להשכיר בשבעת הימים הראשונים ,ולקנות ביום השמיני .אלגוריתם זה יבטיח יחס תחרותיות של – 2אם החופשה מסתיימת בפחות משבעה ימים אז הוא מיטבי ,ואם היא אורכת יותר אז הגולש שילם ארבע-עשרה דולר ,פי 2מהפתרון המיטבי של קניית המגלשיים ביום הראשון. בעיות השמה )(Matching Problems, Facility Location Problems המחקר שלנו עוסק בבעיות השמה מקוונות .בעיות אלה ,שנחקרו בפרוטרוט לאורך מספר עשורים במדעי המחשב ,עוסקות בהשמה של משאבים )שרתים( למשתמשים )בקשות( ,והינן רלוונטיות למספר רב של בעיות פרקטיות ,החל מחלוקה של משאבי חישוב ועד לחלוקת עומסים עבור ניתוב ברשתות. במקרה המקוון שלו ,הבקשות מגיעות אחת-אחת ,ועל האלגוריתם המקוון לתת כפלט את השרת )או השרתים( שישרתו את הבקשה ,כאשר ברגע שמתבצעת השמה של בקשה לשרת ,לא ניתן להזיזו או להורידו בהמשך הדרך. Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 מבוא Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 המחקר נעשה בהנחיית פרופ' יוסף )ספי( נאור בפקולטה למדעי המחשב Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 חיבור על מחקר לשם מילוי חלקי של הדרישות לקבלת התואר מגיסטר למדעים במדעי המחשב ישראל שלום הוגש לסנט הטכניון -מכון טכנולוגי לישראל תמוז תשע"א חיפה יולי 2011 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 השמת משאבים מקוונת מבוססת מרחק ועומס Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 Technion - Computer Science Department - M.Sc. Thesis MSC-2011-09 - 2011 השמת משאבים מקוונת מבוססת מרחק ועומס ישראל שלום