>> Mohit Singh: Hello everyone, welcome, it is a great... Khuller from the University of Maryland. Samir has done...

>> Mohit Singh: Hello everyone, welcome, it is a great pleasure to introduce Samir Khuller from the University of Maryland. Samir has done very nice work on basic algorithms and approximation [indiscernible], but today he is going to tell us how it is related to scheduling to minimize energy. >> Samir Khuller: Thanks Mohit. I have a lot of slides today and I know it’s Friday afternoon and late so please feel free to interject, ask questions and also it’s fine if I don’t get through everything. I will try to keep it light on proofs. So this is work that I have been involved in for several years and joint work with several people. I will show you pictures and give you names at the end. So I am going to try to cover a bunch of results and this works was started about five years ago. The motivation for our work was trying to understand some scheduling problems dealing with datacenters. So as we know datacenters are everywhere these days and consume a large amount of energy. So a couple of points I want to note is that the workload actually fluctuates over time so you might actually begin to ask fundamental questions like maybe we can save energy by shutting down a large fraction of the machines when they are not in use. In these days servers themselves are getting more efficient about energy usage. So your cell phone processor for example can go into sleep mode, power save mode running on a batter. So energy savings is very, very important. More importantly I want to talk about, you know the energy consumption is obviously very huge and there are thousands of processors in any datacenter and each one is consuming a significant amount of energy and we are looking at sort of scheduling policies us at the CPU level. Now more broadly you could say, “Well I have a large set of disks and maybe I could shut disks down as well to save power,” then there are costly expenses in bringing them back up. But in CPU’s it turns out that overhead of turning things back on is much lower. You can put them into different kinds of sleep states or you can go into sleep states where you can rapidly recover from, but you are still consuming a lot of energy if you slow down the processor. So obviously I am an algorithms researcher and we are going to look at problems which have to do with scheduling, where you have a large collection of jobs that you want to do and we sort of want to now examine that question of with this property we now have the ability to turn machines on and off. So if you think about sort of scaling the literature goes back 50 plus years, most standard scaling problems are given jobs, and they are given some machines or resources, and you have to figure out ways to run these jobs on those machines and most of the work historically is done from the jobs perspective. So the machines are always on, we don’t really care about how much energy they are using, but I want to run these jobs with this high job satisfaction or the clients that are issuing these jobs are happy with the services. So happiness might be a fast response time, I want these jobs to complete quickly. Nobody really worries about the machines perspective, but there is a very interesting tradeoff that we are trying to understand that maybe at a slight cost to happiness of jobs maybe we can cut the energy usage significantly. So there is clearly a tradeoff. If you have unbounded energy then you can maximize job happiness, but in general we want to keep the energy cost into account. So our goal is to sort of look at both online and offline problems in the space. Today’s talk is mostly going to focus on offline algorithms. So these are algorithms that you know the entire input in advance and you are trying to compute some solution in advance. Clearly in terms of real world scheduling that’s not the whole picture. You have jobs that you know about and you find a schedule and then there will be jobs that you don’t know about that will arrive at the last minute and will need to be fitted into a schedule. So I have some more recent work that I won’t get time to talk about on online algorithms as well. Then there are some special cases for which we can derive optimal polynomial time solutions and then there are other cases where the problems themselves are NP hard. Then our goal is to develop approximation algorithms. There are a lot of experts in the room, in the first row, that work in this topic where our goal is to develop algorithms that run in polynomial time, but might not find optimal solutions. We are trying to prove that these solutions are close to optimal. So one other piece of work from 2010 that motivated us was an Intel internal technical report which talked about power over job scheduling and it sort of highlighted the fact that jobs scheduling hasn’t been studied with this power or energy constraint in mind. And there is a bunch of research on this topic called “batch scheduling”. So what is batch scheduling? Batch scheduling is the idea that groups of jobs in one batch can be scheduled at the same cost as one job. So you might have an upper bound on the batch side. It’s sort of like saying that when I want to drive to downtown Seattle my car can take 7 passengers whether I take 1 passenger or 7 passengers the cost is the same as one trip to downtown Seattle. So that’s what a batch consists of. And a lot of the literature on batch scheduling people had mostly focused on finding feasible schedules by completely ignoring the energy cost. So one of the things that we looked at was: How do we optimize for energy? And I will get into this in more detail in a few minutes. So let me start off by talking about 1 sort of classical problem and one twist to it that we looked at a few years ago, which was the starting point of some of this research. So there is a vary famous problem studied for about 25 years called Unrelated Parallel Scheduling started by [indiscernible], [indiscernible] and [indiscernible]. So here the framework is the following: you are given a collection of jobs, J1 to JN which are shown by the red nodes on the left hand side and then you are given a collection of machines, M1 to MM and these machines are not identical. So what this means is that some job might run very quickly on some machine and might be very slow on another machine or it might even be infeasible to be done running on the machine. But still the goal is to somehow assign jobs so that we minimize the maximum load on any machine. So we want to sort of do a load balance solution and they give some very nice algorithms even though the problem is NP hard. So the one sort of generalization of this problem that we looked at was the following: let’s assume that these machines are sort of available, but you have to purchase them. So each machine has a buying cost. So machine MI has a purchase cost of CI. So it’s like Saul saying, “I will go to the market place, I have a lot of jobs I want to run, lots of machines are available, but I have some fixed amount of money that I want to spend and now I want to decide which machines I want to buy.” So I have some budget that I am given, C and I want to choose a subset of machines that I buy that fit within my budget. Now once I have chosen the set of machines that I buy, now I have a job scheduling problem, because I know what the set of machines is and I want to schedule all the jobs on those machines. So the goal is to determine the subset of machines with cost at most C and then assign the jobs to the machines so that we minimize the max on load. So that’s why we call it the machine activation problem. >>: [inaudible]. >> Samir Khuller: Right, the cost is just to buy the machines in advance. So you will see this analogy a little bit later, but later on if I have one machine I could think of this vertical access actually as time. So I might say that it’s the same machine, but when I want to turn it on I am paying some cost. So there is a relationship between time and that, but even though in this model it is sort of just saying, “I have a bunch of workload that I want to run, I can decide what machines to purchase, I have a budget, which machine should I buy?” And of course this problem in this way actually it turns out it even generalizes the famous set cover problem. I can think of these jobs as elements, and these machines as sets and then I am choosing certain sets to buy and then maybe if an element belongs to a certain set that job can be scheduled on that machine, but it’s load is minimal and if the element is not in the certain set then it’s workload is very high. So this framework I just described actually captures both the unrelated parallel machine model when all the CI’s are 0 and it also captures the famous set cover problem for which no approximation algorithm [indiscernible]. So the result that we proved is the joint work with [indiscernible], Li and Saha’s paper in 2010. It’s sort of like you kind of get the best of both worlds. So we showed that if you fix a budget of C and suppose there is an optimal solution that meets the budget of C and has a max load T then we can find a solution with cost C times log N with a max load of 2T. And both of these sorts of log N and 2 are sort of unavoidable because of the special cases. So I won’t really go into the algorithm. I just wanted to say that this was sort of the motivation for starting this kind of cost problem. Then if you think of this sort of as jobs and you think of this as time it just might go to certain machines being on at a certain time if the job can be scheduled in that slot and the edge is missing if it just can be scheduled. So now let me move to even a much simpler model where we are talking about batch scheduling. So this is like a shipping problem. So we have a container that has to leave the port, this is the available time or the release time of the job. Then there is a deadline where the container has to arrive at a certain location and we have sort of multiple jobs. So I have another container that can be ready at a later time and then it is expected at some deadline. So we think of this as simply jobs, having release times and deadlines and the ship is sort of like this batch machine that when I want to schedule the ship departing up to a certain number of packages can be sent on the ship. So if I schedule the ship here then I only do 1 job and the ship goes once. If is schedule the ship here I can do both of these jobs, but of course I cannot do an unbounded number of jobs; the ship has a capacity. So that’s what I mean by the notion of batch. So the question really is what is the smallest number of trips necessary to schedule all these jobs so that you don’t miss any deadlines? So that’s sort of a very simple, basic model that you can think of in this batch [indiscernible]. [indiscernible] looks at this problem but it doesn’t take into account the number of trips. The whole goal is just to find a feasible schedule with an unbounded number of trips. So, one of our goals was to find an optimal solution for this problem. So I also wanted to say that this problem is a little bit more nuanced. So the shipping model, also know as the trucking model [indiscernible], where the batch has to be synchronized. So the jobs are scheduled exactly in the same group. You can think of a pizza oven model, which is slightly different where I put a pizza in the oven, I put a second in the oven and when the first one is finished I can take it out and put another pizza in, maybe 2 pizzas can bake simultaneously. So this is like a non synchronized model. So this is a batch, but it’s not a synchronized batch. It just has the property that at most 2 things are running at any point of time and here we want to minimize the running cost or the energy cost which is when the machine comes and when the machine goes off. So there is a distinction between these 2 models here. So in some sense the model on the right is a bit more constraint, because all the jobs have to be scheduled simultaneously. Okay. So I am now going to get into this problem a bit more formally. So the problem that I just described is the following: I have N jobs, each job has a release time and a deadline. In the most general version we can think of a job as having some length and we are going to focus for the next 10 minutes on unit length jobs. We have a batch machine and to make life even simpler [indiscernible] time is slotted so I can, in a time slot, turn a machine on or off. So if I turn the machine on then we say that the machine is active and I can schedule up to some number of jobs. The whole goal is to minimize the number of active number slots. So it’s a very easy problem to think about. So you can think of this simple model as basically talking about maybe a rack of processors that I turn on, and then a bunch of things can be done at the same time or a multi-core processor where the processor is on and a bunch of threads can be experienced simultaneously. So I was going to talk about unit length jobs. So this example shows you a simple schedule, right. So here if you look carefully notice that the batch capacity is 3. So I have a schedule where each job here could have been scheduled at any point of time and we have to align these jobs in a way and find the schedule so that I am scheduling no more than B jobs at any point of time. So that’s my batch capacity and I am trying to simply minimize the time slots on which the machine is on. In some sense you can think of that as a projection on the X axis of wherever you put these rectangles. So the question is now how do we find an optimal solution for this problem? Now there is a general problem, a famous problem known as Capacitated Set Cover. So what is Capacitated Set Cover? So it generally the set cover in the following way: so I have a collection of elements and I have a collection of subsets. The subsets have some cost that you can pay to buy the set, but now it also comes with a capacity. So when I purchase a set I cannot cover all of the elements in that set, but I have a restriction of how many limits I can cover. And I have some flexibility. So, if this set has size 3 and costs $2.00, when I buy this set I can cover any 2 of the 3 elements, but I can’t cover all 3. Now Capacitated Set Cover has been widely studied and studied and has actually many, many applications itself. It is an NP hard problem and there is some very interesting work by Wolsey that I will mention in a second, but what is the relationship with our scheduling problem? So I could think of every time slot as a potential set. So turning the machine on is like buying that set and then all of the jobs that overlap or intersect the time slot are elements of that set, but of course the point is if I turn the machine on here I cannot cover all of them. I have some capacity constraint. So you can think of this problem as a special case of Capacitated Set Cover, but that’s not very useful because Capacitated Set Cover is a very hard problem. So there is a famous algorithm by Wolsey which gives a very simple conceptual greedy approximation, but the bound is not so good. It’s an O log n approximation. But of course it maps it to a hard problem, but the scheduling problem itself is not NP hard. So in 2008 there was a paper that showed that you can solve the problem optimally using Dynamic Programming, but the running time is high, the complexity is high. So it is polynomial, but high. So the next question you can ask is: Is there a faster exact algorithm without using dynamic programming? So that’s the algorithm that I am going to present next. The algorithm itself is called Lazy Activation and this is joint work with Jessica Chang and Hal Gabow. So here is how the algorithm is going to work. So are there any questions about the problem itself? >>: [indiscernible]. >> Samir Khuller: Just when you turn the machine on and off there is not cost, that’s a good point. So it is quite likely that the dynamic programming solution is easier to extend in the general cost model, which is a non-uniform cost. So maybe it’s more expensive to run a machine during peak energy times and maybe cheaper to run at later times and we don’t model that. So certainly yea, dynamic programming sure can be extended to handle cost. So let me describe the algorithm. The algorithm is very, very simple. So the algorithm is called Lazy Activation. So the idea behind lazy activation is that you don’t ever have to be in a rush to do jobs. If you do jobs as soon as they are released then you are not really overlapping them in the maximum possible way. So lazy activation just says that should basically do things lazily, as late as possible. So you could think of a lazy activation algorithm say as let’s look at the first job who has the earliest deadline, there’s no reason to do this job before this deadline. So we will schedule it here, but once I turn the machine on to do this job now you have an option of what other jobs you want to schedule. There is a whole bunch of other jobs that you could schedule at this point in time. So obviously I have only scheduled on job. I have up to capacity of 3 so I could schedule 2 more jobs and what the lazy activation algorithm does is it says to take the jobs that are earliest in deadline in the future and schedule those first. So we can prove that’s sort of the right thing to do. The problem with this algorithm that I just described to you is it doesn’t quite work and the reason it doesn’t quite work is because you might be running along and you might get stuck because you suddenly come to a point in time where maybe 100 times B jobs have that deadline. So now you just don’t have enough time to schedule all the jobs, because you sort of waited too long and not all of them can be scheduled here. So we are going to have a preprocessing step. In the preprocessing step we are going to do the following: so I [indiscernible] on this preprocessing step before we start lazy activation. Now I am going to look at all of the jobs and I am going to scan time right to left. So I am going to look at the last deadline for example and ask the question, “How many jobs have this common deadline?” So if at most B jobs have this common deadline then we are not worried. In fact that’s the property we want to enforce. We want to enforce that every deadline possible value that are at most B jobs with that deadline. So now the question is, “What do we do if we have more than B jobs with this deadline?” So in this example B is 3, but we have 5 jobs with that deadline. So obviously the optimal solution cannot schedule all 5 jobs in that last slot. There is an upper bound of 3. So what we are going to do is take all the excess jobs more than B with the release as early as possible and adjust their deadlines by subtracting 1. So now certainly I have the property that at most B jobs with this deadline and now we will move to this value and enforce the same property again. So it might have been true that at this point in time there where less than B jobs with that deadline, but now I have more. Then we will apply the same rule and apply this rule recursively. So that is basically the entire algorithm. So in step 1 we do this preprocessing step where we scan the deadlines from right to left and then and then in step 2 now we do the algorithm left to right doing lazy activation. The every time I have some option of what to put I pick the jobs who are not immediately due, but that have the earliest deadline in the future. So a simple example is suppose step 1 has been run, then I pick the job with the first deadline, I schedule that, then of all the overlapping jobs I pick the remaining 2, and I schedule those with the earliest deadline and then I get rid of them and keep going. So that’s the entire algorithm. So the algorithm is very simple and very easy to implement. There are some interesting properties about the algorithm that are not completely obvious. So I won’t really talk about the proof of the algorithm. The proof is not very complicated. You can work it out, but the not obvious property is the following: so here is an interesting case. Here again B is 3, but we have 7 jobs, all that are valued only over 2 slots. Now there is no way we can do 7 jobs in these 2 time units if we can only schedule 3 jobs in 1 slot. So of course the input instance itself is not feasible. There is no way that we can do 7 jobs in 2 time slots if I have B equals 3. Now what will happen in step 1 is notice I have 4 jobs with this deadline, so we will subtract 1 from these 2 jobs. Now I will have 4 jobs with this deadline and then I will subtract 1 of them again and this window will collapse to the empty set. So this jobs window will collapse. So obviously this job has to be dropped. What is interesting is that we can prove that the algorithm actually schedules a maximum number of jobs that could be scheduled in any optimal solution. Moreover the number of slots that the algorithm turns the machine on is optimal. So that is the part that actually takes a little bit of work to prove, that we are scheduling the largest number of jobs we can feasibly schedule, but also add optimal cost. Questions? So in the next part of the talk I want to sort of talk about arbitrary-length jobs and there are some interesting open questions here. So in arbitrary-length jobs these are jobs that are not of unit length. So here in this example I have three jobs with release time’s deadlines and some processing time and this is a non-preemptive case on the left. So this job, once I start it, I have to run the job all the way to completion. So in this case, in this example on left the active time of this machine is full. So the full machine is turned on for the first 3 slots, then it’s turned off for 2 slots and then I have to do the last job. But if I am allowed preemption then we could do interesting things where this middle job, I do 2 units of the job, I stop it and then I finish one unit later. So I could save a little bit on the active time by [indiscernible] for preemption. So it’s sort of like saying that some things in the oven need to bake for longer, but interrupting that, I don’t know if I want to eat a pizza that got pulled out of the oven half being baked and then got put back in. But you can imagine that processing of jobs it could be interrupted for no cost. So this problem itself is NP hard. It is very easy to prove, the non-preemptive version. It turns out in the preemptive case we actually don’t know whether it’s NP hard or not. So is the question clear? So I have jobs, the release times, deadlines, arbitrary-lengths, preemption is for free, I can schedule up to 3 jobs simultaneously and I want to find an optimal schedule. We don’t know if that’s NP hard or not. So our focus was for a long time on trying to identify a polynomial time solution to this problem because we weren’t sure it wasn’t NP hard and we could prove that. So we were unable to find an optimal solution. So we developed an approximation algorithm which I will give you the high level ideas for, but again it’s an approximation algorithm without having a proof that the problem is NP hard. So there might actually be an optimal solution or it might be NP hard, we don’t know. So let me change gears a little bit and talk about a relationship with maximum flow, which some of you might not have seen before. So the maximum flow problem is a flow problem directed graph where I have a source and sink and I want to push the largest amount of flow from the source to the sink. So here we have a source and a sink, for every job we are going to create a node in this graph and the capacity of the edge going from the source of the job is simply the processing requirement of that job. So if this job has length 3 somehow 3 units of flow [indiscernible] will have to be sent back to the sink and all 3 units have rescheduled. For every time slot itself there is going to be a node here and the capacity of the edge going from the time slot node to the sink node is simply the batch capacity B. At most B units, if I turn the machine on, at most B units can be scheduled there. Now in standard flow problems the whole network is known and I just want to push maximum flow from S to T. So here the problem is slightly different. I want to select some of these time slots to active and once I activate a certain time slot then I get a capacity of B going from that node to the sink node. If the time slot is turned off then this capacity is [indiscernible] and cannot schedule anything there. So the whole goal now is to select a subset of these nodes and then turn them on so that the max flow has a value which is the sum of the processing times of all of the jobs so everything get’s processed. So that’s exactly the problem that we wish to solve here. So once I can decide a schedule for all subsets to turn on then it’s a flow problem. So I can certainly check feasibility of a schedule by solving a max flow. If I decide, “Oh I want slots 1, 3, 5 and 7 on,” then the flow will tell me what the schedule is in a preemptive way. And you are going to use this oracle as a very simple algorithm now. So that leads to the concept of what I call “Minimal Feasible Solutions”. So what we are going to do is turn on all of these slots initially. So we obviously know that there is a feasible max flow for feasible scheduled exits. Then in arbitrary order, you pick the order, we just start turning these off one at a time. And if I turn it off that’s a permanent decision. All I want to check is turning it off still leaves a feasible max flow. So that’s the algorithm, just finding a minimal feasible solution. So we just shut down active slots one at a time and we don’t shut an active slot down if it’s going to lead to infeasibility. So that’s the entire algorithm. So start from all possible slots being active and as long as a feasible solution is possible we shut the slot down. So it turns out that this simple algorithm, which is really a dumb algorithm in some ways, not being intelligent in what order you are shutting things down, if you could find the right order then you’ll converse to an optimal solution, but being completely blind we can actually prove that the cost of solution is no more than 3 times out. So this simple algorithm will find you a schedule, which might not be optimal in terms of the number of active slots, but we can prove that its cost is no more than 3 times out. Now that bound of 3 is also in fact tight. So there are examples where if you are not careful in the order in which you shut things down you could actually end up with a solution which is 3 times out. So I will show you one simple example illustrating a bound of 2 and then you can generalize this to a bound of 3. So here is a collection of jobs with release time’s deadlines and then the number on the right shows you the length of the job or what the processing needs are. So here is an optimal solution. So this is an optimal solution where B equals 5 and this solution sort of has some interesting properties. These jobs notice are length 4 and are rigid. So these jobs sort of have no option. So the machine really has some spare capacity here to do 1 thing at a time and what we decided to do was to make progress on the long job, which is sort of the right thing to do. But if you tried turning the last slot off it turns out that’s feasible. So if you turn this last slot off then it pushes these unit jobs, strapping them across like this, so all these jobs get done and then the long job get’s pushed out and that’s a feasible thing to do. But shutting that last slot down is a big mistake, so in this example that’s going to force the minimal solution to be factor 2 away from the optimal. So let me give you some high level ideas about where this bound of 3 comes from. So let me forget this left shifting part. I am not going to have time to discuss it. It is not really that crucial, but once we find a schedule in the end how are we going to analyze it. There are some time slots in which I am actually doing B jobs. So we found a schedule by this naive algorithm. There are some time slots which are doing B jobs and there are some time slots which we call “non-full slots” where we are doing less than B jobs. Now this kind of active time slot is great, basically I turn the machine on, the machine had a capacity of B and now it is being 100 percent utilized, that’s great. So we cannot have more than opt many of such slots. The problem comes when we have many, many slots of this type where the machine had a very high capacity of B, but we were doing very little work. So we might be paying sort of a heavy penalty there. So after the algorithm ends the main thing is to come up with looking at a schedule and identifying which slots are full and which slots are non-full. Then we sort of have to figure out how we are going to account for these non-full slots. There are a large number of non-full slots and we are going to sort of prove to you that they were unavoidable and the optimal solution also has to have a very large number of idle slots where not much was happening. So that’s the dichotomy of active slots by full and non-full. So here is sort of an example that gives you some intuition of what is going on. So obviously the proof is a bit more involved, but here is a good way to think about the intuition: suppose this was the entire input. So, this job had 3 units here, this job had 1 unit here, this job had 3 units there and all the slots that you come up with are non-full. And you say, “Well this is sort of unavoidable. What else could optimum do? There is no way to sort of piggy back and do multiple things, even though we had a batch capacity of B.” So our whole goal should be to identify some subset of jobs, J start to charge, which are a disjoined collection of jobs. If I find a disjoined collection of jobs, all of which are pretty long, then the sum of the lengths of the jobs is sort of a lower bound on the number of active slots for any schedule. So if I tell you that these jobs are all disjoined in time then I have to do them, the optimum solution has to pay that cost. So sadly we could not find the subset J star. What we were able to find was a subset J start of jobs where at most 2 of these overlap at any slot. So it’s not purely disjoined, it’s a subset of jobs which have the property where if you look at any point of time at most 2 jobs overlap at that point. So now if you think about the optimum schedule the best thing that optimum could do is having done 2 jobs. And what we can prove is that all of our non-full slots don’t exceed the sum of the lengths of the jobs in the subset J star. So that’s where sort of the factor 2 comes from and because we didn’t worry about counting for the full slots we sort of could be expending as many as close to up slots there. So that’s sort of where the bound of 3 comes from. Question? >>: [inaudible]. >> Samir Khuller: So the algorithm would have to be modified, because we have examples where this minimal solution as B get’s large is getting close to 3B. So the algorithm is sort of tight. >>: [inaudible]. >> Samir Khuller: Quite likely yea, but if you didn’t think about duality, but if you think about this disjoined algorithm it is sort of a simple commentary argument. >>: [indiscernible]. >> Samir Khuller: We apply [indiscernible] in the order. So I do believe still that there should be a commentary of 2-approximation, which intelligent orders as more than indicated would be the right way to go, but we were unable to prove it. Now the problem does have a 2-approximation. So I am not going to really have time to discuss it, but in the paper that appeared in [indiscernible] 2014, so this problem is based on LP rounding. So we basically write an integer program which is very similar to facility location type problems where you define a variable Y sub T, which models whether that slot is active or not active. And if a slot is active and YT is 1 then it gives you a certain processing capacity of B. You want to make sure that all the jobs get done so all the jobs are processed and now you get a fractional solution. I have a picture of the fractional solution. I won’t talk about the rounding, but that’s what a fractional solution might look like and then we have to round that to an integer solution. So you can get a 2-approximation this way, but that’s pretty involved. So if you can get a cleaner [indiscernible] algorithm I would be much happier with that. >>: [inaudible]. >> Samir Khuller: But the algorithm is quite slow. So Jessica actually implemented the algorithm the flow based one. So in practice there are a lot of rules you can use. So for example when I showed you the example where I said, “Oh if I shut this log down the resulting flow is still feasible so it’s okay to shut it down,” but if you notice when that long job got ejected out into the open the cost of that schedule was very high in terms of activation. So when we are solving the slow problem we can actually try to compute some notion of a cost of this active schedule. So you can use some heuristic to sort of guide your search and choose this slot or that slot, which might be in the direction which you were alluding to. So in practice that does very well, but it is slow because at every step we need to solve this slow problem. It’s not really incremental. So at every step we are deciding which slot to shut down. So you might realize that shutting slot is going to lead to infeasibility so I don’t want to touch it, I go to another one and I try it. So the algorithm is slow because we have to solve these flow problems repeatedly. It’s not a very fast algorithm. The same problem happens with Wolsey. If you try implementing Wolsey it’s again not a very fast algorithm. Okay. So coming back to this idea of covering and this flow viewpoint I think this is sort of an interesting problem. So you have a flow network, this sort of models the jobs like I described before and then we have nodes here and there is a capacity going to the sink. In this active time scheduling rather than completing a flow what we are asking it I want to select a subset of the nodes on the right and we pay for how many nodes we select. Then every time we select a node we get a certain capacity of B and then we are trying to find a flow that supports a certain value. So you can think of that as being another way to model active time scheduling. Now this is a very general framework and in fact lots of problems can be thought of in this framework. So if you go back and look at a lot of work on vertex cover, so vertex cover is a classical problem which is to have a graph and I want to select some vertices to cover all of the edges, it’s a covering problem. There is a problem for almost any, in fact [indiscernible] here had a very nice paper on capacitated vertex covering, but the idea is that nodes now have capacity. So when you select a node this node might have 100 edges incident on it, but if its capacity is 15 you can only cover 15 of those edges. This is exactly a capacitated covering problem. So you can think of all of these capacitated covering problems that have been studied in the light of this way, in this flow network. So these nodes here are modeling edges of a graph and then if this edges is incident to 2 vertices it has a possibility of being covered by that node and then being covered by that node and the vertices of the graph are here. Now picking a vertex cover is just like selecting nodes on the right with a certain capacity. In the vertex cover case in fact this capacity is not even uniform, every node has its own capacity and then we want to route a certain amount of flow or assign all of the edges. Then all of these papers basically develop constant factor approximation algorithms for this problem, but there is some special structure. So in the graph case these nodes have mounted degree of exactly 2. We generalize this to hyper graphs where these nodes have a constant degree and get constant approximation. Last year there was a nice paper by Cheung, Goemans and Wong where they improved the bounds significantly, but all I was trying to point out was that all of these problems can be thought about in this general flow setting and this problem [indiscernible] and [indiscernible] actually has a name and it’s called min edge cost flow. Min edge cost flow in a way charges for any non-zero model flow weighing to an edge. So you can say, “Well if the flow here is 0 I don’t pay for it, but if the flow is non-zero I pay for it under some capacity.” So there is clearly a close relationship with all of these problems. Okay, so time for questions before I change gears. >>: [indiscernible]. >> Samir Khuller: That’s right, yes. So for the preemptive case I don’t know if a proof is NP hard. So we have a 2-approximation, we have the lazy activation algorithm which works for unit length jobs. So our initial attempts are all trying to extend the lazy activation algorithm to deal with non-unit length jobs, but we couldn’t get an optimal proof. Do you guys have any intuition about where it might go? I have thought about it in both directions so I don’t know anymore. >>: [inaudible]. >> Samir Khuller: So let me talk a little bit about the non-preemptive case. I want to relate this to a problem that has been studied extensively in the literature. So I will to keep this part somewhat non-technical. So this is the busy time problem where we again have jobs with release times, deadlines and some processing times, but now we want to find a non-preemptive schedule. That’s the main difference in this part. The number of batching machines which was initially assumed to be 1 in this line of work is basically assumed as an unbounded number even though every machine has a batch capacity and I will explain that in a second. So let me jump to an example which will make it clear. So here is an example of busy time. So here I have jobs with release times and deadlines and these jobs all have some length, but I want to find a non-preemptive schedule this time. So this is what the input looks like: I have jobs with lengths and I want to find a grouping of the jobs. So here is one possible grouping of the jobs: so I move the jobs around, that’s the only flexibility we really have and then we have to group the jobs. And we group the jobs in this way where each group or a batch of jobs has the property that at most B of them is running at any point of time. So here say B is 3, so I have at most 3 jobs running at any point of time so I group them into 2 batches and the cost of the first batch is the duration for which it is on, which is when does the first jobs start and when does the last jobs? Then there is a cost for the second batch. The goal is to minimize the total cost, but what I meant by this assumption of unbounded machines is that these direct angles can actually overlap. Of course this is a fine schedule even one machine could run the schedule, “Come on, do these jobs, turn off, come off on, do these jobs and turn off,” but in the problem definition this is not a constraint. So these direct angles could overlap. So the sort of assumption is that different virtual machines are being used to run every batch. And that’s sort of not a desirable part and the work that I have been doing lately tries to address that and we have some partial results in that step. So is the problem clear? This is the non-preemptive version really with this caveat that things are unbounded. What’s more interesting is that a very special case of this problem for interval problems is also very hard. So what is an interval job? An interval job is like the job on the top right where there is no flexibility. The length of the job is exactly the gap between the release time and the deadline. So interval jobs look like. So you have no choice about when the job starts or ends. The jobs turn on at the release time and it ends at the deadline. So what is hard about it? It is kind of a stupid problem. What is hard about it? It is just the grouping that’s hard and this problem is NP hard even for interval jobs. So there is a paper by Winkler and Zang in 2003 which proves the NP hardness for this problem. So all you want to do is come up with a grouping, nothing else. Okay. So where are we? So the problem was proven to be NP hard by Winkler and Zang. A few months later there was a paper where Alicherry and Bhatia gave a 2approximation and they have an example that shows there algorithm cannot do better than 2. Then there was another paper published 2 years later which gave a slightly different algorithm which also gave a 2-approximation of the same problem. Then a few years later there was a 4-approximation published. So you say, “Wait a second, why are the bounds getting worse? They should improve.” These authors were actually unaware of the previous work, that’s all. Now the algorithms developed in all 3 papers are actually different. This algorithm I would say is the simplest of the lot. It’s a very elementary greedy algorithm. So it’s very easy to implement and their algorithms lower bound is 3, but because this greedy algorithm looks so simple I thought it would be very easy to analyze it. So Cole, Jessica and I spent a long time trying to prove that this is actually a 3-approximation, but we failed in doing that. So I don’t know where this greedy algorithm lies. After we failed we actually discovered these 2 papers and realized that maybe even improving it to 3 wasn’t that interesting because there were already algorithms with bounds of 2. Yes? >>: Is there anyway to say anything about how these algorithms perform in practice? >> Samir Khuller: Yes, so I have actually a high school student who has been working in it [indiscernible] and the algorithms that do well in practice are sort of, I would say small modifications of these algorithms. So you can provide some intelligence choices to these greedy algorithms and those actually end up doing very well in practice, but she never implemented this one. She implemented this one and that one and most of the time this one really does quite well, but with some changes in the algorithm. >>: So the original –. >> Samir Khuller: Yes the original algorithm doesn’t do very well, but if you change it to do something more intelligently in practice it does really well. >>: [inaudible]. >> Samir Khuller: Worst case bound, that’s correct. So the story is going to get a little bit more interesting in a second. So this paper is actually interesting for a bunch of reasons and I will try to talk about that. So all of these results by the way are only a for interval graph, which was that very silly problem where you say, “Well I have no choice as to when the job starts. So why is this even hard?” So it’s very frustrating that this problem is NP complete, because it’s just the grouping that makes it hard. Then NP complete in all these papers give basically 2-approximation as the best bound. Now what do you do in the more general case, the non-interval jobs? So there is a paper by Khandekar et al which gave a 4-approximation for this problem and this is what the algorithm does, it’s very interesting. It says that just assume for a minute that your batch capacity is unbounded. How difficult is that problem, to prove that problem is actually solvable in polynomial time? Again the solution is very complicated. It uses dynamic programming, complicated in the sense that the complexity is like N to the power of 6 or something like that. So it’s not very efficient, but you can solve this problem optimally. So what are you trying to do? You can move these jobs around in time. That’s the only choice you have right now. Batch capacity is unbounded, as many things can run it concurrently and assume you try to minimize the duration for which your machine is on. So you are simply trying to move things around so you minimize the projection on the xaxis. That is all you are trying to do. What it shows is that you solve this problem for unbounded B and you get this schedule and now you treat these jobs as rigid interval jobs. So you basically adjust their release times and deadlines to snap around wherever the jobs got scheduled in that solution. Now you have an interval job case and then they actually run the 4-approximation for the greedy algorithm by the Khandekar et al paper. Then they prove that the final bound is 4. So this restriction doesn’t really cost you anything, which is a bit strange. Okay. Now we looked at this and said, “Why are they using the 4-approximation?” Now we discover that there are 2-approximation nodes, so why don’t we plug in a 2approximation. That’s a better algorithm at least in terms of the worst case bound. It turns out that doesn’t quite work. So what happens is when you do this adjusting of these jobs the optimum schedule can actually jump quite a bit, but the greedy algorithm is oblivious of the optimum schedule? It just bounds things based on the sum of the processing times of jobs which never changed. So there is a benefit to their analysis of the greedy algorithm. So what we were able to prove is that even if you plug in the better 2-approximations once you have done this first step the optimum solution might jump by a factor of 2 and then when you apply a 2-approximation you can end up with a solution which is 4 times the optimum. So no matter which route you go you are ending up with the 4- approximation. You could solve the problem the way they did by doing the dynamic programming and then run the greedy algorithm and you get a bound of 4, but you can now plug in a 2-approximation and you are not getting anything better because the optimum solution jumped. And we actually have examples where the optimum solution jumps and you apply that algorithm and get a bound of 4. Okay. So what do we do? So now our final result is a bound of 3 for the general problem and our bound is 3 for interval graphs, for non-interval case, everything is 3. So it is a better bound. So how does that algorithm work? The first step is still the same. We solve the dynamic program for unbounded B and then get an interval case and now our greedy algorithm is a little bit more sophisticated. So their greedy algorithm does the following: it simply solves jobs by decreasing order by length. So they want to worry about the long jobs first and then basically it’s like a trivial bin packing algorithm. They start stuffing the jobs in batches and then when putting a job in a batch will exceed the batch capacity they create a new batch. It’s a very simple algorithm. So we are going to do something slightly cleverer. So we have a large collection of interval jobs now and we want to decide what the first batch is going to be. Now remember their algorithm orders the jobs in length and does one job at a time. We are going to do something slightly more sophisticated. So our goal is to find a subset of disjoined jobs. So in this case this red set of jobs is a disjoined collection of jobs and which subset of disjoined jobs do we want to find? We want to find a subset of maximum total length. So if you look at the [indiscernible] test book in the dynamic programming chapter in the greedy algorithms chapter this is known as the weighted interval scheduling problem. So there is a very simple dynamic programming solution for this problem, weighted interval scheduling. So the whole goal is simply to find a collection of disjoined jobs of maximum total length. And this algorithm we call greedy tracking and this disjoined collection of jobs will give you the first track in the batch. Then on the remaining collection of jobs we apply the same algorithm again. We find another disjoined collection of jobs and make that a second track. And once I have filled up B tracks that’s my batch and I am done with the first batch. And then whatever jobs are left I will apply the same algorithm every time. So I am finding this disjoined collection of jobs at every stop. Now this algorithm –. >>: So this is like coloring with [inaudible], right? >> Samir Khuller: Yes. >>: [indiscernible]. >> Samir Khuller: But here the cost functions are different. In the end I am going to look at the –. So look at this example: so if I pick the first track, which is the top 2 jobs, now here all the jobs are the same length and the example is setup in a way that there are no 3 disjoined jobs. So the first track is the first 2 jobs. Then the second track could be these 2 jobs and that’s my first batch, the first 2. Then track 3 creates a new batch, it’s these 2 jobs and track 4 goes with track 3. So notice that my cost is actually pretty high because I am basically turning this batch on here, it goes all the way to the end and then I’m turning this batch on here and it goes all the way to the end. Now anybody looking at this picture is going to say, “Wait a second, track 1 and track 3 should have been merged together and then track 2 and track 4 should have been merged together.” And our algorithm is not taking benefit of any alignment issues. We just union the tracks and that’s where you are sort of paying a price. It’s not exactly answering your question, but I’m just trying to illustrate even our algorithm we have an upper bound of 3 and this is the example that we have that shows a lower bound of 2. So the right analysis is somewhere between 2 and 3. >>: [indiscernible]. >> Samir Khuller: Maybe that will give a bound of 5, but I don’t know. We didn’t think about that. But how do you take the overlap cost into account? >>: How do you take it? >> Samir Khuller: We don’t, that’s why we think it’s a penalty of 3. So even in this example we are not overlapping anything. Track 1 and track 2 get union into 1 bundle and track 3 and track 4 get union into 1 bundle and that’s not optimal. If I put track 1 and 3 together that’s a much better alignment. >>: But you can merge 2 tracks only if they are completely identical? >> Samir Khuller: No in our things it’s oblivious. So the first B track is going to the first batch, the next B tracks goes to the next batch and so on and even that works. So it improves a bound of 4 to 3, but I think the right answer is 2 and it’s quite likely there is some clever way of doing track merging. So if we go back on how we are finding these tracks –. So Seth is right, so we are finding one sort of color class with maximum weight and then with whatever remaining jobs are left we create the second track. But you are saying that we should have actually found B of them simultaneously. >>: [inaudible]. >> Samir Khuller: But I don’t know whether it will help in this example actually, because why would that prevent track 1 and track 2 sort of being paired together? It won’t, because track 2 and track 3, the only difference is the overlap issues with track 1 and your objective function somehow isn’t modeling that. So believe that’s where the improvement needs to come from. So I don’t know whether it will be of much value to solve the problem. I just don’t know. So in terms of the actual proof here is sort of the key lemma which underlies all of the analysis. I won’t go through the analysis, but I want to just mention this lemma. So we are able to prove that at every step when you find this maximum disjoined collection of jobs what we are calling a track, it has the property that its span or the total cost of the track is at least 50 percent of all of the remaining jobs. And this lemma is sort of the key to doing the whole analysis of comparing to the optimal solution. So I am not going to spend time going over the proof. The proof at some high level follows the greedy proof at this point in terms of a high level charging scheme. Of course the proof is actually different, because the algorithm is different, but it is sort of a standard charging proof. I wouldn’t say it’s anything very sophisticated. Let me start by saying there is a student who has actually been implementing some of these algorithms and trying to compare them. We don’t have any real data sets so all of this comparison is on synthetic data sets. Like I mentioned earlier in this dynamic programming solution there is a lot of inefficiency in terms of running time and mapping the general interval case. So again now we have some algorithms that at least do this efficiently, but at a cost to the optimum solution. I didn’t really get a chance to talk about this result, but that’s fine. >>: What’s the online model? >> Samir Khuller: Oh, so the simplest online model is the jobs are available when they are released and actually we have an algorithm where if B is infinity –. So you have sort of this very powerful machine that you can turn on whenever you want and whatever is in the system get’s run, but it’s a very expensive machine to run. So you want to minimize the time for which you run it. And in the online setting we have a 5 competitive algorithm. So the algorithm is very simple. You delay things as much as you can. So, you never turn it on and you are about to be in a situation where if you don’t start the job you are going to miss the deadline. So you start the job at that point and let’s say this job has a length 13. Then you are going to commit to turning the machine on for twice the length of the job. So everything else that is available as that machine is running, that fits in that window of time will get executed and then you turn the machine off, unless some other job is also about to miss its deadline. But the main thing we commit to is when we run a low job we double the commitment to how long we are going to run it for. So at least for that algorithm the upper bound and lower bound of 5 are tight. We have examples where the algorithm does 5, but again I don’t believe that’s the right algorithm. So I think there will be improvements possible to that algorithm. >>: So is that like price efficiency for like HDInsight or like batch? I am not sure if you are familiar with Azure. >> Samir Khuller: No, I am not; maybe we can discuss it another time. So let me just quickly just go over these slides. The paper I spent most of my time talking about, lazy activation, actually only covered one algorithm of that paper. The paper actually has several results that were released in 2012. Then we had an experimental analysis of some of those algorithms of capacitated covering and also of the general length problem in [indiscernible] 2013. And most of the stuff at busy time and also the preemptive case was published in a more recent paper in Spar 2014. I didn’t really get a chance to talk about the work with Frederick. The stuff that I mentioned very early on was a paper by Barna, Jian and I, which was published in [indiscernible] 2010 with this machine activation problem. Then there are some generalizations and these are some of the papers that I sited in the talk. I also want to show you pictures of my collaborators. So Barna Saha graduated with her PhD and now she is faculty over at U. Mass. Jessica Chang finished her PhD a couple of years ago and now is at the Department of Defense. Jian is now faculty on Tsinghua University. Koyel is at Xerox labs. You were asking about Gabow earlier and he is at professor at Colorado, but he does still read e-mail and he has been pretty hard at work in his retirement. So one of the things that he told me is that he had a bunch of conference papers that he never published in journals and so he has been spending a lot of time writing like 40 page long journal papers of stuff that he published in conferences a long time ago. He sent me some of them to read, but I just didn’t have time to read them, but it is very interesting work. And actually I didn’t get a chance to talk about some of the things he did in [indiscernible] paper, which are very interesting. He developed some [indiscernible] algorithms for stuff where we were solving linear programming initially. And Frederic Koehler is actually an undergraduate at Princeton. I started working with him when he was in high school in Maryland. So we have continued that collaboration. So let me take a few more minutes and talk a little bit about some stuff that has been the focus of attention of both myself and the back of the room is the inventor of [indiscernible], Bill [indiscernible], who is a professor at UMD. And from starting about a year ago we have been hard at work at a building project. So I wanted to just talk a little bit about the department. The department has over 50 faculty. It’s much larger when you consider the entire affiliate faculty in the various other schools. Undergraduate enrollment in computer science is booming all over the country, but especially in Maryland. We went from 1,000 majors to 2,200 undergraduates just in the last 36 months. There are about 250 minors and about 400 computer engineering students. So let’s talk a little bit about looking ahead. So I know HoloLens is a big exciting thing happening here at Microsoft. One of our [indiscernible] Brandon [indiscernible] took 2 of his friends from Maryland Michael Antonov and Andrew Reisse, who are cofounders of a company called Oculus VR, which was big time in the news last year when they got bought by Facebook for 2 billion dollars. And so Brandon and Michael have done an amazing gift to start kickoff this project that Bill and I have been heavily involved in. So that’s sort of the picture of the model of the building that we are in the middle of. We are planning it. So it’s a 6 floor building for computer science. A lot of facilities for lecture halls, collaborative classrooms, a big open cafeteria, research labs, lots of space for PhD students, etc. I guess our under graduates who have been stuffed like sardines in classrooms lately would love those facilities. And especially for PhD students the current building that we are in is kind of depressing; none of the offices have windows and lights. So this building would be an amazing facility for students to come together, collaborate and work together. So I just wanted to share with you sort of the ground floor building model. So we have been working with the architects for the last 6 months or so. So that’s sort of the extension that was in the picture before. So this part is going to have both 100 seat collaborative classroom as well as a 300 seat collaborative auditorium. The main building itself, that’s sort of the footprint. It’s a boomerang shaped building, this is the open cafeteria, space and then there are going to be lots of research facilities and labs. The first two floors are primarily for undergraduates with classrooms, robotics labs and so on, as well as hacker maker spaces, which occupies actually a third of the fraction of the second floor. Then floors sort of 3, 4, 5 and 6 are mostly for research and PhD students. So that has been a pretty exciting project that we are in the middle of fundraising for. So you sort of asked me earlier what brings me to Seattle. So we have about 90 percent of the money in place. The state of Maryland is funding about 100 million dollars. Brandon and Michael together gave 35 million. The cost of the project is expected to be over 148 million. So we still have about 12 to 13 million short. So that’s what we are fundraising for, but the building project is on very fast track. The plan is to do a ground breaking next year. So I guess the architects have to finish their plans this year and it opens in 2.5 years from now. So come visit. >>: Where is it going? >> Samir Khuller: It is going next to the CSIC building. So if you look at this model carefully that’s the CSIC building and this is a big parking lot right now. So as soon as you enter campus drive from that –. >>: [indiscernible]. >> Samir Khuller: From campus drive this is the first big thing you will see on the right hand side, actually very prominent from route 1. >>: [indiscernible]. >> Samir Khuller: Yes and in fact the whole parking lot will go away and this area will all be landscaped. >>: [indiscernible]. >> Samir Khuller: [inaudible]. >>: That’s actually the reason why you see part of it doesn’t go all the way down to the ground. There is a 100 year flood plane, so we can’t actually build in that area on the first floor. >>: Got it and that’s an overhang? >> Samir Khuller: Yes that’s an overhang. >>: Well we were hoping to do a pure [indiscernible], but it turned out that was going to be a little bit too expensive. So there will be some pillars there to help support that. But that will be a plaza that will be covered by the rest of the building. The whole area, this will be a new quad. There will be people playing Frisbee and hanging out on the quad. >> Samir Khuller: Right, that’s another space. Also if you want to organize conferences here this would be amazing, at least for a 300 seat size conference. >>: Will there be a hotel across the street. >> Samir Khuller: There will be a new hotel coming up next summer across the street. So that hotel project started awhile ago actually. There is already stuff in the ground. So they will open well before us I think, certainly by January 2017 that will be open. That’s just like 100 meters away. But yea, it will be a great place to host events because this sort of separates the noise from the rest of the building and so on. You can actually have an event here and it doesn’t disturb the other occupants of the building. What else? In terms of the department these are sort of the areas that we are planning on growing in. The plan is to hire about a dozen new faculty in the next 3 years. So there is going to be a lot of faculty positions. Along with Brandon, his mom gifted 2 chairs for computer science. So we are going to be recruiting for a chair in the specialty of virtual and alternative reality. There is already a lot of activity in Cybersecurity with the [indiscernible] Cybersecurity center that opened about 3 years ago, but our plan is to grow that. There is a new quantum computing institute; Andrew Childs from the University of Waterloo was recruited about 8 months ago. So he joined Maryland and we are looking for a second faculty member in that area. And these 3 areas are sort of big growth areas I would say in the next 3 years. >>: [inaudible]. >> Samir Khuller: Well the quantum computing, almost everybody in that space is a theory person, at least the ones we are trying to go for. Andrew is a theory person for example. >>: Who? >> Samir Khuller: Andres Childs, so he got his PhD in physics, but most of his publications are, well several papers are in theory conference for example. He was a professor at Waterloo for a number of years before we recruited him. Another exciting thing that is going on is CS Education for Tomorrow. This was actually funded by a gift from Bill Pew and this is sort of 2 fold. We are trying to improve the quality of the education for our students and part of this involves creating flip classrooms. So a lot of the faculty has taken this model on where lectures are video taped. So there is a video lab set up now. So if you are teaching any class at Maryland you can just come to the video lab, record the lectures in advance so students can watch the lectures and then the classroom can actually be a discussion as opposed to just teaching. We just recruited a special honors advisor whose goal is to do CS enrichment for undergraduates. So different from the advisors, this was a person, Rich [indiscernible], who used to be a professor at Maryland and left to go to Italy 3 years ago. We just recruited him back. We are also creating a data science graduate certificate program which will launch next year. This is a full 12 credit program, it is 4 courses and we are in the process of creating an undergraduate data science specialization as well. So this will be similar to the Cybersecurity specialization, which is the only specialization we have right now. That’s everything, I am out of time, and so let me stop here. >> Mohit Singh: Let’s thank Samir. [Applause]

>> Mohit Singh: Hello everyone, welcome, it is a great... Khuller from the University of Maryland. Samir has done...

Related documents

Products

Support

&gt;&gt; Mohit Singh: Hello everyone, welcome, it is a great... Khuller from the University of Maryland. Samir has done...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Mohit Singh: Hello everyone, welcome, it is a great... Khuller from the University of Maryland. Samir has done...