>> Leonardo de Moura: It is my pleasure to introduce... automated reasoning, paramodulation to improvers, he is one of the...

advertisement

>> Leonardo de Moura: It is my pleasure to introduce Robert Nieuwenhuis. Robert works automated reasoning, paramodulation to improvers, he is one of the authors of the very influential [indiscernible] paper. He’s implemented the Barcelogic SAT/SMT that was a breakthrough at the time. Since then Robert and his group has opened a company, Barcelogic and they are working with optimization now. Today he is going to talk about how he made the transition from formal verification to high-performance constraint solving.

>> Robert Nieuwenhuis: Thank you very much. The things Leonardo just mentioned involve a group right, it’s not me, its many people, and among others [indiscernible] was here. So this is about how we moved from tools that are mainly used for formal verification to optimization.

So first I will speak a little bit about what Barcelogic is and then there is going to be a technical part and depending on the interest of the audience we can skip a part or we can go through it and there is an application part. So the technical parts many of you know it. Clause-learning SAT solvers. Why do they work so well? What is SMT? Why does it work so well? Maybe there is a new view for some of the people in the audience.

ILP: so integer linear programming as an SMT problem and also hybrids. So SMT plus what I call encoding bottlenecks and then going beyond, so instead of learning only SAT clauses as you do with SMT you can also learn new constraints. Then we go into IntSat and some evaluation of

IntSat. Then some words about the focus of Barcelogic and examples of our customers’ problems and tools used.

So this is Barcelogic, it’s a spin-off of our university, this is ownership, this is the core team.

Where did we come from originally? Out of these areas here: automated deduction, which has many applications, work on the implementation of logics, rewriting, termination proving, weighted CP and inside CP we have weighted CP, probabilistic graphical models and there’s a lot of collaborations with other places there going on.

Then the focus nowadays inside Barcelogic is mostly on SAT and SMT and many applications of this, for instance what we did was define this DPLL T standard, which is a kind of framework that SMT solvers nowadays many use. As Leonardo said before we did some tools and this also has many applications, as many of you know here. We also work on many other combinatorial optimization problems using many other techniques and solvers.

So in this talk, in the technical part I am going to explain about this: SAT, SMT, IntSat and other techniques and applications. So this is SAT, most of you know. Here we have clauses and what you do is you decide, you propagate, etc., each time the blue clause is the one that acts. Then you have this, so the red numbers denote variables that have been decided and the black ones denote propagation. Then you have this false clause, it’s a conflict so you have to backtrack and a typical backtrack would be this. Then you have a solution, but you can do much better. You can also do backjump instead of backtrack.

So backjump, let’s briefly remember, it notices for instance in the example we saw before that this decision level is irrelevant for the conflict. So we could directly backjump to here because the conflict comes from decision 1, its consequence 2, and decision 5 and its consequence not 6.

So having here not 5 is needed to solve the conflict. So conflict analysis is this: you have to find a backjump clause. In this case it’s this one, that’s a logical consequence of all the clauses you had and that reveals a unit propagation at an earlier decision level, so where the part C is false.

Then backjumping is after that you return to the decision level D and do the propagation.

So here is another example; you have this current assignment, last decision is 9 and then with these clauses here you successively propagate not 8 with this clause, then not 5 with this clause, etc., until you reach this conflict here. So then you are going to analyze where the conflict comes from and you say, “Well what was the last literal involved in this conflict? Where did it come from? Oh it came from this clause here.” So then we do a resolution and we get a new false clause that is false in a prefix of the stack. So you can repeat this until you reach a clause that has only 1 literal of the last decision level, this is called the 1UIP. And you can use this clause to backjump always. In this case you would backjump to after not 7. So independently of how many decisions you have in between you can backjump directly to this point. This is simple and most people know it.

Now a view about why it works so well in my opinion. So the first thing is that you learn the backjump clause as a lemma. This makes UnitPropagation more powerful and it prevents exponential amount of repeated work in future, similar conflicts. So these similar conflicts tend to come up when you have structure in your problems, not in random problems typically where this thing doesn’t do anything. Also, the decision heuristics, so you decide on variables that have many occurrences and recent conflicts. This is called a dynamic activity-based heuristics and the idea, from my point of view, is that you work off clusters of tightly related variables.

So if you apply this algorithm you put together two SAT problems that share no variables, two completely independent problems, you will see that the typical SAT solver will work only on one of them thanks to this heuristic, which is of course a good thing. Then after finishing with this one it will work on the other one. The third point is to forget from time to time the low activity lemmas. This is crucial to make unit propagate fast and memory affordable. The idea is this: so after you have worked off such a cluster, the ones I was mentioning here, then these lemmas are no longer needed. You have stronger lemmas talking about that part of the search space.

>>: So backjumping is not really important.

>> Robert Nieuwenhuis: I guess if you would learn this then the backjump itself would not be that important.

>>: So you can replace backjump by fast replay.

>> Robert Nieuwenhuis: By fast?

>>: Replay.

>> Robert Nieuwenhuis: Restart.

>>: Yea, fast restart.

>> Robert Nieuwenhuis: Yea, exactly.

>>: Which should play into 3.

>> Robert Nieuwenhuis: Yea, good point. So why is this working so well? What is good; what is bad? Well I think we all know. This is a slide I used to have in my talks, already some years ago. I was saying, “What is bad?” Well the language is low level; it’s difficult to come up with good encodings, especially for arithmetic. So we did a lot of work on encodings and many people have worked on this, good encoding for 0-1 cardinality constraints, Pseudo-Boolean constraints, encodings for the integers and also this part here. So usually you get an answer

IntSat or you get a model, but optimization was not as well studied in this context.

So how can we solve these bad things? That’s what the rest of the rest of the presentation is about. SAT Modulo Theories, again most of you know. So this comes from software/hardware verification application, so reasoning about theories. Here are a couple of examples. This is a

EUF, essentially the theory of a [indiscernible] and here you have several combined theories. So there is read and write in a race, there is arithmetic and there are some of these un-interpreted functions. So this is typical in verification applications.

So this is the first approach, I think it’s the appearance of SMT in this sense, the lazy thing. It is also known as lemmas on demand, buy [indiscernible] and [indiscernible]. The idea is simply to forget about the meaning of the literals, just consider them as propositional literals and send the clause set to the SAT solver. Then the SAT solver returns a model and then you have another piece of software that’s the theory solver and it can reason about conjunctions of this theory literals. It says, “No, this is T-inconsistent, you cannot have together this, and this, and this, that’s contradictory with the theory.” So then you can add a clause forbidding this and send it again to the SAT solver. It returns another model, it’s again theory inconsistent, then you send it again to the SAT solver with a clause blocking it and then it becomes un-satisfiable.

So eventually either you get un-satisfiable or you get a theory consistent model in this way. But then you can do some improvements, which are quite easy to come up with. So since your solver is DPLL or CDCL based you can do better. So, for instance instead of checking the Tconsistency of full propositional models you can check the partial ones while they are being built. And instead of adding not M as a clause when you get this T-inconsistent model M, you try to come up with a small T-inconsistent subset. This is called an explanation, usually explanations are very small. And instead of adding the clause and restarting you could do conflict analysis of the explanation and backjump.

Then we came up with this DPLL approach and where you also have what is called theory propagation, which are what people in constraint programming call propagation. So you can propagate literals that are theory consequences. So you not only validate the search, but you also guide it a little bit more.

>>: [inaudible].

>> Robert Nieuwenhuis: Yea, but most people don’t think about Google as being a company.

Okay, this is an example and I think is quite interesting for the rest of the talk. This may be newer for you. So consider a theory, you do integer liner programming and the theory is just the conjunction of linear constraints. You decide and you do unit propagation on bounds. So you have these bounds and theory propagation is just then bound propagation, for instance from these two bounds and this constraint you can infer this out of new bounds and an explanation clause for this propagation would be this clause here. So either this bound is false, or this bound is false, or this one is true. But, of course from one constraint you can have many different explanations, many different explanations depending on which propagation you have done with the constraint.

And if there is a conflict you generate the explanation clauses on demand during the conflict analysis and you do everything as in SAT. And termination and completeness follow from just a standard SMT completeness result. The interesting thing here is that doing SMT, pure SMT in this context is a little cowardice. So you only learn new clauses, S doesn’t change, everything is simple. This was later on also developed by Peter Stuckey and others and they called it Lazy

Clause Generation and they use it in the CP world, in the constraint programming world, and it works very well on many CP problems, really well.

Now why does SMT work so well? Because most constraints are not bottlenecks so they only generate a few explanation clauses in practice and SMT will generate exactly these few clauses on demand, but sometimes you do have bottleneck constraints and typically they can generate an exponential number of explanation clauses. For instance if you have some part of the input that implies that at least K of a set of literals has to be true and you have such a constraint saying that at most K minus 1 can be true. Then it will generate all subsets as explanation clauses. So this exponential number, this happens.

So what you can do is detect and encode such bottleneck constraints on the fly. So while running, when you detect that a certain constraint is generating a lot of explanations you stop and you encode that one. And instead of doing this naive SAT encoding, which is this exponential number of explanations you have a compact encoding with auxiliary variables only for that constraint. And it’s even helpful because you can also split on these auxiliary variables and this appears to be helpful in some cases. So this is some work that we have done together with

Stuckey.

>>: [inaudible].

>> Robert Nieuwenhuis: This was for cardinality and for Pseudo-Boolean as well. So Abio is a

PhD student of ours who went to work with Stuckey. In the first paper they tried to not encode a full constraint, but partial, only those parts that appear to be active. This made it really difficult to implement and then we did a lot of experiments and we discovered that you can, in most cases, just encode the full constraint and everything works well. Then of course you do not need specialized algorithms to be able to extract part of a constraint, which is what they were doing in the cardinality constraints for example and the Pseudo-Boolean ones. This works very well, again this is a significant additional improvement on just SMT. So you start with SMT and when you see that something is a bottleneck you encode on the fly only these parts.

But then there is this challenge: so, in the slides before you were only learning new clauses, but it can be much more powerful to learn also new constraints. So if you look at it from the SMT point of view you would be strengthening your theory. Well, not the theory, but the presentation of the theory. So you add new cuts that are consequences, new constraints. So you have this table here: clauses, constraints –.

>>: [inaudible].

>> Robert Nieuwenhuis: When?

>>: [inaudible].

>> Robert Nieuwenhuis: Did it? Ah, when was that? Great minds think alike maybe. This is from my CP talk last fall I think. So if you want to eliminate a variable from 2 constraints you can always find multipliers in order to do it, right. So this would be a cut inference. This is an example just to motivate; indeed learned cuts are stronger than SMT clauses. So for instance this is a 0-1 example so here I could have written, oh sorry, this is just X is through, Y is through, Z is through, U is through and the stack grows like this. So you take these two decisions, you propagate with Z1 that Z is through and then you have a conflict.

Now you propagate with Z2 that U is through and now you have a conflict. So if you do the normal conflict analysis SMT like style then you get essentially that not X, Y and Z at the same time can be true. So, one of the three has to be false, which you can express with this linear constraint. But, if you do the cuts you can directly infer that Z has to be false. This is just an example, but in practice it happens a lot, but doing cuts you get much stronger results.

Now this is a well known problem even in the 0-1 case that you know what you would like to do, if you really do everything you would do in CDCL with this: you stat doing it with integer linear programming with this then you run into trouble. This is an example of the trouble: so again this is a 0-1 example, so X is through, Y is through, then with Z1, Z has to be true and then Z2 is a conflict. But this propagation, you have done it by rounding. So if X is true and Y is true then

2Z has to be at least 1. So this means that Z has to be at least 1, because of rounding. This rounding is what kills you. So if you do the corresponding conflict analysis cut you get something that is a useless tautology in this 0-1 case.

And indeed, according to the intuition, if you translate SAT to this then conflict analysis is finished, because indeed for this constraint only 1 bound in the stack of the current decision level is relevant, but this thing here is too weak to force a backjump so you are stuck. This is a typical example, well known. It was already well known in the 0-1 case, etc. So if you try to solve this problem in the 0-1 case there is a lot of work on this in Pseudo-Boolean solvers.

So one solution is you go the pure SMT way. So you can just solve the problem as in the SMT case. And indeed if you look at it some Pseudo-Boolean solvers only learn clauses. So these are just SMT solvers in a sense, but you can be smarter. So you can fall back on SMT only in the case of the rounding problem and then since any clause on 0-1 bounds is expressible as a

constraint, so you can imagine this is because of convexity. In the 0-1 case this junction of bounds is expressible as a single constraint because it works. So then you can simply express your thing as a constraint and do the cut with these constraints.

So this is a particular case of doing the cut only with the variable that you eliminate having coefficient 1 or minus 1. So this is what happens here. So then there is no rounding so if you do all your cuts like this you can always backjump. So in the Pseudo-Boolean world there are better solutions even. So you can also use cardinality explanations instead of all these clauses. So it is a form of SMT where the basic language is cardinality constraints and not clauses and there are a lot of people that have worked on this.

Now solving the rounding problem in the Z case: so this is very nice work and the authors managed to solve the rounding problem during conflict analysis, for each propagated variable always compute a tight reason, what they call a tight reason, which is again that the coefficient of the propagated variable has efficient 1 or minus 1. And the process of computing this tight reason is done on demand during conflict analysis. This process uses a number of non-variable eliminating cuts. This works because it is required that decisions are always made making a variable equal to it’s current upper or lower bound. And as before, if you only do conflict analysis with tight reasons then there is no rounding problem, you can always backjump.

This learning scheme is similar because, they really need to compute these tight reasons during conflict analysis, they must use this learning scheme. At least the way they solve it; maybe it’s not necessary, but the way they do it they need to use this learning scheme which is similar to the all-decisions SAT one. So this means that the tight reasons that you compute and in the end what you get is kind of as if you had done conflict analysis in SAT until you have only decisions.

And unfortunately this doesn’t work well in SAT and apparently it also doesn’t work well in integer linear programming.

>>: So 1UIP that’s the first unique implication points.

>> Robert Nieuwenhuis: Yes.

>>: And that’s bad because of backjumping or because of the clause?

>> Robert Nieuwenhuis: No, it’s bad because what you learn is less useful. So this agrees with what we were saying before. So the backjump itself I don’t think is very important where you backjump to.

>>: Yea, I just wanted to understand.

>> Robert Nieuwenhuis: So what is important is the quality of what you learn, because in SAT you can do it. You can do conflict analysis until you reach something that’s built from decisions only and if you are a little bit more cleaver you would continue until you had one literal of each decision literal, just a call to all-UIP, which you can also do and you get really small lemmas. So many people have tried this because it looks very good because you get small lemmas and

apparently smaller lemmas should behave better and it’s not true. So it doesn’t work at all. So it’s not the overhead of computing these lemmas it’s because later on the search is much worse.

So even if it were cheap it’s not worth it, just like other learning schemes based on the min cut of the conflict graph. It’s the same thing. So you get really small lemmas. So even if it were cheap to compute it doesn’t search well. So what I tried to do to over come this is quite pragmatic. It’s much less elegant, but it works much better. So the idea is to fall back on SMT only in those cases where you need it and this is similar to what is done in the Pseudo-Boolean case by others, but you are in trouble here because you cannot convert any clause on bounds into a constraint.

So you have to be careful how you do it.

So here you can have arbitrary new bounds as decisions and you always get a backjump constraint, but you learn a new constraint and you do a backjump and not always the backjump is based on the constraint you learned. But the search has guided us in the 1UIP learning scheme in

CDCL. The idea is you always do the cuts and if there is a rounding problem you rely on SMT for completeness, but you learn the clauses on bounds only if they can be turned into constraints, because you cannot have, in your language at the same time, clauses and constraints. So you cannot do any inferences combining them in an easy way at least.

Well essentially if we look at an example you can imagine this is a symbolic way of expressing it, but in the stack you keep a bound and for each propagated bound you have the recent constraint, but you also have the recent set, which is the subset of bounds below it in the stack that has caused the propagation. So for instance this thing has this reason, this bound and this bound. Sorry, this bound has this reason, this bound and another one below which is not drawn.

So this is the reason set.

Then if you have a conflict, you have any initial conflicting sets you do the SMT like conflict analysis, but you also do the cuts. Then it frequently happens that at some point before reaching the first UIP you already have a cut that causes a backjump. Then of course you do the backjump according to the cut. Otherwise you may reach the first UIP based on the SMT conflict analysis and you do the backjump according to the SMT. That’s the main idea. I am not going to go through the while example, but it’s in the slides.

So this is the end of the example and of course here optimization is trivial. You run it the first time and then after that you add the constraints saying, “Now I only want better solutions,” and you rerun. So that’s just mention bound or cut and bound or whatever.

>>: [inaudible].

>> Robert Nieuwenhuis: Then there are these results. Do you mean binary search or something?

>>: [inaudible].

>> Robert Nieuwenhuis: Well the interesting thins is that this can be implemented in a very light way. So since you do not rely on the cuts for completeness for instance you can prevent overflow by just throwing away those cuts you don’t like. So what I do is cuts that are giving

coefficients that are too big I simply discard it. Then if intermediate computations are done in big integers then it’s guaranteed not to give overflow in the intermediate computations. There are some tricks about how to implement it, like counter based bound propagation. And counter base bound propagation, depending on how your constraints look; you could have different ways of doing efficient propagation. For the moment I’ve only implemented this. It also has its advantages because you can more easily parallelize for instance.

So this is what we are comparing with, with the commercial solvers. So, they are expensive, they are based on combining lots of techniques and indeed they have seen immense improvements in the last years. We also compare with their 4-core versions instead of 1 core which is this thing. So this is the first completely different technique that shows some competitiveness on integer linear programming with a commercial solver as far as I know, even on these MIPLIB problems which are standard test sets and even with this small “toy” implementation and there is a lot of room for improvement.

So I tried, you can see it in this paper, I tried some random problems and also the MIPLIB ones.

So, all MIPLIB problems where variables are bounded because currently I cannot handle unbounded. Including some hard and open ones and some of them are really big. So let me try to give you a demo on some of them. Demo IntSat: so here there are a couple of problems. This is one of the MIPLIB problems, this is lecture scheduling. This is not a very big problem. It has something like about 10,000 variables and I think 15,000 constraints or something. Well this is

CPLEX running and for the moment it has no solution. The problem is open, so the optimal is unknown and if you let it run for an hour or so it get’s solutions. So here the lower bound is 38.

I don’t know how long it takes to find the first solution.

>>: [inaudible].

>> Robert Nieuwenhuis: No, no, no, this is something else. This is the lower bounds. No this depends on a know that it’s exploring. You see it goes up and down.

>>: So when the lower bound is 38 it found a solution?

>> Robert Nieuwenhuis: No, no, it has no solution yet.

>>: But there is no solution that is below 38.

>> Robert Nieuwenhuis: Yea, exactly this is the lower bound that comes from the LP relaxation of the problem.

>>: Oh, okay.

>> Robert Nieuwenhuis: So now in this column here we are supposed to be getting integer solutions and as soon as the gap between the two is small enough it stops. So you can set this gap to some value. If you say, “I don’t care about the last 2 percent,” then it will stop after the last 2 percent. Well we don’t get any solutions, let’s interrupt okay. Now I run IntSat with the same. So it finds solutions very quickly. It causes some work to go down.

>>: Wow.

>> Robert Nieuwenhuis: So after a minute here you get the solution which is I think 84 or something, which takes hours to get it with the other and in fact it’s close to what is known as optimal. So the best solution found so far. I think we are getting there. Maybe we can let it run for awhile; you know what we are almost there I think, 99, yea so this is already close to the best known solution of this problem.

Here I have another one, which I am not even going to run CPLEX because it dies. So it takes a very long time to get a first solution with CPLEX. So this is a big one you see. It has 135,000 variables and 150,000 constraints, this is a huge problem. It finished parsing so it starts working.

There we have some solution.

>>: Do you have parsing techniques?

>> Robert Nieuwenhuis: No, not here; here I do nothing. In the 0-1 solver I do a little bit of probing, but it’s not very useful. So I am sure a lot more can be done in all this. Here again, so after a minute or so it gives you a solution that is close to the best know. Anyway, it’s boring to watch it I think. At some point you are going to jump to a much better solution.

[demo]

>> Robert Nieuwenhuis: So you get quite some pruning from when you add the new constraints from the upper bounding.

>>: So is this tool available online somewhere?

>> Robert Nieuwenhuis: Sorry?

>>: Is this tool available online somewhere?

>> Robert Nieuwenhuis: I had a version from the CP paper online, but I recently removed it because since then there have been a lot of improvements. I haven’t made public the new one.

So if you send me an e-mail, when the new release comes out I can let you know.

>>: Does the new release include the source?

>> Robert Nieuwenhuis: Yea. So see, this is already close to the best. We went down from 1 million something to 300,000. Okay, so let’s stop it. I can also show you, so this is a demo of the 0-1 solver with an application to –. So this is work force scheduling for a car rental company. So I am using Prolog as a modeling language and this is a little bit of a toy example.

So you can have lots of constraints. So the car rental company essentially what the guys do is pick up cars. So people pick up cars and they attend them. So this says that each worker can handle 5.5 cars per hour, and there are lots of constraints about the contracts, those guys have

and how many people you have of each contract, what temporary workers cost, how many hours they can work, etc.

So you can complicate it as much as you want. So here you have all these numbers, this is in the time slot from 7am to 9am each day of the year how many cars they have to handle. So we have this data for the whole year, this is a forecast and you can also do things like this worker is absent at this moment, etc., etc. Well this is a simplified version of it. This is for a customer of

Barcelogic. So now what are we doing here? So this is a car rental company at an airport in

Spain and in fact they have many airports. So in January they don’t have much work. So now let’s just schedule –. What’s going on here?

>>: You didn’t press S.

>> Robert Nieuwenhuis: I can’t even read it.

>>: You pressed the S key. Oh, that’s good.

>> Robert Nieuwenhuis: Okay, so on my screen it’s much, much smaller, even with glasses I cannot see it. So this is just Prolog. So now we are going to schedule the month of January.

Where is this here? So this Prolog program is going to invoke CPLEX. So it generates the LP program 0-1 linear programming and it invokes CPLEX. We didn’t even see it. I’m fighting with it, anyway. It’s really big here and this is not what I expected as a result. January, let’s see what happens here.

>>: [inaudible].

>> Robert Nieuwenhuis: Yea that looks really suspicious.

>>: You were using it in February usually.

>> Robert Nieuwenhuis: Ah, that’s why, January is cost 0, you are right, thank you Albert. Yes,

February was easy, January was really easy, ah that’s they it’s scheduling February, yes.

[demo]

>> Robert Nieuwenhuis: Yea okay 700s. So now it’s too late. We can see the solution here and that explains also why the cost is 700s, because it had to hire two temporary workers and there are different costs involved in the contract. So now I lost the cursor, where is it? Ah, okay that’s why I lost the cursor because I am not watching. I am not seeing on one screen what’s on the other. This is here; the cursor is gone, sorry. Ah, here it is incredible. Yes, so now I can do it here.

So you can watch this linear program. There is a huge function to be optimized and you have lots of these constraints. Those are Pseudo-Boolean optimization problems, okay it’s huge. So this is the same problem with IntSat proves optimality.

>>: So you said that you had a separate code for the Pseudo-Boolean case?

>> Robert Nieuwenhuis: Yea in Pseudo-Boolean you can do many more dirty tricks.

>>: Meaning strengthening the inequalities?

>> Robert Nieuwenhuis: Yea so it’s like in the cut SAT procedure you have to do a lot of work to get a tight reason. In the Pseudo-Boolean case there are many tricks to get these tight reasons just locally, without doing any inferences. So from the constraints you can extract sometimes a cardinality constraint instead of a clause and you can also sometimes extract a Pseudo-Boolean constraint instead of a clause. You can do stronger things locally.

>>: Like Chai and Dixon.

>> Robert Nieuwenhuis: Yea, like Chai and Dixon. So this is May and in May things are much more difficult in this company. So again we see first CPLEX running. It has a timeout of 60 seconds I think.

>>: But you said that your integer solver solves bounded integers, each variable is bounded. So do you use any information from the bounds to –?

>> Robert Nieuwenhuis: To do better you mean? In what sense?

>>: Like the 0-1 variable is a bounded [inaudible].

>> Robert Nieuwenhuis: Yea? What do you mean?

>>: Just like if you [inaudible].

>> Robert Nieuwenhuis: So still sometimes Leonardo can surely build such an example. So what you would need to express the conflict is something that’s not convex. So you cannot express it with a single constraint, which is what you would need.

>>: Sure, but has a heuristic.

>> Robert Nieuwenhuis: Huh, so let’s do this. So it timed out after 60 seconds with a cost of, I don’t remember. So here we see the time table again of 13,300. We run IntSat on the same problem and you see. So don’t know whether it proves optimality quickly, but I think this is the optimal year. So this is at least a real world problem which we run into where you can do much better than the commercial solvers with these kinds of techniques.

Let me return to the talk. Where are we? Here, okay, so here are some statistics about the

IntSat. Now we return to Barcelogic. So we get now the application part of the talk. So

Barcelogic, what we have been doing first is what I call focus 1. So we hire this marketing guy expert in business and he did quite a good job. So he got all these paying customers. So we did employee scheduling for a large call center and also this car rental company. We did

crossdocking, later you will see some more. We did a lot of sports scheduling, planning the paralympic games as well, route planning for transportation companies. So all this involves doing consulting with these customers. So you have to learn their sector, their industry. So it does not really scale, but it provides the financial resources for the real mission what we want to do.

>>: Which is what?

>> Robert Nieuwenhuis: Which is this, what we want to do is create value, good solvers and tools. So what we have been doing is working with third-party companies. So this software provides us with ERP and human resources to make these modules: employee scheduling and production scheduling using SMT and extensions. For this now we are ready for them to sell it to customers. So we have prototypes and APIs and modeling languages for this. We have a patent pending on IntSat. We are doing many more things with the 0-1 ILP solver and we also have this which is a little bit different application area, which is Max-SMT-based compositional safety verification tool. And our plan is to do like this: grow in Barcelona where we have this nice atmosphere.

So now let’s speak a little bit about a few of these problems to finish the talk. So for instance in this call center we had forecasts, so they get different types of calls which have to be answered by people with the right skill and some of these people are multi-skilled and then there is this forecast. So for each skill, for each 30 minute slot of the year they know how many calls they expect and they want to have this quality of service, like this percentage of calls answered in this amount of seconds by a person of the right skill. And of course many regulations are there like rest periods and so on and so forth, so by law, and by agreement, and with unions, and with the workers and so on. So in the end we saved 31 percent of the summer temporary employees the first year and since we were charging them per saved full-time equivalent it was quite okay. So this is the demo that we have seen already, although we applied it to car rental companies.

This is a completely different problem. This is crossdocking, this is a logistics problem. So we do it for Mango, you know the clothing company Mango; it’s one of the largest. So after H&M and Zara I think it’s close to the biggest and it is located in Barcelona. They have this forecast; each shop is going to sell so and so many items of each size. Then they tell the Chinese to produce it and the Chinese of course make boxes per size. Then you go to a logistics center and you have to pick items to make the boxes for the shops. What we do is we design standard crossdocking boxes.

So each crossdocking box has an assortment of sizes, but the Chinese are not willing to make the boxes for the shops because that’s not their job, they just produce. But, they are willing to make these, let’s say 3 different types of crossdocking boxes. So each box has an assortment of sizes.

Then of course crossdocking boxes need not be opened at your logistics center. So, most of the shops are served just by using these standard boxes. When we came they had a 30 percent of picking and now they have a 10 percent of picking. So this is a very large savings and we also solved other problems for them and some extensions. We provided them software, maintenance and consulting. This we did just with ad-hoc programmed local search. So we are not married with SAT and SMT.

This is professional sports leagues. This I have been doing for many years and it’s a real hard problem. Let me see whether I can get a demo on this. So again I am using Prolog as a modeling language and this is the Dutch first division professional soccer. It’s full of constraints, let me show you this. So for each team there is this matrix which says, “For each other team and each playing day of the whole season under which circumstances they are willing to receive this other team.” So the A, B, 1, 2, X, blah, blah, blah, they are codes saying, “Well not on Sunday, not at this time, or depending on the police,” blah, blah, blah. So some of these teams they have really full matrices.

So each entry which is not a dash is a constraint. So this is already a lot of constraints. So this is already a lot of constraints. So these are many constraints that involve individual matches. So you say, “Well match [indiscernible] against [indiscernible] on day 3, has no constraint, match blah, blah, blah.” So each entry is a match, but there are also lots of constraints involving combinations of matches like all these. So these guys cannot play at home at the same round.

So when this guy plays home you cannot have this match on the same play day. There are same rounds and same days. So many of these constraints come from TV optimization/maximization,

TV income, others are public order, order [indiscernible], so it depends a lot.

There are many constraints also about certain matches for sport reasons. They have to be placed; they have to be distributed along the season, again for TV reasons. So there are other constraints that are even more complicated to express. This we solve with a kind of tailored SAT solver that also optimizes. So it is a SAT solver that has some special encodings for this problem and also has a special heuristics for this problem. So this is one of the few examples I know of where a tailored heuristic works better than the standard one and it’s necessary.

>>: So this one is not solved using IntSat?

>> Robert Nieuwenhuis: No this is pure SAT, but tailored towards the problem. So now what you are going to see is this Barcelogic SAT solver tailored towards the scheduling application.

>>: And how was it tailored?

>> Robert Nieuwenhuis: For instance it’s optimizing, it’s not just SAT it’s also optimizing a cost function. So it’s a kind of Max-SAT in that sense. Certain constraints have a weight associated with it, but it’s also tailored in the decision heuristic. So certain types of variables it is deciding before on it and it’s also tailored in order to minimize the cost function, so the typical heuristic where you want to lower the cost. So the polarity is done in order to lower the cost.

>>: So then it sounds like you could produce a benchmark in the format of weighted SAT. Have you done that?

>> Robert Nieuwenhuis: No, but we could. So the Australian’s, where [indiscernible] is with

Peter Stuckey they have used this sports scheduling instance for many applications. They have them.

>>: Yes, but there is this evaluation of Max-SAT solvers that use these [inaudible].

>> Robert Nieuwenhuis: So now from Barcelogic we could be generating lots of real world examples in many things.

>>: [inaudible].

>> Robert Nieuwenhuis: So it’s very easy to modify some of the constraints and run again and you get another instance. So it’s taking some time in finding the first solution, but then you will see it is again optimizing and will go down. I don’t know if it’s necessary to wait. So this is pure SAT, the errors are restarts. Maybe we can go back to the talk and then see what it produced.

So this we also do for Portugal, but there they like to have this random draw on TV with girls extracting balls. So we generate 10,000 different schedules for them and they select one. So they can publish in advance the database of 10,000 calendars. So everybody can check that they are completely different, they all satisfy exactly the constraints that they publically say that they have to satisfy and they can see that the girls extract the number. So that’s the best of both worlds.

>>: 10,000 girls.

[laughter]

>> Robert Nieuwenhuis: Okay and we also do Cricket, that’s also nice. So it is completely different, we have to learn a lot about Cricket and then you can do it. There we have 6 leagues simultaneously that are being scheduled.

This is another application where we used an ad-hoc greedy algorithm. So in the Paralympics

Games everything is scheduled until the last minute, everything has to fit in perfectly. There are lots of constraints for TV.

>>: But what’s the language for these constraints? Is it SAT, Boolean, integer?

>> Robert Nieuwenhuis: Well I don’t know. I don’t know how Javier handled it internally. I think he got a spreadsheet or something.

>>: But the algorithm works on top of what?

>> Robert Nieuwenhuis: So we did this for them once and now we are negotiating developing software and then we will have to agree. So we did us once and they paid us once for a moment and we are negotiating about doing it more times. It is quite nice to see the kind of constraints they have.

This is an interesting problem. So there is this third-party company who has geographic information, like Google Maps and they provide SAAS for a transportation company. So those

guys, the transportation companies upload information about their fleet of vehicles, capacity weight, size, drivers’ constraints, etc., and also everyday which packages they have to deliver where, and which size, and which weight and in which time window, etc. Then what we do is generate the planning for each vehicle. This is kind of strongly constraints combination of N traveling salesman problem. So N is one per vehicle in a sense.

And here again we use something completely different. So we use Gecode, I don’t know if you know it, it is an open source constraint programming package for implementing large neighborhood search. So it’s kind of local search where the notion of neighborhood is you take a random subset of variables which you are allowed to flip or to change, because it depends on if it’s 0-1 or not. So the optimal flipping of this subset of variables you do it with a complete method, in this case Gecode. So you do this local search and you can put any local search control on top of this of course. So you cannot detect whether you are in a local optimum, that’s too expensive, but you can iterate so and so many times. Each time you start with a greedy solution for instance.

>>: What is in your vision, to have a single tool that should be able to handle all these problems?

>> Robert Nieuwenhuis: No, no, no, this is focus 1, focus where we want to learn about customers and their problems and use the income to survive to do focus 2.

>>: But focus 2 had –.

>> Robert Nieuwenhuis: It had 2 modules plus having good tools. The modules where 0-1 integer programming and employee scheduling using this production scheduling, which later I will show you using SMT and some extensions.

So this is the other module. So this you can make as complicated as you want. So for the moment we stuck to already a quite complicated setting. So you have resources, machines, vehicles, human resources with different skills, etc., and tasks are subject to constraint. So how many resources of each kind you use. The time window, the duration, precedence between tasks, transportation and intermediate storage capacity like in chemical industries for instance.

Some things you can’t even store and then you have what we call an immediate precedence. So immediately after the producing task finishes you have to do the followup task because you cannot store the product. Availability of raw materials also makes it nice. Transition times and setup times between tasks, so when you stop using one machine for something and you have to use it for something else, maybe you have to clean or you have to change some pieces which cost time, this is called transition time/setup time. Also the duration and energy consumption of each task may vary depending on the time slot or the type of resource you use. So you may have fast machines, slow machines, etc.

>>: So is this to optimize the numeric variables or the Max-SMT?

>> Robert Nieuwenhuis: So we could put an optimization on top of it. For the moment what we are doing is minimizing the make span, but we also try to fit all tasks at least and after that we optimize something of it.

>>: [indiscernible].

>>: So you have multiple objectives and some of the objectives –.

>> Robert Nieuwenhuis: Yes in fact we were discussing this with a software company. What do you exactly want for the moment? So we can do both as you know. So here we have this language which you can easily –. So the API is a kind of text language.

>>: Right, but are the problem are Max-SAT’s with SMT or is it optimization with integers?

>> Robert Nieuwenhuis: You can express it in both I think.

>>: [indiscernible].

>> Robert Nieuwenhuis: So we solve it with SMT and extensions to handle all this stuff.

>>: So asking a different way, the objective functions are they over 0-1 variables or over unbounded variables?

>> Robert Nieuwenhuis: None, bounded integers.

>>: Bounded integers, but that’s 0-1.

>>: [indiscernible].

>>: Okay, in the current problem formulation that reduces to –.

>> Robert Nieuwenhuis: So you are talking with non-experts in all this, right. So it’s difficult sometimes, in fact nowadays we consider that we should have made it much simpler in start and then make it more complicated on the fly.

Then there are a couple of last applications.

>>: Are you allowed to make benchmarks available?

>> Robert Nieuwenhuis: It’s a pain sometimes. We made some benchmarks for SMT at a time and people start asking, “Can you produce more or can you try and do it in different characteristics?” So sometimes it’s quite a lot of work, but of course we could.

>>: I think the issue is if the customers would allow it or not.

>> Robert Nieuwenhuis: The customers would allow, but for the moment I am worried about our work.

>>: [indiscernible].

>> Robert Nieuwenhuis: I think for the customers at this point it would be no problem.

Okay, here are a couple of other examples: course timetabling, in fact we also did course timetabling with SAT and with Max-SAT and we might have some paper about this, but the latest version which we did for a real customer was with Gecode. This is another thing: plastic extrusion. So there are a lot of car industries close to Barcelona and we did this again with an ad-hoc algorithm.

Okay, this is all. So if there is some conclusion then it’s this one. You need a large toolbox of techniques and even just integer linear programming needs combinations of tools and inside this combination of tools IntSat may become one of them. That’s it, thank you.

Download