>> Leonardo de Moura: Hi. It's my pleasure... University of Texas at Austin. He's very influential in...

advertisement
>> Leonardo de Moura: Hi. It's my pleasure to introduce Marijn Heule. Marijn is from the
University of Texas at Austin. He's very influential in the SAT’s community. He is one of the
editors of the Handbook of Satisfiability, the author of a very sufficient SAT solver called March.
He also was an advisor in the last SAT competition and today is going to tell us a little bit about
his latest work. Thank you.
>> Marijn Heule: Thanks for the introduction. My talk today, if anything is not clear, ask
questions. My talk today is about unsatisfiability proofs and so I, before I started working on
this there was either, it was easy to make proofs but they were very inefficient to check, or they
were efficient to check but they were hard to create. And I present work here that has bridged
the gap here between both favorable properties. This is work together with Warren Hunt and
Nathan Wetzler at UT Austin. A short outline, so I've studied some motivation and
contributions and talk about the different kinds of proof styles that there are, resolution and
clausal proofs. I go into detail about it. The big disadvantage of clausal proofs is that the
current methods were very inefficient to check them and so I'll explain here techniques how to
check them efficiently. I also will talk about expressive proofs. That's all the techniques that
are used in the state of the art. In other words how can we express them. In the end, I will just
before conclusions I will mention kind of the future direction that we actually want to have a
mechanically verified checker that does all of the things that I will talk about. Satisfiability,
especially as you know here at Microsoft they are used in many tools [indiscernible]. They are
used to find counterexamples and then if you have a SAT benchmark, you get the solution for it.
It is used for many equivalence checking problems for miters, but there are also, you have more
bigger answers which kind of say if they are used for diagnosis that you ask what is the small
explanation for why it's unsatisfiable. And what we are interested in also in Austin is we want
to have a small proof for benchmarks which are unsatisfiable such that we can check it with a
mechanically verified checker, so we can trust the output of the SAT solver. Although there are
lots of positive news and they're used in so many applications, there are also some negative
news. One of them is that there are bugs in SAT solvers. And not only in SAT solvers, but SMT
solvers and especially QBF solvers. Even winners of the competition have bugs so last week I
was at FMCAD and there was the [indiscernible] Mobil competition and there were seven
rounds of bug fixing allowed to, that solvers kind of were [indiscernible] on…
>>: How many solvers did not have bugs? [laughter]
>> Marijn Heule: The problem is that you cannot really trust the solver, but you want, the
whole idea is we want to trust the output of the solver, so that we cannot be sure that the
solver always give correct results, but you want to emit something that you can say okay. For
this result I can trust because of this and this and this. And you don't have to trust the solver.
You only have to trust the results for this specific things that you are interested in. Does that
make sense?
>>: Well it seems like if you can always use the output then in some sense you could defend
that as fewer bugs than you care about.
>> Marijn Heule: Yeah, but you will check the output for each instance, so you have a problem,
you have a solvable problem -- I will show you in a slide soon. Once you have a problem, you
solve it by solver and then it gives you some kind of proof and then you have a checker and you
have the same problem and the proof and then you check whether the proof is correct. So you
don't have to trust the solver, only if the proof is incorrect, then you don't know, so the solver
might have a bug, but if the proof is correct then you have a proof of the problem. So the
answer for the problem is you know. Does it make sense?
>>: So you are going to talk about how the checker [indiscernible]
>> Marijn Heule: Yeah. It's the final step, but it's slide 35. It's about where you have a formal
proof for the checker and therefore you needed a formal proof for the proofs emitted, certified
proof, but yeah, there are several steps in between. The main contribution for this work that I
will talk about is a tool that can efficiently validate results. I didn't talk about this yet, that for
many of the, all the existing tools that provide these two things, so unsatisfiable course or small
proofs they use a lot of memory and with a lot of memory these tools for benchmarks used in
the competition if you turn on this emission memory consumption increases up to a factor of a
hundred. Due to this memory increase solver slowdown found eventually go out of memory in
a relatively short time. The tool can efficiently check results of a given proof but it also can
produce these results without requiring much memory. That's what I will talk about. I guess
everybody here knows about satisfiability problems so I don't have to go too much into detail
about that, but given a Boolean formula is there an assignment that satisfies it. Throughout
this presentation closest will be denoted by blue boxes and I will be free Boolean variables a, b
and c. Negative literals are denoted by negation, so the first clause is not b or c. It is easy to
check whether an assignment satisfies this formula for this assignment. Not a and not b and
not c, we just scan over the formula to see if each clause has one element in the assignment
there. I guess this is clear for everybody? Unsatisfiability is much harder and it is done by
checking clauses that will be added to the formula. Most of the techniques can be expressed
using resolution steps. Resolution works as follows. You have a pair of clauses such that they
have a complement that is literal, for instance, b and not b and then what we can do with the
resolution rule is we take the union of the clauses and remove the complementary pair.
Resolution with not a and b, not a or b and not b or c gives us not a or c. The same thing we can
do with another and your purple circles mean lemmas which mean they are not original clauses
but have been, they can be added by resolution steps. So on the left there's a resolution
derivation but I will use this notation throughout the presentation which means that we can
constrict lemma c using resolution using these clauses but we don't specify how.
>>: I guess we do because [indiscernible]
>> Marijn Heule: Yeah. So you can reconstruct it but this notation does not specify exactly how
you do it. This notation specifically specifies how the resolution steps are so there is not an
order in which the resolution, so you can give an order, but if you look at this you don't know
exactly how to do it. Of course you can reconstruct it, you can reconstruct it in linear time, but
it's just not notation.
>>: But an arbitrary solution [indiscernible] reconstruct one more time.
>> Marijn Heule: No, not arbitrary, no.
>>: So there's some, if I recall correctly you have a restriction which is that the steps have to be
reconstructed by unit propagation.
>> Marijn Heule: Excellent. That's also the restriction I will use. I will come back to that later.
>>: Maybe I'm anticipating too much.
>> Marijn Heule: In this case I just want to say that we don't specify it like, all the drawings will
be like this and not like that. This is what we just talked about earlier, kind of the tool chain.
Although, it gets a bit more complicated, but you have the Boolean formula; you give it to a SAT
solver. If it's unsatisfiable, you give you also require that it, to some proof of unsatisfiability and
there can be several kinds of proofs and the proof and the original formula are given to the
checker. You don't have to trust this. As soon as this is correct and this is correct then you
trust this or this is verified then you are done. Is that clear? I will also talk about redundant
clauses a lot. What is a redundant clause? If you add it to the formula it remains [indiscernible]
satisfiable. If it has solutions adding the clause, there will still be solutions and also for removal
if it's unsolved and you remove the clause then it remains unsolved. It's a more general notion
than a logical equivalence and that's necessary for some of the properties that I'll discuss later.
It's just if it has solutions and you add it there still are solutions. There might be fewer solutions
and if you remove it if it's unsatisfiable then it is still unsatisfiable, but if it's satisfiable when
you remove it, you might introduce solutions if you remove it. What is a proof trace? You have
a formula and a proof and so the formula everything is, they can be of arbitrary order, but the
proof there is an order of sequence, a sequence of lemmas. How does it work? We have a
formula and we have to do some checks that say the formula we lift a clause from the proof for
satisfiability equivalent with the original one and then we add clauses until we add the last
clause in the proof. And we call it proof of refutation if the last clause in the proof is in the
empty clause, which is denoted by this symbol. So if that one is at the end, then the empty
clause cannot be satisfied so if the empty clause is redundant the satisfiability equivalent then
we know the original formula is unsatisfiable. There are kind of two kinds of proofs and also I
won't go into detail here but in the next slide one is resolution proofs and the other are clausal
proofs and there are kind of four properties you kind of really like to have in a proof format.
What is the properties? You want a proof to be easy to emit so you don't need lots of change
to results over to emit, to obtain a proof and for resolution it is, you can acquire obstacles to
implement it in the state of the art SAT solvers. For clausal proofs it's very easy. I will show it
later. You want to have them [indiscernible] and compact is for competition results is kind of a
few gigabytes which is what clausal proofs can do, but resolution proofs can quickly grow to
hundreds of gigabytes. You want things to be checked efficiently and so the resolution proofs
are big but they can be checked efficiently. And clausal proofs are existing methods are
expensive to check clausal proofs, so our contributions mostly focus on this and by adding
deletion information which is a feature of the solvers, you can make check-in much faster and
here we have a percent more techniques to really speed up the checking and so then we
combine each emission and compactness with efficient checking. What you also kind of want is
you want to have things expressive which means you want to cover all existing techniques in
SAT solvers. In a proof format, so resolution and clausal proofs have the same kind of
expressivity and they can cover lots of techniques, however, there are several techniques that
are not covered. A case I presented this year another format which is very expressive, and this
is kind of the state-of-the-art so in the future I want to combine everything so that I have it all.
I have also the expressivity here but it's kind of more tricky to get efficiency and expressivity.
>>: [indiscernible] proofs are very [indiscernible]
>> Marijn Heule: Yeah. They cover extended resolution.
>>: And so what about QBF and proofs?
>> Marijn Heule: I'm not talking about SAT solvers. I think I can…
>>: So there are certain extensions of resolution and QBF is [indiscernible]
>> Marijn Heule: Yeah.
>>: Does that fit in these frameworks or [indiscernible] extension?
>> Marijn Heule: I think it all fits in but there, so these RAT proofs I think can also be used for
QBFs but the procedure of checking them becomes more complicated because you have to also
take into account universal reduction and you have to take into account the order of the
quantifiers, so the check becomes much more complicated. I'm just working out where
everything works and that you can do it. But I think for QBF it would be very nice way of having
certificates because if you, remember I call the solutions, it's very expensive. Having a clausal
proof would be much more complex, so these RAT proofs is resolution and unit propagation
combined and then you also have to take into account this quantifiers and universal reduction
because that's also a well-known technique used in the QBF solvers. You want me to, did that
answer your question enough?
>>: I think so. I haven't [indiscernible] believe it was about [indiscernible] QABF resolution and
then QBF would be an extension, but you say it somehow it fits in.
>> Marijn Heule: It somehow fits in, yeah. I just submitted the paper that these, about these
procedures for clausal elimination and clausal addition which kind of describes how to, how
things should be changed in order to let everything work for QBF.
>>: Okay so [indiscernible]
>> Marijn Heule: But this was kind of more in how, for instance, it's used in blocker. It's the
main pre-processing tool for QBFs and they use also these kinds of procedures, but it's kind of, I
explain how it works and that they are sound and I still have to kind of work out how it's
actually working for proofs and how to make a proof checker for QBF because it's much more
complicated.
>>: Do you consider [indiscernible] proofs [indiscernible]
>> Marijn Heule: What exactly do you mean? What kind of format do you…
>>: [indiscernible] proofs for [indiscernible]
>> Marijn Heule: Ideally, I would like to have this kind of universal kind of proof construction as
possible, but I think the way is to go from SAT maybe to QBF and then to go up instead of going
down. I think first solve the QBF version and then kind of handling relief into more complicated
clause.
>>: [indiscernible].
>> Marijn Heule: I am not aware of all of the techniques used in all of these kinds of solvers.
[indiscernible]. Discussion about resolution versus clausal proofs. So what is, so I will talk
about resolution graph. The resolution graph is on the bottom. It has all of the clauses and it
has a sequence of clauses and the sequence is based on when the solver learns the clause. It
has incoming arcs of clauses that require clauses or the lemmas that are required to prove a
resolution. And it's a refutation if the top clause is the empty clause. A resolution proof is just
a simple description of the graph. It has for each line it has an identifier for each of the clauses
and lemmas and additionally it has all of the, describes all of the incoming arcs. The size of a
resolution proof is the number of vertices plus the number of arcs, which is big because, so
here, this is a small proof, but typically there are millions of clauses and on average what I will
show on the next slide there are on average 400 incoming arcs in a proof per lemma, so the size
is pretty large. I also used the word core and core is all the lemmas or core clauses are clauses
that can reach the empty clause in the resolution graph and core lemmas are lemmas that can
reach the empty clause. For this resolution graph all of the clauses are in the core and not A is
not in the core because you cannot reach the anti-clause starting from not A. One of the main
problems is that they are huge and the other one is that they are hard to emit these proofs. To
give you some details. Here is the number of clauses and here are the number of literals. This
is the number of literals average in learned clauses which averages about 40 and here is the
number of on average the number of incoming arcs for lemmas in the proofs. What you can
see is that there clearly are many more incoming arcs than literals in the learned clauses. In
order to have a much more compact proof, you actually only want to store the reds instead of
storing the green. And that's the idea of clausal proofs. You only store the literals in the clause
and not store the incoming arcs. Based on this data you can see that resolution proofs kind of
are on average kind of effectively ten larger but if you look to the memory consumption it's
going to be the effective a hundred larger and that's because clausal proofs are immediately
dumped to discs and for efficiency resolution proofs are kept in memory, so you have these big
things into memory. The idea is only store the red dots and not the green ones. To come to
your point how do you now reconstruct if you only have the literals in the clause but you don't
have the incoming arcs, how do we reconstruct the incoming arcs. You do unit propagation and
unit propagation the clause is unitive or its literals are falsified except for one and then we can
affine the remaining literal to true. And then we do that until fixed points. For example, we
have five clauses here and we have the simple assignment not c then there are two units
clauses, not b and a, and we just, for instance, take one clause to extend the assignment. We
say, for instance assign not B to true which creates another unit clause not A. we now take
another unit clause, for instance A, assign it to true. Accent the assignment and now we're
done. There are no unit clauses anymore, but notice that there is one falsified clause, not A
and B. this procedure can be used to reconstruct the arcs and it works as follows. If the
procedure calls for reverse unit propagation, so the main idea is that we assign all of the literals
in the lemmas to false and apply unit propagation. If unit propagation causes one of the
clauses, existing clauses to become false as well, then we can reconstruct the arcs with all of
the units propagation steps. If we assign not C to false and then we have these two units which
falsify this clause. And the reason why we can add lemmas, we are now able to add lemmas, is
that if this clause is falsified then one of the antecedents is falsified so the only way to satisfy
this is if all of these clauses are satisfied this must be satisfied as well. So therefore, this is the
reason why this clause is redundant. Is not clear? Now we have a clausal proof.
>>: [indiscernible]
>> Marijn Heule: This procedure, no. The one that I use for [indiscernible] will be later on in
the presentation and will change the set of [indiscernible]. But this is keeping the logical
equivalent [indiscernible]. This is a clausal proof, so we have here the order of the clauses. The
order that they have learned and we are going to check them and we check them by the
procedure I just explained. We assign all the literals, all the literals in the lemma to false. Do
unit propagation and we check whether there is a falsified clause. In that case we reconstruct
the arc. And the same thing we do for not A. Again, unit propagation falsifies one of the other
lemmas. We do the same for not C, assigning C to false which creates this and in the end we
apply unit propagation to all of the clauses in the proof which should result in a falsified clause.
This way by applying unit propagation we were able to reconstruct the arc without actually
storing them and as soon as you check them you just throw them away or you can store them
[indiscernible], but you don't have to keep anything in memory. The biggest disadvantage of
this technique is that the existing methods are expensive to validate, to reconstruct the arcs.
Now three improvements, how to do it more efficiently. The first one was already proposed in
the original paper. The procedure I just showed you was described in a paper by Eugene
Goldberg and Novikov in 2003, so it's a decade-old. They also proposed to do the other way
around so start by validating the empty clause and go backwards and if you use conflict analysis
after you find, reconstructed the arcs, then you can mark the clauses that have got an arc and
you only have to check the ones that have been marked during the process. The advantage is
that you have to validate much fewer lemmas. The disadvantage is why I initially was against it
is it is much more complex because you have this conflict analysis procedure and the checker
becomes twice as large. In the end it is too much of a gain to really kind of ignore it. Although
it's a decade old, there was no procedure, there is no fast checker of this procedure available,
so this is the first kind of contribution we have is that we have a fast open source
implementation of this. But even with better checking it's a 10 to 20 times slower than actually
solving the benchmark, so we have decided that the solving time is much less than actually
checking it. What is the main reason why checking is much more expensive than solving is that
besides solvers learn lots of clauses, they also aggressively delete clauses. By aggressively
deleting clauses they use, the unit propagation goes much faster and as a consequence, yeah,
that's what you actually want to have in the proof as well. So you want to not only exploit to
learn information, but you also want to exploit deletion information and this is done by
introducing gray boxes. Gray means the clause is deleted. After you learned not B as a lemma
and there is not B or C in the formula you know that you can delete not B or C because it
[indiscernible]. Each assignment that satisfies this also satisfies this. So in the FMCAD paper we
show that you can also combine this by ignoring certain clauses by going backwards and also a
nice thing is that we show that you can optimize clause deletion information during checking
such that you, the tools that we provide doesn't only heavily shrink the formula which go for
the proof which can kind of frequently 90 percent reduction, but also optimizes deletion
information so that if you want to check with a verify checker, it's much faster. And the third
optimization is so-called core first unit propagation. While checking proofs it does this
backwards checking and only 10 to 20 percent of the lemmas and clauses actually will be in the
core of the final, final core. Doing all of this unit propagation over those 80 to 90 percent
lemmas and clauses is completely useless because you will never use them, so you want to
avoid doing unit propagation on them because it's very expensive. The idea to do this is to have
unit propagation first only on clauses that are already in the core and do that to fixed point and
if there are no clauses to propagate in the core you stop or you try to find if there's a single one
which is not marked. You propagate that one and you see if there are now again core clauses
that are used. Yeah?
>>: So when you start with the [indiscernible] you don't know what the core is. So you in the
first round you might have to [indiscernible] and then you proceed to say we use whatever was
scanned before and then you scan more.
>> Marijn Heule: So if you have an empty clause which is always marked during the start, then
you do unit propagation and only step three will be done, because there are no clauses in the
core so you, everything will, so it's more expensive to, so for the first step it's not a very good
procedure because it's more expensive because you first constantly check are there any core
clauses which are become unit, but because there are no core clauses.
>>: Right but I was wondering how you make step two additions in the bootstrapping case. It
seems that step two you would have to examine the entire trail. You do that in the, looking at
by traversing the trail in reverse order do you take the given clauses as [indiscernible]
>> Marijn Heule: So what I do is there are two pointers so as soon as the variable is propagated
I mark this as to examine and I first go over, so there are list of clauses in the core that are not
in the core and then I first check the core and then the other one is kind of, as soon as this one
reach the end and all of the variables have been checked for the core clauses, then it starts
doing now the same for all the variables it starts to check and examine clauses of the known
core clauses.
>>: I guess the question is in the first case two has no core clauses so [indiscernible]
>>: So [indiscernible] watch clauses that all [indiscernible]
>> Marijn Heule: So there are two pointers. There is this trail so we’re counting all the
variables that have been set to certain value that are assigned. And for each variable in the
order they have been assigned I check are there any core clauses becoming unit at this point.
For instance, first for variable A and then there are no core clauses than I go to variable B and
then variable C. And you cannot get…
>>: [indiscernible]
>> Marijn Heule: In some cases you can even get this effect of five speed up because it's really
kind of ignores all of the things that, all the clauses that you can really afford seeing them at all,
because doing this kind of, having this first loop checking only conflict, only clauses in the core
already gives you a conflict so you never actually reach point two. So [indiscernible] you
propagate, propagate propagate hit the conflicts at step one and then you stop. Another nice
feature of doing it like this is because you can really try to postpone pulling clauses and lemmas
in the core because you propagate first on all of these, is that the actual cores both for the core
lemmas and the core clauses these will be smaller. You say I really want to avoid pulling things
in because as soon as you randomly do unit propagation then you easily pull clauses in which
are not in the proof yet. Now I will explain the same proof again but now with the three
optimizations. On the left we have now the same proof, but we have these two clause
deletions inserted and we will do the proof backwards and we will do it with the core first unit
propagation. The first step will be we mark the empty clauses in the core and all the clauses
occurring that will be at some point deleted in the proof will be removed from consideration.
That's why, they are now gone. The first step is now we going to backwards we check the unit
propagation on the empty clause and which, for example, uses these four clauses to find a
conflict. Then all of the clauses which are involved are marked. Next step is we continue using
C and we reuse as much as possible which is the not A and then we end up with having A or C
pulling into the proof. So when you use clause deletion information, so actually when you do
backwards if you find a deleted clause then you bring it back. Now we check not A and we
bring those two clauses in the core. Now the deleted clause not B or C is brought back and in
the end we have not B but not B is not marked so we don't have to do any check; we can ignore
it. Then we are done. Notice it is kind of the same proof we had with the, for the other
example, but what the other example had no B, C, and the empty clause in the core and now
we have not A so the proofs really change, but also notice that in the other example we had all
clauses in the core but now not B or C is not in the core. And this is also what happened, so we
really by, you do not only kind of check but you can reconstruct the arcs, but by reconstructing
the arcs you can actually have a smaller proof and a smaller core and so it's not only that you
will have information that you might think okay, if you do everything during solving and you do
it afterwards you lose information. Actually, it's better to do it afterwards because you can
really exploit it by doing this alternative unit propagation and other things.
>>: [indiscernible] C and [indiscernible] check that C is correct. What's [indiscernible] uses B
[indiscernible]
>> Marijn Heule: Yeah. So actually this [indiscernible]. So maybe, so as soon as you start with
not A, everything above it is deleted. Maybe it's just to keep the picture but you cannot use,
you are not allowed to use any clauses that are higher in the proof to be used for unit
propagation, so as soon as you go down, so if you check the clause or lemma, it's out. So you
verify it and it's out. You can only reconstruct lemmas if they are below in the proof. So I
implemented, getting this deletion and clausal proof information into the Glucose solver which
was the 2012 challenge and the nice thing is that you can have this deletion information
everything output to disk with say 2 or 3 percent overhead to the running costs with only
adding 40 lines of code and using old techniques which are using Glucose, so you have all the
preprocessing everything and the lines are actually very simple. You have a close database
management part in the solver and then the only thing what you do is if there is a clause added
to the data base you print it and if a clause is deleted from the database you say I delete it and
that's it. You don't have to do any more kind of tricks. The only reason it's 40 instead of 20 is
because the procedures are slightly different for the preprocessing and the solving. It will have
been the same if you could do another 20 lines, which is okay. On the, I check it also with
Picosat. Picosat is kind of the state-of-the-art solver for resolution proofs. This is the fastest
solver for which has all techniques for resolution proof enabled and if you run them both with
proof login so Picosat solves the first solution proof logging and Glucose with DRUP logging,
then what you can see is that on the 2009 benchmarks from the SAT competition it's Glucose.
This is [indiscernible] plot which means all of the benchmarks for each line are sorted based on
the y-axis which is the time in a log scale and so what you see is that the Picosat is much slower
and one of the major impacts here is that memory consumption increases a lot because of
enabling the resolution proofs and Glucose memory output is exactly the same because the
only thing what you do is you dump across to disk when you learn it or when you remove it, but
there's no additional memory consumption here. And the third line is the two line implements
to check the DRUP proofs which does the checking which also makes, computes the smaller
cores and the smaller proofs. This is the time to check the proof. So this does not include the
solving time so this is kind of what you see is kind of similar, the cost of checking the proofs and
solving the proofs, solving and generating the proofs.
>>: This is 100 times faster the first time [indiscernible] benchmarks? And then you say it
solves twice as many?
>> Marijn Heule: For kind of a given time yes. [indiscernible] so if you take a life for a given
time Glucose is almost twice as fast, so this is kind of what I is, you know. But as you can see
the running time for Picosat is very high and so there is a point where, so for 900 seconds lots
of these benchmarks, I think a third of the instances are going to be solved more than the first
because of the memory consumption.
>>: And you will [indiscernible] the checking which is ten times slower?
>> Marijn Heule: Yeah, yeah. For here there are kind of, there are a few exceptions, but for
most of the benchmarks, so up to here that's, it's kind of close.
>>: Can you explain why it has this jump? What's the bottleneck? Why it becomes suddenly
ten times slower?
>> Marijn Heule: The bottom line is I think, I looked into it and because of what you kind of see
in one of the solutions is for most benchmarks it's kind of this core first is really helpful, but for
other benchmarks it's really, both the loops constantly interact with each other which makes
things much more expensive and I think most of these benchmarks here, they are enormous
and there are only solved with preprocessing and preprocessing added everything is already
resolution and somehow the solver with unit propagation can really, it might be there are a
million variables in it. It does unit propagation on 100,000 to check only that there
[indiscernible] be assigned to get really the conflict. So yeah, the most important exceptions
are benchmarks that can be solved with almost only preprocessing that has kind of millions of
clauses that with variable elimination can be solved, but the proofs are big because, so if you
vary by elimination, yeah there's just a single resolution step, but then, all the resolvers you add
and then you remove all of the original ones and yeah. So frequently what you can have is you
add the thousand, remove a thousand and one and then adjust to have one single step. If the
loop is kind of slow, but I guess it can be fixed, so with the current free IDs or [indiscernible]
that I showed its already kind of effective 20 faster than the original kind of procedure that was
proposed before. But I guess this line can go further down.
>>: So would you then [indiscernible] in a given proof set to avoid the [indiscernible]
>> Marijn Heule: Know. I think there should be just a more sophisticated unit propagation
routine. You should somehow be able to easily detect that there is a clause falsified. What you
have when you do this variable elimination you assign everything to false and there is then, if
you assign one variable more because everything is, I mean you are using resolution, so if you
find one of the variables that, for which the resolution is done you assign one more variable and
then another clause is falsified. Unit propagation is just with a single, adding a single unit
propagation step there is some clause falsified but yeah, maybe changing data structures
sorting things you can immediately find this, but now sometimes instead of assigning a single
variable you assign 100,000 of them before actually finding it. So there is a lot, I think there is
something to gain there. Exactly how to do it, I am not sure, but I guess making the unit
propagation better would fix this. Already, you see with the core first you can gain a lot. I think
there are other things. And the interesting thing is that if you were to just implement this core
first implementation as I have it now in the SAT solver it would be kind of twice as slow, so it's
really, so the checking how to do efficiently the checking in the checker is completely different
way of thinking than doing fixing the solver because it in a checker you know that -- as soon as
there's no conflict the proof fails. So you really know okay that at some point there must be a
falsified clause. While in SAT solving you expect that unit propagation would return a fixed
point where there's no conflict in most of the cases.
>>: [indiscernible] what the graph doesn't show is the different proofs that [indiscernible] find
[indiscernible] will find [indiscernible] learned clauses. In your example, proving doesn't see
[indiscernible]
>> Marijn Heule: This is not the only slide with results, so let me [indiscernible] this is also one
of the slides so this is for the original formula. This is number of clauses and so this is all the
benchmarks they could both solve, Glucose and Picosat to have the comparison, although as
you can see the running times were much different again. This is the number, so you can see
the size reduction after a single, so you solve the benchmark, check it. There's a big difference
between the original size of the formula and the size that is given after one step and you can,
for instance in tools that you can satisfy minimum satisfiable cores it's [indiscernible] this
procedure. So this is the Picosat proof which is in memory which uses kind of the data that's
still available in the solver at the point of constructing it. If you do Glucose with the backwards
checking but without the core first then the size of the proof is as in the red line and the green
line is what you get when you apply the core first. Kind of to show you the proofs get smaller
when you use this core, so you use the alternative unit propagation. Is this part clear? So for
tools that want to have minimum and satisfiable cores you really kind of want to take this step.
Especially the first step is very easy, but getting proofs really smaller from kind of after finding a
single proof can get much harder. This can be a big step in actually computing minimal
satisfiable cores because there is at some point you need to really delete clauses one by one
with a single SAT check. So I was happy that I was able to push this with some resistance.
There was the SAT competition 2013, the organization demanded that, required that
insatisfiability tracks that solvers that participate in the insatisfiability tracks had some proof
emitted. There were two kinds of options. Either TraceCheck which is the most widely used
resolution format and this Delete Reverse Unit Propagation so we have these clauses with
deletion information. The competition allowed for 5000 seconds for each of the benchmarks to
solve and allowed the checker to have 20,000 seconds. There were kind of three categories but
for the application category nine solvers, there were nine solvers emitting DRUP and two
emitting RUP so ignoring the deletion information and there were nine solvers hard
combinatorial and the nice thing was of the top-tier of solvers like Glucose, Lingeling they all
submitted proof version to the competition. Some statistics. The top-tier, the highest ranked
solver implemented the deletion efficiently. 98 percent of the DRUP proofs could be checked in
the 20,000 second timeout, so practically also one or two instances of the solvers were not,
could be not check because they had reached the timeout but practically all of the proofs had
been checked, our or refuted by the checker within the time of and if you look at sort of, there
were some solvers that only used RUP so they didn't use the deletion information and only 40
percent of the proofs could be checked. By having this deletion information and even with core
first and everything else and backwards which was done here, don't having deletion
information kind of brings down saying, checking 98 percent of the proofs to checking only 40
percent of the proofs in the time limit.
>>: Did any of the solvers report any incorrect results for the proofs they checked?
>> Marijn Heule: There are two issues here. There were a few proofs and even the winners I
think for each solver there was at least one benchmark, maybe one mini SAT was the only one I
think that had proofs that all went through the checker and then the other ones had maybe one
or so the checker, they couldn't reconstruct it. But it doesn't mean that there might be some
ordering difference especially sometimes when you delete a clause earlier than you are allowed
to delete it and then the proof might fail, so there's an implementation issue. It doesn't mean
that there's a bug in the solver if the proof fails, but for the winner and the number two in the
competition they all had at least one benchmark where it said, the checker said I'm not able to
check it. Of course, if could also be the checker.
>>: Was there a penalty?
>> Marijn Heule: No. There was no penalty. That's one of the rules was if the checker says I
am not able to verify then the benchmark is just, it's the same as not solving it. And it's also the
reason it was the first year to let people, because there will be in this case if there was only one
solver that all the checks go through that would mean that the solver would automatically be
the winner. I think it's a tough penalty because if you say there is a solution and you give the
wrong solution you are disqualified. The competition was for in SAT it’s still too early to kind of
have this kind of tough rules. The other thing is, what?
>>: Want to give the [indiscernible] [laughter]
>> Marijn Heule: Something like that. I also discussed with Leo before there were 30 instances
in the competition for which there was also an old benchmark [indiscernible] solvers were
learning all benchmarks and there was no certification for those benchmarks. There were 30
instances where the fastest solver in depth track was at least ten times faster than the fastest
solver in the certified track which you think okay. What's going on here? And for 21 instances I
was able to detect the winning solver had actually a bug on those instances. The easiest way to
check it which frequently helps is you remove the top 10 percent of the lines, except for the
first-line, you remove the top 10 percent of the lines and then you give it to the solvers again
and then frequently the fastest one is still unsatisfiable and there's another solver to say it's
satisfiable.
>>: [indiscernible] example where the [indiscernible] 0 missing or there is a line with a 0.
>> Marijn Heule: Yeah. So there was also one of the [indiscernible] was a solver there was an
empty line in the proof format in the input file then it's considered an empty clause and is
immediately unsatisfiable, but an empty line is to just be ignored. Therefore, by just removing
some clauses and still the empty line is still in the file then you will, as long as the empty line is
there it was a unsatisfiable but other solvers at some point well say we will find a solution
because of the missing clauses. But there are also things -- okay. I don't want to probably go
too much into detail there, but it kind of shows that if there are solvers [indiscernible] they say
that sometimes they had to turn off some features because it was not easy to implement them
using the DRUP information, but maybe it was because they are buggy and then it's hard to
implement it because you can never, the checker will, would always say it's incorrect. There
were so, yeah, I'm really in favor of checking things because you can really, you kind of see that
there are bugs in solvers actually come up much higher in the rankings because of these bugs
because these first few instances they were not buggy, they were not found buggy in this track,
so they were just, everything was correct. I only after the competition I used this kind of this
trick to remove these clauses and then you see that there are, they are buggy but if you look at
the competition results, everything is fine.
>>: [indiscernible] competition have [indiscernible] benchmarks or are they all [indiscernible]
>> Marijn Heule: There are unknown benchmarks in all categories, and especially in the
random category we generate everything at the phase transition for and so half of them are
kind of expected to be solved and half of them are expected to be unsolved, but there's no way
you can ever prove in SAT on these benchmarks so if you prove it's 50 percent it's something
you might assume that the other, but you never know for sure. So in the random category
there is lots of [indiscernible].
>>: If you train your solver on known benchmarks [indiscernible] most of the benchmarks are
known and then there's some chance…
>> Marijn Heule: What you mean known? I mean a few say known, for known solved or known
unsolved.
>>: [indiscernible] benchmarks
>> Marijn Heule: No. No . No.
>>: Suppose you train your solver on [indiscernible] benchmarks and it has this [indiscernible]
for something, you are not going to detect it by rerunning [indiscernible] in the competition.
>> Marijn Heule: Sorry. I had the wrong impression. I thought your question [indiscernible]
unknown it means it's solver unsolved.
>>: [indiscernible] mistakes
>> Marijn Heule: For the competition we did it 50-50. Fifty percent of the clauses were
somewhere published on the internet and 50 percent were contributions, new contributions to
the competitions and that's kind of the best we can do because there's a certain amount of,
there's a limited number of people submitting and so some people submit. I think we had one
participant for over 1000 instances, but if you would select many of them then you would really
favor this one because he also submitted the solver then you can kind of win by submitting your
own benchmarks and your solver which is optimized for your own benchmarks and nobody else
knows they were there, so you are really limited in how many new benchmarks you also can
use. So it's kind of a 50-50, which I think is a good compromise. Yeah, a little gruff about the
competition so I already talked about the two different tracks. There are solver, the same color
means they are the same solver, but they are in different, either they are in category where
there is no checking and login and the other category there is checking and login. And what you
see for Glucose and for Riss their performance is pretty much the same in both tracks, so all
features enabled and no login or no checks and, but if you look for Lingeling which did not have
to do deletion information you really see that if you don't have deletion information hardly
anything can be checked anymore and it all breaks down.
>>: [indiscernible]
>> Marijn Heule: Lingeling is much more than 40 lines because it uses so many techniques and
so many data structures and he did not implement it yet. I tried to convince him to implement
it for the next ones. But it's another animal. It's a huge solver with lots of things. But you can
really see that if everything could have been implemented he would have definitely won also
the other, the certified track. But you can see that now Glucose won that track, was the fastest
one on the application, certified application.
>>: [indiscernible] small addition to [indiscernible]
>> Marijn Heule: [indiscernible] what do you want to say with that?
>>: You get lots of bang for the buck. So the difference is that you can solve ten more
benchmarks?
>> Marijn Heule: No. That's not much when you consider how many lines of code have been
added.
>>: Yeah. That's one reason [indiscernible] how many lines of code do you [indiscernible]
>> Marijn Heule: I think that probably Lingeling is about probably 15,000 lines of code and
Glucose is maybe a thousand. The difference between, but yeah, for the SAT competition this is
considered a big difference.
>>: [indiscernible] much faster [indiscernible] instance [indiscernible] right?
>> Marijn Heule: Yeah, but it's because lots of preprocessing is done and also what I talked
about here before lots of benchmarks are submitted are really [indiscernible] encoding and
Lingeling in the early stages kind of tries to fix that. And for easy benchmarks or it doesn't pay
off but for the harder ones it pays off, so at some point it really Lingeling takes over because
then all of the stuff is useful. But as you can see for our [indiscernible] the problems which are
typically very small and there is not too much re-encoding optimizations possible. You kind of
see that Glucose is, yeah it's faster. And Riss is also the same story, is also just a few more lines
on [indiscernible] kind of similar in size, similar architecture. So it's ten to twelve I have also
some slides about expressive proofs, but shall I continue with this and…
>>: [indiscernible] to me, I think you should continue.
>> Marijn Heule: Okay. I don't know if there is someone else coming into the room at 12:00.
>>: [indiscernible] not until 1:30 but you might be done by then.
>> Marijn Heule: Okay. We might. [laughter]. So far I've been talking about RUP and DRUP
and so it covers resolution this most important learning paradigm with CDLC learning, Boolean
constraint propagation, subsumption, everything is kind of covered in this DRUP format, but
there are some techniques used in solvers, for instance in Lingeling, all these techniques over
here are not covered. There's another polynomial time checking procedure which is called RAT
which I will explain a little later which kind of covers this, so instead of doing this DRUP check, if
you do the RAT check you kind of cover everything which is kind of used instead of the blocked
clauses
>>: [indiscernible]
>> Marijn Heule: What do you mean, do you add -- in Lingeling you can add blocked clauses,
yeah. By the way it is one of the things that was buggy and actually motivated the whole, the in
processing rules and actually directs because I actually just thought [indiscernible] I could make
up a small example where there was only a few million of instances where there was no bug in
it and a small example of only seven clauses was buggy in Lingeling and it's really tricky to see
how to implement this in such a way that it's sound. Here roughly you run Lingeling around
blocked clause addition.
>>: I just thought [indiscernible]
>> Marijn Heule: Yeah. I think the nice thing you get for blocked clause addition is you, so
blocked clause elimination can frequently shrink the instance and then you can empower there
variable elimination because there are fewer clauses, but sometimes it's useful to add blocked
clauses for the remaining variables. You add propagations to for the remaining variables.
>>: Oh. Okay. I see what you're doing.
>> Marijn Heule: So you first bring it down and then you bring back to life or new ones for the
things that are still remaining in the formula to have new propagation steps.
>>: [indiscernible]
>> Marijn Heule: Yeah. I well talk about RAT later how to [indiscernible].
>>: Oh RAT is the [indiscernible]?
>> Marijn Heule: Yeah. RAT is the [indiscernible]. One of the problems that is, for which
typical SAT solvers are, like Glucose are extremely slow on are the pigeonhole problems and it's
very easy problem from a high level is that given n-1 holes and n pigeons can we put the
pigeons in the holes such that there are now two pigeons in one hole. Even a little kid can say
okay. That's not possible, but for SAT solvers if you encoded it can be, proofs can be
exponential, or will be exponential if you don't use special techniques like the one in the other,
saw in the bigger circle and one way to do it is you can do it with extended resolution which
kind of has a cubic size in number of pigeons. It's kind of, you translate the problem from n-1, n
pigeon, n-1 holes to n-1 pigeon in n-2 and you do n-1 times and then the problem becomes
trivial. But this is, you cannot, the proofs that you can do over here, if you add, express those
proofs then DRUP says I cannot explain this. Another technique which is actually pretty useful
for some benchmarks is called bounded variable addition. Given a CNF we try to find a set of
clauses which we can replace by a smaller set of clauses by introducing a new Boolean variable
x, for instance and x does not occur in a formula. The smallest example for which this works is
we have these clauses and we want to replace them by these lemmas. If we add the single one
of them to the formula, if you assign this to false and x does not occur elsewhere, then you can
see there is no conflict. How do you deal with such a technique? Even if you have extended
resolution which you might think okay, let's because of the pigeonholes you have small xn
resolution proves but even for this kind of techniques it's not easy to make an extended
resolution proofs which can express this.
>>: [indiscernible]
>> Marijn Heule: What?
>>: [indiscernible] this is the case where [indiscernible] equivalent but [indiscernible]
>> Marijn Heule: Yeah. Although, it's now obvious, so in this case you can, it's somewhere in
between because with block loss addition you really can throw out solutions, so if you do this
trick, the number of solutions stays the same. The [indiscernible] the same set of solutions over
the common variables. It's a kind of in between satisfiability equivalent and logical equivalent.
>>: [indiscernible]
>>: So I was under the impression that you would always preserve [indiscernible] under
[indiscernible], but what you are saying now is [indiscernible] clause addition [indiscernible]
>> Marijn Heule: Can I, if you just use this example, so we have A or B and we have not A or not
B so if say this is the formula we can add this clause which is blocked with respect to this
formula, so if we add it then, so block loss addition allows you to add this clause which clearly
removes one of the solutions. There are different techniques but and potentially these kind of,
there are techniques that, and so extended for in [indiscernible] resolution in general has this
property of this kind of in between, so it just same set of solution over common variables.
>>: [indiscernible]
>> Marijn Heule: But to express this is [indiscernible] you cannot really solve this. You can do it
but it's tricky. And you actually want to have a technique so the RAT I will explain in the next
slide is something that you can, all these clauses have RAT and that is kind of, you don't have to
make lots of [indiscernible] resolution steps to explain why you can add it. You can just do
exactly what the SAT solver is doing now. If you add the clause to the database just say you
added and if you delete it you say you delete it without having a whole set of additions and a
whole set of deletions in order to convince the checker that it's actually sound. In first this
slide, yeah so, this is kind of motivation why you want to use this bounded variable addition, so
these are free pigeonhole problems. These are free bio informatics problems. If you run
Glucose without BVA preprocessing, these are the running times. If you do first BVA
preprocessing these are the running times and you see that there is a large speed up. For
certain problems you really like to have this kind of preprocessing step because it's, it can give
you a lot of runtime. You want to ask something? This we've seen. This is the RAT procedure
and now the RAT procedure, and so a large example on the next slide, but the, so, a lemma has,
RATs, or resolution asymmetric tautology if it has RUP or, and that's the, if there is a literal in
the lemma such that all resolvents on L, all the literals and namely with all clauses contained a
compliment of the literal are either tautologies or they have RUP. You have a clause and a
certain literal, it must be that if you take all possible resolvents on that literal, so if there are say
ten clauses containing not L then you have 10 resolvents and all these resolvents must either be
tautologies or they must have RUP. That's the property. Should I go through the example? We
have a formula, so this is sort of the smallest formula for which a RAT proof is smaller than a
RUP proof, although it can be exponentially smaller, but so it's, you need some clauses in there
kind of to show off. So this is eight clauses unsatisfiable and this is a RAT proof and the triangle
means that the clause has RAT and no RUP, so and it's easy to see that this clause has no RUP
because all clauses have length free every assign, not A to false which means we should assign
so we should assign A to false, not A to false. It means we should assign A to true, then some
literals gets falsified, but since every clause is length free there will be no unit propagation. So
it's easy to see that there is so that RUP is not going to work? And it's not and so no. So we
now going to check it, so the forward checking. And now we detect, okay. There is no RUP, so
now we going to check if it has RAT. What are we going to do? We going to do all resolutions
with clauses containing the complement of this is A, so we take all clauses, so this clause
contains A. We do resolution and this brings clause B or C. We do the same here. We have the
clause C or D. We have the resolution and we have the clause not B or not E. Because
resolution gives us these three clauses and it has some RATs if and only if all of these are RUP.
So that's the definition.
>>: [indiscernible]
>> Marijn Heule: Because none of them are tautologies, et cetera. So if there's a tautology we
can get rid of it and then we have to check for all of them whether it has RUP, yeah. So first we
create those three using the resolution so we take this clause, we compute all of the resolvents.
That's the first step. And now we going to check and we see okay. We can reconstruct it using
RUP. Like we did before, we can do the same thing with CD and we can do the same thing with
not B or not capital D. And now we checked it all and now we know it is RAT and we can add it
to the formula and remove all of the resolvents. And now we can use this clause too as a
antecedent for B and then we check the empty clause and we are done. So what you can see
here is we have only three clauses for the proof.
>>: So the [indiscernible] so you have an example with the, where x goes into [indiscernible]
and the intuition then is that you select x when you solve and then, when you solve
[indiscernible]
>> Marijn Heule: Yeah. So what you have is that, so for instance, we first add all these three
with x and then we check for all resolvents on x whether they are tautologies or have RUP, but
since not x is not there yet, everything has, all resolvents have RAT or RUP because they are 0
resolvents. So first we can add these for free because these are not in yet because x is new and
then the, for instance, we add this clause. Then we do all resolutions and then we end up with
these three clauses which have RUP because then they are exactly, and we do the same thing
there. We add all resolvents where it brings us these clauses in there. So it's clear that you see
why it works with this technique, but it the nice thing is it works for all non-[indiscernible]
techniques. It is very, I think a very elegant way of checking things. And the same thing for
extended resolution proofs. For extended resolution everything is tautologies so. For instance,
we have the variable x or, so which means that x is equivalent to, so we add the new variable x
which is the ent of A and B, so we can do the first trick here. We add the x because it's, there's
nothing there and now if we do resolution on these two we get a tautology and now we do a
resolution on these two and we get a tautology. So for extended resolution RAT works because
of tautologies. For this technique it works because of RUP and there are techniques where you
the mixture. Yeah?
>>: But your definition is [indiscernible] is in the previous slides…
>> Marijn Heule: Yeah. The definite, the definition is kind of, I want to keep it short and I have
to [indiscernible] so…
>>: Yeah. But, but [indiscernible] is the lemma [indiscernible] because there is no recursion to
that.
>> Marijn Heule: No. There is no recursion. We have to be thinking about the [indiscernible]
technique which requires recursion, because everything becomes [indiscernible]. And if
[indiscernible] fails then you've done the clause, there must be a literal and also in the proof
format we explicitly demand that the first literal has to have been read on the first literal, so if
you add the clause in the proof then it has to have RAT to really reduce the checking costs.
Because the solver, the solver [indiscernible] must know what literal it is which is of course for
this kind of example is easy. I want [indiscernible] the definition in the paper there is much
more formal especially [indiscernible] have much more formal [indiscernible], but this is kind of
my fault here.
>>: Is it sound to [indiscernible] RAT or is it [indiscernible]
>> Marijn Heule: I think it's sound, but I have to, so it's not obvious I think. And so but if you
for instance have a formula so with n variables and all clauses have length n then you have all
to 2 to the power n clauses then any resolution proof will be exponential. But the RAT proof
you just have, it has all the unit clauses, so it's either A or let's say you have x1, x2, 2xn and
that's the proof and then you enter it to the empty clause. So for those kinds of formulas, for
[indiscernible] just to give you an example where you have just only the units and you are done.
>>: That was the example with A, B, C.
>> Marijn Heule: No. You have, this is four literals. This is not for three literals. But it is kind of
similar, so you can generalize it that if you have, if all clauses have the leg of the number of
variables, then you can just add to all the clauses have RAT. So, but the solver all kinds of nice
things that you have. You have small proofs for RAT, or small RAT proofs. I think it's really nice
that you compute first all of these resolvents and then you compute all these edges, but you
can discard everything and you have this compact information. Yeah. So this was kind of the
tool chain that we had, so we have this [indiscernible] and deleted [indiscernible] clauses and
then we put it through the checker, but now the question is do we trust the checker because
things have become much more complicated especially if you want to have all of the
optimizations in the checker, so now it's about 400 lines of code, the checker, but yeah. Can
we, so are we done? So no. I think it is really cool to have a verified checker and so one of the
focus on, of future work will be to kind of use this tool that we have now which also for RAT it
gives us this reduced scores, reduced proofs, optimal clause deletion information, so we come
here. We can get this big proof if n is the length and clause is the RAT lemmas we give it to the
kind of the proof trimmer which is actually the same [indiscernible] checker before so it's the
same tool. This is DRAP chain tool we have now which significantly shrinks it, optimizes clause
deletion information and then the proof is small enough that we can do it with a verified check.
So then lots of optimizations can go and we can then check it with something which we trust
even more. And yeah so this can, for instance, if you see this then you can implement this in
say 100 lines of code. This is one of the things I think is very nice to get it done. And Nate,
which I showed you is a call for this work is doing his PhD on this.
>>: [indiscernible] seems like you had [indiscernible]
>> Marijn Heule: Yeah. [indiscernible] is proposal.
>>: Oh, proposal, okay. [indiscernible]
>> Marijn Heule: Yeah. So [indiscernible]
>>: It said something about verified [indiscernible]
>> Marijn Heule: Yeah. So he's working on the verified, mechanically verified checker that can
have add and delete the [indiscernible] clauses. And he makes it an ACL tool and yeah. And the
nice thing about ACL tool is you can get through say 60 or 70 percent the speed of C if you…
>>: [indiscernible]
>> Marijn Heule: There is lots [indiscernible] subject. We have this [indiscernible] I don't know
if you are familiar with but there are all kind of low-level techniques which are now supported,
but it of course makes a proof much harder. So he has, on the high level he has a proof and
now we have a low-level implementation and it’s bringing them together too. That's probably a
year work of having it all together because these proofs can take a long time. Just to give you a
little bit of impression about this proof reduction which we have now although I think the more
important thing is almost this, that you can optimize the deletion information which can really
help to guide the unit propagation because when you first get it everything is not so clear. This
is so the size of the input proof of Glucose and this is the size of what the proof gives. So you
can see it's slightly more than the factor two that you can reduce it, the proof, but by
optimizing the deletion information it's, yeah, and also the clause is, the original clause is kind
of, is even, yeah, as the plot I showed earlier, you can have easily, a 90 percent reduction in the
original clauses. So if n is much smaller number original clauses and you have a smaller proof
and optimal deletion information, which makes it possible that even a mechanically verified
checker should be able to deal with these in a reasonable amount of time. So yeah. So kind of
the contribution slide again and what is actually [indiscernible].
>>: So the RAT proof is for [indiscernible] it's not for [indiscernible]
>> Marijn Heule: You can do it both, but so the checker, so the verified, so wait. So the checker
is as things would look here will go forward because if you have an optimal clause deletion
information there's really no use of going backwards. It doesn't make sense because everything
will be in the core, so you don't gain as much by going backwards and the costs of going
backwards is that you have conflict analysis all these things you really don't want to verify, so
you really want to have things as clean as possible. So the verifier will go forward with all the
optimal clause deletion and a smaller proof and a smaller clauses, but you can do the, so the C
version, the trimmer, the proof trimmer will go backwards also for you for the RAT. You can
also do the same, the same trick here if you do backwards. There's no, you see there's not
much difference if you go forwards or backwards requiring the RAT. You just, when you check
it you take all the resolvents with everything above in the proof. So you just don't do resolution
with anything that is over here. You just do resolution of everything that is over there.
>>: So one line RAT whereas both forward and backward? But the percent here was a forward
version?
>> Marijn Heule: Yeah. To keep it more clear, yeah.
>>: And your trimmer goes backwards?
>> Marijn Heule: Yeah. The trimmer goes backwards. Very much the one that I have in mind
which is also kind of, let me, so the one I have in mind which I hope to finish maybe in a month
or so also has this, this bar op, so it combines this and this, which requires, which requires going
backwards and using deletion information.
>>: I think I was asked to shut up so you can finish.
>> Marijn Heule: Yeah. So was I [laughter] so I am, as you see I am perfectly finished. So this,
the work over here does do forward RAT and also, but there's no, so by combining this deletion
information and the procedures I have here I think you can have everything. And then
combining that with checking the reduced output to a verified checker, I think when you have
all possible things you can have want for checking. Yeah. So, this kind of sums it up. So here, I
think, this is maybe the more important point for discussion later. Shoot this kind of proof
logging being mandatory for competitions and maybe also DSNT competition. I noticed that at
least I use this line. It's FMCADs and the only people that discuss these things with me during
the coffee break were saying please don't, please don't do it. [laughter] this is too hard and
really kind of you should back off, so in the SAT community there, people are more positive
about this, but in the [indiscernible] competition there they really fear that it might be
extremely hard combining [indiscernible] and all kinds of stuff into this. But I think it's, it will be
good to have more faith in it because the tools definitely here have lots of works as well. That's
it. So there are kind of four publications this year on this proof checking and so I have some
more plans, so we will follow-up with more. Yeah. Thanks for the attention. [applause]
Download