>> Nikhil Swamy: All right. So let's get... Strub. Pierre-Yves is a postdoc at the MSR-INRIA Joint...

advertisement
>> Nikhil Swamy: All right. So let's get started. I'm really happy to introduce Pierre-Yves
Strub. Pierre-Yves is a postdoc at the MSR-INRIA Joint Centre in Orsay. And we've been
working together on the F* project. Pierre-Yves has been our Coq expert in this project. And
he's going to tell us today about some work that's not so related to F* but more his work in his
Ph.D. related to extending Coq with decision procedures in a tool called Coq Modulo Theory,
which is going to be part of the next release of Coq. So this is going to be the mainline of Coq 9.
So Pierre-Yves.
>> Pierre-Yves Strub: Okay. Thank you. So let me start to introduce this work with formal
methods. So formal methods are tools for proving correctness of programs. And as of today we
start to have some big [inaudible]. So, for example, you have some OS kernel verification in
Coq. You have the CompCert project which satisfied C compiler for big subset of C. And you
have also the ASTRÉE static checker, which is used for checking the absence of [inaudible] error
in some controls of the Airbus 380, for example.
And the idea here is that formal tools are complex programs too. So why should I trust the
machine this time when I'm using the formal tools since I don't trust the machine for my user
complex program.
And so the idea is that you want your formal tools to produce either a certificate that you can
check in the simple verifier or either to be small enough -- small enough to be read and trusted
like no more than 5,000 line of code.
And the Coq Proof Assistant is such a reliable formal tool. So in short the Coq Proof Assistant
was designed to develop mathematical proofs. And so what you can do with it, that it allows you
to write formal specification and to programs -- to write programs and to prove that the programs
comply with a specification.
And why the Coq Proof Assistant is reliable is why you have the Coq Proof Assistant parted into
two parts. First on the top here you have the proof checking part. This part is split into two
subparts which is a formal language, which is not a program but which is the logic you are using
in the Proof Assistant, and then you have the proof checker which is given a proof [inaudible]
which checks that this is indeed the proof for the property we want to proof with [inaudible] to
the formal language.
On top of that, you have the proof construction stack. So this one is bigger. Here you have a
proof engine. So proof engine is a very low-level program which allows you to from some
comments to bridge some proofs for the proof checker. Then you have the proof development
language. So this one is the high-level language for proof constructions. And for these two parts
you are able to bring some libraries of theorems that then you can use for more standard
development.
And so the idea here that to -- okay, to trust your proof, that you trust that the proof checker
is [inaudible] the job. You only have to prove that part, that is the final part which tells you that
is your proof -- your final proof is correct on it. And this part in Coq is small. It's less than
10,000 lines of code of ML. This one is bigger. It's between 100- and 200,000 lines of code.
And if you have a bug there, then the proof checker will detect it and reject your buggy proof.
So here is [inaudible] map of the ecosystem of formal tools. So on this slide you have the power
of the formal tool going from memory safety to functional correctness and to formal math.
On that line I will -- I have the ->>: Can I ask what is the difference in formal math and functional correctness?
>> Pierre-Yves Strub: For functional correctness I won't be able to speak about mathematics, for
example. So this is -- for formal math I have set theory which is more expressive than what I can
express in the functional correctness of formal tool about, okay, my function is seeking some
input, I'm returning some input [inaudible]. With that I cannot uncode the [inaudible] for
example.
>>: Okay. Okay. So you're ->> Pierre-Yves Strub: On [inaudible] I will have automation. So starting from free automated
tools to free interactive one. In the middle I have a gray here where I'm between interactive on
the automated proofs.
So, for example, ML checking is free automated. ML checking [inaudible] but what you obtain
is some memory safety. If you move to the gray here, so this is a kind of tradeoff, you can obtain
functional correctness in a quite automated way. For example, so you have VCC, Boogie, and,
for example, F*, we work with Nikhil here. It's a tradeoff between what you can do
automatically and what you can prove in the system.
And then sure you have Coq. So in Coq you're only totally interactive. You have full
automation in Coq. But you have a very large logic. You can express quite a bit of things. But
now I want to zoom on Coq. Because Coq is not totally only one logic. Coq now is 25 years
old. And there have been some work about the logic. So the very first version of Coq was the
calculus of construction there.
So in the calculus of construction, for example, it was not possible to have a built-in definition of
natural numbers, for example. So you had to do a lot of own codings. And then, for example,
addition was not a function. You have to prove. So each time you want to prove that 1 equals -2 equals 1 proof 1, you have to apply some rules about your own coding.
And then you have this new version of Coq which is a calculus of [inaudible] construction where
you have a [inaudible] types. And then here you are able to define quite a lot of function
[inaudible].
And in that case you get more automation because you have some computations, you have things
that you will be able to recomputation and are more expressive. And then you have CACs, okay,
[inaudible] construction where you can add more computation and you can express more
[inaudible] to encode them.
And so my work here is to quickly push in on that line. You see the more -- we progress the
more you are going to the gray-out [inaudible] the more we are expressive. And here the work is
about going to the gray area.
So first I want to say Coq is not a panacea for studying real programs. Okay. So for doing real
programs, you want to go in here. But it is invaluable for, for example, such things as logical
foundation of your formal tools. And [inaudible] for example I did for formalizing the method
theory of F*, for formalizing the method theory of a previous language, which is F7 for
cryptography.
It can also be used to satisfy the reference implementation of these tools. So it's why, for
example, I did for the certification of the type checker of F*. And also it is invaluable for
formalizing complex mathematics. So you have the group of [inaudible] proving the [inaudible]
Coq, and, for example, [inaudible] more things from library about proving properties of the
elliptic curve. So after proving some cryptography [inaudible] about elliptic curves. So there is
an [inaudible] in being able to express [inaudible] in Coq.
So my thought will be in ->>: Can I ask a question [inaudible]?
>> Pierre-Yves Strub: Yeah.
>>: So, for example, when you certified the F* type checker [inaudible] Coq is the
programming language also [inaudible] the type checker inside Coq also?
>> Pierre-Yves Strub: That could be, but here, no, we -- we define a new methodology where
you first do the metatheory, and then your [inaudible] of F* in F*. But since F* is powerful
enough to express the correctness of the type checker. It's not perfect enough to prove the
correctness of type checker but to express it is powerful enough, then you can import the result
of the type checker in Coq and prove that type checker is [inaudible]. This is okay [inaudible]
application.
But you could also write this program in Coq and instruct it. But as I've said, then Coq is not in
the panacea for doing that thing because you do not have effects, you have to write some
[inaudible] and so on.
So [inaudible] introduce the Coq Proof Assistant. Next I will describe my new calculus, which I
call the Calculus of Presburger Constructions, I will define it, and then I will go to how you can
decide proof checking in this calculus. Then I will speak about kernel security [inaudible] of
Coq and I will conclude.
So what is Coq. So Coq is four things. Coq is a programming language dedicated to processing
mathematics. Then Coq is a set of deduction rules characterizing the logic chosen for expressing
the mathematical statements and also their proofs. Proof in Coq is coming with proof tactics
allowing to construct proofs interactively. And then on top of this you have libraries of proved
theorem.
So let me introduce the two first point, which is programming languages and what are the
deduction rules. So for that we are moving one [inaudible] to Gentzen and to natural deduction.
So in Gentzen natural deduction the idea is that a proof is a tree. You give up the axiomatization
of deductive reasoning and you replace them by inference rules. And for each logical connector,
you will have rules for introducing them and rule for eliminating them.
So what is a proof in natural deduction. So I said it's a tree, and then you have a goal, what you
are proving, which is Q, and you are proving the goal under hypotheses, this PI here. And each
hypothesis will be then justified in turn by some proofs.
>>: You said [inaudible], right?
>> Pierre-Yves Strub: Yeah. So I will focus on the [inaudible] connector. So the rule for
eliminating the [inaudible] tells you that if you have a proof of A in place of B and a proof of A,
then you will have a proof of B.
And what this rule is hiding, this rule hides a total function transforming a proof of A, a proof of
type A, because I remove two types next, to a proof of B. And this is what is a Curry-Howard
isomorphism. You see proposition as types. That is that proposition will be the type A
[inaudible] B, and then you see proofs as programs or programs as proofs.
So how we do this in Coq. So in Coq you introduce a type prop for containing other types.
What will be the inhabitants of prop, there will be proof types or propositions. So, for instance,
if A is a prop and B is a prop, then the arrow typed from A to B is a proof type proposition for
the implication of A in place B. And the proof type is valid, a proposition is valid, if there exists
a program of that type.
So, for example, the function, so this Coq syntax here for the function, taking your X of type A
[inaudible] X this function is taking the arrow and joining the A. So this is a proof of A in place
A.
Now, if I want to move to the full logic for all the [inaudible] I have to -- I need some more
powerful types. So this is now the elimination for the [inaudible] connector. So telling me that
if I have a proof of for all of XP, then I can have a proof of P in which I substitute X by T, by
time T. And so what you want to express, that the proof of all XP is a function taking a T and
returning of proof P of T.
And what we have is the result type depends on the inputs. And we need what we call then
dependant types. That is types depending on the value.
So Coq is coming with these dependent types. So what we do that now we have a type for types,
which is type with a capital T, and then you can define type families. That is for type -- for set
A, you will define a family B of I for I being A. And you will see them as function from A to B.
So you can with that have polymorphism. For example, you can have list of A, which has a
list -- the type of the list of element of type A. And this is a function from type to type. Or you
have vector of size N, which has a list of size N. And then show your [inaudible] going from
natural number to type. And then you can continue, for example, ordinal N is a type of numbers
lower than N.
And so the central predicate. For example, prime N is a type for proposition which says that N is
a prime number. And this one is going from nat to prop, the type of propositions. And so there
is a new type construction which is different projects which can read for all X of T I have the
proposition B, which is [inaudible] the encoding of the for-all quantification. This is a
generalization of the arrow type from A to B. So, for example, if I want to say that for all natural
number 4 times N is not prime, I will write for all N of type nat, not, which is a function from
proposition to proposition, prime 4 times N.
Now so I'm coming back to my [inaudible] to Coq. Now I've said that also Coq so is a logic but
so is coming with proof tactics allowing to construct proofs interactively. That is I will have
proof tactics, I will need from a type of proposition to build a program of that type. That is a
proof for that proposition.
So Coq [inaudible] very low-level tactics like for implication [inaudible] and the construction
proof of [inaudible] over N. But so Coq offers the user a mechanism for proof search. For
example, you can ask Coq for -- do simple search by arithmetic or real arithmetic, first order
logic, polynomial system [inaudible] and also you can -- you have a general mechanism for
[inaudible] oracles.
The common base of all these tactics that [inaudible] which take a proposition as input and
which construct proof [inaudible]. That is a program of the given type, of the given proposition.
And then these proofs are checked by the Coq system.
So up to now I spoke about the deduction. But in proofs there is a very important notion which
is computations. And this notion is coming from [inaudible]. So at that time deduction was
unknown. The only way of doing of doing proofs were computations. And then we move to
bridge where we move to the other extreme. That is only using deduction rule, no computation
at all.
[inaudible] go during these [inaudible] of mathematics, computation went back. So it started
with [inaudible] problem which was is it possible to state mathematical certain as computation
problems and then to solve them automatically. And today we know that not too possible to
solve them automatically by computation [inaudible] assessments. But still we can solve some
of them.
So this is the Poincaré principle. If 2 plus 2 equals 4, a proof or just computation?
If you go to Presburger arithmetic, so here you have [inaudible] this logic, then 2 plus 2 equals 4
is a proof, is a full proof. Okay. This is -- shows this big tree, where you have to apply two
times this [inaudible] and Y times this one.
But if you start to [inaudible] then 2 plus 2 will compute to 4. And then the proof of 2 plus 2
equals 4 is no more a proof, it's just the application of [inaudible], all right, because -- yes?
>>: Now, doesn't it [inaudible] depending on what you define as proof, you're doing a
rewriting ->> Pierre-Yves Strub: No, it's not a rewriting.
>>: [inaudible] rewriting rules?
>> Pierre-Yves Strub: No, it's not a rewriting here. And not -- here I am playing the right rules
because I'm [inaudible] equals by equals. So [inaudible] rewrite rules. But here the computation
is embedded in the logic.
So, yes, you compute using rewrite rules, but this is something which is decidable.
>> So you add -- so your point is you add rewrites in as a proof rule.
>> Pierre-Yves Strub: As a?
>>: As a proof rule.
>> Pierre-Yves Strub: You can see like this, yeah. But it's very telling that in a proof you want
to remove all these computational steps and you want the computer to find them. You want the
computation to hide what the reader is doing implicitly when there is a proof.
>>: There's another part of this example doesn't -- the simplicity in this example, which is
representation, so in this example you were [inaudible] numbers as [inaudible].
>> Pierre-Yves Strub: Yeah.
>>: [inaudible] numbers and some computation on kernel numbers might be more efficient than
proof of kernel numbers, but computation on [inaudible] numbers.
>> Pierre-Yves Strub: You can't -- you cannot -- you can do the same by doing numbers. You
can define the computation [inaudible] using binary numbers. And, yes, it would be more
efficient.
>>: But then there's a question of whether the representation is opaque or transparent to the
proof.
>> Pierre-Yves Strub: Well, yes, when you are doing formal proofs, the equation of [inaudible]
mathematical objects into some sort of proof, yes. You have to take the presentation which
exhibits the root structure for doing a write computation in the list during your proofs. Yes. So
it's not like in math. So we are moving from math where you are considering every object up to
an isomorphism.
Here no. Here the computation is intentional. So you choose a computation and you stick to this
one. And if you want to change your position, you have to do it explicitly. Yes, I agree.
>>: I'm sorry, I didn't understand what you meant by computation.
>> Pierre-Yves Strub: It means that this is something -- so this is -- there is a function taking
your expression [inaudible] function taking your expression and returning a result. And what
you do, for example, here when you apply the [inaudible] so on the left-hand side you have 2
plus 2 and on the right-hand side you have 4 and you try [inaudible] these two terms are
different. So you cannot apply the [inaudible] lemma because you need X and X on both sides.
Here you will [inaudible] terms and call up to computation. So you will apply the function 2
plus 2 [inaudible] evaluation to 1 to 4, and on both side we'll obtain 4 and 4. And then you can
apply [inaudible].
>>: But you're [inaudible] computation is independent of all this proof business, but how can
that be because, I mean, that computation somehow knows the semantics of what 2 means and
what 4 means and what plus means. How do they know all the semantics, this is ->> Pierre-Yves Strub: I took the -- so I -- you take the [inaudible] and you orient. So you do a
compression. So here you orient from left to right. And so the computation is correct with
relation to the initial semantic.
>>: So there's no proof needed; it's correct by [inaudible] you prove that this function
terminates, you're done?
>> Pierre-Yves Strub: Yeah.
>>: That's all you need to prove.
>> Pierre-Yves Strub: Yes.
>>: Prove this [inaudible] oriented rewrites will determine.
>> Pierre-Yves Strub: So you do some computation and you obtain -- so with the computation
complete you obtain a [inaudible] system and, yes, you apply the [inaudible].
And so it seems -- so on this simple 2 plus 2 equals 4, it seems quite trivial. But it means that in
a proof you can change any proof construction by decision procedure descending your problem.
And so Coq, which is base of the calculus of construction, integrate computation in -- sorry -- in
the [inaudible]. So you have [inaudible] that so if in a proof environment for operation big T,
you have a proof small T, and big T is convertible to T prime using some computation, then T -small T [inaudible] proof of big T prime.
And so this [inaudible] this tilde here, which convert [inaudible] T prime, express the
computational power of the system. And so the more you have computation in that rule, in this
tilde, the more your logic is expressive.
In the calculus of constructions, the only things you have in the conversion rule is [inaudible]
meaning evaluation of functions. So here, for example, plus is [inaudible] and then plus is not
evaluating. So when you have 2 plus 2, 2 plus 2 is a free evaluated.
So next -- so this is a big step. Coq moves to the calculus of inductive construction where you
can define [inaudible] some function by induction. So, for example, you can define the natural
number as being the smallest inductive type adding 0 [inaudible] function. And then you can
[inaudible] you can define addition by induction of simple argument, for example, and so we
obtain the [inaudible] with the equation that we wanted from left to right. And we will add that
[inaudible] X plus 2 is convertible to X plus 3.
But what you won't have, for example, is that 0 plus X is convertible to X, because plus has been
defined by induction of second arguments, and you have no rules when, for example, here this X
is 0.
And so my work here, so you have some little steps, you see little steps next [inaudible] this
computational power, my work here will be to give a big -- to have a big step having much more
computational power in [inaudible] do a small demo.
So this is a Coq [inaudible]. So I did some [inaudible] some modules, and I will start with doing
some proposition logic. So, for example, I want to prove that for any proposition ABC, if A plus
B plus C, then A in place B in place A in place C.
So I got my proof, okay, and I have a proof [inaudible] with the proposition and what I have to
prove. So first I'll start to introduce my hypothesis, and then this is what I have to prove. So I'm
starting from the conclusion going to the hypothesis.
So, for example, if I apply this, this is a function giving me a C from a proof of B and a proof of
A. So I do some application and I have two subgoals, one for A and one for B.
And in order to prove it, this is a [inaudible] function so it's exact. Now I'm moving to B. B, I
know how to prove B from A, so I'm applying this lemma, and now I have to prove A and I'm
done.
And so these tactics here, they build a proof term for that proposition. So when I do the
[inaudible] what's going on is that Coq is reading the proof term and sending the proof term to
the Coq checker. And I can put it here. This is a proof term. So this is a [inaudible] proposition,
the proof, and [inaudible] some application.
So this is the same proof [inaudible] but now I'm using some tactic [inaudible] and this is exact.
And I can [inaudible] the proof term here and I have a proof term which is equivalent to the
previous one.
And so, like I said, for example, there is some automation for arithmetic. So I want to prove that
for all [inaudible] then a distant [inaudible] so I could do it by hand, but I can use [inaudible] the
same. This is done automatically.
And okay. So this is [inaudible] computation. Okay. Here you see there is quite a big proof
term. And now, okay, let's go to this 2 plus 2 equals 4.
So I can do the proof like in Presburger arithmetic, meaning doing rewriting [inaudible] by
equal. So from 2 plus 2, I move to 2 plus 1 successor of 2 plus 1, then I move to successor of
successor of 2 plus 0, then I [inaudible] with 2, and then now I have 4 equals 4 [inaudible] done.
And so here we have -- okay. We have really the proof in the -- in Presburger arithmetic with all
the rewrites.
>>: So all the proofs you just mentioned and 2 plus 2 does some proofs, some lemmas in 2 plus
2 equals 4, they are all justified by induction.
>> Pierre-Yves Strub: No, no, no, not by induction here, because induction is closed, so I can
apply -- I can apply the lemma finitely to prove it. If I want to prove that for all XY, successor
of X plus Y is equal to successor of X plus Y, yes, then I need to do induction. But for 2 plus 2
equals 4, the proof is not using induction.
>>: So it's using this [inaudible] computation?
>> Pierre-Yves Strub: Yes. There is a proof of 2 plus 2 equals 4 is this one. And moving from
2 plus 2 to successor of 2 plus 1 to successor of 2 plus 0 and then to 2 plus.
>>: So that thing for all X plus SY is equal to SOX plus Y?
>> Pierre-Yves Strub: Yes ->>: Where did you get that [inaudible] from?
>> Pierre-Yves Strub: So this is proven by induction, yes. This one is proven by proof by
induction, yes.
>>: But that is sort of like [inaudible].
>> Pierre-Yves Strub: Yeah, yeah, yes. But the proof is very simple. So the proof is not -- so
it's not built in the system. That is Coq is [inaudible] these libraries, and libraries is having some
natural numbers and you have all the basically natural numbers, so [inaudible] and the like. Yes.
And so [inaudible] calculus inductive construction, for example, this one is not done by
induction, this one is simply one step of computation. So one done by induction is the one where
the successor is here.
So as for all the theorems I can use some automation, for example, the [inaudible] tactic, and
then I continue to have some big proof here, okay, for 2 plus 2 equals 4. But I can use some
computation. And here I'm using [inaudible]. This is just X equal to X.
And if I look at the proof term now, the proof term is very small. There is nothing that says that
this is a proof of 4 equals 4. And then Coq is computing 2 plus 2 to 4 and [inaudible] that proof.
So, for example, if I want to prove 2 plus 3 equals 4 by [inaudible], then Coq is comprehending
that it cannot unify 4 to 2 plus 3, two computations.
And now entering the real step of that work. So [inaudible] which is [inaudible] I'm using is
coming from -- with some libraries about trees. And so I'm adding some more natural numbers.
And here I can define two matrices.
So the first one is a matrix over the wing R with N holes and N colons. And two of the matrices
[inaudible] R with M holes and P colons. And so it show matrix is a different type. The matrix
is depending on the [inaudible] and on the number of holes and the number of colons.
And here the star M is a matrix multiplication. And what I will say, that if I do M1 times M2,
what I obtain is a matrix of size N holes and P colon. If, for example, I want to multiply M1
with M1, Coq will complain because the multiplication of M1 is invalid for multiplication with
N1 on the left.
So here dependent types is restricting me -- is forbidding me from multiplying matrices of the
wrong size. And so I can see this in the type of multiplication which [inaudible] I take a matrix
of size N plus M and then [inaudible] and what I return is a matrix of size N times P.
And let's take now this very simple function, so the whole matrix. So the whole matrix is taking
two matrices of size M times N1 or M times N2 and returning the matrix where the two matrices
grew together, so size N and N1 plus N2 colons. And now I define three new matrices, B1, B2,
B3, with [inaudible] colons. And I can check, for example, the type of [inaudible] of B1 B2, and
this is a matrix N [inaudible] two colons.
And which is interesting is that if I compute this matrix, the number of colons is A1 plus R2 plus
F3. But if I do some [inaudible] meaning doing B1 with B2 and B3, then now the size is A1 plus
A2 plus F3.
And so the number of colons is equal. I can prove that this number is equal to that number. But
now what I want to say is that B1 and B2 grew go the [inaudible] with B3 is equal to matrix B1
grew with B2 B3. So I want to prove this. And now Coq is complaining. I'm not able to state
that lemma. Because it's telling me on the left you have a matrix of that size and on the right you
have a matrix of that size and the types are not equal because here you must have the same type
on the left-hand side and the right-hand side. I have two computations.
And here the two types are open. They are fully computed. So they are not convertible. And
the Coq is rejecting this. So if I want to express a simple lemma, because this is a simple lemma,
if I want to express a simple lemma in Coq, then -- oops, sorry -- then I must start to use some
guests. So here the guest is taking two proof of qualities, it's taking a proof of N equals N, and
the proof of [inaudible] of addition for error 1, error 2, and error 3. And then you take that -- the
first matrix is equal to the second one, but I'm doing some guest. And if I look at the type of
guest, it's really a function taking a pair of proof of [inaudible] between MN prime and MN
prime and guesting the matrix of size MN to a matrix of size N prime N prime.
And I won't go into the details of how you can prove that lemma [inaudible] is quite hard
because you start to manipulate proof and you have to start speaking about equating between
proofs.
And Coq modulo theory is here for removing the need of having guest for expressing such
statement.
So this work is -- this frame is well known, and some solution has been proposed. So one
solution is extensional [inaudible] construction. So here [inaudible] so you have the function
evaluation with convertibility and you say that any term which are power equal are convertible.
So this solve the problem. You can prove that [inaudible] of addition so any term [inaudible]
will be convertible.
This problem is that it's too powerful. For example, 3 will be convertible to the smallest
integers, smallest natural numbers [inaudible]. You will not be able to encode basically the
[inaudible] in Coq, so in that theory. So it's too powerful. You are using proof checking. Coq is
here for checking your proofs. So if you are not able to check your proof anymore, then you are
losing the point of Coq.
>>: What is the complexity of proof checking [inaudible] in Coq?
>> Pierre-Yves Strub: In what term?
>>: Meaning like polynomial exponential.
>> Pierre-Yves Strub: Yes, like of the size of ->>: Size of the proof.
>> Pierre-Yves Strub: Exponential [inaudible].
>>: It's exponential?
>> Pierre-Yves Strub: Yes. Because element type checking is exponential and Coq type
checking is [inaudible].
>>: And what's the upper bound?
>> Pierre-Yves Strub: That's -- I don't know.
>>: [inaudible].
>> Pierre-Yves Strub: [inaudible].
>>: [inaudible] because you have -- you're inferring types, right, with [inaudible]. But in the
colon you're not doing inference, right? You're fully [inaudible].
>> Pierre-Yves Strub: Yeah, but it's similar in the type. It's ensuring the term. But in practice,
even for full term where you are operating the constants, the power is very high. So sometimes
you have to change [inaudible] the way you prove things to other proof checking work in
practice.
So Coq Modulo Theory, so which is the work now, my work I'm introducing, so it's an extension
of the Coq formal language. It's allows the introduction of decision procedures so -- in the
conversion rule. It is implemented and will be in the mainline of the main version of Coq. And,
as I will describe, you can still trust this Coq version, although you are introducing decision
procedure [inaudible].
So for the talk I will present a restriction of the Coq Modulo Theory which I call the calculus of
Presburger construction where instead of introducing any decision procedure we'll only be able
to introduce a decision procedure for the Presburger arithmetic.
So the calculus of Presburger construction will be consistent extension of the core logic of Coq.
So we'll have function
evaluation, the convertibility. You will have the recursor for
natural numbers, so being able to define induction types and function of your natural numbers.
You will have a decision procedure for Presburger arithmetic in the conversion. You will be
able to -- the decision procedure will be able to use the equation which are available from the
proof of environment. And the type checking, the proof checking is decidable.
As I said, here I'm -- I want to add some power to the conversion rule. So the only things I will
modify in the formal language is the conversion. So that is that [inaudible]. And that [inaudible]
so we include the type convertibility, function evaluation, recursor for natural numbers, so for
fixpoint and match. So this is not generating from standard Coq. And then I will introduce the
validity entailment for the Presburger arithmetic.
And what we want at least is that the conversion must consider equal any pair of
Presburger-equal algebraic terms. So what I call algebraic terms, is any term, Coq term, built
from 0 successor and plus and variables with the right arities.
And, for instance, if here I take the dependent type of list of size depending on the size on the
natural number, so this solves our motivation examples, so telling that this type now is
convertible to that type because that subterm is algebraic, it is built from plus and variables. This
one is algebraic. And N plus M and M plus N are equal in the Presburger arithmetic.
And this could be for definition of Coq -- of Coq with Presburger arithmetic. But this does not
define a sound logic. Why? Because let's take that function. So I take a function, taking N and
M, two natural numbers, and returning some list. And so the type is for all NM of type of nat, I
return a list of size N plus M, because body is built like this.
Using my new conversion rule, I know that this function is also of type for all NM of type nat.
Then I return a list of size N plus M.
But now I'll apply that function to these two natural numbers [inaudible] apply to N1 and N2.
And this one is not algebraic. And so the type of this function now is list of size identity
[inaudible] N1 plus N2. And I want it to be convertible to the list of size N2 plus identity of N1.
But since ID of N1 is not algebraic, then nothing ensures that I can [inaudible] because now the
Presburger arithmetic is not able to speak about that subterm.
So what I do is what is done, for example, when you are missing the decision procedure for the
[inaudible], that you will use the question -- you will use variables to abstract aliens to a subterm
which are not algebraic to communicate with the decision procedure.
So if I take -- so the term list apply to plus, which is applied to two modulo terms, I will call that
some -- some alien. They are not known. The term which are not known by the Presburger
decision procedure.
And what I will do is that I will abstract aliens by variables. For example, F apply to nat from
N2 will be abstracted by the variable C1. And this again will be abstracted by the variable C2.
And this arrow with an A here is the algebraisation process [inaudible] variable. So, for
example, here what I obtain is list applied to C1 plus C2. Here I do the same, but now the aliens
are swept. So the algebraisation process will give me the type list apply to C2 plus C1. And
now this one are convertible.
So I'm solving the previous problem, but still that's not enough. Because I will want this
conversion relation to be transitive for using the metatheory study and for proving that this is
[inaudible].
So the question is that which variable we want to choose when asserting aliens. So that comes
from the list. Now a list is applied to plus. And the first [inaudible] is F applied to N1, P0. And
the second [inaudible] is G applied to N2. So since I know that N to N1, P0 is convertible to N1,
this is convertible to that term. That is list applied to FN1 plus GN2.
And now doing some abstraction, okay, I know that C1 plus C2 is convertible to C2 plus C1, so
[inaudible] this one with C2, this one with C1. I obtain this new type which is convertible to that
one, which is G of N2 plus F of N1.
And then here I use the conversion two times. If doing the algebraisation process, I choose my
variables such that two aliens -- you assign the same variable to two aliens with are convertible,
then what you will do is that to C1, C1 [inaudible] F applied to N1 and the alien F applied to N1
P0. That is reversing this one and that one. To C2 you assign G of N2.
And then again what you have is that the algebraisation of the first one is list of C2 plus C1 -- C1
plus C2. The algebraisation of this less one is list of C2 plus C1, and then they are convertible in
one step.
Now what I want also is to be able to use equations from the proof environment, meaning that if
I know that X is equal to 2 and Z plus N is equal to 5, I want that to be convertible to this.
So what you do is that before doing the conversion you are looking at all the equation, you
extract them and you sum them along to the goal you want to solve.
Of course, for the exact same reasons, you want the equations -- equation to be subject to again
extraction. Meaning that if I replace Z but I [inaudible] apply to Z, and if I replace N plus Z by
N plus [inaudible] Z plus 0 plus N, then I want to abstract this again and that again with the same
variable.
So for proof context gamma, what I will write EQ of gamma will be the set of all the equation I
can find in gamma [inaudible] equation which are not fully algebraic. And now my conversion
relation, which was written with a simple tilde, will be returned tilde gamma within the
conversion under the equation I can find in gamma.
One thing to note is that equations will be introduced by the for-all quantifier. So meaning that,
for example, if I start with the proposition, the function which takes a proof P of the proposition
X is equal to Y and returning a list of size X, then this must be convertible to that type which is
the type -- yes?
>>: I'm sorry. Go ahead.
>> Pierre-Yves Strub: Okay. To the type taking the samples for the property X equal Y but
returning a list of size Y. And what you want is that after the for-all quantification, you must be
able to be -- to -- you want to be able to use the equation which is bound by the for-all
quantification. That is when you convert this list of X to this list of Y, you can use that equation,
and then you have conversion.
>>: So [inaudible] architecture communicating theories, you require that each satellite theory be
able to derive all the implied equalities in order to communicate and you also have this variable
introduction.
>> Pierre-Yves Strub: Yeah.
>>: So here you're not placing such a -- you just have this theory and you -- which -- so I'm
trying to understand the relationship. It seems similar in some senses, but you're also -- you're
trying to figure out the equalities implied automatically here or ->> Pierre-Yves Strub: So the [inaudible] is to be able to implement it here and here and
speaking for a moment about [inaudible] which is complete by a sense. So it's why you can
say -- it's why I can -- I can say that I can abstract -- I can decide if I should [inaudible] by same
variables. [inaudible] giving me this [inaudible]. When I want to implement this, then I need the
same restrictions, yes.
>>: All right.
>> Pierre-Yves Strub: So this slide will highlight your point, meaning this is not an algorithm
for the moment. So here we are also going to give an abstract definition of this relation to
gamma. So first understanding that I want the beta for a function evaluation and the [inaudible]
for the fixpoint evaluation.
Then here this is one integrating the [inaudible]. And so the point here is that [inaudible] when
I'm doing the representation of equations, I do not need to know S because this is -- the
normalization is a global process that is taking all the terms of the calculus of construction,
splitting this set into equivalent classes for that relation and ascending a different variable for
each equivalent class [inaudible].
So it's telling me that if I take all the equation of gamma and I do this general algebraisation and
using Presburger arithmetic I can [inaudible] the algebraisation of S is equal to the algebraisation
of T, okay, then I can deduce that [inaudible] is equivalent to T.
And then the last rule is speaking about this contextual rules meaning that if I know that this is
equivalent to that and this equivalent to that and there's extra equation T1, then the full terms
here are equivalent and I have a set of contextual rules like this for the remaining [inaudible].
So what you have is a calculus that this is a consistent calculus. You have strong normalization
of beta, strong normalization of [inaudible]. You have all the properties you expect about that
calculus. And so now you can prove that you can [inaudible] algorithm.
So what the algorithm show will be a very naive algorithm just to show that you can [inaudible]
gamma. So the first point that extracting equation is very easy. You just search for any term of
that form in the proof environment and you extract them. So now the next point is how to
express [inaudible] gamma.
So we start I will show this with an example. So I have my typing environment. And I have two
equation. I have D equals E for natural numbers or variables and I have that T which is a context
in which C appears is equal to the same context in which E appears. And I want to prove that
that term is equivalent to that term shown returning a list applied to N.
So since now I know that [inaudible] is normalizing, I am furthering head normalization of my
types. So this one is already head normalized. And this one is a lambda applied to an argument,
so this reduced to list applied to TD plus N.
And then I have a hat here which is nonalgebraic, and I do some syntactic equality between the
two hats. These are not syntactically equal, then I reject the conversion check.
And so since they are syntactically equal, they have subterms which are paired, which are
algebraic, which has an algebraic cap at least. So on the left-hand side I have N plus T of C; on
the right-hand side I have T of D and N.
And so now I will do -- I will start a conversion check between the paired algebraic subterms.
So this is here I have a new one, third check, which is this one. And my equation are this one.
And so I will now abstract my [inaudible] by doing a pairwise comparison of all [inaudible]
between them. Okay. I will [inaudible] check this one with this one, this one with this one, this
one with this one.
And what I obtain here is that T of D is convertible to T of E because I know that D equals E as
an equation. So we extract TD and TE with the same variable.
TC is not convertible to TD or T or any of the [inaudible]. So for TC, I will use a different
variable, B. So now what I have to check is that N plus A is equivalent to B plus N, knowing
that D equals E [inaudible] algebraisation that A equals B. And of course this is [inaudible] by
the Presburger arithmetic. So I did use that my initial terms are convertible.
So now why should you trust my implementation of this kernel. First, the real implementation is
not choosing this naive algorithm doing pairwise comparison [inaudible]. We're doing some
construct-based algorithm. And so I said that you trust Coq because the kernel is 5,000 lines of
code, and here I allow you to bring some decision procedures into this kernel which can be big
which can have aggressive optimization. And so you break the trust. Unless, unless decision
procedure is generating some certificates.
And so what you can do is to define two-level kernels. So [inaudible] CC of X is a type checker,
is a kernel for the calculus of construction with some decision procedure X in each conversion.
So this is a kernel, these two box. The kernel has two levels. You have CC of [inaudible], which
is a calculus of construction [inaudible] meaning the old kernel, the one you trust. And CC of T
is a Coq Modulo Theory with the [inaudible] theory T. This is all one you do not trust.
Okay. You want to check that P is a proof of that, of this list of size of N [inaudible] prime and
this is a size N prime plus N. And so you send that to the kernel.
At some point the kernel will have to do some conversion check, and this conversion check will
do a code to the decision procedure asking if N plus N prime is equal to the N prime plus N.
And this decision procedure [inaudible] is this third step, yes, it is. But I give you a certificate,
and the certificate is a Coq proof T of N plus N prime equals N prime plus N. But this one, this
statement is expressible in the old kernel. This one is not. Okay. This is the same problem for
matrices, but this one is -- equates here on natural numbers and is expressible in the old kernel.
So I can send this proof T certificate to the old kernel one [inaudible]. And if my decision
procedure has no bugs, then the Coq kernel shall have to ensure me, yes, this is a [inaudible]
certificate. And then what I did here is that in this two-level kernel [inaudible] of that part to the
[inaudible] of this one, the old one. so I still trust my kernel as long as the decision procedure is
answering with certificates.
>>: So do you really introduce everything to CC of 0 because the checking of the statement in
one still involves the rules of CC of T, right? It's only the proof of the equality that gets reduced
in CC of 0.
>> Pierre-Yves Strub: So in CC [inaudible] CCM0 and CCMT, the only thing changing is a
convertibility check. All the remaining is a [inaudible] meaning that what you have is that in CC
of 0 you have -- you have -- if I have a theory in the CC of X, I mean, you have -- if the only
[inaudible] that if the theory of MT, then I'm doing beta convertibility, otherwise I'm using this
extra decision procedure.
And so you can [inaudible] this. You can trust this. And then the part I am using, the external
decision procedures, this part you do not trust it because you reduce this one to the old code.
>>: Okay. [inaudible] I don't ever trust T [inaudible] but I have to trust CC of T plus CC of 0.
>> Pierre-Yves Strub: No, you have to trust CC and MT plus the function doing the bunching
between CC and MT and CC of T.
>>: The function is part of CC of T.
>> Pierre-Yves Strub: The function is part of CC of X. Yes.
>>: Okay. CC of X. All right. Yes.
>> Pierre-Yves Strub: And still -- so I don't have time to [inaudible] you can remove -- you can
remove CC of T if you modify CC and MT with an extra constructor with implicit guest. Which
is a calculus between CC and CoqMT. And this CC of MT for implicit guest is very small. So
the idea here is that you want to remove this one.
So let me conclude. So it is good I presented Coq Modulo Theory which serve a longstanding
theoretical problem, that is using an extensional equality decided by a decision procedure in type
theory. In this extension the kernel becomes incremental. New theories are added, and they
come with certificates and the certificate checker can be added safely by the user, then reducing
the [inaudible] old one.
This improves a lot over prior reflexion-based solution because you have more typeable terms.
That is it's not only that CoqMT is adding more automation, is that you can express more
statements without relying on coding, like the guest.
It allows an easy use of dependent types. So I show the list of -- about matrices now for the
[inaudible] problem you can state the statement as you want it to be stated.
And also CoqMT is implemented and free available on that address. CoqMT will be integrated
into the next major release of Coq.
And CoqMT has been used already in several developments. For example, I have one example
of formalization of combinators for logical gates where gates is depending on the number of
inputs and then both outputs. It has been used for encoding of mathematical construction, like
taking the [inaudible] union on [inaudible] family of sets, for example.
And CoqMT metatheory has been studied carefully. Currently we have a syntactical model of
the core logic of CoqMT which has been mechanized, and we are also mechanizing a set
theoretical model in Coq [inaudible].
And you can find more information on CoqMT in those conferences which describe the system
implementation and the metatheory.
And for concluding I would like to give a big picture of this work [inaudible] or work I'm
involved in. So here I have my work about Coq Modulo Theory. And as I said, you saw in the
kernel we want to check that certificate for ensuring the security. And before that we need first
evaluation of function in Coq.
So this is a second part that you may not trust in the kernel which is a part which is compiling
Coq terms to a binary function that we evaluate. And we are currently working on satisfying the
kernel compiler of Coq.
And so this work is related to the F* language. As I said, I formalize and mechanize the
metatheory of F* in Coq and also I satisfy this Coq type checker. But I started using, for
example, the Z3 [inaudible] server. And so I work on some program on validated [inaudible]
certificate generated by SMT server.
Also I start introduce for doing some protocol formalization, and we are working as a use case
the TLS formalization from high-level APIs to cryptographic primitives. And for that there are
two related works, which are a formalization of the math behind cryptographic primitives, for
example, elliptic curves, and also some extension of F* to some relational type theory that then
will be integrated into Coq.
And why doing all of that, that is why now formalizing all these tools and going to big use case
is that I think that we are moving from formal tools which were not useable even by expert to
formal tools which now are usable by expert for formalizing big programs. And I think now we
need big use cases because now we need to find how we can move to formal tools usable by
engineer, for example.
Thank you.
[applause]
>> Pierre-Yves Strub: Yeah.
>>: So [inaudible] your decision procedure says that, yes, I have proved this, but the whole
enterprise fails in the proof checking part.
>> Pierre-Yves Strub: Yeah. So if there is a bug, the certificates must be wrong.
>>: So but, I mean, do you think that can happen only when you have bugs in your software, or
is there a theoretical reason for it, that you can't guarantee that whatever work the decision
procedure has done can always be expressed in the -- you know, in whatever this [inaudible] --
>> Pierre-Yves Strub: [inaudible] the logical expressiveness of Coq is more powerful that what
you can express in decision procedures. Basically in Coq you are more powerful than set theory.
So it will be very unlikely that the decision procedure [inaudible] which is more powerful than
Coq.
So there is no reason that you cannot encode your certificate. It doesn't mean that it will be
simple. And perhaps this procedure of rebuilding a certificate for the result of the decision
procedure will be a lengthy procedure. Yes. But you should be able to do it.
>>: Perhaps you can prove it too, right? [inaudible] be able to do it.
>> Pierre-Yves Strub: Yes.
>>: [inaudible] decision procedure.
>> Pierre-Yves Strub: Yes, you can. The problem is that you don't want to lose some time on
that. Meaning each time we will add a new optimization in your decision procedure, you will
have to better prove that your decision procedure is correct.
And, for example, for [inaudible] you have to add optimization every year, because, if not, you
are losing the next set computation.
So you do not want to have to because proving that all optimization are correct will be -- I think
will be complicated and it's why all the community is moving from satisfy a decision procedure
to decision procedures generating certificates.
But, for example, for set in Coq you have [inaudible] which is a decision procedure for set within
Coq, proved in Coq, and then you can use it instead of the tree, but will be slower in the end.
Yeah.
>>: So the logic for equivalence is classical logic where Coq is [inaudible].
>> Pierre-Yves Strub: Yes.
>>: So there's some ->> Pierre-Yves Strub: Okay. If you want to maintain writes on decision procedure which is
classical, you need [inaudible] in Coq, yes. It's a choice.
>>: But you do that, right?
>> Pierre-Yves Strub: No. For example, for [inaudible] I am not using [inaudible].
>>: Oh, so you only have constructive.
>> Pierre-Yves Strub: Yeah.
>>: So I think on the slide you also mentioned that you have inductive definitions as part of
what you do for equivalence checking.
>> Pierre-Yves Strub: Yes.
>>: So how's the integration of the decision procedure [inaudible].
>> Pierre-Yves Strub: Badly. So I haven't described this today, but you have to add some new
restriction from what I describe when you start to have inductive definition for two reason. One,
for example, if you start using equation, for example, you know that X is equal to 0 and you have
a match of an X, then you must be able to replace X by 0 [inaudible]. Meaning that the
[inaudible] must be done modulo the theory in that case.
And next you have also in Coq the ability to construct proofs [inaudible] type by induction
[inaudible] function which take a natural number and returning nat when the central number is
equal to 0 or nat if not. If an arrow expression of an inconsistent set of equations, for example, I
have the hypothesis 0 is equal to 1, then I will have the nat is convertible to this function in place
of 0 which is convertible to this function equals to 1 which computes to nat or is not.
And then now I am able to encode the [inaudible] in Coq with that. So meaning that if you have
[inaudible] you must restrict the expression of my question to only consistent set of equations.
Yes.
>>: I was actually wondering if there were some decision problems in the space of [inaudible].
>> Pierre-Yves Strub: No, no. So theoretically it's desirable. In practice, for example, I added
some restriction to be much more efficient. For example, this [inaudible] the theory is very
inefficient. But as you -- we want to use the decision procedures to do some conversion for
[inaudible] types, and list is a single which is not defined. It's a type constructor, so [inaudible].
So I know that a list applied to a natural number won't put the natural number in the place of a
match. So I know that, for example, in that place I can [inaudible] the conversion. But I will
[inaudible] the conversion in other places which can put the arguments in the context of a
fixpoint of a match.
So, yes, there are much more restrictions. The full definition of the calculus, it's more
complicated [inaudible]. Yes.
>>: So with the extensional equality, you can show that if you prove two things to be equal, that
you can substitute [inaudible] what about if you [inaudible] two things are equal [inaudible]
simulation, can you then substitute them in the same way?
>> Pierre-Yves Strub: Let me think. So [inaudible] in Coq is not well defined, but you should
be able to -- but I have to think about this more precisely. But at first glance I will say yes.
>> Nikhil Swamy: Any other questions? Well, let's thank the speaker again.
[applause]
Download