18327 >> Sumit Gulwani: Let's get started. It's my... Professor Sanjit Seshia from U.C. Berkeley. So Sanjit works...

advertisement
18327
>> Sumit Gulwani: Let's get started. It's my immense pleasure to welcome
Professor Sanjit Seshia from U.C. Berkeley. So Sanjit works on a variety of
interesting things, including program synthesis, program verification. And today
he's going to talk about a interesting application of verification in the space of
design of voting machines.
In fact, when we scheduled this talk today, 10:30, half an hour earlier than this,
then Josh Benaloh, who is organizing a workshop on voting technology in the
adjacent room got back to me immediately and said that we are going to be
talking about Sanjit's work in our workshop, so why not let him first speak inside
the workshop. So now Sanjit is going to give a more detailed version of the talk
that he just gave in there. Over to you, Sanjit.
>> Sanjit Seshia: Good to be back at Microsoft Research.
So this work, the title is about verifying a voting machine design. But I would like
to in this talk explore the work we did as verifying interactive systems like voting
machines.
Where you not only have to verify the logic inside the machine, but also the
human computer interface. So let me talk a little bit more about what we did for
voting machines, and then I would love to have a later discussion about any
ideas that you have or questions you have or comments on extending this kind of
stuff to interactive systems in general.
Okay. So this work is joint with David Wagner and our students who did all the
work. And there is a paper that describes this work that appeared at the ACM
security conference, CCS last year. And it's also available on our websites.
So let me start with some background. Many of us, if not all, have used voting
machines of one kind or the other, and but we all are aware of the problems,
perceived problems with voting machines that have appeared in the news. So let
me just go through a few examples.
The first one is an article that appeared in the New York Times about a problem
called the sliding finger bug. This was in a voting machine where some people
found that their votes were not being correctly recorded.
It was on a commonly used operating system that I won't name here. And the
problem here was that when people try to touch the display, the touch screen
display to select their candidate, some of them had their finger dragged across
the screen a little bit.
And that slight drag was interpreted as a drag-and-drop command. And the
machine would respond would go into some weird state when it thought that. So
that was the sliding finger bug.
The second has to do with the recording of votes. So voters complained that the
machines were not registering their votes correctly, and the third had to do with a
bug where when the voter tried to select candidate A, the checkmark appeared
for candidate B.
And so obviously something weird was going on inside the voting machine.
Okay. So given all these kinds of problems, why do we want to use voting
machines, electronic voting machines? And the reason is that there are also a
number of advantages of using electronic voting machines.
First of all, they can be configureable for different elections. So more easily
configureable than physical voting machines where you can have different kinds
of contests. For instance, you can have a contest where you rank candidates not
just select candidates. Potentially if you have an electronic machine you just
have to update the software. You can provide accessibility and usability
features.
So people could require multiple languages and you can have all of that in one
machine. You can have different kinds of I/O devices especially for people with
disabilities. And finally, the election officials expect that either use of electronic
voting machines would make counting easier. And so there are many
advantages of voting machines, many reasons for why you might want to use
one.
But as we all know, many of us are verification folks in this room, that if you use
anything that runs even moderately complicated software, then you end up with
many hundreds of thousands of lines of code. You can have bugs. Bugs can
certainly introduce errors.
But they can also allow for attacks by malicious parties. And so you could end
up with changing the outcome of a contest. Especially I'm aware of the case in
this state in Washington State I think maybe four or five years ago the governor's
race was very close.
It was something like a few hundred votes. 100 votes that separated the two
candidates. So even a small change could affect the outcome of the election.
>>: I have a question. Why is this software called malicious [indiscernible] what
are the simplest logics.
>> Sanjit Seshia: This is exactly the question we asked ourselves, right? So let
me talk about it. So what we thought was suppose you want to formally verify a
voting machine. We had two options. We could have taken something that was
industrial scale voting machine if we were able to get at the code base of one of
these.
Or we could just throw it all away and start and build our own machine from
scratch. And one of the things we wanted to explore was how can you design a
voting machine in a way that would make it easier to verify and test?
So that's what we did. We implemented a very simple bare bones voting
machine in hardware, because we felt -- okay, even writing software, you'd have
to deal with more richer class of constructs. So let's just create a voting machine
that's a finite state machine. And so we just created design in Verilog. We
synthesized it on to an FPGA with a touch screen display.
What I'll talk about today is that effort, verifying that voting machine. So I think
the main contribution of our work is the first bullet, which is we show that you can
combine formal verification and manual testing by humans so as to prove
correctness of a voting machine.
And I'll talk more about why this combination is needed in a few slides. But for
the formal verification component, we fell back to techniques that we all know
and love. So we used model checking and we used SMT solvers.
And one of the interesting aspects of this work is that we used formal verification
to reduce the number of manual tests that you need to test the interface. So, first
of all, we ensure that you have a finite number of tests. And furthermore, as I will
discuss later, it's a number of tests that's polynomial in the length of a voter
session.
Okay. So let me say a little bit about the kind of voting machine we're talking
about. It's called direct recording, electronic voting machine or DRE. So in U.S.
elections, a single voter session is a sequence of contests. So one contest could
be for president. The next could be for a senator. The next could be for the local
Congressman. And then you might have local counsel and in California we have
all additional initiatives that get added on. Any session is a sequence of these
contests.
In a contest, the job is typically you have K choices and you have to pick L out of
K choices. So for president it's one choice. But for city council you might have to
pick three out of 10. That's the form of the contest. You have to pick L out of K.
Now, if you just think about how you would do this on paper, right? On paper you
would have something like a big newspaper which is cut into boxes. Each box is
a contest. And you can think a little bit who you want to pick. If you're not sure
you go to another one and make your selections and some order even go back
and forth before you're satisfied.
When you're satisfied, you fold up the paper and put it in the box. So that's what
the voter session is. In an electronic voting machine you can go back and forth
between the contests by pressing next and previous buttons. At the end, when
you're satisfied with all your selections, there's a cast button. You press the cast
button and then you're done. The vote is recorded.
>>: So the back and forth add any typical challenges and why is this important?
>> Sanjit Seshia: Okay. So the question was about whether navigation, the next
step adds any challenges? One challenge that it adds is that if you just think
about testing the voting machine, an obvious question is how long should you
test it for. Right? So one thing you might want to verify is that is there some
hidden logic such that if I navigate back 10 times and navigate 10 times forward,
10 times backward, then it gets activated. This hidden logic trojan gets activated,
then it starts doing something bad.
That's why the navigation is also part of the specification. The other thing, of
course, is that you have ordering of contests. So how do you do you know if I
cast my votes in a particular order something weird doesn't happen? So that's
the complexity that navigation adds to it.
>>: Why do they allow for this particular one?
>>: Well, at least -- so I think it's human nature in many cases to be indecisive.
You go to a contest you're not sure what you're going to do and solving an exam
as well. People like to have the ability to go back and forth and so I think it's the
same thing. It's a feature that I think is useful.
All right. So that's the kind of problem that we're handling. So what we'll be
talking about the rest of this presentation is proving the correctness of a single
voter session.
>>: [indiscernible] the correctness [indiscernible].
>> Sanjit Seshia: Yeah, that's the idea. I'll talk about what it means to do that.
>>: I see.
>> Sanjit Seshia: Okay. So here's the outline for the talk. I'll talk about what's
the rationale for combining formal verification and testing. I'll give some toy
examples to this approach that we're going to talk about the results and
conclude. So let's start with the first bullet. So let's say we just wanted to use
testing. Pure testing.
Then we have the classic quote from Dijkstra, which I think nicely illustrates this
point that you can only find bugs we're testing if you use it alone. But you can't
prove correctness unless testing is made exhaustive in some way. Right? But in
particular for this problem, it's not possible to make testing exhaustive if you use
it by itself. And as an example, I talked about this case where you can go next
and previous, indefinitely. And how do you know that there isn't some logic
inside the voting machine that might activate after a thousand such iterations.
Now, of course, then we can say, okay, we won't use testing. We'll use formal
methods. We can prove correctness with formal methods. So that's fine. One
challenge is that the state space is large. So if you take the choices in a single
contest well it doesn't seem very large. But then if you multiply it across many
contests then it blows up.
>>: Could you verify ->> Sanjit Seshia: What is that?
>>: What is the property that you verify?
>> Sanjit Seshia: I'll get to it. But right now I'm just making a general statement.
I think even for a fairly simple voting machine like the one we designed, the state
space was quite large, because you get -- it explodes exponentially in the
number of contests.
So one of the things we did was, instead of trying to verify software, we said okay
we'll just go to hardware, and we'll use standard verification tools like SMV to
verify our design.
But I think what's -- the state space is not so much of an issue. I think what is a
bigger issue you need formal specifications. In order to use formal verification
tools you need formal specifications. And this is okay for some properties.
Some properties you can define precisely. So, for instance, proving something
like setting the reset signal, making the reset signal one will clear the selection
state for all contests.
That you can prove on the code. Or you could even prove something like this,
that is, the transition function of the electronic voting machine that I get from the
code is a deterministic function of a set of inputs. You can prove that.
But how do you deal with the user interface? And the user interface has to do
with voter expectations. Here's an example of that. Let's say the voter wants to
vote for Alice. So she looks at the screen and there's a button next to Alice. She
presses the button and then she sees that according to her the region around
Alice is highlighted. Therefore she says, okay, the machine must have cast my
vote for Alice. Right?
That's just one selection. How do you formalize this? It has to do with human
perception. It's sort of a computer vision problem. But even there it's not
properly formally defined. So personally I don't know how to formalize this. And
so that's the challenge with using formal methods to verify the human computer
interface.
So to summarize, formal methods are great. If you have a formal specification
you can prove the correctness. But the voter expectations can't really be
formalized. On the other hand, if I did testing by humans and I selected a panel
of representative voters, so I showed them every single screen that the voting
machine will generate and asked the voters what do you think is the button area.
Right? If your panel of voters is representative, then you can argue that if all of
them pass the screen, all of them interpret the screen correctly, then the output is
correct.
So you could use testing by humans to do that. The problem is that you don't
know how many tests you need. So how much testing is sufficient to prove
correctness?
So that's where our combination comes in. We combine the two to get rid of the
problems of each one. So for precise properties we use formal verification. To
deal with voter expectations we use testing by humans.
And I'll show you in the next few slides how we do this exactly. So there are two
properties of this combination. The first is that we actually have modeled some
assumptions. We show the combination of formal verification and human testing
is sound if all the tests pass and all the verification goals are proven, then the
voting machine is correct, according to the formalization I'll represent soon.
And not only that, we can actually use formal verification to reduce the number of
tests. Okay. So let me illustrate this approach and exactly what is this
combination of testing and verification.
All right. So here's just a naive approach to testing plus formal verification. So
suppose I formally verify that my DRE is deterministic. That given a sequence of
button presses, it will update its state in a unique way. Right? Then if I had an
exhaustive set of tests, I can simply apply all those tests. I don't have to do any
more formal verification. I just apply all those tests and I know that on election
day if the machine code is the same it's still deterministic and my tests were
exhaustive so the machine should behave correctly.
The reason this is naive is because you can't do exhaustive testing. So the basic
problem is you need infinitely many tests. And each test itself you don't have a
bound on how long it might go.
So I've already talked about this. But let me just go through an example, just to
describe some notation here. So we'll view a trace, a trace of inputs in this way.
So inputs or selections. So here every element of the trace is separated by
commas. So here you have the empty set, which means that no candidates are
selected.
Then we have Alice, which means that the voter selected Alice. And then we
have the set with both Alice and Bob, which means that in this contest it's
possible to select more than one candidate and the voter has highlighted two
names. So this is going to be the form of the trace.
And so an infinite trace can be something like this, where you toggle between
selecting Alice and Bob. So you start with no selections and then you select
Alice, then you deselect Alice and you select Bob. You deselect Bob and you
select Alice. And it can keep going like this.
How do you know after a thousand such toggles that some trojan doesn't get
activated? How do you know that the logic of the voting machine isn't
erroneous? So that's the problem with testing. You don't know how long to test
for. Okay. So now -- so let me then tell you how we use formal verification to
solve this problem.
And before that, let me tell you what is the correctness notion. So it's a notion
that we're all familiar with in the formal verification community which is trace
equivalents. So trace equivalents is typically between two designs or an
implementation and a specification. So it's a similar notion here.
You have an implementation of the voting machine. So this is our Verilog design.
And then you have the specification which is something in the voter's mind. So
I'll talk a little bit more about this. Just think about this as let's consider this to be
another finite state machine for just this slide. That formalizes what's in the
voter's mind.
So now what is a trace implementation? It's going to be a sequence of the
relevant part of the state of the implementation. And for the voting machine
context, this is going to be a pair. The first element of the pair is the number or
the contest number that you are on.
So we just assume that all contests are labeled uniquely. So maybe president is
the first contest. Senator is the second contest, et cetera. So that's the contest
number. And this second is going to be a set which represents the selections in
that contest.
Okay. So that's what this pair means. Okay. So let's for simplicity assume that
there's only a single contest. So we'll make it simple just for this slide. All right.
So that's going to be one element of the internal, the trace of this implementation
machine, sequence of internal states.
So given this internal state, the machine outputs a screen, a bit map and the bit
map shows let's say there are two candidates, Alice and Bob, and it shows that
neither is selected. So they're both not highlighted.
Now, what the human does is the human voter looks at the screen and interprets
it. And the interpretation of the voter is, okay, I'm on contest one. Let's label it
somewhere and nothing is selected. So the machine is initialized. No selections
are made. Then the tester presses a button. Let's say the tester presses the
button for Alice. What we wanted the machine state should become now contest
one and Alice. The machine should output a screen which illustrates that. So
Alice is selected. Then the human voter looks at that and says, okay, now I know
that the machine is still in contest one and it has recorded my vote for Alice.
So these are the traces. This is going to be a trace of the implementation, and
that's going to be a trace of the specification voting machine. Okay. And in each
case each element of the trace, each element of that sequence is going to be a
pair of the contest number and the selection state.
So let me talk about this part. So we assume that we have this model of a
human voter. So how do we actually -- how do we construct this model. So the
way we do it in our work is we separate out the perception part from the internal
working.
So we assume that each voter has a state machine in their head, which models
the voting process. That's something we call the specification voting machine.
And then furthermore, each voter has a function that allows them to interpret the
touch screen display. So let's first look at the specification voting machine. So
the specification voting machine as we formalize is going to be a finite state
machine, and the state of that machine is simply the current contest number and
the state of selections made across all contests.
And furthermore we're going to assume that the voting machine is structured in a
particular way. So typically when we cast a paper ballot, we don't expect that the
way we vote for president is going to influence how our vote for Congressman is
recorded. We assume that these are independent. And so we've represented
that in our, the specification voting machine formalization. So we assume that
there are two main modes, that there are two modes of the machine. The main
mode and the cast mode.
The main mode is where you start. There's a state machine that controls the
navigation. So it has to do with which contest you are on.
And then there's a state machine for every contest. So M-1 for contest one and
M-2 for contest two, et cetera, and the job of these machines is simply to deal
with selections within that contest. And the assumption is that how things work in
this contest will not affect the remaining contests.
>>: So let's finite state machine is the -- the instruction that may -- because what
happens -- [indiscernible] what if someone goes and sits there for the entire day?
So are people [indiscernible] voting [indiscernible] people are supposed to come
out of the room in a certain amount of time?
>> Sanjit Seshia: So there could be -- like I said in the beginning, first of all, this
formalization is for the voting machine we constructed. So if you construct a
more complicated voting machine, then you'll have to make -- there's been the
voting process is more complicated, you'll have to correspondingly change the
specification of the voting machine to reflect that.
So if there's a notion of a timeout, then an informed voter should know that. And
that will be incorporated in their state machine model. So this is a model of the
voting process that we implemented.
So you can basically go back and forth between contests and at some point you
cast and control is transferred to a state machine that deals with recording the
cast vote in persistent storage. So that's the model of the state specification
machine informally described the details are in the paper but let me talk about
the I/O interpretation function.
So we assume that for every human voter they have a function which is basically
a pair of functions. I sub O and I sub I which have to do with how they interpret
input and output.
So the output interpretation function, I sub O, this is a mapping from bit maps,
from screens, to a pair of the current contest number and the selection state. So
the idea is that the voter looks at the screen and by looking at the screen they try
to interpret what is the contest I'm on and what has been selected in that contest.
That's what this is. Then the voter has to also figure out what region of the
screen corresponds to what button. That's what this function formalizes. So this
I sub I is a mapping from pixels. So locations on the screen to buttons. So we
assume that every human has a consistent way, the assumption about a function
is for every human there's a consistent way of saying that pixel 4,000 comma
2,000 corresponds to a location inside the button for Alice. Right?
So all we do in our work is we assume that for every human voter these exist.
But we don't know these functions. We don't assume any knowledge of these
functions. We just assume such a function exists.
>>: What do you mean by [indiscernible]?
>> Sanjit Seshia: No, this is knowledge about the definition of the function. The
form of the function. But I don't assume that I know I. What I'm saying is I ->>: This is a type you don't know exactly.
>> Sanjit Seshia: Exactly. This is a type. But I don't know exactly which pixel is
mapped to which button. That's inside the human tester.
>>: Should there be a unique choice for this?
>> Sanjit Seshia: It turns out that different people can interpret the screen in
different ways, slightly different ways. So some people who may be a little bit
vision impaired may not be able to make out the boundaries very clearly. So
they might only look at the interior of a button and say, okay, I can press within
that. So there can be some variation across humans in interpreting this. This is
my understanding from just reading stuff about this.
>>: So let's go back to the one word [indiscernible] -- can you get
misinterpretation [indiscernible].
>> Sanjit Seshia: There's a very classic example in the 2000 presidential
election in Florida. A so-called butterfly ballot. If I remember right, the arrows,
the selection to be made in the middle and the candidates names were on the
side. Then there were arrows pointing towards where you had to do a check or
punch-through or whatever. Right? And it seemed like -- so people wanted to
vote for candidate A, but the arrow was pointing in a way they actually ended up
voting for B. It can happen even in paper ballots.
>>: [indiscernible] the current [indiscernible] uses using some other means?
[indiscernible].
>> Sanjit Seshia: Exactly. So what we will assume is that -- so what we will
claim is if election officials use a representative panel of human voters who
represent all the whole voting population, right is this and they do our generated
tests with those panel of voters, then you'll have some assurance about the
correct operation of the interface for all voters.
Okay. So here's a toy example of one step in the testing process. So what
happens during human testing? So you have your voter there. And that's the
screen that the voter sees. So initially the voter looks at the screen and says,
okay, so, first of all, before I go there, here I'm going to indicate the internal state
of the machine. So right now no candidates have been selected. There's only
one contest. So I leave out the contest number.
So now the voter looks at the screen and comes up with their interpretation. So
they say, okay, system initialized. That means there's no selections made. And
there's only one contest. Then the voter says I'm going to press the first button,
right? And the voter thinks that they're voting for Alice. A correct voting machine
should record that internally. Should record the vote for Alice and should give
some feedback, should highlight the button for Alice and present that display
back to the voter.
When the voter looks at that, at this display, and updates their interpretation.
They say, okay, now I think that the machine has recorded the vote for Alice.
And now what we want to check is that what the voter has in their mind equals
what's inside the voting machine. So we want to do this check. So every step of
the testing process we need to do something like this.
So the way that they're going to do this is, I'm skipping ahead a little bit, the way
we'll do this is we'll assume that for every distinct output screen that the machine
can generate you can then press cast on that screen or you have some way of
pressing a button that will produce a readable output that the human can then
look at and say, okay, this is inside the machine and this is what I think and the
two match up.
And so we have a coverage criterion for testing that deals with this. All right. So
so far I talked a little bit about what correctness means, which is the notion of
trace equivalents. I talked about the human voters view of a voting machine's
operation, and I talked a little bit about the testing process. Let me return to this
question about how do we use formal verification to make testing easier.
So here are three properties that we prove on the code in order to make testing
easier. The first is that we prove that the DRE is deterministic. The machine is
deterministic. So this is a property that is proved on the transition function of the
machine.
And I'll talk a little bit about this in the next slide. The second property that we
prove is that for every output screen of the voting machine, there is a unique
internal state. Every distinct output screen has a unique internal state. What this
means is that the output function is an injective of 1-to-1 function of the contest
number and selection state.
And you can prove this again formally on the code of the machine. And then
finally we have to prove that when the voter casts his or her vote, that the
selection state in RAM is correctly recorded to persistent storage. Right?
So you have to check that the logic that does this copy is correct. So I won't
have time to talk about all the formal verification things, but let me talk about
proving determinism. So the problem here in general in proving determinism,
you want to verify that some state variable V is a deterministic function of K
variables, W1 to WK, and nothing else. It's only those K variables that determine
the value of B.
This could be a single cycle dependence, combinational dependence or one
cycle dependence.
So that's the verification task. So as an example for the selection state of a
contest, we want to verify something like this. That the next value of the
selection state is only a function of the current value. The contest number that
you are on. And whether a button was pressed. And if a button was pressed,
which one.
So each of these have corresponding variables in the Verilog code. So this is
already the previous state value. There's a contest number variable and then
there is a variable that records whether there was a press and if so what button
was pressed.
>>: Could you define what it means to press a button. At the hardware level
aren't there a lot of sensor outputs when you touch the screen?
>> Sanjit Seshia: Yeah. So let me decompose that into two tasks. So the first is
at the hardware level when you press a screen, press the screen somewhere,
there is a sensor that is activated and then it probably, in a software controlled
system, generates an interrupt, right, of some kind. Or in this hardware system,
it erases the level of a signal on a pin.
So that logic currently we're assuming is trusted. What we are verifying is
suppose there was an input signal to the FPGA. The internal logic of the voting
machine that's implemented say as a circuit will then record that correctly.
>>: So you're assuming that by the time you get -- there's one discrete?
>> Sanjit Seshia: That's right. It's a discrete input.
>>: That's fine.
>> Sanjit Seshia: Okay. So how do we prove determinism. So many of you in
this room know about SMT solvers. So the way we do this is we formulate it as a
satisfiability problem. And so, first of all, I'll define a function fy which is going to
be a function of the current state, the next state, the input variables and the
output variables.
And so it's basically what you expect. The next state is defined as a function
delta of current state and input. And the output is simply a function raw of the
current state. We're assuming a mode machine.
Okay. So now you define -- that's the formula that represents your transition and
output functions. And so the kind of check that we do is similar to the kind of
check that you would use to prove noninterference, for instance. The idea here
is that we assume that there are two runs. Two different runs of the transition
function. So in one case you went from S1 to S1 prime on input I1 and
generating output 01. And in the other case you went from S2 to S2 prime on
input I2 generating 02, and these runs are completely arbitrary except that for all
variables in capital W, the values are the same in run 1 and run 2.
So ->>: There's something stronger, because for any state, sometimes you may
want to prove the definition. All fairly reachable.
>> Sanjit Seshia: That's right. We're proving something stronger. You might
only want to prove this for the reachable states. And we don't do that. But it
turns out for our design this is good enough. Okay. All right. So what we're
saying is there are these two runs, and the values of the W variables are equal.
So if that's true, then the value of V in run 1 equals the value of V in run 2. If you
want to check the variability of that formula.
And so in our work we do this with a bit vector SMT solver. Okay. So that's how
we prove determinism. Many of the properties can be similarly cast as SMT
problems. Some properties are probably best cast as compare logic properties.
And we do so. The paper has a list of all the LTL properties that we verified and
verified using SMT.
That's the formal verification stuff. Now let me talk about the coverage metrics
for testing, and there are three coverage criteria. And the idea is that if you
generate a set of tests that satisfy these three criteria, then that, along with the
results of formal verification, is sufficient to prove trace equivalents. For any
existence of an input output interpretation function.
All right. So let's go through this. So the first coverage criterion is initial state
coverage which has to do with the fact that the machine, the voting session
should start in a valid state. So at the very beginning where no selections have
been made, if you just pressed cast, then the output, the recorded output should
reflect the fact that no candidates were selected.
So that's the first thing. You should have a test which simply presses cast in the
very first state. Then the next has to do with transition coverage. And the key is
this is transition coverage within the state machine for a single contest.
So the idea here is if you look at the state machine for one contest, you must
press each navigation button from that contest. You should press next and you
should press previous.
And then for each contest, if you have like ten buttons and you have to pick one
candidate out of ten, then you should do every possible selection and
deselection. So you should press candidate one and then try selecting and
deselecting everybody. Press candidate two and try deselecting everybody, et
cetera. That's transition coverage for selections.
Okay. But the key thing, as I described to you, this is transition coverage within
the state, for the state machine within a contest. So it's actually a fairly small
state machine. And then finally we have a third criterion which has to do with
output screen coverage. So the goal of this coverage criterion is to check the
injectivity property. So the idea is for every output screen that the machine can
generate, you first navigate to the output screen and then from that output screen
you press cast. And then you'll get some internal state of the voting machine.
The voting machine recorded. And you look at that and you say okay for contest
I, which this output screen represented, the state was correctly recorded as I
thought it would be recorded.
That's what this is doing. So if I had just a machine with one contest, then doing
this I would have proved the machine correct if I used a representative voter to
do the tests. So if I verified the machine is deterministic I then test at every
transition and every navigation transition and every selection transition, and a
representative human voter passed it, then I can conclude that this machine is
fine.
But the problem is that once you start with having many contests, then the space
of possible tests blow up. So in this case you have two contests. In the first
contest you pick between Alice and Bob, A and B. The second contest you have
to pick between Eve and Mallory. E and M. And so this graph is basically
showing the space of all possible choices.
So you could either select Alice and then go to the next contest, where Alice is
selected, and then either select Eve or Mallory or you could select Alice and
deselect Alice and keep going back and forth. So this graph represents all the
possible transitions that you could take.
So what we want is we don't want to have to cover all the transitions of the state
machine. Because the number of transitions will grow exponentially the number
of contests.
So instead we want to do something that's more feasible. Okay. So let's first do
a simple thought experiment. So suppose that I had separate physical machines
for every contest. Right? Then I could simply test the transitions for each
machine independently, right? So why is that I'm able to do this if I had separate
DREs for every contest? It's because the logic for different contests doesn't
interact. The choices I make for one contest do not affect how my choices for
another contest are recorded. If I had physical separated machines that are not
connected in any way then you could do this.
And so the idea would be, okay, if you had these three contests all you have to
do is transition coverage within each machine but the fact is we only have one
DRE which is going to support all these contests. So the idea is let's prove that
the logic for different contest software independent of each other.
So what does independence mean? It's sort of the opposite of determinism. So
you want to prove that the selections, the recorded selections in contest one are
not affected by any state variables that have to do with contest two or three or
four. So they're independent. You want to prove independence of the update
function for contest -- of contest one with respect to the variables in other
contests. And you want to prove independence with respect to the code that has
to deal with navigation and the code that has to deal with cast, et cetera. So
proving independence essentially turns out to be a similar SMT formula that we
had in determinism.
And so I'll just -- I'll skip the details but I can tell you exactly what the formula is.
>>: What would happen if independence was longer? So, for example, the
scenario might be that maybe a person gets [indiscernible] and they can
distribute that to a number of characters. Not just choosing one but you can
generate defenses [indiscernible].
>> Sanjit Seshia: So that's a different problem. So I think you had two -- your
question had two parts to it. So one was suppose there was a way in which you
could select a subset of contests in which you will make choices. Right? And so
then there was some inherent dependencies. So maybe if you vote in the
contest for president, then you must also vote in this other contest for senator.
But maybe some voters are ineligible to vote in these contests but maybe they
can vote on ballot initiatives. Maybe there's some dependencies like that.
So if you had a system like that, then I think -- I don't know how to decompose
them right now. Maybe there is some smart design way in which we can try to
decompose these.
The second part of the question had to do with the kind of election that you run.
So instead of having in a contest -- I think some European countries have this,
where you can rank candidates. So you can say that voting for president, but I'll
say A is first and B is my second choice and C is my third choice, and if
somebody comes up third, then you allocate the second choice preferences to
the other candidates. Right?
So that is a different kind -- so that doesn't have to do with multiple contests. It
has to do with even a single contest. And there the state space of possible
options becomes, it's the number of permutations of the candidates. Number of
possible ranking orders. There I think we have a way of again using verification
to do this where if we can prove that the order in which the buttons appear or
something like that, right, the order in which I display candidates is independent
of the names of the candidates, something like that.
Right? Then basically it means that you can pick one order for testing, and that
should work. So you can do things like that to deal with other contests. So it's a
fascinating area in this world of elections, because there's just so much diversity.
Okay. So now if the structure of the machine is as follows. So it's similar to the
specification voting machine that I showed you. So you have for each contest
you have a state machine. When you press reset, you started the first contest.
From any of these contests you can press cast and end up in the cast mode. Or
you can press next and previous and navigate between contests.
Okay. And as you see, it has this structure of similar composition structure that
we had in the specification voting machine. So the additional goals for formal
verification tools are these, which are the independence goals. So we want to
prove that the state of one contest cannot affect the state of any other contest.
Or updating the state of one contest cannot have an effect on any other contest.
Pressing a navigation button cannot affect the selection state of any contest.
And pressing a selection cannot cause you to jump to a screen of some other
contest.
Right? So those are again the properties that we verify and again we can verify
them using SMT solvers. Okay. So given all that, that's the idea of combining
verification and testing. So we had all these goals on independence and
determinism that we prove with SMT and some other goals we prove with model
checking. We use that to reduce the number of tests to end up testing each
contest independently.
So let me talk about the results. So we have a theorem, proof in the paper,
which says that if the DRE satisfies the six properties that I talked about, so the
six properties just to recap are proving determinism, proving injectivity, proving
the correctness of the cast record, and then proving independence.
So if you do all that, then if we further generate a test suite that satisfies our
coverage criteria, if all those tests pass, then we can be assured that the DRE is
trace equivalent to the specification voting machine for some, the interpretation
function that is represented by the voter panel and vice versa. And if the DRE is
correct in that respect, then all the tests will pass.
Okay. So that's the theorem. And the nice thing again now is because we
proved independence, we only need a polynomial number of tests. So previously
if you had N contests and each contest involved picking one out of K, you need K
to the N possible tests, because you have to consider all possible combinations
of candidates from different contests.
But now, because we proved independence, we only need K times N many tests.
And I think this is much more tractable to actually generate a script from which
you can hand off to human voters and have them do exhaustive pre-election
testing.
Okay. So the implementation for those who are interested, it's actually quite
small. It's just a thousand-line Verilog.
>>: Did you say the independence is the number of [indiscernible].
>> Sanjit Seshia: There are ->>: [indiscernible].
>> Sanjit Seshia: No, because of the navigation. You need the number of
contests shows up there.
>>: [indiscernible] find any bugs?
>> Sanjit Seshia: So we -- so the answer is no, we didn't find any bugs. But I
would put a little asterisk to that saying that our implementation is actually very
rudimentary. And it's actually even hard to test it out very well because the
FPGA kit screen is really small and we have these line drawings.
It's not ready for deployment in any sense of the word. Not even for a school
election, I think. But anyway I think it's -- it was a first step. It was an interesting
first step. So and also I think it's to my knowledge it's one of the few, if not the
first, such ways of combining formal verification and testing. At least I have not
seen any paper on this kind of approach before. If any of you know of work, do
let me know.
Okay. So the Verilog was small. That's one high level bit on the slide. And I can
tell you more for those who are interested.
>>: FPGA, the main points for verification and testing to verify finite state
machine for you when you have very fine testing. But why should you use
FPGA, what is the software for finite state machine?
>> Sanjit Seshia: So that's a really good question. So let me say two things
about it. One when you have software, you have to deal with the OS. So the
runtime environment. And with an FPGA you have to trust the cad tools.
And so that's one dichotomy. The second is when you do something in software,
you need model checking tools, et cetera, that will actually be able to prove those
kinds of properties. So when we started out, the student tried experimenting with
some tools, like Blast, and it didn't work very well. But then she did Verilog and
SMV and it was great. So that's just the reason why we went that way. I'm not
saying that there are software tools that you can't, that may have worked better
than Blast. Maybe there were. But it just didn't work for us that way.
The second thing I'll say is, the other thing I'll say about that question is we
actually did try doing things in software. So I'll say a bit more about this in the
future works slide.
Okay. So I want to also summarize this talk by talking about limitations. So I
think the work we did was just a first step. It was a very small baby step really in
this area. And our voting machine is not representative of a real voting machine.
So you make a lot of assumptions. So we assume that you can come up with a
test panel that's representative.
We assume that for the physical components are correct so the FPGA kit, the
things like the sensor works correctly. The tuck sensor, the cad tools are trusted
and things like that. So we make all these assumptions. And also the
functionality we implement is fairly small. For instance, I was just in the next
room in this voting machine workshop I was preceded I think from a speaker from
the Washington State election bureau and he was talking about the importance
of a disability support for people with disabilities.
And that opens a new completely dimension because you have different kinds of
I/O devices. There needs to be a lot more that needs to be done but it was an
interesting exploration and a nice first step in my opinion.
So to conclude, what we did was we verified a working voting machine, you
know, something that captures enough functionality that you can actually run an
election on, will work correctly it's a combination of formal function and testing
and I think the scientific thing that we found was that you can actually do more by
combining testing and verification than either one can do alone.
So for future work, one direction we want to go is just in voting machines, looking
at more complicated designs. So the first thing we did as a student project last
fall, I had some -- I had a student implement the display function in software, just
the display function. So that even in FPGA kits, you can get a kit where you can
implement on the FPGA a soft core and then an ASIC, a dedicated ASIC. The
soft core is like a processor that can run software, and then the ASIC is
something that can run any circuit. Any logic that you want. So in this case what
we're doing is we're partitioning the display function, the touch screen display
function to run on the soft core, and then we were trying to implement the voting
machine logic in the ASIC. And so that way the idea was that if we can get a
better display, a nicer looking display with colors and so forth, and that will be in
software. And the remaining, the remaining proof codes would already have
been done on the voting machine logic. Turned out that proving by -- so
injectivity, the fact that the output should be 1-to-1 function state and contest
number turned out to be a really hard problem. So we tried using CBMC and we
tried using other solvers, and it just didn't, we were just not able to prove
injectivity. It turns out to be a quantified formula. And I can tell you more about it
later, where you quantify over all possible pixel locations.
But anyway, that's one thing that we did. So I think software verifying this in
software is already quite challenging, which is it's probably good because we
have interesting software verification problems there.
The second is dealing with more complicated ranking voting schemes which I
already talked about in the contest of Sumit's question. And there are many
more interesting variants on voting machines. But the other thought that I had
was, if you take out voting machines, this is just a problem in verifying an
interactive system. So something with where the human interface is important.
And especially in the area of embedded systems, there are lots of systems, a lot
of safety critical systems where failures happen often because of
misinterpretation of the input/output interface. And there are, in my exploration of
this topic, there is this area in avionics, which is called mode confusion, which
has to do with the fact that pilots think that the plane is in some mode where it's
actually in some other mode and similar to this problem. And medical devices
now there's more and more software controlled devices, and there are papers
that talk about papers in the medical journals that talk about things where the
doctor didn't really understand what the display was saying and therefore they
administered around those agents or something incorrect. So I think there the
such problems that arise in other fields. And so the question is whether any of
these independents determinism principles could also work there.
Of course, those systems are also much more complex. So but anyway any
thoughts on the work I presented or on these future work directions are most
welcome. And I'm meeting some of you in the afternoon and I look forward to
lots of interesting discussion. Thank you.
[applause]
Download