18327 >> Sumit Gulwani: Let's get started. It's my immense pleasure to welcome Professor Sanjit Seshia from U.C. Berkeley. So Sanjit works on a variety of interesting things, including program synthesis, program verification. And today he's going to talk about a interesting application of verification in the space of design of voting machines. In fact, when we scheduled this talk today, 10:30, half an hour earlier than this, then Josh Benaloh, who is organizing a workshop on voting technology in the adjacent room got back to me immediately and said that we are going to be talking about Sanjit's work in our workshop, so why not let him first speak inside the workshop. So now Sanjit is going to give a more detailed version of the talk that he just gave in there. Over to you, Sanjit. >> Sanjit Seshia: Good to be back at Microsoft Research. So this work, the title is about verifying a voting machine design. But I would like to in this talk explore the work we did as verifying interactive systems like voting machines. Where you not only have to verify the logic inside the machine, but also the human computer interface. So let me talk a little bit more about what we did for voting machines, and then I would love to have a later discussion about any ideas that you have or questions you have or comments on extending this kind of stuff to interactive systems in general. Okay. So this work is joint with David Wagner and our students who did all the work. And there is a paper that describes this work that appeared at the ACM security conference, CCS last year. And it's also available on our websites. So let me start with some background. Many of us, if not all, have used voting machines of one kind or the other, and but we all are aware of the problems, perceived problems with voting machines that have appeared in the news. So let me just go through a few examples. The first one is an article that appeared in the New York Times about a problem called the sliding finger bug. This was in a voting machine where some people found that their votes were not being correctly recorded. It was on a commonly used operating system that I won't name here. And the problem here was that when people try to touch the display, the touch screen display to select their candidate, some of them had their finger dragged across the screen a little bit. And that slight drag was interpreted as a drag-and-drop command. And the machine would respond would go into some weird state when it thought that. So that was the sliding finger bug. The second has to do with the recording of votes. So voters complained that the machines were not registering their votes correctly, and the third had to do with a bug where when the voter tried to select candidate A, the checkmark appeared for candidate B. And so obviously something weird was going on inside the voting machine. Okay. So given all these kinds of problems, why do we want to use voting machines, electronic voting machines? And the reason is that there are also a number of advantages of using electronic voting machines. First of all, they can be configureable for different elections. So more easily configureable than physical voting machines where you can have different kinds of contests. For instance, you can have a contest where you rank candidates not just select candidates. Potentially if you have an electronic machine you just have to update the software. You can provide accessibility and usability features. So people could require multiple languages and you can have all of that in one machine. You can have different kinds of I/O devices especially for people with disabilities. And finally, the election officials expect that either use of electronic voting machines would make counting easier. And so there are many advantages of voting machines, many reasons for why you might want to use one. But as we all know, many of us are verification folks in this room, that if you use anything that runs even moderately complicated software, then you end up with many hundreds of thousands of lines of code. You can have bugs. Bugs can certainly introduce errors. But they can also allow for attacks by malicious parties. And so you could end up with changing the outcome of a contest. Especially I'm aware of the case in this state in Washington State I think maybe four or five years ago the governor's race was very close. It was something like a few hundred votes. 100 votes that separated the two candidates. So even a small change could affect the outcome of the election. >>: I have a question. Why is this software called malicious [indiscernible] what are the simplest logics. >> Sanjit Seshia: This is exactly the question we asked ourselves, right? So let me talk about it. So what we thought was suppose you want to formally verify a voting machine. We had two options. We could have taken something that was industrial scale voting machine if we were able to get at the code base of one of these. Or we could just throw it all away and start and build our own machine from scratch. And one of the things we wanted to explore was how can you design a voting machine in a way that would make it easier to verify and test? So that's what we did. We implemented a very simple bare bones voting machine in hardware, because we felt -- okay, even writing software, you'd have to deal with more richer class of constructs. So let's just create a voting machine that's a finite state machine. And so we just created design in Verilog. We synthesized it on to an FPGA with a touch screen display. What I'll talk about today is that effort, verifying that voting machine. So I think the main contribution of our work is the first bullet, which is we show that you can combine formal verification and manual testing by humans so as to prove correctness of a voting machine. And I'll talk more about why this combination is needed in a few slides. But for the formal verification component, we fell back to techniques that we all know and love. So we used model checking and we used SMT solvers. And one of the interesting aspects of this work is that we used formal verification to reduce the number of manual tests that you need to test the interface. So, first of all, we ensure that you have a finite number of tests. And furthermore, as I will discuss later, it's a number of tests that's polynomial in the length of a voter session. Okay. So let me say a little bit about the kind of voting machine we're talking about. It's called direct recording, electronic voting machine or DRE. So in U.S. elections, a single voter session is a sequence of contests. So one contest could be for president. The next could be for a senator. The next could be for the local Congressman. And then you might have local counsel and in California we have all additional initiatives that get added on. Any session is a sequence of these contests. In a contest, the job is typically you have K choices and you have to pick L out of K choices. So for president it's one choice. But for city council you might have to pick three out of 10. That's the form of the contest. You have to pick L out of K. Now, if you just think about how you would do this on paper, right? On paper you would have something like a big newspaper which is cut into boxes. Each box is a contest. And you can think a little bit who you want to pick. If you're not sure you go to another one and make your selections and some order even go back and forth before you're satisfied. When you're satisfied, you fold up the paper and put it in the box. So that's what the voter session is. In an electronic voting machine you can go back and forth between the contests by pressing next and previous buttons. At the end, when you're satisfied with all your selections, there's a cast button. You press the cast button and then you're done. The vote is recorded. >>: So the back and forth add any typical challenges and why is this important? >> Sanjit Seshia: Okay. So the question was about whether navigation, the next step adds any challenges? One challenge that it adds is that if you just think about testing the voting machine, an obvious question is how long should you test it for. Right? So one thing you might want to verify is that is there some hidden logic such that if I navigate back 10 times and navigate 10 times forward, 10 times backward, then it gets activated. This hidden logic trojan gets activated, then it starts doing something bad. That's why the navigation is also part of the specification. The other thing, of course, is that you have ordering of contests. So how do you do you know if I cast my votes in a particular order something weird doesn't happen? So that's the complexity that navigation adds to it. >>: Why do they allow for this particular one? >>: Well, at least -- so I think it's human nature in many cases to be indecisive. You go to a contest you're not sure what you're going to do and solving an exam as well. People like to have the ability to go back and forth and so I think it's the same thing. It's a feature that I think is useful. All right. So that's the kind of problem that we're handling. So what we'll be talking about the rest of this presentation is proving the correctness of a single voter session. >>: [indiscernible] the correctness [indiscernible]. >> Sanjit Seshia: Yeah, that's the idea. I'll talk about what it means to do that. >>: I see. >> Sanjit Seshia: Okay. So here's the outline for the talk. I'll talk about what's the rationale for combining formal verification and testing. I'll give some toy examples to this approach that we're going to talk about the results and conclude. So let's start with the first bullet. So let's say we just wanted to use testing. Pure testing. Then we have the classic quote from Dijkstra, which I think nicely illustrates this point that you can only find bugs we're testing if you use it alone. But you can't prove correctness unless testing is made exhaustive in some way. Right? But in particular for this problem, it's not possible to make testing exhaustive if you use it by itself. And as an example, I talked about this case where you can go next and previous, indefinitely. And how do you know that there isn't some logic inside the voting machine that might activate after a thousand such iterations. Now, of course, then we can say, okay, we won't use testing. We'll use formal methods. We can prove correctness with formal methods. So that's fine. One challenge is that the state space is large. So if you take the choices in a single contest well it doesn't seem very large. But then if you multiply it across many contests then it blows up. >>: Could you verify ->> Sanjit Seshia: What is that? >>: What is the property that you verify? >> Sanjit Seshia: I'll get to it. But right now I'm just making a general statement. I think even for a fairly simple voting machine like the one we designed, the state space was quite large, because you get -- it explodes exponentially in the number of contests. So one of the things we did was, instead of trying to verify software, we said okay we'll just go to hardware, and we'll use standard verification tools like SMV to verify our design. But I think what's -- the state space is not so much of an issue. I think what is a bigger issue you need formal specifications. In order to use formal verification tools you need formal specifications. And this is okay for some properties. Some properties you can define precisely. So, for instance, proving something like setting the reset signal, making the reset signal one will clear the selection state for all contests. That you can prove on the code. Or you could even prove something like this, that is, the transition function of the electronic voting machine that I get from the code is a deterministic function of a set of inputs. You can prove that. But how do you deal with the user interface? And the user interface has to do with voter expectations. Here's an example of that. Let's say the voter wants to vote for Alice. So she looks at the screen and there's a button next to Alice. She presses the button and then she sees that according to her the region around Alice is highlighted. Therefore she says, okay, the machine must have cast my vote for Alice. Right? That's just one selection. How do you formalize this? It has to do with human perception. It's sort of a computer vision problem. But even there it's not properly formally defined. So personally I don't know how to formalize this. And so that's the challenge with using formal methods to verify the human computer interface. So to summarize, formal methods are great. If you have a formal specification you can prove the correctness. But the voter expectations can't really be formalized. On the other hand, if I did testing by humans and I selected a panel of representative voters, so I showed them every single screen that the voting machine will generate and asked the voters what do you think is the button area. Right? If your panel of voters is representative, then you can argue that if all of them pass the screen, all of them interpret the screen correctly, then the output is correct. So you could use testing by humans to do that. The problem is that you don't know how many tests you need. So how much testing is sufficient to prove correctness? So that's where our combination comes in. We combine the two to get rid of the problems of each one. So for precise properties we use formal verification. To deal with voter expectations we use testing by humans. And I'll show you in the next few slides how we do this exactly. So there are two properties of this combination. The first is that we actually have modeled some assumptions. We show the combination of formal verification and human testing is sound if all the tests pass and all the verification goals are proven, then the voting machine is correct, according to the formalization I'll represent soon. And not only that, we can actually use formal verification to reduce the number of tests. Okay. So let me illustrate this approach and exactly what is this combination of testing and verification. All right. So here's just a naive approach to testing plus formal verification. So suppose I formally verify that my DRE is deterministic. That given a sequence of button presses, it will update its state in a unique way. Right? Then if I had an exhaustive set of tests, I can simply apply all those tests. I don't have to do any more formal verification. I just apply all those tests and I know that on election day if the machine code is the same it's still deterministic and my tests were exhaustive so the machine should behave correctly. The reason this is naive is because you can't do exhaustive testing. So the basic problem is you need infinitely many tests. And each test itself you don't have a bound on how long it might go. So I've already talked about this. But let me just go through an example, just to describe some notation here. So we'll view a trace, a trace of inputs in this way. So inputs or selections. So here every element of the trace is separated by commas. So here you have the empty set, which means that no candidates are selected. Then we have Alice, which means that the voter selected Alice. And then we have the set with both Alice and Bob, which means that in this contest it's possible to select more than one candidate and the voter has highlighted two names. So this is going to be the form of the trace. And so an infinite trace can be something like this, where you toggle between selecting Alice and Bob. So you start with no selections and then you select Alice, then you deselect Alice and you select Bob. You deselect Bob and you select Alice. And it can keep going like this. How do you know after a thousand such toggles that some trojan doesn't get activated? How do you know that the logic of the voting machine isn't erroneous? So that's the problem with testing. You don't know how long to test for. Okay. So now -- so let me then tell you how we use formal verification to solve this problem. And before that, let me tell you what is the correctness notion. So it's a notion that we're all familiar with in the formal verification community which is trace equivalents. So trace equivalents is typically between two designs or an implementation and a specification. So it's a similar notion here. You have an implementation of the voting machine. So this is our Verilog design. And then you have the specification which is something in the voter's mind. So I'll talk a little bit more about this. Just think about this as let's consider this to be another finite state machine for just this slide. That formalizes what's in the voter's mind. So now what is a trace implementation? It's going to be a sequence of the relevant part of the state of the implementation. And for the voting machine context, this is going to be a pair. The first element of the pair is the number or the contest number that you are on. So we just assume that all contests are labeled uniquely. So maybe president is the first contest. Senator is the second contest, et cetera. So that's the contest number. And this second is going to be a set which represents the selections in that contest. Okay. So that's what this pair means. Okay. So let's for simplicity assume that there's only a single contest. So we'll make it simple just for this slide. All right. So that's going to be one element of the internal, the trace of this implementation machine, sequence of internal states. So given this internal state, the machine outputs a screen, a bit map and the bit map shows let's say there are two candidates, Alice and Bob, and it shows that neither is selected. So they're both not highlighted. Now, what the human does is the human voter looks at the screen and interprets it. And the interpretation of the voter is, okay, I'm on contest one. Let's label it somewhere and nothing is selected. So the machine is initialized. No selections are made. Then the tester presses a button. Let's say the tester presses the button for Alice. What we wanted the machine state should become now contest one and Alice. The machine should output a screen which illustrates that. So Alice is selected. Then the human voter looks at that and says, okay, now I know that the machine is still in contest one and it has recorded my vote for Alice. So these are the traces. This is going to be a trace of the implementation, and that's going to be a trace of the specification voting machine. Okay. And in each case each element of the trace, each element of that sequence is going to be a pair of the contest number and the selection state. So let me talk about this part. So we assume that we have this model of a human voter. So how do we actually -- how do we construct this model. So the way we do it in our work is we separate out the perception part from the internal working. So we assume that each voter has a state machine in their head, which models the voting process. That's something we call the specification voting machine. And then furthermore, each voter has a function that allows them to interpret the touch screen display. So let's first look at the specification voting machine. So the specification voting machine as we formalize is going to be a finite state machine, and the state of that machine is simply the current contest number and the state of selections made across all contests. And furthermore we're going to assume that the voting machine is structured in a particular way. So typically when we cast a paper ballot, we don't expect that the way we vote for president is going to influence how our vote for Congressman is recorded. We assume that these are independent. And so we've represented that in our, the specification voting machine formalization. So we assume that there are two main modes, that there are two modes of the machine. The main mode and the cast mode. The main mode is where you start. There's a state machine that controls the navigation. So it has to do with which contest you are on. And then there's a state machine for every contest. So M-1 for contest one and M-2 for contest two, et cetera, and the job of these machines is simply to deal with selections within that contest. And the assumption is that how things work in this contest will not affect the remaining contests. >>: So let's finite state machine is the -- the instruction that may -- because what happens -- [indiscernible] what if someone goes and sits there for the entire day? So are people [indiscernible] voting [indiscernible] people are supposed to come out of the room in a certain amount of time? >> Sanjit Seshia: So there could be -- like I said in the beginning, first of all, this formalization is for the voting machine we constructed. So if you construct a more complicated voting machine, then you'll have to make -- there's been the voting process is more complicated, you'll have to correspondingly change the specification of the voting machine to reflect that. So if there's a notion of a timeout, then an informed voter should know that. And that will be incorporated in their state machine model. So this is a model of the voting process that we implemented. So you can basically go back and forth between contests and at some point you cast and control is transferred to a state machine that deals with recording the cast vote in persistent storage. So that's the model of the state specification machine informally described the details are in the paper but let me talk about the I/O interpretation function. So we assume that for every human voter they have a function which is basically a pair of functions. I sub O and I sub I which have to do with how they interpret input and output. So the output interpretation function, I sub O, this is a mapping from bit maps, from screens, to a pair of the current contest number and the selection state. So the idea is that the voter looks at the screen and by looking at the screen they try to interpret what is the contest I'm on and what has been selected in that contest. That's what this is. Then the voter has to also figure out what region of the screen corresponds to what button. That's what this function formalizes. So this I sub I is a mapping from pixels. So locations on the screen to buttons. So we assume that every human has a consistent way, the assumption about a function is for every human there's a consistent way of saying that pixel 4,000 comma 2,000 corresponds to a location inside the button for Alice. Right? So all we do in our work is we assume that for every human voter these exist. But we don't know these functions. We don't assume any knowledge of these functions. We just assume such a function exists. >>: What do you mean by [indiscernible]? >> Sanjit Seshia: No, this is knowledge about the definition of the function. The form of the function. But I don't assume that I know I. What I'm saying is I ->>: This is a type you don't know exactly. >> Sanjit Seshia: Exactly. This is a type. But I don't know exactly which pixel is mapped to which button. That's inside the human tester. >>: Should there be a unique choice for this? >> Sanjit Seshia: It turns out that different people can interpret the screen in different ways, slightly different ways. So some people who may be a little bit vision impaired may not be able to make out the boundaries very clearly. So they might only look at the interior of a button and say, okay, I can press within that. So there can be some variation across humans in interpreting this. This is my understanding from just reading stuff about this. >>: So let's go back to the one word [indiscernible] -- can you get misinterpretation [indiscernible]. >> Sanjit Seshia: There's a very classic example in the 2000 presidential election in Florida. A so-called butterfly ballot. If I remember right, the arrows, the selection to be made in the middle and the candidates names were on the side. Then there were arrows pointing towards where you had to do a check or punch-through or whatever. Right? And it seemed like -- so people wanted to vote for candidate A, but the arrow was pointing in a way they actually ended up voting for B. It can happen even in paper ballots. >>: [indiscernible] the current [indiscernible] uses using some other means? [indiscernible]. >> Sanjit Seshia: Exactly. So what we will assume is that -- so what we will claim is if election officials use a representative panel of human voters who represent all the whole voting population, right is this and they do our generated tests with those panel of voters, then you'll have some assurance about the correct operation of the interface for all voters. Okay. So here's a toy example of one step in the testing process. So what happens during human testing? So you have your voter there. And that's the screen that the voter sees. So initially the voter looks at the screen and says, okay, so, first of all, before I go there, here I'm going to indicate the internal state of the machine. So right now no candidates have been selected. There's only one contest. So I leave out the contest number. So now the voter looks at the screen and comes up with their interpretation. So they say, okay, system initialized. That means there's no selections made. And there's only one contest. Then the voter says I'm going to press the first button, right? And the voter thinks that they're voting for Alice. A correct voting machine should record that internally. Should record the vote for Alice and should give some feedback, should highlight the button for Alice and present that display back to the voter. When the voter looks at that, at this display, and updates their interpretation. They say, okay, now I think that the machine has recorded the vote for Alice. And now what we want to check is that what the voter has in their mind equals what's inside the voting machine. So we want to do this check. So every step of the testing process we need to do something like this. So the way that they're going to do this is, I'm skipping ahead a little bit, the way we'll do this is we'll assume that for every distinct output screen that the machine can generate you can then press cast on that screen or you have some way of pressing a button that will produce a readable output that the human can then look at and say, okay, this is inside the machine and this is what I think and the two match up. And so we have a coverage criterion for testing that deals with this. All right. So so far I talked a little bit about what correctness means, which is the notion of trace equivalents. I talked about the human voters view of a voting machine's operation, and I talked a little bit about the testing process. Let me return to this question about how do we use formal verification to make testing easier. So here are three properties that we prove on the code in order to make testing easier. The first is that we prove that the DRE is deterministic. The machine is deterministic. So this is a property that is proved on the transition function of the machine. And I'll talk a little bit about this in the next slide. The second property that we prove is that for every output screen of the voting machine, there is a unique internal state. Every distinct output screen has a unique internal state. What this means is that the output function is an injective of 1-to-1 function of the contest number and selection state. And you can prove this again formally on the code of the machine. And then finally we have to prove that when the voter casts his or her vote, that the selection state in RAM is correctly recorded to persistent storage. Right? So you have to check that the logic that does this copy is correct. So I won't have time to talk about all the formal verification things, but let me talk about proving determinism. So the problem here in general in proving determinism, you want to verify that some state variable V is a deterministic function of K variables, W1 to WK, and nothing else. It's only those K variables that determine the value of B. This could be a single cycle dependence, combinational dependence or one cycle dependence. So that's the verification task. So as an example for the selection state of a contest, we want to verify something like this. That the next value of the selection state is only a function of the current value. The contest number that you are on. And whether a button was pressed. And if a button was pressed, which one. So each of these have corresponding variables in the Verilog code. So this is already the previous state value. There's a contest number variable and then there is a variable that records whether there was a press and if so what button was pressed. >>: Could you define what it means to press a button. At the hardware level aren't there a lot of sensor outputs when you touch the screen? >> Sanjit Seshia: Yeah. So let me decompose that into two tasks. So the first is at the hardware level when you press a screen, press the screen somewhere, there is a sensor that is activated and then it probably, in a software controlled system, generates an interrupt, right, of some kind. Or in this hardware system, it erases the level of a signal on a pin. So that logic currently we're assuming is trusted. What we are verifying is suppose there was an input signal to the FPGA. The internal logic of the voting machine that's implemented say as a circuit will then record that correctly. >>: So you're assuming that by the time you get -- there's one discrete? >> Sanjit Seshia: That's right. It's a discrete input. >>: That's fine. >> Sanjit Seshia: Okay. So how do we prove determinism. So many of you in this room know about SMT solvers. So the way we do this is we formulate it as a satisfiability problem. And so, first of all, I'll define a function fy which is going to be a function of the current state, the next state, the input variables and the output variables. And so it's basically what you expect. The next state is defined as a function delta of current state and input. And the output is simply a function raw of the current state. We're assuming a mode machine. Okay. So now you define -- that's the formula that represents your transition and output functions. And so the kind of check that we do is similar to the kind of check that you would use to prove noninterference, for instance. The idea here is that we assume that there are two runs. Two different runs of the transition function. So in one case you went from S1 to S1 prime on input I1 and generating output 01. And in the other case you went from S2 to S2 prime on input I2 generating 02, and these runs are completely arbitrary except that for all variables in capital W, the values are the same in run 1 and run 2. So ->>: There's something stronger, because for any state, sometimes you may want to prove the definition. All fairly reachable. >> Sanjit Seshia: That's right. We're proving something stronger. You might only want to prove this for the reachable states. And we don't do that. But it turns out for our design this is good enough. Okay. All right. So what we're saying is there are these two runs, and the values of the W variables are equal. So if that's true, then the value of V in run 1 equals the value of V in run 2. If you want to check the variability of that formula. And so in our work we do this with a bit vector SMT solver. Okay. So that's how we prove determinism. Many of the properties can be similarly cast as SMT problems. Some properties are probably best cast as compare logic properties. And we do so. The paper has a list of all the LTL properties that we verified and verified using SMT. That's the formal verification stuff. Now let me talk about the coverage metrics for testing, and there are three coverage criteria. And the idea is that if you generate a set of tests that satisfy these three criteria, then that, along with the results of formal verification, is sufficient to prove trace equivalents. For any existence of an input output interpretation function. All right. So let's go through this. So the first coverage criterion is initial state coverage which has to do with the fact that the machine, the voting session should start in a valid state. So at the very beginning where no selections have been made, if you just pressed cast, then the output, the recorded output should reflect the fact that no candidates were selected. So that's the first thing. You should have a test which simply presses cast in the very first state. Then the next has to do with transition coverage. And the key is this is transition coverage within the state machine for a single contest. So the idea here is if you look at the state machine for one contest, you must press each navigation button from that contest. You should press next and you should press previous. And then for each contest, if you have like ten buttons and you have to pick one candidate out of ten, then you should do every possible selection and deselection. So you should press candidate one and then try selecting and deselecting everybody. Press candidate two and try deselecting everybody, et cetera. That's transition coverage for selections. Okay. But the key thing, as I described to you, this is transition coverage within the state, for the state machine within a contest. So it's actually a fairly small state machine. And then finally we have a third criterion which has to do with output screen coverage. So the goal of this coverage criterion is to check the injectivity property. So the idea is for every output screen that the machine can generate, you first navigate to the output screen and then from that output screen you press cast. And then you'll get some internal state of the voting machine. The voting machine recorded. And you look at that and you say okay for contest I, which this output screen represented, the state was correctly recorded as I thought it would be recorded. That's what this is doing. So if I had just a machine with one contest, then doing this I would have proved the machine correct if I used a representative voter to do the tests. So if I verified the machine is deterministic I then test at every transition and every navigation transition and every selection transition, and a representative human voter passed it, then I can conclude that this machine is fine. But the problem is that once you start with having many contests, then the space of possible tests blow up. So in this case you have two contests. In the first contest you pick between Alice and Bob, A and B. The second contest you have to pick between Eve and Mallory. E and M. And so this graph is basically showing the space of all possible choices. So you could either select Alice and then go to the next contest, where Alice is selected, and then either select Eve or Mallory or you could select Alice and deselect Alice and keep going back and forth. So this graph represents all the possible transitions that you could take. So what we want is we don't want to have to cover all the transitions of the state machine. Because the number of transitions will grow exponentially the number of contests. So instead we want to do something that's more feasible. Okay. So let's first do a simple thought experiment. So suppose that I had separate physical machines for every contest. Right? Then I could simply test the transitions for each machine independently, right? So why is that I'm able to do this if I had separate DREs for every contest? It's because the logic for different contests doesn't interact. The choices I make for one contest do not affect how my choices for another contest are recorded. If I had physical separated machines that are not connected in any way then you could do this. And so the idea would be, okay, if you had these three contests all you have to do is transition coverage within each machine but the fact is we only have one DRE which is going to support all these contests. So the idea is let's prove that the logic for different contest software independent of each other. So what does independence mean? It's sort of the opposite of determinism. So you want to prove that the selections, the recorded selections in contest one are not affected by any state variables that have to do with contest two or three or four. So they're independent. You want to prove independence of the update function for contest -- of contest one with respect to the variables in other contests. And you want to prove independence with respect to the code that has to deal with navigation and the code that has to deal with cast, et cetera. So proving independence essentially turns out to be a similar SMT formula that we had in determinism. And so I'll just -- I'll skip the details but I can tell you exactly what the formula is. >>: What would happen if independence was longer? So, for example, the scenario might be that maybe a person gets [indiscernible] and they can distribute that to a number of characters. Not just choosing one but you can generate defenses [indiscernible]. >> Sanjit Seshia: So that's a different problem. So I think you had two -- your question had two parts to it. So one was suppose there was a way in which you could select a subset of contests in which you will make choices. Right? And so then there was some inherent dependencies. So maybe if you vote in the contest for president, then you must also vote in this other contest for senator. But maybe some voters are ineligible to vote in these contests but maybe they can vote on ballot initiatives. Maybe there's some dependencies like that. So if you had a system like that, then I think -- I don't know how to decompose them right now. Maybe there is some smart design way in which we can try to decompose these. The second part of the question had to do with the kind of election that you run. So instead of having in a contest -- I think some European countries have this, where you can rank candidates. So you can say that voting for president, but I'll say A is first and B is my second choice and C is my third choice, and if somebody comes up third, then you allocate the second choice preferences to the other candidates. Right? So that is a different kind -- so that doesn't have to do with multiple contests. It has to do with even a single contest. And there the state space of possible options becomes, it's the number of permutations of the candidates. Number of possible ranking orders. There I think we have a way of again using verification to do this where if we can prove that the order in which the buttons appear or something like that, right, the order in which I display candidates is independent of the names of the candidates, something like that. Right? Then basically it means that you can pick one order for testing, and that should work. So you can do things like that to deal with other contests. So it's a fascinating area in this world of elections, because there's just so much diversity. Okay. So now if the structure of the machine is as follows. So it's similar to the specification voting machine that I showed you. So you have for each contest you have a state machine. When you press reset, you started the first contest. From any of these contests you can press cast and end up in the cast mode. Or you can press next and previous and navigate between contests. Okay. And as you see, it has this structure of similar composition structure that we had in the specification voting machine. So the additional goals for formal verification tools are these, which are the independence goals. So we want to prove that the state of one contest cannot affect the state of any other contest. Or updating the state of one contest cannot have an effect on any other contest. Pressing a navigation button cannot affect the selection state of any contest. And pressing a selection cannot cause you to jump to a screen of some other contest. Right? So those are again the properties that we verify and again we can verify them using SMT solvers. Okay. So given all that, that's the idea of combining verification and testing. So we had all these goals on independence and determinism that we prove with SMT and some other goals we prove with model checking. We use that to reduce the number of tests to end up testing each contest independently. So let me talk about the results. So we have a theorem, proof in the paper, which says that if the DRE satisfies the six properties that I talked about, so the six properties just to recap are proving determinism, proving injectivity, proving the correctness of the cast record, and then proving independence. So if you do all that, then if we further generate a test suite that satisfies our coverage criteria, if all those tests pass, then we can be assured that the DRE is trace equivalent to the specification voting machine for some, the interpretation function that is represented by the voter panel and vice versa. And if the DRE is correct in that respect, then all the tests will pass. Okay. So that's the theorem. And the nice thing again now is because we proved independence, we only need a polynomial number of tests. So previously if you had N contests and each contest involved picking one out of K, you need K to the N possible tests, because you have to consider all possible combinations of candidates from different contests. But now, because we proved independence, we only need K times N many tests. And I think this is much more tractable to actually generate a script from which you can hand off to human voters and have them do exhaustive pre-election testing. Okay. So the implementation for those who are interested, it's actually quite small. It's just a thousand-line Verilog. >>: Did you say the independence is the number of [indiscernible]. >> Sanjit Seshia: There are ->>: [indiscernible]. >> Sanjit Seshia: No, because of the navigation. You need the number of contests shows up there. >>: [indiscernible] find any bugs? >> Sanjit Seshia: So we -- so the answer is no, we didn't find any bugs. But I would put a little asterisk to that saying that our implementation is actually very rudimentary. And it's actually even hard to test it out very well because the FPGA kit screen is really small and we have these line drawings. It's not ready for deployment in any sense of the word. Not even for a school election, I think. But anyway I think it's -- it was a first step. It was an interesting first step. So and also I think it's to my knowledge it's one of the few, if not the first, such ways of combining formal verification and testing. At least I have not seen any paper on this kind of approach before. If any of you know of work, do let me know. Okay. So the Verilog was small. That's one high level bit on the slide. And I can tell you more for those who are interested. >>: FPGA, the main points for verification and testing to verify finite state machine for you when you have very fine testing. But why should you use FPGA, what is the software for finite state machine? >> Sanjit Seshia: So that's a really good question. So let me say two things about it. One when you have software, you have to deal with the OS. So the runtime environment. And with an FPGA you have to trust the cad tools. And so that's one dichotomy. The second is when you do something in software, you need model checking tools, et cetera, that will actually be able to prove those kinds of properties. So when we started out, the student tried experimenting with some tools, like Blast, and it didn't work very well. But then she did Verilog and SMV and it was great. So that's just the reason why we went that way. I'm not saying that there are software tools that you can't, that may have worked better than Blast. Maybe there were. But it just didn't work for us that way. The second thing I'll say is, the other thing I'll say about that question is we actually did try doing things in software. So I'll say a bit more about this in the future works slide. Okay. So I want to also summarize this talk by talking about limitations. So I think the work we did was just a first step. It was a very small baby step really in this area. And our voting machine is not representative of a real voting machine. So you make a lot of assumptions. So we assume that you can come up with a test panel that's representative. We assume that for the physical components are correct so the FPGA kit, the things like the sensor works correctly. The tuck sensor, the cad tools are trusted and things like that. So we make all these assumptions. And also the functionality we implement is fairly small. For instance, I was just in the next room in this voting machine workshop I was preceded I think from a speaker from the Washington State election bureau and he was talking about the importance of a disability support for people with disabilities. And that opens a new completely dimension because you have different kinds of I/O devices. There needs to be a lot more that needs to be done but it was an interesting exploration and a nice first step in my opinion. So to conclude, what we did was we verified a working voting machine, you know, something that captures enough functionality that you can actually run an election on, will work correctly it's a combination of formal function and testing and I think the scientific thing that we found was that you can actually do more by combining testing and verification than either one can do alone. So for future work, one direction we want to go is just in voting machines, looking at more complicated designs. So the first thing we did as a student project last fall, I had some -- I had a student implement the display function in software, just the display function. So that even in FPGA kits, you can get a kit where you can implement on the FPGA a soft core and then an ASIC, a dedicated ASIC. The soft core is like a processor that can run software, and then the ASIC is something that can run any circuit. Any logic that you want. So in this case what we're doing is we're partitioning the display function, the touch screen display function to run on the soft core, and then we were trying to implement the voting machine logic in the ASIC. And so that way the idea was that if we can get a better display, a nicer looking display with colors and so forth, and that will be in software. And the remaining, the remaining proof codes would already have been done on the voting machine logic. Turned out that proving by -- so injectivity, the fact that the output should be 1-to-1 function state and contest number turned out to be a really hard problem. So we tried using CBMC and we tried using other solvers, and it just didn't, we were just not able to prove injectivity. It turns out to be a quantified formula. And I can tell you more about it later, where you quantify over all possible pixel locations. But anyway, that's one thing that we did. So I think software verifying this in software is already quite challenging, which is it's probably good because we have interesting software verification problems there. The second is dealing with more complicated ranking voting schemes which I already talked about in the contest of Sumit's question. And there are many more interesting variants on voting machines. But the other thought that I had was, if you take out voting machines, this is just a problem in verifying an interactive system. So something with where the human interface is important. And especially in the area of embedded systems, there are lots of systems, a lot of safety critical systems where failures happen often because of misinterpretation of the input/output interface. And there are, in my exploration of this topic, there is this area in avionics, which is called mode confusion, which has to do with the fact that pilots think that the plane is in some mode where it's actually in some other mode and similar to this problem. And medical devices now there's more and more software controlled devices, and there are papers that talk about papers in the medical journals that talk about things where the doctor didn't really understand what the display was saying and therefore they administered around those agents or something incorrect. So I think there the such problems that arise in other fields. And so the question is whether any of these independents determinism principles could also work there. Of course, those systems are also much more complex. So but anyway any thoughts on the work I presented or on these future work directions are most welcome. And I'm meeting some of you in the afternoon and I look forward to lots of interesting discussion. Thank you. [applause]