Document 17864789

advertisement
>> Francesco Logozzo: Okay. Good morning. Thank you for coming here. It's a pleasure for me
to introduce our speaker and let's start with [indiscernible]. This summer, we run with, me and
Manuel [indiscernible] we run SCS and we had the same person as invited speaker and you
know session [indiscernible] who introduce him Nicola Boxer was kind of angry with me and
with Manuel, because he say okay. Introducing the invited speaker is a big honor but and he
made us do it with such a long name it's hard to introduce and now I understand why Nicola
was a little bit nervous because I have to pronounce, to say his name, so Sriram
Sankaranarayanan
>>: Yes. Perfect. [laughter]
>> Francesco Logozzo: I have to say I read it, but to be honest Nicola he learned it
[indiscernible]’s name so he spent all day reading and probably the night doing it. Okay. No.
So enough with jokes here. I think everyone knows Sriram did great contribution in the field of
verification static analysis and I think everyone know about us knows about templates that
were introduced by Sriram and all his work he made on numerical domains when he was in his
PhD with [indiscernible] and Ennie Sipman [phonetic] and then he moved into the industry and
we see where he worked on a real software static analyzer used [indiscernible] from my
understanding and then continues his career at University of Boulder which now he's trying to
prove that human body’s correct, so you are trying to better the human body or something like
this.
>> Sriram Sankaranarayanan: I hope not. But we will see.
>> Francesco Logozzo: That's good. Okay.
>> Sriram Sankaranarayanan: Thank you. Thank you Francesco for the kind invitation and
thanks for having me here. I had a wonderful time yesterday and hopefully today promises to
be a great day. Today my talk is about this title, optimal falsification for cyber physical systems.
And some of the slides of the talk were from when this was a talk given to a broader audience,
so some of these slides are going to be childish. I ask you to forgive me for that. What are
cyber physical systems? We all have an idea of what cyber physical systems are. These are
systems which interact with [indiscernible]. These are software systems that have a physical
component to them, so if you think of an artificial insulin delivery system, it controls the human
body which you can think of a physical or a biological system through a software that controls
how the infusion pump infuses insulin. Software is becoming a huge deal in automotive
systems where a lot of functions that were done by mechanical parts are now being software
control. It's called drive by wire. Software is coming in a big way to the power grid, like active
micro-grid power systems and these are very important, so it's very important for us to get the
software correct. But it's very difficult as well to get the software correct, so I thought I would
go up to the National Highway Transportation Safety Association and they have a bunch of
recalls and just get a couple of the recalls, a small sampling of software related recalls and you
can find those if you do Bing on automotive software related recalls. One of them says due to a
software problem the door may unexpectedly open while the person is driving the car, which
seems like a very strange thing. What does software have to do with the door of a car? But it's
a security system in a car. The software controls whether a door opens or closes and now the
door can open while the person is driving.
>>: That's not a problem if you are wearing your seatbelt.
>> Sriram Sankaranarayanan: Right.
>>: [indiscernible] take fire? [indiscernible] took fire yesterday…
>> Sriram Sankaranarayanan: Sure. Okay. It's fine. No problem if you are wearing a seatbelt,
but now look at this one.
>>: Let's just say another reason to wear your seatbelt. [laughter]
>> Sriram Sankaranarayanan: Fine. Another reason to wear your seatbelt, but look at this one.
So this one says "for the electric motor to rotate in the direction opposite to that selected by
the transmission" due to a software problem. The reason is not because they don't know how
to build software. It's just that when you build it there are so many bugs and you may ask me,
okay we've been doing software verification for the last 20 years. What's different about these
bugs? Two things are different. These are functional correctness properties. These are not
buffer overflows. These are not null point errors, which we know how to take care of and
hopefully, you know, there will be no more talks on these subjects, of null point errors, buffer
overflows thanks to all the good work that you guys do. But there are still going to be these
bugs left, so our job is not done. And these are properties of the closed loop system. These are
not properties of just the software alone. These are properties of the software and the physical
artifacts that it controls. Without talking about the physical artifacts you cannot just verify this
from inside the software. That's the main point I wanted to make with this.
>>: A car is no longer a car. It's a computer with wheels, I think is a good way to think about it.
Everything you knew about a car from sort of racing go carts is actually completely falsified, can
be completely false about an automotive car now.
>> Sriram Sankaranarayanan: Exactly. A modern automobile is becoming more and more
electronic.
>>: They used to resemble a go kart. So if you are in a go kart you would go all I'm like playing
in a car and then like you got in a car and it was actually very similar because it was physical.
>> Sriram Sankaranarayanan: Yes. And most cars you press the brake and you are just sending
a signal to a computer that's going to then control the braking system. There's no direct
connection between your brake pedal, unless you have a really very old model of a car, there's
no direct connection between the brake pedal and the drums. There's no mechanical
connection anymore.
>>: That's because as I understand from it won't roll, won't roll if it has no power. So Tesla’s
get braked if you don't, if the battery is drained, it's impossible to roll the car.
>> Sriram Sankaranarayanan: Right. Probably. I mean there are systems like the latest Boeing
aircraft can actually override the pilot. The autopilot can override the manual pilot. It used to
be the other way around, so now they place more trust in their computer systems than on the
human, so things are very interesting. What is verification in all this? What is stopping us from
just solving this problem with all the tools that we have? That's going to be most of what my
talk is about. I'm going to come up with somewhat a different view of what we can do as a first
step towards getting verification on track. You know, someone has to start and something has
to be done, so as a first step we are going to try and approach the full force of this verification
problem through model-based design, and I'll show you what we have done so far. It's going to
be very simple. It's mostly testing, pretending to be falsification, so I'll talk about that and I'll
give you a couple of case studies on which we have had some initial successes. We haven't
solved anything, but we have hopefully made a step towards, you know, doing, thinking about
these problems and solving them. The first challenge when you think of these systems is
modeling. What you need to model? You need to model the computation and you need to
model how this computation interacts with the physical subsystem. But this is nothing new.
We know about hybrid automata. A lot of people pioneered this in the ‘90s and we spent 15 to
20 years looking at hybrid automata, maybe more than that, 25 years looking at this hybrid
automata model which has these states. Inside each mode there's an ordinary differential
equation that says how things evolve inside the mode and there are these guards that go from
one mode to the other. We know hybrid automata. We know what's decidable and what's
undecidable, how to model check, how to do static analysis and to no end. But all of this is no
good when you come up against this beast which is called a simulating state flow diagram. If
you go into the industry this is what people do that double of [indiscernible]. It is equal
[indiscernible] to a hybrid automata and you can prove the theorem, but it's going to be a
hybrid automata that you cannot write down. It's going to have more modes than the number
of atoms in gas at standard temperature and pressure because these are not easy to translate
to hybrid automata and a lot of work needs to be done. So what stops us then? Well they
could have continuous input signals. They have discrete switching of course, otherwise, they
don't become hybrid. They have continuous dynamics, so each of these blocks with a picture
just expands to a huge block when you double-click on it, right? They are not on linear. Linear
systems are much easier to deal with, and to take the cake, the semantics are unclear but that's
not the huge problem. You could say that the semantics are unclear in some corner cases. It's
quite clear in some of the cases but there is inadequate documentation. It keeps changing from
version to version and there is very little front-end support. This is not the world of C or C#
where we have nice front-end tools to parse these things, to load them up and intermediate
representations to simplify them. We have to tackle these diagrams in the full force of the
diagrams. They are really hard to deal with as static analysis, inside a static analysis
infrastructure and so there is none. There is no static analysis infrastructure
>>: In the top right corner, the rest is within the expressiveness of hybrid automata, correct?
>> Sriram Sankaranarayanan: It's within the expressiveness of hybrid automata, but it's outside
the expressiveness of most verification tools. So when they start to combine arbitrary functions
that are defined by tables and so on, most verification tools are not able to do them and they
compose these functions to make huge functions, so they can very easily defeat our verification
tools.
>>: So there's nothing [indiscernible] of a practical…
>> Sriram Sankaranarayanan: Decidability value is very low, two variables and a hybrid
automata is already undecidable. We know from the ‘90s, we know what you need to get to
decidability and really nothing, very little, right? These are well into the undecidable frontier,
so nothing definite can be said about verification or falsification, but we still try. It doesn't stop
us from trying. The goal for this talk is I'm going to try and verify these complex systems, but
I'm going to use simulations as the one new way of obtaining information about the system. So
given up front let's restrict ourselves. This is a harsh restriction, unnecessary, but let's still
restrict ourselves with this. Let's say simulations is the only thing that we can do. Then can we
still do something that's better than randomly testing the system? That's going to be my
premise for this talk. It's going to be hopefully nothing deep, but let's see how much we can get
and where do we break when we start the simulation? And this is something like blackbox
falsification. You are treating the system as a black box. That may be bad but it's convenient to
do so because we can start to get tools that work on real models right up front, so it's a very
practical thing to do. We will presume that any system they give us is flawed and our hope is to
search for the violation systematically. If we find one it's going to be golden. We can go to the
developer and show them the violation. But if we don't find one, then we have done our effort.
We have run for many days and that's what most model checkers do anyway. After running for
many, many, many days, you still may not exhaustively, have exhausted your state space, so
you press control C and stop the engine. In many practical model checking examples that's how
you end. I mean, if you look like explicit state model checking with spin, you never finish on big
examples. You find a violation or it just keeps running and you just gain confidence by its run
five days and found nothing. The big question then is where should we search and how should
we search? Ideally I would like to use a SMT solver checker and use BMC. We all saw the talk
yesterday where they did Coral and they did some nice unwinding and gave it to an SMT solver
and SMT solvers have inspired a lot of our thinking process, but we are nowhere close to doing
this because we would immediately get into non-linear theories and immediately get into
trouble. So there are two ideas that I will present. One is we are going to reinterpret temporal
logic and I'm going to call this the robustness interpretation of temporal logic, so what temporal
logics? The logics we study is they take a trace and they give us true or false. Is this a property
violation or is this not a property violation, but what we are going to do is read the
[indiscernible] with real values and we are going to do it in a systematic way where properties
can be proved. The Boolean interpretation can be compared with the real value interpretation
and so on, so we are going to systematically do that. Once you change a Boolean value to a real
value what we will now see as it is very natural to use optimization as a search procedure and
some very interesting properties of cyber physical systems can be exploited once we do that.
So I'll show you how those two things can be done and I'll show you some better ideas on how
to do the search that are also inspired by thinking about how SMT solvers do this search. I'll
give you some flavor of how we are proceeding with that. Okay. So let me start with the
robustness interpretation. This was work that was originally started by Georgios and I joined
with Georgios on doing this robustness interpretation, so I think it's a very cool piece of work. I
really like the way the issues have been brought out here. The way the standard picture works
is okay. I already told you. We have a simulator. It produces a trace that goes into a property
monitor. Now think of your property as an LTL property or metric temporal logic property or a
real-time temporal logic property of your choice and the properties monitor and you get a true
or a false and we are after false. That's why we are doing falsification. Okay? Now let me take
an example to motivate this. So in this example, this is a very simple example, but it's also a
complex example in another way. This is a small example but it can produce very complex
behaviors and it's called a Van-der-Pol oscillator. I just brought it up. The green or the blue box
are the initial states and our goal is to find a trace that reaches the red box. I have shown you
around 10,000, or some such number, I have shown you that many simulations, and you can
see that the property is clearly violated. So this is not meant to be a tough problem. It is meant
to illustrate what we are seeing. Hopefully if this is clear the details of the system are not too
important to us and this could be any system. Now let me do this plot to motivate what we are
after. So this plot just says the bottom, the x and y-axis are going to be our initial conditions,
between -5 and .5, -.5 and .5. For each initial condition point I plot +20. Plus 20 is just to make
Matlab show it as a big z-axis, +20 if it's true and it's shown up there in pink, and -20 if it's false,
and it's shown down there and you can see that thing down there. If you look at, this is our
search space. I visualized it for you. The search space looks quite flat, so if you are an ant at
the top, and you randomly choose a point, the height with very high probability you are going
to land in the pink zone in the top. But if you are there, you don't know whether to go left or go
right which is the direction you should search towards falsification. Such a question may not
make much sense in program verification, but it makes a lot of sense in cyber physical systems
because if you always think about suppose I tried input .5 and .4, then should I try .6 and .5?
You always think about the neighborhood of your input space. That's how you think. In a
programming world point, the nearby inputs could produce something wildly different, but in a
cyber physical system you expect it to produce something similar, and that's called continuous
sensitivity to initial conditions.
>>: [indiscernible] put some ping-pong balls on it and shake it and do some simulated
annealing, right?
>> Sriram Sankaranarayanan: That's right.
>>: And pop over [indiscernible]
>> Sriram Sankaranarayanan: That's right. But if you do simulated annealing on this, it's just
going to do random search on the pink zone, because the way the search trace looks is it’s a
cliff and you don't know where the drop off points are. This is the first thing that we're going to
do. We're going to change this. We're going to change how we view this and we're going to
say does the trace violate the property is a Boolean interpretation. Get to how close does the
trace get to violating the property, so what is the distance between the trace and the property
and can we define this distance systematically? If we do that, if we manage that without
bringing in an arbitrary definition of closeness, we have to make it systematic or else we would
get something that looks quite random and won't give us the answer that we want. Motivate
that idea; let me take up two traces. The bottom left corner, the circle that is the initial
condition and there are two traces, trace one and trace two. That red box is what we need to
hit to violate the property of our interest. You can look at it and you can say trace one is so far
away from being a violation, but trace two is so close to being a violation, but the Boolean
interpretation is hiding this, because it just gives you true for both traces. It doesn't give you
any more information other than how close to being false. But can we make it a definition that
actually gives us how close a trace is to be false? And we can do this for the kinds of systems
we are interested in because the states space is real value. And what we define here is the
notion of the robustness of a trace, which we say is the radius of the cylinder around the trace,
so that any other trace in the same cylinder that you can think of has the same valuation as the
trace that you are interested in. In this case the blue trace has a valuation of blue for the
property that I do not hit the orange region. Therefore, I put the cylinder around the blue trace
and you can see how the cylinder is quite natural. It's formed by making sure that its
boundaries just touch the unsafe region. And then any other trace like the orange trace that
I've shown inside, that lies inside the cylinder has the same valuation as the property. The
radius of the cylinder is the number that we will call robustness. If the property is true we
would make it a positive number. If the property is false we'll make it a negative number. So
this is basically the formal definition of robustness, and the nice thing that we can now show is
the following. We can define robustness as a function that takes traces to real numbers and
the idea is we connect it up with the Boolean value semantics like this. If the trace satisfies the
formula, the robustness is made strictly positive and we ensure that it is strictly positive. If it's
strictly negative then the trace doesn't satisfy. If it's 0, then it's actually a corner case where
the trace either could or could not and that's one of the problems with this definition, but it so
rare to get exactly 0 that we don't worry about it. We make, instead of Boolean semantics, we
now go to real value semantics. How is this real value semantics defined? The cool thing is you
just take the way you define Boolean semantics and you change a few parts of the definition
and out comes the real value semantics. So how does that work? Instead of saying whether a
point satisfies an atomic proposition p, you look at the signed distance between the point, the
state and the proposition. The distance between a state and a set is well defined and you can -signed distance means that if the state is outside the set it is a positive number and if the state
is inside the set it is a negative number. So instead of saying that state x satisfy an atomic
proposition p, you say what is the distance from x to p?
>>: The point and the set and the distance between the point and the frontier?
>> Sriram Sankaranarayanan: Of the frontier of the set.
>>:
Okay [indiscernible]
>> Sriram Sankaranarayanan: It's called a Holstroff [phonetic] distance.
>>: Yeah, but you can have two [indiscernible] the closest to the farthest or the average?
>> Sriram Sankaranarayanan: Yeah, so this would be a closest. Yeah. So you use the closest
distance from the point to the frontier of the set as the point set distance. And we make it
signed to give us the negative robustness means violation; positive mean satisfaction, so we
make sure that the sign convention respects that. Now all becomes max; and becomes min.
Negation becomes, you know, taking the minus of the robustness. And things like Boolean
operators like box, if you remember from temporal logic they can be expanded out into or and
the neg what holds in the next state. The same thing now becomes, you know, all becomes
max; and becomes min and so you immediately get that recursive definition that you are really
interested in and everything works out nicely. You just change all to max, and to min and you
change the Boolean operators accordingly and negation becomes a minus and then you get the
robustness, as long as at the atomic proposition level you look at signed distances. It's a very
interesting, it's a very simple modification of how you think of temporal logic and suddenly you
get this number that makes a lot of sense.
>>: I have a question about [indiscernible] which one is [indiscernible] for instance,
[indiscernible] atomic proposition for the signed distance?
>> Sriram Sankaranarayanan: Yes. It has to be a metric.
>>: It has to be a metric?
>> Sriram Sankaranarayanan: Yes.
>>: [indiscernible] this is x y, so [indiscernible]
>> Sriram Sankaranarayanan: Correct. Because we use the signed -- if it's inside the set we
kind of give it a negative sign, but our distance has to be a metric. We have a paper in which
we show that even if it's a quasi-metric where we drop one off the requirement of being a
metric it still works. But yes, it has to have those topological properties or else it just doesn't
work. You get some nonsense otherwise. All right. So what we then do is the following.
What's the computational complexity of computing the robustness? It's linear, well it's almost
the same computational complexity as evaluating the formula on the trace, except for a small
change which is instead of doing whether a point lies inside the set, you have to do the distance
computation. In the general case it's a convex optimization problem. Depending on how the
distances are defined L2 distances or L1 distance or infinity distances, but there are some
special cases where the atomic propositions are just boxes or hyper rectangles. Things remain
a certain range as an example of a hyper rectangle are pretty good and then we can make this
distance computation rather fast. This would have been a bottleneck but we can make it rather
fast and make sure that we have picked our logic formulas to make this distance computation
fast. We have a tool, I should say Georgios has a tool called TALIRO which computes temporal
logical robustness for Matlab traces and we extended that to hybrid system traces as well, so
that was the first part of getting a distance. Let's see what it has achieved for us. Go back to
the Van-der-Pol example. We had this 01 surface which wasn't helping us a lot especially when
we were trying to submit it randomly. Instead, let's do robustness and immediately you get the
surface on the right which now has some interesting features. It's very hard to see from this
projector, but what you can see is on the right-hand side, on the right hand top corner things
are elevated.
>>: Your right-hand side?
>> Sriram Sankaranarayanan: Oh sorry, your left-hand side. Your left-hand side top corner you
can see things are elevated. You can see that there is a peak in the center because the system
has an equilibrium in the center. It remains where it is, so robustness is pretty high there
because it goes nowhere near the unsafe side. But you can see how the violations, the two red
arrows that I've shown correspond with the violations here, but there is a gradient that informs
how you get there, and so you can do simulated annealing on this surface and you will get a lot
better performance empirically than on this surface where it well just behave like a random
walk on the surface. So that was the main idea and using robustness for falsifications. Again,
it's nothing deep, but it's building on what is the minimum that we can do and what's better
than the minimum. Again, so again you start with your property monitor. Out it goes and in
comes robustness computation. So you have these robustness computation tool TALIRO and
what you do is you hook it up in an optimization of your choice. There's no silver bullet here
but we tried many different optimizations always and each one of them works well on an
example. It's the same problem that Coral had yesterday or slam has, which tool, which
optimization do you run? You just run all of them if you can, and that's what we do. There's a
bunch of these tools we can run starting with simulated annealing. We have a method called
cross entropy method which is stochastic optimization method. You can even run generic
algorithms, ant-colony optimization. We just threw it whatever solvers we had. There are no
guarantees on performance which if you know my research is not how I like to operate, but,
you know, this is the world of -- we started from ground 0 and we said we are going to solve
hard problems, so let's do it from ground 0. Let's get the low hanging fruit out of the way,
which is, I guess, put in an optimization solver and let it find falsification. Again, we actually got
very good results from this, very promising. We managed to falsify examples which you cannot
even get started with any other method, because we are very simple here. We are just testing
and we just have this optimization loop and robustness is the main thing, the secret thing that
gives us the power to test it because it gives us a gradient; it gives us a direction, go this way
and test this way and that gives you better answers. Just to illustrate one case study where we
got some good results, so this is an automatic transmission model that was proposed by two
researchers then at Ford Motor Company. Now Ken Butts is that Toyota. He proposed this as a
tech report of a challenge problem that people in the hybrid system community should try and
solve, and he gave -- so this system models a powertrain which roughly speaking is, you know, it
takes the current throttle position like where your accelerator pedal is pressed; you can think of
it that way. The initial speed and the road grade like whether you are going up or down and it
adjusts the torque the engine provides to the wheels and the shift schedule, which gear should
I shift? Should I be in the first gear for the second gear? Which gear should I go to make sure
that I am not stalling or falling back? It has six continuous state variables, 24 discrete modes.
It's not a very complicated system. It's by no means the most complicated system we have
looked at. Its dynamics are affine but it's still complex enough to defeat most of the tools out
there. We were given some properties like, you know, starting from 0 speed if the vehicle shifts
from the second gear to the first gear and then back to the second gear is that even possible.
Or they asked us if the vehicle shifts into the first gear, it well not shift back into the second
year for at least 2.5 seconds. It should remain in the gear that it just shifted into and shouldn't
just keep like shifting back and forth and they wanted to know if these properties were false.
For example, 2, it wasn't known. It wasn't very well known whether it was true or false. And
what we found out was it was actually false. We found a falsification. They actually falsified all
the three properties they are using S-Taliro. And here is an example of what the robustness
looks like. We managed to plot it for this example. It's just fixing the values of some of the
variables and plotting for the rest. And you find that that's the overall figure on the, to my right
and to your left. There you see that yellow region on the bottom, in that little yellow region if
you blow it up you'll see another little region here where the robustness goes negative and you
will find a violation. Elsewhere there is no violation. So drawing this figure takes a lot of time
because you see each of these little dots is a simulation, but S-TALIRO just samples a small part
of the space and gets there. It's out on 700 simulations, but this figure itself took around
10,000 or more simulations. Even though this figure is misleading you could say just watch this
figure. There's a violation there. S-Taliro has to find it without drawing this plot. This is just for
your illustration. The nice thing is because this is a testing method, we can provide them the
test inputs that show, okay, do this and this gives you the falsification. So it was very easy for
us to convince the people who had formulated this model that there was a violation because
we just had to send them over our signals, so that was nice. The tool is actually available. It's
called S-Taliro. I will give you a link and it supports Simulink/Stateflow models. That's very nice
because it lets a lot of people adopt this tool because we actually support something that they
use. We have tried it on many challenging examples, some automotive examples, closed loop
models of medical devices. I'll tell you a little bit about them. And one of the projects I used it
for was searching for failure modes of insulin infusion pumps where we looked at how to find
failure modes of this model and how to do parameter synthesis of patient parameters and
control parameters using this kind of search. So we are also trying to get into some synthesis
using S-Taliro. But what I will do instead of talking about these examples is I'll look at
something that was actually inspired by the way SMP solvers work and that inspired how to
think about falsification of little bit more. And we now call this multiple shooting methods. We
initially called it trajectory splicing. This was inspired a little bit by thinking about how would an
SMT solver approach the problem of falsification. Multiple shooting is the name that control
theorists said was the standard name for it rather than trajectory splicing, so that's why we call
it multiple shooting. So what is multiple shooting and what is single shooting? Single shooting
is a term that says we fix all the initial conditions and all the signals and then simulate the
system to find a violation. That's how S-Taliro does its search. It needs everything, all the
inputs and then it runs the simulator, then it obtains a trace and then it finds a violation and
you might be wondering is there any other way to do it. The other way to do it is like this and
this is partly inspired like I said SMT solvers, which is you find small segments of traces, little
segments of traces that can be disconnected and you try to join them up. We call these
segmented trajectories and we called the process of joining them trajectory splicing. And then
magic is when you do things this way you can get must faster on some of the benchmarks and
let me explain what this is doing. What this is saying is if you can think of choosing the initial
conditions and all the signals as saying let's solve a satisfiability problem by upfront fixing the
entire model and then seeing if the model satisfies the formula. But SMT solvers don't do it
that way. They come up with partial assignments that fix parts of the model and see if they can
learn conflict or closest backtrack and then, you know, come up with -- so they don't search the
entire space up front. The search for partial assignments and see if that, you know, backtracks.
So this was, you know, a very naïve way of saying let's also search for partial assignments. Let's
look for parts of a trajectory and see if we can join them up to make a full trajectory. Okay.
How does this even more? The way this works is like this. What we have is, suppose you have
these split trajectories where there are gaps between the segments. What we will let the
solver do is search in the space of these split trajectories but use these gaps as a cost function,
and I'm just showing you the version for safety. You can also extend to other temporal logic
properties, but I will not complicate the presentation here. So you can minimize, if you can
minimize these gaps and eventually get them down to 0 as part of your optimization, then in
the end you would have a complete trajectory where the state at the end of the segment will
match with this state of the beginning of the next segment and you will be able to join
everything up. Okay. So there are some nice advantages to setting up the problem this way in
multiple small segments that, you know, you can join together and get a violation. So what our
tool does is it starts with some such segment that it finds through random simulation. It will
randomly simulate the system for part of time and then just take a random jump, randomly
simulate the system, take a random jump until you find a violation and then it will join together
and find a trajectory with gaps in it. But you can't give it to the user because this is nonsense
right now. But if you can narrow the gaps because of using optimization, then you can give it to
the user. So that was the whole idea. Our hope was that we don't even have to narrow the
gaps down to 0. If we can get it below double precision then what does the user care? If the
user can't see it in there simulation, if it's below the precision of a floating point number, then
we are done. So we use that as a heuristic to say let's narrow it down to 10 to the -10 and then
give it to the user and the user and the simulator will usually be able to reproduce it. So that
was the idea. How does this work? It's very standard, actually, so what we then do is we set up
the optimization problem like this. We say the cost to minimize is the distance between the
end of one trajectory, which is xi is the starting point of a trajectory. ti is how much time I
spend and F is the system simulation function, so I say the gap between the endpoint of one
trajectory and the beginning of the next trajectory, the sum of all of these distances becomes
my objective. Sorry?
>>: [indiscernible]
>> Sriram Sankaranarayanan: We show the distance. So we do L2 or L1 distance. In this case
we do L2. Usual distance, not signed any more. We are just looking at safety. I am not giving
you the full picture here. And what we now do is we can compute the gradient, so we can
actually compute the derivative of these things with respect to xi and with respect to xi +1. It
sounds very surprising but it's actually possible to do so. In our paper we derive it. It's based
on some very standard ideas and ordinary differential equations. We do the derivatives. We
can also calculate the secondary derivatives and what you get is something called ordinary
differential equation sensitivity analysis that can give us the derivatives that can give us the
second derivatives and now we can use techniques that are well known in optimization like
Newton's method or gradient descent to try and narrow these gaps. We use random
simulations, setup an initial state for the solver and just let the solver go and narrow the gaps
down to 0. It's pretty good at doing that and there are some functions. We don't have to
implement all of this. These are all so standard in optimization that we just take a function like
fmincon in Matlab and just give our tool to fmincon, but we also have to give it the derivatives
which we calculate. We also have to give it the second derivatives if we want to use Newton's
method which we also calculate. Once we finish those calculations we can give it to fmincon,
and what it does is it uses our calculations and finds a violation, or at least tries to find a
violation. If it fails to do so we just again randomly submit it, try another additional state and
try to see if it finds a violation and so on. Let me just illustrate why one would want to do this
kind of a technique on a different case study and this case study is a small example of an
artificial pancreas controller. It's not the latest one. It's an early example and I'll tell you what
stops us from doing the latest one just in a few minutes. So here the idea is we are looking at a
control system that is controlling an insulin infusion pump that's pumping insulin into a patient
who is type 1 diabetic so he requires external insulin and the patient has a wearable glucose
sensor which is sensing the level of glucose in their blood and the controller looks at the level of
glucose and decides how much insulin the patient needs to get. So it's a closed loop controller
and it's called an artificial pancreas. In this what we have is we have mathematical models of
insulin glucose dynamics in the human body, so the human body has a very nice way of
regulating glucose through insulin and this dynamics is well understood. People have spent
around 30 years modeling the dynamics using differential equations, so that part is well
understood, so we just take one of those models. And people also talk a lot about control
algorithms for controlling the flow of insulin, so we also use one of those control algorithms and
the input through such a model would be the caloric intake of the patient. What is the patient
eating? And the output would be the blood glucose level G of t that the patient has currently.
For more specifics for the insulin glucose the dynamics we just use for our example a very
simple model. It's called a Bergman Minimal Model. It has some deficiencies, but it's pretty
good for trying to experiment with these systems and trying to see how well our software does
on them. We looked at a couple of control algorithms formerly published in 1985 by Furler,
1991 by Fisher, so we went to their papers, fished out their control algorithms and just
implemented it in our closed loop. What we did in this case was we took 21 different instances
of parameters. If you go back to this slide the insulin glucose dynamics is governed by some
parameters, some numbers, magic numbers that come from various patients. So what people
have done is looked at patient studies where they have looked at the insulin glucose levels in a
patient over time and they have adjusted these parameters so that the model fits what the
patient actually shows as real data. And there are 21 different instances that are available with
the Bergman Minimal Model you can get these instances for 21 different patients or virtual
patients. So what we did was try to see does the controller work on each of these patients.
What are we testing? What are our properties here? Properties are two safety properties.
One safety properties says hypoglycemia shouldn't happen which means that the blood glucose
level should not dip below three mmol per liter or it is like 50 mg per [indiscernible] and that's
the limit, below which if it goes it becomes very dangerous, so you should not let the blood
glucose level go below that. And hyperglycemia which is blood glucose level should not go
above 22 mmol per liter or roughly around 360 mg per [indiscernible]. We had some very
interesting results here and let me take the time to explain these results a little bit. On the left
we tried 21 different patients and we found six examples of violations. Five of them were
hyperglycemia violations. If you go back the level of glucose greater than or equal to 22 and
one of them was hypoglycemia violation. So the controller is pretty robust against parameter
variations. It's not too bad because I would have expected to see violations in all 21 cases.
Let's look at these violations of little bit more detailed. You can see the times we took in
seconds which is roughly like one second. In one case we took 8 seconds, so that's not too bad.
I have another metric here which we call the degree of difficulty which is we tried 10,000
random simulations and we saw of the 10,000 how many gave us a violation. That gives us an
example, a metric of how difficult is the property to violate by our straw man. If random
simulation is a straw man we have to beat, then how does that random simulation perform on
this? You can see how in three of these examples 10,000 simulations yielded nothing. In two
of these examples it yielded around 293 our of 10,000 and 84 out of 10,000, so probability wise
just by chance finding these violations is pretty hard. The interesting thing is we actually do
find violations even when random simulation completely fails to find one or it's pretty rare to
find one through random simulation. Some other interesting things are, okay. Random
simulation in these examples takes around two hours to finish so 10,000 simulations takes
around two hours to run because simulations aren't cheap. They are all expensive and this two
hours is on parallel on a four core machine, so it's not even -- so it's all parallelized and even
then it takes a long time to finish and we have a lot more benchmark results like this in our CDC
paper. We have one more example called a NAV example where we show very similar results
across many instances of the benchmark where random simulation takes a lot of time, but we
can find a violation fairly fast, okay, which is very interesting. So we also get around five to
100x speedups when we do the same thing using single shooting. If we don't do this multiple
shooting and just do a single shooting we get some very impressive speed ups and we also try
to style it over stimulated [indiscernible] and we also did significant speedups on S-Taliro.
When we do this multiple shooting method, we get some very good speedups on S-Taliro. So
this is a very promising idea but we still haven't managed to make it work on simulating on
state flow primarily because simulating on state flow makes it very hard for us to do the
segmented simulations. You can't do a short simulation for a short amount of time, reset your
simulation and then do another short simulation. That becomes pretty hard for us automate in
simulated state flow. It insists on recompiling the models so it's a technical issue that we just
have, so every time we do it the model compiles again so that makes it very hard for us to make
this worthwhile. Otherwise, the simulation itself just swamps up all of the cost of doing work
for us, so that becomes harder and computing these gradients and Hessions in simulating state
flow still requires some support which they aren't providing yet but they are, there is some
hope that there will provide it soon enough. So what's the future work going to be? So the
future work is read really want to expand on artificial pancreas verification and we really want
to make sure that this verification can help people who are developing the artificial pancreas
which is an ongoing effort in many academic labs. Some industries have also started looking
into the development of artificial pancreas. It's a huge industry. There's a lot of money that
has been invested. Billions of dollars have been invested in type 1 diabetic research so this
would be a huge deal if we can get these things to market faster using verification. And the
state of the art model here is like Hovorkas model or Dalla Man’s model, not Bergman’s model
because there are some significant limitations to the physiology that it can capture. What we
have already is we can model the insulin glucose dynamics with much better validity, so
incorporating those models is one thing and incorporating the control algorithm is another
thing. The control algorithms are now much more complicated. They are what are called
model predictive control algorithms which involve a solver, an optimization solver inside the
controller that's running, so every time step the controller itself runs a linear programming or a
quadratic programming instance. So capturing such controllers is going to be even harder. So
what we have done so far, well we have taken the top part. That part is easy to do. We have
modeled it; we have taken the models from the paper and incorporated. The bottom part is a
really hard problem. Finding very fine model predictive controllers is really hard. It's going to
be the next big challenge if we want to develop the industry, because this is what the industry
really wants to use, NPC controllers because processors are fast and they can solve optimization
problems in real time so people want to use this and we don't know how to get there yet. The
hope is that using some of these techniques we may get there. To end this whole long story,
CPS are quite challenging and complicated and everyone says that. Models of cars, airplanes
human physiology they can be arbitrarily complex and these models aren't human artifacts
really; they are a combination of human artifacts and lots of physics and lots of biology, so it's
very difficult to say that they can be as intuitive and they, you know, in program verification we
say programs have simple invariance at least for the kinds of properties we are interested in.
Those kinds of things one cannot reasonably expect here, but one can expect other things. One
can expect to work in continuous state places, to have continuous sensitivity to initial
conditions, have properties like local linearizability which means if I make small changes to the
input, the output changes will be linear in the changes that I make to the input, which is a very
nice property. Programs generally don't have it because we have statements that destroy local
linearizability. Control algorithms are often designed to stabilize the system, so most of the
runs will be stable. A few of the runs would be unstable, so that actually helps us in verification
as well, because over time as we simulate longer the system behavior will get to the stable
behavior and so it's very easy to do verification for systems that are stable. Systems that are
unstable you have to face a lot of other problems, but stable systems are nice and easy well
behaved, very fair at least through simulations. So the hope is that we can exploit these
properties and more properties which I haven't mentioned here to make these verification
problems easy and make them, even though they are not decidable. The decidable cases might
require us to exploit these kinds of conditions. So that's all I have to say and we have S-Taliro
online so it's available online for you to use and I wanted to thank you all for listening to me.
[applause]
>> Francesco Logozzo: We have time for a couple of questions.
>>: [indiscernible] structure? This abstraction there [indiscernible] problem.
>>: Who knows, this cone, the first part of your talk was already abstraction. You are talking
logic.
>> Sriram Sankaranarayanan: That's an abstraction of the trace, yes. That's right. It's a very
different kind of abstraction. It's not the kind of abstraction we are used to doing. I mean,
normally abstraction takes a system and gives a simpler system. All we are doing is here we are
using abstraction of traces to give us a number that says how close are we to violating the
properties. So that is still an abstraction but I haven't done abstraction on the system yet, so
that would be a nice idea, right, using solvers on the abstraction to tell us how to start falsifying
would be a nice idea. Right now I don't know of any, I don't have any front ends to take
simulated state flow and do some abstractions, so that would be a good place to start for us.
Already there are many ideas that if you had abstractions how could you use them in
falsification, but we don't have any yet. But yes, if you want to think about abstractions in this
work, yes, the use of the cylinder around the trace is an example of an abstraction, but it's a
different purpose of doing abstraction than why we do it in normal verification.
>>: When you say all this work [indiscernible] I was thinking about the attraction with the
environment, right? For example, this [indiscernible] the more you accelerate the more likely
you are to get into some failure state. I was wondering if you can input into the measure
function, distance function maybe the probability of going, taking some [indiscernible] others.
For example, the case of the motor. It might be the case that more people, the faster they are
going, the less likely they are to accelerate more, so if you can adjust the distance function
[indiscernible]
>> Sriram Sankaranarayanan: Yes. We have some ideas about adjusting the distance function.
So the way I talked about the distance function I won't bring that slide up; its way in the
beginning of my talk. It seems like you use the Euclidean metric, but that's not usually done, so
we really want distance functions to relate to what the distance actually means in the scale of
the system. So I take your question to mean, suppose you have some inputs where you may be
able to make a big change and make very little change to the dynamics of your system, so you
might have…
>>: [indiscernible] really close to an error, but what I know of the environment I know that it is
much likely that the trace will actually go far from the other one and actually getting closer.
>> Sriram Sankaranarayanan: That actually brings, yes. You are right. That brings the dynamics
into measuring. So one of the reasons this was so easy to work and I guess I should have put a
slide on this. Let me try and fish out the slide. Okay. Right here. So one of the reasons this
was so easy was the robustness computation knew nothing of the system that produced the
trace. Problem is, yes, that also means that things like what you are saying, you know, like
incorporating some system knowledge into how you search is not possible. But if you can do
that, then you can of course do much better. You can, people already know a lot about how to
control systems that are non whole anomic [phonetic] and people know that you cannot, even
though two states look very close in the distance sense, it doesn't mean that you can get from
state A to B. But those kinds of things we can't incorporate into this framework because that
requires breaking the box and looking into the simulator and looking into the model. We
haven't done that yet, but what you are saying is a very good idea. That's what we are hoping
to try.
>> Francesco Logozzo: Anymore questions?
>> Sriram Sankaranarayanan: Are right. Thanks. [applause]
Download