>> Neil Pittman: Hello. I'm here to introduce... with us at the embedded systems group working on FPGA...

advertisement
>> Neil Pittman: Hello. I'm here to introduce Mehrdad Majzoobi. He's an MSR intern working
with us at the embedded systems group working on FPGA verification and debugging. Let's see
what he has to say.
>> Mehrdad Majzoobi: Hello, everyone. Thanks for coming. I'm going to give like a 25
minutes' presentation, and then afterwards I'll give you a quick demo of the tool that I have
developed during my internship. It's not complete yet, but I can show the pieces to you. And
we're hoping to get this tool in hands of designers pretty soon so Neil can and everybody in the
group can start using it in the real designs.
So as the title said, we have developed an automated FPGA design verification and debugging
platform. So what it means is we are using FPGA for debugging hardware design in an area
stage, in whatever it is t, it's an ASIC design or so on. So it's actually hardware emulation and
verification on that platform.
So here's a very quick outline of the talk. I'm going to give you the motivation for the work and
tell you what is out there right now, what other designers are using, and the industry is using the
state of the art. And I give a very quick background on assertion-based verification, and then
afterwards I start talking about what we have, the automated debugging and the pieces of the
tool. And after that I demo each of these stages and future work. And at the end I'll conclude the
talk.
Okay. Well, everybody -- anyone who has done hardware design can really correlate to these
images here. You know, they may not have the money to destroy the computers, but they really
wish they could. You know, so whenever you're doing hardware design, at the end of the day,
nothing's going to work for you. So you have to look for bugs, and you have no idea where
they're coming from. And to do verification and debugging, you have to verify the quirkiness of
your design.
So verification and debugging is everywhere: in software, hardware, whatever you want to do.
But what makes hardware verification particularly different is the lack of consistency in structure
of design. So if you're designing an encryption core, another person is designing a
micro-controller, another person is -- so these are completely different designs. So if I can give
you a recipe for that design, that's not going to work for you for another designer doing another
kind of design.
So the lack of consistent in structure/architecture in hardware design makes it completely
different from software design where you're dealing with a fixed CPU architecture and you have
a compiler and everything is almost the same for all software designers. So the lack of a
one-fits-all solution is a big deal.
And the second problem is the lack of observability and controllability in hardware design. So
it's very difficult to know a value of a certain net in your design at a given point in time, unless
you go inside the chip, you know, you hire some, I don't know, moving electron, go there and get
this for me, come back, something like that.
So you have to do probing, and you cannot see inside the chip unless you have -- or you can
simulate it on software, see what's happening in there.
So also there's a long recompile time usually for hardware design. So if you compared the
design of the same size, it usually takes a longer time to recompile results. So every time you
find a bug, you fix it, you have to like, I don't know -- and you lose -- any of those
micro-controller design has to like wait a day, go home, come back the next day. It's not that
bad. You take a day off.
So extreme hardware bugs can require total hardware replacement, especially if you're doing
ASICs. So, in practice, this is going to be a nightmare for industry.
So Intel, it's a very famous example, you probably have seen this a hundred times before, but it's
worth re-mentioning that Intel has lost $500 million in its error in the Intel Pentium processor in
'95 just for a bug in the floating point division instruction in their processor.
And also, interestingly, NASA lost $125 million over the Mars Orbiter for an error in the
software they had running on their unit. But here the interesting observation is that the Mars
Orbiter cost lest than what Intel lost in their Pentium processors. So NASA, not a big deal.
Look at Pentium, look at Intel.
>>: [inaudible].
[laughter].
>> Mehrdad Majzoobi: Yeah.
>>: But Intel did it again. NASA didn't.
>>: [inaudible] explosion.
>> Mehrdad Majzoobi: So anytime there is a bug in a processor, it delays the shipping of the
product, you have to like spend so much time on finding that bug, and because of that you may
have to delay your product shipping.
So, once again, the lack of a standard and coherent approach and procedure in a hardware
debugging, and also a lack of tools and automation that can help you point the bugs and fix them.
At the same time complexity and large scale of the designs that designers are dealing with today
are much different than the designs than like ten years ago, and it's growing faster than before,
has made it more challenging to find a little tiny bug in a very huge design.
And insufficient computational power is also a problem, because if you could have simulated
everything, then everything would have been fine.
So there are actually some solutions out there, but a typical hardware designer does not really
have access to those tools. They're very expensive hardware verification tools that synopsis
mentor graphics have for industry purposes, and nobody really knows how efficient those tools
are. They cannot -- and there are proprietary languages across these tools. There's no consistent
contrast and judgment in between those tools.
Okay. So in a nutshell, you can divide the methodology that is being used in the state-of-the-art
in three parts. And the first approach is using simulation to find out bugs in your design. So that
requires writing test benches. Since test benches can be also self-checking, transaction-based
test benches, you know, giving lots of input vectors to your design and looking at the output.
So the second approach is to use logic analyzer integrated -- logic analyzer inside your chip and
to monitor signals and see what is going on.
The last category is formal methods. Formal methods use formal mathematics to check your
formal description of your system versus what you have designed. And we'll do that for you. So
the problem of formal mathematics is still -- it's a very complex problem, and it might be
computationally infeasible in certain cases.
So also there are hybrid solutions in here. So you can mix simulation with formal methods and
hardware. And any kind of hybrid solution here is possible, simulation based and hardware
based. So ideally you can divide your verification problem to subsets and use simulation
hardware verification formal methods for each part of those things.
So the tool that we are dealing with here is using hardware based and simulation based. And in
the future we might be able to add certain formal methods into the tool to make it a more
efficient tool.
So now a few words on assertion-based verification. So on the previous slides I talked about
three methodology to find bugs there. But a systematic way to find and verify your define is to
write models and describe conditions in your circuit and then check and verify the correctness of
what you have described using those tools on the previous slide.
So I can write a few conditions assertions properties from my design and then verify the
correctness of those using simulation, using hardware, or using formal methods. So it's more
like encapsulating properties and then verifying those things.
So naturally I think in your mind everybody is going through that process, but it's more's like a
formal thing. I have this thing in my mind, let's go and check it. But here in assertion base you
have to write things down in a formal way using linear time logic, using conditional statements
and any kind of logic that you can write down on a piece of paper and then check it instead of
running things in your mind. So it has the potential of finding the bugs early in the [inaudible]
level design and then detect them earlier than before.
So I'm going to throw out there the goal of the project, what we are looking at the end, what we
are trying to achieve throughout this project, and then this is more like the user experience kind
of thing.
So I as a designer would like to give the tool my Verilog files, my hardware description language
codes, HDL codes, that I've written to the tool, write down the assertions on a piece of paper. Or
maybe if I'm lazy, I just don't write those assertions. It's optional. I just give computer to my
designs and I tell, okay, tell me what's wrong with my design, right, that's very high level. Give
the computer a design, and computer just tells me, okay, I think your design is not working.
There's something wrong.
This is the ideal case for a designer. So how can we get close to that? Probably it's to sci-fi to
really have that, but we can get slightly close to that.
So given a design, the computer can learn its behavior. So assuming that bugs happen once in a
blue moon, but they happen, but not so often that you can see them, you can detect those outlier
behavior.
So it's like a machine learning approach. You want to find inconsistent behavior in simulation
traces, in hardware traces.
So now the assumption is that if you're -- if bugs happen once in a blue moon, not so often, you
have a good chance to learn the good behavior and isolate the buggy and bad behavior.
So the process that extract and we use the term mine, which is an established term in the
community, to mine assertions from your design. So it's assertion mining. It's specification
mining.
So after you've mined these assertions, you can pass it through a designer, say, okay, I have
mined these things, some of them are very trivial assertions, some of them are more complex, tell
me if these are good. And say, okay, good, good, good, good, good. Move on.
So after we have gotten these assertions, you have to check them versus your design. You can
do that in simulation, but it's very slow. So we are looking into a faster approach. So we can
take these assertions, synthesize them into state machines. So it's -- I will call it PSL to Verilog.
PSL is a property specification language which is becoming the standard.
So you synthesize your assertion. And you merge those assertions with the ones that the
designer has or might have provided.
So and then you implement those assertions onto the hardware and you let it run, run for a day,
run for five minutes, until something fails. And then you have an interface that tells you at this
clock cycle this assertion failed, and then it -- through software you have the ability to set certain
parameters to your assertions.
So you would like to avoid recompile time of your hardware, so you can always show how this
can be achieved, but you can set certain parameters in your assertions through software without
having to recompile things.
So think of parametric assertions. So you have parametric assertions that you set those
parameters through software. And then you would like to maintain least impact on the original
design. So as you're adding all these debugging peripherals to your main circuit, you don't want
to change the functionality of the main circuit as you're adding things around it or inside it.
So I show how -- what that really means.
Yeah.
>>: When you get an assertion, what do you do?
>> Mehrdad Majzoobi: We check for its correctness.
>>: Okay. Check for the correctness of the assertion?
>> Mehrdad Majzoobi: Of the assertion. So the assertions can come in any kind of logic
statements. But the kind of statement that are common now are linear time logic. So an example
would be if this event happens, it has to stay valid until the other event happened.
Or the whole thing -- you know, the whole statement, eventually something else should happen.
So instead of eventually until next alternating patterns and so on, you'll form certain things that
you might observe in your design. So you express those assertion in a given standard logic, and
then you pass it on to the tool.
So back to the spec mining. As I said before, it's originally the designer's task to write down the
assertions. But since it's a -- first of all, it's a tedious process, if you work in a -- I haven't
worked in verification department in Intel, but I have friends that have told me they give you a
code, there are no assertions to check, the designer left, you know, quits his job, he didn't write
the assertion and so on, so you have to like go through the whole cycle, and the pay was not
enough, he didn't write the assertions, stuff like that, you know.
So its original design [inaudible] but because of all these issues, there's a lack of assertions that
give you complete coverage of the correctness of your circuit.
So one alternative approach is to hire a laptop to write assertions for you. And here I have to
emphasize that the assumption is that the design is acting right, is acting correctly most of the
time.
So what happens if the traces -- if you get a false positive, if the tool tells you there is a bug there
and you go and dig there and there was no bug, it's okay. You just wasted, you know, a couple
of hours for something that was not an issue.
But if the tool tells you there was no bug, just keep going, and then you tape out your -- you
know, CPU comes back and, you know, you lost a couple of millions of dollars. So false
negatives are not okay. So that should be like set to almost zero, but you can tolerate false
positive based on how patient you are.
Okay. So this is actually a work of Wenchao, a previous intern at MSR, at MSR has done, is to
-- so the tool is that you run a benchmark on your circuit and you collect the output traces
through a simulation tool, ModelSim. You get a VCD dump, which is your simulation traces.
You pass these simulation traces to the spec mine and then the spec mining engine looks for
patterns, alternating patterns, next patterns, until patterns, and eventual patterns. So you also
provide the pattern library to the tool. So mine, look for these patterns and the traces and tell me
when they happen.
And after it has mined this pattern, it performs a reduction on the patterns, because sometimes
the patterns are redundant. You want to remove the redundant ones, also they're chaining of the
patterns or merging of the pattern to give more efficient assertion. And at the end it ranks the
assertions based on how often they happened or they were observed in the traces.
So but this is unsupervised learning. Remember when I was explaining the formal methods, how
we can use formal methods. This is where we can use formal methods. After we get these
assertions, if you have a tool that tells you this is an important assertion, because it gives you
very good coverage on a lot of different action, then we pick that.
So to kind of -- the ranking and reduction that we have are kind of -- you're trying to reduce the
number. But if there is an engine that tells you this is a really important assertion because it
gives you a lot of coverage, that's formal methods. It can mathematically verify for you that this
is an important assertion because of that coverage. We keep it.
So the missing part is to apply formal methods to the output of this tool to get the more important
assertions based on the coverage, not based on how often they appear in the output traces.
Okay. Now, after you get these assertions, you have to map them, synthesize them to a state
machine. So this is still a challenging problem because you might end up in a state explosion
state. So you want to efficiently map this logic statement into a state machine with minimal
number of states, because you don't want to use a lot of hardware.
So in a paper in 2005, this problem was tackled in a very efficient way. And so the trick is to
rewrite properties. So you have a logic and say this logic changes if I have this input.
So, for example, say your logic is that if I move to the city, my life would be wonderful. But if
somebody gives you an input, your logic might change. Something like that.
So it's -- say logic P can be rewritten to P5 if the input is this one. And so it keeps writing the
logics until they reappear in the chain. So P1 reappears again here, again here, so the assumption
is that you are going inside a loop in the state machine. So and then it wraps up this chain into a
state machine by observing how you're traversing in the state machine.
Okay. So now we have two parts, asserting -- mining the specs, synthesizing them. Now, the
next step is to insert them into the design. What is missing there is how to -- automatically it's
hashed the probes into the signals that you want to monitor.
So this part of the tool we call a probe insertion. So given the names of the nets, first the probe
insertion tool parses the Verilog design, the input that you provide to the tool, and then it detects
a system, the modular hierarchy of your design, and then modifies the modular declaration, the
modular substantiation, adds extra ports to them.
So you can, for example, add a wire from this leaf node and route it to the top node, so you can
have an interface, debug interface in your system.
So instead of you going through your code, writing those wires and bringing them down, the tool
will take care of that for you.
Okay. Now that we inserted the core into the design, we would like to see what kind of things -the interface -- what kind of information we would like to see from the interface. So it's more
like a user experience kind of thing.
So right now the tool will do these things. You provide -- in the software we have the software
interface. You say I want to enable these sets of assertions. Say you have a thousand assertions,
but you don't want to check them all the time. Say, okay, I just want to check assertion 1, 2, X,
and so on.
So you enable those assertions that you want to check, and then you can set the parameters for
you assertion. So think of the notion of parametric assertion. So and then you set the scope time
scoping the time.
So certain assertions have to be frequently checked. Think of an eventually logic. I will
eventually finish this work. This logic has no meaning if there is no time scope there. Right?
Eventually logic will have no meaning if there's no time definition, right? I will finish the
PowerPoint slides by tomorrow, right, but if there's no tomorrow, it's meaningless, right?
So you set the scope time for certain assertions that make sense. For example, if a next assertion,
if this happened next, this should happen, then the scope time doesn't really matter, because next
is we have a clock and after that something should happen.
And we have a method to set the clock frequency in hardware through software.
>>: Why would you turn off any of these [inaudible]?
>> Mehrdad Majzoobi: Because sometimes you write the wrong assertions. And then you
realize I wrote the wrong assertion and it was my fault, so I don't want to go into the tool, delete
it, recompile everything. Just ignore that assertion. Or I understand this is a trivial bug, but I
don't care about -- I don't know.
It gives you the flexibility to not to recompile everything still not caring about certain assertions.
>>: So you can collect multiple bugs and say, okay, I know about that bug already ->> Mehrdad Majzoobi: Right.
>>: -- stop [inaudible].
>> Mehrdad Majzoobi: Exactly. And also the tool stops everything as soon as it observes the
first failure in the assertion. So aborts, goes to the software, you know, send all these diagnostic
information, halts everything. And also, as you said, you can say, okay, keep going, don't check
this assertion.
>>: I was just going to ask what's the benefit of doing all this in hardware? Just seems like
you're making your life ->> Mehrdad Majzoobi: It's very much faster. It's a very good question. So if you want to do the
same kind of thing in software, it's going to take at least three orders of magnitude more time, so
it's like a thousand X slower in software. So if you have a larger design.
>>: It just seems like if you really, really, really want it to work correctly, you would build a
formal model for it, you would use a language that had kind of something [inaudible] something
that was really sound for building -- like it's verified and it's very stringent to when it checks
[inaudible]. Seems like if you were to have this much care about every single smaller bug, you
kind of [inaudible].
>> Mehrdad Majzoobi: Well, ideally ranking formal methods is the way to go. But some of the
problems are computationally intractable with the kind of hardware that we have. So if you run
those proofs on a computer, it might take much longer than this.
>>: [inaudible] assumes that we have -- that you put everything in simulation. So if your ->>: I guess the hardware existed too.
>>: So if you're [inaudible].
>> Mehrdad Majzoobi: Exactly.
>>: [inaudible] lazy, don't do it.
>>: Right, right, right. I guess that's true.
>> Mehrdad Majzoobi: Yeah. And it's actually a very good point. So you can -- instead of
writing test benches, we have the input buffer and output buffer in the interface. So you can send
the data to whatever computation you want to do, get back the data, do the -- and then compare -do the same kind of computation and software -- it's more like a co-hardware/software
simulation, and then you contrast the result you get from software.
So you're at the same time sending the data, the circuit is running X speed, and you're doing it -you're checking its functionality and correctness at the same time.
And as Ken mentioned, sometimes it's difficult to simulate Ethernet controller in software, you
have to write it down and so on, but ->>: I think I was just confused because it seemed like you were just basically trying to do all
debugging in hardware, what it seemed like, and that just seems silly to me. Takes forever. I
mean, you should do debugging first, you know, not hardware, as much as you can, right? I
mean, you don't want to just [inaudible].
>> Mehrdad Majzoobi: Right, right.
>>: So that's kind of what I thought you were saying, but I see what you mean now.
>> Mehrdad Majzoobi: Well, yeah. I mean, you -- when you get to this point, you have to make
sure that your design is sane, but sometimes it goes crazy, but you're trying to find out what's
going on sometimes. Not all the time. So it's not like I write a totally crappy code, I don't know
what it does, I want the tool to redesign it for me, something like that.
So the assumption is that you have a good code, you can't find the bug, there if something's
going on, you're trying to achieve that.
>>: I actually have a follow-up question. So this tool -- so you'd actually use this to find bugs
that you saw in simulation but you couldn't -- they were basically hard to -- it just took a lot to
find them?
>> Mehrdad Majzoobi: Okay. So --
>>: Or would you actually use this tool [inaudible]?
>> Mehrdad Majzoobi: -- a real-world scenario would be I have an [inaudible] score and I give
these input blocks to the core, I get the encryption or decryption of the input, and then I compare
it with what values I expect to see. It runs, and at some cycle the values do not match. There is
something wrong.
Or you're running some instruction on a CPU. It's running at certain -- for certain instructions or
input data something fails.
So you're not doing that in the simulations, running of the hardware, but something fails there.
So you can still capture without having to do simulation, looking at that.
So -- okay. So if you want to go and look at what's happening locally, we have that feature,
which is, okay, this assertion failed, roll back in time, repeat the same operation, but this time
capture everything from X number of clock cycle before is stored in a chip scope logic analyzer,
integrated logic analyzer. Now look at it this time with a chip scope.
So the tool has that ability to show you the [inaudible] also so you don't have to deal with the
simulation.
>>: A good [inaudible] simulation of [inaudible].
>>: Or for that matter, it's a matter of [inaudible].
>>: Yeah, I mean, I guess that's why I was asking. You want to simulate it first, right? That's to
some extent.
>>: [inaudible] simulate it first.
>>: Right, it seems pretty good, but let's [inaudible] kind of a runtime thing.
>>: Right. There's also [inaudible] and there's also the obvious you can't account for it in a
simulation like you [inaudible].
>> Mehrdad Majzoobi: Okay. So ->>: [inaudible].
>> Mehrdad Majzoobi: You're not clear about?
>>: You have the output buffer and you have the chip scope, so what's the difference ->> Mehrdad Majzoobi: So this is like an Ethernet controller that Ken has developed, and you
can communicate with the on-chip device through Ethernet controllers. So everything in that
blue box, there's API control for that.
>>: [inaudible] if you have a [inaudible].
>> Mehrdad Majzoobi: Well, the input buffer and output buffer are for communicating with the
device under tests. So you give it -- so they're now for diagnostic information. So [inaudible]
give you that [inaudible] information on certain signals that you want to probe. But for input
buffer and output buffer, you're just sending data and getting back -- you're just running the
circuit. So I get back to this figure in a minute too. Okay.
So there's also timing issues that you may miss in simulation. There are -- your circuit might
work functionally correct, but considering the timing, it may fail. Right? So simulating for
timing, the timing dependant design is also an issue.
So you might have race conditions. You might have forgotten to write timing constraints in
certain paths, especially when you're dealing with multi -- a clock domain.
And a very scary thing is that you might have something that happens post silicon after you
implement things, you might have -- you might be having to deal with delay variation defects on
delays and such. It's not so common, but you're going to see more of these.
So to check for timing issues, we have added this feature to the tool that gives you the ability to
tune the hardware clock on the fly in hardware. So the clock is running at 100 megahertz, and
you say, okay, increase the frequency to 101 megahertz. It's -- okay, increase it one step, 102
megahertz. So without having to resynthesize everything in the code, we tune the hardware to
clock using the PLL, phase lock loops, in the FPGA.
So as a designer you're provided target frequency, and then we have a tool that searches for
optimal phase look loop attribute settings. So each phase lock loop here has a D attribute, which
is a division attribute, and multiplication, and O is also a division. So these three here.
But there's certain constraints. So it's a search problem. You have to look for the best setting for
these attributes. For example, the inputs of the phase detector, phase frequency detector, cannot
be higher than a certain frequency. The output of the see VCO cannot be lower than a certain
frequency or higher than a certain frequency.
So considering all these constraints, we have this algorithm that looks for an optimal PLL setting
given an input frequency, and after finding these coefficients, the PLL is to be reconfigured with
the new attributes.
So the figure on top shows if you have one PLL and if you -- all possible frequencies you can
synthesize with a single PLL on a Virtex-5 device, also sorted out, so they are sorted from zero -like about 3 megahertz to 600 megahertz, 700 megahertz, you see there are more points here,
there are more number of frequencies you can synthesize in lower frequencies than higher
frequency.
So it's kind of sparse. Think of it as a histogram. So higher frequencies, it's more difficult to
synthesize higher frequencies with higher precision. So they're kind of sparse.
>>: So basically you have to -- I want [inaudible] how do I get ->> Mehrdad Majzoobi: Right, right, right. So the thing is that if you need a low frequency,
probably one PLL would be enough. But if you want an accurate high frequency, there's a
chance that you don't hit that frequency in the upper curve, so you need a second PLL for that.
So I will show you ->>: What does frequency count?
>> Mehrdad Majzoobi: So you can synthesize about 11 distinct frequencies with this one PLL.
>>: Okay. [inaudible] frequency counter.
>> Mehrdad Majzoobi: Yeah. I mean, this is the number of distinct frequencies we can
synthesize with one PLL. It's just a -- it's a vector.
>>: Oh, I see.
>> Mehrdad Majzoobi: You score the vector of all possible frequencies that you can synthesize.
>>: [inaudible] DMO?
>> Mehrdad Majzoobi: What's that?
>>: The DMO.
>> Mehrdad Majzoobi: Right.
>>: [inaudible] you get this number.
>> Mehrdad Majzoobi: Right, right. Each point on this curve corresponds do a different DMO
setting. But they're feasible settings. There are certain settings that are not feasible. There are
certain settings that produce duplicate frequencies.
For example, if you multiply both D and M by 2, you get the same frequency, because one of
them is on the denominator, the other is on the numerator, so there are lots of redundancies. So
these are like the best frequencies ->>: What do you mean? Like if you cannot get high frequency [inaudible]?
>> Mehrdad Majzoobi: Right.
>>: What does that mean?
>> Mehrdad Majzoobi: So with two PLL you can synthesize a lot more frequencies with one
PLL. Because you have three -- three [inaudible] of freedom here, three parameters to play with.
With two PLL you have six sets of parameters. So the chances you hit a closer frequency higher.
>>: So I guess it would be nice [inaudible] two PLLs and three PLLs.
>> Mehrdad Majzoobi: Right, right.
>>: There's a variable in here that's not taking [inaudible] input clock.
>> Mehrdad Majzoobi: Okay. You're right. Input clock. I assume the board ->>: [inaudible].
>> Mehrdad Majzoobi: What is that?
>>: There's a crystal socket on most boards, and I can just take that out and put in a different
crystal.
>> Mehrdad Majzoobi: Right. I did the simulation on the on-board clock, a hundred megahertz
on-board clock. So you're right. This diagram changes if you have a different input frequency.
That is the case for the second PLL, because the second PLL takes a different input frequency
from the first PLL. So if you don't want to get a second -- if you don't want to get a crystal
oscillator, you can do two PLL and [inaudible] you can hit the frequency that you desire.
Okay. So we're kind of running out of time. So guided placement. What is the problem here?
So assume you have a modular design, you have a micro-controller encryption core, some other
peripherals. And you want to debug only one module in that design. So you want to take out a
module from a bigger design and debug that.
So the problem is that if you take it out and put it next to a bunch of debugging interfaces, things
might change. So you want to have least impact when you're moving that thing out of the design
and put it into a new design with a different setting. So it retains the same behavior.
So the trick is to have a consistent behavior. So to achieve that, we run the original design with
no debugging peripherals and no debugging interfaces, anything, just original design. And then
we extract the placement of the certain module that you want to debug. And then we pass it on
to the debugging tool. The next time you're doing debugging, it enforces the placement of that
certain module so you get a consistent behavior.
Xilinx has a Smartguide technology, and it's supposed to achieve the same goals. But it only
works on if you're doing very small incremental changes on your design.
So I have a comparison of the performance here. Okay. So one example. I have -- I'm
[inaudible] debugging a multiplier, which is attached to the Ethernet communication interface.
So I add one verification unit or three verification unit.
In the three top figures I'm enforcing the placements to follow the behavior when there's no
verification unit. And the bottom figure, I'm not enforcing constraint. So you see visually the
placement is retaining its shape, but in the bottom when there are no placement constraints
extracted from this design, the top left design, then it does a completely different placement.
So then at the bottom I'm comparing the costs you're paying for performance as you're enforcing
this placement. Because the tool wants to do the best placement, it gives you the least power
consumption with the least timing, the best timing. But once you enforce the placement, you are
kind of constraining that optimization problem, so you naturally get a lower performance.
But that's very small here, not negligible. I mean, that's negligible. And the problem is not
really losing performance, because in the original design, you're getting the same placement and
nothing's changing.
You're losing performance compared to the case that there are no other peripherals and you load
the tools, there's a placement for you, and there is no other modules around your module.
Okay. This is a better visualization. Okay. So we have a standalone AES core, just an AES
core placed like that. And then we add the communication interface and the debugging interface
through that.
And here I explain to you why this happens, and I have a correction for that. Here we are
enforcing the placement constraint. So we would like the red region to be exactly like here while
everything else is placed around that.
So if you look at this area, we get a -- there is a shadow in that region. Oops. Sorry. There is a
shadow right there. So the problem is that whenever you reimplement and you resynthesize your
design, the name of the components might change. The tool has a random number generator that
for certain memory components it assigns a new name every time you do recompile.
So the names of certain components here change, and I tell it to, okay, lock this component at
this location, and the tool tells me there is not such a component. So if you see these
components are being locked because their names haven't changed, but the ones whose names
have changed, they're not locked down and they're placed -- scattered all around the place.
So I tell you a solution for that. So using a Smartguide, not much similarity in the placement.
No constraint. It's moved to another place.
So okay. These two figures are the same as before to the left. Now, I did a trick. And I say,
okay, the names of these components are changing, but the rectangle around these are not
changing. I extract the rectangle and I enforce the rectangular area around -- inside of each
component. And you can do both. You can do -- so this component's name's changing, but the
rectangular area is not changing. So combining these two, you can get a pretty close placement
of the same module.
>>: So the whole point of doing all this is so that you don't have your debug stuff interfere with
your original.
>> Mehrdad Majzoobi: Right.
>>: But you don't want the synthesis to almost replace and introduce new bugs.
>> Mehrdad Majzoobi: Right, right. Exactly. Okay. So the communication interface that we're
using is SIRC Ethernet designed by Ken. It's a very handy interface. You have a bunch of C
files that does the software interface and Verilog files.
And the probe insertion, most of the tools are Python based, most of the parts. The Verilog
parsing is being done with a Pyparsing Package. PSL to Verilog uses a Ply Python Lex/Yacc
parser. And then we have an xdl2pcf. PCF is the physical constraint file, which is equivalence
of UCF. But it's for after implementation, after -- sorry, after mapping, before doing place and
route, but UCF is in the synthesis in mapping.
Okay. You're back into the flow diagram. It's the tool for how things work. Just to wrap things
up. So we have the original design simulated, get the output traces, mining the specs from the
original design, and then also you provide the user specs. Then these are being synthesized to
state machines.
And at the same time, after -- we extract the probing information, insert the probing into the
original design, and we implement the hardware assertion checker core, insert a probe, and
forcing the constraint we have from the previous round, we've collected from before, and then
we do the debugging through the SIRC interface.
Okay. I'm going to give a quick demo on the interface for you. I have -- I'm debugging, just for
an example, an AES core. We've added a bunch of PSL assertions.
And we -- like I demonstrate a debugging scenario and also I run the spec mining tool for you in
the placement extraction. I show you how PLL attributes are being generated for a given
frequency. So let's move on to those things.
Before I finish, I want to thank my mentor, Neil Pittman, for his support, and I had a really good
time here, and Sandro for his wise showing the way through the project. And also we have used
a lot of work from previous interns. This is actually a joint work with Kenneth McMillan and
Wenchao Li.
So remoting to -- okay. Remoting to my desktop, I have an FPGA board, and I have to visualize
this for you. I couldn't bring everything. It would have made a better impression if I had like
hardware running here.
But so I have the hardware running on my desk connected to my computer. And this is the
project that I have here. Okay. So we are debugging an AES core. And I have these sets of
assertions here. Everything in PSL language.
So this is an example of how you can write assertions in PSL. So it starts with defining the
atomic expressions. The atomic expressions evaluate to 0 or 1 at any time instance. So this is
saying that the atomic expression, W0, equals to this.
Whenever I send a command request, eventually it has to be acknowledged. So it's more like a
transaction-level assertion. Every request has to be eventually acknowledged within a time
window that I define in the software. I show you that time window too.
The second one says if -- okay. In the AES core, I read 48 bytes and then decrypt or encrypt
those 48 bytes and then I spit out 48 bytes. So this says wait after you have read 48 bytes. So
this says after AES count equals 48 wait command, this is the next thing that should happen.
You have to wait. The wait command has to be issued. So there's another assertion.
The next one says AES in data valid has to stay high until you have read 48 bytes. So it
shouldn't go down if you haven't finished reading those things. So this is working until assertion.
W1 has to stay high until W2 goes high. So this also has a notion of time. Because until -- in an
until event, there's a notion of time. The assertion can fail before the time has expired, but it may
wait until the end of it too. So in the next assertion there's no notion of time. Again, we have
another next assertion which says if the [inaudible] is full, do not -- the next thing should happen
is that it stop requesting things. These are more like transaction-level assertions.
The next one is if I start reading the data, the core has to finally spit out something. You can't
just keep reading things and not give an output. So it's like if the data is -- data invalid is one,
eventually the data out should be valid.
So, again, another eventually. So you can write more complex PSL properties. This is just for
the sake of demonstration. It can be a chain of different kind of logic, so we can have eventually
this event chained by some other thing. So this is very simple assertions that I wrote for
demonstrating.
So giving the assertions to the tool that I show, that I will show, the tool spits out the state
machine for you. So this is a state machine that I get from -- after synthesis. And then the state
machine is connected to the signals that I'm probing for monitoring their behavior into the tool.
Okay. So now in the -- this was software side. So I've already programmed the FPGA running
the code. So when I run this thing -- bad news. Okay. I think I'm running an outdated version
here. Okay. Okay. I see what's wrong. Okay. I'm going to disable all the verification units in
the code. So being that I enable 0, everything is disabled. Sorry. Okay. So this says assertion 0
was violated at clock cycle 0. Makes sense. Nothing is being checked. And now I'm going to
check assertion 16. Oh, well, assertion -- this is binary code. The last assertion, assertion 5.
Okay. Assertion 16 was violated at clock cycle 23. So what is -- okay. 16 is like 10000. It's
like a binary code. The last assertion is being violated at clock 23. What was the last assertion?
The assertion was that if the data is valid, eventually the tool has to spit out something.
Eventually has a time scope, right, a time notion. How long I'm waiting, what is the scope set
here. It's set to 20. So I wait 20 clock cycle, I repeat checking that assertion between every 20
clock cycle. Maybe that time is not enough for the AES core to process that data.
Let me increase that to, say, 22. 25. It is still failing. Was violated at clock cycle 28. But it's
still -- I added five more clock cycle. It was 23; now it's 28. So let me do a hundred. I'm going
to wait 100 clock cycle. Okay. It didn't fail this time. The assertion 0 was -- so it's like I'm
changing the time and scope and then checking if that assertion failed or not.
So I can do the same thing with other assertions here. Also, I was talking about parametric
assertions. Let me see. Here if -- there are certain ways you can parameterize your assertion.
For example, I can pass on value 48 from software to the assertion. So I don't have to
resynthesize these assertions every time I want to change the parameters.
So in certain cases you can parameterize the assertion, so you can pass on these values to the
assertions, to the software. So you can add more flexibility to the tool.
Okay. Let me show you the PLO thing real quick. Okay. Somebody tells me a frequency they
would like to synthesize, something between 3 to 600 with any precision, 600 megahertz. I'm
going to put it in there.
>>: [inaudible].
>> Mehrdad Majzoobi: Okay. You can be more accurate.
>>: [inaudible]
>> Mehrdad Majzoobi: Sorry. [inaudible] frequency. My software interface is still -- I'm
looking for it. Okay. What was that frequency?
>>: 173 [inaudible].
>> Mehrdad Majzoobi: 173.1234. Okay.
>>: Is that megahertz, or is that ->> Mehrdad Majzoobi: Megahertz. Okay. It takes a little while because, okay, we have too
many options here. So let me -- okay. Okay. So the first setting, the second PLL outputs a
frequency of 1783125. 123 -- you've got 123 here. So with this one optimal frequency was 123,
the settings are for the second PLL M97D103, and for the second PLL -- for the first PLL
N34D50O127.
So I'm training out things with low precision. Probably you got that 7247, but it's not because
I'm using [inaudible] in the printing without formatting the string. I'm not -- you're not seeing
that thing.
But so it gives you a list of frequencies that are within a given tolerance window. I've specified
that, and then you can choose the closest and so on. Okay.
>>: [inaudible].
>> Mehrdad Majzoobi: Well, no, so the tool is still not able to do dynamic reconfiguration. This
is the next thing we want to do to -- after we get these numbers, you send it to FPGA, and then
you reconfigure those two PLLs. And then you can get those kind of timing information from
the [inaudible].
>>: So at the moment it's just ->> Mehrdad Majzoobi: I'm just generating the coefficients. Yeah. The problem with the
Virtex-5 is that the information to do dynamic port configuration of the PLL is not well
documented. But that information exists for Virtex-6. So, yeah, a part is still on debate.
So what else do we have? Okay. Spec mine, let's go to spec mine, mining a specification
from -- I think I had it open. Okay. So this is how you run the spec mine. So you give it a -you give it a VCD dump, and then it outputs the specs for you. So it's going to run this here.
The VCD is the output you get from simulation tools, usually models, and it stands for vector
charge dump? No.
>> [inaudible].
>> Mehrdad Majzoobi: Yeah. Okay. So these are the two files. I gave it AES that I got from
running ModelSim, and then the specs. Let's look at the specs. Okay. So this says the number
of patterns before merging. The number of events, 31, the lengths of traces, 3,000 something.
The number of patterns after merging after training. So this is one pattern.
This A stands for alternating pattern. So this pattern alternates. So we have every time the
address is 5, next we have 6, so we have this 5-6 in alternation.
So whenever you have 5, you have 7, 5 you have 8. So the tool is basically saying this address is
incrementing in a verbose way.
So that's how the model formal methods can come handy. They can look at these things, say,
okay, they're all saying the same thing mathematically, right, so I can do another reduction on
this.
The downside is that if you want to directly synthesize these assertions, you're going to have a
lot of redundancy in the hardware. So if you had unlimited space, then that wouldn't be any
problem.
So what is F is eventually. So this says address equal 4 and address equal 4, if this happened
eventually address equal 3, address equal 3 should happen. So these are like eventually patterns.
And so on. So they're written in linear -- the same kind of thing, linear time logic.
>>: And this is constrained just by what you simulate it to?
>> Mehrdad Majzoobi: Right, right. So the longer you have traces ->>: [inaudible] never been inserted.
>> Mehrdad Majzoobi: What is that?
>>: Like if it says -- I mean, you cannot get an assertion. So, for instance, you basically can't
capture it. You get some stuff, but you can't -- you don't get a lot of things.
>> Mehrdad Majzoobi: Yeah, the longer the simulation traces, the more, say, confidence you
would have in your patterns, and you would probably get more meaningful patterns than running
the short traces.
>>: But that would never give you any false negatives, right? So it will never -- it will never
give you a case where the assertion was actually not right.
>>: You have lots of false negatives [inaudible].
>>: It's not going to say that it's an assertion but it's wrong.
>> Mehrdad Majzoobi: Right. It just gives you a bunch of assertions. That might not be a
correct assertion. But that is a very dangerous thing, but it might happen.
>>: Because isn't it just looking at the physical ->> Mehrdad Majzoobi: No, it's a form closer.
>>: Well, it's looking at the physical signals, but it's saying this is what [inaudible] and this is
what it turned out to be kind of thing, right? It's generating assertions from actually what
happened, so you couldn't get anything that was [inaudible].
>>: But, no, you can get false positives [inaudible].
>> Mehrdad Majzoobi: Right, right, right. So if you have bugs in your traces, the learning is
also done on the buggy traces.
So you ->>: I see what you mean. Right, right, right. Yeah.
>> Mehrdad Majzoobi: But we are hoping ->>: [inaudible].
>> Mehrdad Majzoobi: It's like the assumption is that those are not too often that affect the
patterns. So you have something that happens frequently, then it affect your learning, you're
learning the wrong behavior. You know, it's like a kid.
>>: Or could be like [inaudible] does the right thing 99 percent of the time, but 1 percent of the
time does the wrong thing, so I can pull that out.
>>: Right, right. So if you're -- I mean, if your code was really bad, then you would be full of
assertions that were wrong.
>>: Right.
>>: Okay.
>>: That's why we say that you won't have a somewhat confident [inaudible] 99 percent right.
>>: So it's not -- I mean, it's really not just like code, push button, give me my assertions and
then I put this into hardware and you say push button and you're going to help me figure it all ->>: You're assuming that you're mostly there.
>>: Yeah, okay. That makes sense.
>> Mehrdad Majzoobi: How much time do we have? Show a few quick things and we can wrap
up.
So the other tool is the xdl2pcf. So after you do your design, Xilinx at the end of all of the
processes that Xilinx spits out a file with the format XDL, which stands for Xilinx Description
Language. So it's not a -- it's a closed language, but it's easy to understand. It has a very simple
syntax. And it tells you -- it's the net list of the FPGA, how things are being connected in the
FPGA, how things are being placed.
So, for example, I'm going to open up one XDL file here, show it to you. Okay. Here it is. So
this is, for example, for the AES core, the tool spits out this file. Which this is like -- for
example, this says this is an I/O buffer placed at -- this is an I/O buffer placed at this location,
and it's configured this way. This parameter is set to off, this parameter off, and so on. And it's
also -- the I/O standard is set to this very low-level net list of the FPGA.
So what we are processing this for is to extract the placement of the components with given
names. So the script takes this file as the input and then outputs a PCF file. So this script is
finished running. So this is the PCF constraint file, which was generated from that. So it tells
the tool to place this component at this site and so on.
So you see these names? These are things that change. They're being generate with a random
number generator. So the rest, which look normal, this is, for example, a name that is not
changing. But this one changes and so on.
Anyways, for the P2V, let me show the synthesis tool as well. Okay. So for P2V, we write the
assertions here. So this is, for example, first assertion that we had there, AES byte counts 48, the
same assertions that we had in the Xilinx tool that I showed.
So okay. Okay. So we run this. Okay. So we say my P2V, run this, and then it synthesizes into
AES underline views. Go look at that. So these are the state machines generated from these
assertions.
So currently we have pieces of these tools running. We still need some work to put them all
together. So we isolate the designer from having to deal with what's going on inside. As I said,
the goal is you provide your design. And if you're really willing, you can write your own
assertions. If not, the tool can mine it for you.
And the rest has been taken care of. And then you sit in front of your computer, and then you
run on your interface and check the assertions that are -- they're in the tool.
The vision is that the tool also generates the C codes, parts of the C code that you can insert them
as header file that, you know, has the parameter set for those assertions and so on, so you don't
have to write those parts in the C code. That would be the flow of the things that we have, and
hopefully we get things together soon into something that people start using.
Any question? All right. Thanks a lot. I appreciate your time.
[applause]
Download