>> Neil Pittman: Hello. I'm here to introduce Mehrdad Majzoobi. He's an MSR intern working with us at the embedded systems group working on FPGA verification and debugging. Let's see what he has to say. >> Mehrdad Majzoobi: Hello, everyone. Thanks for coming. I'm going to give like a 25 minutes' presentation, and then afterwards I'll give you a quick demo of the tool that I have developed during my internship. It's not complete yet, but I can show the pieces to you. And we're hoping to get this tool in hands of designers pretty soon so Neil can and everybody in the group can start using it in the real designs. So as the title said, we have developed an automated FPGA design verification and debugging platform. So what it means is we are using FPGA for debugging hardware design in an area stage, in whatever it is t, it's an ASIC design or so on. So it's actually hardware emulation and verification on that platform. So here's a very quick outline of the talk. I'm going to give you the motivation for the work and tell you what is out there right now, what other designers are using, and the industry is using the state of the art. And I give a very quick background on assertion-based verification, and then afterwards I start talking about what we have, the automated debugging and the pieces of the tool. And after that I demo each of these stages and future work. And at the end I'll conclude the talk. Okay. Well, everybody -- anyone who has done hardware design can really correlate to these images here. You know, they may not have the money to destroy the computers, but they really wish they could. You know, so whenever you're doing hardware design, at the end of the day, nothing's going to work for you. So you have to look for bugs, and you have no idea where they're coming from. And to do verification and debugging, you have to verify the quirkiness of your design. So verification and debugging is everywhere: in software, hardware, whatever you want to do. But what makes hardware verification particularly different is the lack of consistency in structure of design. So if you're designing an encryption core, another person is designing a micro-controller, another person is -- so these are completely different designs. So if I can give you a recipe for that design, that's not going to work for you for another designer doing another kind of design. So the lack of consistent in structure/architecture in hardware design makes it completely different from software design where you're dealing with a fixed CPU architecture and you have a compiler and everything is almost the same for all software designers. So the lack of a one-fits-all solution is a big deal. And the second problem is the lack of observability and controllability in hardware design. So it's very difficult to know a value of a certain net in your design at a given point in time, unless you go inside the chip, you know, you hire some, I don't know, moving electron, go there and get this for me, come back, something like that. So you have to do probing, and you cannot see inside the chip unless you have -- or you can simulate it on software, see what's happening in there. So also there's a long recompile time usually for hardware design. So if you compared the design of the same size, it usually takes a longer time to recompile results. So every time you find a bug, you fix it, you have to like, I don't know -- and you lose -- any of those micro-controller design has to like wait a day, go home, come back the next day. It's not that bad. You take a day off. So extreme hardware bugs can require total hardware replacement, especially if you're doing ASICs. So, in practice, this is going to be a nightmare for industry. So Intel, it's a very famous example, you probably have seen this a hundred times before, but it's worth re-mentioning that Intel has lost $500 million in its error in the Intel Pentium processor in '95 just for a bug in the floating point division instruction in their processor. And also, interestingly, NASA lost $125 million over the Mars Orbiter for an error in the software they had running on their unit. But here the interesting observation is that the Mars Orbiter cost lest than what Intel lost in their Pentium processors. So NASA, not a big deal. Look at Pentium, look at Intel. >>: [inaudible]. [laughter]. >> Mehrdad Majzoobi: Yeah. >>: But Intel did it again. NASA didn't. >>: [inaudible] explosion. >> Mehrdad Majzoobi: So anytime there is a bug in a processor, it delays the shipping of the product, you have to like spend so much time on finding that bug, and because of that you may have to delay your product shipping. So, once again, the lack of a standard and coherent approach and procedure in a hardware debugging, and also a lack of tools and automation that can help you point the bugs and fix them. At the same time complexity and large scale of the designs that designers are dealing with today are much different than the designs than like ten years ago, and it's growing faster than before, has made it more challenging to find a little tiny bug in a very huge design. And insufficient computational power is also a problem, because if you could have simulated everything, then everything would have been fine. So there are actually some solutions out there, but a typical hardware designer does not really have access to those tools. They're very expensive hardware verification tools that synopsis mentor graphics have for industry purposes, and nobody really knows how efficient those tools are. They cannot -- and there are proprietary languages across these tools. There's no consistent contrast and judgment in between those tools. Okay. So in a nutshell, you can divide the methodology that is being used in the state-of-the-art in three parts. And the first approach is using simulation to find out bugs in your design. So that requires writing test benches. Since test benches can be also self-checking, transaction-based test benches, you know, giving lots of input vectors to your design and looking at the output. So the second approach is to use logic analyzer integrated -- logic analyzer inside your chip and to monitor signals and see what is going on. The last category is formal methods. Formal methods use formal mathematics to check your formal description of your system versus what you have designed. And we'll do that for you. So the problem of formal mathematics is still -- it's a very complex problem, and it might be computationally infeasible in certain cases. So also there are hybrid solutions in here. So you can mix simulation with formal methods and hardware. And any kind of hybrid solution here is possible, simulation based and hardware based. So ideally you can divide your verification problem to subsets and use simulation hardware verification formal methods for each part of those things. So the tool that we are dealing with here is using hardware based and simulation based. And in the future we might be able to add certain formal methods into the tool to make it a more efficient tool. So now a few words on assertion-based verification. So on the previous slides I talked about three methodology to find bugs there. But a systematic way to find and verify your define is to write models and describe conditions in your circuit and then check and verify the correctness of what you have described using those tools on the previous slide. So I can write a few conditions assertions properties from my design and then verify the correctness of those using simulation, using hardware, or using formal methods. So it's more like encapsulating properties and then verifying those things. So naturally I think in your mind everybody is going through that process, but it's more's like a formal thing. I have this thing in my mind, let's go and check it. But here in assertion base you have to write things down in a formal way using linear time logic, using conditional statements and any kind of logic that you can write down on a piece of paper and then check it instead of running things in your mind. So it has the potential of finding the bugs early in the [inaudible] level design and then detect them earlier than before. So I'm going to throw out there the goal of the project, what we are looking at the end, what we are trying to achieve throughout this project, and then this is more like the user experience kind of thing. So I as a designer would like to give the tool my Verilog files, my hardware description language codes, HDL codes, that I've written to the tool, write down the assertions on a piece of paper. Or maybe if I'm lazy, I just don't write those assertions. It's optional. I just give computer to my designs and I tell, okay, tell me what's wrong with my design, right, that's very high level. Give the computer a design, and computer just tells me, okay, I think your design is not working. There's something wrong. This is the ideal case for a designer. So how can we get close to that? Probably it's to sci-fi to really have that, but we can get slightly close to that. So given a design, the computer can learn its behavior. So assuming that bugs happen once in a blue moon, but they happen, but not so often that you can see them, you can detect those outlier behavior. So it's like a machine learning approach. You want to find inconsistent behavior in simulation traces, in hardware traces. So now the assumption is that if you're -- if bugs happen once in a blue moon, not so often, you have a good chance to learn the good behavior and isolate the buggy and bad behavior. So the process that extract and we use the term mine, which is an established term in the community, to mine assertions from your design. So it's assertion mining. It's specification mining. So after you've mined these assertions, you can pass it through a designer, say, okay, I have mined these things, some of them are very trivial assertions, some of them are more complex, tell me if these are good. And say, okay, good, good, good, good, good. Move on. So after we have gotten these assertions, you have to check them versus your design. You can do that in simulation, but it's very slow. So we are looking into a faster approach. So we can take these assertions, synthesize them into state machines. So it's -- I will call it PSL to Verilog. PSL is a property specification language which is becoming the standard. So you synthesize your assertion. And you merge those assertions with the ones that the designer has or might have provided. So and then you implement those assertions onto the hardware and you let it run, run for a day, run for five minutes, until something fails. And then you have an interface that tells you at this clock cycle this assertion failed, and then it -- through software you have the ability to set certain parameters to your assertions. So you would like to avoid recompile time of your hardware, so you can always show how this can be achieved, but you can set certain parameters in your assertions through software without having to recompile things. So think of parametric assertions. So you have parametric assertions that you set those parameters through software. And then you would like to maintain least impact on the original design. So as you're adding all these debugging peripherals to your main circuit, you don't want to change the functionality of the main circuit as you're adding things around it or inside it. So I show how -- what that really means. Yeah. >>: When you get an assertion, what do you do? >> Mehrdad Majzoobi: We check for its correctness. >>: Okay. Check for the correctness of the assertion? >> Mehrdad Majzoobi: Of the assertion. So the assertions can come in any kind of logic statements. But the kind of statement that are common now are linear time logic. So an example would be if this event happens, it has to stay valid until the other event happened. Or the whole thing -- you know, the whole statement, eventually something else should happen. So instead of eventually until next alternating patterns and so on, you'll form certain things that you might observe in your design. So you express those assertion in a given standard logic, and then you pass it on to the tool. So back to the spec mining. As I said before, it's originally the designer's task to write down the assertions. But since it's a -- first of all, it's a tedious process, if you work in a -- I haven't worked in verification department in Intel, but I have friends that have told me they give you a code, there are no assertions to check, the designer left, you know, quits his job, he didn't write the assertion and so on, so you have to like go through the whole cycle, and the pay was not enough, he didn't write the assertions, stuff like that, you know. So its original design [inaudible] but because of all these issues, there's a lack of assertions that give you complete coverage of the correctness of your circuit. So one alternative approach is to hire a laptop to write assertions for you. And here I have to emphasize that the assumption is that the design is acting right, is acting correctly most of the time. So what happens if the traces -- if you get a false positive, if the tool tells you there is a bug there and you go and dig there and there was no bug, it's okay. You just wasted, you know, a couple of hours for something that was not an issue. But if the tool tells you there was no bug, just keep going, and then you tape out your -- you know, CPU comes back and, you know, you lost a couple of millions of dollars. So false negatives are not okay. So that should be like set to almost zero, but you can tolerate false positive based on how patient you are. Okay. So this is actually a work of Wenchao, a previous intern at MSR, at MSR has done, is to -- so the tool is that you run a benchmark on your circuit and you collect the output traces through a simulation tool, ModelSim. You get a VCD dump, which is your simulation traces. You pass these simulation traces to the spec mine and then the spec mining engine looks for patterns, alternating patterns, next patterns, until patterns, and eventual patterns. So you also provide the pattern library to the tool. So mine, look for these patterns and the traces and tell me when they happen. And after it has mined this pattern, it performs a reduction on the patterns, because sometimes the patterns are redundant. You want to remove the redundant ones, also they're chaining of the patterns or merging of the pattern to give more efficient assertion. And at the end it ranks the assertions based on how often they happened or they were observed in the traces. So but this is unsupervised learning. Remember when I was explaining the formal methods, how we can use formal methods. This is where we can use formal methods. After we get these assertions, if you have a tool that tells you this is an important assertion, because it gives you very good coverage on a lot of different action, then we pick that. So to kind of -- the ranking and reduction that we have are kind of -- you're trying to reduce the number. But if there is an engine that tells you this is a really important assertion because it gives you a lot of coverage, that's formal methods. It can mathematically verify for you that this is an important assertion because of that coverage. We keep it. So the missing part is to apply formal methods to the output of this tool to get the more important assertions based on the coverage, not based on how often they appear in the output traces. Okay. Now, after you get these assertions, you have to map them, synthesize them to a state machine. So this is still a challenging problem because you might end up in a state explosion state. So you want to efficiently map this logic statement into a state machine with minimal number of states, because you don't want to use a lot of hardware. So in a paper in 2005, this problem was tackled in a very efficient way. And so the trick is to rewrite properties. So you have a logic and say this logic changes if I have this input. So, for example, say your logic is that if I move to the city, my life would be wonderful. But if somebody gives you an input, your logic might change. Something like that. So it's -- say logic P can be rewritten to P5 if the input is this one. And so it keeps writing the logics until they reappear in the chain. So P1 reappears again here, again here, so the assumption is that you are going inside a loop in the state machine. So and then it wraps up this chain into a state machine by observing how you're traversing in the state machine. Okay. So now we have two parts, asserting -- mining the specs, synthesizing them. Now, the next step is to insert them into the design. What is missing there is how to -- automatically it's hashed the probes into the signals that you want to monitor. So this part of the tool we call a probe insertion. So given the names of the nets, first the probe insertion tool parses the Verilog design, the input that you provide to the tool, and then it detects a system, the modular hierarchy of your design, and then modifies the modular declaration, the modular substantiation, adds extra ports to them. So you can, for example, add a wire from this leaf node and route it to the top node, so you can have an interface, debug interface in your system. So instead of you going through your code, writing those wires and bringing them down, the tool will take care of that for you. Okay. Now that we inserted the core into the design, we would like to see what kind of things -the interface -- what kind of information we would like to see from the interface. So it's more like a user experience kind of thing. So right now the tool will do these things. You provide -- in the software we have the software interface. You say I want to enable these sets of assertions. Say you have a thousand assertions, but you don't want to check them all the time. Say, okay, I just want to check assertion 1, 2, X, and so on. So you enable those assertions that you want to check, and then you can set the parameters for you assertion. So think of the notion of parametric assertion. So and then you set the scope time scoping the time. So certain assertions have to be frequently checked. Think of an eventually logic. I will eventually finish this work. This logic has no meaning if there is no time scope there. Right? Eventually logic will have no meaning if there's no time definition, right? I will finish the PowerPoint slides by tomorrow, right, but if there's no tomorrow, it's meaningless, right? So you set the scope time for certain assertions that make sense. For example, if a next assertion, if this happened next, this should happen, then the scope time doesn't really matter, because next is we have a clock and after that something should happen. And we have a method to set the clock frequency in hardware through software. >>: Why would you turn off any of these [inaudible]? >> Mehrdad Majzoobi: Because sometimes you write the wrong assertions. And then you realize I wrote the wrong assertion and it was my fault, so I don't want to go into the tool, delete it, recompile everything. Just ignore that assertion. Or I understand this is a trivial bug, but I don't care about -- I don't know. It gives you the flexibility to not to recompile everything still not caring about certain assertions. >>: So you can collect multiple bugs and say, okay, I know about that bug already ->> Mehrdad Majzoobi: Right. >>: -- stop [inaudible]. >> Mehrdad Majzoobi: Exactly. And also the tool stops everything as soon as it observes the first failure in the assertion. So aborts, goes to the software, you know, send all these diagnostic information, halts everything. And also, as you said, you can say, okay, keep going, don't check this assertion. >>: I was just going to ask what's the benefit of doing all this in hardware? Just seems like you're making your life ->> Mehrdad Majzoobi: It's very much faster. It's a very good question. So if you want to do the same kind of thing in software, it's going to take at least three orders of magnitude more time, so it's like a thousand X slower in software. So if you have a larger design. >>: It just seems like if you really, really, really want it to work correctly, you would build a formal model for it, you would use a language that had kind of something [inaudible] something that was really sound for building -- like it's verified and it's very stringent to when it checks [inaudible]. Seems like if you were to have this much care about every single smaller bug, you kind of [inaudible]. >> Mehrdad Majzoobi: Well, ideally ranking formal methods is the way to go. But some of the problems are computationally intractable with the kind of hardware that we have. So if you run those proofs on a computer, it might take much longer than this. >>: [inaudible] assumes that we have -- that you put everything in simulation. So if your ->>: I guess the hardware existed too. >>: So if you're [inaudible]. >> Mehrdad Majzoobi: Exactly. >>: [inaudible] lazy, don't do it. >>: Right, right, right. I guess that's true. >> Mehrdad Majzoobi: Yeah. And it's actually a very good point. So you can -- instead of writing test benches, we have the input buffer and output buffer in the interface. So you can send the data to whatever computation you want to do, get back the data, do the -- and then compare -do the same kind of computation and software -- it's more like a co-hardware/software simulation, and then you contrast the result you get from software. So you're at the same time sending the data, the circuit is running X speed, and you're doing it -you're checking its functionality and correctness at the same time. And as Ken mentioned, sometimes it's difficult to simulate Ethernet controller in software, you have to write it down and so on, but ->>: I think I was just confused because it seemed like you were just basically trying to do all debugging in hardware, what it seemed like, and that just seems silly to me. Takes forever. I mean, you should do debugging first, you know, not hardware, as much as you can, right? I mean, you don't want to just [inaudible]. >> Mehrdad Majzoobi: Right, right. >>: So that's kind of what I thought you were saying, but I see what you mean now. >> Mehrdad Majzoobi: Well, yeah. I mean, you -- when you get to this point, you have to make sure that your design is sane, but sometimes it goes crazy, but you're trying to find out what's going on sometimes. Not all the time. So it's not like I write a totally crappy code, I don't know what it does, I want the tool to redesign it for me, something like that. So the assumption is that you have a good code, you can't find the bug, there if something's going on, you're trying to achieve that. >>: I actually have a follow-up question. So this tool -- so you'd actually use this to find bugs that you saw in simulation but you couldn't -- they were basically hard to -- it just took a lot to find them? >> Mehrdad Majzoobi: Okay. So -- >>: Or would you actually use this tool [inaudible]? >> Mehrdad Majzoobi: -- a real-world scenario would be I have an [inaudible] score and I give these input blocks to the core, I get the encryption or decryption of the input, and then I compare it with what values I expect to see. It runs, and at some cycle the values do not match. There is something wrong. Or you're running some instruction on a CPU. It's running at certain -- for certain instructions or input data something fails. So you're not doing that in the simulations, running of the hardware, but something fails there. So you can still capture without having to do simulation, looking at that. So -- okay. So if you want to go and look at what's happening locally, we have that feature, which is, okay, this assertion failed, roll back in time, repeat the same operation, but this time capture everything from X number of clock cycle before is stored in a chip scope logic analyzer, integrated logic analyzer. Now look at it this time with a chip scope. So the tool has that ability to show you the [inaudible] also so you don't have to deal with the simulation. >>: A good [inaudible] simulation of [inaudible]. >>: Or for that matter, it's a matter of [inaudible]. >>: Yeah, I mean, I guess that's why I was asking. You want to simulate it first, right? That's to some extent. >>: [inaudible] simulate it first. >>: Right, it seems pretty good, but let's [inaudible] kind of a runtime thing. >>: Right. There's also [inaudible] and there's also the obvious you can't account for it in a simulation like you [inaudible]. >> Mehrdad Majzoobi: Okay. So ->>: [inaudible]. >> Mehrdad Majzoobi: You're not clear about? >>: You have the output buffer and you have the chip scope, so what's the difference ->> Mehrdad Majzoobi: So this is like an Ethernet controller that Ken has developed, and you can communicate with the on-chip device through Ethernet controllers. So everything in that blue box, there's API control for that. >>: [inaudible] if you have a [inaudible]. >> Mehrdad Majzoobi: Well, the input buffer and output buffer are for communicating with the device under tests. So you give it -- so they're now for diagnostic information. So [inaudible] give you that [inaudible] information on certain signals that you want to probe. But for input buffer and output buffer, you're just sending data and getting back -- you're just running the circuit. So I get back to this figure in a minute too. Okay. So there's also timing issues that you may miss in simulation. There are -- your circuit might work functionally correct, but considering the timing, it may fail. Right? So simulating for timing, the timing dependant design is also an issue. So you might have race conditions. You might have forgotten to write timing constraints in certain paths, especially when you're dealing with multi -- a clock domain. And a very scary thing is that you might have something that happens post silicon after you implement things, you might have -- you might be having to deal with delay variation defects on delays and such. It's not so common, but you're going to see more of these. So to check for timing issues, we have added this feature to the tool that gives you the ability to tune the hardware clock on the fly in hardware. So the clock is running at 100 megahertz, and you say, okay, increase the frequency to 101 megahertz. It's -- okay, increase it one step, 102 megahertz. So without having to resynthesize everything in the code, we tune the hardware to clock using the PLL, phase lock loops, in the FPGA. So as a designer you're provided target frequency, and then we have a tool that searches for optimal phase look loop attribute settings. So each phase lock loop here has a D attribute, which is a division attribute, and multiplication, and O is also a division. So these three here. But there's certain constraints. So it's a search problem. You have to look for the best setting for these attributes. For example, the inputs of the phase detector, phase frequency detector, cannot be higher than a certain frequency. The output of the see VCO cannot be lower than a certain frequency or higher than a certain frequency. So considering all these constraints, we have this algorithm that looks for an optimal PLL setting given an input frequency, and after finding these coefficients, the PLL is to be reconfigured with the new attributes. So the figure on top shows if you have one PLL and if you -- all possible frequencies you can synthesize with a single PLL on a Virtex-5 device, also sorted out, so they are sorted from zero -like about 3 megahertz to 600 megahertz, 700 megahertz, you see there are more points here, there are more number of frequencies you can synthesize in lower frequencies than higher frequency. So it's kind of sparse. Think of it as a histogram. So higher frequencies, it's more difficult to synthesize higher frequencies with higher precision. So they're kind of sparse. >>: So basically you have to -- I want [inaudible] how do I get ->> Mehrdad Majzoobi: Right, right, right. So the thing is that if you need a low frequency, probably one PLL would be enough. But if you want an accurate high frequency, there's a chance that you don't hit that frequency in the upper curve, so you need a second PLL for that. So I will show you ->>: What does frequency count? >> Mehrdad Majzoobi: So you can synthesize about 11 distinct frequencies with this one PLL. >>: Okay. [inaudible] frequency counter. >> Mehrdad Majzoobi: Yeah. I mean, this is the number of distinct frequencies we can synthesize with one PLL. It's just a -- it's a vector. >>: Oh, I see. >> Mehrdad Majzoobi: You score the vector of all possible frequencies that you can synthesize. >>: [inaudible] DMO? >> Mehrdad Majzoobi: What's that? >>: The DMO. >> Mehrdad Majzoobi: Right. >>: [inaudible] you get this number. >> Mehrdad Majzoobi: Right, right. Each point on this curve corresponds do a different DMO setting. But they're feasible settings. There are certain settings that are not feasible. There are certain settings that produce duplicate frequencies. For example, if you multiply both D and M by 2, you get the same frequency, because one of them is on the denominator, the other is on the numerator, so there are lots of redundancies. So these are like the best frequencies ->>: What do you mean? Like if you cannot get high frequency [inaudible]? >> Mehrdad Majzoobi: Right. >>: What does that mean? >> Mehrdad Majzoobi: So with two PLL you can synthesize a lot more frequencies with one PLL. Because you have three -- three [inaudible] of freedom here, three parameters to play with. With two PLL you have six sets of parameters. So the chances you hit a closer frequency higher. >>: So I guess it would be nice [inaudible] two PLLs and three PLLs. >> Mehrdad Majzoobi: Right, right. >>: There's a variable in here that's not taking [inaudible] input clock. >> Mehrdad Majzoobi: Okay. You're right. Input clock. I assume the board ->>: [inaudible]. >> Mehrdad Majzoobi: What is that? >>: There's a crystal socket on most boards, and I can just take that out and put in a different crystal. >> Mehrdad Majzoobi: Right. I did the simulation on the on-board clock, a hundred megahertz on-board clock. So you're right. This diagram changes if you have a different input frequency. That is the case for the second PLL, because the second PLL takes a different input frequency from the first PLL. So if you don't want to get a second -- if you don't want to get a crystal oscillator, you can do two PLL and [inaudible] you can hit the frequency that you desire. Okay. So we're kind of running out of time. So guided placement. What is the problem here? So assume you have a modular design, you have a micro-controller encryption core, some other peripherals. And you want to debug only one module in that design. So you want to take out a module from a bigger design and debug that. So the problem is that if you take it out and put it next to a bunch of debugging interfaces, things might change. So you want to have least impact when you're moving that thing out of the design and put it into a new design with a different setting. So it retains the same behavior. So the trick is to have a consistent behavior. So to achieve that, we run the original design with no debugging peripherals and no debugging interfaces, anything, just original design. And then we extract the placement of the certain module that you want to debug. And then we pass it on to the debugging tool. The next time you're doing debugging, it enforces the placement of that certain module so you get a consistent behavior. Xilinx has a Smartguide technology, and it's supposed to achieve the same goals. But it only works on if you're doing very small incremental changes on your design. So I have a comparison of the performance here. Okay. So one example. I have -- I'm [inaudible] debugging a multiplier, which is attached to the Ethernet communication interface. So I add one verification unit or three verification unit. In the three top figures I'm enforcing the placements to follow the behavior when there's no verification unit. And the bottom figure, I'm not enforcing constraint. So you see visually the placement is retaining its shape, but in the bottom when there are no placement constraints extracted from this design, the top left design, then it does a completely different placement. So then at the bottom I'm comparing the costs you're paying for performance as you're enforcing this placement. Because the tool wants to do the best placement, it gives you the least power consumption with the least timing, the best timing. But once you enforce the placement, you are kind of constraining that optimization problem, so you naturally get a lower performance. But that's very small here, not negligible. I mean, that's negligible. And the problem is not really losing performance, because in the original design, you're getting the same placement and nothing's changing. You're losing performance compared to the case that there are no other peripherals and you load the tools, there's a placement for you, and there is no other modules around your module. Okay. This is a better visualization. Okay. So we have a standalone AES core, just an AES core placed like that. And then we add the communication interface and the debugging interface through that. And here I explain to you why this happens, and I have a correction for that. Here we are enforcing the placement constraint. So we would like the red region to be exactly like here while everything else is placed around that. So if you look at this area, we get a -- there is a shadow in that region. Oops. Sorry. There is a shadow right there. So the problem is that whenever you reimplement and you resynthesize your design, the name of the components might change. The tool has a random number generator that for certain memory components it assigns a new name every time you do recompile. So the names of certain components here change, and I tell it to, okay, lock this component at this location, and the tool tells me there is not such a component. So if you see these components are being locked because their names haven't changed, but the ones whose names have changed, they're not locked down and they're placed -- scattered all around the place. So I tell you a solution for that. So using a Smartguide, not much similarity in the placement. No constraint. It's moved to another place. So okay. These two figures are the same as before to the left. Now, I did a trick. And I say, okay, the names of these components are changing, but the rectangle around these are not changing. I extract the rectangle and I enforce the rectangular area around -- inside of each component. And you can do both. You can do -- so this component's name's changing, but the rectangular area is not changing. So combining these two, you can get a pretty close placement of the same module. >>: So the whole point of doing all this is so that you don't have your debug stuff interfere with your original. >> Mehrdad Majzoobi: Right. >>: But you don't want the synthesis to almost replace and introduce new bugs. >> Mehrdad Majzoobi: Right, right. Exactly. Okay. So the communication interface that we're using is SIRC Ethernet designed by Ken. It's a very handy interface. You have a bunch of C files that does the software interface and Verilog files. And the probe insertion, most of the tools are Python based, most of the parts. The Verilog parsing is being done with a Pyparsing Package. PSL to Verilog uses a Ply Python Lex/Yacc parser. And then we have an xdl2pcf. PCF is the physical constraint file, which is equivalence of UCF. But it's for after implementation, after -- sorry, after mapping, before doing place and route, but UCF is in the synthesis in mapping. Okay. You're back into the flow diagram. It's the tool for how things work. Just to wrap things up. So we have the original design simulated, get the output traces, mining the specs from the original design, and then also you provide the user specs. Then these are being synthesized to state machines. And at the same time, after -- we extract the probing information, insert the probing into the original design, and we implement the hardware assertion checker core, insert a probe, and forcing the constraint we have from the previous round, we've collected from before, and then we do the debugging through the SIRC interface. Okay. I'm going to give a quick demo on the interface for you. I have -- I'm debugging, just for an example, an AES core. We've added a bunch of PSL assertions. And we -- like I demonstrate a debugging scenario and also I run the spec mining tool for you in the placement extraction. I show you how PLL attributes are being generated for a given frequency. So let's move on to those things. Before I finish, I want to thank my mentor, Neil Pittman, for his support, and I had a really good time here, and Sandro for his wise showing the way through the project. And also we have used a lot of work from previous interns. This is actually a joint work with Kenneth McMillan and Wenchao Li. So remoting to -- okay. Remoting to my desktop, I have an FPGA board, and I have to visualize this for you. I couldn't bring everything. It would have made a better impression if I had like hardware running here. But so I have the hardware running on my desk connected to my computer. And this is the project that I have here. Okay. So we are debugging an AES core. And I have these sets of assertions here. Everything in PSL language. So this is an example of how you can write assertions in PSL. So it starts with defining the atomic expressions. The atomic expressions evaluate to 0 or 1 at any time instance. So this is saying that the atomic expression, W0, equals to this. Whenever I send a command request, eventually it has to be acknowledged. So it's more like a transaction-level assertion. Every request has to be eventually acknowledged within a time window that I define in the software. I show you that time window too. The second one says if -- okay. In the AES core, I read 48 bytes and then decrypt or encrypt those 48 bytes and then I spit out 48 bytes. So this says wait after you have read 48 bytes. So this says after AES count equals 48 wait command, this is the next thing that should happen. You have to wait. The wait command has to be issued. So there's another assertion. The next one says AES in data valid has to stay high until you have read 48 bytes. So it shouldn't go down if you haven't finished reading those things. So this is working until assertion. W1 has to stay high until W2 goes high. So this also has a notion of time. Because until -- in an until event, there's a notion of time. The assertion can fail before the time has expired, but it may wait until the end of it too. So in the next assertion there's no notion of time. Again, we have another next assertion which says if the [inaudible] is full, do not -- the next thing should happen is that it stop requesting things. These are more like transaction-level assertions. The next one is if I start reading the data, the core has to finally spit out something. You can't just keep reading things and not give an output. So it's like if the data is -- data invalid is one, eventually the data out should be valid. So, again, another eventually. So you can write more complex PSL properties. This is just for the sake of demonstration. It can be a chain of different kind of logic, so we can have eventually this event chained by some other thing. So this is very simple assertions that I wrote for demonstrating. So giving the assertions to the tool that I show, that I will show, the tool spits out the state machine for you. So this is a state machine that I get from -- after synthesis. And then the state machine is connected to the signals that I'm probing for monitoring their behavior into the tool. Okay. So now in the -- this was software side. So I've already programmed the FPGA running the code. So when I run this thing -- bad news. Okay. I think I'm running an outdated version here. Okay. Okay. I see what's wrong. Okay. I'm going to disable all the verification units in the code. So being that I enable 0, everything is disabled. Sorry. Okay. So this says assertion 0 was violated at clock cycle 0. Makes sense. Nothing is being checked. And now I'm going to check assertion 16. Oh, well, assertion -- this is binary code. The last assertion, assertion 5. Okay. Assertion 16 was violated at clock cycle 23. So what is -- okay. 16 is like 10000. It's like a binary code. The last assertion is being violated at clock 23. What was the last assertion? The assertion was that if the data is valid, eventually the tool has to spit out something. Eventually has a time scope, right, a time notion. How long I'm waiting, what is the scope set here. It's set to 20. So I wait 20 clock cycle, I repeat checking that assertion between every 20 clock cycle. Maybe that time is not enough for the AES core to process that data. Let me increase that to, say, 22. 25. It is still failing. Was violated at clock cycle 28. But it's still -- I added five more clock cycle. It was 23; now it's 28. So let me do a hundred. I'm going to wait 100 clock cycle. Okay. It didn't fail this time. The assertion 0 was -- so it's like I'm changing the time and scope and then checking if that assertion failed or not. So I can do the same thing with other assertions here. Also, I was talking about parametric assertions. Let me see. Here if -- there are certain ways you can parameterize your assertion. For example, I can pass on value 48 from software to the assertion. So I don't have to resynthesize these assertions every time I want to change the parameters. So in certain cases you can parameterize the assertion, so you can pass on these values to the assertions, to the software. So you can add more flexibility to the tool. Okay. Let me show you the PLO thing real quick. Okay. Somebody tells me a frequency they would like to synthesize, something between 3 to 600 with any precision, 600 megahertz. I'm going to put it in there. >>: [inaudible]. >> Mehrdad Majzoobi: Okay. You can be more accurate. >>: [inaudible] >> Mehrdad Majzoobi: Sorry. [inaudible] frequency. My software interface is still -- I'm looking for it. Okay. What was that frequency? >>: 173 [inaudible]. >> Mehrdad Majzoobi: 173.1234. Okay. >>: Is that megahertz, or is that ->> Mehrdad Majzoobi: Megahertz. Okay. It takes a little while because, okay, we have too many options here. So let me -- okay. Okay. So the first setting, the second PLL outputs a frequency of 1783125. 123 -- you've got 123 here. So with this one optimal frequency was 123, the settings are for the second PLL M97D103, and for the second PLL -- for the first PLL N34D50O127. So I'm training out things with low precision. Probably you got that 7247, but it's not because I'm using [inaudible] in the printing without formatting the string. I'm not -- you're not seeing that thing. But so it gives you a list of frequencies that are within a given tolerance window. I've specified that, and then you can choose the closest and so on. Okay. >>: [inaudible]. >> Mehrdad Majzoobi: Well, no, so the tool is still not able to do dynamic reconfiguration. This is the next thing we want to do to -- after we get these numbers, you send it to FPGA, and then you reconfigure those two PLLs. And then you can get those kind of timing information from the [inaudible]. >>: So at the moment it's just ->> Mehrdad Majzoobi: I'm just generating the coefficients. Yeah. The problem with the Virtex-5 is that the information to do dynamic port configuration of the PLL is not well documented. But that information exists for Virtex-6. So, yeah, a part is still on debate. So what else do we have? Okay. Spec mine, let's go to spec mine, mining a specification from -- I think I had it open. Okay. So this is how you run the spec mine. So you give it a -you give it a VCD dump, and then it outputs the specs for you. So it's going to run this here. The VCD is the output you get from simulation tools, usually models, and it stands for vector charge dump? No. >> [inaudible]. >> Mehrdad Majzoobi: Yeah. Okay. So these are the two files. I gave it AES that I got from running ModelSim, and then the specs. Let's look at the specs. Okay. So this says the number of patterns before merging. The number of events, 31, the lengths of traces, 3,000 something. The number of patterns after merging after training. So this is one pattern. This A stands for alternating pattern. So this pattern alternates. So we have every time the address is 5, next we have 6, so we have this 5-6 in alternation. So whenever you have 5, you have 7, 5 you have 8. So the tool is basically saying this address is incrementing in a verbose way. So that's how the model formal methods can come handy. They can look at these things, say, okay, they're all saying the same thing mathematically, right, so I can do another reduction on this. The downside is that if you want to directly synthesize these assertions, you're going to have a lot of redundancy in the hardware. So if you had unlimited space, then that wouldn't be any problem. So what is F is eventually. So this says address equal 4 and address equal 4, if this happened eventually address equal 3, address equal 3 should happen. So these are like eventually patterns. And so on. So they're written in linear -- the same kind of thing, linear time logic. >>: And this is constrained just by what you simulate it to? >> Mehrdad Majzoobi: Right, right. So the longer you have traces ->>: [inaudible] never been inserted. >> Mehrdad Majzoobi: What is that? >>: Like if it says -- I mean, you cannot get an assertion. So, for instance, you basically can't capture it. You get some stuff, but you can't -- you don't get a lot of things. >> Mehrdad Majzoobi: Yeah, the longer the simulation traces, the more, say, confidence you would have in your patterns, and you would probably get more meaningful patterns than running the short traces. >>: But that would never give you any false negatives, right? So it will never -- it will never give you a case where the assertion was actually not right. >>: You have lots of false negatives [inaudible]. >>: It's not going to say that it's an assertion but it's wrong. >> Mehrdad Majzoobi: Right. It just gives you a bunch of assertions. That might not be a correct assertion. But that is a very dangerous thing, but it might happen. >>: Because isn't it just looking at the physical ->> Mehrdad Majzoobi: No, it's a form closer. >>: Well, it's looking at the physical signals, but it's saying this is what [inaudible] and this is what it turned out to be kind of thing, right? It's generating assertions from actually what happened, so you couldn't get anything that was [inaudible]. >>: But, no, you can get false positives [inaudible]. >> Mehrdad Majzoobi: Right, right, right. So if you have bugs in your traces, the learning is also done on the buggy traces. So you ->>: I see what you mean. Right, right, right. Yeah. >> Mehrdad Majzoobi: But we are hoping ->>: [inaudible]. >> Mehrdad Majzoobi: It's like the assumption is that those are not too often that affect the patterns. So you have something that happens frequently, then it affect your learning, you're learning the wrong behavior. You know, it's like a kid. >>: Or could be like [inaudible] does the right thing 99 percent of the time, but 1 percent of the time does the wrong thing, so I can pull that out. >>: Right, right. So if you're -- I mean, if your code was really bad, then you would be full of assertions that were wrong. >>: Right. >>: Okay. >>: That's why we say that you won't have a somewhat confident [inaudible] 99 percent right. >>: So it's not -- I mean, it's really not just like code, push button, give me my assertions and then I put this into hardware and you say push button and you're going to help me figure it all ->>: You're assuming that you're mostly there. >>: Yeah, okay. That makes sense. >> Mehrdad Majzoobi: How much time do we have? Show a few quick things and we can wrap up. So the other tool is the xdl2pcf. So after you do your design, Xilinx at the end of all of the processes that Xilinx spits out a file with the format XDL, which stands for Xilinx Description Language. So it's not a -- it's a closed language, but it's easy to understand. It has a very simple syntax. And it tells you -- it's the net list of the FPGA, how things are being connected in the FPGA, how things are being placed. So, for example, I'm going to open up one XDL file here, show it to you. Okay. Here it is. So this is, for example, for the AES core, the tool spits out this file. Which this is like -- for example, this says this is an I/O buffer placed at -- this is an I/O buffer placed at this location, and it's configured this way. This parameter is set to off, this parameter off, and so on. And it's also -- the I/O standard is set to this very low-level net list of the FPGA. So what we are processing this for is to extract the placement of the components with given names. So the script takes this file as the input and then outputs a PCF file. So this script is finished running. So this is the PCF constraint file, which was generated from that. So it tells the tool to place this component at this site and so on. So you see these names? These are things that change. They're being generate with a random number generator. So the rest, which look normal, this is, for example, a name that is not changing. But this one changes and so on. Anyways, for the P2V, let me show the synthesis tool as well. Okay. So for P2V, we write the assertions here. So this is, for example, first assertion that we had there, AES byte counts 48, the same assertions that we had in the Xilinx tool that I showed. So okay. Okay. So we run this. Okay. So we say my P2V, run this, and then it synthesizes into AES underline views. Go look at that. So these are the state machines generated from these assertions. So currently we have pieces of these tools running. We still need some work to put them all together. So we isolate the designer from having to deal with what's going on inside. As I said, the goal is you provide your design. And if you're really willing, you can write your own assertions. If not, the tool can mine it for you. And the rest has been taken care of. And then you sit in front of your computer, and then you run on your interface and check the assertions that are -- they're in the tool. The vision is that the tool also generates the C codes, parts of the C code that you can insert them as header file that, you know, has the parameter set for those assertions and so on, so you don't have to write those parts in the C code. That would be the flow of the things that we have, and hopefully we get things together soon into something that people start using. Any question? All right. Thanks a lot. I appreciate your time. [applause]