>> Ray Bittner: Good morning. My name is... Don Matson today. He's a local Xilinx field applications...

>> Ray Bittner: Good morning. My name is Ray Bittner. I have the pleasure to introduce Don Matson today. He's a local Xilinx field applications engineer. He's offered to do a series of three talks for us on the newest Xilinx tools, and today we'll be talking about Virtex-6 and Spartan-6. Don? >> Don Matson: Thank you, Ray. Good morning. And as Ray said, I'm going to be talking about Virtex-6 and Spartan-6. These are our newest families. I'll start by talking about Spartan-6 and then I'll cover Virtex-6, and at the end I'll have a little bit of a summary. Go ahead if you do have questions. We'll try and make this a little bit interactive in case I miss something. So with that, let's get started with Spartan-6. And normally when I talk about Spartan-6 people say, What happened to Spartan-4 and what happened to Spartan-5? And you can see I give a lot of responses, and some people think that our marketing department couldn't count, some think it's just because it's better than Spartan-3, others because Xilinx wanted to highlight the commonality, and then the final choice was it is the sixth generation of Spartan devices. So if you count generations and you go back, this is really the sixth generation of Spartan devices, and Spartan devices have always been optimized to deliver a balance of cost, power, and performance. And probably originally, the original ones were more cost and performance, and as we've gotten into these later architectures, power has become a bigger issue. So that is Spartan-6. And as I go through here, you will see, for those who have done FPGA work, that Spartan-6 and Virtex-6 are really derivatives of Virtex-5. So that's part of the reason for calling it Spartan-6 as well. So Spartan-6 is on a 45-nanometer process. It is a low-power process. For those who have used our FPGAs, we've always used what would be considered either the general purpose or the high-performance process before. So this is the first FPGA that we've done using the low-power process. You'll see that we're going to offer a couple different platforms, an LX series device and an LXT series device. And the T stands for transceivers, which means our gigabit transceivers are in there. And then you can see we have a rollout in mid 2009 of those devices. And I'll talk, when I get to the family chart, a little bit more about what sort of is our rollout schedule for hardware and software and documentation. So that said, I did say it is a 45-nanometer process. And one of the things you will see is, as we are rolling to 45 here, you will see that there are some significant advantages just from the power standpoint in here, as well -- let's see. We'll just move -- so Spartan devices as opposed to Virtex devices. Spartan devices are really aimed for the high-volume market. That's what we're targeting. And you can see that in this slide I'm showing, we've had some pretty good growth or very good growth in our Spartan families over this last decade. And we're also going to talk a little bit about what we're doing in packaging. So just to make sure everybody's aware, Spartan is what we're aiming for the high-volume market and the Virtex is the guys who are really after some significant performance. And so that's sort of how we're positioning the two families. That said, here's sort of what we see as the market needs for that high volume. So we need to minimize application cost. We do that by doing things to minimize the power supplies you need. So in the Spartan family you need a VCC end supply to power the core and in Spartan-6 that's 1.2 volts, and a VCCO supply for your I/Os, and that can be anything from 3.3 volts all the way down to 1.2 volts. In addition to that, there's a VCCOX supply so that certain I/O standards and certain things have a dedicated supply. That supply can be either 2.5 volts or 3.3 volts. So if I need just a 3.3-volt system, I only need to supply two power supplies, one for VCCO and VCCOX of a 3.3 and then supply for VCC end. So that helps with the cost. The other thing we'll talk about is how do we address more getting additional bandwidth. In particular with the Spartan device, it isn't so much how much I can run inside the device or the speed that I can run inside the device, it's more about how can I get things onto the chip. And that's been the main emphasis in improving performance on Spartan-6. And then when we get to the slides on power you'll see where we've made significant improvements there, and then when we look at the device family, you'll see that it's more than 2x the size of our previous Spartan devices. So let's just talk a little bit about some of the things we're doing to meet low cost. First, when we look at the packages, what you'll see is we sort of break these into a few groups. So there's the old TQ package. That's for the guys who are doing prototypes who really -- you know, if I'm doing a flat panel display, I've got a lot of board area, and I want to minimize board layers, so having a TQ package makes sense. Then this is probably where most of our customers are at, the 1-millimeter ball pitch packages. And then we have some chip scale packages here, and those are the .8-millimeter ball pitch. And there is some additional packaging coming on Spartan, particularly to reach -- put more logic into a small space. But these are the initial packages that we'll roll out. In addition to packaging, some things that we can -- yes? >>: In the chip scale, is that like a bare die or is there a [inaudible]? >> Don Matson: The question was, in the chip scale package, is it a bare die? Chip scale package -- all of these packages are really wire bond packages, which -- I mean, this kind of shows it looking more like a flip chip, but really the die is upside down on the package and then there's wire bonds out and then it's got a lid on top of the package. So that's all of these devices. >>: [inaudible] the die is bonded to that metal cover of that [inaudible]. >> Don Matson: Yes. So the question was, is the thermal resistance -- is there a cover on top of the die. Yes, there is a cover on top of the die, and you can put the heat sink on top of that, and that does help with the thermal impedence. But if you compare these devices to the Virtex devices, all the Virtex devices are flip chip packages, and the majority of the heat is actually conducted out through the leads, and the Virtex devices offer a much higher thermal performance. So hopefully that answers that question. Also, to help with reducing system costs, we've started implementing a significant number of hard blocks inside the devices. And what this slide is showing is some of the blocks that are in the Spartan FPGA. So the first one that it shows is this SRAM. And I'll talk about it a little bit more in a few slides. But the important thing there is we've actually made a hard controller and put it on the Spartan device, so that saves you logic. It also saves you development time because it's much easier to use. We introduced in Virtex-5 PCI Express Endpoint. In Virtex-5 it can be a by 1, by 2, by 4 or by 8. In a Gen 1 PCI express, and Spartan-6 it is a by 1 Gen 1 endpoint, so 2.5 gig. In addition, we have put in the DSP blocks in here. If people are doing some processing, the original Virtex-2 Pros put in multipliers, hard multipliers, 18-by-18s. In Virtex-4 we introduced a DSP, what we call the DSP48. It's basically a multiplier followed by an accumulator. In Spartan-6 and in Virtex-6 that multiplier accumulator structure will have a preadder in front of it as well. So if you think about classical FIR filters, FFTs or even -- yeah, or those two applications, it's often quite nice to have that preadder. So it's a hard silicone block in cascades. And I'll talk a little bit more about it in the Virtex-6 time. But that is in Spartan-6 as well. There are some differences between it, but both devices have the multiplier, the accumulator in there. And they have multiples in there. So another thing we've done in Spartan land to make things easier, and we actually do this in Virtex as well, is we've made it easier to configure with commodity SPI PROM or flash. We started making that switch a few years ago. We're now support SPI PROMs from multiple vendors. Although you might not want to buy them from Spanion [phonetic] right now. They're not doing so well. But they have added capabilities now. They've gone not only by one, but by two, by four. We've put some intelligence into the Spartan FPGA, so if you tell it to configure with an SPI PROM, you no longer need to give it those variant selector vendor select pins. You don't have to set those because we're going to interrogate the SPI PROM and determine what -- who's out there and then do the appropriate things to pull the bit stream off. The other thing we're showing here is we're showing that you can put multiple images in that flash and have the Spartan device boot from one image and then give it a -- have it do some checking and then boot from a second image if you wanted as well. So those things have been worked on for a while now, and we've made some improvements there. The next area I wanted to talk about is the I/Os, what have we done in Spartan-6 I/Os to allow us to get higher speeds. And you'll see we talk about 1.0 gig per second. In the general purpose LVDS I/Os, those who have used high-speed LVDS know that the challenge usually lies not in capturing the signal at the first register, but how do I get it from that first register, whatever speed it's in, to a parallel form to get it into the rest of my fabric to run it. And I'll talk a little bit about that. And then, of course, we've talked about the gigabit transceiver, so I'll talk about that. Just so you're aware, here are all the memory interfaces that are supported by Spartan, and just -- these last ones here in Spartan-6 are supported in the devices that are LTX, where the T stands for transceivers. So those are ones that are supported with the transceivers. The general purpose I/Os look like this, and I want to make sure everybody is aware that our general purpose I/Os are full 3-volt tolerant. So if you've used Spartan before, you know that Spartan-3A has an ability to do hot socketing or be in a hot swap application and withstand full 3-volt I/Os, and we're doing the same thing in Spartan-6. And I need to contrast that a little bit with Virtex-6 because Virtex-6, the I/Os -- the maximum I/O voltage it will withstand is 2.5 volts. And that was done to improve performance there. Those who have used either Virtex-4 or Virtex-5 will kind of recognize this block. What's off over here is my output buffer and over here is the input buffer. But over here I have this I Logic block, which is my deserializer. This is on every I/O pin. And then pins are sort of paired together, with the idea that if I want to bring in LVDS or some other differential signal, I need to have them come into a pair. And so this is -- these two are paired together, and this would be like the P and this would be the end input. And I can make this be the master deserializer and this would be the slave, and with that I can deserialize either one to two, one to three, one to four, five -no, I can't do five -- yeah I can do five, six, seven, eight. So any of those numbers. And that's how the deserializer works in combination. If I just bring it in single-ended, I can do one to four, is the maximum. So somewhere we talk about DDR3. That's a single-ended standard, and that will come in at 400 megahertz. It's DDR, so four to one on that means that coming out of here, this deserializer will give me four bits at 200 megahertz going to the fabric, going to the hard endpoint, just so everybody's aware of that. And, likewise, that was the input logic. There's also a serializer over here to do the same thing going out, and it's trying to show that I have a data path, and then the tri state enable also needs some sort of ability to deserialize as well. So those are those two blocks. And then the last block here is the I/O delay. So in the Virtex-4 we introduced the I/O delay. This is a similar function. It allows me to programmably come in through this dynamic reconfiguration port and adjust where my delay -- input delay tab is. So that is probably the biggest change in the I/O for Spartan, and that will give us the ability to do that gigabit LVDS or do the high-speed memories. When I'm talking about I/Os, we talked a little bit about the transceiver block. This is just a high-level view of what we call the GTP in Virtex-5 or in Spartan-6. And this is the transceiver. It's the same transceiver used or -- I shouldn't say the same. It was ported from Virtex-5 to Spartan-6. So it was ported from a 65-nanometer process to a 45-nanometer process, which is one of the reasons we believe that we will have very good success. It's something that we've been producing now in volume for quite a while, and we're ready to make it available in our Spartan devices. There is a couple of differences that are important for the transceivers. And I don't know how many in here have used the transceivers, but in Virtex-5, those people who have used the gigabit transceivers know that our transceiver pairs are put in a tile. A tile contains two transceivers, and in that tile of two transceivers there is one PLL, and so you can -- for your transmitter or for your two transceivers, they can work off of that one PLL if their clocking rates are related in some nice integer divider ratios. In Spartan-6 and in Virtex-6 we've changed the structure and we've added a PLL for the TX and a PLL for the RX. Now, the PLL on the RX path is optional, so if you're running PCI Express, you probably don't want to bother to turn it on. It's going to just burn a little extra power. But it does give me the ability to take PCI Express in on one channel, which is running at 2.5 gig, and then run the other channel or the other transceiver at something else, like a video HD-SDI rate. So that's our transceiver. I mentioned before that we've put in a hard memory controller. This hard memory controller is only in Spartan devices, and this memory controller has a controller interface to the memory that can either be 8, 4 or 16 bits. If I can figure it at 16-bit memory interfaces, that'll be the majority I can do. On the user side I get six 32-bit programmable ports to the fabric. So if I had a MicroBlaze or a processor, I could have the processor hooked up to one of those ports. The ports can be either write-only, read-only or read/write, and it's a very simple FIFO type interface, so relatively easy to use. And this will give it -- make it easier for our customers to hook up to these high-performance memories, get the design up and running quickly. And then the last area I wanted to talk about is twice the capabilities, half the power. And I'm going to start here by looking at the fabric, and I want to contrast -- so this is where Spartan-3 is today. You'll see the 4 LUT followed by a register, and you'll see that Virtex-4 also had that same structure. Virtex-5 we went to a 6 LUT followed by a flip-flop, and then there was -- it's actually six inputs and two outputs, and that second output was just available to the fabric. And we've done some studies and we said, hey, for a lot of designs where we want to increase the performance, having that extra flip-flop here is a very easy thing to add, and so we've added that. It also gives us the advantage of when I make this LUT into a distributed memory, I now can put twice as much distributed memory in the same area. So a 32 deep distributed memory and two bits wide is what I can put into that area. Whereas, with Virtex-5 it's 32 by 1. So that's in both Virtex and Spartan. The B-RAM, we've actually made multiple changes there. Probably the biggest one is in Spartan devices, the ratio of logic to B-RAM was relatively low. So we've pretty much doubled the ratio of B-RAM to logic in Spartan-6 over Spartan-3. The other thing we've done is we've taken these 18Kb RAMS and we allow it to be fractured into two independent 9Kb RAMS. And a lot of times people didn't need the full depth of the 18Kb RAM, and so now you can get a better utilization out of that memory. And that's very much similar in Virtex-6, this primitive -- the base primitive is still a 36kb RAM, and it can be fractured into two 18kb RAMS. So that's one of the things. We've also done some enhancements to the block RAMS because we're interested in reducing power in both Virtex and Spartan. When we were designing them, we found some ways inside to reduce the power. And just so you know, in Spartan, the B-RAMS will run somewhere -- or if you use the full pipeline modes, 270, 300 megahertz, somewhere in there. 250 to 300 depending upon your device. And in Virtex that'll be 500 to 600 megahertz if you use the pipelining options. Okay. Clocking. So those of you who have used Virtex-5, this block should look very familiar. It's a PLL and two DCMs. Those are Spartan-3A DCMs coming over. So if you have Spartan code, your DCMs will come on over. We've added this PLL. The PLL has a VCO that's going to run somewhere between 500 to -- 500 megahertz to gigahertz, and then you have five divider tabs that you can set coming off of that. And that will go to your clocks. We've done a lot of working on the clocking in order to support the high-speed I/Os. There are some dedicated clocking paths for getting clocks from the PLL out to the I/Os as well. On this slide I just wanted to highlight, you know, what kind of performance boost do we get using Spartan-6 versus Spartan-3. So we just take the MicroBlaze and give you some relative performance numbers as we switch from a Spartan-3 architecture and just move it to Spartan-6. You can see we do get a nice little boost in performance. In general Spartan-6 will be, you know, a little better, a little faster than Spartan-3. I mean, we're not trying to really push the speed aspect on Spartan-6. However, we are doing everything we can to reduce the dynamic and static power, and so you can see we're showing roughly about 50 percent reduction in power. And this slide here, what we did is I took -- or we took the largest Spartan-3A device, it's a 3A -- what we call the DSP device, so it had twice the memory that the standard Spartan devices had. So it was roughly comparable to sort of a midrange Spartan-6 device. And if you just take this 3A device and compared it to here, you'll see that we -just through process, we're getting roughly about a 50 percent power savings. Additional power savings is available if you move from a 1.2 volt core to a 1.0 volt core. So let me explain what we're doing. As we're designing these devices, we're designing them to do something we call voltage scaling so that the device can be either run at 1.2 volts or 1.0 volts. When we do voltage scaling on the Virtex-6, voltage scaling will either allow the core to run at 1.0 volt or .9. And it's just an option that you order and you can get the device in there. Yes? >>: What are your projections for voltage levels for [inaudible]? >> Don Matson: Your question was what are our projections for voltage levels? >>: [inaudible] scale it down to .7 or are you plateauing at .9 or .8? >> Don Matson: The question was -- and I think what you're really asking is when we roll out the next generation beyond here at 32 nanometers, will we be scaling voltage. My expectation is that we would, but I have not -- I have not checked on that. I mean, I could certainly ask and find out. So this just shows you where we're at with our power. I wanted to show you some of the features that are added in in power. So these are hardware features that have been added over the years. You know, the ability to stop clock, the buff Gs [phonetic] have had that ability for quite a while or you could do it another way. Hibernate, the idea that, hey, I'm going to just power off the FPGA, we've added suspend, voltage scaling, and now when we go into suspend, we've added the ability to use some of the pins inside the device to wake up. So not all the pins go asleep. So I could have an interrupt pin coming into the FPGA to wake it up. I don't have to have an external processor to wake it up. And it's a lower power platform. One of the things I should mention is we've done a lot of work on our software, and, you know, the initial software -- and I think we've been doing -- there are power options in MAP and PAR to reduce power, simply just clean-up routes, you know. Hey, what can I do to clean up this route? And, you know, we didn't get a whole lot of power savings. We did run the 11 and 1 recently on a design comparing it to 10-1 on an LX -- on a Virtex 5 and LX 110 device, and what we saw there was 13 percent improvement in performance or in power reduction. And we kind of expect somewhere in that 10 percent range can be handled -- can be achieved through software alone. So I just wanted to mention that as well. And then the last thing is we're bringing all these devices -- or building these devices. People want to be able to test them. You want to get boards, you want to be able to build your system. We're working to not only build the base platform or the devices, we're rolling out boards to targeted areas, we're putting IP on those boards, putting reference designs, making those available to customers. We're trying to do as many things as we can to help you guys be able to do your designs in a quick and efficient time. So that said, here's the family of devices, and the LX 16 device is -- we have got some devices back. We have an early sample of that. We have a demo board that we took to ESC Conference back in February. That's the first device. It will be generally available in the July timeframe. And then the next couple of devices are the LX 45 and the LX 45P, which is the August/September timeframe, and then the 150. Those three devices will be the first three devices sampling, and I think all three of those are scheduled to be in production at the end of the year. Yes? >>: [inaudible]. >> Don Matson: Volume prices -- the guy in the back right there, that's Michael Pierce [phonetic]. He's our sales rep, and he can get you some guidelines as to what kind of pricing to expect with volumes. And it's a question of volumes and time. But as an FAE, they don't like me to talk about that So I haven't mentioned anything about pricing, and I don't know if you've looked, you know, at a Spartan-3. Obviously this Spartan-6 will give us the ability to put a lot more transistors, a lot more logic cells in the same area. So there will be some price reductions compared to Spartan-3. Yes? >>: Do the blocks save me power on memory [inaudible]. >> Don Matson: So I think there's two questions there from Sandra [phonetic]. The first one, hard blocks, do they save you power. And the answer is yes, hard blocks always save power over the embedded or over a soft block. Simply because it's dedicated routes, it can be built a lot smaller. And typically we only do hard blocks on pieces of IP that we see that there's significant savings in doing that. I have numbers on how much power a DSP block or I could -- I could do a FIR filter with or without it. The numbers are fairly dramatic. PCI Express would be an interesting one to do. I mean, it would be easy to do a hard versus soft. >>: [inaudible] SDRAM is a lot less hot than [inaudible]. >> Don Matson: So I think your question on -- so on the memory, I think a better way to -- or how I would like to think about it is the memory controller inside there should save you a fair amount of power. We made sure it addresses three particular standards, and I mentioned a couple of them. DDR3, DDR2, and the other one that we get a fair amount of request for is low-powered DDR simply because it's a low power, or mobile DDR is the other name that it's called. And so I think we'll be able to demonstrate significant savings, and when we've got boards, we'll get you some of those numbers. The other question was ICAP, and I'm not exactly sure what the question is. I can guess, and I think I'll take a stab at that. So obviously there's an ICAP in Spartan devices. There has been for a number of years. The area that some people are most interested in ICAP for is reconfiguration. And we've always said don't bother with reconfiguration in Spartan devices because there was a potential memory glitch when doing -- rewriting a logic cell from -- back to its same value. With Spartan-6 there is no memory glitch. The devices are designed for partial reconfiguration. So that is in Spartan-6. Yes? >>: I think you said that there is no hard block for DDR in the Virtex-6, right? >> Don Matson: Yes. >>: So you would still be using MIG? >> Don Matson: You would still be using MIG to do the memory controllers in Virtex. And the reason for that is it's -- with the Spartan class device, it isn't that -- I guess, actually, it's easy to say, hey, if I can put a 16-bit-wide DDR3 out there, I probably can cover the memory bandwidth that most applications need. With Spartan devices I need the ability to tailor it a lot more, and so with Virtex devices we give the user the ability to customize it, and the fabric is fast enough to allow you to do that. The last slide here on Spartan -- actually, I've got a couple reference slides I'll throw in as well. Really the last slide here is on the timeline, and I don't have an equivalent slide for Virtex, so I'm going to comment about Virtex because it's almost identical. So documentation -- early access documentation to both Virtex-6 and Spartan-6 was opened up the end of last year, so first of this year. And so if you go out to the web, you go xilinx.com/6, you'll go to the Spartan-6 Virtex-6 home page, you'll find that that's an eight-page overview brief that pretty much covers all the information that I'm going to be covering today for both Spartan and Virtex. But in the early access document lounges there's probably -- there's well over a thousand pages of documentation on the major features of Spartan-6 and Virtex-6. And so if somebody needs access to it, you know, my contact information was on the first slide. It's don.matson@xilinx.com, and we can work to get you early access documentation. Early access software happens the end of this month. So there will be a limited number of customers with that. What I don't have on here -- well, I guess generally yes. Third quarter, just to be -- so everybody's aware, it's actually -- July will be the release. So 1st of July you should be able to target full Spartan-6 and Virtex-6. And as I said before, it's the July timeframe, the first Spartan device, the 16, will be out in general sampling, and it's the May timeframe for the first Virtex-6 device. So that's kind of the timeline we're on. I put these next two slides in here. These are just references for people who have used Spartan-3A just to compare Spartan-3A on logic cells to Spartan-6. And so you can see that we've got significantly larger devices, and this just gives people a way to compare. And I also broke down the comparison on this slide and I said compared to number of block RAM bits. So this is just looking at block RAM bits, and so you can see that there's a lot more memory in these devices. Yes? >>: [inaudible] with an embedded flash? Are you going to do something like that for this one? >> Don Matson: Yes. So the question was is we do have some Spartan devices out there, Spartan-3AN, that have an embedded flash available. And will we do something with Spartan-6? I think that the devices have been designed to allow us to put additional devices on there if we wanted. So we could put a flash, we could put a DDR3 out there. Our goal, though, right now is getting Spartan-6 and Virtex-6 released on schedule. And also there's another thing. ISE software is ongoing. There's a lot of work being done there. So that's our primary focus, and we're investigating whether we should. So just showing here a road map, you see that we do have our Spartan-3A family out now. We will be releasing Spartan-6 early -- in the middle part of this year, and then the devices with the transceivers, and you'll see that there's also something coming in 45 nanometers we call Dragonfire. I think there will be an architectural announcement at the end of this month or early next month on that. So with that, I'd like to transition from Spartan to Virtex-6. Are there any other questions on Spartan before I go to Virtex? Okay. So in Virtex land the care-abouts from our customers are a little bit different. Performance is probably the number one or is the number one concern, but power is not too far behind. So much like Spartan, our customers are caring about power. They do care about cost, so we've done a number of things in Virtex to make it easier for people to lay out boards. So the Spartan devices, the packages were wire bond packages. In Virtex they are all flip chip packages. We've done an extensive amount engineering of the package to support the high-speed transceivers. We've also put all of the bypass -- not all of them -- the majority, vast majority of the bypass capacity you need for the devices are on the flip chip devices in Virtex land, and that helps to reduce cost. The other thing we've done is we've done as much as we can to keep the number of power supplies to a minimal. So those are some of the things that we've been doing there. When I talk about Virtex, you know that we've had multiple families in Virtex, and in Virtex-6 we have the LXT family and the SXT family that we've announced, and we will be announcing something later this year and actually sampling something with the high-speed transceivers at the end of the year. So that's where we're at. Let's just talk about performance. Compared to Virtex-5, the fabric should be about one speed grade faster. The transceivers in Virtex-5 were basically 3.2 gig. They're now -all base transceivers are 6.5 gig, so roughly twice as fast. We've done some things on the I/O to allows us to run at a higher speed. And you can see that we're going to be able to run DDR3 at 1066, and we've done a number of little things in the clocking. It says clocking is 10 percent faster overall, but there's a number of things that have been done in clocking to really help those guys doing high-performance designs. And the DSP, those people using DSP blocks, you'll see that even in all the devices, there's a lot more DSP resources available, and so we'll talk about that. This slide looks familiar so we won't say much about it other than the fabric is the same as the Spartan-6 and it's a derivative from Virtex-5. Block RAMS, much the same story. We've done some enhancements to improve the FIFO performance so you know there is a dedicated hard block FIFO in that 36k B-RAM. And so you have one FIFO or you can split and it use it as two independent B-RAMS. And we've also added the in -- we added error correction in. If you use the FIFO as -- or not the FIFO -- if you use the block RAM as a 72-bit-wide data path or 64-bit data path with the 8-bit extra for correction, there is an ECC block available on all of the B-RAMS in Virtex. In addition to that, we've added some -- an extra capability on that ECC logic to allow you to inject errors to test it, verify that it's working. So it's sort of our second generation of ECC logic for the B-RAMS. As I mentioned before, we've put in some new performance paths on the clocks. The other things we've done is we've added midpoint buffering to reduce global skew. If you've used the regional clock buffers, you know that in Virtex-5 they're limited to 250 megahertz. They're now -- those regional clock buffers have been changed to be differential clock buffers to reduce any jitter or noise that they might see, and their performance has been increased to the 500 megahertz range, and so that has happened as well. We have up to 18 mixed-mode clock managers. The clock managers are all PLL based in here. All pins have the I/O delay. So you do have the ability to still do I/O delay. The refinements in the I/O delay is that we've done a number of things to reduce the power and also to increase the accuracy of the tap delays, and we have reduced the total number of taps. So in Virtex-5 you could have 64 taps, in Virtex-6 you're down to 32 taps, the idea being that for a higher speeds I don't need that really long tap delay, and that costs me a lot of power area and jitter performance. So that's why we've made that change. As we look at the DSP blocks -- so this is showing the DSP slice. As I mentioned before, the DSP slice is similar in Spartan-6 to what you see here. So what I'm showing is one slice. You'll see that I have an A and a B input. And I have a preadder coming in here, so I could have a coefficient coming in here, I could have -- for a FIR filter, I could have my coefficients wrapped around, so I come in here, and adding and then multiplying. And you'll see that that multiplier is a 18-by-25 in Spartan -- or in Virtex-6, much like it was in Virtex-5. So that bigger multiplier allows more dynamic range. The accumulator is still a 48-bit accumulator in both Spartan and Virtex. And then the other thing I'm showing is most of our DSP software takes advantage of the fact that I can run a cascade of these really nicely to greatly reduce power by using the direct route and keep performance up in that high range. So it's 500 megahertz for dash 1; 550, dash 2; 600 for dash 3. And then, as I mentioned before, there's roughly twice as many as there was before. Now, all of the Virtex devices have a built in PCI Express block, and they all have some built in Ethernet MAC blocks as well. And those Ethernets can be 10, 100, 1 gig. And the PCI Express block here, since the transceivers support 6.5 gig or up to 6.5, the transceivers are all -- or I should say the PCI Express block can be either Gen 1 or Gen 2, and it can support up to four lanes Gen 1 endpoint, and it has the ability or the hooks necessary to allow you to hook to an endpoint as a downstream device from you. So that capability is new. As I mentioned before, the memory performance has been boosted. So we now support 1066 in 4DDR3 and we also support 1.4 gig for the LVDS I/O. And the rest of that should look very familiar to those who have done Virtex-5. So sort of as I mentioned before, our transceivers will be GTX on the base devices, and then we'll be introducing a device later called HXT devices with the other transceiver. So much like in Spartan, we have a lot of hard IP in Virtex. One of the blocks that I hadn't talked about before is there is a system monitor block on all Virtex-5 and in Virtex-6 devices that allows you to do -- monitor voltages and temperatures internal to the device and also monitor voltages external to the chip if you want. From a power standpoint, what you can see is both the static and the dynamic power is reduced in Virtex-6 when compared to Virtex-5. And, you know, roughly we say it's going to be about 30 percent lower power if you're running at a 1.0 volt core. If you use the voltage scaling in Virtex, you'll get about 50 percent reduction. Now, one thing you should be aware of is if you look on your charts, you'll see that on the Spartan devices, the speed rates are dash 2 and 3 by default, and there's a dash 1L speed grade, and that dash 1L is -- that's that 15 percent slower than the regular Spartan devices. In Virtex you have a dash 1, 2 and 3 and, then there is also a dash 1L, and that dash 1L is roughly the same performance as the dash 1. So, you know, it's somewhere in the performance range of a dash 2 Virtex-5. So that's our story there on power. And we'll just move on to the slide here that shows all of the I/Os here and devices. So one thing to be aware of, as I mentioned, if you look at the number of DSP slices, it's much higher. And if you look at the DSP, you'll see that we have a couple of devices that have significant amount of DSP elements and memory. So the smaller devices, because the DSP block is relatively small to add, we have a high number of DSP blocks, and as you go up larger we say, hey, if you really are interested in DSP, you want to go this way as opposed to going up on these devices, just so you're aware what's going on there. As I said, all devices here are flip chip and all of them have the I/O or the majority of the capacitance already on the chips. We've done some things to improve signal integrity of the packages, and we've been doing our design on the packages with the idea that the transceivers need to be capable to run 10 gig even though we're putting lower-speed transceivers, and that's guiding our package development. I guess the last thing -- earlier I said that this device, the 240T, is the first device out, and you can see when that device samples, the first package will be the 1156 and the other two packages will be followed shortly after. As those packages are available, we're able to at least put something in any footprint that somebody would want, with the exception of our friends who like to do big ASIC prototypes. We built a device specially for them, and that gives you the three quarters of a million flip flops. So that's a big one. That's what I had on Virtex. What I want to do now is just kind of summarize things by just showing you the structure and just kind of show what things are common between Virtex and Spartan and what isn't. So what's common in there is, you know, the LUT/CLB is the same, the block RAM structure. While the size of the block RAM isn't identical, the features are pretty much identical. It makes it easy for us to port. In the same fashion, the DSP slices are very similar. We have a lot of clocking resources in these devices. The I/O structures, as you saw with the I/O -- I-logic, O-logic and I/O delay is very similar. We have transceivers on all of them and we have PCI Express. As you look at this, though, you should see some things that aren't the same. And first off you'll see that this is a column-based structure, and so that means -- and I don't know that I show it here. This doesn't show it very well. There are actually three columns of I/O and then the fourth column of the transceivers over here. But this is your I/O ring structure in Spartan land. And that's done so that we can do wire bonding on there. As I said before, Spartan devices give you full 3.3-volt compatibility if you need that. And there's just some other minor differences. You know, you have a system monitor in Virtex, not in Spartan, and you have a tri-mode Ethernet MAC in Virtex as well. So that's our difference, and then -- so you can kind of see, you know, depending upon what you're after, you know, for the lowest cost and, you know, moderate performance, Spartan is the right choice. For the higher performance, you know, Virtex is the right choice. So that's pretty much what I have on the hardware. And like most things, I have to have a commercial here at the end. We are releasing our tools, the next version of our tools, the end of this month, so 11/1. I just wanted to mention that for you. We've been doing a lot of work on the incremental flows and partition base flows with the idea that people need to be able to compile their designs in much faster time so you can get more turns per day, because you look at these devices and you scale what it takes -might take to do a large Virtex-5 today, you realize that we have to do something different. So there's a lot of work been going on there. As I mentioned before, power, some things in the memory footprint. But I just wanted to let you know the ISE software, which is now called Integrated Design Suite because when we release it, everybody who has an ISE Suite will get -- what we call Plan Ahead will be part of it, Chip Scope will be part of it. So your only options that you can buy now are you can say, hey, I want to just do logic development or I want to do logic development and embedded development or I want to do logic development, DSP development, or I want all -- everything. But those are your choices. You no longer -Chip Scope is part of the package, I guess is the way to say it. Chip Scope, Plan Ahead, all those tools are part of it. So that's the changes coming, and I want to say thank for your time, and if you have any questions, let me know. [applause] >> Don Matson: Yes? >>: [inaudible] improvement, I forgot to ask you what the date was for IBS. >> Don Matson: So, yeah. So IDS is out the end of this month. I hate to say the first of next month because April 1st isn't -- it's kind of an odd day to say, but -- so that's when it ships, and we have it available now internally. The big improvement is using -- is spending more time using incremental flows, I think, probably a little better. I don't know if you've tried that. >>: [inaudible] you mentioned a different word. >> Don Matson: So there's two different flows, and maybe we need to come in and talk about that. Partition-based flows says, hey, I've got a design, and if I look at it at the top level, I've got multiple blocks in there, and I want to -- and I want to set those as partitions, and as a partition I'm going to say, hey, I'm going to look at the compiled code, and, of course, I have to compile it. And I don't know -- is everybody using Xilinx compilers or Simplicity? But if you're using Xilinx compilers, then the compilers will keep it separate. Simplicity can do it through MultiPoint. And it looks and says, oh, that block hasn't changed, use the old NCD, use the information in that to keep that place and route it the same, and then looks for the areas that are changed. There's a little more problem in that just with the partition flow than in the flow that I called incremental. Incremental says, hey, I'm just doing my design as I've been doing it now, which is basically it's compiled flat. I mean, all modules are compiled at once whether -and then when I come in at that point, I look at my old, whole database from the previous compile and I compare it to the database I have on the new compile, looking for changes. And if I don't see changes, you know, I at least start with that placement, and that reduces the compile times pretty significantly. >>: [inaudible] in this performance improvement generally, is that equally across synthesis [inaudible]. >> Don Matson: Yeah, I should go back to that. You notice I didn't say those words. So, yeah -- so the question has to do with -- let me see if I can go back here -- to the very first bullet on this slide, average 2x faster runtime compiles. I don't know what they mean by average 2x faster. I can demonstrate some pretty good numbers by using incremental compile, but if I don't change my flow and I don't change my settings, I'm not going to see 2x improvement, you know. I might see, you know, 10 percent or something like that. But using incremental flow or partition-based flow, you can see dramatic speed-ups on your compiles. And that's some stuff I can help with. Yes? >>: [inaudible]. >> Don Matson: The question from Sandra [phonetic] was do we do parallel compiles. So this is another interesting topic. So if you look in Xilinx today, the old way of doing design was you ran through NGDBuild, but NGDBuild today, all the synthesis tools understand the underlying architecture, so NGDBuild is almost just really a translation from the synthesis database to the Xilinx database. It hardly does anything. Then the next step was MAP. And what you've seen over time is MAP has moved to a point where more and more time is being spent in MAP because the performance that you can get in PAR is dramatically impacted by what you do in MAP. So a couple things for you to be aware of. In Virtex-5 MAP desk timing is the only way it runs. Same thing with Virtex-6. So you don't really have that option. So it's going to spend more time there. Then when you got into PAR, there was this idea that you could place and route it. Well, actually if you run MAP desk timing, your design has already been placed and all you're doing is routing. So there was this old thing called Multipass Place and Route, and some of you guys may have used it a long time ago. But if you think about it, now if I'm down there in PAR and I spew off a bunch of runs on Multipass Place and Route, I probably don't get that much advantage because by the placement, some of the stuff I've done earlier will impact the performance I get. So we have the ability now to do some exploration up front that says, hey, instead of doing Multipass Place and Route, which can use multiple servers or multiple CPUs, I can use multiple servers at that earlier stage. So that is there. We are working as well on, hey, I've just got a single compile, I know all my options, and I can take advantage of multiple CPUs. I don't know the performance advantage I'd get doing that at this point in time, but we are able to do that, and we are putting more and more resources, because clearly that's the way to attack the problem. Any other questions? >>: [inaudible] begins with F, does that mean Floating Point? >> Don Matson: What is this? >>: [inaudible]. >> Don Matson: Three slides back? Okay. Way back. I think I know what you're talking about. Let me not do it this way. Let's just say this. Your question really was on Floating Point, I think. And we do have support for floating point soft cores today. I don't know if there's plans to make a soft or a hard Floating Point block. It wouldn't surprise me. That's something I can take offline and we can do some checking on. Yes? >>: [inaudible] and such, there was -- one of the points was being able to actually -having a shorter lead time in terms of [inaudible]. >> Don Matson: Uh-huh. >>: Other than the faster IDS, whatever, ISD tools, what are we doing to spend people up faster? >> Don Matson: That's a good question. So what we're doing to help people get there is, if you've looked -- and I know you guys have used some of our Xilinx development boards, and the past ones there was a Virtex board and Spartan boards, and there are a fair number of development boards. What we're doing is we're going to a development board strategy where the base development board has a couple of sockets on it, or connectors, and then we're going to build daughter boards for those connectors. And you saw some slides that show these different reference designs, and our idea is we're doing a lot of verification across different areas. So like video, for instance, we'll build a video daughter card. We'll actually do some development. I have a customer who's doing what I would call video connectivity, and we've got complete reference designs from Virtex-5, and we're continuing on with that strategy of not only are we giving you hardware, but we need to get either IP as reference designs or through Core Gen to our customers to help you do your designs, because there's a lot of common things that you might do that aren't critical. So like my customer who's doing video connectivity, so what they want to do is they want to take an HD-SDI stream coming in, they want to do some of their own processing and MPEG encoding on it. Well, we're not doing the MPEG encoding for them, but as far as they're concerned, they don't care to do the work necessary to take the data from the gigabit transceiver into their FPGA and get it into some data, you know, format into memory. And same thing, they don't want to do it on the other side. So we're providing that design for them. That's what we're talking about. Any other questions? Sure. >>: [inaudible]. >> Don Matson: That announcement is coming. So the question was have we said anything about future embedded hard processors. What I will say is we are committed to future embedded hard processors. We will be doing some stuff, and we'll be announcing stuff in the sixth generation architectures later this year on the embedded hard processors, and we are committed to continuing with embedded hard processors. Anything else? Thanks again. [applause]

>> Ray Bittner: Good morning. My name is... Don Matson today. He's a local Xilinx field applications...

Related documents

Products

Support

&gt;&gt; Ray Bittner: Good morning. My name is... Don Matson today. He's a local Xilinx field applications...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Ray Bittner: Good morning. My name is... Don Matson today. He's a local Xilinx field applications...