Series 7 Overview, part 1 Transcript Hello and welcome to this

advertisement
Series 7 Overview, part 1
Transcript
1- Hello and welcome to this recorded e-learning about the 7-Series device
architectures. This module is part one of three modules that gives you an
overview of the 7-series device family members.
Please note that where applicable were going to make comparisons between the
7-Series products to Virtex-6 and Spartan-6 FPGA families.
2- The course objectives <read slide>…
Please note that everything we cover here we are going to be covering in more
detail in later chapters.
3- This is a quick snapshot of the seven series subfamilies. The seven series
includes three subfamilies Artix-7, Kintex-7, and Virtex-7. One of the most
exciting things about the 7 Series is it's a truly unified FPGA architecture that
spans multiple FPGA subfamilies. This means that the architectures associated
with the device subfamilies is virtually identical. There are minor customizations
between the device families but the underlying CLB and array architecture is
virtually the same.
The homogeneous nature of the 7 Series has a lot of advantages to designers
because once a designer understands one device family it is easy to use the
other device subfamilies. So customers do not have to learn how to optimize
their design for different device families. This means that design migration is
easier than it ever has been before.
These 7 Series families are still designed to cater to different customer needs,
but still make design migration easy. The Artix-7 device is designed to be the
lowest cost subfamily. Kintex-7 is designed to be the industry's best combination
of price and high performance. Virtex-7 is designed to have the highest system
performance in the FPGA industry.
You can also note from this slide that as you migrate to the higher speed and
density devices, you increase the quantity IO pins, quantity of transceivers
available, transceiver performance, available block RAMs, and the amount of DSP
resources available.
As you can see from this chart each of these devices has a range of densities
that do somewhat overlap, but are separated significantly by device density.
Also note that the largest Virtex-7 device, the 2000, has over 2 million logic cells.
This is significantly larger than the largest Virtex-6 device which had about
750,000 logic cells. This is almost 3 times the size of the largest Virtex-6 device.
4- Although there are three subfamilies in the 7 Series, each of the subfamilies has
some variation. This slide shows you that the Virtex-7 devices have 3
subfamilies.
The Virtex-7 is designed for general logic purposes. It is our main stream
device, which means that it has a moderate amount of logic, block RAM, and
slice resources.
The Virtex-7 XT is designed for DSP applications because it has the largest
amount of DSP slice resources and block RAM resources.
The Virtex-7 HXT is best for high-speed serial connectivity, because it has the
most Serial Gigabit Transceivers. This family also has several very high-speed
gigabit transceivers which the other families do not have.
Note that each of the sub-families has serial gigabit transceivers, even though
the Virtex-7 subfamily does not have a T in its name.
5- Each of the 7 Series devices have the same features that you see here. So the
block RAMs in Artix-7 are the same as they are in Kintex-7 and Virtex-7. All of
the slice resources are the same, all of the DSP slice resources are the same, etc.
This is different than the Spartan-6 and Virtex-6 device families which are
significantly different. By making this strategy change, Xilinx has made it easier
to migrate a design between subfamilies and improves the time to market for a
customer who wishes to do so.
You should note that each device may support different IO standards and that
not all of the Transceivers are identical. This will depend on the subfamily you
choose. We will talk more about this in another recording.
You can also see from this slide that each of these devices place the dedicated
hardware in separate columns with CLB logic dispersed throughout the device.
This has been used by Xilinx in other device families for years.
6- With the development of the 7 Series, there has been a very strong focus on
reducing power consumption. This slide shows you a number of those power
saving features.
First of all, it is important to note that 7-Series is not just a Virtex-6 device with a
die shrink. There are considerable features that have been modified to reduce
power consumption. For example, the 7-Series uses a different gate technology
that reduces the leakage current of the transistors, this translates to a significant
static power reduction.
A lot of these features are designed to reduce power consumption while
maintaining high performance. Virtex-7 performance is designed to be the same
or better than Virtex-6, while providing a substantial savings in power.
The fine-grained clock gating is a functionality that allows you to effectively turn
off a clock domain with a clock enable. This helps save dynamic power
consumption. This feature has been around with the Virtex-6 devices, but now
the implementation tools are making the most of its availability.
Just like with Virtex-6, Virtex-7 also has a lower device core voltage feature
(called –1L) which will save you considerable power at the expense of
performance.
There are also numerous changes to the I/O structures to reduce your I/O power
which is considerable in most applications.
Sorry I don’t have more time to cover all of these features, but if you refer to the
7 Series data sheet, it will describe in more detail all of the device families power
saving features. From an engineering standpoint it is quite interesting.
7- All of this means that the 7-Series FPGA supports the goal of 50% lower total
power compared to Virtex-6. A lot of this is accomplished with a 65% lower
static power enabled by the 28 nm high-performance process. But overall you
should see a 50% reduction in your power consumption. This means you can do
the same work in a package with a smaller heat sink, less air flow, or a smaller
power supply. Alternatively, you can use the same package and use a larger
FPGA that will allow you to pack twice as much logic into the device or run at a
higher system speed.
8- In summary…<read slide>
9- If you would like to see what other courses we offer, or what other Free RELs
are available go to the Xilinx Education link you see here.
<read slide>
But whatever you do, please take a second and let us know what you thought of
this REL. Just click on this icon at the top of this page and tell us what you
think.
My name is Frank Nelson. You have been listening to an introduction to the 7
Series of Xilinx FPGAs. Thank you for listening and thanks for your business.
10- <nothing said>
Series 7 FPGA Overview, part 2
Transcript
1- Hello and welcome to this recorded e-learning about the 7-Series device
architectures. This module is part two of three modules.
Please note that where applicable were going to make comparisons between the
7-Series products to Virtex-6 and Spartan-6 FPGAs.
2- The course objectives <read slide>…
Please note that everything we cover here we are going to be covering in more
detail in later chapters.
3- Like Virtex-6, the 7-Series is built on the 4th generation of ASMBL architecture
(that is the Advanced Silicon Modular Block architecture). This means the device
is made up of separate columns of different dedicated hardware resources. This
includes clocking resources, DSP, block RAM, and IO resources.
This enables the changes in device densities to be homogeneous. So as the
density increases, the architectures simply grows vertically, not horizontally. This
means that you will not get additional columns of block RAMs and DSP slices as
the density increases, just extra clock regions.
The 7-Series has very similar functionality with the Virtex-6 architecture. This
was deliberate and intended to facilitate design migration from Virtex-6 to the 7
Series. If you are a Spartan-6 user, this may require a little bit of learning effort.
The 7 Series is simply a more advanced architecture than Spartan-6.
4- One of the differences between Virtex-6 and the 7-Series is the changes in which
some of the columns of dedicated resources are laid out on the die. For
example, in Virtex-6 all the clocking resources and a couple of the I/O columns
were placed in the middle of the device. The 7-Series does not keep any I/O
columns in the middle of the device, instead they are all placed on the left and
right edge. However, there is still is a middle column for access to the clock
routing resources. Instead the CMTs are placed on the left and right edge next
to the IO columns and are tightly bound to the IOs.
You should also note that in some devices the high-speed I/O pins will end up
replacing some of the I/O Banks you might be used to having available. In this
example, the sixth I/O bank we might be expecting to be in the upper right-hand
corner is missing and has been replaced by high-speed I/O pins.
5- Series-7 devices use clock regions, just like the Virtex-6 devices. Each clock
region is 50 CLBs tall, which is a little larger than Virtex-6 devices. Likewise,
each clock region has 50 IOBs associated with it, compared to 40 IOBs with
Virtex-6. Just as before, clock regions span from the middle of the device to the
edge.
The BUFH is designed to distribute clock signals into an individual region. In this
case, it splits the region horizontally and is represented by the gray line.
Regional clock routing resources are set to route in the middle of the clock
region as well.
6- The CLB structure of the 7-Series is the same as Virtex-6. There are two slices
per CLB. There are slice-M (which includes memory capable LUTs) and slice-L
(which only support general logic and the carry chain). There is no slice-X which
was in the Spartan-6 devices.
The slice-M LUTs can be configured as a 64x1 distributed memory (good for DSP
applications) or a 32-bit shift register.
There are also two FFs per LUT. This is helpful because each 6-input LUT can be
configured as two 5-input LUTs. This lets each 5-input LUT drive a FF or be used
for pipelining an application.
7- In terms of functionality, the 7 Series block RAM is the same as the Virtex-6
block RAM. It features independent read and write port widths, Dual port, single
port, and simple dual-port modes, integrated cascade logic, byte-write enable,
optional 64-bit error correction, and integrated FIFO Logic.
The block RAM has been designed for lower static power. Internally, each 36Kbit block is divided into 9K-bit blocks. Unused blocks can be turned off to save
even more power for smaller memories. The performance of both the block RAM
and FIFO Logic is around 600 MHz.
8- The DSP slice resources feature a pre-adder that simplifies the design of
symmetric filters. Other DSP features include a 25 x 18 multiplier, an ALU stage
that includes dynamic OPCODES, single instruction multiple data support,
add/subtract and common logic functionality, and a pattern detector on the
output.
Design enhancements continue to reduce power consumption, and the number
of available DSP slices continues to grow.
Power consumption is the lowest of any FPGA solution. 7 Series achieves lowpower consumption without sacrificing performance. 600 Megahertz performance
for any DSP operation means that you could achieve 1.2 TeraMACs in a single
device. The pre-adder can save a significant amount of resources when building
symmetric filter functions, which could mean cost savings if your design can fit
into a smaller part. Also, the flexible structure of the DSP resources allows other
non-DSP logic functions to use them, which saves slice resources further
increasing device utilization for designs that do not contain many DSP functions.
9- The latest advancement in clock management comes in the form of the Mixed
Mode Clock Manager or MMCM. The MMCM is based on a PLL providing low
output jitter, phase shifting, and clock de-skew. Each clock management tile
contains two MMCMs with dedicated cascade connections between the MMCMs.
The benefits of the MMCM are that it has been designed for low static power. It
also has excellent jitter performance.
Another enhancement is the introduction of new high-performance paths that
connect the MMCM outputs directly to the I/O and regional clock networks.
These paths provide low skew without using a global clock buffer. Global clocks
are distributed up and down from the midpoint of each clock region. This
translates to lower skew between columns and increases the maximum global
clock frequency to 800 MHz. All of these enhancements provide low power clock
generation and distribution with performance up to 800 MHz for global clocks
and 500 MHz for regional clocks. These advanced clocking features remove the
need for external clocking components, reducing costs and simplifying board
designs.
10- One of the most significant differences between Virtex-6 and 7-Series is that
there are two distinct types of I/O pins, high range (also called HR) and high
performance (also called HP). High range supports I/O standards up to 3.3
volts, this is significantly different from Virtex-6 which did not support any 3.3v
standards and only supported up to 2.5v. Support for 3.3v IO standards is only
available for I/O banks that are denoted as high range.
High performance banks are limited to 1.8v, but they have more capabilities and
support more advanced IO standards that allow them to interface at higher
system speeds.
These system features are designed to support two diverging needs. High range
pins supports legacy applications, high performance pins support the newest and
most demanding I/O standards. Having an architecture that divides the I/O
banks into types allows the device to meet all IO performance expectations.
The Serdes functionality now adds an independent Output Delay element in
some of the I/O pins. There is also some other functionality built into the device
that aids its use with high speed memory controllers.
11- Another interesting thing about the Virtex-7 device is that its largest device is
almost 3x as big as the largest Virtex-6 device. This is enabled by the stacking
of multiple silicon dies on an interposer, where the lower die is designed to route
signals between super logic regions. For example, the largest device is made up
of 4 Super Logic Regions (SLRs), each with 500K logic cells and each is
interconnected by an interposer. This allows the density to increase significantly
without a loss in performance.
12- This slide shows a cross-section of the largest 7 Series FPGA. Each die is placed
on an interposer (in blue) which contains the four FPGAs on its one mounting. It
is important to note that this connectivity is done at the routing level in order to
get optimum performance. The combined dies are now placed on a substrate
which is now made into a conventional package and can be attached to a board.
This is all hidden from the user and the tools will still treat this device as a single
monolithic FPGA with 2 million logic cells. There are minor routing delays when
logic is split over two dies, but the tools will take this into account during
implementation. This assures that your performance objectives are met and
verified.
13- Summary <read slide>
14- If you would like to see what other courses we offer, or what other Free RELs
are available go to the Xilinx Education link you see here.
<read slide>
But whatever you do, please take a second and let us know what you thought of
this REL. Just click on this icon at the top of this page and tell us what you
think.
My name is Frank Nelson. You have been listening to an introduction to the 7
Series of Xilinx FPGAs. Thank you for listening and thanks for your business.
15- <nothing said>
Series 7 FPGA Overview, part 3
Transcript
1- Hello and welcome to this recorded e-learning Overview of the 7-Series device
architectures. This module is part three of three modules.
Where applicable were going to make comparisons between the 7-Series
products to Virtex-6 and Spartan-6 FPGAs.
2- The course objectives <read slide>…
Note that everything we cover here we are going to be covering in more detail in
later chapters.
3- 7 Series has all the IP that exists in the Virtex-6 device family, except the
Ethernet Mac hard core, which exists in the Virtex-6 family. However, you still
can instantiate the core as soft IP, however. This was done because of declining
demand for Ethernet Mac interconnectivity applications. Instead we are seeing
more applications that demand high-speed transceiver links or PCI express cores.
Almost all the devices in the 7 Series have dedicated serial gigabit transceivers.
There are four types of transceivers that are used throughout the families. So it
is useful to understand which transceivers are used in different technologies.
Each type of transceiver has different capabilities and speed and so you will need
to learn more about each transceiver to design with it. Note that when building
these transceivers the user interface is similar, but not identical. However, with
the use of the Core Generator, customization of your transceiver is relatively
easy.
The GTP transceiver supports speeds up to 3.75 gigabits per second. It is an
ultrahigh volume transceiver and is used in the wire bond packaging provided by
the Artix-7 family. That is important because when using a low-end of
performance transceiver some customers still wanted a wire bond package which
is relatively inexpensive.
The GTX transceiver is the most similar to what was provided in the Virtex-6
device family. It operates at up to 12.5 gigabits per second. This is a bit better
performance than what is available in Virtex-6 devices.
The GTH transceivers are similar to what was supported in the latest Virtex-6
devices. It supports speeds up to 13.1 gigabits per second and also supports up
to 10 gigabits per second protocols with high forward error correction overhead.
At the very high end the new GTZ transceivers supports speeds up to 28 gigabits
per second. The GTZ transceivers are designed to support the next generation
100-400 gigabit per second system line cards.
4- Another hard block almost all 7 Series devices have is a PCI express block. The
capabilities of this block vary by device subfamily. The PCI express core
supports both endpoint and root functionality in all devices. There is a lot of new
features embedded in the PCI express block that we don’t have time to talk
about here. I would recommend you refer to the PCI Express User Guide to
learn more about its functionality.
5- The Xilinx analog-to-digital converter block is something new and exciting in this
7 Series devices. This is kind of like the system monitor block that is included
with the Virtex-6 devices, but it has significantly different features.
First of all, it is a dual analog to digital converter rather than a single channel as
it was in Virtex-6. The speed of the XADC is higher than the System Monitor
block, it performs up to one mega samples per second (versus the 200K samples
per second of the System Monitor).
In Virtex-6 the system monitor was used to monitor power supplies and other
things on the customer's final board, but with the 7 Series it is used as a true
front-end for medium speed applications. You can actually bring in analog
signals directly into the FPGA and route them to the XADC component. This is
particularly helpful for some of the low-end devices since this allows you to build
a full DSP system on a small Artix-7 device, and for relatively low cost. So you
don't have to purchase a separate analog-to-digital converter just for that type of
application.
6- The 7 Series devices are all based on the same architecture but obviously the
intent is to have different device densities and different price points for different
types of market applications. This slide was made to give you some ideas of
how the devices have been used.
<read slide>
7- As mentioned earlier, the IO banks are not identical in the 7 Series of devices.
Some of the I/O banks can be high range which supports I/O standards up to
3.3 V. Other I/O Banks are designed for high performance which supports I/O
standards up to 1.80 V, but supports higher-speeds.
The mixture of the I/O Banks varies by device family. For example, the Artix-7
family only has high range IOs and there are no high-performance IO pins. This
means that all Artix-7 I/O pins are 3.3v compatible.
The Kintex-7 family has mostly high range I/O banks, but it does have some
high-performance I/O banks.
The Virtex-7 device family has mainly high-performance I/O banks with some
that are high range. So even with Virtex-7 you have some ability to work with
legacy I/O standards. This is quite different compared to Virtex-6 where you
could only work with IO standards that could be powered by 2.5 V. In the
Virtex-7 XT and HT families all IO banks are high-performance.
8- This slide shows the performance you can expect for the serial gigabit
transceivers across all the 7 Series families. As you can see it varies significantly
from family to family. The Artix-7 device family only supports the GTP multigigabit transceivers which only supports up to 3.75 gigabits per second in the
fastest speed. This is about as fast as you can go with wire bond technology.
In the Kintex-7, Virtex-7, and Virtex-7 XT family there is support for the GTX
transceivers. These transceivers vary in performance based on their speed grade
and their package. There are a couple of new packages offered in the Kintex-7
device family and this will impact the performance of some of the GTX
transceivers.
The Virtex-7 XT provides a mixture of GTX and GTH transceivers. This is
provided with one column of GTX and a separate column of GTH transceivers.
As the chart shows, some speed grades enable the GTX to operate at up to 12.5
gigabits per second. The GTH transceivers can operate at up to 13.1 gigabits
per second.
The Virtex-7 HT device family has a mixture GTH and GTZ transceivers. The GTZ
transceivers perform at up to 28 gigabits per second and are only available in the
highest speed grade members. As you can see from this chart, the performance
of the GTZ transceivers is still to be characterized. Note that the GTZ
transceivers will not be available in the -1L speed grade (that is the low-power
device offering).
9- There is very distinctive device packaging offered for each of the different family
members. Artix-7 has an ultra low-cost wire bond technology packaging that is
very inexpensive when compared to flip-chip packaging cost. But this was used
since the price point for Artix-7 needs to be as low as possible.
One of the challenges of using wire-bond technology with Artix-7 is how to
connect the package to the die. Particularly since this limits the performance of
the transceivers. Likewise, parallel I/O performance is limited to just over 1 Gb
per second.
10- The Kintex-7 devices support the conventional flip-chip packaging and a new
low-cost bare die flip-chip package. The bare die packaging is designed to be
very low cost, which is compatible with its role in the 7-Series. The interesting
thing about the bare die packaging is that it will show you the backside of the die
as being part of the package. This allows significant cost savings, but at expense
of some of the performance.
You should also note that the flip-chip does have some decoupling capacitors on
the substrate, but the bare-die package only has the power supply for the
transceivers bypassed. This is different from the Kintex-7 and Virtex-7 devices
which have more of the necessary power supplies bypassed.
11- The Kintex-7 and all the Virtex-7 device family members use a more conventional
flip-chip packaging. This is similar to the packaging used for Virtex-6. Virtex-7
uses a fourth-generation sparse Chevron pin pattern and supports speeds up to
1.866 Gb per second for parallel I/O (this is designed to suit the memory
controller speeds needed by users) and up to 20 Gb per second for the MGTs.
There is also a fair number of substrate decoupling capacitors used around the
MGT power supplies, block RAM power supplies, and the I/O pre-driver power
supplies. As an FYI having a separate power supply for the block RAM is a new
requirement for this device family.
12- In summary…<read slide>
13- If you would like to see what other courses we offer, or what other Free RELs
are available go to the Xilinx Education link you see here.
<read slide>
But whatever you do, please take a second and let us know what you thought of
this REL. Just click on this icon at the top of this page and tell us what you
think.
My name is Frank Nelson. You have been listening to an introduction to the 7
Series of Xilinx FPGAs. Thank you for listening and thanks for your business.
14- <nothing said>
Download