FPGA Presentation

advertisement
EE446 Embedded Architectures




It is primarily a semiconductor device that can
be configured by the user (customer or
designer) after the manufacturing process
has been completed
The term "field-programmable" means the
device is programmed by the customer, not
the manufacturer.
Can be programmed using a logic circuit
diagram or source code in VHDL or Verilog
It offers partial re-configuration of a portion
of design

An FPGA (Field Programmable Gate Array)
is a reprogrammable chip which contains
hundreds of thousands of logic gates that
internally connects together to build
complex digital circuitry.
3/16/2016
3





Real-time analysis of high-rate data streams
(Performance)
Deterministic hardware dedicated to every task
(Reliability)
Nonrecurring engineering expenses
(Reconfigurability )
Radiation Hardened and Program Integrity.
(Durability)
Flexible and rapid prototyping
(Development)



FPGAs excel at computing
non-data dependent
algorithms in parallel.
Customizable data path and
ALU allow very large
amounts of data to be
transferred and computed
within several clock cycles.
Despite lower clock
frequencies, FPGA’s can
outperform conventional
CPU’s on certain data
processing tasks
3/16/2016
5

Cheap/fast fuse connections-One time
programmable
◦
◦
◦
◦





small area (can fit lots of them)
low resistance wires (fast even if in multiple segments)
very high resistance when not connected
small capacitance (wires can be longer)
Antifuse: One-time programmable
Pass transistors (switches)
◦ used to connect wires
◦ bi-directional
EEPROM
SRAM
Multiplexors
◦ used to connect one of a set of possible sources to
input
◦ can be used to implement logic functions
Xilinx FPGAs - 6


FPGAs have always been slower and required
more energy leading to less functionality than
ASICs
Due to fabrication enhancements, and greater
R&D the performance has been nearly
normalized between FPGAs and ASICs

Advantages of FPGAs over ASICs:
◦ Shorter time to market
◦ Can be re-programmed in the field to fix bugs, and
lower engineering costs
◦ Hardware can be developed on ordinary FPGAs,
leading to a finalized version that can no longer be
modified after the design has been decided





Power consumption - FPGAs fundamentally
use a lot more power than ASICs
Price - they also fundamentally cost more
Speed - ASICs can still blow any FPGA away in
speed although design techniques can help
with this issue
Density - ASICs can still pack a lot more logic
into a single chip than an FPGA
IP - modern, complex IP (a complete PCI
Express of Hyper-transport core for example)
may take up most or all of an FPGA but only
10% of an ASIC

Common FPGA architecture involves:
◦ Configurable Logic Blocks (CLBs)
◦ I/O pads
◦ Routing Paths usually of the same width (# of wires)
Standard Logic Block
Logic Block
Pin Assignment


Logic optimization. Performs two-level or
multi-level minimization of the Boolean
equations to optimize area, delay, or a
combination of both.
Technology mapping. Transforms the
Boolean equations into a circuit of FPGA logic
blocks. This step also optimizes the total
number of logic blocks required (area
optimization) or the number of logic blocks in
time-critical paths (delay optimization).


Placement. Selects the specific location for
each logic block in the FPGA, while trying to
minimize the total length of interconnect
required.
Routing. Connects the available FPGA’s
routing resources1 with the logic blocks
distributed inside the FPGA by the placement
tool, carrying signals from where they are
generated to where they are used.

Fuse and anti-fuse
◦ fuse makes or breaks link between two
wires
◦ one-time programmable

Flash
◦ High density
◦ Process issues

RAM-based
◦ memory bit controls a switch that
connects/disconnects two wires
◦ can be programmed and re-programmed
easily (tested at factory)

Logic block - how are functions implemented:
fixed functions (manipulate inputs) or
programmable?
◦ support complex functions, need fewer blocks, but
they are bigger so less of them on chip
◦ support simple functions, need more blocks, but
they are smaller so more of them on chip

Interconnect
how are logic blocks arranged?
how many wires will be needed between them?
are wires evenly distributed across chip?
programmability slows wires down – are some wires
specialized to long distances?
◦ how many inputs/outputs must be routed to/from
each logic block?
◦ what utilization are we willing to accept? 50%?
20%? 90%?
◦
◦
◦
◦

CLB - Configurable Logic Block
◦ direct
◦ general-purpose
◦ long lines of various lengths

RAM-programmable
◦ can be reconfigured
IOB
CLB
CLB
IOB

IOB
Wiring Channels
IOB

Built-in fast carry logic
Can be used as memory
Three types of routing
CLB
IOB

IOB
IOB
IOB
◦ 5-input, 1 output function
◦ or 2 4-input, 1 output functions
◦ optional register on outputs
CLB
The Virtex CLB
Details of One Virtex Slice

Each slice contains two sets of
the following:
◦ Four-input LUT
 Any 4-input logic function,
 or 16-bit x 1 sync RAM (SLICEM only)
 or 16-bit shift register (SLICEM only)
◦ Carry & Control
 Fast arithmetic logic
 Multiplier logic
 Multiplexer logic
◦ Storage element




Latch or flip-flop
Set and reset
True or inverted inputs
Sync. or async. control
4-input
function
3-input
function;
registered
e.g. 9-input
parity
x1
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
x2
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
x3
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
x4
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
x1
x2
x3
x4
y
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
LUT
y
x1 x2 x3 x4
x1
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
x2
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
x3
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
x4
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
y
0
1
0
0
0
1
0
1
0
1
0
0
1
1
0
0
• Look-Up tables
are primary
elements for
logic
implementation
• Each LUT can
implement any
function of
4 inputs
x1 x2
y
y
COUT
YB
G4
G3
G2
G1
Y
Look-Up
O
Table
D
Carry
&
Control
Logic
S
Q
CK
EC
R
F5IN
BY
SR
XB
F4
F3
F2
F1
CIN
CLK
CE
X
Look-Up
Table O
Carry
&
Control
Logic
S
D
Q
CK
EC
R
SLICE
Carry & Control Logic in Xilinx FPGAs
x
0
0
1
1
y
COUT
0
1
0
1
y
CIN
x
y
y
CIN
Propagate = x  y
Generate = y
Sum= Propagate  CIN = x  y  CIN
Carry & Control Logic
LUT
Hardwired (fast) logic
Critical Path for an
Adder Implemented Using
Xilinx Spartan 3 FPGAs


The general architecture of Xilinx FPGAs
consists of a two-dimensional array of
programmable blocks, called Configurable
Logic Blocks – CLBs,
with horizontal and vertical routing channels
between CLB’s rows and columns.
Connection boxes
Flexibility of Connection, Fc = 2,
Can A connect to B?
Switch Boxes
Fs, defines for a wiring segment entering
the S block the number of other wiring
segments it can be connected to
Routings using C and S Boxes

Maze Router

A* Search Routing

The Pathfinder
In Comparison to the Virtex 2
Configurable Logic Blocks
Array (Row*Column): 160*54
Virtex 5 Slices: 17,280
Max Distributed RAM (Kb): 1,120
Block RAM Blocks
18Kb: 296
36Kb: 148
Max (Kb): 5,328
DSP48E Slices: 64
CMTs: 6
PowerPC Processor Blocks: 0
Configurable Logic Blocks
Array (Row*Column): 80*46
Virtex 2 Slices: 13,969
Max Distributed RAM (Kb): 428
Block RAM Blocks
Max (Kb): 2,448


I/O blocks provide the interface between package pins and
the internal configurable logic
Most popular and leading-edge I/O standards are supported
by programmable I/O blocks (IOBs)


The basic logic elements for Xilinx® FPGAs, providing
combinatorial and synchronous sequential logic as well as
distributed memory and shift register capability
Virtex-5 FPGA CLBs are based on real 6-input look-up table
technology and provide superior capabilities and performance



Block RAM modules provide flexible 36
Kbit true dual port RAM that are
cascadable; this allows for the
formation of larger memory blocks
Virtex-5 FPGA block RAMs possess
programmable FIFO logic for increased
device utilization
Each block RAM can also be configured
as two independent 18 Kbit true dualport RAM blocks, providing for designs
needing smaller RAM blocks


Cascadable embedded DSP48E slices with 25 x 18 two’s
complement multipliers and 48-bit
adder/subtracter/accumulator provide massively parallel DSP
algorithm support
Clock Management Tile (CMT) blocks provide the most
flexible, highest-performance clocking for FPGAs
16-Character x 2-Line LCD
256 MB SODIMM
Compact Flash Card
The Xilinx System ACE Compact Flash (CF) configuration
controller allows a Type I Compact Flash card to program
the FPGA through the JTAG port.
Eight general-purpose
(active-High) DIP
switches are connected
to the user I/O pins of
the FPGA
15 LEDs controllable by the FPGA:
8 green LEDs are general purpose LEDs
arranged in a row, 5 green LEDs are positioned
next to the pushbuttons, 2 red LEDs are for
error conditions, but Is not limited to that
purpose
Ethernet Port
10/100/1000 Mb/s
Audio Jacks for Microphone, Line In, Line Out, and
Headphone. Supports stereo 16-bit audio with up
to 48-kHz sampling
The USB Controller provides USB connectivity for the
board and supports host and peripheral modes of
operation. The USB controller has an internal
microprocessor to assist in handling of USB
commands. The firmware for this processor can be
stored in its own dedicated IIC EEPROM or can be
downloaded from a host computer via a peripheral
connector. The USB controller‘s serial port is
connected to J30 through an RS-232 transceiver to
assist with debug.
The JTAG configuration port for the allows for
programming the FPGA along with debugging
support.
•
ISE Foundation (Project Navigator) allows for the start of the
FPGA design process
•
Runs in background to maintain operation and flow of design
by managing the chain of tools involved including but not
limited to: Embedded Development Kit (EDK), ChipScope Pro
and AccelDSP
•
EDK consists of XPS as mentioned before this can be run
independently to begin a project however use of the project
navigator provides for a more organized design process of an
embedded system
•
XPS (Xilinx Platform Studio) and the XPS SDK (Software
Development Kit) are the main components of the EDK
•
Allows for the utilization for the Base System Builder (BSB) if
required for development of an existing board including
layout and pin connections
•
Given that you have a supported embedded processor
development board available from Xilinx the BSB allows you
pick from the peripherals available on that board,
automatically match the FPGA pinout to the board, and create
a completed platform and test application ready to download
and run on the board.
The Base System Builder allows for the selection of the following system
attributes:
•Processor type (MicroBlaze or PowerPC, depending on your selected target
FPGA device)
•Reference and processor-bus clock frequency (BSB automatically infers and
configures a Digital Clock Manager (DCM) primitive when needed)
•Standard processor buses (all peripherals are automatically connected via
appropriate buses)
•Debug interface
•Cache configuration
•Memory size and type (both on-chip block RAM and controllers for off-chip
memory devices)
•Common peripherals (such as general purpose I/O, Universal Asynchronous
Receiver-Transmitter (UART), and timer)
•Automatic selection of the on-board FPGA
•Selection of clock rates supported by the on-board oscillators
•Automatic setting of reset polarity
•Automatic generation of FPGA pinout to match the board connections, for the
selected set of peripherals
•
Upon completion of BSB a Microprocessor Hardware
Specification (MHS) file is created and loaded into the XPS
project
•
The XPS can then be used to develop the embedded subsystem
that was established through the BSB, which acts as a
wizard/template for overall board capabilities
•
The next course of action would be to design all constraints,
etc. of the system
•
Add the embedded system as a sub module to a top-level
Xilinx® ISE® project in Project Navigator; declare, instantiate,
and interconnect the embedded sub module in your top-level
FPGA design when choosing to begin through the ISE Project
Navigator
Download