Evolution of Implementation Technologies

advertisement
Evolution of Implementation Technologies
 Discrete devices: relays, transistors (1940s-50s)
 Discrete logic gates (1950s-60s)
trend toward
 Integrated circuits (1960s-70s)
of integration
higher levels
e.g. TTL packages: Data Book for 100’s of different parts
Map your circuit to the Data Book parts
 Gate Arrays (IBM 1970s)
“Custom” integrated circuit chips
Design using a library (like TTL)
Transistors are already on the chip
Place and route software puts the chip together
automatically
+ Large circuits on a chip
+ Automatic design tools (no tedious custom layout)
- Only good if you want 1000’s of parts
Xilinx FPGAs - 1
Gate Array Technology (IBM - 1970s)
 Simple logic gates
Use transistors to
implement combinational
and sequential logic
 Interconnect
Wires to connect inputs and
outputs to logic blocks
 I/O blocks
Special blocks at periphery
for external connections
 Add wires to make connections
Done when chip is fabbed
“mask-programmable”
Construct any circuit
Xilinx FPGAs - 2
Programmable Logic
 Disadvantages of the Data Book method
Constrained to parts in the Data Book
Parts are necessarily small and standard
Need to stock many different parts
 Programmable logic
Use a single chip (or a small number of chips)
Program it for the circuit you want
No reason for the circuit to be small
Xilinx FPGAs - 3
Programmable Logic Technologies
 Fuse and anti-fuse
Fuse makes or breaks link between two wires
Typical connections are 50-300 ohm
One-time programmable (testing before programming?)
Very high density
 EPROM and EEPROM
High power consumption
Typical connections are 2K-4K ohm
Fairly high density
 RAM-based
Memory bit controls a switch that connects/disconnects two
wires
Typical connections are .5K-1K ohm
Can be programmed and re-programmed in the circuit
Xilinx FPGAs - 4
Low density
Programmable Logic
 Program a connection
Connect two wires
Set a bit to 0 or 1
 Regular structures for two-level logic (1960s-70s)
All rely on two-level logic minimization
PROM connections - permanent
EPROM connections - erase with UV light
EEPROM connections - erase electrically
PROMs
Program connections in the _____________ plane
PLAs
Program the connections in the ____________ plane
PALs
Program the connections in the ____________ plane
Xilinx FPGAs - 5
Making Large Programmable Logic Circuits
 Alternative 1 : “CPLD”
Put a lot of PLDS on a chip
Add wires between them whose connections can be
programmed
Use fuse/EEPROM technology
 Alternative 2: “FPGA”
Emulate gate array technology
Hence Field Programmable Gate Array
You need:
A way to implement logic gates
A way to connect them together
Xilinx FPGAs - 6
Field-Programmable Gate Arrays
 PALs, PLAs = 10 - 100 Gate Equivalents
 Field Programmable Gate Arrays = FPGAs
Altera MAX Family
Actel Programmable Gate Array
Xilinx Logical Cell Array
 100 - 1000(s) of Gate Equivalents!
Xilinx FPGAs - 7
Field-Programmable Gate Arrays
 Logic blocks
To implement combinational
and sequential logic
 Interconnect
Wires to connect inputs and
outputs to logic blocks
 I/O blocks
Special logic blocks at
periphery of device for
external connections
 Key questions:
How to make logic blocks programmable?
How to connect the wires?
After the chip has been fabbed
Xilinx FPGAs - 8
Tradeoffs in FPGAs
 Logic block - how are functions implemented: fixed
functions (manipulate inputs) or programmable?
Support complex functions, need fewer blocks, but they are
bigger so less of them on chip
Support simple functions, need more blocks, but they are
smaller so more of them on chip
 Interconnect
How are logic blocks arranged?
How many wires will be needed between them?
Are wires evenly distributed across chip?
Programmability slows wires down – are some wires
specialized to long distances?
How many inputs/outputs must be routed to/from each logic
block?
What utilization are we willing to accept? 50%? 20%? 90%?
Xilinx FPGAs - 9
Altera EPLD (Erasable Programmable
Logic Devices)
 Historical Perspective
 PALs: same technology as programmed once bipolar PROM
 EPLDs: CMOS erasable programmable ROM (EPROM) erased by UV light
 Altera building block = MACROCELL
CLK
8 Product Term
AND-OR Array
+
Programmable
MUX's
Clk
MUX
AND
ARRAY
Output
MUX
Q
I/O Pin
Inv ert
Control
F/B
MUX
Programmable polarity
pad
Xilinx FPGAs - 10
Seq. Logic
Block
Programmable feedback
Altera EPLD
Altera EPLDs contain 8 to 48 independently programmed macrocells
Global
CLK
Personalized
by EPROM
bits:
Clk
MUX
Synchronous Mode
1
Flipflop controlled
by global clock signal
OE/Local CLK
Q
EPROM
Cell
Global
CLK
Clk
MUX
local signal computes
output enable
Asynchronous Mode
1
OE/Local CLK
Q
EPROM
Cell
Flipflop controlled
by locally generated
clock signal
+ Seq Logic: could be D, T positive or negative edge triggered
+ product term to implement clear function
Xilinx FPGAs - 11
Altera Multiple Array Matrix (MAX)
AND-OR structures are relatively limited
Cannot share signals/product terms among macrocells
Logic
Array
Blocks
(similar to
macrocells)
LAB A
LAB H
LAB B
LAB C
LAB G
P
I
A
LAB F
LAB D
LAB E
Xilinx FPGAs - 12
Global Routing:
Programmable
Interconnect
Array
EPM5128:
8 Fixed Inputs
52 I/O Pins
8 LABs
16 Macrocells/LAB
32 Expanders/LAB
LAB Architecture
Macrocell
P-Terms
Expander
P-Terms
Expander Terms shared among all
macrocells within the LAB
Xilinx FPGAs - 13
P22V10 PAL
INCREMENT
2904
1
0
0
FIRST
FUSE
NUMBERS
4
8
12
16
20
24
28
32
36
2948
2992
3036
3080
3124
3168
3212
3256
3300
3344
3388
3432
3476
3520
3564
3608
40
ASYNCHRONOUS RESET
(TO ALL REGISTERS)
44
88
132
176
220
264
308
352
396
1 1
1 0
AR
D
Q
23
0 0
0 1
Q
5808
SP
P
R
1
0
5809
OUTPUT
LOGIC
MACROCEL
L
18
P - 5818
R - 5819
6
440
3652
484
528
572
616
660
704
748
792
836
880
OUTPUT
LOGIC
MACROCELL
3696
3740
3784
3828
3872
3916
3960
4004
4048
4092
4136
4180
4224
4268
22
P - 5810
R - 5811
2
924
968
1012
1056
1100
1144
1188
1232
1276
1320
1364
1408
1452
P - 5820
R - 5821
4312
OUTPUT
LOGIC
MACROCELL
4356
4400
4444
4488
4532
4576
4620
4664
4708
4752
4796
4840
21
P - 5812
R - 5813
1496
OUTPUT
LOGIC
MACROCEL
L
16
P - 5822
R - 5823
8
4884
OUTPUT
LOGIC
MACROCELL
4928
4972
5016
5060
5104
5148
5192
5236
5280
5324
20
P - 5814
R - 5815
4
OUTPUT
LOGIC
MACROCEL
L
15
P - 5824
R - 5825
9
5368
2156
2200
2244
2288
2332
2376
2420
2464
2508
2552
2596
2640
2684
2728
2772
2816
2860
5
17
7
3
1540
1584
1628
1672
1716
1760
1804
1848
1892
1936
1980
2024
2068
2112
OUTPUT
LOGIC
MACROCEL
L
OUTPUT
LOGIC
MACROCELL
5412
5456
5500
5544
5588
5632
5676
5720
OUTPUT
LOGIC
MACROCEL
L
14
P - 5826
R - 5827
19
10
P - 5816
R - 5817
SYNCHRONOUS
PRESET
(TO ALL REGISTERS)
5764
11
INCREMEN
T
13
0
4
8
12
16
20
24
28
32
36
40
Supports large number of product terms per output
Xilinxassociated
FPGAs - 14
Latches and muxes
with output pins
Actel Programmable Gate Arrays
Rows of programmable
logic building blocks
+
rows of interconnect
Anti-fuse Technology:
Program Once
Use Anti-fuses to build
up long wiring runs from
short segments
8 input, single output combinational logic blocks
Xilinx FPGAs - 15
FFs constructed
from discrete cross coupled gates
Actel Logic Module
SOA
S0
Basic Module is a
Modified 4:1 Multiplexer
S1
D0
2:1 MUX
D1
2:1 MUX
Y
D2
2:1 MUX
D3
R
"0"
SOB
Example:
Implementation of S-R Latch
2:1 MUX
"0"
2:1 MUX
"1"
2:1 MUX
Xilinx FPGAs - 16
S
Q
Actel Interconnect
Logic Module
Horizontal
Track
Anti-fuse
Vertical
Track
Interconnection Fabric
Xilinx FPGAs - 17
Actel Routing Example
Logic Module
Input
Logic Module
Logic Module
Output
Input
Jogs cross an anti-fuse
minimize the # of jobs for speed critical circuits
2 - 3 hops for most interconnections
Xilinx FPGAs - 18
Xilinx Programmable Gate Arrays
 CLB - Configurable Logic Block
 Three types of routing
 RAM-programmable
can be reconfigured
CLB
CLB
Wiring Channels
CLB
IOB
direct
general-purpose
long lines of various lengths
IOB
IOB
 Can be used as memory
IOB
IOB
 Built-in fast carry logic
IOB
IOB
IOB
5-input, 1 output function
or 2 4-input, 1 output functions
optional register on outputs
Xilinx FPGAs - 19
CLB
CLB
Slew
Rate
Control
CLB
D
Q
Passive
Pull-Up,
Pull-Down
Output
Buffer
Switch
Matrix
Vcc
Pad
Input
Buffer
CLB
Q
CLB
Programmable
Interconnect
C1 C2 C3 C4
S/R
Control
DIN
G
Func.
Gen.
SD
F'
H'
EC
RD
1
F4
F3
F2
F1
H
Func.
Gen.
F
Func.
Gen.
Y
G'
H'
S/R
Control
DIN
SD
F'
D
G'
Q
H'
1
H'
K
Q
D
G'
F'
EC
RD
X
Delay
I/O Blocks (IOBs)
H1 DIN S/R EC
G4
G3
G2
G1
D
Configurable
Logic Blocks (CLBs)
The Xilinx 4000 CLB
Xilinx FPGAs - 21
Two 4-input functions, registered output
Xilinx FPGAs - 22
5-input function, combinational output
Xilinx FPGAs - 23
CLB Used as RAM
Xilinx FPGAs - 24
Fast Carry Logic
Xilinx FPGAs - 25
Xilinx 4000 Interconnect
Xilinx FPGAs - 26
Switch Matrix
Xilinx FPGAs - 27
Xilinx 4000 Interconnect Details
Xilinx FPGAs - 28
Global Signals - Clock, Reset, Control
Xilinx FPGAs - 29
Xilinx 4000 IOB
Xilinx FPGAs - 30
Xilinx FPGA Combinational Logic Examples
 Key: General functions are limited to 5 inputs
(4 even better - 1/2 CLB)
No limitation on function complexity
 Example
2-bit comparator:
A B = C D and A B > C D implemented with 1 CLB
(GT) F = A C' + A B D' + B C' D'
(EQ) G = A'B'C'D'+ A'B C'D + A B'C D'+ A B C D
 Can implement some functions of > 5 input
Xilinx FPGAs - 31
Xilinx FPGA Combinational Logic
 Examples
N-input majority function: 1 whenever n/2 or more inputs are 1
N-input parity functions: 5 input/1 CLB; 2 levels yield 25 inputs!
5-input Majority Circuit
9 Input Parity Logic
CLB
CLB
7-input Majority Circuit
CLB
CLB
CLB
CLB
Xilinx FPGAs - 32
Xilinx FPGA Adder Example
 Example
2-bit binary adder - inputs: A1, A0, B1, B0, CIN
outputs: S0, S1, Cout
A3
B3
A2
CLB
Cout
B2
A1
CLB
S3
A3 B3 A2 B2
CLB
A0
CLB
S2
C2
B1
C1
Full Adder, 4 CLB delays to
final carry out
CLB
S1
C0
S0
A1 B1 A0 B0 Cin
S2
2 x Two-bit Adders (3 CLBs
each) yields 2 CLBs to final
carry out
CLB
S0
S3
Cout
B0 Cin
S1
C2
Xilinx FPGAs - 33
Computer-Aided Design
 Can't design FPGAs by hand
Way too much logic to manage, hard to make changes
 Hardware description languages
Specify functionality of logic at a high level
 Validation: high-level simulation to catch specification
errors
Verify pin-outs and connections to other system components
Low-level to verify mapping and check performance
 Logic synthesis
Process of compiling HDL program into logic gates and flip-flops
 Technology mapping
Map the logic onto elements available in the implementation
technology (LUTs for Xilinx FPGAs)
Xilinx FPGAs - 34
CAD Tool Path (cont’d)
 Placement and routing
Assign logic blocks to functions
Make wiring connections
 Timing analysis - verify paths
Determine delays as routed
Look at critical paths and ways to improve
 Partitioning and constraining
If design does not fit or is unroutable as placed split into
multiple chips
If design it too slow prioritize critical paths, fix placement
of cells, etc.
Few tools to help with these tasks exist today
 Generate programming files - bits to be loaded into
chip for configuration
Xilinx FPGAs - 35
Xilinx CAD Tools
 Verilog (or VHDL) use to specify logic at a high-level
Combine with schematics, library components
 Synopsys
Compiles Verilog to logic
Maps logic to the FPGA cells
Optimizes logic
 Xilinx APR - automatic place and route (simulated
annealing)
Provides controllability through constraints
Handles global signals
 Xilinx Xdelay - measure delay properties of mapping and
aid in iteration
 Xilinx XACT - design editor to view final mapping results
Xilinx FPGAs - 36
Applications of FPGAs
 Implementation of random logic
Easier changes at system-level (one device is modified)
Can eliminate need for full-custom chips
 Prototyping
Ensemble of gate arrays used to emulate a circuit to be
manufactured
Get more/better/faster debugging done than with simulation
 Reconfigurable hardware
One hardware block used to implement more than one function
Functions must be mutually-exclusive in time
Can greatly reduce cost while enhancing flexibility
RAM-based only option
 Special-purpose computation engines
Hardware dedicated to solving one problem (or class of problems)
Accelerators attached to general-purpose computers
Xilinx FPGAs - 37
Download