Evolution of Implementation Technologies Discrete devices: relays, transistors (1940s-50s) Discrete logic gates (1950s-60s) trend toward Integrated circuits (1960s-70s) of integration higher levels e.g. TTL packages: Data Book for 100’s of different parts Map your circuit to the Data Book parts Gate Arrays (IBM 1970s) “Custom” integrated circuit chips Design using a library (like TTL) Transistors are already on the chip Place and route software puts the chip together automatically + Large circuits on a chip + Automatic design tools (no tedious custom layout) - Only good if you want 1000’s of parts Xilinx FPGAs - 1 Gate Array Technology (IBM - 1970s) Simple logic gates Use transistors to implement combinational and sequential logic Interconnect Wires to connect inputs and outputs to logic blocks I/O blocks Special blocks at periphery for external connections Add wires to make connections Done when chip is fabbed “mask-programmable” Construct any circuit Xilinx FPGAs - 2 Programmable Logic Disadvantages of the Data Book method Constrained to parts in the Data Book Parts are necessarily small and standard Need to stock many different parts Programmable logic Use a single chip (or a small number of chips) Program it for the circuit you want No reason for the circuit to be small Xilinx FPGAs - 3 Programmable Logic Technologies Fuse and anti-fuse Fuse makes or breaks link between two wires Typical connections are 50-300 ohm One-time programmable (testing before programming?) Very high density EPROM and EEPROM High power consumption Typical connections are 2K-4K ohm Fairly high density RAM-based Memory bit controls a switch that connects/disconnects two wires Typical connections are .5K-1K ohm Can be programmed and re-programmed in the circuit Xilinx FPGAs - 4 Low density Programmable Logic Program a connection Connect two wires Set a bit to 0 or 1 Regular structures for two-level logic (1960s-70s) All rely on two-level logic minimization PROM connections - permanent EPROM connections - erase with UV light EEPROM connections - erase electrically PROMs Program connections in the _____________ plane PLAs Program the connections in the ____________ plane PALs Program the connections in the ____________ plane Xilinx FPGAs - 5 Making Large Programmable Logic Circuits Alternative 1 : “CPLD” Put a lot of PLDS on a chip Add wires between them whose connections can be programmed Use fuse/EEPROM technology Alternative 2: “FPGA” Emulate gate array technology Hence Field Programmable Gate Array You need: A way to implement logic gates A way to connect them together Xilinx FPGAs - 6 Field-Programmable Gate Arrays PALs, PLAs = 10 - 100 Gate Equivalents Field Programmable Gate Arrays = FPGAs Altera MAX Family Actel Programmable Gate Array Xilinx Logical Cell Array 100 - 1000(s) of Gate Equivalents! Xilinx FPGAs - 7 Field-Programmable Gate Arrays Logic blocks To implement combinational and sequential logic Interconnect Wires to connect inputs and outputs to logic blocks I/O blocks Special logic blocks at periphery of device for external connections Key questions: How to make logic blocks programmable? How to connect the wires? After the chip has been fabbed Xilinx FPGAs - 8 Tradeoffs in FPGAs Logic block - how are functions implemented: fixed functions (manipulate inputs) or programmable? Support complex functions, need fewer blocks, but they are bigger so less of them on chip Support simple functions, need more blocks, but they are smaller so more of them on chip Interconnect How are logic blocks arranged? How many wires will be needed between them? Are wires evenly distributed across chip? Programmability slows wires down – are some wires specialized to long distances? How many inputs/outputs must be routed to/from each logic block? What utilization are we willing to accept? 50%? 20%? 90%? Xilinx FPGAs - 9 Altera EPLD (Erasable Programmable Logic Devices) Historical Perspective PALs: same technology as programmed once bipolar PROM EPLDs: CMOS erasable programmable ROM (EPROM) erased by UV light Altera building block = MACROCELL CLK 8 Product Term AND-OR Array + Programmable MUX's Clk MUX AND ARRAY Output MUX Q I/O Pin Inv ert Control F/B MUX Programmable polarity pad Xilinx FPGAs - 10 Seq. Logic Block Programmable feedback Altera EPLD Altera EPLDs contain 8 to 48 independently programmed macrocells Global CLK Personalized by EPROM bits: Clk MUX Synchronous Mode 1 Flipflop controlled by global clock signal OE/Local CLK Q EPROM Cell Global CLK Clk MUX local signal computes output enable Asynchronous Mode 1 OE/Local CLK Q EPROM Cell Flipflop controlled by locally generated clock signal + Seq Logic: could be D, T positive or negative edge triggered + product term to implement clear function Xilinx FPGAs - 11 Altera Multiple Array Matrix (MAX) AND-OR structures are relatively limited Cannot share signals/product terms among macrocells Logic Array Blocks (similar to macrocells) LAB A LAB H LAB B LAB C LAB G P I A LAB F LAB D LAB E Xilinx FPGAs - 12 Global Routing: Programmable Interconnect Array EPM5128: 8 Fixed Inputs 52 I/O Pins 8 LABs 16 Macrocells/LAB 32 Expanders/LAB LAB Architecture Macrocell P-Terms Expander P-Terms Expander Terms shared among all macrocells within the LAB Xilinx FPGAs - 13 P22V10 PAL INCREMENT 2904 1 0 0 FIRST FUSE NUMBERS 4 8 12 16 20 24 28 32 36 2948 2992 3036 3080 3124 3168 3212 3256 3300 3344 3388 3432 3476 3520 3564 3608 40 ASYNCHRONOUS RESET (TO ALL REGISTERS) 44 88 132 176 220 264 308 352 396 1 1 1 0 AR D Q 23 0 0 0 1 Q 5808 SP P R 1 0 5809 OUTPUT LOGIC MACROCEL L 18 P - 5818 R - 5819 6 440 3652 484 528 572 616 660 704 748 792 836 880 OUTPUT LOGIC MACROCELL 3696 3740 3784 3828 3872 3916 3960 4004 4048 4092 4136 4180 4224 4268 22 P - 5810 R - 5811 2 924 968 1012 1056 1100 1144 1188 1232 1276 1320 1364 1408 1452 P - 5820 R - 5821 4312 OUTPUT LOGIC MACROCELL 4356 4400 4444 4488 4532 4576 4620 4664 4708 4752 4796 4840 21 P - 5812 R - 5813 1496 OUTPUT LOGIC MACROCEL L 16 P - 5822 R - 5823 8 4884 OUTPUT LOGIC MACROCELL 4928 4972 5016 5060 5104 5148 5192 5236 5280 5324 20 P - 5814 R - 5815 4 OUTPUT LOGIC MACROCEL L 15 P - 5824 R - 5825 9 5368 2156 2200 2244 2288 2332 2376 2420 2464 2508 2552 2596 2640 2684 2728 2772 2816 2860 5 17 7 3 1540 1584 1628 1672 1716 1760 1804 1848 1892 1936 1980 2024 2068 2112 OUTPUT LOGIC MACROCEL L OUTPUT LOGIC MACROCELL 5412 5456 5500 5544 5588 5632 5676 5720 OUTPUT LOGIC MACROCEL L 14 P - 5826 R - 5827 19 10 P - 5816 R - 5817 SYNCHRONOUS PRESET (TO ALL REGISTERS) 5764 11 INCREMEN T 13 0 4 8 12 16 20 24 28 32 36 40 Supports large number of product terms per output Xilinxassociated FPGAs - 14 Latches and muxes with output pins Actel Programmable Gate Arrays Rows of programmable logic building blocks + rows of interconnect Anti-fuse Technology: Program Once Use Anti-fuses to build up long wiring runs from short segments 8 input, single output combinational logic blocks Xilinx FPGAs - 15 FFs constructed from discrete cross coupled gates Actel Logic Module SOA S0 Basic Module is a Modified 4:1 Multiplexer S1 D0 2:1 MUX D1 2:1 MUX Y D2 2:1 MUX D3 R "0" SOB Example: Implementation of S-R Latch 2:1 MUX "0" 2:1 MUX "1" 2:1 MUX Xilinx FPGAs - 16 S Q Actel Interconnect Logic Module Horizontal Track Anti-fuse Vertical Track Interconnection Fabric Xilinx FPGAs - 17 Actel Routing Example Logic Module Input Logic Module Logic Module Output Input Jogs cross an anti-fuse minimize the # of jobs for speed critical circuits 2 - 3 hops for most interconnections Xilinx FPGAs - 18 Xilinx Programmable Gate Arrays CLB - Configurable Logic Block Three types of routing RAM-programmable can be reconfigured CLB CLB Wiring Channels CLB IOB direct general-purpose long lines of various lengths IOB IOB Can be used as memory IOB IOB Built-in fast carry logic IOB IOB IOB 5-input, 1 output function or 2 4-input, 1 output functions optional register on outputs Xilinx FPGAs - 19 CLB CLB Slew Rate Control CLB D Q Passive Pull-Up, Pull-Down Output Buffer Switch Matrix Vcc Pad Input Buffer CLB Q CLB Programmable Interconnect C1 C2 C3 C4 S/R Control DIN G Func. Gen. SD F' H' EC RD 1 F4 F3 F2 F1 H Func. Gen. F Func. Gen. Y G' H' S/R Control DIN SD F' D G' Q H' 1 H' K Q D G' F' EC RD X Delay I/O Blocks (IOBs) H1 DIN S/R EC G4 G3 G2 G1 D Configurable Logic Blocks (CLBs) The Xilinx 4000 CLB Xilinx FPGAs - 21 Two 4-input functions, registered output Xilinx FPGAs - 22 5-input function, combinational output Xilinx FPGAs - 23 CLB Used as RAM Xilinx FPGAs - 24 Fast Carry Logic Xilinx FPGAs - 25 Xilinx 4000 Interconnect Xilinx FPGAs - 26 Switch Matrix Xilinx FPGAs - 27 Xilinx 4000 Interconnect Details Xilinx FPGAs - 28 Global Signals - Clock, Reset, Control Xilinx FPGAs - 29 Xilinx 4000 IOB Xilinx FPGAs - 30 Xilinx FPGA Combinational Logic Examples Key: General functions are limited to 5 inputs (4 even better - 1/2 CLB) No limitation on function complexity Example 2-bit comparator: A B = C D and A B > C D implemented with 1 CLB (GT) F = A C' + A B D' + B C' D' (EQ) G = A'B'C'D'+ A'B C'D + A B'C D'+ A B C D Can implement some functions of > 5 input Xilinx FPGAs - 31 Xilinx FPGA Combinational Logic Examples N-input majority function: 1 whenever n/2 or more inputs are 1 N-input parity functions: 5 input/1 CLB; 2 levels yield 25 inputs! 5-input Majority Circuit 9 Input Parity Logic CLB CLB 7-input Majority Circuit CLB CLB CLB CLB Xilinx FPGAs - 32 Xilinx FPGA Adder Example Example 2-bit binary adder - inputs: A1, A0, B1, B0, CIN outputs: S0, S1, Cout A3 B3 A2 CLB Cout B2 A1 CLB S3 A3 B3 A2 B2 CLB A0 CLB S2 C2 B1 C1 Full Adder, 4 CLB delays to final carry out CLB S1 C0 S0 A1 B1 A0 B0 Cin S2 2 x Two-bit Adders (3 CLBs each) yields 2 CLBs to final carry out CLB S0 S3 Cout B0 Cin S1 C2 Xilinx FPGAs - 33 Computer-Aided Design Can't design FPGAs by hand Way too much logic to manage, hard to make changes Hardware description languages Specify functionality of logic at a high level Validation: high-level simulation to catch specification errors Verify pin-outs and connections to other system components Low-level to verify mapping and check performance Logic synthesis Process of compiling HDL program into logic gates and flip-flops Technology mapping Map the logic onto elements available in the implementation technology (LUTs for Xilinx FPGAs) Xilinx FPGAs - 34 CAD Tool Path (cont’d) Placement and routing Assign logic blocks to functions Make wiring connections Timing analysis - verify paths Determine delays as routed Look at critical paths and ways to improve Partitioning and constraining If design does not fit or is unroutable as placed split into multiple chips If design it too slow prioritize critical paths, fix placement of cells, etc. Few tools to help with these tasks exist today Generate programming files - bits to be loaded into chip for configuration Xilinx FPGAs - 35 Xilinx CAD Tools Verilog (or VHDL) use to specify logic at a high-level Combine with schematics, library components Synopsys Compiles Verilog to logic Maps logic to the FPGA cells Optimizes logic Xilinx APR - automatic place and route (simulated annealing) Provides controllability through constraints Handles global signals Xilinx Xdelay - measure delay properties of mapping and aid in iteration Xilinx XACT - design editor to view final mapping results Xilinx FPGAs - 36 Applications of FPGAs Implementation of random logic Easier changes at system-level (one device is modified) Can eliminate need for full-custom chips Prototyping Ensemble of gate arrays used to emulate a circuit to be manufactured Get more/better/faster debugging done than with simulation Reconfigurable hardware One hardware block used to implement more than one function Functions must be mutually-exclusive in time Can greatly reduce cost while enhancing flexibility RAM-based only option Special-purpose computation engines Hardware dedicated to solving one problem (or class of problems) Accelerators attached to general-purpose computers Xilinx FPGAs - 37