Introduction to FPGA Devices & Tools ECE 545 – Introduction to VHDL George Mason University FPGA Devices ECE 545 – Introduction to VHDL George Mason University World of Integrated Circuits Integrated Circuits Full-Custom ASICs Semi-Custom ASICs PLD PAL PLA ECE 545 – Introduction to VHDL User Programmable FPGA PML LUT (Look-Up Table) MUX Gates 3 Two competing implementation approaches ASIC Application Specific Integrated Circuit • designs must be sent for expensive and time consuming fabrication in semiconductor foundry • designed all the way from behavioral description to physical layout ECE 545 – Introduction to VHDL FPGA Field Programmable Gate Array • bought off the shelf and reconfigured by designers themselves • no physical layout design; design ends with a bitstream used to configure a device 4 What is an FPGA? Configurable Logic Blocks Block RAMs Block RAMs I/O Blocks Block RAMs ECE 545 – Introduction to VHDL 5 Which Way to Go? ASICs FPGAs Off-the-shelf High performance Low development cost Low power Short time to market Low cost in high volumes ECE 545 – Introduction to VHDL Reconfigurability 6 Other FPGA Advantages • Manufacturing cycle for ASIC is very costly, lengthy and engages lots of manpower • Mistakes not detected at design time have large impact on development time and cost • FPGAs are perfect for rapid prototyping of digital circuits • Easy upgrades like in case of software • Unique applications • reconfigurable computing ECE 545 – Introduction to VHDL 7 Major FPGA Vendors SRAM-based FPGAs • Xilinx, Inc. • Altera Corp. • Atmel • Lattice Semiconductor Flash & antifuse FPGAs • Actel Corp. • Quick Logic Corp. ECE 545 – Introduction to VHDL 8 Xilinx Primary products: FPGAs and the associated CAD software Programmable Logic Devices ISE Alliance and Foundation Series Design Software Main headquarters in San Jose, CA Fabless* Semiconductor and Software Company UMC (Taiwan) {*Xilinx acquired an equity stake in UMC in 1996} Seiko Epson (Japan) TSMC (Taiwan) ECE 545 – Introduction to VHDL 9 Xilinx FPGA Families • Old families • XC3000, XC4000, XC5200 • Old 0.5µm, 0.35µm and 0.25µm technology. Not recommended for modern designs. • High-performance families • Virtex (0.22µm) • Virtex-E, Virtex-EM (0.18µm) • Virtex-II, Virtex-II PRO (0.13µm) • Low Cost Family • • • • Spartan/XL – derived from XC4000 Spartan-II – derived from Virtex Spartan-IIE – derived from Virtex-E Spartan-3 ECE 545 – Introduction to VHDL 10 ECE 545 – Introduction to VHDL 11 Basic Spartan-II FPGA Block Diagram ECE 545 – Introduction to VHDL 12 CLB Structure COUT G4 G3 G2 G1 Look-Up Table O Carry & Control Logic COUT YB Y D S Q CK EC Look-Up Table O R F5IN BY SR F4 F3 F2 F1 G4 G3 G2 G1 Carry & Control Logic YB Y D S Q CK EC R F5IN BY SR Look-Up Table O Carry & Control Logic XB X CIN CLK CE D S CK EC Q F4 F3 F2 F1 R SLICE CIN CLK CE Look-Up Table O Carry & Control Logic XB X D S Q CK EC R SLICE • Each slice has 2 LUT-FF pairs with associated carry logic • Two 3-state buffers (BUFT) associated with each CLB, accessible by all CLB outputs ECE 545 – Introduction to VHDL 13 CLB Slice Structure • Each slice contains two sets of the following: • Four-input LUT • Any 4-input logic function, • or 16-bit x 1 sync RAM • or 16-bit shift register • Carry & Control • Fast arithmetic logic • Multiplier logic • Multiplexer logic • Storage element • • • • Latch or flip-flop Set and reset True or inverted inputs Sync. or async. control ECE 545 – Introduction to VHDL 14 LUT (Look-Up Table) Functionality x1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 x2 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 x3 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 x4 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 x1 x2 x3 x4 y 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 LUT y x1 x2 x3 x4 x1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 x2 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 x3 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 x4 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 y 0 1 0 0 0 1 0 1 0 1 0 0 1 1 0 0 • Look-Up tables are primary elements for logic implementation • Each LUT can implement any function of 4 inputs x1 x2 y y ECE 545 – Introduction to VHDL 15 Distributed RAM RAM16X1S • A LUT equals 16x1 RAM • Implements Single and DualPorts • Cascade LUTs to increase RAM size • Synchronous write • Synchronous/Asynchronous read • Accompanying flip-flops used for synchronous read ECE 545 – Introduction to VHDL = LUT • CLB LUT configurable as Distributed RAM D WE WCLK A0 A1 A2 A3 O RAM32X1S D WE WCLK A0 A1 A2 A3 A4 LUT = LUT or O RAM16X2S D0 D1 WE WCLK A0 A1 A2 A3 O0 O1 or RAM16X1D D WE WCLK A0 SPO A1 A2 A3 DPRA0 DPO DPRA1 DPRA2 DPRA3 16 Shift Register LUT • Each LUT can be configured as shift register IN CE CLK • Serial in, serial out • Dynamically addressable delay up to 16 cycles • For programmable pipeline • Cascade for greater cycle delays • Use CLB flip-flops to add depth LUT = D CE Q D CE Q D CE Q D CE Q OUT DEPTH[3:0] ECE 545 – Introduction to VHDL 17 Shift Register 12 Cycles 64 Operation A Operation B 4 Cycles 8 Cycles 64 Operation C 3 Cycles • Register-rich FPGA 3 Cycles 9-Cycle imbalance • Allows for addition of pipeline stages to increase throughput • Data paths must be balanced to keep desired functionality ECE 545 – Introduction to VHDL 18 Carry & Control Logic COUT YB G4 G3 G2 G1 Y Look-Up O Table D Carry & Control Logic S Q CK EC R F5IN BY SR XB F4 F3 F2 F1 CIN CLK CE ECE 545 – Introduction to VHDL X Look-Up Table O Carry & Control Logic S D Q CK EC R SLICE 19 Fast Carry Logic Each CLB contains separate logic and routing for the fast generation of sum & carry signals MSB Carry Logic Routing • Increases efficiency and performance of adders, subtractors, accumulators, comparators, and counters Carry logic is independent of normal logic and routing resources ECE 545 – Introduction to VHDL LSB 20 Accessing Carry Logic All major synthesis tools can infer carry logic for arithmetic functions • • • • Addition (SUM <= A + B) Subtraction (DIFF <= A - B) Comparators (if A < B then…) Counters (count <= count +1) ECE 545 – Introduction to VHDL 21 Block RAM Port B Port A Spartan-II True Dual-Port Block RAM Block RAM • Most efficient memory implementation • Dedicated blocks of memory • Ideal for most memory requirements • 4 to 14 memory blocks • 4096 bits per blocks • Use multiple blocks for larger memories • Builds both single and true dual-port RAMs ECE 545 – Introduction to VHDL 22 Spartan-II Block RAM Amounts ECE 545 – Introduction to VHDL 23 Block RAM Port Aspect Ratios 1 2 0 4 0 0 1k x 4 2k x 2 1023 4k x 1 1047 8 0 512 x 8 511 16 0 4095 255 ECE 545 – Introduction to VHDL 256 x 16 24 Basic I/O Block Structure D Q EC Three-State FF Enable Clock SR Three-State Control Set/Reset D Q EC Output FF Enable Output Path SR Direct Input FF Enable Registered Input Q D EC Input Path SR ECE 545 – Introduction to VHDL 25 IOB Functionality • IOB provides interface between the package pins and CLBs • Each IOB can work as uni- or bi-directional I/O • Outputs can be forced into High Impedance • Inputs and outputs can be registered • advised for high-performance I/O • Inputs can be delayed ECE 545 – Introduction to VHDL 26 Routing Resources CLB CLB PSM CLB CLB PSM CLB PSM CLB ECE 545 – Introduction to VHDL CLB Programmable Switch Matrix PSM CLB CLB 27 Spartan-II FPGA Family Members ECE 545 – Introduction to VHDL 28 ECE 545 – Introduction to VHDL 29 Virtex-II 1.5V Architecture Multipliers 18 x 18 Block RAMs Multipliers 18 x 18 Block RAMs Multipliers 18 x 18 Block RAMs Multipliers 18 x 18 Configurable Logic Block Block RAMs 30 ECE 545 – Introduction to VHDL I /O Block Virtex-II 1.5V Device CLB Array Slices Maximum I/O BlockRAM (18kb) Multiplier Blocks Distributed RAM bits XC2V40 8x8 256 88 4 4 8,192 XC2V80 16x8 512 120 8 8 16,384 XC2V250 24x16 1,536 200 24 24 49,152 XC2V500 32x24 3,072 264 32 32 98,304 XC2V1000 40x32 5,120 432 40 40 163,840 XC2V1500 48x40 7,680 528 48 48 245,760 XC2V2000 56x48 10,752 624 56 56 344,064 XC2V3000 64x56 14,336 720 96 96 458,752 XC2V4000 80x72 23,040 912 120 120 737,280 XC2V6000 96x88 33,792 1,104 144 144 1,081,344 XC2V8000 112x104 46,592 1,108 168 168 1,490,944 ECE 545 – Introduction to VHDL 31 Virtex-II Block SelectRAM • Virtex-II BRAM is 18 kbits • Additional “parity” bits available in selected configurations WEA ENA SSRA CLKA DOA[# : 0] DOPA[# : 0] ADDRA[# : 0] DIA[# : 0] DIPA[# : 0] Width Depth 1 16,386 Address Data Parity [13:0] [0] N/A WEB 2 8,192 [12:0] [1:0] N/A ENB RSTB 4 4,096 [11:0] [3:0] N/A CLKB DOB[# : 0] DOPB[# : 0] ADDRB[# : 0] 9 2,048 [10:0] [7:0] [0] 18 1,024 [9:0] [15:0] [1:0] 36 512 [8:0] [31:0] [3:0] ECE 545 – Introduction to VHDL DIB[# : 0] DIPA[# : 0] 32 FPGA Nomenclature ECE 545 – Introduction to VHDL 33 FPGA Tools ECE 545 – Introduction to VHDL George Mason University Design process (1) Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key set on 8031 microcontroller. Unlike in the experiment 5, this time your unit has to be able to perform an encryption algorithm by itself, executing 32 rounds….. Specification (Lab Experiments) VHDL description (Your Source Files) Library IEEE; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; Functional simulation entity RC5_core is port( clock, reset, encr_decr: in std_logic; data_input: in std_logic_vector(31 downto 0); data_output: out std_logic_vector(31 downto 0); out_full: in std_logic; key_input: in std_logic_vector(31 downto 0); key_read: out std_logic; ); end AES_core; Synthesis ECE 545 – Introduction to VHDL Post-synthesis simulation 35 Design process (2) Implementation Timing simulation Configuration On chip testing ECE 545 – Introduction to VHDL 36 Design Process control from Active-HDL ECE 545 – Introduction to VHDL 37 Simulation Tools Many others… ECE 545 – Introduction to VHDL 38 ECE 545 – Introduction to VHDL 39 ECE 545 – Introduction to VHDL 40 Synthesis Tools … and others ECE 545 – Introduction to VHDL 41 Logic Synthesis VHDL description Circuit netlist architecture MLU_DATAFLOW of MLU is signal A1:STD_LOGIC; signal B1:STD_LOGIC; signal Y1:STD_LOGIC; signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC; begin A1<=A when (NEG_A='0') else not A; B1<=B when (NEG_B='0') else not B; Y<=Y1 when (NEG_Y='0') else not Y1; MUX_0<=A1 and B1; MUX_1<=A1 or B1; MUX_2<=A1 xor B1; MUX_3<=A1 xnor B1; with (L1 & L0) select Y1<=MUX_0 when "00", MUX_1 when "01", MUX_2 when "10", MUX_3 when others; end MLU_DATAFLOW; ECE 545 – Introduction to VHDL 42 Features of synthesis tools • Interpret RTL code • Produce synthesized circuit netlist in a standard EDIF format • Give preliminary performance estimates • Some can display circuit schematics corresponding to EDIF netlist ECE 545 – Introduction to VHDL 43 Implementation • After synthesis the entire implementation process is performed by FPGA vendor tools ECE 545 – Introduction to VHDL 44 ECE 545 – Introduction to VHDL 45 Translation Synthesis Circuit netlist Electronic Design Interchange Format EDIF Timing Constraints Constraint Editor Native Constraint File NCF UCF User Constraint File Translation NGD ECE 545 – Introduction to VHDL Native Generic Database file 46 Sample UCF File • • • • • • • • • • • • • • • • • • • • • • # # Constraints generated by Synplify Pro 7.3.3, Build 039R # # Period Constraints #Begin clock constraints #End clock constraints # Output Constraints # Input Constraints # Location Constraints # End of generated constraints NET "clock" LOC = "P88"; NET "control(0)" LOC = "P50"; NET "control(1)" LOC = "P48"; NET "control(2)" LOC = "P42"; NET "reset" LOC = "P93"; NET "segments(0)" LOC = "P67"; NET "segments(1)" LOC = "P39"; NET "segments(2)" LOC = "P62"; NET "segments(3)" LOC = "P60"; NET "segments(4)" LOC = "P46"; NET "segments(5)" LOC = "P57"; NET "segments(6)" LOC = "P49"; ECE 545 – Introduction to VHDL 47 Pin Assignment FPGA P93 P88 P39 P42 P46 CLOCK CONTROL(0) CONTROL(1) CONTROL(2) RESET LAB2 SEGMENTS(0) SEGMENTS(1) SEGMENTS(2) SEGMENTS(3) SEGMENTS(4) SEGMENTS(5) SEGMENTS(6) P67 P62 P60 P48 P49 P50 P57 ECE 545 – Introduction to VHDL 48 Parallel Port Interface ECE 545 – Introduction to VHDL 49 Constraints Editor ECE 545 – Introduction to VHDL 50 Circuit netlist ECE 545 – Introduction to VHDL 51 Mapping LUT4 LUT1 FF1 LUT5 LUT2 FF2 LUT3 ECE 545 – Introduction to VHDL 52 Placing FPGA CLB SLICES ECE 545 – Introduction to VHDL 53 Routing FPGA Programmable Connections ECE 545 – Introduction to VHDL 54 Static Timing Analyzer • Performs static analysis of the circuit performance • Reports critical paths with all sources of delays • Determines maximum clock frequency ECE 545 – Introduction to VHDL 55 Static Timing Analysis • Critical Path – The Longest Path From Outputs of Registers to Inputs of Registers tP logic in D Q D Q out clk tCritical = tP FF + tP logic + tS FF ECE 545 – Introduction to VHDL 56 Static Timing Analysis • Min. Clock Period = Length of The Critical Path • Max. Clock Frequency = 1 / Min. Clock Period ECE 545 – Introduction to VHDL 57 Configuration • Once a design is implemented, you must create a file that the FPGA can understand • This file is called a bit stream: a BIT file (.bit extension) • The BIT file can be downloaded directly to the FPGA, or can be converted into a PROM file which stores the programming information ECE 545 – Introduction to VHDL 58 Resources & Required Reading Spartan FPGA devices Xilinx Spartan-II 2.5V FPGA Family: Complete Data Sheet • Module 1: Introduction & Ordering Information • Module 2: Functional Description http://direct.xilinx.com/bvdocs/publications/ds001.pdf ECE 545 – Introduction to VHDL 59 Resources & Required Reading FPGA Tools Integrated Interfaces: Active-HDL with Synplify® http://www.aldec.com/Previews/active_synplify.htm Integrated Synthesis and Implementation http://www.aldec.com/Previews/synthesis_implementation.htm ECE 545 – Introduction to VHDL 60 Hands-on Session Enough Talking Let’s Get To It !!Brace Yourselves!! ECE 545 – Introduction to VHDL 61 MLU: Block Diagram MUX_0 A1 A IN 0 NEG_A MUX_1 IN 1 MUX_2 Y1 IN 2 IN 3 Y O U T PU T S E L1 S E L0 B B1 MUX_4_1 NEG_Y MUX_3 NEG_B L1 L0 ECE 545 – Introduction to VHDL 62 ALU Schematic arith [1:0] A[3:0] B[3:0] A+B 0 A-B 1 A <<< 1 2 A >>> 1 3 logic [1:0] 0 Y [3:0] 1 A and B 0 A or B 1 A xor B 2 A xnor B 3 ECE 545 – Introduction to VHDL 0 ar_log 1 neg_Y 63 Questions? ECE 545 – Introduction to VHDL 64