Virtex-6 Memory Resources Basic FPGA Architecture Xilinx Training Objectives After completing this module you will be able to… Fully utilize the Virtex-6 distributed memory, block memory, and FIFO resources Use the Memory Interface Generator (MIG) to build a custom memory controller for your off-chip memory component Every Design Uses Memory On-chip Block RAM and ROM – Frequently used for local data storage, shift registers, finite state machines, and FIFOs Fast on-chip memory – Block RAM also includes dedicated FIFO logic – Bitstream can pre-load with fixed data for a Virtex-6 ROM or RAM Distributed memory for smaller applications External memory for larger data storage – Soft memory controllers support newest standards – Supported with the Memory Interface Generator (MIG) External memory Memory Options LOGIC RAM/SRL 32 Distributed RAM/SRL32 On-chip BRAM Fast Memory Interfaces DRAM SRAM BRAM BRAM Virtex-6 FLASH EEPROM • Very granular, localized memory • Minimal impact on logic routing • Great for small FIFOs Granularity • Efficient, on-chip blocks • Ideal for mid-sized buffering • Dedicated FIFO logic DRAM • SDRAM • DDR • DDR3 • FCRAM • RLDRAM SRAM • Sync SRAM • DDR SRAM • ZBT • QDR FLASH EEPROM • Cost-effective bulk storage • Memory controller cores for large memory requirements Capacity Interfacing to External Memories Supports many standards – 1.5 V to 2.5 V – Single or double data rate – Different protocols High performance – With advanced ChipSync™ technology • This is the Select I/O functionality that allows each pin to directly connect to many different I/O standards In this case, memory specific standards improve speed Embedded ECC logic Virtex-6 Distributed RAM Distributed LUT memory Logic – 64-bit blocks throughout the FPGA available in 50% of the slices (Slice_M) Slice3 – Single-port, dual-port, multi-port Slice3 – Can be used as 32-bit shift register – Very fast (sub-nanosecond) Logic Logic Slice3 Logic Slice3 Logic RAM Shift Register Slice3 Logic RAM Shift Register Slice3 Logic RAM Shift Register Slice3 Logic Slice3 Logic Slice3 Logic Slice3 Slice3 Logic RAM Shift Register Slice3 Logic RAM Shift Register RAM Ideal for small and fast memories – Coefficient storage – Small data buffers – Small state machines – Small FIFOs – Shift registers R A M Logic RAM Shift Register R A M Distributed RAM Features Distributed LUT memory – Can be loaded by configuration Synchronous (clocked) write operation – But asynchronous (combinatorial) read • Can be converted to a synchronous read when you use the associated slice flip-flop Best built with the Core Generator Slice3 Logic Slice3 Logic Slice3 Logic Slice3 Logic RAM Shift Register Slice3 Logic RAM Shift Register Slice3 Logic RAM Shift Register Slice3 Logic Slice3 Logic Slice3 Logic Slice3 R A M Logic RAM Shift Register Slice3 Logic RAM Shift Register Slice3 Logic RAM Shift Register RAM – Automatically builds the input decode and output multiplexer logic Memory can be initialized with… – Text I/O code in VHDL – Coefficient file – Or an initialization file placed in HDL code if inferring the memory R A M Block RAM or FIFO Two independent ports address common data – Individual address, clock, write enable, clock enable – Independent widths for each port – Don’t forget to leave the memory disabled when not writing to avoid unpredictable results Integrated control logic for fast and efficient FIFOs Multiple configuration options – True dual-port, simple dual-port, single-port – Configurable aspect ratio 600-MHz operation when using data pipeline option – All operations are synchronous; all outputs are latched – Data output has an optional internal pipeline register Integrated cascade logic – Widens/deepens FIFOs – Creates 64k x 1 from two 32k x 1 block RAMs Byte-write enable – Enhances processor memory interfacing Integrated 64 / 72-bit Hamming error correction 36K Memory or Dual-Port BRAM FIFO Segment Block RAMs Each block RAM block can be used as… 18 Kb BRAM 36 Kb BRAM / FIFO (1) 36 Kb BRAM OR (1) 36 Kb or FIFO or 18 Kb BRAM / FIFO (2) independent 18 Kb block RAMs OR (1) 18 Kb FIFO + (1) 18 Kb block RAM True Dual-Port Block RAM True dual-port flexibility – Can perform read and write operations simultaneously and independently on port A and port B – Each port has its own clock, enable, write enable Addr A Port A 36 – Every write also performs a read operation 36 Kb Memory Array – Simultaneous read + write or write + write to the same location can cause data corruption • Make sure that address and control signals are stable during operation BRAM configurations – 32Kx1, 16Kx2, 8Kx4, 4Kx9, 2Kx18, 1Kx36 – Or two independent: 16Kx1, 8Kx2, 4Kx4, 2Kx19, 1Kx18 – Each port can have its on depth x width – But FIFO always has identical read and write width Rdata A Wdata A • Read before write, write before read, or no change 36 Addr B Port B 36 Wdata B 36 Rdata B Simple Dual-Port Block RAM One read port and one write port – Natural structure for FIFOs Read Port Allows widest implementation − 72-bit data on one or both 36 Kb ports Read Addr − Up to 72-bit read and write in one clock cycle − 36-bit width for 18 Kb block RAM − Doubles the memory bandwidth per block Rdata A 36 Kb Memory Array Write Addr 72 Wdata B Write Port Dedicated parity bit with each block RAM – Extra bits can be used for any purpose – No dedicated parity checking logic 72 Block RAM Configurations Each 18K 36K True dual-port 16Kx1, 8Kx2, 4Kx4, 2Kx9, 1Kx18 32Kx1, 16Kx2, 8Kx4, 4Kx9, 2Kx18, 1Kx36 Two fully independent read and write operations Simple dual-port 16Kx1, 8Kx2, 4Kx4, 2Kx9, 1Kx18, 512x36 16Kx2, 8Kx4, 4Kx9, 2Kx18, 1Kx36, 512x72 1 read & 1 write port Read AND write in 1 cycle Single-port 16Kx1, 8Kx2, 4Kx4, 2Kx9, 1Kx18, 512x36 16Kx2, 8Kx4, 4Kx9, 2Kx18, 1Kx36, 512x72 1 read & 1 write port Read OR write in 1 cycle Block RAM is Cascadable Built-in cascade logic for 64Kx1 – Cascade two vertically adjacent 32Kx1 block RAMs without using external CLB logic or compromising performance – Saves resources and improves speed of larger memories The Core Generator enables you to easily build with the Cascade option or even larger rrays DQ DQ DI A[13:0 ] Ram_ Extension DQ DI A[13:0] 1 0 DO A14 1 1 0 DQ WE _ Control) DQ DI DQ A[13:0] Ram_ ExtensionDQ 1 0 Not Used A14 1 1 0 DQ WE _ Control) – 128Kb, 256Kb, 512Kb, 1 Mb, … – Using external CLB logic for depth expansion • Not quite as fast as cascaded block RAMs – Width expansion uses parallel block RAMs Example: Cascade 8 block RAMs to build 256-Kb memory Integrated Error Correction Built-in optional error correction / detection – 64-bit ECC (Hamming code, uses all 72 bits) • On outputs but not in the memory array – Detects, but does not correct, all double-bit errors – Identifies the address of the error – This feature is not included with Spartan-6 New feature – Deliberately insert error for testing purposes – Useful when testing external memory interfaces ECC and Interconnect – Corrects all single-bit errors 18 kbit 64-bits FIFO Logic + Code 18 kbit Byte-Wide Write Enable Controls which byte is being written – One write enable for each byte of data and its parity bit – Useful when interfacing with processors Byte write during write-first mode clk address data we[3:0] memory@a0 output a0 ABCD 1001 FFFF AFFD – Bytes not being written will show existing memory value on the output – Output truly reflects the new memory content – Writing always implies a read • Be careful of reading when writing AFFD FIFO 600-MHz maximum frequency – Faster than a soft implementation and saves CLBs Full featured – Synchronous or asynchronous read and write clocks • No phase relationship is required – Full, empty, programmable almost-full/empty – Synchronous FIFO mode eliminates the uncertainty of flag latency inherent in asynchronous operation – Optional first-word-fall-through • Immediate availability of the first word after empty FIFO configurations – Any 36-Kb block RAM: 8Kx4, 4Kx9, 2Kx18, 1Kx36, 512x72 – Any 18-Kb block RAM: 4Kx4, 2Kx9, 1Kx18, 512x36 • Must be same width for read and write DIN Bus WREN > WRCLK RDEN > RDCLK DOUT Bus FULL AFULL EMPTY AEMPTY RDERR WRERR RDCONT<11:0> RESET WRDCONT<11:> FIFOs are Cascadable DIN<35:0> RDEN WREN RDEN DIN<71:36> WREN DOUT<35:O> FIFO #1 FIFO #2 EMPTY Flexible FIFO configuration – Expand width, depth, or both using CLB logic – Build them with the Core Generator – FIFO Generator DOUT<71:36> AFULL DIN<3:0> 512X144 FIFO WREN DOUT<3:0> FIFO #1 FIFO #2 WRCLK Width Cascade RDCLK RDEN 16KX4 FIFO Depth Cascade Block RAM and FIFO Capacity Virtex-6 Device Distr. RAM (Kb) Block RAM/FIFO (Kb) Block RAM/FIFO Blocks LX75T 1,045 5,616 156 LX130T 1,740 9,504 264 LX195T 3,040 12,384 344 LX240T 3,650 14,976 416 LX365T 4,130 14,976 416 LX550T 6,200 22,752 632 SX315T 5,090 25,344 704 SX475T 7,640 38,304 1,064 LX2760 8,280 25,920 720 Two Design Flows For ISE® tool design flow (Core Generator) – Memory Interface Generator (MIG) wizard, within the CORE Generator tool – Best for all applications, except embedded applications – Simple GUI-driven tool for configuring soft memory controllers – Supports all Virtex-6 memory standards (DDR2, DDR3, QDR, RLDRAM) For EDK design flow (Multi Port Memory Controller, MPMC) – Best for Embedded applications – Simple GUI-driven tool within the Embedded Developers Kit (EDK) – Supports same memory standards Memory Interface Generator (MIG) Customize your memory controller and interface design – Memory architecture, data rate, bus width – CAS latency, burst length – I/O bank assignments – Generates RTL source code and UCF from hardware-verified IP Currently supported memory interfaces for Virtex-6 – DDR2 SDRAM – DDR3 SDRAM – QDR II+SRAM – RLDRAM II Memory MIG Design Flow Open CORE Generator™ Run MIG, choose your memory parameters and generate rtl and ucf files Integrate MIG .ucf constraints to overall design constraints file Import RTL and build options into ISE project Customize MIG design Integrate/customize MIG memory RTL testbench Perform functional simulation Optional RTL customization Synthesize design Place and route design Timing simulation Verify in hardware MIG Customize your memory controller – Some options are specific to the memory controller standard Options for the physical layer and the FPGA controller – Debug signals – IOB options for power or speed MIG bank selection options – Displays pins required – Restricted columns are grayed out – Rules are based on IO banking, WASSO, clocking, and size of the Memory Controller MIG Output Files UCF folder – Pinout and clocking constraints – Batch file (ise_flow.bat) with recommended build options RTL folder – Functional modules (physical layer, user interface, controller, testbench) – Unencrypted for ease of customization Simulation folder – HDL simulation files including memory device models Synthesis files folder Instruction Reordering DDR2/DDR3 controller state machine runs at half the memory clock rate Reordering doubles the throughput of DDR2/DDR3 MC – – – – New feature for Virtex-6 soft memory controller Reordering controller looks ahead several commands Reordering READs avoids the pre-charge time penalty Execute out-of-order READs to a different bank while performing precharge for the current bank – Efficiency is dependent on the application (address / command patterns) Regrouping READs and WRITEs minimizes bus turnaround – Example : Read A - Write B - Read C - Write D – Reordered to: Read A - Read C - Write B - Write D DDR3 Bank Selection Guidelines DDR3 bank assignments – Address/Control pins must be in one of the center IO columns – 16-bit requires 1 data bank – 32-bit requires 2 data banks – 64/72-bit requires 3 data banks – Data banks must be +1/-1 rows from the Address Control bank 6VLX240T / 6VSX315T Outer Column 17 27 37 16 26 36 15 MIG has a bank selection GUI that incorporates all the necessary rules – Rules are explained in the User Guide Center Columns 25 Address 35 14 24 34 13 23 33 12 22 32 Summary Distributed LUT RAM Distributed LUT RAM – Fast, localized memories – Great for small FIFOs Block RAM / FIFO – Bigger on-chip memories – Built-in FIFO and ECC logic – Great for mid-sized buffering High-Performance Block RAM FPGA Memory controllers can be built with the MIG and the Core Generator – Fast connection to popular standard RAMs – Ideal for large memory requirements External Memory Interfacing Where Can I Learn More? User Guides – Virtex-6 FPGA User Guide • Describes the complete FPGA architecture, including distributed memory, block memory and FIFO resources – Virtex-6 FPGA Memory Interface Solutions User Guide, UG406 • Describes the complete library of memory controllers that can be made • Explains the rules behind the IO pin pre-optimization Xilinx Education Services courses – www.xilinx.com/training • • • • Designing with the Spartan-6 and Virtex-6 Families course Xilinx tools and architecture courses Hardware description language courses Basic FPGA architecture, Basic HDL Coding Techniques, and other Free videos! Trademark Information Xilinx is disclosing this Document and Intellectual Property (hereinafter “the Design”) to you for use in the development of designs to operate on, or interface with Xilinx FPGAs. Except as stated herein, none of the Design may be copied, reproduced, distributed, republished, downloaded, displayed, posted, or transmitted in any form or by any means including, but not limited to, electronic, mechanical, photocopying, recording, or otherwise, without the prior written consent of Xilinx. Any unauthorized use of the Design may violate copyright laws, trademark laws, the laws of privacy and publicity, and communications regulations and statutes. Xilinx does not assume any liability arising out of the application or use of the Design; nor does Xilinx convey any license under its patents, copyrights, or any rights of others. You are responsible for obtaining any rights you may require for your use or implementation of the Design. Xilinx reserves the right to make changes, at any time, to the Design as deemed desirable in the sole discretion of Xilinx. Xilinx assumes no obligation to correct any errors contained herein or to advise you of any correction if such be made. Xilinx will not assume any liability for the accuracy or correctness of any engineering or technical support or assistance provided to you in connection with the Design. THE DESIGN IS PROVIDED “AS IS" WITH ALL FAULTS, AND THE ENTIRE RISK AS TO ITS FUNCTION AND IMPLEMENTATION IS WITH YOU. YOU ACKNOWLEDGE AND AGREE THAT YOU HAVE NOT RELIED ON ANY ORAL OR WRITTEN INFORMATION OR ADVICE, WHETHER GIVEN BY XILINX, OR ITS AGENTS OR EMPLOYEES. XILINX MAKES NO OTHER WARRANTIES, WHETHER EXPRESS, IMPLIED, OR STATUTORY, REGARDING THE DESIGN, INCLUDING ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE, AND NONINFRINGEMENT OF THIRD-PARTY RIGHTS. IN NO EVENT WILL XILINX BE LIABLE FOR ANY CONSEQUENTIAL, INDIRECT, EXEMPLARY, SPECIAL, OR INCIDENTAL DAMAGES, INCLUDING ANY LOST DATA AND LOST PROFITS, ARISING FROM OR RELATING TO YOUR USE OF THE DESIGN, EVEN IF YOU HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. THE TOTAL CUMULATIVE LIABILITY OF XILINX IN CONNECTION WITH YOUR USE OF THE DESIGN, WHETHER IN CONTRACT OR TORT OR OTHERWISE, WILL IN NO EVENT EXCEED THE AMOUNT OF FEES PAID BY YOU TO XILINX HEREUNDER FOR USE OF THE DESIGN. YOU ACKNOWLEDGE THAT THE FEES, IF ANY, REFLECT THE ALLOCATION OF RISK SET FORTH IN THIS AGREEMENT AND THAT XILINX WOULD NOT MAKE AVAILABLE THE DESIGN TO YOU WITHOUT THESE LIMITATIONS OF LIABILITY. The Design is not designed or intended for use in the development of on-line control equipment in hazardous environments requiring fail-safe controls, such as in the operation of nuclear facilities, aircraft navigation or communications systems, air traffic control, life support, or weapons systems (“High-Risk Applications”). Xilinx specifically disclaims any express or implied warranties of fitness for such High-Risk Applications. You represent that use of the Design in such High-Risk Applications is fully at your risk. © 2012 Xilinx, Inc. All rights reserved. XILINX, the Xilinx logo, and other designated brands included herein are trademarks of Xilinx, Inc. All other trademarks are the property of their respective owners.