RTG4 FPGA Fabric UG0574 User Guide UG0574: RTG4 FPGA Fabric User Guide Table of Contents About this Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Additional Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1 Fabric Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Fabric Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Logic Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Interface Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 I/O Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 FPGA Routing Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Fabric Array Coordinate System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2 Large SRAM (LSRAM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 LSRAM Resources Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Functional Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Port List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Port Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Memory Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Dual-Port Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Two-Port Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Read Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ECC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reset Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Block Select Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Read Enable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Collision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 30 33 34 35 36 37 3 Micro SRAM (uSRAM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 uSRAM Resource Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Functional Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Port List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Port Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Read Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ECC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reset Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Collision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Revision 2 45 51 52 53 55 2 UG0574: RTG4 FPGA Fabric User Guide 4 uPROM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . uPROM Resource Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Functional Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 56 56 57 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Port List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Operational Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5 Mathblocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Mathblock Resource Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Functional Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 How to Use Mathblocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Design Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Mathblock Use Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Coding Style Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 6 I/Os. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Functional Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 MSIO, MSIOD, and DDRIO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transmit Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Receive Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Input Programming Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . On-Die Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 85 85 85 85 Radiation Hardening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 I/O Banks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Supported I/O Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Single-Ended Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Voltage-Referenced Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Differential Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 I/O Programmable Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Programmable Input Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pre-Emphasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Programmable Slew Rate Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Programmable Weak Pull-Up/Pull-Down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Programmable Schmitt Trigger Input and Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Programmable Output Drive Strength . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Configurable ODT and Driver Impedance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 93 93 93 94 95 96 Cold Sparing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5 V Input Tolerance and Output Driving Compatibility (only MSIO) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Temperature Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 I/Os in Shared By Fabric and FDDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 DDRIOs with FDDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 DDRIOs with Fabric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 MSIOs/MSIODs with Fabric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 JTAG I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Dedicated I/Os . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Device Reset I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 SERDES I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Dedicated Global I/Os . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Revision 2 3 Table of Contents A Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 B List of Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 C Product Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Customer Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Customer Technical Support Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contacting the Customer Technical Support Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 112 112 112 112 Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 My Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Outside the U.S. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 ITAR Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4 R e vi s i o n 2 About this Guide Purpose The RTG4™ field programmable gate array (FPGA) device integrates fourth generation flash-based FPGA fabric with radiation tolerance. The RTG4 architecture has been designed to be RadiationTolerant, at the silicon level. The FPGA fabric (the Fabric is the Digital Logic section of RTG4) is composed of 4-input look-up table (LUT) logic elements and includes embedded memories and mathblocks for digital signal processing (DSP) capabilities. This document describes the RTG4 FPGA fabric architecture, embedded memories, mathblocks, fabric routing, and input/output (I/O). Contents This user guide contains the following chapters: • Chapter 1 - Fabric Architecture • Chapter 2 - Large SRAM (LSRAM) • Chapter 3 - Micro SRAM (uSRAM) • Chapter 4 - uPROM • Chapter 5 - Mathblocks • Chapter 6 - I/Os Additional Documentation Table 1-1 shows additional documentation that are available for RTG4 FPGAs. Refer to the RTG4 Documentation web page for a complete up-to-date listing. Table 1-1 • RTG4 Additional Documents Document Description RTG4 FPGA Product Brief Provides an overview of RTG4 FPGA family of devices, features and benefits, and ordering information. RTG4 FPGA Datasheet Provides details about RTG4 AC characteristics, DC characteristics, switching characteristics, and general specifications. RTG4 FPGA Pin Descriptions Contains RTG4 pin descriptions, bank location diagrams, packaging information, and links to pin assignment tables. RTG4 FPGA High Speed DDR Describes the high-speed memory interfaces in the RTG4 FPGA devices. The Interfaces User Guide functionalities of FDDR subsystems and configurations are also described. RTG4 FPGA High Speed Serial Interfaces User Guide Provides details about high-speed serial interfaces (SERDES) and integrated functionality support for multiple protocols within the RTG4 FPGA. RTG4 FPGA Clocking Resources User Guide Describes the RTG4 FPGA devices clocking resources that include, FPGA fully SET hardened fabric global network, clock conditioning circuitry (CCCs) with dedicated Radiation-hardened phase-locked loops (PLLs), and a radiation hardened 50 MHz RC oscillator. RTG4 FPGA System Controller Describes the System Controller that manages programming, initialization, and User Guide configuration of the RTG4 FPGA devices and also the subsystems and interfaces available in the System Controller. Revision 2 5 About this Guide Table 1-1 • RTG4 Additional Documents (continued) Document Description RTG4 FPGA Programming User Describes the programming modes that the RTG4 FPGAs support and Guide provides details about implementation of programming modes that are validated in the RTG4 devices. The RTG4 device programming security, debugging features and methods are not discussed in this document. RTG4 Debugging User Guide RTG4 Board Layout User Guide RTG4 Board Design User Guide Libero SoC User Guide 6 Describes the usage of the Libero® System-on-Chip (SoC) software and the design flow. R e vi s i o n 2 1 – Fabric Architecture Introduction The RTG4 FPGA fabric comprises an array of flash-technology based radiation tolerant logic elements and embedded hard ASIC blocks such as large static random access memory (LSRAM), micro SRAM (uSRAM) for data storage, and mathblocks for DSP. These elements are arranged as several rows inside the fabric and interconnected by the clustered routing architecture. Each element in the fabric has a distinct logical coordinate value assigned to it. The registers in embedded hard blocks have an option to mitigate the single-event transients and memories have built-in error detection and correction (EDAC) with 1-bit error correction, 2-bit error detection. As it is flash-technology, the RTG4 configuration is nonvolatile and does not require programming the logic elements every time during the device power-up. Figure 1-1 on page 8 shows a simple layout of the RTG4 FPGA fabric architecture. Three types of resources constitute the major part of the fabric logic elements: • Logic Element • Interface Logic Element • I/O Module Logic elements: The logic element is the basic element used for implementing the combinatorial circuits, arithmetic functions, and sequential circuits inside the fabric. Each logic element consists of a 4-input LUT, a self-corrected triple module redundancy (STMR) flip-flop, and a dedicated carry chain. The STMR flip-flops have an option to mitigate single-event transients. Interface logic elements: The interface logic element is the logic element that interfaces the embedded hard blocks to the fabric. It enables the accessibility of the embedded hard block through the fabric routing. It is structurally similar to the basic logic element without the dedicated carry chain. It can be used to implement the combinatorial and sequential circuits, if the design does not use the associated embedded hard block. I/O modules: The I/O module forms the digital part of the fabric user I/Os, also called as multi-standard inputs/outputs (MSIOs). The I/O module enables the user I/Os to be connected to the fabric routing. The RTG4 fabric uses a clustered routing architecture to interconnect the various elements inside the fabric. In the clustered architecture, various logic elements are grouped together to form the clusters. There are three types of clusters in the RTG4 FPGA fabric: • Logic clusters • Interface clusters • I/O clusters The logic cluster is composed of 12 logic elements, the interface cluster is composed of 12 interface logic elements, and I/O clusters are composed of 3 I/O modules that are distributed on all four sides of the device, as shown in Figure 1-1 on page 8 (north, south, east, and west I/O clusters). Revision 2 7 Fabric Architecture Fabric Resources Table 1-1 shows the fabric resources available on RTG4 devices. Table 1-1 • Fabric Resources for RTG4 Devices Fabric Resource RT4G075 RT4G150 77,712 151,824 LSRAM 24.5 Kbit blocks 111 209 uSRAM 1.5 Kbit blocks 112 210 uPROM 254 381 Mathblocks 224 462 8 8 Logic elements (4-input LUT + TMR/SET FF) PLLs and CCCs (Rad Tolerant) SERDES + Hardened IP (PCI Express) Logic Cluster Logic Clusters LSRAMs uSRAM uPROM 8 R e vi s i o n 2 Logic Element Logic Element Logic Element PLL and CCC SERDES + Hardened IP (PCI Express) Mathblocks Figure 1-1 • RTG4 Simple Layout Logic Element Logic Element Logic Element Logic Element Logic Element Logic Element Logic Element Logic Element Logic Element One Logic Cluster UG0574: RTG4 FPGA Fabric User Guide Architecture Overview The RTG4 FPGA fabric has rows composed of the following: • Logic cluster • Interface cluster • I/O cluster • LSRAM • uSRAM • Mathblocks • Global clock distribution stripes Logic Cluster The logic cluster is a combination of 12 logic elements with a dedicated hardwired carry chain implemented for all 12 logic elements. The logic clusters contain routing MUXes. Each routed signal is driven by a unique logic element output or by a routing MUX. All the logic elements are interconnected with feedback from outputs to inputs. The intra-routing inside the logic clusters has very low propagation delay compared to the routing outside the logic clusters. Each LUT, D-flip-flop, and the carry-circuit in the logic cluster has an individual X-Y logical coordinate assigned, and this makes them independently addressable. Figure 1-2 shows the top-level logic cluster layout diagram. 'HGLFDWHG&DUU\&KDLQ &OXVWHU&DUU\,1 &OXVWHU&DUU\2XW /RJLF(OHPHQWV ,QWUDFOXVWHU 5RXWLQJ 5RXWLQJ 0X[HV %XIIHUV Figure 1-2 • Top-Level Logic Cluster Layout Revision 2 9 Fabric Architecture Logic Element The logic elements is a base element in a logic cluster that consists of: • Combinational logic element (CLE) - 4-LUT with Carry Chain • Sequential logic element (SLE) - STMR flip-flop Figure 1-3 shows the functional block diagram of the logic element with a carry chain. 6680 < 4 /2*,&02'8/( &LQ &RXW &RXW /2*,&02'8/( ' /87 ZLWK&DUU\&KDLQ 6705 )OLS)ORS (1 &/. 6/B1 /2*,&02'8/( 4 GDWD < &LQ $/B1 $ % & DOBQ VOBQ FORFN HQ ' ' 5RXWLQJ08;HV Figure 1-3 • Functional Block Diagram of Logic Element Combinational Logic Element Each CLE consists of: • A 4-input LUT • A dedicated carry chain based on the carry look-ahead technique The 4-input LUT can be configured to implement any 4-input combinatorial function or an arithmetic function, where the LUT output is XORed with carry the input (Cin) to generate the sum (S) output. The sum output, S, is typically used as an output for arithmetic functions but can also be used as an output for logical functions along with the other output, Y, when the LUT is used to implement combinatorial functions. Each logic element has a dedicated 3-bit look-ahead carry implementation that is used to implement a dedicated carry chain between the logic elements when the LUT is used to implement arithmetic operations. Each cluster has one carry initialization bit and four look-ahead circuits. The carry chain has hardwired routing nets running between the logic elements, which reduces the carry propagation delay through the carry chain, and hence gives better performance. 10 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Sequential Logic Element Each logic element has a SET-mitigated asynchronous self corrected TMR-D flip-flop (STMR), which can be used as a sequential logic element. The self corrected TMR flip-flop can be configured as a register. Figure 1-4 shows the functional block diagram of STMR flip-flop. Each STMR flip-flop has Asynchronous majority voter logic that ensures SEU immunity within the timeline of an SET pulse width. Its triple module redundancy mitigates the single event upset (SEU) errors. It has hardened asynchronous (AL_n), synchronous load (SL_n), and clock enable (EN) inputs. AL_n can be used as single global asynchronous set or reset signal shared to all fabric STMR flip-flop. It sets or resets the register depending on configuration. SL_n can be used as synchronous set or reset signal of each fabric STMR flip-flop. It sets or resets the register depending on configuration. The data input of the STMR flip-flop can be fed from the direct input or from the outputs of the 4-input LUT inside the logic element. Data input (D) has a programmable delay circuit to derive a delayed data for SET mitigation. The delay value decides the maximum SET glitch width that can be filtered out. STMR flip-flops support mitigated SET and non-mitigated SET modes. This can be set by using the Libero SoC tool. Refer to the Libero SoC User Guide for more details on how to set the mitigation using the Libero SoC software. Non-mitigated timing is significantly faster than the mitigated timing. Setting the fabric flip-flops in critical timing paths to non-mitigated mode improves the application speed significantly while reducing the radiation tolerance nominally. G &/. 6/BQ $/BQ T G )) 'HOD\ FON & 2 1 7 5 2 / / 2 * , & FRQWURO 6705RXWSXW T T G )) FON FRQWURO T G )) FON FRQWURO 0DMRULW\YRWHU GHOD\BHQ GHOD\BVHO>@ Figure 1-4 • Functional Block Diagram of STMR Flip-Flop Revision 2 11 Fabric Architecture Interface Cluster The interface cluster is similar to the logic cluster except that it is a combination of 12 interface logic elements. These clusters are used to interface the inputs and outputs of the embedded hard blocks (LSRAM, uSRAM, mathblocks, and CCCs) to the fabric routing. Each embedded hard block is spanned by three interface clusters, as shown in Figure 1-5. The interface logic element can be used as a normal logic elements (without carry chain) when the design does not use the associated embedded hard block. (PEHGGHG,3V65$0VX65$0V0DWKEORFNV &OXVWHUV:LGH ,QWHUIDFH/RJLF(OHPHQWV ,QWHUIDFH/RJLF(OHPHQWV ,QWHUIDFH /RJLF /87 )) ,QWHUIDFH /RJLF /87 )) 5RXWLQJ 5RXWLQJ ,3,QWHUIDFH&OXVWHU ,3,QWHUIDFH&OXVWHU Figure 1-5 • IP interface Cluster Interface Logic Element The embedded hard IP blocks (LSRAM, uSRAM, and mathblocks) contain dedicated interface logic elements. The embedded hard blocks are connected to the fabric routing structure through LUTs and STMR-flip-flops on their inputs and outputs, and these together form the interface logic element. Each embedded hard block is associated with 36 interface logic elements. This interface logic element is structurally similar to a logic element with 4-input LUT, STMR-flip-flop, and without a dedicated carry chain. Interface logic elements are TMR'd and have same SET mitigation as SLEs. If an embedded hard block is used by the target design, the interface logic element is used to connect the I/Os of the embedded hard block to the fabric routing. If an embedded hard block is not used by the design, the interface logic element is available for use as normal logic elements for implementing combinatorial and sequential circuits. These are in addition to the logic elements available in the fabric. 12 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide I/O Cluster I/O clusters are combinations of I/O modules and the associated routing interfaces. Each I/O cluster contains three I/O modules. I/O Module The I/O module includes the I/O digital (IOD) circuitry and the associated routing interface. Each user I/O pad is connected to its own dedicated I/O module. The I/O module interfaces the user I/Os with the fabric routing and enables the routing of external signals coming in through the I/Os to reach all the logic elements. The I/O modules also enable the internal signals to reach the I/Os. Figure 1-6 on page 14 shows the functional diagram of the complete I/O Module with the IOD and I/O analog (IOA) sections. The IOD circuitry consists of the following: • Input registers: Used to register the inputs received from the I/Os. The input registers allow capturing the input signals and synchronizing them to the design clock. • Output registers: Used in the I/O modules for registering the output signals at I/Os for better design performance. The output register provides the registered version of the output signals to the I/Os. • Output enable registers: Act as a control signal for the output, if the I/O is configured as a tristate or bi-directional I/O. • Routing multiplexers (MUXes): These routing muxes are used to connect logic elements. All these registers in the I/O modules are similar to the STMR flip-flop available in the logic element. For a signal bus, these registers ensure that all the signal bus bits are synchronized to the clock signal when sent out through I/Os. For more information on IOA, refer to "I/Os" on page 83. Revision 2 13 Fabric Architecture I/O Module (IOD) IOA Weak pull-up/pull-down resistor control PAD_P DO_P TX Output data outreg OCLK RX OE_P Differential ODT Output enable outreg ODT 0 1 DO_N Output data 0 1 outreg TX PAD_N OE_N 0 1 RX Output enable outreg VREF ODT non-registered input data registered input data DI_P inreg ICLK non-registered input data DI_N registered input data inreg DIFF_IN DIFF_OUT Figure 1-6 • I/O Module Functional Block Diagram 14 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide FPGA Routing Architecture The RTG4 FPGA fabric has a clustered routing architecture. Clustering is hierarchical grouping of fabric resources that allows improved area-efficient implementation of designs while maintaining optimal performance. It also helps to reduce the run-time of the place-and-route software. Routing Structure Each routing interface includes multiple muxes and routing buffers. Each routed signal is driven by a unique logic element output or routing MUX. The routing of a design is completed automatically by the software, and hence the utilization of the routing resources is completely transparent to the user. The selection among various routing resources by the place-and-route software is impacted by the design constraints provided. Refer to the RTG4 SmartTime, I/O Editor and ChipPlanner User Guide in the Libero SoC software for more details on how to use the constraints using the Libero SoC software. Timing driven constraints and placement constraints can be used to constrain the good placement of user logic. Knowledge of the routing architecture and functional modules is required for providing effective design constraints to the software and to perform an optimal design implementation on the RTG4 fabric. In the RTG4 device, there are two types of fabric routing: • Inter-cluster routing • Intra-cluster routing Figure 1-7 shows the fabric routing structure for the RTG4 device. )URP2WKHU &OXVWHUV 7R2WKHU &OXVWHUV ,QWHUFOXVWHU5RXWLQJ /RJLF(OHPHQWV &OXVWHU ,QWUDFOXVWHU5RXWLQJ/HYHOVRI5RXWLQJ0X[HV )URP$GMDFHQW &OXVWHUV 2XWSXW08;HV 7R$GMDFHQW &OXVWHUV )URP2WKHU &OXVWHUV ,QWHUFOXVWHU5RXWLQJ 7R2WKHU &OXVWHUV Figure 1-7 • Fabric Routing Structure Inter-cluster routing spans the clusters and connects them. The inter-cluster routing resource is common to all the clusters inside the fabric and is universal across the clusters. Intra-cluster routing spans the modules that constitute a cluster. Intra-cluster routing varies from cluster to cluster, depending on the functionality of the cluster. For example, the intra- cluster routing for an interface cluster is different from that of a logic cluster. The differences in the routing of the various interface clusters, depends on the embedded hard block to which they interface. Revision 2 15 Fabric Architecture Inter-cluster routing is different from intra-cluster routing. Inter-cluster routing never drives the inputs of the functional modules (logic elements, interface logic elements, or I/O modules) directly and the outputs of the functional modules do not drive the inter-cluster routing directly. Inter-cluster routing has to pass through the intra-cluster routing to reach the functional modules. It makes RTG4 routing a fully clustered routing architecture. The global network can also drive intra-cluster routing through special routing MUXes. These global routing MUXes bring in STMR flip-flop control signals such as clock, enable, and sets/resets. There are a few short routing lines between the adjacent clusters and the inter-cluster, and intra-cluster routing MUXes. These short paths are provided for better performance to the signals routed through these lines. Fabric Array Coordinate System All elements in the RTG4 FPGA fabric has individual logical X-Y coordinates associated with the fabric array coordinate system. These logical coordinates are used by the place-and-route software when implementing the design using the fabric elements. The place-and-route software can have constraints set to place the design components in specific locations inside the fabric using this coordinate system. Regions can be created inside the fabric and a particular part of the design can be assigned to that region using the floor-planner in Libero SoC. The boundaries of these regions can be specified using the array coordinates. Similarly, the embedded hard blocks are also addressable through the fabric coordinate system. The array coordinates are measured from the bottom left corner to the top right corner of the FPGA fabric. Table 1-2 on page 17 provides the array coordinates of logical modules and embedded hard blocks of the RTG4 devices. For more information on how to use array coordinates for region/placement constraints, refer to the Libero SoC User Guide or online help (available in the software) for RTG4 Libero SoC tools. 16 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide /65$0 0DWKEORFNV X65$0 /65$0 0DWKEORFNV X65$0 0DWKEORFNV X65$0 /65$0 Figure 1-8 • RT4G150 Fabric Logical Coordinates Table 1-2 • Fabric Array Coordinate Systems* Logic Elements uSRAM Minimum Maximum Bottom Middle Device X Y X Y (X,Y) RT4G075 – – – – RT4G150 – – – – LSRAM Mathblocks Top Bottom Middle Top Bottom Middle Top (X,Y) (X,Y) (X,Y) (X,Y) (X,Y) (X,Y) (X,Y) (X,Y) – – – – – – – – – – – – – – – – – – Note: *Coordinates will be filled when new devices are added. Revision 2 17 2 – Large SRAM (LSRAM) Introduction The RTG4 FPGA fabric has embedded 24 Kbit SRAM blocks used for storing data. These LSRAMs are arranged in multiple rows within the FPGA fabric and can be accessed through the fabric routing architecture. The number of available LSRAM blocks depends on the specific RTG4 device, as shown in Table 2-1 on page 19. For example, in the RT4G150 devices there are 209 LSRAM blocks available, which are spread across three rows inside the fabric. Features RTG4 LSRAM blocks have the following features: • Each LSRAM block can store up to 24,576 bits of data and can be configured in any of the following depth × width combinations: 512 × 36, 1K × 18, 2K × 12 or 2K × 9. Only the x12 port width accesses the entire address space of the 24,576 bits. The ×9, ×18 and ×36 address space is limited to 18,432 bits. • The registers in LSRAM block are similar to STMR flip-flop in fabric and have an option to mitigate single-event transients. • Each LSRAM block contains two independent data ports - Port A and Port B. • The LSRAM block is synchronous for both read and write operations. These operations are triggered on the rising edge of the clock. • The LSRAM block has built-in error detection and correction (EDAC) with 1-bit error correction, 2 bits error detection for the x18 and x36 modes; but not for the x9 and x12 modes. EDAC is referred to as ECC in the description, ports, and timing diagrams. • When ECC is enabled, each port of the LSRAM block can raise flags to indicate single-bit-correct and double-bit-detect. • LSRAM can be operated in dual-port mode and two-port mode. • LSRAM supports pipelined read and non-pipelined read (flow-through) operations. • LSRAM supports three types of write operations: – Simple write – Feed-Through write (write-bypass write) – Read before write • LSRAM has a read-enable control in both dual-port and two-port modes. • The address, data, block-port select, write-enable and read-enable inputs are registered. • An optional pipeline register with a separate enable and synchronous-reset is available at the read-data port to improve the clock-to-out delay. • A write operation requires one clock cycle. • A read operation requires one clock cycle in Non-pipelined mode. In Pipelined mode, the output data appears in the next cycle. • Read from both the ports at the same location is allowed. • Read and write on the same location at the same time is not allowed. Does not support built in collision prevention or detection circuit in LSRAM. Revision 2 18 UG0574: RTG4 FPGA Fabric User Guide LSRAM Resources Table Table 2-1 shows the LSRAM rows and the 24.5 Kb blocks available in the RTG4 devices. Table 2-1 • RTG4 LSRAM (24.5 Kb Blocks) Resource Table Blocks LSRAM 24.5 K Blocks RT4G075 RT4G150 111 209 Note: All numbers given above are per device. Functional Description This section provides the detailed description of the following: • Architecture Overview • Port List • Port Descriptions Architecture Overview The RTG4 LSRAM embedded memory includes the RAM1Kx24 macro. Figure 2-1 shows a simplified block diagram of the LSRAM memory block and Table 2-2 on page 20 provides the port descriptions. Figure 2-1 displays two independent data ports, the pipeline registers for read data delay, and the FeedThrough multiplexers to enable immediate access to the write data. $B',1>@ (&&/RJLF 0X [ (&&B(1 JK 3RUW$5RZGHFRGH :ULWH&RQWURO URX (&&B(1 GWK $B5(1 )HH $B$''5>@ $B:(1>@ $B%/.>@ $B6567B1 $B&/. $B'287>@ &ROXPQ 'HFRGH (&& /RJLF 3LSHOLQH 5HJLVWHU $B6%B&255(&7 $B'%B'(7(&7 0HPRU\$UUD\ .[ $B'287B(1 $B:02'(>@ %B'287>@ &ROXPQ 'HFRGH (&& /RJLF %B6%B&255(&7 3LSHOLQH 5HJLVWHU %B'%B'(7(&7 %B:02'(>@ %B$''5>@ %B:(1>@ C %B%/.>@ %B6567B1 %B&/. %B5(1 %B'287B(1 3RUW%5RZGHFRGH :ULWH&RQWURO (&&B(1 (&&/RJLF %B',1>@ Figure 2-1 • Simplified Functional Block Diagram for LSRAM Revision 2 19 Large SRAM (LSRAM) Port List Table 2-2 • Port List for LSRAM Macro (RAM1KX18) Direction Type1 A_WIDTH[1:0] Input Static A_WEN[1:0]2 Input Dynamic Port A Write enable High A_REN Input Dynamic Port A Read enable High A_ADDR[10:0] Input Dynamic Port A Address input – A_DIN[17:0] Input Dynamic Port A Data input – Output Dynamic Port A Data output – A_BLK[2:0] Input Dynamic Port A Block select High A_WMODE[1:0] Input Static Port A Feed-Through write select High A_CLK Input Dynamic Port A Clock ADOUT_SRST_N Input Dynamic Port A Pipeline Synchronous reset Low A_DOUT_EN Input Dynamic Port A Pipeline register enable High A_DOUT_BYPASS Input Static A_SB_CORRECT Output Dynamic Port A 1-bit error correction flag High A_DB_DETECT Output Dynamic Port A 2-bit error detection flag High Input Static Port B Width/depth mode select – Input Dynamic Port B Write enable High B_REN Input Dynamic Port B Read enable High B_ADDR[10:0] Input Dynamic Port B Address input – B_DIN[17:0] Input Dynamic Port B Data input – Output Dynamic Port B Data output – B_BLK[2:0] Input Dynamic Port B Block select High B_WMODE[1:0] Input Static Port B Feed-Through write select High B_CLK Input Dynamic Port B Clock B_DOUT_SRST_N Input Dynamic Port B Pipeline Synchronous reset Low B_DOUT_EN Input Dynamic Port B Pipeline register enable High B_DOUT_BYPASS Input Static B_SB_CORRECT Output Dynamic Port B 1-bit error correction flag High B_DB_DETECT Output Dynamic Port B 2-bit error detection flag High Port Name Description Polarity PORT A A_DOUT[17:0] Port A Width/depth mode select Port A output pipeline bypass mode – Rising Active High PORT B B_WIDTH[1:0] B_WEN[1:0] 2 B_DOUT[17:0] Port B output pipeline bypass mode Rising Active High Notes: 1. Static inputs are defined during the design time and need to be tied to 0 or 1. 2. If the LSRAM block is configured in Two-port mode with a write data width of x36 and read data width of x36, both the bits of A_WEN and B_WEN must be tied to logic 1 and must not be dynamically changed. 20 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Table 2-2 • Port List for LSRAM Macro (RAM1KX18) (continued) Direction Type1 Description Polarity ECC Input Static Error correction code (ECC) enable, turns on the ECC encoders, decoders and registers High ECC_BYPASS Input Static ECC pipeline bypass High DELEN Input Static SET mitigation High ARST_N Input Global Pipe line registers Asynchronous reset Active Low Output Dynamic Busy signal from SII Active High Input Static Lock access to SII Active High Port Name Common Signals BUSY SECURITY Notes: 1. Static inputs are defined during the design time and need to be tied to 0 or 1. 2. If the LSRAM block is configured in Two-port mode with a write data width of x36 and read data width of x36, both the bits of A_WEN and B_WEN must be tied to logic 1 and must not be dynamically changed. Port Descriptions A_WIDTH[1:0] and B_WIDTH[1:0] These signals are the depth × width mode selections for each port. Table 2-3 shows the depth × width based on ports width selection. Table 2-3 • Depth/Width Mode Selection A_WIDTH/B_WIDTH Depth/Width 00 2 K × 12 2K×9 01 1 K × 18 1× 512 × 36 (Two-port) A_WEN[1:0] and B_WEN[1:0] These signals are the write enables for each port to select read/write operations. Table 2-4 shows the depth x width operations based on port write enable selection. Table 2-4 • Read/Write Operation Selection1, 2 Depth x Width A_WEN/B_WEN Operation 1 K × 18 00 Read operation 11 Write operation 2K×9 2 K × 12 2 K × 12 2K×9 Notes: 1. In Dual-port mode, every port reads when the corresponding write enable (A_WEN/B_WEN) is 00 and corresponding port select (A_BLK/B_BLK) is active. 2. In Two-port mode, the read port (Port A) reads in every clock cycle if A_BLK is active. Revision 2 21 Large SRAM (LSRAM) Table 2-4 • Read/Write Operation Selection1, 2 (continued) Depth x Width A_WEN/B_WEN Operation 1 K × 18 01 Write [8:0] 10 Write [17:9] 11 Write [17:0] 512 × 36 A_WEN[1:0] = 11 Write [35:0] (Two-port write - Port B) B_WEN[1:0] = 11 Notes: 1. In Dual-port mode, every port reads when the corresponding write enable (A_WEN/B_WEN) is 00 and corresponding port select (A_BLK/B_BLK) is active. 2. In Two-port mode, the read port (Port A) reads in every clock cycle if A_BLK is active. A_ADDR[10:0] and B_ADDR[10:0] These signals are the address buses for the two ports. In ×12 mode and ×9 mode 11 bits are used to address the 2048 independent locations. In wider modes (×18, ×36) fewer address bits are used. The used address bits are the most significant bits (MSBs). The unused bits are the least significant bits (LSBs) and they must be grounded. Table 2-5 shows the address bus used and unused bits for depth × width selections. Table 2-5 • Address Bus Used and Unused Bits A_ADDR/B_ADDR Depth x Width Used Bits Unused Bits (to be grounded) [10:0] None 1 K × 18 [10:1] [0] 512 × 36 (Two-port) [10:2] [1:0] 2 K × 12 2K×9 A_DIN[17:0] and B_DIN[17:0] These signals are the data input buses for the two ports. In Dual-port mode, the data width can be 9 bits, 12 bits, or 18 bits. In Two-port mode, Port B becomes the write-only port. For a write data width of 36 bits, A_DIN[17:0] becomes write data[35:18] and B_DIN[17:0] becomes write data[17:0]. The used bits for any mode are LSB justified in the data bus and the unused MSB bits must be grounded. Table 2-6 shows the data input buses used and unused bits for depth × width selections. Table 2-6 • Data Input Buses Used and Unused Bits A_DIN/B_DIN Depth x Width Used Bits Unused Bits (to be grounded) 2 K × 12 [11:0] [17:12] 2K×9 [8:0] [17:9] 1 K × 18 [17:0] None 512 × 36 (Two-port Write) A_DIN[17:0] is [35:18] None B_DIN[17:0] is [17:0] 22 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide A_DOUT[17:0] and B_DOUT[17:0] These signals are the data output buses for the two ports. In Dual-port mode, the data width can be either 9 bits, 12 bits, or 18 bits. In Two-port mode, Port A becomes the read-only port and Port B becomes write-only port. For a read data width of 36 bits, A_DOUT[17:0] becomes read data[35:18] and B_DOUT[17:0] becomes read data[17:0]. The used bits for any mode are LSB justified in the data bus and the unused MSB must be grounded. Table 2-7 shows the data output buses used and unused bits for depth × width selections. Table 2-7 • Data Output Buses Used and Unused Bits Depth x Width A_DOUT/B_DOUT Used Bits Unused Bits (to be grounded) 2 K × 12 [11:0] [17:12] 2K×9 [8:0] [17:9] 1 K × 18 [17:0] None A_DOUT[17:0] is [35:18] None 512 × 36 (Two-port) B_DOUT[17:0] is [17:0] A_BLK[2:0] and B_BLK[2:0] These signals are the port select control signals for each port block. Table 2-8 shows operations (Read, Write, and No operation) based on the selection of port select control signals. Table 2-8 • Block Select Control Signals Port Select Signal Value Result A_BLK[2:0] 111 Perform read or write operation on Port A. In 36 width mode, perform a read operation from both port A and B. A_BLK[2:0] 000 No operation in memory from Port A. Port A read-data will be forced to logic 0. In 36 width mode, the read-data from both ports A and B are forced to 0. 001 010 011 100 101 110 B_BLK[2:0] 111 Perform read or write operation on Port B. In 36 width mode, perform a write operation to both ports A and B. B_BLK[2:0] 000 No operation in memory from Port B. Port B read-data is forced to 0, unless it is a 36 width mode and write operation to both ports A and B is gated. 001 010 011 100 101 110 Revision 2 23 Large SRAM (LSRAM) A_WMODE[1:0] and B_WMODE[1:0] These signals represent the Write mode control signals for Port A and Port B. Table 2-9 • Depth/Width Mode Selection A_WODE / B_WMODE Write Mode 00 Simple Write 01 Feed-Through; write data appears on the corresponding output data port. In Twoport mode, Feed-Through write is not supported. 10 Read before write mode. In Two-port mode, Read before write mode is not supported. 11 No operation. A_CLK and B_CLK These signals are the synchronous clock inputs for Port A and Port B. All inputs must be set up before the rising edge of the clock. The read or write operation begins with the rising edge. A_DOUT_SRST_N and B_DOUT_SRST_N These signals are Active Low, synchronous reset inputs for the output pipeline registers for Port A and Port B. Assertion of these reset signals forces the data output to logic 0. This does not reset the ECC pipeline registers. A_DOUT_EN and B_DOUT_EN These signals are Active High enable inputs for the output pipeline registers for Port A and Port B. • Logic 1: Normal register operation • Logic 0: Register holds previous data ECC This signal is an Active High enable for ECC logic on Port A and Port B. • Logic 1: ECC logic enable • Logic 0: ECC logic disable ECC_BYPASS The ECC pipe line registers have bypass mode for slower operations. • Logic 0 = pipe-lined operation • Logic 1= non-pipelined operation A_SB_CORRECT, B_SB_CORRECT These are Error Correction Code flags for Port A and Port B. The flag going High indicates that a single bit error has been detected by that port and corrected in the data output. This flag also goes High when a double bit error is detected. Flags for each port are independent of the opposite port, even in x36 width. A_DB_DETECT, B_DB_DETECT These are Error Detection Code flags for Port A and Port B. The flag going High indicates that multiple bit errors have been detected by that port, but have not been corrected. Flags for each port are independent of the opposite port, even in x36 width. DELEN This signal enables the single-event Transient mitigation. When this signal is driven High, the delay for glitch filters is turned ON. LSRAM supports maximum frequency up to 250 MHz with glitch filter and 300 MHz with out glitch filter. ARST_N This signal is the Global asynchronous reset. When this signal is driven Low, the output registers and outputs and write enables are reset. 24 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide A_REN and B_REN These are the read enable signals for A and B ports. If the read enable is Low, the outputs retain their previous values and there will be no dynamic read power consumed. A_DOUT_BYPASS and B_DOUT_BYPASS The pipe line registers have bypass mode inputs for each port. • Logic-0 = pipe-lined operation • Logic-1= non-pipelined operation BUSY This output indicates that the LSRAM is being accessed by the SII. SECURITY This is a control signal for security. When this signal is driven High, the entire LSRAM memory gets locked and cannot be accessed by the SII. Memory Modes LSRAM can be configured as a dual-port SRAM or two-port SRAM. The easiest way to configure LSRAM is to use the Libero SoC tool. Dual-Port Mode The LSRAM block configured as dual-port SRAM provides a data storage capability of 24 Kbits with two independent access ports: Port A and Port B (Figure 2-2 on page 26). Read and write operations can be performed from both the ports independently at any location as long as there is no collision (simultaneous access to the same address). In Dual-port mode, the maximum data width can be x18 for either port. In Dual-port mode, each port of the LSRAM can be configured in the following depth × width configurations: • 1 K × 18 • 2K×9 • 2 K × 12 Revision 2 25 Large SRAM (LSRAM) Figure 2-2 shows the data path for the dual-port SRAM (DPSRAM). $B',1 %B',1 3RUW$ $B:(1 3RUW% 'DWD,Q$ 'DWD,Q% %B:(1 $B%/. %B%/. $B&/. %B&/. $B:,'7+ %B:,'7+ $B5(1 %B5(1 $B$''5 %B$''5 'DWD2XW$ 'DWD2XW% 3LSHOLQH 5HJLVWHU$ 3LSHOLQH 5HJLVWHU% 6WDWLF6LJQDOV $B'287 %B'287 '\QDPLF6LJQDOV Figure 2-2 • Data Path for Dual-Port Mode Data can be written to either or both ports and also can be read from either or both ports. Each port has its own address, data in, data out, clock, block select, and write enable. The read and write operations are synchronous and require a clock edge. There is no collision detection or prevention circuit built into LSRAM. Simultaneous write operations from both the ports to the same address location result in data uncertainty. Simultaneous read and write operations from both the ports to the same address location results in correct data written into the memory but garbage values being read out. The read operation requires one clock cycle in Non-pipelined mode. In Pipelined mode, the output data appears in the next cycle. The write operation requires one clock cycle. Table 2-10 shows the data width configurations that are supported by the LSRAM block configured in Dual-port mode. Table 2-10 • Data Width Configurations for LSRAM in Dual-Port Mode Port A Data Width (represented as - x number of bits) Port B Data Width (represented as - x number of bits) x9 x9, x18 x12 x12 x18 x9, x18 26 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Table 2-11 shows the mode operations and data input and output pins used in this case where simultaneous read from Port A and write to Port B. Writing to Port A and reading from Port B at the same time is valid. Simultaneous write and read is supported except to the same address. Table 2-11 • Dual-port Mode of Operation Mode of Operation Pins Used R18/W18 A_DOUT[17:0], B_DIN[17:0] R18/W9 A_DOUT[17:0], B_DIN[8:0] R12/W12 A_DOUT[11:0], B_DIN[11:0] R9/W18 A_DOUT[8:0], B_DIN[17:0] R9/W9 A_DOUT[8:0], B_DIN[8:0] Two-Port Mode The LSRAM block configured as two-port SRAM provides a data storage of 24 Kbits, with Port A dedicated to read operations and Port B dedicated to write operations, refer Figure 2-3. In Two-port mode, the data width for the read port (Port A) or the write port (Port B) is x36. $B',1 %B',1 3RUW$ 'DWD,Q$ 'DWD,Q% $B%/. %B%/. $B&/. %B&/. $B:,'7+ %B:,'7+ $B$''5 %B$''5 'DWD2XW$ 'DWD2XW% 3LSHOLQH 5HJLVWHU$ 3LSHOLQH 5HJLVWHU% $B'287 %B'287 3RUW% '\QDPLF6LJQDOV 6WDWLFVLJQDOV Figure 2-3 • Data Path for Two-Port Mode When the read port data width is configured as x36: • Output data pins are borrowed from Port B, with Port A forming the MSB and Port B forming the LSB. • Input data pins are borrowed from Port A, with Port A forming the MSB and Port B forming the LSB. Revision 2 27 Large SRAM (LSRAM) The read operation requires one clock cycle in Non-pipelined mode. In Pipelined mode, the output data appears in the next cycle. The write operation requires one clock cycle. There is no collision detection or prevention circuit built into LSRAM. Simultaneous read operations from Port A and write operations from Port B for the same address location must be avoided. This situation results in correct values being written into the memory, but garbage values will be read out from the memory. Table 2-12 shows the data width configurations supported by LSRAM configured in Two-port mode. Table 2-12 • Data Width Configurations for LSRAM in Two-Port Mode Read Port - Port A (represented as - x number of bits) Write Port - Port B (represented as - x number of bits) Data Input Data Output Address x36 A_DIN[17:0] A_DOUT[8:0] A_ADDR[10:0] x9 B_DIN[17:0] x18 x36 A_DIN[17:0] B_ADDR[10:2] A_DOUT[17:0] B_DIN[17:0] x36 x9 x36 A_DIN[8:0] x18 x36 B_DIN[17:0] x36 A_ADDR[10:1] B_ADDR[10:2] A_DOUT[17:0] A_ADDR[10:2] B_DOUT[17:0] B_ADDR[10:0] A_DOUT[17:0] A_ADDR[10:2] B_DOUT[17:0] B_ADDR[10:1] A_DIN[17:0] A_DOUT[17:0] A_ADDR[10:2] B_DIN[17:0] B_DOUT[17:0] B_ADDR[10:2] Note: In Two-port mode, if the write data width is x36 and read data width is x36, both the bits of A_WEN and B_WEN have to be tied to logic 1 and must not be dynamically changed. Operating Modes Read Operation LSRAM supports two types of read operations for both Dual-port and Two-port RAM configurations. • Pipelined read • Non-pipelined read (Flow-through read) Table 2-13 shows the settings of read enable, block select, and width for the simple read on Port A. Same settings apply for Port B. Table 2-13 • Read Enable Settings A_BLK[2:0] A_REN A_WIDTH[1:0] A_WEN[1:0] 111 1 00 00 Read the data in x12 or x9 mode. A_DOUT[11:0] is used for ×12 mode and A_DOUT[8:0] is used for ×9 mode data output 111 1 01 00 Read the data in x18 mode. A_DOUT[17:0] is used for data output 28 R e visio n 2 Result UG0574: RTG4 FPGA Fabric User Guide Pipelined Read In a pipelined read operation, the output data is registered at the pipeline registers, and the data is displayed on the corresponding output in the next clock cycle. In Pipeline mode, pipeline clock input and LSRAM clock input must be synchronized and fed with a single clock source. Non-pipelined Read Flow-through mode indicates a non-pipelined read operation where the pipeline registers are bypassed and the data is displayed on the corresponding output in the same clock cycle. During flow-through read operation, the LSRAM block can generate glitches on the data output buses. Microsemi® recommends using LSRAM with pipeline registers to avoid these read glitches. Timing Diagram: Flow-Through Read and Pipeline Read • The addresses (A_ADDR, B_ADDR), BLK enables (A_BLK, B_BLK), and read enables (A_WEN, B_WEN = 0) must be setup before the rising edge of the clock (A_CLK, B_CLK). • For Non-pipeline read operations, data comes on the output bus (A_DOUT, B_DOUT) after a delay of tCLK2Q (read access time without pipeline register) in the same cycle. • For pipeline read operations, the data is displayed on the output in the next clock cycle. Figure 2-4 shows the timing diagram for a read operation performed on LSRAM. WF\ WFK WFO $B&/.%B&/. W$''568 W$''5+' W%/.68 W%/.+' $B$''5>@ %B$''5>@ $B%/.>@ %B%/.>@ W:(68 W:(+' W5'68 W5'+' $B:(1>@ %B:(1>@ $B5(1 %B5(1 WFGRXWWFHGRXW 9DOLGGDWD $B'287>@QRQSLSHOLQHPRGH %B'287>@QRQSLSHOLQHPRGH WIGRXW 9DOLGGDWD $B'287>@SLSHOLQHDFFHVVRUQRQSLSHOLQHDFFHVVZLWKSLSHOLQH(&& %B'287>@SLSHOLQHDFFHVVRUQRQSLSHOLQHDFFHVVZLWKSLSHOLQH(&& WIHGRXW 9DOLGGDWD $B'287>@SLSHOLQHDFFHVVZLWK(&&SLSHOLQHE\SDVV %B'287>@SLSHOLQHDFFHVVZLWK(&&SLSHOLQHE\SDVV WIGRXW 9DOLGGDWD $B'287>@SLSHOLQHDFFHVVZLWK(&&SLSHOLQH %B'287>@SLSHOLQHDFFHVVZLWK(&&SLSHOLQH Figure 2-4 • Read Operation Timing Waveforms Revision 2 29 Large SRAM (LSRAM) Table 2-14 shows the read operation timing parameters. Table 2-14 • Read Operation Timing Parameters Parameters Description tCY Clock period tCH Clock minimum pulse width High tCL Clock minimum pulse width Low tADDRSU Address setup time tADDRHD Address hold time tBLKSU Block select setup time (With pipeline register enabled) tBLKHD Block select hold time (With pipeline register enabled) tRDESU Read enable setup time (A_WEN, B_WEN =0) tRDEHD Read enable hold time (A_WEN, B_WEN =0) tCDOUT Flow through read access time or non pipe line mode tFDOUT Pipe line read access time tCEDOUT Non pipeline read access time with non pipe line ECC tFEDOUT Non pipeline read access time with pipe line ECC Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. Write Operation LSRAM supports three types of write operations. • Simple Write for both Dual-port and Two-port memory configurations • Feed-Through Write (write-bypass write) for Dual-port memory only. • Read before Write for Dual-port only. Simple Write The simple write mode supports both Dual-port and Two-port memory configurations. The Simple write mode is selected by A_WMODE/B_WMODE equal to 00. In Simple write mode, the data out will only change on a read cycle. As the new data is delayed by one clock cycle, the data out in Simple write mode cannot be read out until the third cycle after the initial write clock cycle and will be delayed an additional clock cycle for the ECC pipeline on the output side. Table 2-15 shows the settings of write enable, read enable, block select, and width for the simple write on Port A. Same settings applies for Port B. Table 2-15 • Read and Write Enable Settings A_BLK[2:0] A_REN A_WIDTH[1:0] A_WEN[1:0] Result 111 x 00 11 Write the data in x12 or x9 mode. A_DIN[11:0] is used for x12 mode and A_DIN[8:0] is used for x9 mode input data. 111 x 01 01 Write the data in x18 mode. A_DIN [8:0] is used for input data. Invalid for x12/x9 mode. 111 x 01 10 Write the data in x18 mode. A_DIN[17:9] is used for input data. Invalid for x12/x9 mode. 111 x 01 11 Write the data in x18 mode. A_DIN[17:0] is used for input data. 30 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Feed-Through Write (write-bypass Mode) The Feed-Through write mode is selected by A_WMODE or B_WMODE equal to "01" and write enable is High. The Feed-Through write option is not supported when the LSRAM is configured in Two-port mode. In Feed-Through write operation, the data written into the memory array is displayed immediately on the corresponding data output for Non-pipeline operation. For Pipeline operation, data output displays in the next clock cycle. For a Feed-Through write operation in ECC pipeline mode, the pipeline enable holds the data and will be clocked through the ECC pipe, one cycle after the data is written through the memory and available at the ECC. Read Before Write Mode The read before write mode is selected by A_WMODE or B_WMODE equal to "10" and write is High. The read before write option is not supported when the LSRAM is configured in Two-port mode. In read before write operation, the data output will be updated with the content of write address before write. During read before write and feed through write modes, the data out from the address written is available after the third clock cycle: one cycle to register the address, one cycle to write/read the data, and one cycle for the ECC data out pipeline. An additional clock cycle is required, if the data out pipeline is also selected. ECC flags will only be valid the same clock cycle as the data out. Timing Diagram: Simple Write, Feed-Through Write, and ReadBeforeWrite • The addresses (A_ADDR, B_ADDR), BLK enables (A_BLK, B_BLK), and write enables (A_WEN, B_WEN = 1) must be set up before the rising edge of the clock (A_CLK, B_CLK). • For a Feed-Through write, the written data is displayed on the output (A_DOUT, B_DOUT) after a delay of tCEDOUT in the same clock cycle. • For a simple write, the written data is displayed on the output only when a read operation is performed on the same address. • If ECC is in Pipeline mode, the actual write to memory is delayed by one clock cycle. In simple write mode, the data out changes only on a read cycle. As the new data is delayed by one clock cycle, the data out in Simple-write mode cannot be read out until the third cycle after the initial write clock cycle and will be delayed an additional clock cycle for the ECC pipeline on the output side. • In RBW and WFT modes, the data out from the address written is also not available until the third clock cycle: one cycle to register the address, one cycle to write/read the data and one cycle for the ECC data out pipeline. The pipeline enable will only hold the data at the output pipeline and not the input data pipeline and will be effective for the clock when the data would be expected to be clocked through the pipeline. • An additional clock cycle is required if the data out pipeline is also selected. ECC flags will only be valid the same clock cycle as the data out. • ECC flags are reset to zero, but are valid only on the same cycle as the corresponding data out. If Pipeline modes are enabled, the ECC flags will be unknown values on subsequent invalid clock cycles until a valid data out clock cycle. • The pipeline enables only hold the data at the output pipelines, including the ECC data out pipeline, but not the input data pipeline. It is effective for the clock when the data is expected to be clocked through the pipeline. For a Feed-Through Write operation in ECC pipeline mode, the pipeline enable will not be captured during the write cycle and will only hold the data when it is expected to be clocked through the ECC pipe, one cycle after the data is written through the memory and available at the ECC. Revision 2 31 Large SRAM (LSRAM) Figure 2-5 shows the timing diagram for a write operation performed on the LSRAM block. W&< W&+ W&/ $B&/.%B&/. W$''568 W$''5+' W%/.68 W%/.+' W:(68 W:(+' W'68 W '+' $B$''5>@ %B$''5>@ $B%/.>@ %B%/.>@ $B:(1>@ %B:(1>@ $B',1>@ %B',1>@ ƚKhd͕ƚKhd 9DOLGGDWD $B'287>@ )HHG7KURXJK:ULWH255HDGEHIRUH:ULWH %B'287>@ 3LSHOLQH%\SDVVZLWKRXW(&& ƚ&Khd 9DOLGGDWD $B'287>@ )HHG7KURXJK:ULWH255HDGEHIRUH:ULWH %B'287>@ 3LSHOLQHGZLWKRXW(&& ƚ&Khd $B'287>@ %B'287>@ )HHG7KURXJK:ULWH255HDGEHIRUH:ULWH 3LSHOLQHGDQG:LWK(&&3LSHOLQH%\SDVV $B'287>@ %B'287>@ )HHG7KURXJK:ULWH255HDGEHIRUH:ULWH 3LSHOLQH%\SDVVDQG:LWK(&&3LSHOLQHG $B'287>@ %B'287>@ )HHG7KURXJK:ULWH255HDGEHIRUH:ULWH 3LSHOLQHGDQG:LWK(&&3LSHOLQHG 9DOLGGDWD ƚ&Khd 9DOLGGDWD ƚ&Khd Figure 2-5 • Write Operation Timing Waveforms Table 2-16 shows the write operation timing parameters. Table 2-16 • Write Operation Timing Parameters Parameters Description tCY Clock period tCH Clock minimum pulse width High tCL Clock minimum pulse width Low tADDRSU Address setup time tADDRHD Address hold time tBLKSU Block select setup time (With pipeline register enabled) tBLKHD Block select hold time (With pipeline register enabled) tWESU Write enable setup time (A_WEN, B_WEN =1) tWEHD Write enable hold time (A_WEN, B_WEN =1) tDSU Data setup time tDHD Data setup time tCEDOUT Read access time with non-pipelined Feed-Through write timing, ECC bypass Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. 32 R e visio n 2 9DOLGGDWD UG0574: RTG4 FPGA Fabric User Guide Table 2-16 • Write Operation Timing Parameters (continued) Parameters Description tCDOUT Read access time with pipelined Feed-Through write timing tFEDOUT Read access time with pipeline bypass and ECC pipeline tFDOUT Read access time with pipeline and ECC pipeline Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. ECC The LSRAM block has error detection and correction logic circuitry (1-bit error correction, 2-bit error detection) and it is available for the x18 and x36 modes; but not for the x9 and x12 modes. Setting the ECC enable ECC_EN to High turns ON the ECC circuitry and ECC pipeline stages. Table 2-17 shows the ECC availability for different modes. Table 2-17 • ECC available modes Write Bypass Read Before Write Write Enables (byte write) Output Pipeline Pipeline Bypass x9/x9 No ECC No ECC No ECC No ECC No ECC x9/x18 No ECC No ECC No ECC No ECC No ECC x9/x36 N/A N/A No ECC No ECC No ECC x12/x12 No ECC No ECC No ECC No ECC No ECC x18/x9 No ECC No ECC No ECC No ECC No ECC x18/x18 ECC Available ECC Available No ECC ECC Available ECC Available x18/x36 N/A N/A No ECC ECC Available ECC Available x36/x9 N/A N/A No ECC No ECC No ECC x36/x18 N/A N/A No ECC ECC Available ECC Available x36/x36 N/A N/A No ECC ECC Available ECC Available Port Widths A/B The ECC encoder provides 24 bits of data for x18 mode or 48 bits of data for x36 mode. The ECC decoder reads the same amount of bits (24 or 48) from the array and provides the expected number of corrected bits (18 or 36) on the outputs. If the ECC has detected an error (A_DB_DETECT, B_DB_DETECT), you need to correct the data in the LSRAM block. The writing of the correct data is called 'Scrubbing'. Scrubbing is not available inside the LSRAM. All scrubbing must be done in the fabric design. Both the ECC encoder and ECC decoder contain their own pipeline registers, which add a clock cycle of latency to each of the read and write operations. These pipeline registers may be by-passed for slower operation. The ECC encoder generates two flags per port, an error correction flag (A_SB_CORRECT, B_SB_CORRECT) that is set as High when a single bit in a word has been corrected and an error detection flag (A_DB_DETECT, B_DB_DETECT) that is set as High when two or more bit errors in a word have been detected, but not corrected. These flags will be set to match the output data of the port where the error was detected, even in x36 width. Revision 2 33 Large SRAM (LSRAM) On a single-bit error, the status flags are set to: A/B_SB_CORRECT, = 1'b1 A/B_DB_DETECT = 1'b0 On a double-bit error, the status flags are set to: A/B_SB_CORRECT, = 1'b1 A/B_DB_DETECT = 1'b1 Reset Operation The global reset signal (ARST_N) is an asynchronous Active Low signal. For any normal operation of LSRAM, this reset signal must be kept High. To reset the LSRAM block, the reset signal must be set to Low. When the reset signal is asserted (ARST_N forced Low), the LSRAM block behaves as follows during read and write operations: 1. Read operation: If the reset signal is asserted when the read operation is in process, the data output port is forced to Low after a certain amount of delay. If the clock is set to High and the reset signal is asserted and then deasserted in the same High clock phase or Low clock phase, the data output stays Low until the next cycle. The data output changes its state only if a read operation or write operation in Bypass mode is performed on the LSRAM block. In a simple write operation, the data output stays Low. 2. Write operation: If the reset signal is asserted during write operation, corrupted data is written into the memory. Microsemi recommends avoiding asserting reset signal during write operation. All data stored in the array is lost during a global reset. The content of the array must be considered unknown until a valid write operation. Timing Diagram: Asynchronous Reset Operation Figure 2-6 shows the timing diagram of an asynchronous reset operation. W&< W&+ W &/ $B&/. %B&/. $567B1 W54 $B'287 %B'287 Figure 2-6 • Asynchronous Reset Operation Table 2-18 shows the asynchronous reset timing parameters. Table 2-18 • Asynchronous Reset Timing Parameters Parameters Description tCY Clock period tCH Clock minimum pulse width High Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. 34 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Table 2-18 • Asynchronous Reset Timing Parameters tCL Clock minimum pulse width Low tR2Q Asynchronous reset to output propagation delay Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. Block Select Operation The block select in LSRAM works like a chip select. When the block select (A_BLK and B_BLK) is High, the LSRAM block is active and read and write operations can be performed. If the block select is Low, LSRAM does not perform any read or write operations. Refer to "A_BLK[2:0] and B_BLK[2:0]" section on page 23. It drives logic 0 on the data output pins until the next read cycle or write operation in Bypass mode. When the pipeline registers are used, the block select effect at the output is delayed by one pipeline clock cycle (the pipeline registers are independent of block select). In Two-port mode, A_BLK[2:0] controls the entire read port (important when is x36) and B_BLK[2:0] controls the entire write port (important when is x36). In Two-Port mode, the block select of Port A can be independent of the block select of Port B. Figure 2-7 shows the timing diagram for block select inputs for LSRAM. W&< $B&/. %B&/. W%/.03: W%/.68 W%/.+' $B%/.>@ %B%/.>@ W%/.4 'DWDRXWSXWORZ $B'287>@1RQ3LSHOLQH0RGH %B'287>@1RQ3LSHOLQH0RGH W &/.4 'DWDRXWSXWORZ $B'287>@3LSHOLQH$FFHVV %B'287>@3LSHOLQH$FFHVV Figure 2-7 • Block Select Timings Revision 2 35 Large SRAM (LSRAM) Table 2-19 shows the block select control signal settings for the read/write operations. Table 2-19 • Block Selection Timing Parameters Parameters Description tCY Clock period tCH Clock minimum pulse width High tCL Clock minimum pulse width Low tBLKSU Block select setup time (with pipeline register enabled) tBLKHD Block select hold time (with pipeline register enabled) tBLKMPW Block select minimum pulse width tBLK2Q Block select to out disable time (when pipeline registers are disabled) tCLK2Q Read access time without pipeline register Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. Figure 2-6 on page 34 shows the timing diagram for asynchronous reset operation performed on LSRAM. Read Enable The Read enable pin controls each port. It can be used to conserve power while retaining previously read data out. When the read enable is set to Low, the data outputs will retain their previous state and no dynamic read power will be consumed on that port. When the read enable is set to High, normal read operation will resume. This operation is summarized for Port A in Table 2-20 shows the Read enable functionality for Port A and Port B. Table 2-20 • Read enable functionality for Port A and Port B Function Deselect LSRAM Write to Port A A_WEN/B _WEN A_REN/B_ A_BLK/B_ A_DOUT/B_ REN BLK DOUT x x Any 0 11 or 10 x 111 All zero Power Low Comment No read or write operations. Refer to "A_BLK[2:0] and B_BLK[2:0]" section on page 23. Previous data Write power Simple write mode. A _WEN/B_WEN = 11 is the only valid active write setting for x12/x9. Read Port A 00 1 111 New data Read power Read Operation Standby mode 00 0 111 Previous data Low No read or write operations 36 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Collision Collision scenarios arise between both ports of the LSRAM block when a read operation is requested from one port and a write operation is requested from the other port simultaneously on the same address location, or when a write operation occurs at the same location at the same time from both ports. Table 2-21 shows the behavior of the LSRAM block during the various cases of collisions. Table 2-21 • Collision Scenarios Operation Description Simultaneous read from Port A and Port B at the same Operation is allowed without any restrictions and data is location available on the output ports after the specified time, as described in the read timing diagrams in Figure 2-4 on page 29. Simultaneous read from Port A and write from Port B Not allowed. The new data may be written into the address at the same location location but the read data out will be a garbage value. Simultaneous read from Port B and write from Port A Not allowed. The new data may be written into the address at the same location location but the read data out will be a garbage value. Simultaneous write from Port A and Port B at the same Not allowed. If the data to be written is same on both the location ports, then the data is successfully written. If the data is different, then the LSRAM cell has an undetermined state. Note: There are no collision prevention or detection techniques available in LSRAM. The last three operations mentioned in Table 2-21 are not allowed on LSRAM and must be avoided. Revision 2 37 3 – Micro SRAM (uSRAM) Introduction The RTG4 FPGA fabric has embedded 1.5 Kbits uSRAM blocks used for storing data. These uSRAMs are arranged in multiple rows within the FPGA fabric and can be accessed through the fabric routing architecture. The number of uSRAM present varies between different RTG4 devices. Table 3-1 on page 39 shows the number of uSRAM present in each RTG4 device. Features RTG4 uSRAM blocks have the following features: • Each uSRAM block stores up to 1.5 Kbits (1,536 bits) of data and can be configured in any of the following depth × width combinations: 64 × 18, 128 × 12, and 128 × 9. Only the x12 port width accesses the entire address space of the 1536bits. The x9 and x18 address space is limited to 1152 bits. • Each uSRAM block has two read data ports (Port A and Port B) and one write data port (Port C). • Each uSRAM block has built-in EDAC with 1-bit error correction, 2-bit error detection for the x18 mode but not for the x9 and x12 modes. EDAC is referred to as ECC in the description, ports, and timing diagrams. • The registers in uSRAM block are similar to STMR flip-flop in fabric and have an option to mitigate single-event transients. • Read operations can be performed in both Synchronous and Asynchronous modes. The write operation is always performed in Synchronous mode. • The two read ports have address/block select registers for enabling Synchronous mode operation. • In Pipelined mode, the two read ports have output registers with independent clocks. These Output pipeline registers can also be configured as transparent for Asynchronous mode operation. • Due to the availability of separate input address and output pipeline registers, read operations through Port A and Port B in uSRAM can be performed in four different modes: – Synchronous read mode without pipeline registers (Synchronous-Asynchronous mode) – Synchronous read mode with pipeline registers (Synchronous-Synchronous mode) – Asynchronous read mode without pipeline registers (Asynchronous-Asynchronous mode) – Asynchronous read mode with pipeline registers (Asynchronous-Synchronous mode) • Separate synchronous resets are provided for the input address select registers. These resets can be used to initialize the read ports. • The output pipeline registers have separate synchronous resets, which provide independent control to these registers. • uSRAM can operate up to 300 MHz with SET mitigation disable and up to 250 MHz with SET mitigation enable. • The two read ports are independent of each other and simultaneous read operations can be performed from both ports at the same address location. • Simultaneous read and write operations at the same location are not allowed. Revision 2 38 UG0574: RTG4 FPGA Fabric User Guide uSRAM Resource Table Table 3-1 shows uSRAM blocks available for the RTG4 devices. Table 3-1 • RTG4 uSRAM (1.5 Kb Blocks) Resource Table Blocks uSRAM 1.5 Kbit Blocks RT4G075 RT4G150 112 210 Note: All numbers given above are per device. Functional Description This section provides detailed description of the following: • Architecture Overview • Port List • Port Description Architecture Overview The RTG4 uSRAM embedded memory includes the RAM64×24 macro available in the Libero SoC software. Figure 3-1 shows a simplified block diagram of the uSRAM memory block with two read data ports, one write data port, and pipeline registers at read port. Table 3-2 on page 40 shows the port descriptions. $B$''5B%<3$ 66 $B$''5>@ $B%/.>@ $B$''5B( 1 $B&/. (&&B(1 $B'287>@ 3RUW$ 5HDG 'HFRGH &B',1>@ (&&B(1 3RUW& ZULWHFRQWURO (&&ORJLF &B$''5>@ 3RUW% 5HDG 'HFRGH &B:(1 $B6%B&255(&7 $B'%B'(7(&7 $B'287B(1 0HPRU\$UUD\ [ &B%/.>@ 3LSHOLQH 5HJLVWHU (&& /RJLF %B'287>@ (&& /RJLF %B6%B&255(&7 3LSHOLQH 5HJLVWHU &B&/. %B'%B'(7(&7 (&&B(1 %B'287B(1 %B$''5>@ %B%/.>@ %B$''5B( 1 %B&/. %B$''5B%<3$ 66 Figure 3-1 • Simplified Functional Block Diagram of uSRAM Revision 2 39 Micro SRAM (uSRAM) Port List Table 3-2 • Port List for uSRAM Direction Type* A_ADDR[6:0] Input Dynamic Port A read address input A_BLK[1:0] Input Dynamic Port A block select A_WIDTH Input Static Output A _CLK Port Name Descriptions Polarity Port A – Active High Port A Depth × width mode selection – Dynamic Port A read data – Input Dynamic Port A clock input Rising A_DOUT_EN Input Dynamic Port A read-data pipeline register enable Active High A_DOUT_SRST_N Input Dynamic Port A read-data pipeline register synchronous reset Active Low A_DOUT_BYPASS Input Static Port A read data pipeline register select Active High A_ADDR_BYPASS Input Static Port A read address pipeline register select Active High A_ADDR_EN Input Dynamic Port A read-address register enable Active High A_ADDR_SRST_N Input Dynamic Port A read-address register synchronous reset Active Low A_SB_CORRECT Output Dynamic Port A 1-bit error correction flag Active High A_DB_DETECT Output Dynamic Port A 2-bit error correction flag Active High B_ADDR[6:0] Input Dynamic Port B read-address input B_BLK[1:0] Input Dynamic Port B Block select B_WIDTH Input Static Output B_ CLK A_DOUT[17:0] Port B Active High Depth × width/depth mode selection – Dynamic Port B read data – Input Dynamic Port B clock input Rising B_DOUT_EN Input Dynamic Port B read data pipeline register enable Active High B_DOUT_SRST_N Input Dynamic Pipeline read data pipeline register synchronous reset Active Low B_DOUT_BYPASS Input Static Port B read data pipeline register select Active High B_ADDR_BYPASS Input Static Port B read-address register select Active High B_ADDR_EN Input Dynamic Port B read address register enable Active High B_ADDR_SRST_N Input Dynamic Port B read address register synchronous reset Active Low B_SB_CORRECT Output Dynamic Port B 1-bit error correction flag Active High B_DB_DETECT Output Dynamic Port B 2-bit error correction flag Active High B_DOUT[17:0] Note: *Static inputs are defined at design time and need to be tied to 0 or 1. 40 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Table 3-2 • Port List for uSRAM (continued) Direction Type* C_ADDR[6:0] Input Dynamic Port C write address input C_BLK[1:0] Input Dynamic Port C Block select C_WIDTH Input Static Output C_CLK C_WEN Port Name Descriptions Polarity Port C – Active High Port C Depth × width mode selection – Dynamic Port C Data output – Input Dynamic Port C Clock input Rising Input Dynamic Port C write enable Active High ECC Input Static ECC enable Active High ECC_DOUT_BYPAS S Input Static ECC pipeline register select Active High ARST_N Input Global Read-address and Read-data pipeline registers asynchronous-reset Active Low DELEN Input Static Enable SET mitigation Active High Output Dynamic Busy signal from SII Active High Input Static Lock access to SII Active High C_DIN[17:0] Common Signals BUSY SECURITY Note: *Static inputs are defined at design time and need to be tied to 0 or 1. Port Description A_WIDTH, B_WIDTH, and C_WIDTH These signals are the depth × width mode selections for each port. Table 3-3 shows the depth × width based on ports width selection. Table 3-3 • Width/Depth Mode Selection A_WIDTH / B_WIDTH / C_WIDTH Depth x Width 0 128 × 12 128 × 9 1 64 × 18 Revision 2 41 Micro SRAM (uSRAM) A_ADDR[6:0], B_ADDR [6:0], and C_ADDR [6:0] These signals are the address buses for three ports (two read and one write). In ×12 mode, 7 bits are used to address the 1536 independent locations. In wider mode x9/x18, few address bits are used. The used address bits are the most significant bits (MSB). The unused bits are the least significant bits (LSBs) and they must be grounded. Table 3-4 shows the address bus used and unused bits for depth × width selections. Table 3-4 • Address Bus Used and Unused Bits Depth x Width A_ADDR/B_ADDR/C_ADDR Used Bits Unused Bits (to be grounded) 128 × 9 [6:0] None 128 × 12 [6:0] None 64 × 18 [6:1] [0] C_DIN[17:0] This signal is the data input bus for the write Port C. The used bits for any mode are LSB justified in the data bus and the unused MSB bits must be grounded. Table 3-5 shows the data input bus used and unused bits for depth × width selections. Table 3-5 • Data Input Buses Used and Unused Bits Depth x Width C_DIN Used Bits Unused Bits (to be grounded) 64 × 18 [17:0] None 128 × 12 [11:0] [17:12] 128 × 9 [8:0] [17:9] A_DOUT[17:0] and B_DOUT[17:0] These signals are the data output buses for the two ports (Port A and Port B). The used bits for any mode are LSB justified in the data bus and the unused MSB bits must be grounded. Table 3-6 shows the data output bus used and unused bits for different depth × width selections. Table 3-6 • Data Output Buses Used and Unused Bits Depth x Width A_DOUT/B_DOUT Used Bits Unused Bits 64 × 18 [17:0] None 128 × 12 [11:0] [17:12] 128 × 9 [8:0] [17:9] 42 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide A_BLK[1:0], B_BLK [1:0], and C_BLK [1:0] These signals are the port select control signal for each port. Table 3-7 shows the operations (Read, write, and no operation) based on the selection of port select control signals. Table 3-7 • Port Select Control Signals Port Select Signal A_BLK[1:0] Value Operation 11 Perform read operation on Port A. 00 Port A is not selected and its read data is logic 0. 01 10 B_BLK[1:0] 11 Perform read operation on Port B. 00 Port B is not selected and its read data is logic 0. 01 10 11 Perform write operation on Port C. 00 C_BLK[1:0] 01 Port C is not selected. 10 A_CLK, B_CLK, C_CLK This signal is the clock signal for Port A, Port B, and Port C. Ensure that all inputs are set up before the first rising clock edge. The read/write operation starts at the rising edge of this clock signal. C_WEN This signal is the write enable for Port C. If the C_BLK and C_WEN signals are 1, then the write occurs in Port C. A_ADDR_SRST_N and B_ADDR_SRST_N These signals are Active Low, synchronous reset inputs for the input address/block select registers for Port A and Port B. Assertion of these reset signals forces the address input registers and block select registers to logic 0, which in turn forces the data output to logic 0. When the registers are configured as transparent, these inputs must be tied to logic 1. A_DOUT_SRST_N and B_DOUT_SRST_N These signals are Active Low, synchronous reset inputs for the output pipeline registers for Port A and Port B. Assertion of these reset signals forces the data output to logic 0. In Non-pipelined mode of operation, tie these inputs to logic 1. A_ADDR_EN and B_ADDR_EN These signals are Active High enable inputs for the input address/block select registers for Port A and Port B. When logic 0 is applied on these inputs, the input registers hold the previous input address. When logic 1 is applied on these inputs, the input registers behave as normal D flip-flops. When the registers are configured as transparent, these inputs should be tied to logic 1. A_DOUT_EN and B_DOUT_EN These signals are Active High enable inputs for the output pipeline registers for Port A and Port B. When logic 0 is applied on these inputs, the pipeline registers hold the previously read data out. In Nonpipelined mode, tie these inputs to logic 1. ARST_N This signal is the Global reset. Connects the read-address and read-data pipeline registers to the global Asynchronous-reset signal. Revision 2 43 Micro SRAM (uSRAM) ECC This signal is Active High and enables ECC logic for Port A, Port B, and Port C. • Logic 1: ECC logic enable • Logic 0: ECC logic disable ECC_DOUT_BYPASS The ECC pipeline registers have Bypass mode for slow operations. • Logic 0: Pipe-lined operation • Logic 1: Non-pipelined operation DELEN This signal enables the SET mitigation. When this signal is driven High, the delay for SET filters is turned ON. uSRAM supports maximum frequency up to 250 MHz with SET enable and 300 MHz with SET disable. A_DOUT_BYPASS and B_DOUT_BYPASS The output pipe line registers have bypass mode for each port. • Logic-0 = pipe-lined operation • Logic-1= non-pipelined operation A_ADDR_BYPASS and B_ADDR_BYPASS The Input pipe line registers have bypass mode for each port. • Logic-0 = pipe-lined operation • Logic-1= non-pipelined operation A_SB_CORRECT, B_SB_CORRECT These are Error Correction Code flags for Port A and Port B. When the flag goes High by itself, it indicates that a single bit error is detected by that port and corrected in the data output. This flag also goes High for a double bit error. A_DB_DETECT, B_DB_DETECT These are Error Detection Code flags for Port A and Port B. When the flag goes High, it indicates that multiple bit errors are detected by that port, but not corrected. BUSY This output indicates that the uSRAM is being accessed by the SII. SECURITY Control signal, when 1 locks the entire uSRAM memory from being accessed by the SII. Operating Modes This section describes the following operation modes: 44 • Read Operation • Write Operation • ECC • Reset Operation • Collision R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Read Operation uSRAM blocks are read through two ports: Port A and Port B. There are four modes for read operations: • Synchronous read mode without pipeline registers (Synchronous-Asynchronous mode) • Synchronous read mode with pipeline registers (Synchronous-Synchronous mode) • Asynchronous read mode without pipeline registers (Asynchronous-Asynchronous mode) • Asynchronous read mode with pipeline registers (Asynchronous-Synchronous mode) Synchronous Read Mode Synchronous read mode requires that the input registers for the address and block select inputs are configured in STMR flip-flop mode (A_IN_BYPASS or B_IN_BYPASS = 0). Similarly, on the output side, the pipeline registers can be configured as registered or asynchronous. When the pipeline registers are enabled, the clock inputs of both the input and output registers must be synchronous to each other and fed with a single clock source. Microsemi recommends configuring the registers as pipeline registers during read operation to avoid glitches on the read output data lines. In Synchronous read mode, the address (A_ADDR or B_ADDR) and block select (A_BLK or B_BLK) inputs must satisfy the setup and hold timing with respect to the input clocks (A_ CLK or B_ CLK). Synchronous Read Mode without Pipeline Registers (Synchronous-Asynchronous Read Mode) • The input registers are configured in Synchronous read mode. • The output pipeline registers are configured as transparent. • This mode is achieved by configuring the following settings: – A_DOUT_BYPASS = 1 or B_DOUT_BYPASS = 1 – A_IN_BYPASS or B_IN_BYPASS = 0 – A_DOUT_SRST_N = 1 or B_DOUT_SRST_N = 1 – A_DOUT_EN or B_DOUT_EN = 1 – A_BLK = 1, B_BLK = 1 Figure 3-2 on page 46 shows the synchronous asynchronous operation with data output behavior when block select inputs are deasserted (any bit forced to logic 0). • The output data is displayed immediately in the same clock cycle in which the address and block select inputs were registered. • The uSRAM block can generate glitches on the output buses when used without the pipeline registers. Revision 2 45 Micro SRAM (uSRAM) Figure 3-2 shows the timing waveforms for synchronous-asynchronous read operation without pipeline registers. W&+ W&/ W&< $B&/. %B&/. W&< $B$''5>@ %B$''5>@ $B%/. %B%/. W$''568 $ W$''5+' W%/.68 W%/.+' $ $ W%/.68 W%/.+' 2XWSXWLQWKHV\QFKURQRXV±DV\QFKURQRXVPRGHZLWKRXW(&&UHJLVWHUV W&/.4+ $B'287>@ %B'287>@ ' ' W%/.4 W%/.4 ' W&/.45 2XWSXWLQWKHV\QFKURQRXV±DV\QFKURQRXVPRGHZLWK(&&UHJLVWHUV $B'287>@ %B'287>@ ' ' W&/.4( ' WFTH W&/.4( Figure 3-2 • Synchronous-Asynchronous Read Operation Waveform without Pipeline Registers Table 3-8 shows the timing parameter values for Synchronous read mode without pipeline registers. Table 3-8 • Timing Parameters for Synchronous-Asynchronous Read Operation Parameter Description tCY Read clock period tCH Read clock minimum pulse width High time tCL Read clock minimum pulse width Low time tADDRSU Read address setup time in Synchronous mode tADDRHD Read address hold time in Synchronous mode tBLKSU Read block select setup time (when pipeline registers enabled) tBLKHD Read block select hold time (when pipeline registers enabled) tCLK2QH Data output read hold time tCLK2QR Data output read access time tBLK2Q Block select to dout disable/enable time tCLK2QE Data output read access time with ECC registers. Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. 46 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Synchronous Read Mode with Pipeline Registers (Synchronous-Synchronous Read Mode) • The input registers are configured in Synchronous read mode. • The output pipeline registers are configured as edge-triggered registers (Pipelined mode). • Pipelined mode is achieved by making the following settings: – A_DOUT_BYPASS or B_DOUT_BYPASS = 0 – A_IN_BYPASS or B_IN_BYPASS = 0 – A_DOUT_SRST_N = 1 or B_DOUT_SRST_N = 1 – A_DOUT_EN or B_DOUT_EN = 1 – A_BLK = 1, B_BLK = 1 • The input register clock and pipeline register clock must be synchronous to each other; hence they must be sourced from the same clock input. • The output data appears on the output bus in the next clock cycle. Figure 3-3 shows the timing waveforms for synchronous-synchronous read operation with pipeline registers. W&+ W&/ W&< $B&/. %B&/. W&< $B$''5>@ %B$''5>@ $B%/. %B%/. W$''568 $ W$''5+' W%/.68 W%/.+' $ $ W%/.68 W%/.+' 2XWSXWLQWKHV\QFKURQRXV±V\QFKURQRXVPRGHZLWKRXW(&&UHJLVWHUV $B'287>@ %B'287>@ ' ' W&/.43 ' W&/.43 W&/.43 2XWSXWLQWKHV\QFKURQRXV±V\QFKURQRXVPRGHZLWK(&&UHJLVWHUV $B'287>@ %B'287>@ ' ' W&/.43 ' ' W&/.43 W&/.43 Figure 3-3 • Synchronous-Synchronous Read Operation Waveform with Pipeline Registers Revision 2 47 Micro SRAM (uSRAM) Table 3-9 shows the timing parameter values for Synchronous read mode with pipeline registers. Table 3-9 • Timing Parameters for Synchronous-Synchronous Read Operation Parameter Description tCY Read clock period tCH Read clock minimum pulse width High time tCL Read clock minimum pulse width Low time tADDRSU Read address setup time in Synchronous mode tADDRHD Read address hold time in Synchronous mode tBLKSU Read block select setup time (when pipeline registers enabled) tBLKHD Read block select hold time (when pipeline registers enabled) tCLK2QP Pipeline read access time Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. Asynchronous Read Mode Asynchronous read mode requires that the input registers for the address and block-select inputs are configured in asynchronous mode by configuring the following settings: • A_IN_BYPASS or B_IN_BYPASS = 1 • A_ADDR_SRST_N or B_ADDR_SRST_N = 1 • A_BLK = 1, B_BLK = 1 Asynchronous Read Mode Without Pipeline Registers (Asynchronous-Asynchronous Mode) 48 • The input registers are configured in Asynchronous read mode. • The output pipeline registers are configured as transparent (non-pipelined operation). • The pipeline registers can be made transparent by making the following settings: – A_DOUT_BYPASS or B_DOUT_BYPASS = 1 – A_IN_BYPASS or B_IN_BYPASS = 1 – A_DOUT_SRST_N = 1 or B_DOUT_SRST_N = 1 – A_DOUT_EN or B_DOUT_EN = 1 • After the input address is provided, the output data is displayed on the output data bus after a tCLK2Q delay (Figure 3-4 on page 49). • The uSRAM block can generate glitches on the data output bus when used without the pipeline register. R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Figure 3-4 shows the timing diagram for Asynchronous-Asynchronous read mode for uSRAM. W$''5 $B$''5>@ %B$''5>@ $ $ $ W$''5+' W$''568 $B%/. %B%/. W%/.03: W%/.68 W%/.+' 2XWSXWLQWKHDV\QFKURQRXV±DV\QFKURQRXVPRGHZLWKRXW(&&UHJLVWHUV W$''54+ $B'287>@ %B'287>@ ' ' W%/.4 W%/.4 ' W$''545 $B&/. %B&/. W&+ W&/ W&< 2XWSXWLQWKHDV\QFKURQRXV±DV\QFKURQRXVPRGHZLWK(&&UHJLVWHUV W&/.4 $B'287>@ %B'287>@ ' W&/.4 Figure 3-4 • Read Operations with Asynchronous Inputs Without Pipeline Registers Waveform Table 3-10 shows the timing parameter values for the asynchronous read mode without pipeline registers. Table 3-10 • Timing Parameters of the Asynchronous Read Mode Without Pipeline Registers Parameter Description tADDR Address cycle time, 300 MHz (250 MHz with SET mitigation enable) tBLKMPW Block select cycle time tADDR2QH Data output read hold time tADDR2QR Data output read access time tBLK2Q Block select to dout disable/enable time tCY Pipe-line clock period is 300 MHz (250 MHz with SET mitigation enable) tCH Clock high time tCL Clock low time tADDRSU Address setup time tADDRHD Address hold time tBLKSU Block select setup time Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. Revision 2 49 Micro SRAM (uSRAM) Table 3-10 • Timing Parameters of the Asynchronous Read Mode Without Pipeline Registers Parameter Description tBLKHD Block select hold time tCLK2Q Pipe-line read access time Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. Asynchronous Read Mode with Pipeline Registers (Asynchronous-Synchronous Mode) • The input registers are configured in Asynchronous read mode. • The output pipeline registers are configured as registers (Pipelined mode). • Pipelined mode is achieved by configuring the following settings: • – A_DOUT_BYPASS or B_DOUT_BYPASS = 0 – A_IN_BYPASS or B_IN_BYPASS = 1 – A_DOUT_SRST_N = 1 or B_DOUT_SRST_N = 1 – A_DOUT_EN or B_DOUT_EN = 1 – A_BLK =1, B_BLK = 1 After the input address is provided, the output data is displayed on the output data bus after the next rising edge of the pipeline register input clock. Figure 3-5 shows the timing diagrams for Asynchronous-Synchronous read mode for uSRAM. $B$''5>@ %B$''5>@ $ $B%/. %B%/. W$''568 $ $ W$''5+' W%/.68 W%/.68 W%/.+' $B&/. %B&/. W&+ W%/.+' W&/ W&< 2XWSXWLQWKHDV\QFKURQRXV±V\QFKURQRXVPRGHZLWKRXW(&&UHJLVWHUV $B'287>@ %B'287>@ W&/.4 ' W&/.4 2XWSXWLQWKHDV\QFKURQRXV±V\QFKURQRXVPRGHZLWK(&&UHJLVWHUV $B'287>@ %B'287>@ W&/.4 W&/.4 ' Figure 3-5 • Read Operations with Asynchronous Inputs with Pipeline Registers Waveform 50 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Table 3-11 shows the timing parameter values of the asynchronous read mode with pipeline registers. Table 3-11 • Timing Parameters of the Asynchronous Read Mode with Pipeline Registers Parameter Description tCY Pipe-line clock period is 300MHz (250MHz with SET mitigation enable) tCH Clock high time tCL Clock low time tADDRSU Address setup time tADDRHD Address hold time tBLKSU Block select setup time tBLKHD Block select hold time tCLK2Q Pipe-line read access time Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. Write Operation • Port C is the only port through which a write operation can be performed on uSRAM. • The write operation is purely synchronous and all operations are synchronized to the rising edge of the Port C clock input (C_CLK). • The write inputs, C_ADDR, C_BLK, C_WEN, and C_DIN, have to satisfy the setup and hold timings with respect to the rising edge of the C_CLK input for a successful write operation. • If all the inputs meet the required timing parameters, the input data is written into uSRAM in one clock cycle. Figure 3-6 shows the timing waveforms for a Port C write operation. W&+ &B&/. W$''568 W$''5+' W%/.68 W%/.+' W:(68 W:(+' W&/ W&< W$''568 W$''5+' W$''568 W$''5+' &B$''5 &B%/.>@ &B:(1 W',168 W',168 W',1+' ' &B',1 W%/.68 ' W',168 ' W',168 ' W%/.+' W',168 ' ' 'DWDZULWWHQLQ65$0ZLWKRXW(&&UHJLVWHUV 'DWDZULWWHQ LQ65$0 ' ' 'DWDZULWWHQLQ65$0ZLWK(&&UHJLVWHUV 'DWDZULWWHQ LQ65$0 ' Figure 3-6 • Timing Waveforms for the Write Operation Revision 2 51 Micro SRAM (uSRAM) Table 3-12 shows the timing parameters of the write operation. Table 3-12 • Timing Parameters of the Write Operation Parameter Description tCY Write clock period tCH Write clock minimum pulse width High tCL Write clock minimum pulse width Low tADDRCSU Write address setup time tADDRCHD Write address hold time tBLKCSU Write block setup time tBLKCHD Write block hold time tWESU Write enable setup time tWEHD Write enable hold time tDINSU Write input data setup time tDINHD Write input data hold time Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. ECC uSRAM has Error detection and correction logic circuitry (1-bit error correction, 2-bit error detection) and it is available for the x18 only. Setting the ECC enable ECC_EN to High turns ON the ECC circuitry and ECC pipeline stages. The ECC encoder provides 24 bits of data for x18 mode. The ECC decoder reads 24 bits from the array and provides 18 corrected bits on the output. If the ECC has detected an error, you can choose to correct the data in the uSRAM block. The writing of the correct data is called 'Scrubbing'. Scrubbing is not available inside the uSRAM. All scrubbing must be done in the fabric design. Both the ECC encoder and ECC decoder contain their own pipeline registers, which add a clock cycle of latency to each of the read and write operations. These pipeline registers may be bypassed for slower operation. If pipeline modes are enabled, the ECC flags will be unknown values on subsequent invalid clock cycles until a valid data out clock cycle. The ECC encoder generates two flags per port, an error correction flag (A_SB_CORRECT, B_SB_CORRECT) that is set to High when a single bit in a word is corrected and an error detection flag (A_DB_DETECT, B_DB_DETECT) that is set to High when two or more bit errors in a word are detected, but not corrected. On a single bit error, the status flags will be set to: A/B_SB_CORRECT = 1'b1 A/B_DB_DETECT = 1'b0 On a double bit error, the status flags will be set to: A/B_SB_CORRECT = 1'b1 A/B_DB_DETECT = 1'b1 52 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Reset Operation The global reset signal (ARST_N) is an asynchronous Active Low signal. For any normal operation of uSRAM, the reset signal must be set to High. To reset the uSRAM block, the reset signals must be set to Low. When reset is asserted (ARST_N forced Low), the uSRAM behaves as follows during read and write operations: 1. Read operation: If reset is asserted when the read operation is in process, the data output port is forced Low after a certain amount of delay. If the clock is High and the reset signal is asserted and then deasserted in the same High clock phase or Low clock phase, the data output stays Low until the next cycle. The data output changes its state only if a read operation or write operation in Bypass mode is performed on the uSRAM block. In a simple write operation, the data output stays Low. 2. Write operation: If reset is asserted during the write operation, then the corrupted data is written into the memory. Microsemi recommends to avoid asserting the reset signal during write operation. All data stored in the array is lost during a global reset. The contents of the array must be considered unknown until a valid write operation. Timing Diagram: Asynchronous Reset Operation Figure 3-7 shows the asynchronous reset operation. W&< W&+ W &/ $B&/. %B&/. $567B1 W54 $B'287 %B'287 Figure 3-7 • Asynchronous Reset Operation Table 3-13 • Asynchronous Reset Timing Parameters Parameter Description tCY Clock period tCH Clock minimum pulse width High tCL Clock minimum pulse width Low tR2Q Asynchronous reset to output propagation delay Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. The reset signals (A_ADDR_SRST_N, B_ADDR_SRST_N) are synchronous Active Low signals for the address and block select input registers for Port A and Port B. The assertion of these reset signals forces the address and block select input registers to logic 0, which in turn forces the data output to logic 0. Revision 2 53 Micro SRAM (uSRAM) Figure 3-8 shows the timing waveform for synchronous reset. W&/.03:/ W&/.03:+ $B$''5B&/. %B$''5B&/. W&< W656768 W6567+' $B$''5B6567B1 %B$''5B6567B1 W&/.4 $B'287 %B'287 Figure 3-8 • Timing Waveforms for Synchronous Reset Table 3-14 shows the timing parameters of the synchronous reset. Table 3-14 • Timing Parameters of the Synchronous Reset Parameter Description tCY Read clock period tCLKMPWH Read clock minimum pulse width High tCLKMPWL Read clock minimum pulse width Low tSRSTSU Read synchronous reset setup time tSRSTHD Read synchronous reset hold time tCLK2Q Read synchronous reset to output propagation delay Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. 54 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Collision Collision between ports occurs when the read and write operations are requested from two or all three ports at the same time and the same address location. Table 3-15 shows different scenarios for collision. Table 3-15 • Collision Scenarios Operation Comments Simultaneous read from Port A and read from Port B to Allowed as the read ports are independent of each other. the same address location Both read ports deliver correct read data. Simultaneous read from Port A and write to Port C to Collision occurs. The write operation works correctly but the the same address location read operation from Port A generates ambiguous data output unless the clock cycle is long enough to allow the newly written data to be read. Simultaneous read from Port B and write to Port C to Collision occurs. The write operation works correctly but the the same address location read operation from Port B generates ambiguous data output unless the clock cycle is long enough to allow the newly written data to be read. Simultaneous read form Port A, read from Port B, and Collision occurs. The write operation works correctly but the write to Port C to the same address location read operation from both the ports generates ambiguous data output unless the clock cycle is long enough to allow the newly written data to be read. Note: There is no collision prevention or detection implemented in the uSRAM architecture, so the designer must take measures to avoid the last three scenarios in designs. Revision 2 55 4 – uPROM Introduction The RTG4 FPGA fabric has embedded micro programmable read only memory (uPROM) blocks used for storing program data such as initialization data for SERDES, LSRAM, and uSRAM blocks. These uPROMs are arranged in a single row at the bottom of FPGA fabric and can be accessed through the System Controller or fabric interface. The number of uPROMs present depends on the device. Table 4-1 shows the numbers of uPROMs present in each RTG4 device. Features RTG4 uPROM blocks have the following features: • Each uPROM block stores up to 18,144 bits (504x36) of data. • Write operation (erase / program) is performed at the same time as FPGA programming. • Only Read operation is supported during normal operation. • Read operation can be through System Controller or fabric interface. • Read operation is supported at 50 MHz speed. • Read operation supports Synchronous operation. • Each uPROM block has an option to register all inputs and outputs. • The registers at read port in uPROM block are similar to STMR flip-flop and have an option to mitigate single-event transients. uPROM Resource Table Table 4-1 shows uPROM blocks available for RTG4 devices. Table 4-1 • RTG4 uPROM Resource Table Blocks uPROM Blocks RT4G075 RT4G150 254 381 Note: All numbers given above are per device. Revision 2 56 UG0574: RTG4 FPGA Fabric User Guide Functional Description This section provides the detailed description of the following: • Architecture Overview • Port List • Operational Modes Architecture Overview The RTG4 uPROM embedded read only memory includes the uPROM macro available in the Libero SoC software. Figure 4-1 shows a simplified block diagram of the uPROM memory block with on read data ports and pipeline registers at read port. Table 4-2 shows the port descriptions. 5'(1 &/. X3520 ,QWHUIDFH $''5>@ 5HDG 'HFRGH 0HPRU\$UUD\ [ 3LSHOLQH 5HJLVWHU '$7$5>@ %86< Figure 4-1 • Simplified Functional Block Diagram of uSRAM Port List Table 4-2 shows list of ports for uPROM blocks. Table 4-2 • Port List for uPROM Port Name Direction Type Descriptions ADDR[13:0] Input Dynamic Address input – CLK Input Dynamic Clock input Rising RDEN Polarity Input Dynamic Read Enable Active High DATAR[35:0] Output Dynamic Data output – BUSY Output Dynamic Busy signal from SII Active High Operational Modes In the RTG4 uPROM block, the write operation (Program/Erase) will be performed during FPGA program or erase operation. Following two modes of read operation are available to access the uPROM data. • Mode 1: Read Operation through System Controller • Mode 2: Read Operation through Fabric Interface Mode 1: Read Operation through System Controller During Power-up sequence, the System Controller reads the data from uPROM to initialize the LSRAM, uSRAM, FDDR, or SERDES block registers. Refer to the RTG4 FPGA System Controller User Guide for more information on uPROM access through System Controller. Revision 2 57 uPROM Mode 2: Read Operation through Fabric Interface During normal operation, you can read the uPROM block through fabric interface using uPROM macro (to be released) available in Libero Macro library. The read timing diagram for the uPROM is shown in Figure 4-2. W&+ W&/ W&< &/. W$''568 $''5>@ $''5 $''5 $''5 $''5 W68 WFT '$7$>@ W$''5+' W'5 :LWKRXWUHJLVWHUHG '$7$ '$7$ '$7$ '$7$ '$7$>@ WFT :LWKUHJLVWHUHG '$7$ '$7$ '$7$ Figure 4-2 • Timing Waveforms for Read Operation Table 4-3 • Timing Parameters for Synchronous-Synchronous Read Operation Parameter Description tCY Read clock period (50 MHz) tCH Read clock minimum pulse width High time tCL Read clock minimum pulse width Low time tSU Read setup time tCQ Read Clock to Q delay tDR Read Data delay Note: Refer to the RTG4 FPGA Datasheet (to be released) for more information on timing values. 58 R e visio n 2 5 – Mathblocks Introduction The RTG4 FPGA device implements a custom 18×18 multiply and accumulate block (18×18 MACC) for efficient implementation of complex DSP algorithms such as finite impulse response (FIR) filters, infinite impulse response (IIR) filters, and fast fourier transform (FFT) for filtering and image processing applications etc. The RTG4 mathblock has a built-in multiplier and adder, which minimizes the fabric logic required to implement multiplication, multiply-add, and multiply-accumulate (MACC) functions. Implementation of these arithmetic functions results in efficient resource usage and improved performance for DSP applications. In addition to the basic MACC function, DSP algorithms typically need small amounts of RAM for coefficients and larger RAMs for data storage. RTG4 micro RAMs (uSRAMs) are ideally suited to serve the needs of coefficient storage while the large RAMs are used for data storage. The number of available mathblocks varies depending on the size of the device, as shown in Table 5-1 on page 60. Features Each mathblock has the following features: • High-performance and power optimized multiplications operations. • Supports 18 × 18 signed multiplication natively. • Supports 17 × 17 unsigned multiplications. • Supports dot product: the multiplier computes (A[8:0] × B[17:9] + A[17:9] × B[8:0]) × 29. • Built-in addition, subtraction, and accumulation units to combine multiplication results efficiently. • Independent third input C with data width 44 bits completely registered. • Single-bit input, CARRYIN, from fabric routing. • Supports both registered and unregistered inputs and outputs. • All the input and output registers are STMR-flip-flops. • Supports signed and unsigned operations. • Internal cascade signals (44-bit CDIN and CDOUT) enable cascading of the mathblocks to support larger accumulator, adder, and subtractor without extra logic. • Supports loopback capability. • Adder support: (A × B) + C or (A × B) + D or (A × B) + C + D. • Clock-gated input and output registers for power optimizations. • Width of adder and accumulator can be extended by implementing extra adders in the FPGA fabric. • Mathblocks can operate up to 300 MHz with SET mitigation disable and up to 250 MHz with SET mitigation enable. • Supports transparent mode. • Asynchronous load - limited to reset. • Global reset can be ignored, if required. • Mathblock flip-flops always reset during power-up. Revision 2 59 Mathblocks Mathblock Resource Table Table 5-1 lists the mathblocks available for RTG4 devices. Table 5-1 • RTG4 Mathblocks Resource Table Blocks RT4G075 RT4G150 224 462 Mathblocks (18-bit ×18-bit) Note: All numbers given above are per device. Functional Description This section provides the detailed description of the architecture of mathblock. Architecture Overview RTG4 devices can have one to three rows of mathblocks in the FPGA fabric, as given in Table 5-1. Mathblocks can be accessed through the FPGA routing architecture and cascaded in a chain, starting from the left-most block to the right-most block. Each mathblock consists of the following: • Multiplier • Adder or Subtractor • I/O and Control Registers Figure 5-1 shows the functional block diagram of the mathblock. 68% 68%B$/B1 FQWOUHJ 68%B%<3$66 68%B6/B1 68%B6'B1 &/. $>@ '273 68%B$' 68%B(1 LQUHJ $B%<3$66 3B6567B1 $B6567B1 3B(1 $B(1 &/. %>@ &/. %B%<3$66 LQUHJ %B(1 3B6567B1 & &/. 3B(1 &$55<,1 LQUHJ 3B%<3$66 &B%<3$66 ' &/. &B6567B1 &B(1 &/. $56+)7 FQWOUHJ $56+)7B$/B1 $56+)7B6/B1 $56+)7B(1 $56+)7B$' $56+)7B6'B1 29)/B&$55<287B6(/ !! $56+)7B%<3$66 &/. &'6(/ &'6(/B$/B1 FQWOUHJ &'6(/B6/B1 &'6(/B(1 &/. &'6(/B$' &'6(/B6'B1 &'6(/B%<3$66 )'%.6(/ )'%.6(/B$/B1 )'%.6(/B6/B1 )'%.6(/B(1 &/. FQWOUHJ )'%.6(/B$' )'%.6(/B6'B1 )'%.6(/B%<3$66 &',1>@ Figure 5-1 • Functional Block Diagram of the Mathblock 60 29)/B&$55<287 &'287>@ %B6567B1 &>@ &$55<,1 FQWOUHJ 3B%<3$66 R e visio n 2 RXWUHJ 3>@ UG0574: RTG4 FPGA Fabric User Guide Multiplier The RTG4 mathblock can be used as a multiplier, which accepts two 18-bit inputs (A and B), and generates a 36-bit output. The mathblock multiplier can be configured in two different operating modes: • Normal Mode • DOTP Mode Normal Mode In Normal mode, the mathblock implements a single 18 × 18 signed multiplier. The mathblock accepts the inputs, A [17:0] and B [17:0], and generates A*B with a 36-bit wide result. Figure 5-2 shows the functional block diagram of the mathblock in Normal mode. 1RUPDO0RGH $>@ 68% %>@ 3>@ &$55<,1 &>@ '>@ Figure 5-2 • Functional Block Diagram of the Mathblock in Normal Mode DOTP Mode Dot Product (DOTP) mode has two independent 9-bit × 9-bit multipliers with adder and the product sum is stored in the upper 36 bits of the 44-bit register. In DOTP mode, the mathblock implements the following equation: (A [8:0] × B [17:9] + A[17:9] × B[8:0]) × 29 EQ 1 DOTP mode can be used to implement 9 × 9 complex multiplications. Revision 2 61 Mathblocks Figure 5-3 shows the functional block diagram of the mathblock in DOTP mode. 68% '273URGXFW0RGH $>@ %>@ %>@ $>@ &$55<,1 &>@ 3>@ '>@ Figure 5-3 • Functional Block Diagram of the Mathblock in DOTP Mode Adder or Subtractor The adder sums the output from the multiplier, C input, CARRYIN, or D input. The final output (P) of the adder is ((A [17:0] × B [17:0]) + C [43:0] + D [43:0] + CARRYIN). The mathblock can be configured as a 2-input or 3-input adder. • As a 2-input adder, the mathblock computes A × B + C or A × B + D. • As a 3-Input adder, the mathblock computes A × B + C + D. If the adder is configured as a subtractor, the adder output is ((C [43:0] + D [43:0] + CARRYIN) - (A[17:0] × B[17:0])). I/O and Control Registers Mathblocks have built-in registers on data inputs (A, B, C), data output (P), and control signals. If required, these registers can be bypassed. All the registers in the mathblock have clock gating capability to reduce the power consumption. These register flip-flops are STMR. Mathblocks do not have a pipeline register at the cascade input (CDIN), so pipeline registers can be added from the fabric when multiple mathblocks are cascaded to implement higher bit-width multiplications. C Input The C input port allows the formation of many 3-input mathematical functions, such as 3-input addition or 2-input multiplication with an addition. The CARRYIN signal is the carry input of the adder or accumulator. The C input can also be used as a dynamic input achieving the following functionalities: 62 • Wrapping-around the cascade chain of mathblocks from one row to the next row through the fabric. • Rounding of multiplication outputs. • Trimming of lower order bits of the final sum, partial sum or the product. R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Cascaded Input, Output, and Selection Higher level DSP functions are supported by cascading individual mathblocks in a row. The two data signals, CDIN [43:0] and CDOUT [43:0], provide the cascading capability with a cascade select input (CDSEL). Table 5-2 shows the selection of CDSEL for propagating CDIN to the D input of the adder. To cascade mathblocks, the CDOUT of one block must feed the CDIN of another block. CDOUT to CDIN is a hardwired connection between the blocks within a row. Two different rows can be cascaded using the fabric routing between the two rows. Extra pipeline registers may be needed to compensate for the extra delays added due to the fabric routing, which in turn increases the latency of the chain. The ability to cascade mathblocks is useful in filter designs. For example, an FIR filter design can use cascading inputs to arrange a series of input data samples and cascading outputs to arrange a series of partial output results. The ability to cascade provides a high-performance and low power implementation of DSP filter functions because the general routing in the fabric is not used. Overflow Output Each mathblock has an overflow signal, OVFL_CARRYOUT. This signal indicates any overflow from the additional operation performed by the adder. This signal is also used to extend the adder data widths from the existing 44 bits using the fabric. The overflow signal is also used for the implementation of saturation capabilities. Saturation refers to catching an overflow condition and replacing the output with either the maximum (most positive) or minimum (most negative) value that can be represented. In RTG4 mathblocks, this capability is implemented using the adder's output sign bit (MSB [43] bit of the P output) and the overflow signal. Shift Input For multi-precision arithmetic, mathblocks provide a right-wire-shift by 17 which is controlled by the ARSHFT17 input. Thus, a partial product from one mathblock can be shifted to the right and added to the next partial product computed in an adjacent mathblock. Using this technique, mathblocks can be used to build larger multipliers. Feedback Select Input For accumulation operations, the mathblock output needs to loopback to the D input of the adder block. Selection of the D input is controlled by the feedback select (FDBKSEL) input. Table 5-2 shows the selection of FDBKSEL for loopback. Table 5-2 • Truth Table for Propagating Operand D of the Adder or Accumulator CDSEL FDBKSEL ARSHFT17 Operand D 0 0 0 0 0 0 1 0 1 X 0 CDIN[43:0] 1 X 1 {{17{CDIN[43]}}, CDIN[43:18]} 0 1 0 P[43:0] 0 1 1 {{17{P[43]}}, P[43:18]} Mathblock Interface to Fabric Routing Mathblocks can access the fabric routing through interface logic routing clusters. These clusters are composed of 12 flip-flops and 12 4-input LUTs. When mathblocks are used, these flip- flops and LUTs act as an interface to the fabric routing. When mathblocks are not used, these flip-flops and LUTs can be utilized as normal flip-flops and LUTs. The interface logic clusters do not have carry chain support. Revision 2 63 Mathblocks How to Use Mathblocks The following sections describe how to use Mathblock in an application: • Design Flow • Mathblock Use Models • Coding Style Examples Design Flow Mathblocks can be used in two ways: through inference or by using the mathblock primitive. Inference is done during the synthesis stage of an RTL design. Alternately, the mathblock primitive is available in the Libero SoC IP catalog as a component that can be used directly in the HDL file or instantiated in SmartDesign. Using a Mathblock Through Inference Synplify Pro can infer mathblocks and can configure them into appropriate modes automatically, if the RTL contains any specific multiply, multiply-accumulate, multiply-add, or multiply-subtract functions. In this case, the synthesis tool takes care of all the signal connections of the mathblock to the rest of the design and provides the correct values for the static signals to configure the appropriate operational mode. The tool ties unused dynamic input signals to ground and provides default values to unused static signals. The synthesis tool maps any multiplication function with input widths of three or greater to mathblocks. However, the mapping of multiplication functions with input widths less than three, which are implemented in FPGA logic by default, can be controlled by the synthesis attribute (syn_multstyle). The tool also has the capability to cascade multiple mathblocks, if the function crosses the limits of a single mathblock. For example, if an RTL function has a 35 × 35 multiplication, the synthesis tool implements this using four mathblocks cascaded in a chain. It also has the capability to place the input and output registers inside the mathblock boundary, provided they are driven by same clock. If the registers have different clocks, the clock that drives the output register has priority, and all registers driven by that clock are placed into the mathblock. If the outputs are unregistered and the inputs are registered with different clocks, the input registers with the larger input have priority and are placed into the mathblock. The synthesis tool supports inference of mathblock components across hierarchical boundaries, which means even if the multipliers, input registers, output registers, and subtracter/adders are present in different hierarchies, they can be placed into the same mathblock. For more information on mathblock inference by Synplify Pro, refer to the Synopsys application note on inferring Microsemi RTG4 MACC Blocks (to be released). Using the Mathblock Primitive The mathblock primitive available in the Libero SoC IP Catalog is called MACC. Figure 5-4 on page 65 shows the MACC primitive with input/output port and the bit width of each port. The MACC primitive can be used in designs by SmartDesign for schematic-based design entry or by directly instantiating the MACC wrapper in an HDL file as a component. For the MACC primitive, the inputs and outputs must be connected manually to the design signals. Proper values to the static signals must be provided to ensure that the mathblock is configured in the correct operational mode. For example, to configure the mathblock in DOTP mode, the DOTP signal must be tied to logic 1. Unused active high dynamic signals must be connected to ground, unused active low dynamic signals must be set to High, and unused static signals must be in default state. 64 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Figure 5-4 • Mathblock Macro Revision 2 65 Mathblocks Table 5-3 provides the port list and definitions. Table 5-3 • Mathblock Pin Descriptions Pin Name CLK Direction Type Polarity Input Dynamic Rising Edge Description Input clock. There is one clock used in the entire mathblock • ARST_N Input Global A[17:0] Input Dynamic A_EN Input Dynamic Low CLK is the clock for A[18:0], B[18:9], P[43:0], OVFL, SHFTSEL, CDSEL, FDBKSEL, and SUB registers. Global Asynchronous reset Port A (to Multiplier) Input Data High Enable for data registers • A_EN is for A[17:0] When not registered, connect A_EN to logic 1. A_SRST_N Input Dynamic Low Synchronous reset • A_SRST_N is for A[17:0] When not registered, connect A_SRST_N to logic 1. A_BYPASS Input Dynamic B[17:0] Input Dynamic B_SRST_N Input Dynamic Low Port A register select Port B (to Multiplier) Input Data Low Synchronous reset • B_SRST_N is for B[17:0] When not registered, connect B_SRST_N to logic 1. B_EN Input Dynamic High Enable for data registers • B_EN is for B[17:0] When not registered, connect B_EN to logic 1. B_BYPASS Input Dynamic Low Port B register select C[43:0] Input Dynamic Input Data CARRYIN Input Dynamic Adder/accumulator's carry input C_SRST_N Input Dynamic Port C (to Adder) Low Synchronous reset • C_SRST_N is for C[43:0] When not registered, connect C_SRST_N to logic 1. C_EN Input Dynamic High Enable for data registers • C_EN is for C[43:0] When not registered, connect C_EN to logic 1. C_BYPASS Input Dynamic Input Cascade Low Port C register select Other Inputs CDIN[43:0] Cascaded input for operand D of the adder/accumulator. The entire CDIN will be driven by another mathblock's CDOUT. Note: Asynchronous load input has higher priority than the synchronous load input. 66 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Table 5-3 • Mathblock Pin Descriptions (continued) Pin Name DOTP Direction Type Polarity Input Static High Description Dot product mode When DOTP = 1, mathblock performs (A[8:0] × B[17:9] + A[17:9] × B[8:0]) × 29 When DOTP = 0, mathblock performs normal 18× 18 multiplication operations. SUB Input Dynamic High Subtract operation When SUB = 1, perform 2's complement subtraction to get P = C + D + CARRYIN - (A x B). When SUB = 0, perform 2's complement addition to get P = C + D + CARRYIN + (A x B). SUB_SL_N Input Dynamic Low Synchronous reset input for SUB input control register. SUB_EN Input Dynamic High Enable input for SUB input control register. SUB_SD Input Static Low Synchronous load data for the SUB input control register. SUB_BYPASS Input Dynamic Low SUB register select ARSHFT17 Input Dynamic High Arithmetic right-shift for operand D. When asserted, a 17-bit arithmetic right-shift is performed on operand D of the adder/accumulator. ARSHFT17_SL_N Input Dynamic Low Synchronous reset input for ARSHFT17 input control register. ARSHFT17_EN Input Dynamic High Enable input for ARSHFT17 input control register. ARSHFT17_SD Input Static Low Synchronous load data for the ARSHFT17 input control register. ARSHFT17_BYPASS Input Dynamic Low ARSHFT17 register select CDSEL Input Dynamic High Selects CDIN for operand D of the adder/accumulator input. When CDSEL = 1, CDIN is propagated to the operand D. When CDSEL = 0, either logic 0 or feedback from output P is routed to the operand D depending upon the FDBKSEL. CDSEL_SL_N Input Dynamic Low Synchronous reset input for CDSEL input control register. CDSEL_EN Input Dynamic High Enable input for CDSEL input control register. CDSEL_SD Input Static Low Synchronous load data for the CDSEL input control register. CDSEL_BYPASS Input Dynamic Low CDSEL register select Note: Asynchronous load input has higher priority than the synchronous load input. Revision 2 67 Mathblocks Table 5-3 • Mathblock Pin Descriptions (continued) Pin Name Direction Type Polarity Description FDBKSEL Input Dynamic High Select the feedback from P for operand D of the adder or accumulator. • When FDBKSEL = 1, propagate the current value of result P register. • Ensure P_BYPASS = 0 and CDSEL = 0. When FDBKSEL = 0, logic 0 is propagated. Ensure CDSEL = 0. FDBKSEL_SL_N Input Dynamic Low Synchronous reset input for FDBKSEL input control register. FDBKSEL_EN Input Dynamic High Enable input for FDBKSEL input control register. FDBKSEL_SD Input Static Low Synchronous load data for the FDBKSEL input control register. FDBKSEL_BYPASS Input Dynamic Low FDBKSEL register select Output Port P[43:0] Output Result data out • Normal mode P = C + D + CARRYIN + (A × B) when SUB = 0 P = C + D + CARRYIN - (A × B) when SUB = 1 • DOTP mode P = C + D + CARRYIN + ((A[8:0] × B[17:9] + A[17:9] × B[8:0]) × 29) when SUB = 0 P = C + D + CARRYIN - ((A[8:0] x B[17:9] + A[17:9] × B[8:0]) × 29) when SUB = 1 OVFL_CARRYOUT Output Overflow output • Normal mode if C + D + CARRYIN +/- (A x B) > (243 - 1), then OVFL_CARRYOUT = 1 if C + D + CARRYIN +/- (A x B) < - (243), then OVFL_CARRYOUT = 1 else OVFL_CARRYOUT = 0. • DOTP mode if C + D + CARRYIN +/- ((A[8:0] x B[17:9] + A[17:9] × B[8:0]) × 29) > (243- 1), then OVFL_CARRYOUT = 1 if C + D + CARRYIN +/- ((A[8:0] × B[17:9] + A[17:9] × B[8:0]) × 29) < - (243), then OVFL_CARRYOUT = 1 else OVFL_CARRYOUT = 0. Note: Asynchronous load input has higher priority than the synchronous load input. 68 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Table 5-3 • Mathblock Pin Descriptions (continued) Pin Name OVFL_CARRYOUT_SEL Direction Type Polarity Description Input Static High Input to the adder for generating the overflow bit or an external bit, which finally comes as an output on the OVFL_CARRYOUT port. The overflow bit indicates the overflow generated in the addition process. The external bit is generated to extend the adder into the fabric. In this case, P[43], C[43], and D[43] are not representing the sign bit. When OVFL_CARRYOUT_SEL = 1, OVFL_CARRYOUT is the external bit for fabric extension. Otherwise, OVFL_CARRYOUT is the overflow output. CDOUT[43:0] P_SRST_N Output Input Cascade output of result P. CDOUT is the same as P. It is used to drive CDIN of another mathblock. Dynamic Low Synchronous reset input for P and OVFL_CARRYOUT control registers • P_SRST_N P[43:0] is for OVFL_CARRYOUT and When not registered, connect P_SRST_N to logic 1. P_EN [1:0] Input Dynamic High Enable input for P and OVFL_CARRYOUT control registers • P_EN[1] is for OVFL_CARRYOUT and P[43:18] • P_EN[0] is for P[17:0] When not registered, connect P_EN[1:0] to logic 1. In Normal mode, ensure P_EN[1] = P_EN[0]. P_BYPASS Input Dynamic Low Output Port P register select Note: Asynchronous load input has higher priority than the synchronous load input. Revision 2 69 Mathblocks Mathblock Use Models This section describes a few use models for RTG4 mathblocks. Use Model 1: Non-Pipelined Implementation of the 35 × 35 Multiplier 35 × 35 multipliers are useful for applications which require more than 18-bit precision. Non-pipelined implementation is typically used for low speed applications. A 35 × 35 multiplier can be constructed using 4 mathblocks in a single row, connected in a cascade. Figure 5-5 shows a typical implementation of a non-pipelined 35 × 35 multiplier. The inputs are assumed to be A [34:0] and B [34:0] with a product of P [69:0]. $>@ $>@ + 3>@ %>@ %>@ + !! $>@ ^$>@` / 3>@ %>@ %>@ + $>@ $>@ + 8QFRQQHFWHG %>@ ^%>@` / !! $>@ ^$>@` / 3>@ %>@ ^%>@` / Figure 5-5 • Non-Pipelined 35 × 35 Multiplier 70 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Use Model 2: Pipelined Implementation of the 35 x 35 Multiplier RTG4 mathblocks have built-in registers on all input and output ports. To implement high-speed multipliers, extra registers are added to the input or output side of the mathblocks to balance the pipeline latency. These extra registers are implemented in the fabric. Figure 5-6 shows a typical 35 × 35 multiplier implementation with fabric pipeline registers. $>@ $>@ + 3>@ %>@ %>@ + !! $>@ ^$>@` / 3>@ %>@ %>@ + $>@ $>@ + 8QFRQQHFWHG %>@ ^%>@` / !! $>@ ^$>@` / 3>@ %>@ ^%>@` / )DEULF5HJLVWHUV Figure 5-6 • Pipeline 35 × 35 Multiplier Revision 2 71 Mathblocks Use Model 3: Implementation of 9-Bit Complex Multiplication Complex multiplication implemented using a mathblock in DOTP mode requires additional 2's complement logic in the fabric for negating the Q input. The DOTP implementation in Figure 5-7 shows the optimized way of implementing the 2's complement with minimal logic in the fabric. For two complex numbers X + jY, P + jQ, the complex multiplication is shown in EQ 2: Multiplication Result = Real part + Imaginary Part = (PX - QY) + j (PY + QX) EQ 2 In EQ 2, real part (PX-QY) requires -Q for the multiplication result. This can be computed using the one‘s complement of Q and add the Y using the C input (since -Q = ~Q+1). Imaginary part = P*Y+Q*X EQ 3 Real part = P*X + (~Q)*Y + Y EQ 4 Figure 5-7 shows the implementation of 9 × 9 complex multiplication using a mathblock configured in DOTP mode. ,QSXW $GGHU ,QSXW $GGHU 3<4; ,PDJLQDU\3DUW $/ < %+ 3 'RW3URGXFW 0RGH %/ 4 $+ ; &>@ =HURHV 0DWKEORFN %+ ¶VFRPSOHPHQW /RJLF 4 'RW3URGXFW 0RGH %/ 3 $+ ; &>@ =HURHV &>@ =HURHV &>@ < $/ < 0DWKEORFN Figure 5-7 • 9-Bit Complex Multiplication Using DOTP Mode 72 R e visio n 2 3;4< 5HDO3DUW UG0574: RTG4 FPGA Fabric User Guide Use Model 4: Multi-Threading and Multi-Channeling Mathblocks support a multi-threading option where the same mathblock can be used for performing more than one computation by time multiplexing. Time multiplexing can be done easily for designs with low sample rates. The multi-threading capability, if implemented for a chain of mathblocks, is called multi-channeling. Multichanneling can be used to implement multi-channel FIR filters where the same mathblock chain can be used to process multiple input channels by time multiplexing the mathblock chain. Multi-channel filtering is used in applications such as wireless communications, image processing, and multimedia applications. The mathblock uses its C input for multi-threading and multi-channeling, but fabric registers are also required for implementation. Use Model 5 - Rounding and Trimming Rounding Rounding can be computed by adding a fixed term and a variable term to the input value to be rounded, and then truncating. The fixed term can be feed using the C-Input of the mathblock and the value depends on the number of decimal points required after rounding. The variable term is always a single bit in the least-significant position whose value may be determined from the input value based on the type of rounding. Types of rounding are: • Round to the adjacent even integer: The variable term is determined from the 20-bit of the input value. • Round towards zero: The variable term is determined from the sign bit of the input value. For example, 1.5 rounds to 1 and -1.5 rounds to -1. Table 5-4 shows examples for 6-bit values including three fraction bits. Table 5-4 • Rounding Examples Input Value Round To Even Round Toward Zero Decimal Fixed Binary Term Variable C-Input Term 2.5 010.100 0.011 000.000 010.111 010 2 000.000 010.111 010 2 1.5 001.100 0.011 000.001 010.000 010 2 000.000 001.111 001 1 -1.5 110.100 0.011 000.000 110.111 110 -2 000.001 111.000 111 -1 -2.5 101.100 0.011 000.001 110.000 110 -2 000.001 110.000 110 -2 Sum Truncated Sum Revision 2 Decimal Variable Term Sum Truncated Decimal Sum 73 Mathblocks $>@ %>@ )L[HG7HUP &,QSXW 9DULDEOH7HUP &$55<,1 3>@ Figure 5-8 • Rounding Using C-Input and CARRYIN Trimming Trimming of the Final Sum: Applications such as IIR and FFT often requires the rounding and trimming of the final result. For example, last output of a cascade chain or the final value read from an accumulator. The addition of the rounding terms can be done as shown in the Figure 5-9 and final result can be trimmed in the fabric. 9DULDEOH 7HUP $ % $ % )L[HG 7HUP 3 Figure 5-9 • Rounding and Trimming of the Final Sum Trimming of Grouped Sums: When computing very large dot products (for example, a large, fullyenumerated FIR), it is good to avoid overflow by breaking the sum into a few groups, trimming the sum for each group, and only then combining the sums of the groups into a final result. The rounding of each group's sum can be done as shown in Figure 5-9. The trimming of each group's sum and summation of the final result can be done in the fabric. Trimming can be done between the output of each cascade and the final fabric adder. 74 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Trimming of Products: Figure 5-10 shows the implementation of rounding all products towards zero and then trimming the least significant m bits of the product. As long as there are no additive terms other than the products, it is possible to equivalently trim the partial sums instead of the products. Round towards zero can be done using sign bit of the product (A*B) from the sign bits of the incoming factors A and B using an EXOR. $ $>@ $ % %>@ % &>P@ &>P@ & 3>P@ &>P@ 3 Figure 5-10 • Rounding and Trimming of the Final Sum Coding Style Examples The following code examples illustrate coding styles from which the synthesis tool can infer and implement RTG4 mathblocks. Note: Examples provided are only in VHDL. Verilog examples are provided on request. Example 1: 18 × 18 Signed Multiplication – Non-Registered The following code is for an 18 × 18-bit signed multiplier. The input and output registers are configured in Transparent mode. The synthesis tool maps the code into one mathblock. library IEEE; use IEEE.std_logic_1164.all; use IEEE.numeric_std.all; entity sign18x18_mult is port( in1 : in signed(17 downto 0); in2 : in signed(17 downto 0); out1 : out signed(35 downto 0) ); end sign18x18_mult; architecture behav of sign18x18_mult is begin out1 <= in1 * in2; end behav; Revision 2 75 Mathblocks Example 2: 18 × 18 Signed Multiplication – Registered The following code is for an 18 × 18 signed multiplier. The inputs and outputs are registered, with a synchronous active low reset signal. The synthesis tool maps the code into one mathblock. library IEEE; use IEEE.std_logic_1164.all; use IEEE.numeric_std.all; entity sign18x18_mult_reg is port( clk : in std_logic; rstn : in std_logic; in1 in2 out1 : in signed(17 downto 0); : in signed(17 downto 0); : out signed(35 downto 0) ); end sign18x18_mult_reg; architecture behav of sign18x18_mult_reg is signal in1_reg :signed(17 downto 0); signal in2_reg :signed(17 downto 0); begin process(clk,rstn) begin if(rstn = '0')then in1_reg <= (others => '0'); in2_reg <= (others => '0'); out1 <= (others => '0'); else if(rising_edge(clk))then in1_reg <= in1; in2_reg <= in2; out1 <= in1_reg * in2_reg; end if; end if; end process; end behav; 76 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Example 3: 17 × 17-Bit Unsigned Multiplier with Different Resets The following code is for a 17 × 17-bit unsigned multiplier, which has input and output registers with different asynchronous resets. The synthesis tool maps the code into one RTG4 mathblock. library IEEE; use IEEE.std_logic_1164.all; use IEEE.std_logic_unsigned.all; entity mult_17x17unsign is port( clk : in std_logic; rstn1 : in std_logic; rstn2 : in std_logic; in1 : in std_logic_vector(16 downto 0); in2 : in std_logic_vector(16 downto 0); out1 : out std_logic_vector(33 downto 0) ); end mult_17x17unsign; architecture behav of mult_17x17unsign is signal in1_reg :std_logic_vector(16 downto 0); signal in2_reg :std_logic_vector(16 downto 0); begin process(clk,rstn1) begin if(rstn1 = '0')then in1_reg <= (others => '0'); in2_reg <= (others => '0'); else if(rising_edge(clk))then in1_reg <= in1; in2_reg <= in2; end if; end if; end process; process(clk,rstn2) begin if(rstn2 = '0')then out1 <= (others => '0'); else if(rising_edge(clk))then out1 <= in1_reg * in2_reg; end if; end if; end process; end behav; Revision 2 77 Mathblocks Example 4: 17 × 17-Bit Unsigned Multiplier with Different Clocks This example shows an unsigned multiplier with inputs and outputs that are registered with different clocks: clock1 and clock2. In this case, the synthesis tool places only the output registers and the multiplier into the RTG4 mathblock. The input registers are implemented in the FPGA logic outside the mathblock. library IEEE; use IEEE.std_logic_1164.all; use IEEE.std_logic_unsigned.all; entity mult_17x17unsign is port( clk1 : in std_logic; clk2 : in std_logic; in1 in2 out1 : in std_logic_vector(16 downto 0); : in std_logic_vector(16 downto 0); : out std_logic_vector(33 downto 0) ); end mult_17x17unsign; architecture behav of mult_17x17unsign is signal in1_reg :std_logic_vector(16 downto 0); signal in2_reg :std_logic_vector(16 downto 0); begin process(clk1) begin if(rising_edge(clk1))then in1_reg <= in1; in2_reg <= in2; end if; end process; process(clk2) begin if(rising_edge(clk2))then out1 <= in1_reg * in2_reg; end if; end process; end behav; 78 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Example 5: Multiplier-Adder In the code below. the output of a multiplier is added with another input. Inputs and outputs are registered and have enables and synchronous resets. The synthesis tool maps the code into one RTG4 mathblock. library iEEE; use iEEE.std_logic_1164.all; use iEEE.std_logic_unsigned.all; entity mult_add is port ( clk : in std_logic; rst : in std_logic; en : in std_logic; in1 : in std_logic_vector (16 downto 0); in2 : in std_logic_vector (16 downto 0); in3 : in std_logic_vector (33 downto 0); out1 : out std_logic_vector (34 downto 0) ); end mult_add; architecture behav of mult_add is signal in1_reg, in2_reg : std_logic_vector (16 downto 0 ); signal mult_out : std_logic_vector ( 33 downto 0 ); begin process(clk) begin if(rising_edge(clk))then if(rst = '0') then in1_reg <= ( others => '0'); in2_reg <= ( others => '0'); out1 <= ( others => '0'); elsif(en = '1')then in1_reg <= in1; in2_reg <= in2; out1 <= ( '0' & mult_out ) + ('0' & in3 ); end if; end if; end process; mult_out <= in1_reg * in2_reg; end behav; Revision 2 79 Mathblocks Example 6: Multiplier-Subtractor There are two ways to implement multiplier and subtract logic. The synthesis tool places the logic differently, depending on how it is implemented. • Subtract the result of multiplier from an input value (P = Cin – mult_out). The synthesis tool places all logic in the mathblock. • Subtract a value from the result of the multiplier (P = mult_out – Cin). The synthesis tool places only the multiplier in the mathblock. The subtractor is implemented in FPGA logic outside the mathblock. – Unsigned MultSub Example (P = Cin – Mult_out) - Implemented in a single mathblock. – Unsigned MultSub Example (P = Cin – Mult_out) - Implemented in a single mathblock library IEEE; use IEEE.std_logic_1164.all; use IEEE.std_logic_unsigned.all; entity mult_sub is port ( clk : in std_logic; rst : in std_logic; in1 : in std_logic_vector(16 downto 0); in2 : in std_logic_vector(16 downto 0); in3 : in std_logic_vector(33 downto 0); out1 : out std_logic_vector(33 downto 0) ); end mult_sub; architecture behav of mult_sub is signal in1_reg, in2_reg : std_logic_vector(16 downto 0); begin process(clk) begin if(rising_edge(clk))then if(rst = '0') then in1_reg <= ( others => '0'); in2_reg <= ( others => '0'); out1 <= ( others => '0'); else if(rising_edge(clk))then in1_reg <= in1; in2_reg <= in2; out1 <= in3 - (in1_reg * in2_reg); end if; end if; end if; end process; end behav; – Unsigned MultSub Example (P = Mult - Cin) Multiplier is implemented in the mathblock and subtractor in FPGA logic library IEEE; use IEEE.std_logic_1164.all; use IEEE.std_logic_unsigned.all; entity mult_sub is port ( clk : in std_logic; rst : in std_logic; 80 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide in1 : in std_logic_vector(16 downto 0); in2 : in std_logic_vector(16 downto 0); in3 : in std_logic_vector(33 downto 0); out1 : out std_logic_vector(33 downto 0) ); end mult_sub; architecture behav of mult_sub is signal in1_reg, in2_reg : std_logic_vector(16 downto 0); begin process(clk) begin if(rising_edge(clk))then if(rst = '0') then in1_reg <= ( others => '0'); in2_reg <= ( others => '0'); out1 <= ( others => '0'); else if(rising_edge(clk))then in1_reg <= in1; in2_reg <= in2; out1 <= (in1_reg * in2_reg)-in3; end if; end if; end if; end process; end behav; Example 7: Signed 35 × 35 Multiplication The following code implements a signed 35 × 35 multiplication function. The synthesis tool uses four cascaded mathblocks to implement this multiplication function. library IEEE; use IEEE.std_logic_1164.all; use IEEE.numeric_std.all; entity sign35x35_mult is port ( in1 : in signed(34 downto 0); in2 : in signed(34 downto 0); out1 : out signed(69 downto 0) ); end sign35x35_mult; architecture behav of sign35x35_mult is begin out1 <= in1*in2; end behav; Revision 2 81 Mathblocks Example 8: Signed 35 × 35 Multiplication with Two Pipelined Register Stages The following code implements a signed 35 × 35 multiplication function with two pipelined register stages. The synthesis tool uses four cascaded mathblocks to implement this multiplication function. The synthesis tool first infers pipeline registers at the input, output of the RTG4 mathblock and controls pipeline latency by balancing the number of register stages. To balance the stages, the tool adds additional registers at the input or output of the mathblock as required, implemented in the fabric logic. library IEEE; use IEEE.std_logic_1164.all; use IEEE.numeric_std.all; entity sign35x35_mult is port ( clk : in std_logic; rstn : in std_logic; in1 in2 out1 : in signed(34 downto 0); : in signed(34 downto 0); : out signed(69 downto 0) ); end sign35x35_mult; architecture behav of sign35x35_mult is signal in1_reg : signed(34 downto 0); signal in2_reg : signed(34 downto 0); begin process(rstn,clk) begin if(rstn ='0')then in1_reg <= (others => '0'); in2_reg <= (others => '0'); out1 <= (others => '0'); else if(rising_edge(clk))then in1_reg <= in1; in2_reg <= in2; out1 end if; end if; end process; <= in1_reg*in2_reg; end behav; 82 R e visio n 2 6 – I/Os Introduction RTG4 FPGA devices have different types of I/Os, such as MSIO and MSIOD, double data rate I/Os (DDRIO), and dedicated I/Os based on functional usage. For more information on I/O naming conventions and I/O description, refer to the RTG4 FPGA Pin Description. The MSIO, MSIOD, and DDRIO provide programmable I/O features such as drive strength, slew rate, input delay, weak pull-up, and weak pull-down for several voltages. The programmable I/O features are explained in detail in the "I/O Programmable Features" section on page 91. The DDRIO is an MSIO optimized for LPDDR/DDR2/DDR3 performance. In RTG4 devices, there is a DDR subsystem that is used to control external DDR memory, called FDDR. DDRIOs can be connected to the respective DDR subsystem PHYs or can be used as user I/Os. For more information on DDR subsystem, refer to RTG4 High Speed DDR Interfaces User Guide. The MSIO, MSIOD, and DDRIO can be configured as fabric I/Os, whereas dedicated I/Os can be used for a single purpose such as serializer/deserializer (SERDES), device reset, and clock functions. Dedicated I/Os cannot be used by any other circuits. The MSIO, MSIOD, and DDRIO are configured at power-up by means of fabric-related flash bits, which are used to initialize register blocks. This is automatically done using Libero SoC. Functional Description The RTG4 I/O is classified into the following three categories depending on their functional usage: • MSIO, MSIOD, and DDRIO • JTAG I/O • Dedicated I/Os MSIO, MSIOD, and DDRIO Figure 6-1 on page 84 shows the top-level view of I/O interconnection between fabric logic and FDDR. The DDRIOs are shared among the fabric logic and FDDR. when FDDR controller is used, the Libero SoC software automatically assigns and configures the FDDR controller signals to respective DDRIOs. The SPIO_SEL signal (as shown in Figure 6-1 on page 84) determines whether fabric logic or FDDR peripheral connected to the corresponding I/O. This selection is set automatically by Libero SoC software during programming. When FDDR controller is not used, the respective DDRIOs are available to fabric logic as shown in Figure 6-1 on page 84. In case of MSIO and MSIOD, the I/O is directly connected to fabric logic. For fabric logic, each I/O port of the design must be individually assigned to I/Os in the Libero SoC software. Revision 2 83 I/Os ,2' 8VHU&RQILJXUHVLQ /LEHUR6R& )DEULF /RJLF /LEHUR6R& &RQILJXUHV$XWRPDWLFDOO\ )''5 &RQWUROOHU3+< ,2$ 2(B3 'DWDBRXW )DEULF,2' 'DWDBLQ '2B3 ',B3 3 63,2B6(/ 7UDQVPLWWHUDQG 5HFHLYHU 2(B3 '2B3 ',B3 )''5 ,2' ',B3 ,3EXIIHU GLVDEOH FRQWURO 8VHUFRQILJXUHVLQ/LEHUR6R& )DEULF /RJLF 3$' '2B3 23EXIIHU GLVDEOH FRQWURO 2(B1 'DWDBRXW )DEULF,2' 'DWDBLQ '2B1 ',B1 /LEHUR6R& &RQILJXUHVDXWRPDWLFDOO\ )''5 &RQWUROOHU3+< 63,2B6(/ '2B1 ',B1 )''5 ,2' 'LIIHUHQWLDO 2(B1 '2B1 ',B1 7UDQVPLWWHUDQG 5HFHLYHU 3$' 'LIIHUHQWLDO 1 Figure 6-1 • I/O Interconnection An I/O consists of a highly featured bi-directional I/O buffer. The I/O is divided into two main sections, as shown in Figure 6-1: • Digital: IOD (fabric and FDDR) • Analog: IOA The digital (IOD) section generates output enable (OE), data out (DO), and data in (DIN) signals for both P and N. Refer to the "Fabric Architecture" chapter on page 7 for more details on IOD. Each pair of Analog (IOA) block forms a differential pair as shown in Figure 6-2 on page 86. The differential pair is used to support differential and Pseudo differential modes of operation. The differential pair is composed of a true and complement IOA. The True IOA is called P (with positive polarity relative to the DO/DIN data signals of the P cell). The complement IOA is called ION (with negative polarity relative to the DO/DIN data signals of the N cell). The IOA blocks form a ring around the periphery of the device (Excluding the SERDES channel edge). The Top and Bottom edge of the device IOA order starts with P on the left and N on the right. The left and right edges use N on the top and P on the bottom. There is One IOD for each pair of IOAs. In order to support a variety of different differential standards, the RTG4 uses pairs of regular IO cells: P and N. These two IO cells of MSIO, MSIOD, and DDRIO can be configured as separate single ended IOs or configured as one differential IO pair. In differential output mode, the output data signal is driven out on both the P cell and N cell as a differential pair, where the true signal is on the P pad and the complement signal is on N pad. The P and N output signals will be complementary as required by the DDR1/DDR2/DDR3 standards for CK and DQS signals. The P and N cells have to be laid out next to each other, as a pair, in order to minimize the skew between the two output signals of the differential pair. 84 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide The analog block (IOA) section has transmitter and receiver buffers for the P and N pair. The main circuits in the IOA are transmit and receive buffers (as shown in Figure 6-2 on page 86), that supports various I/O standards and contains the following modules: • Transmit Buffer • Receive Buffer • Input Programming Delay • On-Die Termination Transmit Buffer Transmit and receive buffers transfer signals between the FPGA fabric and the IOA and also transfer signals between FDDR and the IOA. OE_P and OE_N control the direction of I/O buffers, as shown in Figure 6-2 on page 86. When an I/O is operated as a single-ended I/O, OE_P and OE_N individually control the P and N I/O buffers. When an I/O is operated as a differential I/O, OE_P controls both the P and N I/O buffers. The dynamic OE disables or enables an output buffer for all the standards. Receive Buffer The enabling and disabling of the input buffer is controlled automatically by the Libero SoC software. The I/O receiver can be made to operate in four different modes, as shown in Figure 6-2 on page 86. These modes are selected based on flash configuration bits, which are configured during programming, after power-on. Following are the four modes of the receiver: • True differential • Pseudo-differential • Single-ended • Schmitt trigger In True differential mode, P and N pad inputs are fed to the comparator, whereas in Pseudo-differential mode, each pad input is compared to reference with external reference voltage. Figure 6-2 on page 86 shows the detailed IOA structure of an I/O. The I/O input can be configured as a Schmitt trigger receiver or single-ended receiver. When Schmitt trigger inputs are selected, the input buffers present hysteresis that filters the noise at the receiver and double glitching prevents caused by noisy input edges. Input Programming Delay Input delays can be used for hold time improvement of the input register by increasing input pin to input register delay. Refer to "I/O Programmable Features" section on page 91 for more information. On-Die Termination The On-die termination (ODT) improves the signaling environment by reducing the electrical discontinuities introduced with off-die termination and hence enables reliable operation at higher signaling rates. For more information on the programmed ODT values for DDRIO, MSIO, and MSIOD, refer to the "I/O Programmable Features" section on page 91. Revision 2 85 I/Os 3URJUDPGLUHFWO\2'7WRGHVLUHGYDOXH 5HIHUHQFH5HVLVWRU9DOXH ,2$ )DEULF RU )''5 '2B3 ''5,23DLUV&RQQHFWHGWR 0''5)''5 ''5,2 &DOLEUDWLRQ%ORFN 9&&,2 3URJUDPPD EOH6OHZUDWHIRUµ3¶GULYHU 3URJUDPPD EOH3XOOXSRU 3XOOGRZQRU 'LVDEOHERWKIRUµ3¶ 2'7 7UDQVPLWWHU ,PSHGDQFH 7[3 3$'B3 2(B3 6LQJOH(QGHG 5HFHLYHU3 6FKPLW 3VXHGR'LIIHUHQWLDO 7UXH 'LIIHUHQWLDO ',1B3 ,QSXW3URJUDPPLQJ 'HOD\ ',1B3BGHOD\HG 9ROWDJH6WDQGDUG 6HOHFW 3URJUDPPD EOH6OHZUDWH IRUµ1¶GULYHU '2B1 'LIIHUHQWLDO 2'7 06,2'RQO\ ;B95() 9&&,2 2'7 7UDQVPLWWHU ,PSHGDQFH 7[1 3$'B1 2(B1 5HFHLYHU1 6LQJOH HQGHG 3URJUDPPD EOH3XOO XSRU 3XOO GRZQRU 'LVDEOHERWKIRUµ1¶ 'LIIHUHQWLDO 6FKPLW ',1B1 ',1B1BGHOD\HG 3VXHGR 'LIIHUHQWLDO ,QSXW3URJUDPPLQJ 'HOD\ ;B95() Figure 6-2 • IOA Architecture Radiation Hardening Radiation Hardening is the act of making systems resistant to damage or malfunctions caused by ionization radiation (such as particle radiation and high-energy electromagnetic radiation, which are encountered in space, high-altitude flight, and so on). The Hardened term is referred to Radiation Hardened. RTG4 devices have a hardened input buffers for receiving clock inputs or other critical signals. There are 24 primary clock inputs on a RTG4 device. The hardened capability is only available for MSIO and MSIOD receivers. The DDRIO receivers are not hardened, which means they are susceptible to radiation. The RTG4 hardened receiver uses TMR logic (that is, each receiver block is composed of three receivers with a wire-or at the output). Each RTG4 hardened receiver in MSIO and MSIOD supports the following modes of operations: 1. Single ended ratio receiver mode (LVTTL/LVCMOS) with programmable ON/OFF 2. Reference receiver mode (SSTL/HSTL) 3. Differential input mode (LVDS/RSDS) In RTG4 devices, as hardening is only on the input buffer, when an I/O is configured bi-directional mode, it is not hardened. The hardened input has a programmable on-die termination ON/OFF, programmable weak pull-up/pull-down ON/OFF per pad. 86 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide I/O Banks I/Os are grouped in banks on the basis of I/O voltage standard. Each I/O bank has dedicated I/O supply and ground voltages and only standards compatible with the voltage supplied to the bank might be used. There are 10 I/O banks as shown in RTG4150-CG1657 device. Every I/O bank has input and output buffers to support a wide range of standards, which require different VCC voltage and reference voltages (VREF) for voltage referenced standards. These voltages are externally supplied and connected to device pins, which serve banks (groups) of I/Os. This section discusses on the RT4G150 device CG1657 package details. There are 10 banks in the RT4G150 device as shown in Figure 6-3. There are three MSIO banks, four MSIOD banks, and two DDRIO banks in the RT4G150 device. The maximum number of available I/Os are mentioned in parenthesis as shown in Figure 6-3. For more information on RTG4 FPGA pin descriptions, supply pins, unused conditions, and packaging details, refer to the RTG4 FPGA Pin Description. %$1. -7$* %$1. 06,2 %$1. 06,2 %$1. 06,2 $ %$1. 06,2' %$1. 06,2' 57*)3*$ 57*&* %$1. 06,2' %$1. 06,2' %$1. ''5,2 )''5B: %$1. ''5,2 )''5B( 6(5'(6B3&,(B 6(5'(6B 6(5'(6B 6(5'(6B 6(5'(6B 6(5'(6B3&,(B Figure 6-3 • RT4G-CG1657 I/O Bank Location and Naming The MSIOs, MSIODs, and DDRIOs are divided into banks, each of which may be configured to support one of the standards listed in Table 6-2 on page 88. Revision 2 87 I/Os Table 6-1 shows the organization of I/O banks in the RTG4 devices. Table 6-1 • The Organization of I/O Banks in RTG4 Devices I/O Banks RT4G150CG1657 Bank 7 MSIOD: Fabric Bank 8 MSIOD: Fabric Bank 9 DDRIO: FDDR or fabric Bank 4 MSIO: Fabric Bank 5 MSIO: Fabric Bank 6 MSIO: Fabric Bank 1 MSIOD: Fabric Bank 2 MSIOD: Fabric Bank 0 DDRIO: FDDR or fabric Bank 3 MSIO: JTAG Supported I/O Standards Table 6-2 shows the supported voltage standards for various I/O types. Table 6-2 • Supported Voltage Standards I/O Types I/O Standards MSIO MSIOD DDRIO LVTTL 3.3 V Yes – – LVCMOS 3.3 V Yes – – PCI Yes – – LVCMOS 12 Yes Yes Yes LVCMOS 15 Yes Yes Yes LVCMOS 18 Yes Yes Yes LVCMOS 25 Yes Yes Yes SSTL2I Yes Yes Yes (DDR1) SSTL2II Yes Yes Yes (DDR1) SSTL18I Yes Yes Yes (DDR2) SSTL18II Yes Yes Yes (DDR2) SSTL15I – – Yes (DDR3) Only for I/Os used by FDDR SSTL15II – – Yes (DDR3) Only for I/Os used by FDDR HSTL18I Yes Yes Yes HSTL18II Yes Yes Yes Single-Ended Standard Voltage-Referenced Standard 88 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Table 6-2 • Supported Voltage Standards (continued) I/O Types I/O Standards MSIO MSIOD DDRIO HSTLI Yes Yes Yes HSTLII – – Yes LPDDRI – – Yes LPDDRII – – Yes LVPECL Yes – – LVDS 33 Yes – – LVDS Yes Yes – RSDS Yes Yes – MINILVDS Yes Yes – BUSLVDS Yes Yes (Only Input) – MLVDS Yes Yes (Only Input) – Differential Standard For I/O pin naming and assignments to specific banks, refer to the RTG4 Pin Descriptions document. Single-Ended Standards Single-ended I/O standards use a push-pull CMOS output stage with a voltage referenced to the system ground. The input buffer configuration, output drive, and I/O supply voltage (VCCI) vary among the I/O standards. The advantage of these standards is that a common ground can be used for multiple I/Os. This simplifies board layout and reduces system cost. The reduced slew rate of these I/O standards causes less electromagnetic interference (EMI) on the board. However, these I/Os are not suitable for high frequency (>200 MHz) switching due to noise and high power consumption. Low Voltage TTL (LVTTL) This is a general purpose standard (EIA/JESD8-B) for 3.3 V applications. It uses an LVTTL input buffer and a push-pull output buffer. The LVTTL output buffer can have up to eight different programmable drive strengths. Low Voltage CMOS (LVCMOS) RTG4 devices provide five different kinds of LVCMOS: • LVCMOS 3.3 V • LVCMOS 2.5 V • LVCMOS 1.8 V • LVCMOS 1.5 V • LVCOMS 1.2 V LVCMOS 3.3 V (only in MSIO) is an extension of the LVCMOS standard (JESD8-B compliant) used for general purpose 3.3 V applications. LVCMOS 2.5 V is an extension of the LVCMOS standard (JESD8-5compliant) used for general purpose 2.5 V applications. LVCMOS 1.8 V is an extension of the LVCMOS standard (JESD8-7-compliant) used for general purpose 1.8 V applications. The LVCMOS 1.5 V is an extension of the LVCMOS standard (JESD8-11-compliant) used for general purpose 1.5 V applications. The VCCI values for these standards are 3.3 V, 2.5 V, 1.8 V, 1.5 V, and 1.2 V, respectively. For MSIOs, all these versions use a 3.3 V-tolerant CMOS input buffer and a push-pull output buffer. Similar to LVTTL, the output buffer has up to eight different programmable drive strengths. Revision 2 89 I/Os 3.3 V Peripheral Component Interface (PCI) This standard specifies support for both 33 MHz and 66 MHz PCI bus applications. It uses an LVTTL input buffer and a push-pull output buffer. With the aid of an external resistor, this I/O standard can be 5 V-compliant. Voltage-Referenced Standards I/Os using these standards are referenced to an external reference voltage (VREF). High-Speed Transceiver Logic (HSTL) Class I These are general purpose, high-speed 1.5 V bus standards (EIA/JESD8-6) for signaling between integrated circuits. The signaling range is 0 V to 1.5 V, and signals can be either single-ended or differential. HSTL requires a differential amplifier input buffer and a push-pull output buffer. These standards are used in the memory bus interface with data switching capability of up to 400 MHz. The other advantages of these standards are low power and fewer EMI concerns. HSTL has four classes, of which RTG4 devices support Class I. The reference voltage (VREF) is 0.75 V. Stub Series Terminated Logic 2.5 V (SSTL2) Class I and II These are general purpose 2.5 V memory bus standards (JESD8-9) for driving transmission lines, designed specifically for driving the DDR SDRAM modules used in computer memory. The SSTL2 requires a differential amplifier input buffer and a push-pull output buffer. The reference voltage (VREF) is 1.25 V. Stub Series Terminated Logic 1.8 V (SSTL18) Class I and II These are general purpose 1.8 V memory bus standards (JESD8-15) for driving transmission lines, designed specifically for driving the DDR2 SDRAM modules used in computer memory. SSTL18 requires a differential amplifier input buffer and a push-pull output buffer. The VREF is 0.9 V. Differential Standards These standards require two I/Os per signal (called a signal pair). Logic values are determined by the potential difference between the lines, not with respect to ground. This is why differential drivers and receivers have much better noise immunity than single-ended standards. The differential interface standards offer higher performance and lower power consumption than their single-ended counterparts. Two I/O pins are used for each data transfer channel. Differential standards require resistor termination on both I/Os. Low Voltage Positive Emitter Coupled Logic Low voltage positive emitter coupled logic (LVPECL) requires that one data bit is carried through two signal lines; therefore, two pins are needed per input or output. It also requires external resistor termination. The voltage swing between the two signal lines is approximately 850 mV. When the power supply is +3.3 V, it is commonly referred to as LVPECL. Low Voltage Differential Signal Low voltage differential signal (LVDS) is a differential I/O standard. As with all differential signaling standards, LVDS requires that one data bit is carried through two signal lines, and it has inherent noise immunity over single-ended I/O standards. The voltage swing between two signal lines is approximately 350 mV. The external VREF or board termination voltage (VTT) is not required. LVDS requires the use of two pins per input or output. Reduced Swing Differential Signaling Reduced swing differential signaling (RSDS) is a signaling standard that defines the output characteristics of a transmitter and inputs of a receiver along with the protocol for a chip-to-chip interface between flat-panel timing controllers and column drivers. 90 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide B-LVDS/M-LVDS Bus LVDS (B-LVDS) refers to bus interface circuits based on LVDS technology. Multipoint LVDS (MLVDS) specifications extend the LVDS standard to high-performance multipoint bus applications. Multidrop and multipoint bus configurations may contain any combination of drivers, receivers, and transceivers. The LVDS drivers provide the higher drive current required by B-LVDS and M-LVDS to accommodate the bus loading. The driver requires series terminations for better signal quality and to control the voltage swing. Termination is also required at both ends of the bus, since the driver can be located anywhere on the bus. The RTG4 MSIOD has an internal circuit isolation, and the bus isolation must be implemented in the design external to the FPGA when using M-LVDS. Mini-LVDS A serial, intra-flat panel solution that serves as an interface between the timing control function and an LCD source driver. I/O Programmable Features RTG4 devices support different I/O programmable features for MSIO, MSIOD, and DDRIO. Each I/O pair (P and N) supports the following programmable features: • Programmable Input Delay • Pre-Emphasis • Programmable Slew Rate Control • Programmable Output Drive Strength • Programmable Weak Pull-Up/Pull-Down • Programmable Schmitt Trigger Input and Receiver • Configurable ODT and Driver Impedance These features can be configured using Libero SoC or in a PDC file. Refer to the Libero SoC User Guide for more details. Revision 2 91 I/Os Table 6-3 lists all the features supported for single-ended and differential I/Os. Table 6-3 • RTG4 I/O Features I/Os I/O Features MSIO MSIOD DDRIO Programmable drive strength Yes Yes Yes Programmable weak pull-up and pull-down Yes Yes Yes Configurable ODT Yes Yes Yes – – – Yes – – Pre-emphasis capability – Yes - Programmable slew rate – – Yes 5 V tolerant with minimal use of external circuitry Yes Yes – Schmitt receiver Yes Yes Yes Programmable input delay Yes Yes Yes Programmable weak pull-up and pull-down Yes Yes Yes Configurable ODT Yes Yes Yes – – Yes 100 Ω differential ODT Yes Yes – Schmitt receiver Yes Yes Yes Programmable input delay Yes Yes Yes – – Yes Single-Ended Transmitter Hot insertion capable LVTTL/LVCMOS 3.3 V outputs compatible with external 5 V TTL inputs Single-Ended Receiver Differential Transmitter Programmable slew rate Differential Receiver Programmable Slew rate Programmable Input Delay Each I/O, when configured as an input, can be programmed with different input delays. The input delay is calculated using: Delay = D + N x 0.1 ns EQ 1 where, N ranges from 0 to 63. D is the intrinsic delay or circuit delay of an input without additional delay, when N is 0. The total delay range is between D ns to D + 6.3 ns. The intrinsic delay varies depending on SLOW (SS), MEDIUM (TT), and FAST (FF) slew rates. Hence, there are 65 input delay values which can be selected and configured using the I/O Constraints Editor of Libero SoC for MSIO, MSIOD, and DDRIO. Note: Input delays could be used for hold time improvement for the input register by increasing input pin to input register delay. 92 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Pre-Emphasis The MSIOD has pre-emphasis on the differential output only. The RTG4 MSIOD has LVDS preemphasis. Programmable Slew Rate Control The MSIO and MSIOD do not support a user programmable slew rate. Although, the MSIO and MSIOD output drive slew rate is managed, to some extent, with staggered output pre drive stages. Each output buffer has multiple transistors connected in parallel and driven by corresponding pre-driver circuits. Delay circuit is introduced to stagger the pre-driver turn-on times and then control the overshoot of the switching current. The DDRIO has two bits of programmable slew control capability on the non-differential drive outputs. The LVCMOS25, LVCMOS18, LVCMOS15, and LVCMOS12 support three levels of slew control. Minimum, Medium, and Maximum slew are supported. The DDRIO output drive slew rate is also managed with staggered output pre drive stages and by use of an impedance matched output driver. Programmable Weak Pull-Up/Pull-Down All I/O standards support the Pull-up or Pull-down or None states. The default configuration is None. These states can be configured using I/O Constraints Editor in the Libero SoC software. The Pull-up and Pull-down are mutually exclusive and weakly hold the output to either VDDI or VSS respectively through 10K ohm resistor. Table 6-4 • Weak Pull-Up/Pull-Down I/O Standard MSIO MSIOD DDRIO LVTTL33 None – – – – – – None None None Down Down Down Up Up Up None None None Down Down Down Up Up Up None None None Down Down Down Up Up Up Down Up LVCMOS33 None Down Up PCI None Down Up LVCMOS12 LVCMOS15 LVCMOS18 Revision 2 93 I/Os Table 6-4 • Weak Pull-Up/Pull-Down (continued) I/O Standard MSIO MSIOD DDRIO LVCMOS25 None None None Down Down Down Up Up Up Programmable Schmitt Trigger Input and Receiver The MSIO, MSIOD, and DDRIO inputs can be configured as Schmitt trigger receiver. When the Schmitt trigger inputs are enabled, the input buffers present a hysteresis and filter out the noise at the receiver and prevent double glitching caused by noise at input edges. This feature can be enabled or disabled by using a physical design constraints (PDC) command or by using I/O Constraints Editor in the Libero SoC software. The Schmitt Trigger receiver is disabled by default. Table 6-5 shows the different I/O standards which support the Schmitt Receiver option. Table 6-5 • Schmitt Receiver I/O Standard LVTTL33 MSIO MSIOD DDRIO Off – – – – – – Off Off Off On On On Off Off Off On On On Off Off Off On On On Off Off Off On On On On LVCMOS33 Off On PCI Off On LVCMOS12 LVCMOS15 LVCMOS18 LVCMOS25 94 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Programmable Output Drive Strength The programmable current drive output buffers can be programmed to select the current drive capabilities ranging from 2 mA to 16 mA. Programmable values are available only for LVTTL and LVCMOS standards as shown in Table 6-6. These values can be programmed using I/O Constraints Editor in the Libero SoC software for the selected I/O standard. Table 6-6 • Recommended DDRIO Output Drive Strengths I/O Standard LVTTL MSIO (mA) MSIOD (mA) DDRIO (mA) 2 – – – – 2 2 2 4 4 4 4 8 12 16 LVCMOS33 2 4 8 12 16 LVCMOS12 6 LVCMOS15 2 2 2 4 4 4 6 6 6 8 8 10 12 LVCMOS18 2 2 2 4 4 4 6 6 6 8 8 8 10 10 10 12 12 16 LVCMOS25 2 2 2 4 4 4 6 6 6 8 8 8 12 12 12 16 Revision 2 16 95 I/Os Configurable ODT and Driver Impedance The MSIO, MSIOD, and DDRIOs have an ODT or transmitter impedance feature which is calibrated depending on the I/O standard. If the impedance feature is enabled, impedance can be programmed to the desired value in three ways. Figure 6-2 on page 86 shows the impedance configuration for DDRIO. • Calibrate the ODT/Driver Impedance with Calibration Block • Calibrate the ODT/Driver Impedance with Fixed Calibration Codes • Configure the ODT/Driver Impedance Statically to Desired Value Directly There are two DDRIO calibration blocks in each RTG4 device. The FDDR has a DDRIO calibration block. Each calibration block calibrates ODT/driver impedance for all 44 DDRIO pairs (P and N). Calibrate the ODT/Driver Impedance with Calibration Block The I/O calibration block automatically calibrates the I/O drivers to an external resistor. The impedance control is used to identify the digital values PCODE<5:0> and NCODE<5:0>. These values are fed to the pull-up/pull-down reference network to match the impedance with an external resistor. Once it matches the PCODE and NCODE registers, they are latched and sent to the drivers. The calibrated impedance value can be configured statically by enabling ODT_STATIC, or dynamically by enabling ODT_DYN. ODT_STATIC selects the ODT value set in flash configuration bits programmed during power-on, whereas ODT_DYN selects the ODT value provided at run time. Refer to the FDDR I/O Calibration Control register of the "System Register Block" in the RTG4 FPGA High Speed DDR User Guide for enabling the calibration block. Table 6-7 shows the ODT calibrated impedances for the listed I/O standards. Table 6-7 • ODT Calibrated Impedance Driver Mode ODT, DDR3/SSTL 1.5, 1.5 V Reference Resistor (Ω) ODT Calibrated Impedance 240 120 60 40 30 20 ODT, DDR2/SSTL 1.8, 1.8 V 150 150 75 50 ODT, HSTL 191 47.8 To calibrate driver or transmitter impedance for an I/O, configure it to the calibrated impedance according to the flash configuration bits for the appropriate I/O standard. Recommended reference resistor values are used for calibration. The calibrated impedance values are shown in Table 6-8. Table 6-8 • Driver/Transmitter Calibrated Impedance Driver Mode Transmitter, DDR3 SSTL 1.5 V Reference Resistor (Ω) Transmitter Calibrated Impedance 240 34 40 Transmitter, DDR2 SSTL 1.8 V 150 20 42 Transmitter, DDR1 SSTL 2.5 V 150 20 42 96 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Table 6-8 • Driver/Transmitter Calibrated Impedance (continued) Driver Mode Reference Resistor (Ω) Transmitter Calibrated Impedance 150 20 Transmitter, LPDDR SSTL 1.8 V 42 Transmitter, HSTL 1.5 V 191 25.5 47.8 LVCMOS 1.2 V and 1.5 V 300 75 66.7 50 LVCMOS 1.8 V 150 75 50 33 25 LVCMOS 2.5 V 150 75 50 33 25 Calibrate the ODT/Driver Impedance with Fixed Calibration Codes The DDRIO can use fixed impedance calibration for different drive strengths, and these values can be programmed using I/O Constraints Editor in the Libero SoC software for the selected I/O standard. Refer to the I/O Constraints Editor section in the Libero SoC User Guide. Table 6-6 on page 95 shows the recommended DDRIO output drive strength values. PCODE<5:0> and NCODE<5:0> are registers accessible through the dedicated APB configuration interface. Table 6-9 • PCODE and NCODE Values I/O Standard NCODE PCODE DDR1 Full Drive/SSTL2 II 42 44 DDR1 Half Drive/SSTL2 I 42 44 DDR2 Full Drive/SSTL18 II 58 61 DDR2 Half Drive/SSTL18 I 58 61 LPDDR Full Drive 58 61 LPDDR Half Drive 58 61 HSTL II 53 56 HSTL I 53 56 LVCMOS25 16 mA 42 44 LVCMOS25 12 mA 42 44 LVCMOS25 8 mA 42 44 LVCMOS25 6 mA 42 44 LVCMOS25 4 mA 42 44 LVCMOS25 2 mA 42 44 Revision 2 97 I/Os Table 6-9 • PCODE and NCODE Values (continued) I/O Standard NCODE PCODE LVCMOS18 16 mA 58 61 LVCMOS18 12 mA 58 61 LVCMOS18 10 mA 58 61 LVCMOS18 8 mA 58 61 LVCMOS18 6 mA 58 61 LVCMOS18 4 mA 58 61 LVCMOS18 2 mA 58 61 LVCMOS15 12 mA 53 56 LVCMOS15 10 mA 53 56 LVCMOS15 8 mA 53 56 LVCMOS15 6 mA 53 56 LVCMOS15 4 mA 53 56 LVCMOS15 2 mA 53 56 LVCMOS12 6 mA 40 42 LVCMOS12 4 mA 40 42 LVCMOS12 2 mA 40 42 Configure the ODT/Driver Impedance Statically to Desired Value Directly The ODT/driver can be calibrated to a desired value by providing PCODE<5:0> and NCODE<5:0> values directly through the dedicated APB configuration interface FIC2. In this configuration, the values are overwritten with the existing values. Refer to the FDDR I/O Calibration Control register of the "System Register Block" in the RTG4 FPGA High Speed DDR User Guide for configuring the PCODE and NCODE values. For MSIO and MSIOD, the ODT values shown in Table 6-10 are configured based on I/O standard. Table 6-10 • ODT Values I/O Standards SSTL18I & SSTL18II (DDR2) SSTL15I & SSTL15II (DDR3) MSIO MSIOD DDRIO 50 50 50 75 75 75 150 150 150 – – 20 30 40 60 120 HSTL18I & HSTL18II 98 50 50 50 75 75 75 150 150 150 R e visio n 2 UG0574: RTG4 FPGA Fabric User Guide Table 6-10 • ODT Values I/O Standards LPDDRI & LPDDRII MSIO MSIOD DDRIO – – 50 75 150 Default Values Set in Software (Cannot be accessible) LVDS33 100 – – LVPECL 100 – – LVDS25 100 100 – RSDS 100 100 – MINILVDS 100 100 – BUSLVDS 100 100 – MLVDS 100 100 – I/O External Termination If ODT is not used, I/O standards require termination for better signal integrity. Voltage referenced standards generally have a serial (driver) and parallel (receiver) termination whereas differential standards have only a parallel termination (receiver). Table 6-11 shows external termination schemes for the I/O standards supported for DDRIO, MSIO, and MSIOD when the ODT/driver impedance calibration feature is not used. Table 6-11 • Termination Schemes I/O Standard External Termination Scheme SSTL 1.5 single-ended (Class I & II) SSTL 1.8 single-ended (Class I & II) Single-ended SSTL I/O standard termination SSTL 2 single-ended (Class II) HSTL 1.5 single-ended (Class II) Single-ended HSTL I/O standard termination SSTL 2.5 differential (Class I & II) SSTL 1.8 differential (Class I & II) Differential SSTL I/O standard termination SSTL 1.5 differential (Class I & II) HSTL 1.5 differential (Class II) Differential HSTL I/O standard termination LVCMOS 2.5 LVCMOS 1.8 LVCMOS 1.5 No external termination required LVCMOS 1.2 LVDS 100 Ω, parallel termination MLVDS 100 Ω, parallel termination BLVDS 100 Ω, parallel termination RLVDS 100 Ω, parallel termination Mini LVDS 100 Ω, parallel termination Revision 2 99 I/Os Table 6-11 • Termination Schemes (continued) I/O Standard External Termination Scheme LVPECL 100 Ω, parallel termination Note: To obtain more information on electrical characteristics, refer to the RTG4 DataSheet (to be released). RTG4 does not support Bus Keeping feature. Cold Sparing In cold sparing applications, voltage can be applied to device I/Os before and during power-up. The RTG4 device is capable of cold sparing applications which has the following strategies: • System board integrates two parallel RTG4 devices on the board with shared or common I/O connections. • Primary RTG4 device has its core powered and fully functional until a point where a swap of devices is determined to be necessary. • Backup RTG4 device has its I/O banks powered to prevent I/O leakage through the ESD diodes and fabric core un-powered. This establishes a low power, protected state for the backup RTG4 device. • At any point, you can swap by powering down the core of the primary RTG4 device and powering up the core of the backup RTG4 device and going through its configuration sequence. • Primary and backup devices are identical parts. • Only one of the two devices might be active at one time. • CoreVDD high activates the part and low de-activates the part. • The de-active part must tie the VDD to the ground and must not be floating. Following are the advantages of Cold Sparing: • Power-Up can be done in any sequence. • No Excess device leakage in spare device. (No power supply sequencing requirement) (In this cold sparing method) 9'' YQRP &RUHSRZHU 9'',2 ,2%DQNSRZHU 3ULPDU\57* YGG $FWLYH 9'',2 ,2%DQNSRZHU %DFNXS57* YGG 6SDUH YGGLR YGGLR 2WKHU&KLS Figure 6-4 • Cold Sparing 100 9'' 9QRP &RUHSRZHU R e vi s i o n 2 UG0574: RTG4 FPGA Fabric User Guide 5 V Input Tolerance and Output Driving Compatibility (only MSIO) 5 V Input Tolerance I/Os can support 5 V inputs when LVTTL 3.3 V and LVCMOS 3.3 V configurations and one of the following techniques is used to reduce the voltage at the IO. There are three recommended solutions for achieving 5 V receiver tolerance. All the solutions meet a common requirement of limiting the voltage at the input to 3.45 V or less. The I/O absolute maximum voltage rating is 3.45 V, and any voltage above 3.45 V may cause long-term gate oxide failures. Solution 1 The board-level design must ensure that the reflected waveform at the pad does not exceed the limits provided in the recommended operating conditions in the datasheet. This is a requirement to ensure long-term reliability. This scheme also works for a 3.3 V PCI configuration, but the internal diode must not be used for clamping, and the voltage must be limited by two external resistors. Relying on diode clamping would create an excessive pad DC voltage of 3.3 V + 0.7 V = 4 V. This solution requires two board resistors. Here are some examples of possible resistor values based on a simplified simulation model with no line effects and 10 Ω transmitter output resistance. where, Rtx_out_high = [VCCI – VOH] / IOH and Rtx_out_low = VOL / IOL). EQ 2 Example 1 (high speed, high current): Rtx_out_high = Rtx_out_low = 10 Ω R1 = 36 Ω (±5%), P(r1)min = 0.069 Ω R2 = 82 Ω (±5%), P(r2)min = 0.158 Ω Imax_tx = 5.5 V / (82 × 0.95 + 36 × 0.95 + 10) = 45.04 mA tRISE = tFALL = 0.85 ns at C_pad_load = 10 pF (includes up to 25% safety margin) tRISE = tFALL = 4 ns at C_pad_load = 50 pF (includes up to 25% safety margin) Example 2 (low-medium speed, medium current): Rtx_out_high = Rtx_out_low = 10 Ω R1 = 220 Ω (±5%), P(r1)min = 0.018 Ω R2 = 390 Ω (±5%), P(r2)min = 0.032 Ω Imax_tx = 5.5 V / (220 × 0.95 + 390 × 0.95 + 10) = 9.17 mA tRISE = tFALL = 4 ns at C_pad_load = 10 pF (includes up to 25% safety margin) tRISE = tFALL = 20 ns at C_pad_load = 50 pF (includes up to 25% safety margin) Other values of resistors are also allowed as long as the resistors are sized to limit the voltage at the receiving end to 2.5 V < Vin(rx) < 3.6 V when the transmitter sends a logic 1. This range of Vin_dc(rx) must be assured for any combination of transmitter supply (5 V ± 0.5 V), transmitter output resistance, and board resistor tolerances. Revision 2 101 I/Os Figure 6-5 shows the 5 V input tolerance solution 1. 9 9 5H[W 5H[W 5HTXLUHVWZRERDUGUHVLVWRUV /9&0269,2V Figure 6-5 • 5 V Input Tolerance Solution 1 Solution 2 The board-level design must ensure that the reflected waveform at the pad does not exceed the voltage overshoot/undershoot limits provided in the datasheet. This is a requirement to ensure long-term reliability. This scheme also works for a 3.3 V PCI configuration, but the internal diode must not be used for clamping, and the voltage must be limited by the external resistors and Zener. Relying on the diode clamping would create an excessive pad DC voltage of 3 V + 0.7 V = 4 V. 9 9 5H[ =HQHU 9 5HTXLUHVRQHERDUGUHVLVWRUV RQH=HQHU9GLRGH/9&0269,2V Figure 6-6 • 5 V Input Tolerance Solution 2 102 R e vi s i o n 2 UG0574: RTG4 FPGA Fabric User Guide 5 V Output Driving Compatibility RTG4 I/Os must be set to 3.3 V LVTTL or 3.3 V LVCMOS mode to reliably drive 5 V TTL receivers. It is also critical that there is NO external I/O pull-up resistor to 5 V, since this resistor would pull the I/O pad voltage beyond the 3.6 V absolute maximum value and consequently cause damage to the I/O. When set to 3.3 V LVTTL or 3.3 V LVCMOS mode, the I/Os can directly drive signals into 5 V TTL receivers. VOL = 0.4 V and VOH = 2.4 V in both 3.3 V LVTTL and 3.3 V LVCMOS modes exceeds the VIL = 1.8 V and VIH = 2 V level requirements of 5 V TTL receivers. Therefore, level 1 and level 0 are recognized correctly by 5 V TTL receivers. Temperature Sensing This feature is used as an internal thermometer to provide a way for monitoring the die temperature. This is a temperature sense diode located in lower left corner of the device. The temperature sensing diode has one dedicated pin, PTEMP, connected to the anode of the diode. The cathode of the diode is connected to the VSS of the die. The diode is a passive device and the pins are always attached to the die. The PTEMP pin can be left floating, if the feature is not being used. There is nothing that needs to be programmed in software to enable this temperature sensing feature. In order to use the temperature sensing diode, it must be calibrated by user software and/or circuits. To measure the temperature, check the voltage drop between PTEMP and VSS. I/Os in Shared By Fabric and FDDR DDRIOs with FDDR If FDDR is selected, Libero SoC automatically connects FDDR signals to the DDRIOs. Depending on the memory configuration, only the required DDRIOs are used by Libero SoC. The unused DDRIOs are available to connect to the FPGA fabric. DDRIOs with Fabric If FDDR is not selected, DDRIOs are available to the FPGA fabric. DDRIOs must be configured manually in Libero SoC. MSIOs/MSIODs with Fabric There are two macros in silicon called DDR_IN and DDR_OUT and these can be connected to a DDR controller soft ip core in fabric. You can use the I/O standards of MSIO and MSIODs for DDR controllers which can not be supported by dedicated DDRIO bank. MSIOs/MSIODs are available to the FPGA fabric and must be configured manually in Libero SoC. JTAG I/O The system controller implements the functionality of a JTAG slave, with IEEE 1532 support, which also implies IEEE 1149.1 compliance. JTAG communicates with the system controller using a Command register that conveys the JTAG instruction to be executed and a 128-bit data I/O buffer that transfers any associated data. The JTAG pins can be run at any voltage from 1.5 V to 3.3 V (nominal). The IO voltage of this interface is set by powering the VJTAG power pin with the desired IO voltage. Core voltage must also be powered for the JTAG state machine to operate, even if the device is in Bypass mode. VJTAG power alone is insufficient. Both VJTAG and core voltage to the RTG4 part must be supplied to allow JTAG signals to transit the RTG4 device. Isolating the JTAG power supply in a separate I/O bank gives greater flexibility with supply selection and simplifies power supply and PCB design. If the JTAG interface is not used and not planned for use, the VJTAG pin together with the TRSTB pin must be tied to GND. Revision 2 103 I/Os The TAP controller is a state machine whose transitions are controlled by the TMS signal and controls the behavior of the JTAG system. The TAP controller uses 8-bit instructions consistent with previous Microsemi product families. There are two types of TAP controllers. • Fabric TAP • Auxiliary TAP Table 6-12 • JTAG Pin Description Name JTAGSEL Type Bus Size In 1 Description JTAG controller selection Depending on the state of the JTAGSEL pin, an external JTAG controller detects the FPGA fabric TAP/auxiliary TAP. The JTAGSEL pin must be connected to an external pull-up resistor such that the default configuration selects the FPGA fabric TAP. TCK In 1 • Logic 1: FPGA fabric TAP selected • Logic 0: AUX TAP selected Test Clock Serial input for JTAG boundary scan, ISP, and UJTAG. The TCK pin does not have an internal pull-up/-down resistor. If JTAG is not used, Microsemi recommends tying it off TCK to GND or VJTAG through a resistor placed close to the FPGA pin. This prevents JTAG operation in case TMS enters an undesired state. To operate at all VJTAG voltages, the resistor values mentioned in Table 6-13 on page 105 are recommended. TDI In 1 Test Data Input Serial input for JTAG boundary scan, ISP, and UJTAG usage. There is an internal weak pull-up resistor (10K) on the TDI pin. TDO Out 1 Test Data Output Serial output for JTAG boundary scan, ISP, and UJTAG usage. TMS 1 Test Mode Select The TMS pin controls the use of the IEEE1532 boundary scan pins (TCK, TDI, TDO, and TRSTB). There is an internal weak pull-up resistor (10K) on the TMS pin. TRSTB 1 Boundary scan reset pin. The TRSTB pin functions as an active low input to asynchronously initialize (or reset) the boundary scan circuitry. There is an internal weak pull-up resistor (10K) on the TRSTB pin. If JTAG is not used, an external pulldown resistor must be included to ensure the TAP is held in Reset mode. The resistor values must be selected from Table 6-13 on page 105 and must satisfy the parallel resistance value requirement. The values in Table 6-13 on page 105 correspond to the resistor recommended when a single device is used. The values correspond to the equivalent parallel resistor when multiple devices are connected through a JTAG chain. In safety critical applications (Avionics mode), an upset in the JTAG circuit could allow entering an undesired JTAG state. In such cases, Microsemi recommends tying off TRSTB to GND through a resistor placed close to the FPGA pin. This keeps JTAG circuitry in Reset state. 104 R e vi s i o n 2 UG0574: RTG4 FPGA Fabric User Guide Table 6-13 • Recommended Tie-Off Values for the TCK and TRST Pins Tie-Off Resistance1, 2 VJTAG VJTAG at 3.3 V 200 Ohm to 1 KOhm VJTAG at 2.5 V 200 Ohm to 1 KOhm VJTAG at 1.8 V 500 Ohm to 1 KOhm VJTAG at 1.5 V 500 Ohm to 1 KOhm Notes: 1. The TCK pin can be pulled up/down. If it is pulled-up, it should w.r.t. VJTAG voltage. 2. The TRSTB pin can only be pulled down. 3. Equivalent parallel resistance if more than one device is on JTAG chain. Dedicated I/Os The RTG4 devices have the following dedicated I/Os: • Device Reset I/O • SERDES I/O Device Reset I/O RTG4 devices have a dedicated input reset. Anytime reset is asserted, the whole chip will be reset. The device reset feeds the system controller, which generates the system reset for the reset controller to reset the entire device. Figure 6-7 shows the full chip reset flow from device reset. The Libero SoC tool allows to configure the reset controller using the System Builder. System Controller Reset Controller DEVRST_N Chip Level Resets System Resets Figure 6-7 • Chip Level Resets From Device Reset Port List and I/O Pins Table 6-14 • Device Reset I/O Pin Pin Type I/O Description DEVRST_N Analog Input Device reset, asserted low, and powered by VPP. Revision 2 105 I/Os SERDES I/O The SERDES I/Os available in RTG4 devices are dedicated to high-speed serial communication protocols. For more information, refer to the SERDES section in the RTG4 FPGA High Speed Serial Interfaces User Guide. The SERDES I/O supports protocols such as PCI Express 2.0, XAUI, serial gigabit media independent interface (SGMII), serial rapid I/O (SRIO), and any user-defined high speed serial protocol implementation in the fabric. These protocols access the SERDES lanes through the physical media attachment (PMA) and physical coding sub layer (PCS) of SERDES interface. The detailed configuration of the SERDES interface for various protocols is explained in the "SERDESIF Block" chapter of the RTG4 FPGA High Speed Serial Interfaces User Guide. This section describes the SERDES I/O pins, SERDES I/O banks, SERDES I/O standards, and board-level design considerations available. SERDES I/O Banks The SERDES I/Os reside in the dedicated I/O banks. The number of SERDES I/Os depends on the device size and pin count. For example, the RT4G150 device has four SERDES_IFs (SERDES_IF0, SERDES_IF1, SERDES_IF2 and SERDES_IF3), which reside on four I/O banks. Refer to the RTG4 FPGA High Speed Serial Interfaces User Guide for details on I/O bank locations and I/O electrical specifications. SERDES I/O Pins Each SERDES interface in the RTG4 device has four SERDES I/O data lanes or 16 SERDES I/Os available for accessing the SERDES interface (SERDESIF block). Each data lane has two pairs of differential signals: one for transmit data (TxDP, TxDN) and other for receive data (RxDP, RxDN). Data Ianes are multiplexed to support different serial protocols and scalable to various link widths - ×1, ×2, and ×4. These settings can be configured in the SERDES_IF macro using Libero SoC design software. Each SERDES_IF has two sets of dedicated power, clock, and reference signals. One set for data lane 0 and lane 1 and another for data lane 2 and lane 3. For more information on SERDES I/O and power pin names and descriptions, refer to the RTG4 FPGA Pin Descriptions. Dedicated Global I/Os Dedicated global I/Os are dual-use I/Os, which can drive the global blocks directly or through clock conditioning circuits (CCC). They can also be used as regular user I/Os. These global I/Os are the primary source to bring external clock inputs into the RTG4 device. Unused dedicated global I/Os behave similarly to unused regular User I/Os (MSIO, MSIOD, DDRIO). Libero configures unused User I/Os as input buffer is disabled, output buffer is tristated with weak pullup. The RTG4 devices have 36 I/Os, which are dedicated for global clocks. Out of these 36 global clocks, 12 are dedicated for SERDES clocks. GRESET generates a global asynchronous reset signal during power-up or programming, and allows the user to apply an asynchronous reset on the fabric flip-flops globally, if required. For more information on Global I/Os, refer to the "Fabric Global Routing Resources" chapter of the RTG4 FPGA Clocking Resources User Guide. 106 R e vi s i o n 2 A – Glossary Acronyms uSRAM Micro static random access memory CCC Clock conditioning circuits LSRAM Large static random access memory LSB Least significant bit ECC Error correction code MSB Most significant bit STMR Self-corrected triple module redundancy DDRIO Double data rate input output FDDR Controller for external DDR memory IOA Input output analog IOD Input output digital LPDDR Low power double data rate memory ODT On-die termination HSTL High-speed transceiver logic SSTL Stub series terminated logic LVDS Bus LVDS ESD Electrostatic discharge protection Revision 2 107 Glossary HSTL High-speed transceiver logic LPE Low power exit LVPECL Low-voltage positive emitter coupled logic LVTTL Low voltage transistor transistor logic MLVDS Multipoint LVDS MSIO Multi-standard I/O MVN MultiView Navigator ODT On-die termination RSDS Reduced swing differential signaling SERDES Serializer/deserializer 108 R e vi s i o n 2 UG0574: RTG4 FPGA Fabric User Guide Terminology Clusters Clusters are formed by grouping a certain number of logic elements and interconnecting them. This is related to the clustered routing architecture of the RTG4 FPGA fabric. Interface Cluster An interface cluster is formed by grouping 12 interface logic elements. I/O Cluster I/O cluster is formed by grouping either three or four I/O modules. Interface Logic The logic element consists of a 4-input LUT and a STMR-flip-flop. This logic element interfaces the hard macros (LSRAMs, uSRAMs, and mathblocks) to fabric routing. I/O Module The logic element consists of flip-flops and routing multiplexers. This logic element interfaces the user I/Os to fabric routing. Inter-cluster Routing Inter-cluster routing refers to routing resources between various types of clusters. Intra-cluster Routing Intra-cluster routing refers to routing resources existing inside a specific cluster. Logic Cluster A logic cluster is formed by grouping 12 logic elements. Logic Element The basic logic element in the RTG4 FPGA fabric, consisting of a 4-input LUT, a D-flip-flop, and a dedicated carry chain. Flow-Through Read A read operation performed with the output not being registered by the output pipeline registers. Pipelined Read A read operation performed with the output being registered by the output pipeline registers. Simple Write A write operation in which the data written does not appear on the SRAM output ports. Feed-Through Write (Write-Bypass Write) A write operation in which the data written appears on the SRAM output ports immediately for nonpipeline mode and next clock cycle for pipeline mode. Dual-Port Mode SRAM with two independent ports through which both read and write operation can be done. Two-Port Mode SRAM with two ports, one dedicated to read operations and the other dedicated to write operations. Multi-Channeling Multi-threading done for a chain of mathblocks Multi-Threading Using a mathblock for performing more than one computation by time multiplexing it. Pipelined Operation The mode of operation where the mathblock output is registered at the pipeline registers. Revision 2 109 Glossary STMR Self-corrected triple module redundancy Transparent Mode Non-registered/Non-pipelined mode Inference Using RTL to infer mathblocks Bus Keeper Holds the signal on an I/O pin at its last driven state. Hot Insertion Capability to connect I/O to external circuitry even after power-up. Low Power Exit Logic for the chip to come out from low power state. 110 R e vi s i o n 2 B – List of Changes The following table shows important changes made in this document for each revision. Date Revision 2 (April 2015) Changed Chapters Updated the document with FTC inputs (SAR 63317). NA Updated "Features" section (SAR 65204). 18 Updated "Features" section and Table 5-3 (SAR 66075). 18 and 66 Updated "ECC" section (SAR 65236). 52 Updated Table 6-3 (SAR 64973). 92 Updated Table 6-10 (SAR 64447). 98 Updated "Low Voltage CMOS (LVCMOS)" section (SAR 64028). 89 Added "Cold Sparing" and "Dedicated Global I/Os" sections (SAR 64970 and 64971). Revision 1 (November 2014) List of Changes 100 and 106 Updated "Programmable Slew Rate Control" section. 93 Added "uPROM" chapter. 56 Replaced GSR_N signal with ARST_N (SAR 66394). NA Initial release. NA Revision 2 111 C – Product Support Microsemi SoC Products Group backs its products with various support services, including Customer Service, Customer Technical Support Center, a website, electronic mail, and worldwide sales offices. This appendix contains information about contacting Microsemi SoC Products Group and using these support services. Customer Service Contact Customer Service for non-technical product support, such as product pricing, product upgrades, update information, order status, and authorization. From North America, call 800.262.1060 From the rest of the world, call 650.318.4460 Fax, from anywhere in the world, 408.643.6913 Customer Technical Support Center Microsemi SoC Products Group staffs its Customer Technical Support Center with highly skilled engineers who can help answer your hardware, software, and design questions about Microsemi SoC Products. The Customer Technical Support Center spends a great deal of time creating application notes, answers to common design cycle questions, documentation of known issues, and various FAQs. So, before you contact us, please visit our online resources. It is very likely we have already answered your questions. Technical Support For Microsemi SoC Products Support, visit http://www.microsemi.com/products/fpga-soc/designsupport/fpga-soc-support Website You can browse a variety of technical and non-technical information on the SoC home page, at www.microsemi.com/soc. Contacting the Customer Technical Support Center Highly skilled engineers staff the Technical Support Center. The Technical Support Center can be contacted by email or through the Microsemi SoC Products Group website. Email You can communicate your technical questions to our email address and receive answers back by email, fax, or phone. Also, if you have design problems, you can email your design files to receive assistance. We constantly monitor the email account throughout the day. When sending your request to us, please be sure to include your full name, company name, and your contact information for efficient processing of your request. The technical support email address is soc_tech@microsemi.com. Revision 2 112 UG0574: RTG4 FPGA Fabric User Guide My Cases Microsemi SoC Products Group customers may submit and track technical cases online by going to My Cases. Outside the U.S. Customers needing assistance outside the US time zones can either contact technical support via email (soc_tech@microsemi.com) or contact a local sales office. Sales office listings can be found at www.microsemi.com/soc/company/contact/default.aspx. ITAR Technical Support For technical support on RH and RT FPGAs that are regulated by International Traffic in Arms Regulations (ITAR), contact us via soc_tech_itar@microsemi.com. Alternatively, within My Cases, select Yes in the ITAR drop-down list. For a complete list of ITAR-regulated Microsemi FPGAs, visit the ITAR web page. Revision 2 113 Microsemi Corporation (Nasdaq: MSCC) offers a comprehensive portfolio of semiconductor and system solutions for communications, defense & security, aerospace and industrial markets. Products include high-performance and radiation-hardened analog mixed-signal integrated circuits, FPGAs, SoCs and ASICs; power management products; timing and synchronization devices and precise time solutions, setting the world’s standard for time; voice processing devices; RF solutions; discrete components; security technologies and scalable anti-tamper products; Power-over-Ethernet ICs and midspans; as well as custom design capabilities and services. Microsemi is headquartered in Aliso Viejo, Calif., and has approximately 3,400 employees globally. Learn more at www.microsemi.com. Microsemi Corporate Headquarters One Enterprise, Aliso Viejo, CA 92656 USA Within the USA: +1 (800) 713-4113 Outside the USA: +1 (949) 380-6100 Sales: +1 (949) 380-6136 Fax: +1 (949) 215-4996 E-mail: sales.support@microsemi.com © 2015 Microsemi Corporation. All rights reserved. Microsemi and the Microsemi logo are trademarks of Microsemi Corporation. All other trademarks and service marks are the property of their respective owners. Microsemi makes no warranty, representation, or guarantee regarding the information contained herein or the suitability of its products and services for any particular purpose, nor does Microsemi assume any liability whatsoever arising out of the application or use of any product or circuit. The products sold hereunder and any other products sold by Microsemi have been subject to limited testing and should not be used in conjunction with mission-critical equipment or applications. Any performance specifications are believed to be reliable but are not verified, and Buyer must conduct and complete all performance and other testing of the products, alone and together with, or installed in, any end-products. Buyer shall not rely on any data and performance specifications or parameters provided by Microsemi. It is the Buyer's responsibility to independently determine suitability of any products and to test and verify the same. The information provided by Microsemi hereunder is provided "as is, where is" and with all faults, and the entire risk associated with such information is entirely with the Buyer. Microsemi does not grant, explicitly or implicitly, to any party any patent rights, licenses, or any other IP rights, whether with regard to such information itself or anything described by such information. Information provided in this document is proprietary to Microsemi, and Microsemi reserves the right to make any changes to the information in this document or to any products and services at any time without notice. 50200574-2/04.15