ISU SENIOR DESIGN – TEAM MAY09-04 InfiniBand FPGA Project Final Report Matthew Wall CprE, Team Lead Ryan Schwarzkopf EE, Communication Coordinator Tim Prince CprE Alex Burds CprE Xiang Li EE Rachel Ayoroa CprE Troy Benjegerdes Client Dr. Morris Chang Faculty Advisor May 4, 2009 DISCLAIMER: This document was developed as part of the requirements of an electrical and computer engineering course at Iowa State University, Ames, Iowa. The document does not constitute a professional engineering design or a professional land surveying document. Although the information is intended to be accurate, the associated students, faculty, and Iowa State University make no claims, promises, or guarantees about the accuracy, completeness, quality, or adequacy of the information. Document users shall ensure that any such use does not violate any laws with regard to professional licensing and certification requirements. Such use includes any work resulting from this student prepared document that is required to be under the responsible charge of a licensed engineer or surveyor. This document is copyrighted by the students who produced the document and the associated faculty advisors. No part may be reproduced without the written permission of the senior design course coordinator. 1 C ONTENTS 1 Contents ............................................................................................................................................................................. 1 2 Introduction ..................................................................................................................................................................... 3 3 2.1 Problem Statement .............................................................................................................................................. 3 2.2 Approach.................................................................................................................................................................. 3 Project Planning ............................................................................................................................................................. 4 3.1 3.1.1 Link Layer ...................................................................................................................................................... 4 3.1.2 Physical Layer .............................................................................................................................................. 4 3.2 Requirements Analysis ...................................................................................................................................... 5 3.2.1 Functional Requirements ........................................................................................................................ 5 3.2.2 Non-Functional Requirements .............................................................................................................. 5 3.3 Project Resources ................................................................................................................................................. 6 3.3.1 Human Resources ....................................................................................................................................... 6 3.3.2 Other Resources .......................................................................................................................................... 6 3.4 Schedule ................................................................................................................................................................... 6 3.4.1 Work Breakdown........................................................................................................................................ 7 3.4.2 Milestones...................................................................................................................................................... 7 3.5 4 System Description.............................................................................................................................................. 4 Deliverables ............................................................................................................................................................ 8 System Design ................................................................................................................................................................. 9 4.1.1 4.2 System Interface.......................................................................................................................................... 9 Link Layer Architecture ................................................................................................................................... 10 4.2.1 Top-level Architecture............................................................................................................................ 10 4.2.2 Link State Machine ................................................................................................................................... 10 4.2.3 Packet Receiver State Machine ........................................................................................................... 12 4.2.4 Data Packet Check Machine .................................................................................................................. 12 4.2.5 Link Packet Check Machine .................................................................................................................. 12 4.2.6 Virtual Lanes ............................................................................................................................................... 13 4.2.7 Virtual Lane Selector ............................................................................................................................... 14 4.2.8 Link Packet Generator ............................................................................................................................ 14 4.2.9 Packet Priority Selector ......................................................................................................................... 14 4.3 Physical Layer Architecture ........................................................................................................................... 15 4.3.1 Top-level Architecture............................................................................................................................ 15 4.3.2 Link/Physical Layer................................................................................................................................. 16 InfiniBand FPGA Project 1 4.3.3 5 Test Strategy .................................................................................................................................................................. 19 5.1 Test Environment .............................................................................................................................................. 19 5.2 Unit Testing .......................................................................................................................................................... 19 5.3 Physical Testing .................................................................................................................................................. 19 5.4 Test Harness ......................................................................................................................................................... 20 5.4.1 5.5 6 System Architecture ................................................................................................................................ 20 Test Harness Architecture .............................................................................................................................. 21 5.5.1 PLB Interface .............................................................................................................................................. 21 5.5.2 RX and TX FIFOs ........................................................................................................................................ 21 Conclusions..................................................................................................................................................................... 23 6.1 7 RocketIO Interface ................................................................................................................................... 18 Accomplishments ............................................................................................................................................... 23 6.1.1 Link Layer .................................................................................................................................................... 23 6.1.2 Physical Layer ............................................................................................................................................ 23 6.1.3 Integration ................................................................................................................................................... 23 6.1.4 Testing........................................................................................................................................................... 23 6.2 Challenges ............................................................................................................................................................. 23 6.3 Lessons Learned ................................................................................................................................................. 24 6.4 Future Work ......................................................................................................................................................... 24 6.5 Acknowledgements ........................................................................................................................................... 24 Bibliography................................................................................................................................................................... 25 InfiniBand FPGA Project 2 2 I NTRODUCTION T he InfiniBand architecture is a multigigabit, low-latency serial communications interconnect developed by the InfiniBand Trade Association, a consortium of IBM, Cisco, Intel, and other corporations. It is used primarily in high-performance computing, and includes facilities for centralized management, data rates as fast as 96 gigabits per second, and quality of service, among others. 2.1 P ROBLEM S TATEMENT Currently, only proprietary implementations of InfiniBand hardware exist, limiting users to a handful of vendors and limiting further research into the InfiniBand architecture by independent researchers. Our client, Troy Benjegerdes, of the Scalable Computing Laboratory (SCL) at Ames Laboratory would like to be able to perform experiments on existing InfiniBand hardware, such as injecting errors into the data stream, as well as experiment with the effect of various modifications to portions of the specification. Figure 1 - An InfiniBand switch panel 2.2 A PPROACH To address the lack of open InfiniBand implementations, this project endeavors to create an open-source hardware implementation of the InfiniBand architecture. Last year’s project succeeded in laying the foundation for the implementation of the InfiniBand link layer. This year’s project sought to complete the link layer implementation and develop the physical layer. Figure 2 - An IBM cluster from the Ames Laboratory SCL InfiniBand FPGA Project 3 3 P ROJECT P LANNING 3.1.1 LINK LAYER D uring the previous semester, our primary concern was designing our system and planning the development cycle. This section describes the objectives of our project, our implementation schedule, project resources, and so on. The link layer provides a number of services for the upper layers, including data verification and error detection, flow control and buffering, among others. The previous project provided the groundwork for this layer and our project built upon their foundation. 3.1 S YSTEM D ESCRIPTION 3.1.2 PHYSICAL LAYER Figure 3 depicts the top-level block diagram of our system. This project initially planned to implement a complete InfiniBand switch, which would include the layers in the diagram, as well as a third switch controller layer. Through the development process the immensity of completing this task became apparent and our scope was scaled back to include just the link and physical layers. The physical layer is primarily responsible for converting the symbols sent by the link layer into physical signals for communicating with other InfiniBand devices. To that end, it is also responsible for configuring the link between InfiniBand devices, maintaining the link state, detecting symbol errors, and so on. The detailed design of the link and physical layers is discussed in sections 4.2 and 4.3, respectively. InfiniBand Core Link Layer Physical Layer Xilinx RocketIO Fabric Figure 3 – Top-level System Block Diagram InfiniBand FPGA Project 4 3.2 R EQUIREMENTS A NALYSIS One of the first tasks addressed during the planning phase was to perform a requirements analysis of our system. The majority of our requirements focus on following the InfiniBand specifications, and ensuring that our end product is as free and open as possible. 3.2.1 FUNCTIONAL REQUIREMENTS FR01. The system shall provide InfiniBand compliant input and output signals. Rationale: This requirement follows from the ultimate objective of this project: to provide a complete hardware implementation of the InfiniBand architecture. FR02. The system shall be capable of transmitting and receiving packets according to the InfiniBand specifications. NFR02. The system shall be distributed under an open-source license. Rationale: To facilitate research, we would like the system to be as open as possible. Using an open-source license ensures that the system can be used and modified freely. NFR03. The system should use only use standard VHDL libraries. Rationale: Using non-standard VHDL libraries has the effect of limiting development to a specific vendor. By using only standard VHDL libraries, we can remain vendor-agnostic and ensure that the system is as open as possible. NFR04. The system should be compatible with open-source VHDL simulators. Rationale: By attempting to maintain compatibility with open-source simulators, we can further distance ourselves from being tied to a particular vendor. Rationale: See rationale for FR01. FR03. The system should recognize and handle all errors defined by the InfiniBand specifications. Rationale: See rationale for FR01. 3.2.2 NON-FUNCTIONAL REQUIREMENTS NFR01. The system shall be implemented using VHDL. Rationale: The client requested that the system be written in VHDL. Furthermore, using VHDL greatly simplifies the design of digital circuits as compared to schematicbased design. InfiniBand FPGA Project 5 3.3 P ROJECT R ESOURCES Our project has had access to a number of resources which aided in our development process. Those resources are listed here. 3.3.1 HUMAN RESOURCES The table below lists all the human resources available to our project. Name Role Matthew Wall Project lead, link layer developer Link layer developer Communications coordinator, tester Physical layer developer Tester Link layer developer Client Faculty advisor Lab coordinator Senior design instructor Equipment contributor Alex Burds Ryan Schwarzkopf Tim Prince Xiang Li Rachel Ayoroa Troy Benjegerdes Dr. Morris Chang Jason Boyd Dr. Ahmed Kamal Dr. Phillip Jones 3.3.2 OTHER RESOURCES In addition to our human resources, we also had access to laboratories, equipment and software provided by the department. The department labs provided us with Xilinx ISE for development and synthesis, and ModelSim PE for simulation testing. In addition to these resources, Dr. Jones graciously provided us with access to a Virtex-5-based Xilinx ML507 development board to replace our original Virtex-II-based boards. The department also allowed us access to a giga-sample oscilloscope and SMA cables and connectors which proved invaluable for debugging our gigabit transceivers. 3.4 S CHEDULE In the previous semester, our project identified a possible schedule for development. During this semester, as work progressed, it became apparent that the original schedule was infeasible. To that end, we constructed a new schedule with milestones based on the different tasks we identified to complete and the milestones we had already completed. Figure 4 - Xilinx ML507 Board InfiniBand FPGA Project 6 InfiniBand Project Planning Documentation Design Implementation Testing Scheduling Bound Report(s) Top-Level Design Link Layer Unit Testing User’s Guide Layer Design Physical Layer Layer Integration Testing Compilation Instructions Layer Component Design Resources Risks Requirements Device Integration Device Integration Figure 5 - Work breakdown 3.4.1 WORK BREAKDOWN When we constructed our original schedule, we broke the work down into a number of tasks. Figure 5 shows how work was broken down into tasks. When the original schedule was dropped, we constructed a new milestonebased schedule, with each milestone building upon the previous. The revised milestone schedule is listed below. 3.4.2 MILESTONES When the new schedule was created, we identified a series of milestones and approximate feasible dates for the completion of those milestones. The new schedule is given on the right in Figure 6. InfiniBand FPGA Project 1/12/2009 – Implementation begins 4/3/2009 – Layers implemented 4/22/2009 – Layers tested 4/26/2009 – System integration completed 4/26/2009 – Integration testing completed 4/29/2009 – System testing completed 5/4/2009 – Documentation completed Figure 6 – Schedule milestones 7 3.5 D ELIVERABLES The deliverables for our project are: 1. A CD containing: a. Source code b. Compilation instructions c. User’s manual 2. Code submitted to OpenCores.org and Google Code. InfiniBand FPGA Project 8 4 S YSTEM D ESIGN T he design of our system consists of several major components shown in Figure 3. The bulk of the project was found in the InfiniBand HDL Implementation containing the link and physical layers. These layers make up the IP core of described by InfiniBand specification. Sections 4.2 and 4.3 will detail the design and functionality of these layers in greater detail. The Xilinx FPGA chip is the programmable Virtex-5 board that we loaded our VHDL design onto. This specific board that is used for loading our InfiniBand implementation should be easily interchangeable so that our open-source InfiniBand HDL Implementation is usable by many different people and many different boards. 4.1.1 SYSTEM INTERFACE The IP core has two external interfaces—the physical interface and the link layer interface. The physical layer interface consists of two low-voltage differential signals (LVDS) which carry the transmit and receive signals from RocketIO. The link layer interface consists of two virtual lanes, number VL0 and VL15, which provide packet buffering between the link and upper layers. The Xilinx RocketIO transceiver is the adapter we used to establish connections between boards programmed with our InfiniBand HDL Implementation, or in our case, establish a loopback to itself. The RocketIO transceiver has a specific wrapper allowing it to interface with our InfiniBand HDL Implementation, but this wrapper could be rewritten to accommodate interfacing with other transceivers to make this IP core more usable for a wide range of developers. The RocketIO transceiver and wrapper will be examined in further detail in section 4.3.3. The fabric just represents the external connections that exist between this FPGA chip and any other InfiniBand implementations or devices that could conceivably be connected to ours. Within the scope this project, this fabric was limited to a serial loopback to our Virtex-5 board since we only had one board to work with at this time. InfiniBand FPGA Project 9 4.2 L INK L AYER A RCHITECTURE As described in section 3.1.1, the link layer is responsible for providing error detection, flow control, buffering, and other services to the upper layers. This section describes how these services are designed and implemented. 4.2.1 TOP-LEVEL ARCHITECTURE A top-level design of the link layer is shown in Figure 7. There are nine main components of the link layer: Link State Machine Packet Receiver State Machine Data Packet Check Machine Link Packet Check Machine Virtual Lanes Virtual Lane Selector Link Packet Generator Packet Priority Selector These components are described in detail in the sections that follow. 4.2.2 LINK STATE MACHINE The link state machine is used to control which state the link layer is in. In order for either link packets or data packets to be processed, the link state machine must be initialized and pass all checks (enables, triggers, etc) before processing a packet. Figure 8 shows the state transitions and what functionality is enabled when the link layer is brought to each state. For a complete description of the link state machine, see [1]. 4.2.2.1 L INK S TATE M ACHINE I NPUTS Link Layer Virtual Lane 15 (Subnet Management) Virtual Lane 0 Link Packet Generator Virtual Lane Selector Packet Priority Selector Link State Machine To Physical Layer Figure 7-Top level diagram of the Link layer InfiniBand FPGA Project Data/Link Packet Check Machine Packet Reciever State Machine Reset – When asserted, link layer will return to initialize state. RemoteInit – A link packet with the flow control initialize OP code has been received and has passed the checks of the link packet check machine. CPortState – A value that indicates commands from management to change the port state. There are three valid states. • Down – Connection between physical layer and link layer current inactive. • Arm – Connection between physical layer and link layer has been initialized. • Active – Ongoing connection between physical layer and link layer. ActiveEnable – Prevents Active state of CPortState from occurring too quickly. ActiveEnable will be asserted when a link packet with a normal OP code is received. ActiveTrigger – Allows the transition between Arm and Active states. ActiveTrigger only deals with non10 VL15 packets that pass the VCRC check. LinkDownTimeout – A timeout that indicates that the Link State Machine has been down for a period of time that causes the link to transition to the LinkDownState. PhyLink – When asserted, notified link layer that the physical layer is active. When not asserted, link layer stays in initial state. 4.2.2.2 L INK S TATE M ACHINE O UTPUTS DataPktXmitEnable - Indicates the link layer’s action with respect to transmission of non Subnet Management Packets (SMP) data packets. When true, transmission of non-SMP data packets is enabled. When False, non-SMP data packets submitted to link layer for transmission are discarded. DataPktRcvEnable - Indicates the link layer’s action with respect to reception of non-SMP data packets from the physical layer. When True, reception of non-SMP data packets is enabled. When false, non-SMP data packets received from the physical layer are discarded. Figure 8 - Link state machine state diagram InfiniBand FPGA Project 11 SMPEnable - Indicates the link layer’s action with respect to transmission and reception of subnet management packets. When true, transmission and reception of SMPs are enabled. When false, SMPs submitted to link layer for transmission or reception is discarded. LinkPktEnable - Indicates the link layer’s action with respect to transmission and reception of link packets. When true, transmission and reception of link packets are enabled. When false, link packets are not generated by the link layer and any link packets received are discarded. PortState - The value of the PortState component of the PortInfo attribute. Valid values are “Down”, "Initialize”, “Arm” and “Active”. 4.2.3 PACKET RECEIVER STATE MACHINE The packet receiver state machine determines the state of the incoming data stream from the physical layer. The module determines the type of packet incoming so that the packet check machines and other modules can respond to the packet type. If an error occurs in the data stream or a stream is marked bad by the physical layer, an error is indicated and the incoming packet is dropped. The state diagram is shown in Figure 9. For a complete description of the packet receiver state machine, see [1]. 4.2.3.1 P ACKET R ECEIVER S TATE M ACHINE I NPUTS Reset – An internal signal to reset the interface PhyLink – The physical link status from the physical layer RcvStream – Input from physical layer with packet type information InfiniBand FPGA Project PortState – Value of PortState component of PortInfo Down – Indicates to the link layer that the physical layer is down Initialize – Indicates there are packets to be received and initializes all starting values Arm – Responsible for receiving packet info Active – Responsible for processing packet info 4.2.4 DATA PACKET CHECK MACHINE The data packet check machine performs checks on incoming data packets. The first check is the variant and invariant CRC to determine if the link packet maintained data integrity during transmission. The next check determines if a packet marked for virtual lane 15 is in fact a subnet management packet. The length check determines if the packet length matches the length field in the header. Finally, the destination local identifier check determines if the packet was destined for the current port. If any errors are indicated by the module, the system will drop the packet. For a complete description of the data packet check machine, see [1]. 4.2.5 LINK PACKET CHECK MACHINE The link packet check machine performs checks on incoming link packets. The first check is the CRC to determine if the link packet maintained data integrity during transmission. The next check is the length check to see if the packet is six bytes long. Finally, the virtual lane check determines if the destination virtual lane of the packet exists on the port. If any errors are indicated by the module, the system will drop the packet. For a complete description of the link packet check machine, see [1]. 12 4.2.6 VIRTUAL LANES A virtual lane (VL) provides buffering for the link layer. There can be up to 16 virtual lanes in an InfiniBand port, 15 data virtual lanes numbered 0-14 and one reserved for subnet management packets numbered 15. There must be a minimum of one data virtual lane and VL15. A virtual lane consists of a circular buffer with three indexes; read, write, and temp write. The temp write index indicates where bytes from the incoming data stream are written to. The write index is set to the temp write index once the incoming packet is confirmed valid by the data packet check machine. If the packet has an error then the temp write index is set to the write index, dropping the packet. When a virtual lane receives a link packet, it is stored on a separate buffer until the packet is validated and the packet can be consumed by the virtual lane. The procedure to using a link packet is described in the next section. 4.2.6.1 F LOW C ONTROL Flow Control is implemented in the link layer to ensure that no packets need to be retransmitted due to buffer overflow on the receiving end. VL15 is not subject to flow control. Each virtual lane keeps track of the Adjusted Blocks Received (ABR) and Flow Control Total Blocks Sent (FCTBS). FCTBS is the number of blocks (64 bytes) sent by the virtual lane since link initialization. ABR is set to the value of FCTBS for each link packet received by the virtual lane, and then incremented by number of blocks received for each data packet received by the virtual Figure 9 - Packet receiver state machine state diagram InfiniBand FPGA Project 13 lane. When a link packet is transmitted, the Flow Control Credit Limit (FCCL) is calculated by taking the ABR plus the number of empty blocks left on the virtual lane up to 2048. indicates to the packet priority selector that a packet is ready to be transmitted. Link packets are not generated if not enabled by the link state machine. Let CR represent the total blocks sent since link initialization plus the number of blocks in the data packet to be transmitted. Let CL represent the last FCCL received in a link packet. If (CL-CR) modulo 4096 <= 2048, then the data packet may be transmitted by the virtual lane. If not, the packet must wait until the condition becomes true. For a complete description of flow control, see [1]. 4.2.9 PACKET PRIORITY SELECTOR 4.2.7 VIRTUAL LANE SELECTOR The virtual lane selector routes the incoming data stream to the appropriate virtual lane based on the packet header. The InfiniBand specifications dictate that the first four bits of an InfiniBand packet is the virtual lane number. When a packet start delimiter is received from the physical layer, the virtual lane selector will read the virtual lane number from the first byte received and enable write on the virtual lane indicated until a packet end delimiter is received. If the virtual lane indicated by the incoming packet does not exist on the port, then the packet is ignored. Link packets are routed to all data virtual lanes on the port. The packet priority selector feeds transmitting packets to the physical layer from multiple sources. If no sources have data to transmit the module remains idle. If one or more sources have data to transmit then the packet priority selector first selects which source to service. Link packets have first priority followed by subnet management packets and finally data packets. If more than one virtual lane exists in the architecture, then they are serviced in a round robin manner. Once a source is selector the module sends a start packet delimiter to the physical layer. The module then sends the source data to the physical layer. The packet priority selector counts the number of bytes sent and stops sending once the count matches the packet length indicated in the packet header. Finally, the module sends a packet end delimiter. 4.2.8 LINK PACKET GENERATOR The link packet generator creates a flow control packet to be sent to the device on the other end of the link. Each data virtual lane has a link packet generator. A link packet is generated whenever indicated to by the associated virtual lane, which is at least every 65,536 clock cycles. The generator receives the FCCL and FCTBS fields from the associated virtual lane along with the VL number. The link packet is then appended with a 16 bit CRC. Once a packet is generated the module InfiniBand FPGA Project 14 4.3 P HYSICAL L AYER A RCHITECTURE As described in section 3.1.2, the physical layer is responsible for maintaining the state of the link and converting data from the link layer into physical signals compatible with other InfiniBand devices. This section describes how these services are implemented. Link Layer Byte Streams Physical Layer Link/Phy Layer 4.3.1 TOP-LEVEL ARCHITECTURE A top-level diagram of the physical layer is given in Figure 10. There are three main components of the physical layer: Link/Physical layer RocketIO wrapper RocketIO transceiver. Encoded Lanes (1X) RocketIO Adapter Physical RocketIO Module These components are described in detail in the sections that follow. Physical Interconnects Figure 10 – Top-level diagram of the Physical layer InfiniBand FPGA Project 15 4.3.2.1 C ONTROL S EQUENCE G ENERATORS 4.3.2 LINK/PHYSICAL LAYER The Link/Physical layer constitutes the majority of the logic in the physical layer. It includes the logic for configuring the link between InfiniBand devices and maintaining the link once configured. The internals of the link/physical layer are depicted in the diagram in Figure 11. The majority of the logic is implemented by the link training state machine (LTSM), which keeps track of the current link status and determines which data should be sent at any given time, and controls which control sequence generators are active. The control sequence generators are three components which generate the various physical control sequences defined by the InfiniBand specifications. These control sequences include: TS1 control sequence TS2 control sequence skip sequence heartbeat sequence. All four of these sequences are currently implemented by a simple lookup table, except for the heartbeat sequence, which is not currently implemented since we do not implement the high-speed signaling option. For a detailed description of each control sequence, see [1]. 4.3.2.2 C ONTROL S EQUENCE D ETECTORS For each implemented sequence generator, there is also a corresponding sequence To Link Layer Link/Physical Layer Control Sequence Generators TX Buffer L.T.S.M. TX Side Error Reporting Control Sequence Detectors RX Side To RocketIO Figure 11 - Link/Physical layer internals InfiniBand FPGA Project 16 Polling Received TS1 Config Failed Configuration Recovery Failed Config OK Link Up Recovery OK Major Error or Link initiated recovery Recovery Figure 12 - Link training state machine state diagram detector. Since our current implementation only implements single data rate communication and 1X links, these components also amount to a simple lookup table to verify that the received packet matches what is expected. 4.3.2.3 L INK T RAINING S TATE M ACHINE The link training state machine is like the control center of the link physical layer. It uses the signals provided by RocketIO and the control sequence detectors to maintain the state of the link. Figure 12 gives a top-level state diagram of the state machine, and the states are described in the following sections. For a complete description of the LTSM, see [1]. InfiniBand FPGA Project 4.3.2.3.1 P OLLING S UPER - STATE In this super-state, the link/physical layer periodically sends TS1 control sequences to the other endpoint, and waits for incoming TS1 sequences. When a TS1 sequence is detected, the LTSM moves on to configuring the link. 4.3.2.3.2 C ONFIGURATION S UPER - STATE In this super-state, the link/physical layer negotiates with the other endpoint to achieve the maximum possible bandwidth. This is done by exchanging TS1 and TS2 control sequences with the appropriate information. Once the link is configured, and idle data is received, the system moves into the link-up state. 17 4.3.2.3.3 L INK - UP S TATE In the link-up state, data may be sent over the link by upper layers. Skip sequences are periodically inserted to allow for correction of clock skew, and idle data is sent whenever the link layer is idle. 4.3.2.3.4 R ECOVERY STATE In this state, the link/physical layer attempts to regain connection with the other endpoint and recover from any major errors that may occur. This entails sending TS1 and TS2 control sequences to resynchronize communications and ensure agreement upon the configuration of the link. 4.3.3 ROCKETIO INTERFACE The RocketIO transceiver is a critical part of the physical layer. It provides the physical layer with facilities for clock recovery, 8b/10b encoding, and gigabit transmissions. 4.3.3.1 R OCKET IO T RANSCEIVER The RocketIO transceiver itself is a physical core available on the Xilinx Virtex line of FPGAs. It is instantiated in Xilinx ISE using their coregen software. However, RocketIO is a proprietary IP core. We also developed the RocketIO wrapper to avoid tying ourselves to a specific vendor. 4.3.3.2 R OCKET IO W RAPPER The RocketIO wrapper serves to abstract the proprietary Xilinx modules out of the physical layer, allowing them to be replaced by a similar module provided by an alternate vendor. The current iteration of the transceiver interface is likely still very Xilinx specific. This semester, we lacked access to an Altera development board with a gigabit transceiver; however, in the future, the interface will likely require modification to work with Altera transceivers. InfiniBand FPGA Project 18 5.2 U NIT T ESTING 5 T EST S TRATEGY T his section details our testing and verification strategy. This semester, we focused on testing our designs to ensure that they worked according to what we expected of them. In a future project, further verification against the specification will be necessary, so this section also details the test harness, which is designed to simplify the task of verifying our code against the specifications. 5.1 T EST E NVIRONMENT The environment for our testing was ModelSim for simulation of the VHDL modules we coded for our design. For our integration testing we planned to program our layers onto the Virtex-5 FPGA and test the system as a whole on the board. Shown below in Figure 13 is an example of a waveform from the ModelSim simulator, this waveform shows the value of the signals of the physical layer as it moves from the initial to link-up state. InfiniBand FPGA Project The first step of our test plan was unit testing. The unit testing we performed was to simulate each module in ModelSim to ensure that it performed as we intended when coded. The emphasis of this step in the test plan was not to verify each module against the InfiniBand specifications, rather to ensure that a working module had been designed to match the designer’s interpretation of the InfiniBand specification. Verification of the system against the InfiniBand specification will be performed in the later phases of our testing. 5.3 P HYSICAL T ESTING The second stage of our test plan was testing the physical to physical connection. The goal of this testing was to ensure that we could establish and maintain a connection between two physical layers as well as transfer a data stream between the two. We only possessed one Virtex-5 board to program the physical layer on, so, in order to test the design we implemented a serial loopback to the same board for testing purposes. Figure 13 – Example simulation waveform 19 5.4 T EST H ARNESS The InfiniBand specification is a two-part document consisting of several thousand pages. As one might expect, verifying that a particular implementation meets these specifications can be a difficult, time consuming, and expensive process. In order to facilitate full scale verification of the system, we have developed a test harness. The test harness will allow a future development team to verify and expand upon our implementation in a straightforward manner. 5.4.1 SYSTEM ARCHITECTURE Figure 14 - Test Harness System InfiniBand FPGA Project 20 Figure 14Error! Reference source not ound. provides a top level diagram of the test harness architecture. A PowerPC 440 core on a Virtex-5 FPGA is interfaced to one or more InfiniBand test harnesses over the IBM Processor Local Bus (PLB). Each test harness contains an instance of the InfiniBand link and physical layers, as well as an instance of the Xilinx RocketIO gigabit transceiver IP core. PLB Interface TX FIFO Control Register Control Signals 5.5 T EST H ARNESS A RCHITECTURE RX FIFO The test harness itself will include an instance of the InfiniBand link and physical layers, a PLB control interface, and a variety of programmable interconnects to provide the maximum possible level of flexibility for the user. Link Layer TX Link Layer RX Figure 15 depicts the basic architecture of the test harness. The PLB interface connects the control interface of the test harness to the PLB, the control register allows the user to program the various interconnects and control the operation of the test harness, and the TX and RX FIFOs allow the user to inject and extract data into and out of the data path at various points. Physical Layer TX Physical Layer RX RocketIO 5.5.1 PLB INTERFACE The PLB interface provides handles the translation of PLB signals into a format compatible with the test harness interface. The test harness appears to the PLB as three memory mapped registers. 5.5.2 RX AND TX FIFOS The RX and TX FIFOs allow the user to inject and extract data to and from various points on the data path. To inject data into a point on the data path, the user first fills the FIFO buffer with the data to be injected, and then sets the TX InfiniBand FPGA Project Figure 15 - Test Harness Internals tap point in the control register. The data from the FIFO will be injected into the data path at a rate of one word per cycle until the FIFO is empty. Once the FIFO is empty, the TX tap point is automatically reset so that no data is injected into the data path. While data is being read from the FIFO, the user may continue to add data to the FIFO. To extract data from the data path, the user configures the RX tap point to the appropriate position. Data from the data path will then be written to the RX FIFO at a rate of 21 one word per cycle until the RX FIFO is full, at which point the oldest data in the FIFO is ejected. The end result is that the RX FIFO contains the last 8k values of the connected bus. InfiniBand FPGA Project 22 6 C ONCLUSION T his section details the end result of our efforts, including what is done, what remains to be completed, and the lessons we learned through our work on this project. 6.1 A CCOMPLISHMENTS The progress made towards an implementation of a working IP core that follows InfiniBand specifications is outlined in detail in this section. 6.1.1 LINK LAYER Currently, all of main components of the link layer shown in link layer block diagram have been implemented. This includes all of the sub-modules required for each of these main components. Integration of the modules of the link layer requires simulation testing and detailed verification against the InfiniBand specifications. 6.1.2 PHYSICAL LAYER All components of the physical layer have been unit tested successfully. Integration has been completed and tested. We have successfully transferred data at a line rate of 1.5 gigabits over RocketIO. The task of verifying the physical layer against InfiniBand specifications remains to be completed. 6.1.3 INTEGRATION The physical layer has been fully integrated and has been successful in maintaining a link and sending data over a serial loopback connection. The link layer still requires all of the working modules from the block diagram of the link layer to be integrated and verify that through testing that the integration was successful. Following completion of the InfiniBand FPGA Project integration of the link layer, integration of the link layer and the physical layer can then be completed. 6.1.4 TESTING The testing of the VHDL modules within the ModelSim simulator was completed on the link and physical layers of the design, although some simulation still may be required on the link layer when all modules are fully integrated together. The test harness is only partially complete: the link layer has not yet been integrated, and the PLB interface is incomplete. We were able to test a connection between two physical layers by programming the physical layer onto the Virtex-5 FPGA and running a serial loopback connection. We were able to demonstrate that a connection was able to be established between the physical layers and data stream was able to be transmitted over the connection. 6.2 C HALLENGES Some of the challenges this project faced included an extensive amount of work coding in VHDL and a lack of experience and knowledge among some of the group members working on the project. This challenge was compounded with the loss of a team member with VHDL experience following the first semester design of the project. Another challenge encountered was with the Virtex-II FPGA we intended to use with our project. The Virtex-II boards in the lab worked with computers running an older version Xilinx ISE. The Virtex-II boards also used an older version of the RocketIO transceiver, because of these challenges we used a newer Virtex-5 board to solve these problems. However, because we only had access to one Virtex-5 board, we were unable 23 to test our project by communicating between two boards, and instead had to perform a serial loopback to our board for testing purposes. 6.3 L ESSONS L EARNED By working on this project, our team learned the importance of planning and evaluation on the success of a project. This project was very complex with a very detailed specification that it needed to follow, and effective and accurate evaluation of the difficulty and complexity of the project would help to identify project risk and greatly increase the chances of success. Another important lesson from the work on this project was the importance of decomposing the complex and difficult tasks to make them more feasible, and then setting strict deadlines for the completion of these tasks. Finally, the final goal and expectation for what we hoped to accomplish with this project at the end of the semester was somewhat ambiguous and providing a more detailed, specific end goal would allow for a less subjective measure of project success. 6.4 F UTURE W ORK the link layer modules, and then integration of the physical layer with the link layer. Once this work has been performed, the next step in the project would be verification of the design against the InfiniBand specifications in detail. Another option for continued work could be porting the design to transceivers other than RocketIO. Finally, some work could be done to create and open hardware development platform. Currently, we are tied to using development boards designed by Xilinx and Altera which may or may not have all the components necessary for physically implementing a complete InfiniBand endpoint. 6.5 A CKNOWLEDGEMENTS As a final note, we’d like to express our thanks to all those involved who helped to make our project a success. Troy Benjegerdes and the Ames Laboratory SCL Dr. Morris Chang Dr. Phillip Jones Jason Boyd Dr. Ahmed Kamal The ECpE Department Subsequent work required to continue this project include complete integration of all of InfiniBand FPGA Project 24 7 B IBLIOGRAPHY [1] InfiniBand Trade Assosciation, InfiniBand Architecture Specification, Volumes 1 and 2, Release 1.2. InfiniBand Trade Assosciation, 2004. [2] InfiniBand Trade Assosciation. (2009, Jan.) InfiniBand Trade Assosciation Website. [Online]. http://www.infinibandta.org/about/about_infiniband InfiniBand FPGA Project 25