DOLPHIN DX ADAPTER User Guide www.opal-rt.com 1751 Richardson, suite 2525 Montréal (Québec) Canada H3K 1G6 www.opal-rt.com © 2016 All rights reserved Printed in Canada Contents REVISION HISTORY ............................................................................................................................................................................ 4 INTRODUCTION ................................................................................................................................................................................. 5 REQUIREMENTS AND PLANNING ..................................................................................................................................................................... 5 supported hardware .......................................................................................................................................................................... 5 DECIDE ON DOLPHIN INTERCONNECT TOPOLOGY ................................................................................................................................................ 5 DOLPHIN DX ADAPTER INSTALLATION ............................................................................................................................................... 6 SOFTWARE INSTALLATION AND CONFIGURATION ............................................................................................................................ 7 MANUAL INSTALLATION................................................................................................................................................................................ 7 build rpms.......................................................................................................................................................................................... 7 pre-compiled dolphin rpms – NOT YET SUPPORTED.......................................................................................................................... 8 AUTOMATIC INSTALLATION ........................................................................................................................................................................... 8 requirements: .................................................................................................................................................................................... 8 SOFTWARE REMOVAL................................................................................................................................................................................... 9 dolphin SIA available ......................................................................................................................................................................... 9 dolphin SIA not available ................................................................................................................................................................... 9 CLUSTER CONFIGURATION............................................................................................................................................................................. 9 DOLPHIN DX ADAPTER CONFIGURATION.......................................................................................................................................................... 9 CABLING INSTRUCTIONS ............................................................................................................................................................................. 11 2 nodes cluster ................................................................................................................................................................................ 11 3 or more nodes cluster ................................................................................................................................................................... 11 VERIFYING FUNCTIONALITY AND PERFORMANCE ............................................................................................................................................. 11 availability of drivers and services................................................................................................................................................... 11 CABLE CONNECTION TEST .......................................................................................................................................................................... 11 STATIC INTERCONNECT TEST ....................................................................................................................................................................... 11 INTERCONNECT PERFORMANCE TEST ............................................................................................................................................................ 11 TROUBLESHOOTING ........................................................................................................................................................................ 12 TERMINOLOGY ................................................................................................................................................................................ 13 RT-LAB 61850 Driver Test 3 REVISION HISTORY Version 1.0 1.1 Author Cristina Olariu Cristina Olariu Date February 3, 2009 August 4, 2009 1.2 1.3 Cristina Olariu Cristina Olariu December 10, 2009 February 26, 2010 1.4 Cristina Olariu 19 March 2010 1.5 Cristina Olariu December 2, 2010 1.6 François Berthelot January 10th, 2012 RT-LAB 61850 Driver Test Changes Corrections/updates based on observations from prod. group (after a setup of 4 nodes was completed). Updates for Opal RedHat Updates concerning the build procedure – export variables that point to the kernel source. The SIA for DX boards doesn’t support the D352! Develop script that automates Dolphin software installation. Add details for helping setting 2-node direct 4 INTRODUCTION The Dolphin DX board provides the means for a process running on one machine to write data directly into the address space of another process running on a remote machine. This is the characteristic of the SCI technology and the CPU through direct load/store operations achieves it. REQUIREMENTS AND PLANNING SUPPORTED HARDWARE The Dolphin DX adapter operates in any x86 and x86_64 PC architecture that offers compliant PCI-Express slots. DECIDE ON DOLPHIN INTERCONNECT TOPOLOGY 2 nodes cluster: Two nodes can be connected directly using one or two CX4 cables, using PCI Express link width of x4 (single cable) or x8 (two cables). 3 nodes cluster: Three nodes can be connected directly using PCI Express link width x4. 1-10 nodes: can be connected to a DXS410 switch (using x4, or x8 PCI Express link widths). The nodeID assigned to each node is based on the formula (for a 2-D cluster): NodeID = (x + 1) * 4 + y * 64 Note: link 0 is mapped to the X-dimension of the torus and link 1 is mapped to the Y-dimension. PHYSICAL NODE PLACEMENT Ideally, nodes equipped with Dolphin DX adapters must be placed closed to each other in order to keep the cable lengths short. Note: 1 to 3 meters length cable is acceptable. RT-LAB 61850 Driver Test 5 DOLPHIN DX ADAPTER INSTALLATION Nodes must be powered down before installing an adapter. Insert the adapter into a free PCI-Express slot that matches the adapter 4x, 8x or 16x PCI-Express slot for DXH510 if it is to be used in x4 mode 8x or 16x PCI-Express slot for DXH510 if it is to be used in x8 mode Make sure you are properly grounded to avoid static discharges that may destroy the hardware. Once the adapter is properly inserted and fixed, you can power up the node again. The adapter slot LEDs will be yellow. We recommend that all nodes are setup identically, thus use the same slot for identical architectures. We recommend connecting the cables after the software is installed and all adapters are configured. RT-LAB 61850 Driver Test 6 SOFTWARE INSTALLATION AND CONFIGURATION This section describes how to perform the Dolphin DX package installation on a cluster. Start with Automatic Installation. For manual installation, start with 0 Manual Installation and then continue with Cluster Configuration. If this it is not the first time installation, the previous version must be completely removed before attempting to install the new packages. Please refer to chapter Software Removal. MANUAL INSTALLATION Follow the instructions below for each node: BUILD RPMS We assume that you have the latest script release from Dolphin Interconnect and you want to build Dolphin RPMS for a particular kernel version. Please follow the procedure below: Transfer mode the latest SIA in binary (*.sh) from \\Mainnas\Logiciels\Dolphin\DXH510PCIExpress\TestLatestRelease\OpalRedHat on each node on /home/ntuser directory. Pass both *.sh files individually instead of passing the .rar file. Log onto the node with root privileges and type # cd /home/ntuser Prepare the build environment by exporting the following variables: # export PATH_LINUX_INCLUDE=/usr/src/kernels/linux-2.6.29.6/include # export PATH_LINUX_CONFIG=/usr/src/kernels/linux-2.6.29.6 Build the binary RPM packages by executing the commands: # sh DIS_DX_OPAL_SVN-r27662-r3489-r24.sh --build-rpm Note: the “build-rpm” command is preceded by two dash and there is no space character between the second dash and the build-rpm command. The build process will create the directories: frontend_RPMS, node_RPMS and source_RPMS. Install the packages from the directory node_RPMS: # cd node_RPMS # rm –f Dolphin-SuperSockets-DX* # rm –f Dolphin-DX-devel* # rpm –ivh *.rpm Add the DIS/bin and DIS/sbin directories to the executable search path using the PATH environment variable: # cd /root # vi .bash_profile Add the following line: PATH=$PATH :/opt/DIS/bin:/opt/DIS/sbin RT-LAB 61850 Driver Test 7 PRE-COMPILED DOLPHIN RPMS – NOT YET SUPPORTED We assume that you have Dolphin RPMS compiled for the exact kernel version you intend to use. Follow the procedure below: Transfer in binary mode the rpms: Dolphin-DX-3.4.0d-1.i386.rpm and Dolphin-SISCI-DX-3.4.0d-1.i386.rpm. Install the rpms by running the following commands: # rpm –ivh Dolphin-DX-3.4.0d-1.i386.rpm # rpm –ivh Dolphin-SISCI-DX-3.4.0d-1.i386.rpm. Add the DIS/bin and DIS/sbin directories to the executable search using the PATH environment variable: # cd /root # vi .bash_profile Add the following line: PATH=$PATH :/opt/DIS/bin:/opt/DIS/sbin AUTOMATIC INSTALLATION Opal-RT provides an automatic installation script for the Dolphin software. REQUIREMENTS: The rpm qt-devl must have been installed on each node where the build takes place. The Dolphin self-installing archive must be included in the same directory as the script The configuration file dishost.conf that describes the cluster to setup must also be included in the same directory Follow the instructions below on only one node. The script detects the other nodes that are part of the Dolphin interconnect cluster to be from the configuration file and uses ssh to install and configure them remotely. From \\Mainnas\Logiciels\Dolphin\DXH510PCIExpress\TestLatestRelease\OpalRedHat TRANSFER THE DOLPHIN SELF- INSTALLING ARCHIVE TO A DIRECTORY OF YOUR CHOICE From the same location, transfer the Opal automatic installation script op_dolphin_install.sh in the same directory From \\Mainnas\logiciels\Dolphin\DXH510PCIExpress\DXConfiguration transfer in the same directory the cluster configuration file dishosts.conf that reflects the size of the cluster you want to setup. Make appropriate changes so each IP address may be used to connect to a node in the cluster. Execute op_dolphin_install.sh and follow the instructions on the screen. The script takes three arguments: 1. The first argument is the name of the Dolphin SIA. 2. The second argument is the name of the configuration file that describes the cluster. 3. The third argument is the execution mode. Use “—build” to run a complete installation or “—configure” to configure or reconfigure the cluster. Use this option if the Dolphin software is correctly built and installed in each node. The only interaction required from the user is to input once the password for the root user for each node. Note: the script op_dolphin_install reboots each node in the cluster to complete the configuration process. When the nodes are powered up, they should be ready for communications using SCI interconnect. You may continue with 0 Cabling Instructions. RT-LAB 61850 Driver Test 8 SOFTWARE REMOVAL DOLPHIN SIA AVAILABLE To remove all software that has been installed via SIA and to keep the configuration file, execute the following command: # sh DIS_DX_install_3_1_0_2_LINUX.sh - - uninstall This command stops all the drivers if they are in use and removes all software that has been installed by the script. If you want to remove all software installed by SIA along with the configuration data and possibly remaining of previous non-SIA installations, execute the following command: # sh DIS_DX_install_3_1_0_2_LINUX.sh - - wipe DOLPHIN SIA NOT AVAILABLE To remove all software that has been installed, execute the following command: # rpm –qa | grep Dolphin | xargs rpm -e CLUSTER CONFIGURATION At this moment, the Dolphin software is installed. Assuming you have already decided on the cluster topology you want to set up, make available the corresponding cluster configuration file dishosts.conf in the /etc/dis directory. The configuration file is used to set few global interconnect properties and the position of each node within the interconnect topology chosen. Transfer the file dishosts.conf from \\Mainnas\Logiciels\Dolphin\DXH510PCIExpress\DXConfiguration\Xnodes on each node in the /etc/dis directory. X represents the size of the cluster you want to set up (up to 10 nodes for now). Open the file dishosts.conf and update the IP addresses. Note 1: The exact same dishosts.conf file is to be used on both simulators (dishosts.conf on target 2 is simply a copy of dishosts.conf on target 1). Note 2: The IP addresses to use in the dishosts.conf file are the IP addresses of the simulators. For example, card 1 is inserted in simulator 1 and card 2 is inserted in simulator 2. IPs found in the dishosts.conf file should those of simulator 1 and simulator 2. Reload the driver in order for these changes to take effect: # dis_services restart Reboot the simulators. DOLPHIN DX ADAPTER CONFIGURATION The DX adapters do not store the configuration information permanently; they must be reconfigured after each restart. The easiest way is to add the following line in the init script /etc/rc.d/rc.local: # /opt/DIS/sbin/dxconfig –c <card number> -a <adaptor number> -n <node ID> If you have only one DX adapter installed, the: Card number is 1 (start indexing at 1) Adapter number is 0 (start indexing at 0) RT-LAB 61850 Driver Test 9 Node ID respects the formula described on section 2.2. For a 2-node direct connection, the lines should be: Target 1: /opt/DIS/sbin/dxconfig -c 1 -a 0 -n 4 Target 2: /opt/DIS/sbin/dxconfig -c 1 -a 0 -n 8 RT-LAB 61850 Driver Test 10 CABLING INSTRUCTIONS We recommend installing the cables while the nodes are powered up. 2 NODES CLUSTER 2 DX adapters can be interconnected using the following instructions: Insert one cable from any available port on node 4 to any available port on node 8. Observe that the PORT LEDs turn green on both the Dolphin Express DXH510 Adapters (Active). 3 OR MORE NODES CLUSTER 3 or more DX adapters can be interconnected through a switch, using these instructions: Connect any free port of each DXH510 board with a free port on the switch. Observe that the PORT LEDs turn green on both the Dolphin Express DXH510 Adapters (Active) and the switch. VERIFYING FUNCTIONALITY AND PERFORMANCE AVAILABILITY OF DRIVERS AND SERVICES Query the status of the drivers with # dis_services status. All services must be running. If one of them is not running, you will find more information on the problem that may have occurred using the system log tools. For example, inspect kernel log messages with dmesg. CABLE CONNECTION TEST All LEDs on the DX adapters should turn green and emit a steady light (meaning that the link is idle). STATIC INTERCONNECT TEST Perform this test using the dxdiag diagnostic tool. The report of the test must be PASSED. # dxdiag –V 9 or # dxdiag –V 9 > result.txt 2>&1 if you want to save the results and errors in one file text result.txt Use dxdiag –help to obtain more information about the possible arguments. INTERCONNECT PERFORMANCE TEST Run scibecnh2 or scipp between pairs of nodes in the same cluster. Example for scibench2, on node 4 run the command: # scibench2 –server –rn 8 On node 8, run the command: #scibench2 –client –rn 4 You may expect as minimum latency to write 4 bytes to remote memory 0.2us. RT-LAB 61850 Driver Test 11 TROUBLESHOOTING PROBLEM The DX was installed on the node but the SCI software doesn’t recognize it. All cables are connected, all LEDs are green, and all drivers are up; however some nodes cannot see some other via SCI. The driver dis_irm refuses to load. Load of a RT-LAB model fails if trying to transfer more then 16kB RT-LAB 61850 Driver Test SOLUTION If this is the case, try to power down the node for at least 5 seconds. Then, power it up again and re-try. These symptoms indicate that the cabling is not correct. Re-verify the cabling. Verify also that all nodes run the same version of the Dolphin software: # rpm –qa | grep –I Dolphin Run dmesg and check all error messages. If you find the line Out of vmalloc space, it means that the adapter requires more virtual PCI address space then supported by the installed kernel. There are two alternative solutions: 1. When building a small cluster, you may be able to run your application with less SCI address space. If this is the case, you may change the SCI address space size for the adapter by using the command # dxconfig -c 1 -spms 16 2. If reducing the prefetch memory size is not desired, the related sources in the kernel have to be increased. For x86 machines this is achieved by passing the kernel option vmalloc=256m and the parameter uppermem=524288 at boot time. Load the RT-LAB model in Exhaustive mode and check the error code. If SCIMapRemoteSegment() fails with the error code 0x40000904, try increasing the size of the memory prefetch. Run the following command: # dxconfig Set-prefetch-mem Choose 128MB Q Reboot the target or reload Dolphin drivers: # dis_services restart 12 TERMINOLOGY Term Description Adapter Node Cluster A DX adapter installed in a PC A PC that has a DX adapter installed and is part of the Dolphin Interconnect Solution Two or more nodes interconnected Frontend A single computer that is running software that monitors and controls the nodes in the cluster. For increased fault tolerance, the frontend should not be part of the Dolphin Express interconnect it controls. Instead, the frontend should communicate with the nodes via Ethernet. The installation script is typically executed on the frontend, but can also be executed on another machine that is neither a node nor the frontend, but has network (ssh) access to all nodes and the frontend. This machine is the installation machine. Installation machine Self-installing archive SCI SISCI RT-LAB 61850 Driver Test Is a single executable shell command used to compile and install the Dolphin Software stack in al variants (the installation script). Scalable Coherent Interface – a 1995 standard The user-level API to create applications that makes direct use of the low-level Dolphin Interconnect Solutions shared memory capabilities 13 RT-LAB 61850 Driver Test 14 CONTACT Note: Opal-RT Corporate Headquarters While every effort has been made to ensure accuracy in this publication, no responsibility can be accepted for errors or omissions. Data may change, as well as legislation, and you are strongly advised to obtain copies of the most recently issued regulations, standards, and guidelines. This publication is not intended to form the basis of a contract. 1751 Richardson, Suite 2525 Montréal, Québec, Canada H3K 1G6 Tel.: 514-935-2323 Toll free: 1-877-935-2323 Technical Services www.opal-rt.com/support 01/2011 © Opal-RT Technologies Inc.