IPbus Performance Measurement Ideas for 904 Test Stand in Summer 2013 Tom Williams version 2nd May 2013 904 μTCA Test Stand Setup The physical setup required for performance measurements to start is summarized below: Based on discussion with Marc Dobson, the PCs will (I think) be connected to the μTCA crate via our own dedicated switch, with some additional connection route between the PCs and the 904 904 network 1Gb/s Switch general network for ssh access to PCs. 1Gb/s uTCA crate MCH (Vadatech) 4 GLIB boards 1Gb/s PC A (SLC6, 64-bit) PC B (SLC6, 64-bit) Must have potential to setup 1Gb/s cable directly from one of PCs to μTCA crate – for debugging if we get unexpected results (e.g. never reach 1Gb/s over network). Also useful to check extra latency from network switches. Marc Dobson said this will be possible in our last meeting. All of the measurements below of course require the test stand is setup as above such that: All components have power Each board has a unique MAC address “officially” assigned by CMS Each board has a unique IP address. (We will have our own 192.168.xxx.yyy subnet for the boards.) N.B. How will the IP address assignment work? RARP? Will both the software and firmware be available? The PCs and boards are connected via Ethernet such that one can send a “ping” between the two PCs, and such that each PC can ping each board. This of course requires that the boards have MAC & IP addresses One of the PCs must have JTAG / similar access to the boards in order to be able to reload firmware onto the boards easily. Some other important points/principles: The basic test executables used should be packaged up in the IPbus software suite release (for sake of reproducibility). o Then just write wrapper scripts that iterate the measurements vs the “x” variable, and parse the output of the PerfTester commands. o However, the ControlHub will likely be the component controlling the number of packets in flight (rather than making several different firmware images). Hence, it will have to be re-compiled from the tagged sources for the different measurements vs “nr in-flight”, with the appropriate “MAX_IN_FLIGHT” macro redefined each time. It would be nice to be using a tagged copy of the firmware by this stage as well Should check at all stages that behavior of measured bandwidths/latencies vs. “number of transactions”/”number in-flight” are close to values expected based on ping latency between PCs/boards & sequence diagrams in spec document. Definitions Bandwidth (bits/s) = Nr registers modified * 32 / latency Latency = time taken for uHAL client to perform IPbus transactions. This should be measured in the uHAL client PC, starting from first corresponding uhal::read/write call, and stopping when the dispatch() method returns. The performance measurements are separated into 4 sets here … Part A: Simplest topology (1 packet in-flight) Performance for 1 client, talking to 1 board, with 1 packet in-flight … i.e. check that latencies and bandwidth have expected values in simplest possible topology. Measure latency & bandwidth as function of number of words read/written/modified for: Sequence of single register reads/writes @ random addresses Block read/write (Optional extra) RMW transactions @ random addresses Repeat in the following configurations: Protocol: 1. Direct UDP 2. via ControlHub (both client and ControlHub on same PC) 3. via ControlHub (client and ControlHub on different PCs) Client: On PC A & PC B Compare measured latencies & bandwidths with rough expectation from sequence diagrams, and ping latency. N.B: Check that as number of packets increases, the “via ControlHub” bandwidth asymptotically approaches the “direct UDP” bandwidth. Part B: Simplest topology, with multiple packets in-flight Performance for 1 client, talking to 1 board, with N packets in-flight … i.e. see how high we can get the bandwidth by increasing the number of packets in-flight Measure bandwidth for block reads/writes as a function of number of packets in-flight, N – calculate both all data down cable (i.e. full IPbus packet + UDP header), and only counting the registers written/read Repeat with the following configurations: Both with uHAL client / ControlHub running on same PC and on different PCs Transaction: Block read only & block writes only (Results should be the same.) Is 1Gb/s ever reached? N.B: Calculate what fraction of data travelling along the cable is the values that will be written to the registers. Use these results to decide what number of packets in-flight to use for all future measurements. Part C: Multiple clients and/or boards Performance for simultaneous large block reads/writes from n clients, talking to 1 or n boards … Questions to answer with these measurements: Does the bandwidth of each client get throttled back equally, or do some clients see higher fraction of bandwidth than others? Does the ControlHub ever limit the bandwidth? Repeat with the following configurations: Setup: o 1) all clients talking to a different board; 2) all clients talking to the same board o uHAL clients on either same / different PC as ControlHub Transaction: Block reads only & block writes only Measure as a function of number of clients, n: Total bandwidth (all clients summed up) Average bandwidth per client Range of bandwidths for different clients ControlHub memory & CPU usage N.B. Since we will only have 4 boards, we would need to put multiple ipbus endpoints in each board in order to simulate n > 4. However, when doing this, we of course need to check that the block read/write bandwidth to 2 IPbus endpoints on the same board is the same as for 2 IPbus endpoints each on their own board. Part D: Packet loss & MCH switch congestion Main aspects to check here: 1. Level of packet loss in real usage: Aim: Check if can see any packet loss when put system under maximal realistic load Continuous block read from each board in crate, each to separate uHAL client. Look at ControlHub stats command output to count number of packets lost. Are any packets lost? If so, how much does bandwidth degrade by? This should be tested with both types of MCH – i.e. NAT and Vadatech – if possible. Repeat with other traffic injected over the switch in order to replicate realistic Point 5 situation, where there will be > 1 crate and > 1 ControlHub connected to each switch. 2. Reduction in performance from artificially inducing packet loss Measure for packet loss on UDP & TCP sides separately Compare with prediction Part E: System/component robustness tests Possible ideas: Emulate MCH dying during/before IPbus communication with board. o i.e. pull the Ethernet cable out of MCH? o What does uHAL end-user see? Emulate boards dying during/before IPbus communication. o i.e. switch off that individual board or cause it to crash somehow. o What does the uHAL end-user see? o How will MCH death eventually be detected in Point 5? IPMI? o Can/should the ControlHub in principle differentiate between “dead MCH” and “dead board” situations – e.g. “no route to host” when try to send udp vs. “no response to UDP“?