THREE SCENARIOS WHO DRIVES ADMIN’S CRAZY… AND HOW TO HANDLE THEM Ing. Thomas Mitrovits, MSc Sr. Systems Engineer 1 covered topics • contaminated infrastructure • two challenges for long distance considerations (aka frame handling) • slow drainer contaminated infrastructure ClearLink Diagnostics Functional Details Test Initiator (switch port) • D_Port test consists of following four steps: • Electrical loopback test (E-WRAP) • Optical loopback test (O-WRAP) • Link traffic test • Link latency and distance measurement Test Responder (device port) Validating Configurations • Use ClearLink diagnostic port (D_Port) mode to test all 16 Gbps-capable ISLs, ICLs, and Brocade HBA connections • Complete optical, electrical and link saturation testing to ensure reliable connections • Pre-test and validate the entire SAN fabric at full line rate and with full FOS features enabled using the integrated flow generator • Emulate a 16 Gbps SAN without having any 16 Gbps hosts, targets or SAN testers H1 T1 H2 T2 D-Port Test Results via CLI sw0:root> portdporttest --show 10/39 D-Port Information: =================== Slot: 10 Port: 39 Remote WWNN: 10:00:00:05:33:7e:69:c4 Remote port: 24 Mode: Manual No. of test frames: 12 Million Duration of test (HH:MM): 00:01 Test frame size: 1024 Bytes Payload Pattern: JTSPAT FEC (enabled/option/active): Yes/No/No CR (enabled/option/active): No/No/No Start time: Mon Jan 16 05:57:51 2012 End time: Mon Jan 16 05:58:56 2012 Status: FAILED ================================================================================ Test Start time Result EST(HH:MM:SS) Comments ================================================================================ Electrical loopback 05:57:52 PASSED ----------------Optical loopback 05:58:06 PASSED ----------------Link traffic test 05:58:13 FAILED -------See failure report ================================================================================ Roundtrip link latency: 934 nano-seconds Estimated cable distance: 1 meters Buffers required: 1 (for 1024 byte frames at 16Gbps speed) Failure report: Errors detected (local): Errors detected (remote): CRC, Bad_EOF, Enc_out CRC, Bad_EOF Please use portstatsshow and porterrshow for more details on the above errors. D-Port test results show pass/fail as well as reason for failure to accelerate troubleshooting Long Distance (aka Buffer-Credit-Handling) Important Numbers … Numbers … Numbers 5 µs latency per km fiber 25 km maximum distance with 16 Gbit FC SFPs 125 m is the maximum distance with 16 Gbit/s and OM4 cabling 250 m length is a FC frame @16 Gbit/s 1st challenge: the physics Attenuation (dB/km) Fiber Optics Transmission Window 1300nm =0.5dB/km 850 Wavelength (nm) 1300 1550nm =0.2dB/km 1550 Available SFP+ Optical Small Form-factor Pluggable (SFP+) transceivers are available in short- and long-wavelength types: 16G SWL Brocade 57-0000088-01 16G LWL - 10km Brocade 57-0000089-01 16G ELWL 25km Brocade 57-1000262-01 Optical cable length for Multimode fiber Optical cable length for Fibre Channel OM1 OM2 OM3 OM4 Protocol (FC) Encoding Line Rate (Gb/sec) OM1 - 62.5µ (200 mHz) Multimode OM2 - 50µ (500 mHz) Multimode OM3 - 50µ (2000 mHz) Multimode OM4 - 50µ (4700 mHz) Multimode 1G 8b10b 1.0625 300 500 860 2G 8b10b 2.125 150 300 500 4G 8b10b 4.25 70 150 380 400 8G 8b10b 8.5 21 50 150 200 10G 64b66b 10.53 33 82 300 300 16G 64b66b 14.025 10.5 25 100 125 SFP specifications Possible Budget Real Budget -24dBm -20,5dBm -15dBm --9,5dBm --5dBm --3dBm Power Budget = (Worst Case Launch Power) – (Worst Case Receiver Sensitivity) + (Connector Attenuation) FCIP - extension without limits ? • use of existing IP wide area network (WAN) infrastructure to connect Fibre Channel SANs. • No implicit distance limit. • The TCP connections ensure in-order delivery of FC frames and lossless transmission. • All Fibre Channel targets and initiators are unaware of the presence of the IP WAN. 2nd challenge: Flow Control Flow Control Credit exchange at Fabric Login Host says, “I can receive 40 frames.” Storage says, “I can receive 16 frames.” Switch says, “I can receive 8 frames.” Buffer Credits Credit accounting after Fabric Login Switch thinks, “OK, I can send 40 frames that way and 16 frames this way, but I have to think about it.” Host thinks, “Good, I can send 8 frames without thinking about it.” Credit Count 8 Credit Count 40 Credit Count 16 Storage thinks, “Good, I can send 8 frames without thinking about it.” Credit Count 8 Buffer Credits Frame 1 1km Frame 1 1km 1km 1km 1km 1km 1km 1km Buffer Credits Frame 1 Frame 2 1km 1km Frame 2 Frame 1 1km 1km 1km 1km 1km 1km Buffer Credits Frame 1 Frame 2 Frame 3 1km 1km 1km Frame 3 Frame 2 Frame 1 1km 1km 1km 1km 1km Buffer Credits Frame 1 Frame 2 Frame 3 Frame 4 1km 1km 1km 1km Frame 4 Frame 3 Frame 2 Frame 1 1km 1km 1km 1km Buffer Credits Frame 1 Frame 2 Frame 3 Frame 4 Frame 5 1km 1km 1km 1km 1km Frame 5 Frame 4 Frame 3 Frame 2 Frame 1 1km 1km 1km Buffer Credits Frame 1 Frame 2 Frame 3 Frame 4 Frame 5 Frame 6 1km 1km 1km 1km 1km 1km Frame 6 Frame 5 Frame 4 Frame 3 Frame 2 Frame 1 1km 1km Buffer Credits Frame 1 Frame 2 Frame 3 Frame 4 Frame 5 Frame 6 Frame 7 1km 1km 1km 1km 1km 1km 1km Frame 7 Frame 6 Frame 5 Frame 4 Frame 3 Frame 2 Frame 1 1km Buffer Credits Frame 1 Frame 2 Frame 3 Frame 4 Frame 5 Frame 6 Frame 7 Frame 8 1km 1km 1km 1km 1km 1km 1km 1km Frame 8 Frame 7 Frame 6 Frame 5 Frame 4 Frame 3 Frame 2 Frame 1 Buffer Credits Frame 1 Frame 1 Frame 2 Frame 3 Frame 4 Frame 5 Frame 6 Frame 7 Frame 8 Frame 9 1km 1km 1km 1km 1km 1km 1km 1km Frame 9 Frame 8 Frame 7 Frame 6 Frame 5 Frame 4 Frame 3 Frame 2 ACK 1 Buffer Credits Frame 1 Frame 1 Frame 2 Frame 2 Frame 3 Frame 4 Frame 5 Frame 6 Frame 7 Frame 8 Frame 9 1km 1km 1km 1km 1km 1km 1km 1km Frame A Frame 9 Frame 8 Frame 7 Frame 6 Frame 5 Frame 4 Frame 3 ACK 1 ACK 2 Frame A Buffer Credits Frame 1 Frame 1 Frame 2 Frame 2 Frame 3 Frame 3 Frame 4 Frame 5 Frame 6 Frame 7 Frame 8 Frame 9 1km 1km 1km 1km 1km 1km 1km 1km Frame B Frame A Frame 9 Frame 8 Frame 7 Frame 6 Frame 5 Frame 4 ACK 1 ACK 2 ACK 3 Frame A Frame B Buffer Credits Frame 1 Frame 1 Frame 2 Frame 2 Frame 3 Frame 3 Frame 4 Frame 4 Frame 5 Frame 6 Frame 7 Frame 8 Frame 9 1km 1km 1km 1km 1km 1km 1km 1km Frame C Frame B Frame A Frame 9 Frame 8 Frame 7 Frame 6 Frame 5 ACK 1 ACK 2 ACK 3 ACK 4 Frame A Frame B Frame C Buffer Credits Frame 1 Frame 1 Frame 2 Frame 2 Frame 3 Frame 3 Frame 4 Frame 4 Frame 5 Frame 5 Frame 6 Frame 7 Frame 8 Frame 9 1km 1km 1km 1km 1km 1km 1km 1km Frame D Frame C Frame B Frame A Frame 9 Frame 8 Frame 7 Frame 6 ACK 1 ACK 2 ACK 3 ACK 4 ACK 5 Frame A Frame B Frame C Frame D Buffer Credits Frame 1 Frame 1 Frame 2 Frame 2 Frame 3 Frame 3 Frame 4 Frame 4 Frame 5 Frame 5 Frame 6 Frame 6 Frame 7 Frame 8 Frame 9 1km 1km 1km 1km 1km 1km 1km 1km Frame E Frame D Frame C Frame B Frame A Frame 9 Frame 8 Frame 7 ACK 1 ACK 2 ACK 3 ACK 4 ACK 5 ACK 6 Frame A Frame B Frame C Frame D Frame E Buffer Credits Frame 1 Frame 1 Frame 2 Frame 2 Frame 3 Frame 3 Frame 4 Frame 4 Frame 5 Frame 5 Frame 6 Frame 6 Frame 7 Frame 8 Frame 9 1km 1km 1km 1km 1km 1km 1km 1km Frame F Frame E Frame D Frame C Frame B Frame A Frame 9 Frame 8 ACK 1 ACK 2 ACK 3 ACK 4 ACK 5 ACK 6 ACK 7 Frame A Frame B Frame C Frame D Frame E Frame F Frame 7 Buffer Credits Frame 1 Frame 1 Frame 2 Frame 2 Frame 3 Frame 3 Frame 4 Frame 4 Frame 5 Frame 5 Frame 6 Frame 6 Frame 7 Frame 8 Frame 9 1km 1km 1km 1km 1km 1km 1km 1km Frame G Frame F Frame E Frame D Frame C Frame B Frame A Frame 9 ACK 1 ACK 2 ACK 3 ACK 4 ACK 5 ACK 6 ACK 7 ACK 8 Frame A Frame B Frame C Frame D Frame E Frame F Frame G Frame 7 Frame 8 Buffer Credits Frame H Frame 1 Frame 2 Frame 2 Frame 3 Frame 3 Frame 4 Frame 4 Frame 5 Frame 5 Frame 6 Frame 6 Frame 7 Frame 8 Frame 9 1km 1km 1km 1km 1km 1km 1km 1km Frame H Frame G Frame F Frame E Frame D Frame C Frame B Frame A ACK 2 ACK 3 ACK 4 ACK 5 ACK 6 ACK 7 ACK 8 ACK 9 Frame A Frame B Frame C Frame D Frame E Frame F Frame G Buffer Credit Frame1 can be released now ! Frame 7 Frame 8 FCP frame control (i.e. SCSI-FCP Write Command) Initiator Switch Target sequence sequence sequence sequence lost ack_frames (aka performance) What happens if ack_frames get lost if BB=0 (i.e. lost r_rdy frames) the link will be reseted by sending LinkCreditReset (LR) and LinkCreditResetResponse (LRR). Automatically recovers flow control buffer credit loss at the VC level, improving availability t Buffer Credit Recovery FCP frame control cont. • Data Droop Bandwidth – distance Extension • Remove Data droop – adding Buffer-to-Buffer Credits FCP frame control cont. Bandwidth – distance Extension cont. • Remove Data Droop => Terminating Buffer-to-Buffer Credits How “long” is a frame? Traveling at the speed of light = 300.000 km/s in vacuum (approx. 65% in fiber) a frame can be very short… @ 1G a frame is about 4Km in length @ 2G a frame is about 2Km in length @ 4G a frame is about 1Km in length @ 8G a frame is about 0.5Km in length @ 16G a frame is about 250 m in length How much credit do I need? Good “Rule of thumb” Number of credits needed = 1 + Link speed in Gb/s * Distance in Km Frame Size in K 4Gb Example: 20 Km at 1 Gb/s 32 Credits 110 Example: 10 Km at 4 Gb/s 1 + 4 * 10 = 21 2 100 P e rce n t D a ta R a te 1 + 1 * 20 = 11 2 90 80 70 60 50 40 30 20 10 4 20 36 52 68 84 Distance (Km) 100 116 132 148 Performance Optimization on FC Long Distance ISLs Optimize Performance • Allow end users to specify either the number of buffers or average frame size while configuring a long distance port • Provides more control to users to optimize performance on long distance links based on traffic pattern • Two new options for Portcfglongdistance CLI - one option to configure buffers, and another option to configure frame size for LD and LS modes • In pre-FOS v7.1, user can configure only the “distance” for long-distance static and dynamic mode. Buffer estimation done based on distance, link speed and full_size frame buffers assumed, which can lead to suboptimal buffer allocation • With FOS v7.1, a user can directly configure the buffers required for a port of a long distance link • Users can also configure/specify the average frame size for a long distance port. Using the frame size option, number of buffers/credits required for a port will be automatically calculated Performance Optimization on FC Long Distance ISLs Optimize Performance • Enhancement to display the average buffer usage and average frame size in portbuffershow • Average buffer usage is the real time buffers used by the port while the traffic is in progress • Provides better insights into the traffic pattern and also lets users optimize performance on long distance links by specifying the average frame size • A new CLI portBufferCalc to calculate the number of buffers required per port given the distance, speed and frame size • If a user does not provide any of the options, then current port’s configuration will be considered to calculate the number of buffers required • This CLI will give the users an estimate on the number of buffers required for given distance, speed and frame size Portcfglongdistance Example • -distance & -framesize pluto_134:FID128:root> portcfglongdistance 1/3 LS 1 -distance 100 -framesize 1024 Reserved Buffers = 806 Warning: port (3) may be reserving more credits depending on port speed. pluto_134:FID128:root> portcfgshow 1/3 Speed Level: AUTO(HW) Fill Word(On Active) 1(Arbff-Arbff) Fill Word(Current) 1(Arbff-Arbff) AL_PA Offset 13: OFF Trunk Port ON Long Distance LS VC Link Init ON Desired Distance 100 Km Frame Size 1024 Bytes Reserved Buffers 806 Portcfglongdistance Example • -buffers pluto_134:FID128:root> portcfglongdistance 1/3 LS 1 -buffers 400 Reserved Buffers = 406 Warning: port (3) may be reserving more credits depending on port speed. pluto_134:FID128:root> portcfgshow 1/3 Area Number: 3 Speed Level: AUTO(HW) Fill Word(On Active) 1(Arbff-Arbff) Fill Word(Current) 1(Arbff-Arbff) AL_PA Offset 13: OFF Trunk Port ON Long Distance LS VC Link Init ON Desired Buffers 400 Reserved Buffers 406 Portbuffercalc – New CLI • This CLI is used as an assistance to configure the recommended buffers for a longdistance port. • It returns the buffers based on the distance/speed/framesize configured. • CLI • portBufferCalc [SlotNumber/]PortNumber <-distance distance> <-speed speed> <-framesize framesize> • Example pluto_134:FID128:root> portbuffercalc 1/3 -distance 100 406 buffers required for 100km at 8G and framesize of 2048bytes Buffer Credits Switch or blade model Total FC ports per switch or blade User port group size Unreserved buffers with QoS (per port group) Unreserved buffers without QoS (per port group) 6510 switch 48 48 6752 7712 FC16-32 32 16 5188 5408 FC16-48 48 24 4480 4960 FC8-64 *** Extended Fabrics are not supported on this blade *** Maximum distances (km) that can be configured assuming 2112 Byte Frame Size 2 Gbps 4 Gbps 8 Gbps 10 Gbps 16 Gbps 6510 switch 6752 3376 1688 1350 844 FC16-32 5188 2594 1297 1037 648 FC16-48 4484 2242 1121 896 560 FC8-64 *** Extended Fabrics are not supported on this blade *** Bottleneck Detection Server 1 Storage A FC Switch 1 FC Switch 2 Server 2 FC Switch 1 FC Switch 2 Storage B Server 3 FC Switch 1 FC Switch 2 Storage B Storage C Server 4 Storage D FC Switch 1 FC Switch 2 Storage B 1 Server to 1 Storage with bottleneck Server 2 FC Switch 2 FC Switch 1 Storage B Bottleneck Bottleneck Bottleneck Server 1 Server 2 Bottleneck Bottleneck Bottleneck Storage A Server 3 Server 4 FC Switch 1 FC Switch 2 Storage B Storage C Storage D Bottlenecks in general • “Bottleneck” is an attribute of the transmit direction of a port • (The transmit direction of) a port is bottlenecked when the offered load at the port exceeds the throughput at the port • A port can be a congestion bottleneck or a latency (aka slow-drain) bottleneck • Congestion bottleneck: offered load exceeds throughput and throughput is 100% • Latency bottleneck: offered load exceeds throughput and throughput is less than 100% Latency bottlenecks and throughput • Common misconception that a latency bottleneck (“slow-drain”) must be doing low throughput • A latency bottleneck can have any link utilization level from 0% to under 100% • Not necessarily low utilization/throughput • Looking at slow drain at high utilizations is not very useful • Feature is not recommended above 85% link utilization Handling of trunks • Congestion bottlenecks • Entire capacity and entire utilization of the trunk are considered to determine if it is congested • Reporting and configuration are done on the master only • Reporting and configuration follow the master • Latency bottlenecks • Any bottlenecked VC on the trunk makes the trunk a bottleneck • Reporting and configuration are done on the master only • Reporting and configuration follow the master RAS: Bottleneck Detection Maintaining Application Performance • Identifies and alerts administrators to bottlenecks that can degrade application performance ISL Congestion Congestion Bottleneck Monitor , E_Port • Detects bottlenecks caused by slow drain devices • Bottleneck detection for E_, EX_, and F_Ports • Accelerates problem detection and diagnosis to minimize performance degradation Congestion Bottleneck Monitor, E_Port Latency Bottleneck Monitor, F_Port Slow-drain Device Normal Traffic Congested Traffic Supported Configurations • Condor/Condor2/GoldeEye/GoldenEye2/Condor3 ports • Latency bottleneck detection on Condor/GoldenEye is an approximation of the more exact mechanism available on Condor2/GoldenEye2 • Does not catch all latency bottlenecks on Condor/GoldenEye • Runs on all platforms • Works the same on switch or Access Gateway • Feature is allowed on switch F_Port attached to Access Gateway License, Conflicts • No license requirement • No conflicts with other features Bottleneck detection • User can configure bottleneck detection parameter on switch, port. • User can view bottleneck statistics for a given port (max up to 32 ports) • Bottlenecked port is highlighted in connectivity map and product tree within 10 secs of switch detecting bottleneck • User can see affected hosts because of the bottlenecked port Bottleneck configuration • Configure bottleneck parameters Bottleneck statistics Topology indications Show affected hosts THANK YOU 67