9/29/15 End-to-End Performance Tuning and Best Practices Wednesday, September 29, 2015 Moderator: Charlie McMahon, Tulane University Jan Cheetham, University of Wisconsin-Madison Chris Rapier, Pittsburgh Supercomputing Center Paul Gessler, University of Idaho Maureen Dougherty, USC Slide 1 Slide 1 9/29/15 Paul Gessler Professor & Director, Northwest Knowledge Network University of Idaho Slide 2 Slide 2 9/29/15 Enabling 10 Gbps connections to the Idaho Regional Optical Network • UI Moscow campus network core • Northwest Knowledge Network and DMZ • DOE’s Idaho National Lab • Implemented perfSONAR monitoring over Idaho • Institute for Biological and Evolutionary Studies Slide 3 Slide 3 9/29/15 Slide 4 Slide 4 9/29/15 Slide 5 Slide 5 9/29/15 Jan Cheetham Research and Instructional Technologies Consultant University of Wisconsin-Madison Slide 6 Slide 6 9/29/15 University of Wisconsin Campus Network Internet2 Innovation Network Science DMZ SSEC IceCUBE HEP 100G perfSONAR Engineeri ng CHTC WID WEI LOCI Biotech Campus Network Distribution Slide 7 Slide 7 9/29/15 Diagnosing Network Issues PerfSONAR helps uncover problems with: • TCP window size issues to San Diego • Optical fiber cut affecting latency-sensitive link between SSEC and NOAA • Line card failure resulting in dropped packets on research partner’s (WID) LAN • Transfers from internal data stores to distributed computer resources (HTCondor pools) Slide 8 Slide 8 9/29/15 Dealing with Firewalls Can’t use firewall • Security baseline for research computing Must be behind a firewall • Upgrade firewall to high speed backplane to allow 10G throughput to campus in preparation for campus network upgrade • Plan to use SDN to shunt some traffic (identified uses within our security policy) Slide 9 Slide 9 9/29/15 Challenges • 100 GE line card failure (pursuing buffer overflow) • Separating spiky research traffic from the rest of campus network traffic • Distributed campus—getting the word out to enable everyone to take advantage • Internal network environments limitations for researchers • Storage bottleneck Slide 10 Slide 10 9/29/15 Chris Rapier Senior Research Programmer Pittsburgh Supercomputing Center Slide 11 Slide 11 9/29/15 XSight & Web10G Goal: Use the metrics provided by Web10G to enhance workflow by early identification of pathological flows. • A distributed set of Web10G enabled listeners on Data Transfer Nodes across multiple domains. • Gather data on all flows of interest and collate at centralized DB. • Analyze data to find marginal and failing flows • Provide NOC with actionable data in near real time Slide 12 Slide 12 9/29/15 Implementation • Listener: C application periodically polls all TCP flows. Applies rule set to • Database: InfluxDB. Time series DB. • Analysis engine: Currently applies heuristic approach. Development of models in progress. • UI: Web based logical map. Allows engineers to drill down to failing flows and display collected metrics. Slide 13 Slide 13 9/29/15 Results • Analysis engine and UI still in development • Looking for partners for listener deployment (includes NOCs) • 6 months left under EAGER grant. Will be seeking to renew grant. Slide 14 Slide 14 9/29/15 Maureen Dougherty Director, Center for High-Performance Computing USC Slide 15 Slide 15 Trojan Express Network II Goal: Develop Next Generation research network in parallel to production network to address increasing research data transfer demands • Leverage existing 100G Science DMZ • Instead of expensive routers, use cheaper high-end network switches • Use OpenFlow running on a server to control the switch • PerfSONSAR systems for metrics and monitoring Trojan Express Network Buildout Collaborative Bandwidth Tests • 72.5ms round trip between USC and Clemson • 100Gbps Shared Link • 12 machine OrangeFS cluster at USC – Directly connected to Brocade Switch at 10Gbps Each • 12 clients at Clemson • USC ran nuttcp sessions between pairs of USC and Clemson hosts • Clemson ran file copies to the USC OrangeFS cluster Linux Network Configuration Bandwidth Delay Product 72.5ms x 10Gbits/second = 90625000 bytes (90Mbytes) • net.core.rmem_max = 96468992 • net.core.wmem_max = 96468992 • net.ipv4.tcp_rmem = 4096 87380 96468992 • net.ipv4.tcp_wmem = 4096 65536 96468992 • net.ipv4.tcp_congestion_control = yeah • jumbo frames enabled (mtu 9000) Nuttcp Bandwidth Test Peak Transfer of 72Gb/s with 9 nodes 9/29/15 Contact Information Charlie McMahon, Tulane University cpm@tulane.edu Jan Cheetham University of Wisconsin-Madison jan.cheetham@wisc.edu Chris Rapier, Pittsburgh Supercomputing Center rapier@psc.edu Paul Gessler, University of Idaho paulg@uidaho.edu Slide 21 Slide 21