UH SWARM: Dense perfSONAR Deployment With Small

UH SWARM: Dense perfSONAR Deployment With Small, Inexpensive Devices Alan Whinery U. Hawaii ITS September 29, 2015 Slide 1 Slide 1 The Whole Small perfNode Thing  At a $2000 to $5000 price point, a typical perfSONAR node gets deployed at REN leaves and branches    At $50 price point, you can buy 40 to 100 times as many for the same amount Some focus on pS node equivalence, Intel compatibility   Deployment tends to be relatively sparse (~$200 price point) Some focus on value in smaller ($50) nodes Slide 2 Slide 2 The Whole Small perfNode Thing  PerfSONAR developer fork for small devices:https://github.com/perfsonar/project/wiki/perfSONAR-EndpointNode-Project    $150 - $250 price range, focus on Intel architecture RNP Brasil – MonIPE  http://monipe.rnp.br/  Same type of nodes – ARM architecture - as our $50 price range UH SWARM  Our thing – Beaglebones, Raspberries, etc ARM/$50. Slide 3 Slide 3 The Swarm   Wrote paragraph into our CC-NIE campus networking proposal about making use of the recent availability of ~$50 computers to “sense” the network, using elements of perfSONAR. Funded a project to deploy 100 nodes on one campus over 2 years, exploiting a ~$50 price point to deploy many nodes on campus as a dense mesh. Slide 4 Slide 4 Goals/Challenges  Finding nodes to buy in the face of market exhaustion  Getting node deployment work-flow down to nil  Getting recoveries of off-line nodes to a minimum  Tracking assets and reliability, generating metrics  Evaluating capabilities of the whole set-up  Developing a test program for many nodes  Slicing/Dicing data to see what it has to tell us  Developing visualizations and distillations to put tools in hands of network maintainers, merging into pS Toolkit Slide 5 Slide 5 Devices We Have   Raspberry Pi – famous, $50, med-perf, file system on SD card, 100 Mb Ethernet, USB 2.0 BeagleBone Black – $50, more perf, FS on internal flash, and/or SD card, 100 Mb, USB 2.0 Honorable mention:   CuBox i4 – $147, more perf, FS on SD, GigE, WiFi, USB 2.0 MiraBox $149 – most perf, FS on SD, dual GigE, WiFi, USB 3.0 Slide 6 Slide 6 Reliability     Raspberry Pi (July 2014)  UH ITS owns 47 – 1 has failed  22 SD card hard failures  10 file-system failures BeagleBone Black Rev A/C. (December 2013/April 2015)  UH ITS owns 60, 1 has corrupted firmware  Of nodes in production, one had to be power-cycled, once CuBox – one deployed 6 months of service zero problems. (using SD from OEM). Mirabox – promising, dual GigE ($150), wimpy kernel Slide 7 Slide 7 SD Cards  DANE ELEC 8 GB Class 4 10 cards, 2 failures in light duty  SanDisk Ultra 8 GB Class 10   Kingston 8 GB Class 10    10 cards, 0 failures, 3 FS corrupted in 42k hours 10 cards, 0 failures, 7 FS corrupted, in 42k hours Kingston 4 GB Class 4  20 hard failures in less than 20k hours  (100% across 6 weeks, < 1000 Hr MTBF) SanDisk Ultra – 8GB Class 10  Most recent batch of replacements Slide 8 Slide 8 Year 1  Tried 10 BeagleBones, liked them   And a few Raspberries Pi The market vacuum around the release of BBB Rev. C made BBB impossible to obtain  Bought 43 Raspberries  Although we are going with BeagleBone Black for the completion, we could make Raspberries work if necessary.  Bought 2 Dell rack servers as test facilitators, data archives. Slide 9 Slide 9 2nd Year Completion  50 BeagleBone Black Rev. C (4 GB internal flash)     BBB Internal flash is more reliable than SD  Internal + SD card enables separating system/data partitions  Better 100 Mb Ethernet performance 5 Raspberry Pi 2 Model B As number deployed approaches 100, we will be placing nodes in new/special roles. Correlating topology from netdot, MRTG graphs for context Slide 10 Slide 10 Management   Puppet/The Foreman  https://puppetlabs.com/  http://theforeman.org/  Easy to push changes, updates out to the swarm.  Easy to push errors out to the swarm and require 50 SSH sessions. Work-flow  Try to minimize per node actions and attended setup  RPi – ua-netinstall with tweaks for Puppetization  BBB – custom SD that auto-images the internal flash  Make individual nodes as interchangeable as possible  (If you have a choice use one type of device) Slide 11 Slide 11 Characteristics Of Dense Sensor Deployment Within An Enterprise     A “sensor” is less complicated than a perfSONAR toolkit node Central perfSONAR buoy/MA orchestrates Having many observations makes the loss of a single one less important. You can correlate topo and test results to “triangulate” on the source of a problem.  It takes planning avoid affecting user traffic  Strategy is to “be” user traffic  pS Toolkit as-built isn't really made for 100 nodes Slide 12 Slide 12 Test Programs: powstream (owamp)  powstream puts 10 packets/second on the wire, 24 hours a day    (there's been discussion about increasing the rate) To some extent, apparently stochastic/probabilistic loss resembles stochastic/probabilistic loss at much higher rates – meaning – the probabilistic loss that powstream encounters is probably the minimum of what a throughput test will encounter. Slide 13 Slide 13 SideBar: Regular Global perfSONAR powstream Log-scaled loss to color gradient An early idea about how to unify many graphs in front of your cerebral cortex. Black = 0% loss Green->Yellow->Red gradient: low → medium → higher loss Log scaled to avoid hiding low loss In our campus network, everything was always black (no appreciable loss) (Gray – no data) Time: left to right Each 10 pixel row is one path. Slide 14 Slide 14 Test Programs: powstream (owamp)  powstream from pS Toolkit node to/from each sensor node  Really, really, really boring at first glance. All loss appears to be about zero. Always one or two losing a packet per day (1 in 864000)  Standard deviation in latency groups somewhat interesting, may reflect queuing, flares in latency std dev may precede loss events  Longitudinal analysis reveals damaging loss rates that would otherwise be invisible  Higher packet rates might expose low loss probabilities in shorter time Slide 15 Slide 15 30 nodes, in/out Mathis, Semke, Mahdavi, "The Macroscopic Behavior of the TCP Congestion Avoidance Algorithm”, ACM SIGCOMM, Vol 27, Number 3, July 1997 Slide: Used with permission Speed Limits You Can't See For 45 milliseconds RTT, typical minimum to get onto continental US from Hawaii Loss Rate 10 pps Powstream Packets Lost Per day TCP AIMD Coastal Limit @1460 MSS (Mbits/sec) TCP AIMD Coastal Limit @8960 MSS (Mbits/sec) 45 mS RTT 45 mS RTT 1.82E-005 15.75 42.56 261.18 2.25E-006 1.94 121.11 743.23 1.87E-006 1.62 132.76 814.72 9.38E-007 0.81 187.58 1151.16 6.05E-007 0.52 233.55 1433.28 5.93E-007 0.51 236.03 1448.52 3.35E-007 0.29 314.03 1927.21 2.51E-007 0.22 362.49 2224.57 1.74E-007 0.15 435.64 2673.49 Slide 19 Slide 19 Test Progams: 50 Node Full Mesh TCP Throughput  <= 100 Mbps RPi, BBB throughput tests resemble real-life user flows      Unlike a high performance iperf tester which “punches the network in the face” I run a 50x50 full mesh iperf matrix (2450 tests) in about 7 hours, (5 second tests). Full-mesh traceroute is collected concurrently By scoring every hop encountered on the average peformance for paths it appears in, “per-hop confidence” can be derived. Using multi-rate UDP vs. TCP is worth investigating. Slide 20 Slide 20 The Matrix Sources    Cut-out view of iperf3 tests to/from a chosen node... This row/column represents all tests to/from that chosen node. Leaves one wondering what the correlation is between the pink squares showing retransmissions Slide 22 Slide 22 Correlating Full Mesh Throughput And Traceroute Results For Fault Isolation Slide 23 Slide 23 Graph of per-hop “confidence” with colored links where retransmissions were observed (names/addresses obfuscated) This graph shows hops involved in in-bound Throughput testing between a chosen node and all partners. Each oval represents an IP interface as reported in Traceroute output. Graph rendered from test data with GraphViz. (GraphViz.org) Data Archiving  perfSONAR MA     Exposing some ways in which MA handling of long-term, diverse data could be optimized Correlating such things as early/late “bathtub curve” failures per equipment life cycle (see Wikipedia ^^^ ) Trending probabilistic loss by months/years Etc Slide 25 Slide 25 Ongoing  perfSONAR toolkit integration    Not so much new development as making some pieces fit together Correlation of other sources to zero in on a fault  NetDot  Flows/MRTG Ancillary programs  Log collection (honeypot-ish info)  Name resolution tests  v6/v4 precedence Slide 26 Slide 26

UH SWARM: Dense perfSONAR Deployment With Small

Related documents

Products

Support

UH SWARM: Dense perfSONAR Deployment With Small

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib