New Directions in Traffic Measurement and Accounting Focusing on the Elephants, Ignoring the Mice Cristian Estan and George Varghese University of California, San Diego SIGCOMM 2002 Talk outline • • • • • Problem definition Sample and hold Multistage filters Validation, measurements Conclusions SIGCOMM 2002 Traffic analysis today Workstation Router Collection and analysis software Large raw data Measurement module Sampled packets Concise analysis results Offline analysis Fast link SIGCOMM 2002 Our research agenda Router Concise analysis results Measurement module Real-time analysis •Is it doable? •Is it better? Fast link SIGCOMM 2002 What is traffic analysis used for? • Network planning: need to know traffic between pairs of networks (traffic matrix) • Accounting: usage based billing • Detecting DoS attacks: flood attacks • Application characterization: breaking up the traffic based on port numbers • … SIGCOMM 2002 Common abstractions • Packets are grouped together into streams based on header fields Traffic matrix – by source and destination AS DoS attacks – by destination IP address • Measuring large streams (this paper) • Estimating the number of active streams (poster) • … SIGCOMM 2002 Why is measuring streams hard? • Cheap memories (DRAM) are too slow to count all packets • Fast memories (SRAM) are too small to keep counters for all streams • Opportunity: elephants matter, mice don’t • Problem: usually we don’t know in advance which streams are large SIGCOMM 2002 Problem definition • Given a fixed definition for streams, measure large streams accurately Large = above 1% of link capacity over a 1 minute interval • Assumptions Mice don’t matter Accuracy of results important SIGCOMM 2002 Talk outline • • • • • Problem definition Sample and hold Multistage filters Validation, measurements Conclusions SIGCOMM 2002 How does sample and hold work? stream memory Sample Insert SIGCOMM 2002 stream1 1 How does sample and hold work? stream memory Update SIGCOMM 2002 stream1 21 How does sample and hold work? stream memory Sample Insert stream1 2 stream2 1 SIGCOMM 2002 Why is sample & hold better? Sample and hold uncertainty Ordinary sampling uncertainty uncertainty SIGCOMM 2002 uncertainty How much better is it? • Comparing the relative error of the estimate for a stream at 1/F of the link bandwidth • Memory limited to M entries Measure Ordinary sampling Sample and hold Error √ F/M F/M Memory accesses 1/S 1 SIGCOMM 2002 Talk outline • • • • • Problem definition Sample and hold Multistage filters Validation, measurements Conclusions SIGCOMM 2002 Multistage filters Characteristics: • No large stream is ever omitted • Very few entries are used by small streams • Better performance but implementation and tuning is more complex SIGCOMM 2002 How do multistage filters work? stream memory Array of counters Hash(Pink) SIGCOMM 2002 How do multistage filters work? stream memory Array of counters Hash(Green) SIGCOMM 2002 How do multistage filters work? stream memory Array of counters Hash(Green) SIGCOMM 2002 How do multistage filters work? stream memory SIGCOMM 2002 How do multistage filters work? stream memory Collisions are OK SIGCOMM 2002 How do multistage filters work? Reached threshold stream memory stream1 1 Insert SIGCOMM 2002 How do multistage filters work? stream memory stream1 1 SIGCOMM 2002 How do multistage filters work? stream memory stream1 1 stream2 1 SIGCOMM 2002 How do multistage filters work? stream memory Stage 1 stream1 1 Stage 2 SIGCOMM 2002 Conservative update Gray = all prior packets SIGCOMM 2002 Conservative update Redundant Redundant SIGCOMM 2002 Conservative update SIGCOMM 2002 Talk outline • • • • • Problem definition Sample and hold Multistage filters Validation, measurements Conclusions SIGCOMM 2002 Validation • Analytical evaluation • Comparison of analytical results to measured performance • Comparison of full measurement devices using different algorithms SIGCOMM 2002 On traces, algorithms much better than analysis predicts Percentage Theory of small Zipf streams passing filter Actual (log scale) Conservative update Number of SIGCOMM stages 2002 Measurement results • Setup: OC48 trace, 100,000 TCP flows, 5 second intervals, ordinary sampling - unlimited memory, sampling 1 in 16 our algorithms - 1Mbit, adapting parameters to keep it around 90% full • Large streams (above 0.1%): ordinary sampling has an error of 9% sample and hold 0.075%, multistage filter 0.037% SIGCOMM 2002 Talk outline • • • • • Problem definition Sample and hold Multistage filters Validation, measurements Conclusions SIGCOMM 2002 Our contributions • Abstraction: Real-time packet analysis abstractions can help systematize router implementations. While the notion of elephants and mice is inherent in earlier work, we abstracted measurement of large streams - it can be used by many applications. SIGCOMM 2002 Our contributions (2) • Algorithms: Sample and hold is a simple and efficient algorithm for identifying and measuring large streams. Multistage filters with conservative update perform better but are more complex. Both can be used for real-time as well as offline analysis. SIGCOMM 2002 Our contributions (3) • Validation: Theoretical results that make no assumptions on traffic distribution Simulations on traces are orders of magnitude better Preliminary hardware design (John Huber) indicates feasibility at OC192 speeds SIGCOMM 2002 Thank you! SIGCOMM 2002 Optimizations to sample and hold • Preserving entries: Keep large entries from one measurement interval to the next Reduces error by a factor of 6 • Early removal: Quickly remove entries that do not accumulate much traffic Reduces memory usage by 25% SIGCOMM 2002 Optimizations to multistage filters • Preserving entries: Keep large entries from one measurement interval to the next Reduces error by a factor of 5 • Shielding: Large streams identified in previous intervals don’t pass through the filter Reduces memory usage by up to 70% SIGCOMM 2002