Measuring a (MapReduce) Data Center Srikanth Kandula Sudipta Sengupta Albert Greenberg Parveen Patel Ronnie Chaiken Typical Data Center Network IP Routers ToR Aggregation Switches Top-of-rack Switch … … … … … … Servers 24-, 48- port 1G to server, 10Gbps up ~ $7K Agg • Less bandwidth up the hierarchy • Clunky routing e.g., VL2, BCube, FatTree, Portland, DCell Modular switch Chassis + up to 10 blades >140 10G ports $150K-$200K Goal What does traffic in a datacenter look like? • A realistic model of data center traffic • Compare proposals How to measure a datacenter? (Macro-) Who talks to whom? Congestion, its impact (Micro-) Flow details: Sizes, Durations, Inter-arrivals, flux How to measure? Servers Switches Router Agg. ToR Distr. FS + MapReduce Scripts = … … • per port: in/out octets • sample every few minutes • miss server- or flow- level info … … … 1. SNMP reports … Use the end-hosts to share load • Auto managed already 2. Packet Traces • Not native on most switches • Hard to set up (port-spans) 3. Sampled NetFlow Tradeoff: CPU overhead on switch for detailed traces Measured 1500 servers for several months Who Talks To Whom? 1Gbps Server To .4 Gbps 3 Mbps 20 Kbps .2 Kbps 0 Server From Two patterns dominate • Most of the communication happens within racks • Scatter, Gather Flows are small. Flows are short-lived. turnover quickly. 80% of bytes in flows < 200MB 50% of bytes in flows < 25s median inter-arrival at ToR = 10-2s which lead to… • Traffic Engineering schemes should react faster, few elephants • Localized traffic additional bandwidth alleviates hotspots Congestion, its Impact are links busy? Often! who are the culprits? are apps impacted? 1 .8 .6 .4 .2 0 Contiguous Duration of >70% link utilization (seconds) Congestion, its Impact are links busy? Often! who are the culprits? are apps impacted? Apps (Extract, Reduce) Marginally Measurement Alternatives Link Utilizations (e.g., from SNMP) Server 2 Server Traffic Matrix Tomography + make do with easier-to-measure data – under-constrained problem heuristics a) gravity Percentile Rank 1 0.8 0.6 0.4 Tomogravity 0.2 0 0% 200% 400% 600% 800% Tomogravity estimation error (for 75% volume) Measurement Alternatives Link Utilizations (e.g., from SNMP) Server 2 Server Traffic Matrix Tomography + make do with easier-to-measure data – under-constrained problem heuristics b) max sparse 1 Percentile Rank a) gravity 0.8 0.6 0.4 Tomogravity 0.2 Max Sparse 0 0% 200% 400% 600% 800% Tomogravity estimation error (for 75% volume) Measurement Alternatives Link Utilizations (e.g., from SNMP) Server 2 Server Traffic Matrix Tomography + make do with easier-to-measure data – under-constrained problem heuristics b) max sparse c) tomography + Job Information 1 Percentile Rank a) gravity 0.8 0.6 Tomogravity Tomog+job info Max Sparse 0.4 0.2 0 0% 200% 400% 600% 800% Tomogravity estimation error (for 75% volume) a first look at traffic in a (map-reduce) data center some insights • traffic stays mostly within high bandwidth regions • flows are small, short-lived and turnover quickly • net highly-utilized often with moderate impact on apps. measuring @ end-hosts is feasible, necessary (?) → a model for data center traffic