c-Through: Part-time Optics in Data Centers Guohui Wang, David G. Andersen, Michael Kaminsky, Konstantina Papagiannaki, Defences: Hyma Chilukuri Chunjing xiao Current solutions for increasing data center network bandwidth The basic problem is traditional tree-structure Ethernet are heavily over-subscribed when a large amount of data are shuffled across different server racks FatTree 1. Hard to construct BCube 2. Hard to expand 2 An alternative: hybrid packet/circuit switched data center network Goal of this work: – Feasibility: software design that enables efficient use of optical circuits – Applicability: application performance over a hybrid network 3 Optical Circuit Switch Output 1 Output 2 Input 1 Lenses Fixed Mirror Glass Fiber Bundle • • Does not decode packets Needs take time to reconfigure 2010-09-02 SIGCOMM Rotate Mirror Mirrors on Motors Nathan Farrington 4 Optical circuit switching v.s. Electrical packet switching Switching technology Switching capacity Switching time Switching traffic Electrical packet switching Optical circuit switching Store and forward Circuit switching 16x40Gbps at high end e.g. Cisco CRS-1 320x100Gbps on market, e.g. Calient FiberConnect Packet granularity Less than 10ms For bursty, uniform traffic For stable, pair-wise traffic 5 Hybrid packet/circuit switched network architecture Electrical packet-switched network for low latency delivery Optical circuit-switched network for high capacity transfer Optical paths are provisioned rack-to-rack – A simple and cost-effective choice – Aggregate traffic on per-rack basis to better utilize optical circuits Design requirements Traffic demands Control plane: – Traffic demand estimation – Optical circuit configuration Data plane: – Dynamic traffic de-multiplexing – Optimizing circuit utilization (optional) 7 c-Through (a specific design) No modification to applications and switches Leverage endhosts for traffic management Centralized control for circuit configuration 8 c-Through - traffic demand estimation and traffic batching Applications Per-rack traffic demand vector Socket buffers Transparent to applications. Accomplish two requirements: – Traffic demand estimation – Pre-batch data to improve optical circuit utilization 9 c-Through - optical circuit configuration configuration Traffic demand Controller configuration Use Edmonds’ algorithm to compute optimal configuration Many ways to reduce the control traffic overhead 10 c-Through - traffic de-multiplexing VLAN-based network isolation: VLAN #1 – No need to modify switches – Avoid the instability caused by circuit reconfiguration VLAN #2 Traffic control on hosts: – Controller informs hosts about the circuit configuration – End-hosts tag packets accordingly traffic circuit configuration Traffic de-multiplexer VLAN #1 VLAN #2 11 Testbed setup 16 servers with 1Gbps NICs Emulate a hybrid network on 48-port Ethernet switch Ethernet switch 100Mbps links 4Gbps links Optical circuit emulation – Optical paths are available only when hosts are notified – During reconfiguration, no host can use optical paths – 10 ms reconfiguration delay Emulated optical circuit switch 12 Evaluation Basic system performance: – Can TCP exploit dynamic bandwidth quickly? Yes – Does traffic control on servers bring significant overhead? No – Does buffering unfairly increase delay of small flows? No Application performance: – Bulk transfer (VM migration)? Yes – Loosely synchronized all-to-all communication (MapReduce)? Yes – Tightly synchronized all-to-all communication (MPI-FFT) ? Yes 13 TCP can exploit dynamic bandwidth quickly Throughput reach peak within 10 ms 14 Traffic control on servers bring few overhead Although optical management system adds an output scheduler in the server kernel, it does not significantly affect TCP or UDP throughput. Application performance Three different Benchmark applications VM migration Application(1) VM migration Application(2) MapReduce(1) MapReduce(2) Yahoo Gridmix benchmark 3 runs of 100 mixed jobs such as web query, web scan and sorting 200GB of uncompressed data, 50 GB of compressed data 21 MPI FFT(1) MPI FFT(2) Summary Hybrid packet/circuit switched data center network c-Through demonstrates its feasibility Good performance even for applications with all to all traffic Future directions to explore: The scaling property of hybrid data center networks Making applications circuit aware Power efficient data centers with optical circuits Picture from Internet websites. 24