Connection Conditioning: Architecture-Independent Support for Simple, Robust Servers KyoungSoo Park Vivek S. Pai Princeton University Server Performance Is Great Most Web servers generally perform well Many contributors Moore’s law – processors & memory OS – timing wheels, hashed PCB, zero copy Load balancers – clusters, data centers Poor performance outside server software CGIs, database access, network access, etc 2 What About Research In Server Software Architectures? 10 years of progress NCSA/Apache (just processes) Events (Harvest, Zeus) Events & helpers (Flash) Events & threads (JAWS, Haboob) Events & threads & compilers (Knot) Has it mattered? 3 Netcraft Web Server Survey Apache NCSA Microsoft 4 That’s Not Quite Fair Robustness/scalability work Handling large #’s connections Tolerating long delays Detecting/mitigating attacks Generally in context of event-driven servers Users love multi-process servers Easier to add features, modules Apache very successful 5 Should We Care? Keep going this route Becomes a boutique research area Discover/invent the next Apache Some servers still benefit Go with the flow Bring research benefits to Apache Focus on what matters for most people New constraints new research 6 Why Push Comes to Shove Walmart Linux machine: $400 total Microtel AMD 1.5 GHz 1GB memory Intel GigE HP DL320: List under $3000 Intel 2.8 GHz dual-core 2 GB memory Built-in GigE 100 Mbps WAN: $30,000/month 7 New Approach Can we make servers simpler & robust-ier Easier to program, defend, share Possibly slower, but that’s OK Programming-style (architecture) neutral Old idea: Unix pipes Good for text processing Bad for servers? 8 Connection Conditioning 9 Salient Features Filters are separate processes Internally: threads, processes, events, ??? Communicate via Unix domain sockets Allows passing socket/request bundle Server sees TCP sockets Responses via client socket No outbound overheads Filters tied to protocol, not # clients 10 Why Filters? Reuse across apps/protocols Beck attack: Apache 98, Flash 02, thttpd ?? Another layer of defense Works before application API (or even no API) Decoupled from application structure Can codify best practices Simplifies re-use But not a panacea 11 How Many Filters? In general, for most servers: four 1. Manage connections – wait for request 2. Separate multiple requests, re-present 3. 4. Probably event-driven Eases persistent connection use Reject malformed requests Prioritize 12 Connection Conditioning Library 9 functions, 2 “nonstandard” Most are cc_accept, cc_read, etc. Trivial to modify existing servers “cc_close” specifies local or global One new function, cc_createlsock Offloads the socket/bind/listen process Easier than doing it transparently Library is 89 lines 13 Modify or Start Fresh Modify existing servers Apache: < 50 lines (of 6000+) Flash: < 30 lines (of 2500+) New CC-aware server: 80 lines Filters Framework: 152 lines Connections: 98 lines Persistence: 76 lines Priority: 59 lines 14 CCServer Operation & Rationale Straightforward server One request at a time No caching Open file, send it if small, else fork+send Rely on filters for heavy lifting Model for simple servers Small footprint environments Sensors Not going to replace Apache 15 Performance Tests Every paper needs performance tests Single-file tests Some File mix File idea of baseline performance set from SpecWeb99 Chained filters Throughput & latency of multiple filters 16 Requests/sec Single File Test Request Rates 10000 9000 8000 7000 6000 5000 4000 3000 2000 1000 0 Flash Apache Haboob 0.1 1 10 100 File Size (KBytes) 1000 17 Requests/sec Single File Test Request Rate 10000 9000 8000 7000 6000 5000 4000 3000 2000 1000 0 Flash CC-Flash CCServer Apache Haboob CC-Apache 0.1 1 10 100 File Size (KBytes) 1000 18 Single File Test Throughput Bandwidth(Mbps) 500 400 300 Flash CCServer CC-Flash Apache CC-Apache Haboob 200 100 0 0 50 File Size (KBytes) 100 19 File Mix Throughput (Microtel) Throughput (Mbps) 400 Flash CC-Flash CCServer Apache CC-Apache Haboob 350 300 250 200 150 100 50 0 100 MB 500 MB 1500 MB 3000 MB Workload Data Set Size 20 Throughput (Mbps) File Mix Throughput (HP) 900 800 700 600 500 400 300 200 100 0 500 MB Flash CC-Flash CCServer Apache CC-Apache Haboob 1000 MB 2500 MB Workload Data Set Size 4000 MB 21 Chained Filters Latency Latency (microsec) 1200 Microtel HP Single HP Dual 1000 800 94 us 34 us 600 400 200 0 1 2 3 4 5 6 7 # Filters in Chain 8 9 10 22 Chained Filters Throughput Throughput (Mbps) 800 HP Dual HP Single Microtel 700 600 500 400 300 200 100 0 1 2 3 4 5 6 7 # Filters in Chain 8 9 10 23 Robustness Tests Incomplete connections Client opens socket, but no request Very low-rate DoS if server limits connections Quality of service High-priority client mixed with low-priority High-rate DoS, but filterable Persistent connection test Connection persists, but no follow-up request Low-rate DoS, easy to mask 24 Incomplete Connections 8000 Flash Requests/sec 7000 6000 Haboob 5000 Apache 4000 3000 2000 1000 0 0 5000 10000 15000 20000 # Incomplete Requests 25000 30000 25 Incomplete Connections 1. Much higher rate to DoS CCServer 2. Brings event-driven benefitsCC-Flash to Apache CC-Apache 3. Implement policies across connections 8000 Requests/sec 7000 6000 Flash Haboob Apache 5000 4000 3000 2000 1000 0 0 5000 10000 15000 20000 25000 # Incomplete Requests 30000 26 Quality of Service Response Time (ms) 2500 Apache CC-Apache 2000 1500 1000 500 0 0 50 100 150 200 # of Clients 250 300 27 Throughput (Mbps) Idle Persistent Connections 200 180 160 140 120 100 80 60 40 20 0 1. Persistent connections become cheap 2. Easier to provide client benefits 3. Lazy closing better performance Apache CC-Apache 0 50 100 150 200 250 # Idle Persistent Connections 300 28 Summary: Connection Conditioning Applying Unix pipes to servers allows Decompose processing into filters Compose filters as needed Design filters as you like Protect existing & new servers At a modest performance cost 29 30