A Flexible and Efficient API for a Customizable Proxy Cache

advertisement
A Flexible and Efficient API for a
Customizable Proxy Cache
Vivek S. Pai, Alan L. Cox,
Vijay S. Pai, and Willy Zwaenepoel
iMimic Networking, Inc.
http://www.imimic.com
Motivation
More features moving into proxy caches
–
–
–
–
The ubiquitous layer 7 device
Filtering, reporting, CDN support, transformation
Lots of this being done one-off, ad hoc
Can’t know everything at deployment
Some approaches for generalization
– ICAP/OPES, proprietary mechanisms
– But design considerations shifting
Goal: new approach for modern environments
2
Contributions
Designed event-friendly proxy API
Implemented on iMimic DataReactor cache
Imposes negligible performance overhead
Demo modules
– High performance
– Low interference
3
Outline
Background
API Design
API Functions
Implementation and Performance
Conclusions
4
Proxy Cache Concepts
clients
WAN
proxy
cache
LAN
origin
servers
5
Why Program a Proxy?
It’s at the right point in network
– Sees all client-side and server-side HTTP traffic
– Can react to both LAN and WAN conditions
Already examines layer 7
Groundwork in place for value-adds
– Content filtering, access control, etc.
6
Enabling Technologies
Moore’s Law
– CPU speeds outstripping all other components
– Lots of cycles to burn…
Proxy software
– Increasing efficiency in managing connections,
disk storage, etc.
Commodity OS/hardware improvements
– No longer need specialized systems to run
efficient proxy caches
7
Commodity System Improvements
1997: Appliances 4x faster than software
running on a 2-processor UltraSparc
[Source: Danzig, “NetCache Architecture and
Deployment”]
8
Commodity System Improvements
1997: Appliances 4x faster than software
running on a 2-processor UltraSparc
[Source: Danzig, “NetCache Architecture and
Deployment”]
1st NLANR cacheoff (April ’99): gap only 2.5 x
– 600 req/sec (Peregrine) vs. 1500 (InfoLibria)
9
Commodity System Improvements
1997: Appliances 4x faster than software
running on a 2-processor UltraSparc
[Source: Danzig, “NetCache Architecture and
Deployment”]
1st NLANR cacheoff (April ’99): gap only 2.5 x
2nd cacheoff (Jan ’00): gap only 1.7x
– 1450 req/sec (iMimic) vs. 2400 (Compaq)
10
Commodity System Improvements
1997: Appliances 4x faster than software
running on a 2-processor UltraSparc
[Source: Danzig, “NetCache Architecture and
Deployment”]
1st NLANR cacheoff (April ’99): gap only 2.5 x
2nd cacheoff (Jan ’00): gap only 1.7x
3rd cacheoff (Oct ’00): gap only 15%
– 2083 req/sec (Microsoft) vs. 2400 (Compaq)
11
Commodity System Improvements
1997: Appliances 4x faster than software
running on a 2-processor UltraSparc
[Source: Danzig, “NetCache Architecture and
Deployment”]
1st NLANR cacheoff (April ’99): gap only 2.5 x
2nd cacheoff (Jan ’00): gap only 1.7x
3rd cacheoff (Oct ’00): gap only 10%
4th cacheoff (Dec ’01): commodity system best
– Performance record: 2700 req/sec (Cintel/iMimic)
12
How free is the CPU?
Stratacache Dart-10, with Nokia phone
120 req/sec (7 Mbps) with 300 MHz CPU
– CPU mostly idle; performance disk-limited
13
Outline
Background
API Design
API Functions
Implementation and Performance
Conclusions
14
Previous Customization Approaches
Write your own proxy or modify Squid
– Huge code, changes likely to conflict with updates
ICAP: TCP-based offload
– Proxy redirects requests/responses to a separate
server for modification
Filter-style processes
– Plugins where proxy designers anticipated a need
(e.g., content filtering)
Kernel modules
– Difficult programming model, but needed for
kernel-integrated proxies
15
Reasons for a New Approach
Scalability needed to > 10,000 flows
– Filter processes may not scale
Limitations of ICAP-style offloading
– Offloading small requests adds latency
– Need for separate ICAP server with own CPU
Programmers want flexibility
– Program in C using standard OS and libraries
– Avoid problems from later code conflicts
16
Design of the Proxy API
Event-aware
– Modules notified as requests/responses arrive
– Maps well to implementation of modern proxies
HTTP-Complete
– Capture all key interactions in HTTP requestresponse protocol for full flexibility
Support various programming models
– Events, threads, processes
– Communication via function call or socket
17
HTTP Data Flows
Cache
Misses
Requests
Proxy
Cache
Client
Responses
Cache
Hits
Server
New
Content
Cached
Content
Storage
System
18
HTTP Data Flows and the API
Client
modify
modify
Proxy
Cache
modify
Server
modify
modify
Storage
System
19
HTTP Request-Response Structure
Requested URL
Request header line 1
Request header line 2
...
Request header line N
<blank terminating line>
Optional request “body"
used in POST requests
for forms, etc.
Header block –
special first line
followed by
more detail about
request/response
Body data
Response Status Code
Response header line 1
Response header line 2
...
Response header line N
<blank terminating line>
Actual response “body,"
containing HTML file,
image binary data, etc.
20
Design of API Notifications
typedef struct DR_FuncPtrs {
DR_InitFunc *dfp_init;
DR_ReconfigureFunc *dfp_reconfig;
DR_FiniFunc *dfp_fini;
// on module load
// on config change
// on module unload
DR_ReqHeaderFunc *dfp_reqHeader;
DR_ReqBodyFunc *dfp_reqBody;
DR_ReqOutFunc *dfp_reqOut;
// when req hdr done
// on each piece of req body
// before req to remote srv
DR_DNSResolvFunc *dfp_dnsResolv;
// when DNS resolution needed
DR_RespHeaderFunc *dfp_respHeader; // when resp hdr done
DR_RespBodyFunc *dfp_respBody;
// on each piece of resp body
DR_RespReturnFunc *dfp_respReturn; // when resp returned to clt
DR_TransferLogFunc *dfp_logging;
DR_OpaqueFreeFunc *dfp_opaqueFree;
DR_TimerFunc *dfp_timer;
int dfp_timerFreq;
} DR_FuncPtrs;
//
//
//
//
log entry after req done
when each resp completes
periodic maintenance
timer period (sec)
21
Outline
Background
API Design
API Functions
Implementation and Performance
Conclusions
22
API Functions
Content Adaptation
Content Management
Customized Administration
Utility Functions
23
Content Adaptation
Functions to allow modules to inspect and
modify requests and replies through cache
Client
modify
modify
Proxy
Cache
modify
Server
modify
modify
Storage
System
24
Content Adaptation (cont’d)
Example uses
– Integration into a CDN based on URL rewriting
– Transcoding for mobile devices
Special features of cache integration
– Store modified content
– Return multiple versions using HTTP Vary header
25
Content Management
Fine-grained control over cacheability
– Content-freshness modification/eviction
– Content preloading
– Content querying
Example uses
– News CDN needs new home page on major event
– Premium services
26
Customized Administration
Notifications on logging
Example uses
– Aggregation at network operation centers
– Detection of high error rates indicates bad links
27
Utility Functions
Interfaces to underlying OS event-notification
– Module may register or clear interest on FD events
– API will automatically call back module
– Independent of underlying OS mechanisms (e.g.,
poll, select, /dev/poll, kevent)
Configuration options processing
28
Outline
Background
API Design
API Functions
Implementation and Performance
Conclusions
29
Implementation in DataReactor
Commercial proxy server
– Portable
(x86, Alpha, Sparc), and
(FreeBSD, Linux, Solaris)
– Fast (exposes overheads)
– Independently measured at Proxy Cache-Offs
(alone or via OEMs)
Support requires < 1000 lines of code
Implementation < 6 person-months
30
Sample Modules
Ad Remover
–
Matches ad patterns in Hostname, URI
Dynamic Compressor
–
Uses zlib to compress, store, & serve object
Image Transcoder
–
Color stripping via NetPBM & ijpeg helpers
Text Injector
–
Finds <head> tag, asks helper what to insert
Content Manager
–
Local telnet, then query, fetch, inject, evict objects
ICAP client
–
Implements ICAP 1.0 draft to use external server
31
Web Surfing Now
32
Web Surfing Without Ads
33
Sample Module Implementation
Module
Name
Total
Lines
Code
Lines
Semicolons
# API
call sites
Ad Remover
175
115
51
4
Compressor
387
280
126
11
Transcoder
+ helper
391
+166
309
+118
148
+54
10
Text Injector
+ helper
473
+56
367
+32
170
+8
12
Manager
675
556
289
56
1024
719
321
15
ICAP Client
34
Measurement
Polygraph and PolyMix-3, Measurement Factory
– De facto standard for proxy testing
Scales with load
–
–
–
–
Number of clients
Number of servers
Data set size
Working set size
Very long test time
– Fill phase (~14 hours)
– Test phase (~10 hours)
35
0
5
10
15 20
Time (hours)
2nd Load Phase
Fill Phase
1st Load Phase
PolyGraph Test Phases
25
30
36
PolyGraph Hit Rates
Cacheable
Offered
Actual
37
Our Test Environment
Proxy - 1.4GHz Athlon, 2GB memory
5 SCSI disks, GigE, FreeBSD
Harness
– 10 Polygraph client/server machines
– Target load: 1450 reqs/sec
– 16000 simultaneous connections
Pmix-3: Modified Polymix-3
– Single fill phase for all tests
– Load phase time cut in half
– Slight increase in hit rate
38
API Performance
Throughput Response
req/sec Time, ms
Miss
Time, ms
Hit Time,
ms
Hit
Ratio, %
Baseline
1452.87
1248.99 2742.53
19.82
57.81
API Enabled
1452.75
1248.95 2743.18
19.86
57.81
Empty Callback
1452.89
1251.25 2744.33
20.87
57.76
Add Headers
1452.62
1251.98 2745.07
20.85
57.74
Body + Headers
1452.84
1250.14 2746.98
22.10
57.85
39
Module Performance
Throughput Response
req/sec Time, ms
Miss
Time, ms
Hit Time,
ms
Hit
Ratio, %
Baseline
1452.87
1248.99 2742.53
19.82
57.81
Ad Remover
1452.72
1248.87 2743.55
20.42
57.81
Images 25 Trans/s
1452.65
1256.60 2753.47
23.21
57.74
Images Max Trans
1452.73
1277.76 2778.09
43.30
57.80
Max Trans Nice 19
1452.68
1250.69 2744.60
20.15
57.78
Compress 75 obj/s
1452.73
1252.24 2745.63
23.44
57.81
Compress 95 obj/s
1452.88
1258.34 2752.63
28.69
57.78
40
Outline
Background
API Design
API Functions
Implementation and Performance
Conclusions
41
Summary
CPUs getting more idle
Commodity OS suitable choices
High-concurrency servers needed
Customizable, efficient event-friendly API
Implemented with low overhead
Sample results, deployments promising
42
Ongoing Work
CoDeeN – a CDN system on PlanetLab
–
–
–
–
Uses a customized version of DataReactor
Being built at Princeton
Prototype: 1 week reading + 1 week reading
Currently: ~42 nodes (one per site)
Lessons
– API easy enough for busy grad students
– Logging infrastructure would be nice
– Want to mask non-HTTP failures
43
Questions?
vivek@imimic.com
iMimic Networking, Inc.
http://www.imimic.com/
Cacheoff-3 Hit Times
45
Cacheoff-3 Miss Times
46
Cacheoff-3 Improvements
47
Cacheoff-3 Price/Performance
48
CacheOff-3 Results
49
CacheOff-3 Results
50
Cacheoff-4 Hit Times
51
Cacheoff-4 Miss Times
52
CacheOff-4 Results
53
Download