Slides - EECS User Home Pages

advertisement
Web Caching on Smartphones:
Ideal vs. Reality
Feng Qian1, Kee Shen Quah1, Junxian Huang1, Jeffrey Erman2
Alexandre Gerber2, Z. Morley Mao1, Subhabrata Sen2, Oliver Spatscheck2
1University
of Michigan
2AT&T
June 27 2012
Labs - Research
2
Mobile Traffic: An Explosive Growth
Year
2011
2012
2013
2014
2015
2016
Global Mobile
Data Traffic per
Month (106 TB)
0.6
1.3
2.4
4.2
6.9
10.8
Avg. Smartphone
Traffic per Month
(MB)
150
1600% increase
2576
Source: Cisco Visual Networking Index (VNI) Global Mobile Data Traffic Forecast, 2011-2016
• Deployment of cellular infrastructures: much slower
– Spectrum shortage and economic issue
– The cellular infrastructure spending in 2011 was expected
to be only a 6.7% rise over 2010
3
Web Caching on Cellular Devices
• The big picture: traffic redundancy elimination
• The first network-wide study of redundant transfers
caused by inefficient HTTP caching on cellular devices
– HTTP: The dominant app-layer protocol for ~20 years
– Caching: Huge benefits, but complex
– Caching on cellular devices:
Reduces redundant data transferred over the RAN
Improves performance due to reduced latency
Cuts cellular bills for customers
4
Background: Caching in HTTP 1.1
• Use Expiration and Revalidation to ensure caching consistency
• Before expiration: the client should safely assume the
freshness of the cached file
known the
protocol
20 send
yearsa revalidation message
• AfterWell
expiration:
clientfor
must
is the
state-of-the-art
in the
context
to theWhat
server
to query
the freshness
of the
cacheofentry
cellular devices?
Last-Modified:
Feb 1
If-Modified-Since:
122012
201215:00:00
15:00:00
304 Not
10 15:00:00
FebModified
1Feb
2012
Expires:
15
2012 15:00:00
?
Last-Modified: Feb 11215:00:00
15:00:00
Expires:
Feb 10
15 15:00:00
5
Measurement Goal
• Goal: understand the state-of-the-art in
HTTP caching on cellular devices
• What to study: redundant transfers caused by
inefficient HTTP caching
• Potential cause: HTTP implementation Related
– Caching logic (client/server) not following HTTP spec
– Limited cache size
They account for 20% of the
– Non-persistent cache total HTTP traffic volume!
• Potential cause: application semantics related
– Server conservatively sets headers to make files uncacheable
or expire too soon
6
Measurement Data
Name
ISP
UMICH
Collection period
May 20 2011 (24 hours)
May to Oct 12 2011 (5 months)
Collection location
Commercial cellular core network
Directly on user handsets
Data format
695 million records of HTTP
transactions
Full packet trace with payload
of all traffic
Traffic volume
24.3 TB
118 GB
Dataset size
271 GB
119 GB
# Users
About 2.9 million
20 U of Michigan students
Platforms
Multiple (mainly iOS and Android)
Android 2.2
User interface for the data
collector/uploader software
7
Methodology
• A simulator strictly follows HTTP/1.1 caching logic (RFC 2616)
–
–
–
–
Expiration and freshness calculation mechanism
Non-cacheable objects
Partial caching due to byte-range requests and broken connection
LRU cache replacement algorithm, and more …
• Feed each user’s HTTP transactions to the cache simulator
• Redundant transfers are accurately identified in the
simulation process
• HTTP caching is not simple: 2K C++ LoC even for the
simulation core
8
Cacheability and Redundancy
• File cacheability: for both datasets
– Most bytes (70% to 78%) and most files (66% to 72%) are cacheable.
• Traffic Redundancy (assuming unlimited cache size)
Dataset
% Redundancy
(HTTP only)
% Redundancy
(HTTP + non-HTTP)
ISP
17.7%
N/A
UMICH
20.3%
17.3%
Under-estimation
due to HTTPS and
app-semantic-related
redundancy
• Root causes of redundant transfers (within all HTTP traffic)
Origin of redundancy
Client
Issue
Server
Issue
ISP
UMICH
1. Handset issues a request before local copies expire
15.9%
16.3%
2. Handset does not revalidate after local copies expire
(the file unchanged).
1.8%
4.0%
3. Server does not recognize revalidation after local copies
expire (the file unchanged)
<0.1%
<0.1%
9
Limited Cache Size and
Non-persistent cache
• Which factor has the main
responsibility for redundancy?
– Problematic caching logic
It is unlikely
that∞
the handset
– Limited cached
cache size
Thesize:
benefits
are significant
4MB, HTTP traffic
savings 17%13%
is rebooted
during such a
even for a small cache.
– Non-persistent
cache:
59% of
short
interval.
consecutive cache hits < 1 min
• How large the cache size needs to
be?
– A cache of 50 MB achieves 90% of
the gain (w.r.t. traffic reduction)
compared to an unlimited cache
Dist. of intervals between consecutive cache
hits on the same entry (ISP trace)
10
Quantifying the Resource
Impact of Redundant Traffic
Compute
thewe
impact:
ΔEabout
= (E0cellular
– ER) / resources
E0
• In cellular
networks,
also care
•ΔEUse
ourenergy
trace-driven RRC
machine simulator with a
: Radio
ER: state
Radio energy
E : Radio energy
handset
model [Qian
etal, Mobisys 11]0
impact ofradio power
consumption
in modified
consumption
redundant
transfers
traces
with
redundant
– Applied to only cellular traffic within UMICHindataset
original traces
(a positive value)
transfers removed
• Three important metrics characterizing cellular resource
consumption:
– D: radio resource consumption
– S: signaling load
– E: handset radio energy consumption
11
Quantifying the Resource
Impact of Redundant Traffic
ΔS Signaling
load Impact
ΔE Radio
Energy Impact
ΔD Radio
Resource Impact
HTTP only
27%
26%
27%
All traffic
6%
7%
9%
• When redundant and other traffic coexist, only eliminating
redundant traffic may not reduce resource consumption
– As long as one of the concurrent transfers exists, the radio
is on (i.e., consuming resources)
• Non-HTTP traffic plays a role (push notification and chatting)
– Traffic volume: small (1%); resource impact: high (18%)
– Resource release is controlled by fixed inactivity timers
– Sending small data incurs high resource overhead
12
Testing HTTP Libraries and Browsers
• Verify measurement findings by testing popular
HTTP libraries and browsers on real handsets
• Design 13 controlled tests to cover all important
• Revisit: which factor has the main
aspects of caching
implementation
responsibility for redundancy?
Feature tests (is it well–supported?)
Attribute
Problematic caching
logic tests (infer the parameters)
1. Basic caching
1. Shared or non-shared?
– Limited cached size
2. Revalidation
2. Persistent or non-persistent?
– Non-persistent cache
3. Various non-caching directives
3. Cache entry size limit
4. Various expiration directives
4. Total cache size
5. URL with query strings
5. Cache entry replacement policy
6. Partial caching
6. Heuristic freshness lifetime
7. Redirection caching
13
Testing HTTP Libraries and Browsers
• Basic caching test
–
–
–
–
Handset requests for a small cacheable file f
Server transfers f with a proper Expires directive.
Client requests for f again before it expires.
PASS iff the 2nd request not incurring any network traffic
• Cache size test: perform binary search
• Cache replacement policy test: try popular
algorithms (LRU, LFU, FIFO)
• See paper for all 13 tests
14
Test Results
Smartphone HTTP
library
OS version
Implementation
issues
of caching
Support
Caching?
Caching
Enabled by
Default?
java.net.URLConnection
Android 2.3
• 4 out of 8 libraries do not support
caching atNo
all.
No
java.net.HttpURLConnection
Android 2.3
No
• For both browsers, when loading
the same URL
No
back-to-back, the second request2.3
is treated as a
org.apache.http.client.HttpClient
A huge gap between protocolAndroid
specification andNo
No
android.webkit.WebView
Android 2.3
implementation, leading to significant
No
full reload from the remote server
Yes
• Android browser uses a small cache of 8MB
redundancy of network traffic.
android.net.http.HttpResponseCache
Android 4.0.2
Partially
• Partial caching is not supported
Three20 (Version
1.0.6.2) do not properly handle
iOSPragma:no-cache
4.3.4
No or
• Some
NSURLRequest Cache-Control:no-cache.
iOS 5.0.1
Partially
• (Version
…
ASIHTTPRequest
1.8.1)
iOS 4.3.4
Partially
No
No
No
No
Android Browser
Android 2.3
Partially
Yes
iPhone Browser
iOS 4.3.4/5.0.1
Partially
Yes
Chrome Browser
Android 4.0.2
YES
YES
15
Summary
• The first network-wide study of cellular HTTP
caching
• Redundant transfers are prevalent
– 18% (ISP) and 20% (UMICH) of HTTP traffic volume
– 17% of overall traffic volume (UMICH)
– 6%~9% of cellular resource consumption (UMICH)
– The root cause: problematic caching logic on
handsets
– Validated by caching tests of popular libraries and
browsers
Backup Slides
17
Diversity Among Applications
• Identifying smartphone applications
– ISP: by user-agent fields in HTTP requests
– UMICH: by the captured packet-process correspondence
• Diversity among top apps
– HTTP redundancy ratios range from 0.0% to 100.0%
• Validate apps with high redundancy ratios (> 90%)
– Analyze locally collected tcpdump traces
– They do not cache HTTP responses
• Some apps have negligible redundant transfers
– Almost all bytes are not cacheable
e.g., all requests are HTTP POST instead of HTTP GET
18
The Cache Simulator (Simplified Version)
foreach HTTP transaction r
if (file is not storable) then
assign_label(r, NOT_STORABLE);
continue;
else if (cache entry not exists) then
assign_label(r, CACHE_ENTRY_NOT_EXIST);
else if (cache entry not expired) then
assign_label(r, NOT_EXPIRED_DUP);
continue;
else if (file changed) then
assign_label(r, FILE_CHANGED);
else if (HTTP 304 used) then
assign_label(r, HTTP_304);
else if (revalidation not performed) then
assign_label(r, EXPIRED_DUP);
else
assign_label(r, EXPIRED_DUP_SVR);
update_cache_entry(r);
endfor
The simulation algorithm:
• Performs fine-grained
caching simulation at a
per-user basis
• Assigns to each HTTP
transaction a label
indicating its caching
status.
• Red labels correspond to
duplicated transfers.
Duplicated transfer: the file has not changed
Duplicated
The file hastransfer:
not changed
the file
after
hasthe
not
cache
changed
The
Duplicated
after
file
thecontains
has
cache
transfer:
changed
entry
"Cache-Control:
Aafter
expires,
request
thebut
cache
is issued
no-store“.
theentry
Cache
after
entrythe
expires,
miss.
cacheand
entry
a cache
expires,
revalidation
but the server
is
It
before
handset
expires.
cannot
the
does
befile
cached.
not
expires.
perform cache
does
properly
not performed.
recognize the cache revalidation.
revalidation.
Background: Radio Resource
Management in Cellular Networks
• RRC (Radio Resource Control) state machine [3GPP TS 25.331]
– State promotions have promotion delay
– State demotions incur tail times
Delay: 2s
Tail Time
Delay: 1.5s
RRC State
Channel
Radio
Power
IDLE
Not
allocated
Almost
zero
CELL_FACH
Shared,
Low Speed
Low
CELL_DCH
Dedicated,
High Speed
High
Tail Time
UMTS RRC State Machine for a large US 3G carrier
Page 19
Background: Radio Resource
Management in Cellular Networks
PromoDCH
Delay Tail
2 Sec5 sec
FACH
Tail
12 sec
Tail Time
Waiting inactivity timers to expire
Page 20
Download