The Power of Prediction: Cloud Bandwidth and Cost Reduction

advertisement
The Power of Prediction:
Cloud Bandwidth and Cost
Reduction
Eyal Zohar Israel Cidon
Technion
Osnat (Ossi) Mokryn
Tel-Aviv College
Traffic Redundancy Elimination (TRE)
Traffic redundancy stems from downloading
same or similar information items.
We found around 70% redundancy in
end-clients traffic, compared with past traffic
and local files.
SIGCOMM 2011
2
TRE Importance
Moving to the cloud => higher e2e traffic.
Cloud users pay for traffic used in practice =>
incentive to use TRE.
Cloud User
Application
Pay for Use
Cloud Provider
Cloud Traffic
End-user
SIGCOMM 2011
TRE
3
How TRE Works
Server parses the outgoing stream to contentbased chunks and signs with SHA-1
Byte stream
Rolling hash
Anchor 1
Anchor 2
Chunk 1
Anchor 3
Chunk 2
Anchor 4
Chunk 3
SHA-1 signature
Sign. 1
Sign. 2
Sign 3
New bytes
Insertion example
SIGCOMM 2011
Chunk 1
Chunk 2’
Chunk 3
4
Problems in Existing Solutions
In the cloud environment:
1.
2.
3.
4.
High processing costs in the cloud.
Scalability – remember each client.
Elasticity - unaware of data from other sources.
Do not handle long-term repeats (days/weeks).
Server 2
Receiver
SIGCOMM 2011
Server 1
5
Our Solution: PACK (Predictive ACK)
Redundancy detection by the client.
Repeats appear in chains.
Tries to match incoming chunks with a
previously received chain or local file.
Sends to the server predictions of the future
data.
SIGCOMM 2011
6
PACK: The Client Prediction
Stream chunks
Chunk 1
Chunk 2
Chunk 3
SHA-1 signature
Chain of chunks
Sign. 1
Sign. 2
Sign 3
Received
Each prediction:
1.TCP seq. – no server parsing
2.Hint – spare unnecessary SHA-1
3.SHA-1 signature
Prediction
TCP seq.
Chunk
Last-byte
hint
SHA-1
SIGCOMM 2011
7
PACK: Server Operation
The server compares the hint with the last-byte to sign.
Upon a hint match it performs the expensive SHA-1.
PACK saves cloud’s computational effort in the absence
of redundancy.
First receiver-based TRE: the server does not parse. It
signs with >99% confidence.
2,3V
3
1
1
2
2,3?
Local
storage
SIGCOMM 2011
Client
2
3
Server
Chain
8
PACK Benefits
Minimizes processing costs induced by TRE.
– Signs with SHA-1 in the presence of redundancy.
Receiver-based end-to-end TRE => suitable for cloud
server elasticity and client mobility.
– Does not require the server to continuously maintain
clients’ status.
SIGCOMM 2011
9
Server Effort Experiment
Several data-sets in 3 modes: baseline no-TRE,
PACK and a sender-based TRE.
25%-30% redundancy:
common to many
data-sets
Single Server Cloud Operational Cost
(100%=without TRE system)
140%
120%
100%
80%
60%
40%
EndRE-like
Sender-based
PACK
20%
0%
0%
SIGCOMM 2011
10%
20%
30%
Redundancy Elimination Ratio
40%
50%
10
YouTube Redundancy
Traces of 40k clients, captured at an ISP.
Found 30% end-to-end (personal) redundancy.
35%
30%
All YouTube Traffic (Gbps)
2.5
25%
2.0
YouTube Traffic
20%
1.5
PACK TRE
15%
1.0
10%
0.5
5%
0.0
SIGCOMM 2011
PACK TRE (Removed Redundancy)
3.0
0%
Time (24 hours)
11
Long-Term TRE
Social network: eliminated 30% with one hour cache
and 75% with a long-term cache.
80%
Average Redundancy of Daily Traffic
70%
60%
50%
40%
30%
20%
Unlimited
1 Hour
24 Hours
10%
0%
1
SIGCOMM 2011
3
5
7
9
11 13 15 17 19 21 23 25 27 29 31 33
Days Since Start
12
Cloud Email Redundancy
Gmail account with 1,000 Inbox messages.
Found 32% static redundancy (higher when
messages are read multiple times).
300
250
Traffic Volume Per Month (MB)
Redundant
Non-redundant
200
150
100
50
0
Jan
SIGCOMM 2011
Feb Mar
Apr
May Jun
Month
Jul
Aug Sep
Oct
Nov Dec
13
Implementation
Linux with Netfilter Queue, 25k lines of C and
Java, available for download.
Receiver-sender protocol is embedded in the
TCP Options field.
Transparent use at both sides.
SIGCOMM 2011
14
Processing Effort in the Client
Laptop experiment: PACK-related CPU consumption is
~4% when playing HD video (9 Mbps with 30%
redundancy).
Smartphone experiment: PACK consumes ~3% of the
battery power when processing 1 GB video (avg.
monthly data plan).
Virtual traffic saves
the client the need
to chunk or sign.
SIGCOMM 2011
15
New Chunking Algorithm
Most existing solutions use Rabin fingerprint.
SIGCOMM 2011
16
New Chunking Algorithm
64 bits
Mask=00
00
8A
31
10
58
30
80
n
n-1
n-2
n-3
n-4
n-5
n-6
n-7
n-8
n-40
n-41
n-42
n-43
n-44
n-45
n-46
n-47
SIGCOMM 2011
17
Summary
Current TRE solutions may not reduce cloud cost.
PACK is the first receiver-based TRE – leverages the
power of prediction.
Minimizes processing costs induced by TRE.
Suitable for cloud server migration and client mobility.
Implementation is available for download.
SIGCOMM 2011
18
Download