DNS based Request Routing - Interactive Computing Lab

advertisement
Content Distribution
March 8, 2012
2: Application Layer
1
Review: P2P architecture
 no always-on server
 arbitrary end systems
directly communicate peer-peer
 peers are intermittently
connected and change IP
addresses
2: Application Layer
2
File Distribution: Server-Client vs P2P
Question : How much time to distribute file
from one server to N peers?
us: server upload
bandwidth
Server
us
File, size F
dN
uN
u1
d1
u2
ui: peer i upload
bandwidth
d2
di: peer i download
bandwidth
Network (with
abundant bandwidth)
2: Application Layer
3
P2P content distribution issues
 Issues
 Group management and data search
 Reliable and efficient file exchange
 Security/privacy/anonymity/trust
 Approaches for group management and
data search (i.e., who has what?)
Centralized (e.g., BitTorrent tracker)
 Unstructured (e.g., Gnutella)
 Structured (Distributed Hash Tables [DHT])

2: Application Layer
4
Contents
 P2P architecture and benefits
 P2P content distribution
 Content distribution network (CDN)
2: Application Layer
5
Why Content Networks?
 More hops between client and Web server
more congestion!
 Same data flowing repeatedly over links
between clients and Web server

C1
C3
C4
S
C2
Slides from http://www.cis.udel.edu/~iyengar/courses/Overlays.ppt
- IP router
2: Application Layer
6
Why Content Networks?
 Origin server is bottleneck as number of
users grows
 Flash Crowds (for instance, Sept. 11)
 The Content Distribution Problem: Arrange
a rendezvous between a content source at
the origin server (www.cnn.com) and a
content sink (us, as users)
Slides from http://www.cis.udel.edu/~iyengar/courses/Overlays.ppt
2: Application Layer
7
Example: Web Server Farm
 Simple solution to the content distribution problem: deploy a
large group of servers
www.cnn.com
(Copy 1)
www.cnn.com
(Copy 2)
Request from
grad.umd.edu
www.cnn.com
(Copy 3)
Request from
ren.cis.udel.edu
L4-L7 Switch
Request from
ren.cis.udel.edu
Request from
grad.umd.edu
 Arbitrate client requests to servers using an “intelligent”
L4-L7 switch
 Pretty widely used today
2: Application Layer
8
Example: Caching Proxy
 Majorly motivated by ISP business interests – reduction in
bandwidth consumption of ISP from the Internet
 Reduced network traffic
 Reduced user perceived latency
ISP
Client
ren.cis.udel.edu
Client
merlot.cis.ud
el.edu
Intercepters
TCP port 80
traffic
Other
traffic
Internet
www.cnn.com
Proxy
2: Application Layer
9
But on Sept. 11, 2001
Web Server
www.cnn.com
New Content
WTC News!
1000,000
other hosts
request
1000,000
other hosts
ISP
old
content
request
User
mslab.kaist.ac.kr
- Congestion /
Bottleneck
- Caching Proxy
2: Application Layer
10
Problems with discussed approaches:
Server farms and Caching proxies
 Server farms do nothing about problems due to
network congestion
 Caching proxies serve only their clients, not all
users on the Internet
 Content providers (say, Web servers) cannot rely
on existence and correct implementation of
caching proxies
 Accounting issues with caching proxies.
 For instance, www.cnn.com needs to know the number of
hits to the webpage for advertisements displayed on the
webpage
2: Application Layer
11
Again on Sept. 11, 2001 with CDN
Web Server
www.cnn.com
New Content
WTC News!
WA
CA
MI
1000,000
other users
IL
MA
1000,000
other users
FL
NY
DE
request
new
content
User
mslab.kaist.ac.kr
- Distribution
Infrastructure
- Surrogate
2: Application Layer
12
Web replication - CDNs
 Overlay network to distribute content from
origin servers to users
 Avoids large amount of same data repeatedly
traversing potentially congested links on the
Internet
 Reduces Web server load
 Reduces user perceived latency
 Tries to route around congested networks
2: Application Layer
13
CDN vs. Caching Proxies
 Caches are used by ISPs to reduce bandwidth
consumption, CDNs are used by content providers
to improve quality of service to end users
 Caches are reactive, CDNs are proactive
 Caching proxies cater to their users (web clients)
and not to content providers (web servers), CDNs
cater to the content providers (web servers) and
clients
 CDNs give control over the content to the content
providers, caching proxies do not
2: Application Layer
14
CDN Architecture
Origin
Server
CDN
Request
Routing
Infrastructure
Distribution
& Accounting
Infrastructure
Surrogate
Surrogate
Client
Client
2: Application Layer
15
CDN Organization
 Limelight/Google: placing CDN servers near a small # of ISP
core nets
 Akamai: placing CDN servers deep into a large # of ISP
networks’ sites
 Nano Data Center (NaDa): home gateways (STBs/modems)
as CDN servers (peer-to-peer delivery among NaDa servers)
 P2P software (BitTorrent, PPLive, etc.)
Core
Router
Core Network
Edge
Router
Metro/Edge Network
OLT
ONT
DSLAM
Modem
Access
Digital Media
Delivery Platform
NaDa
CDN Components
 Distribution Infrastructure:
 Moving or replicating content from content source
(origin server, content provider) to surrogates
 Request Routing Infrastructure:
 Steering or directing content request from a client to
a suitable surrogate
 Content Delivery Infrastructure:
 Delivering content to clients from surrogates
 Accounting Infrastructure:
 Logging and reporting of distribution and delivery activities
2: Application Layer
17
Server Interaction with CDN
www.cnn.com
1.
Origin server pushes new
content to CDN
OR
CDN pulls content from origin
server
Origin
Server
1
2
2. Origin server requests logs and
other accounting info from CDN
OR
CDN provides logs and other
accounting info to origin server
CDN
Distribution
Infrastructure
Accounting
Infrastructure
2: Application Layer
18
Client Interaction with CDN
1. Hi! I need www.cnn.com/sept11
2.
Go to surrogate
newyork.cnn.akamai.com
CDN
california.cnn.akamai.com
Surrogate
(CA)
Request
Routing
Infrastructure
3. Hi! I need content /sept11
newyorkcnn.akamai.com
Q:
How did the CDN choose the New
York surrogate over the California
surrogate ?
Surrogate
(NY)
1
2
3
Client
2: Application Layer
19
Request Routing Techniques
 Request routing techniques use a set of
metrics to direct users to “best” surrogate
 Proprietary, but underlying techniques
known:
DNS based request routing
 Content modification (URL rewriting)
 Anycast based (how common is anycast?)
 URL based request routing
 Transport layer request routing
 Combination of multiple mechanisms

2: Application Layer
20
DNS based Request-Routing
 Common due to the ubiquity of DNS
as a directory service
 Specialized DNS server inserted in
a DNS resolution process
 DNS server is capable of returning
a different set of A, NS or CNAME
records based on policies/metrics
2: Application Layer
21
DNS based Request-Routing
Q: How does the Akamai
DNS know which
surrogate is closest ?
Akamai
CDN
newyork.cnn.akamai.com
Surrogate
145.155.10.15
www.cnn.com
Akamai DNS
california.cnn.akamai.com
Surrogate
58.15.100.152
1) DNS query:
www.cnn.com
test.nyu.edu
128.4.30.15
DNS response:
A 145.155.10.15
newyork.cnn.akamai.com
local DNS server (dns.nyu.edu)
128.4.4.12
2: Application Layer
22
DNS based Request-Routing
www.cnn.com
Akamai
CDN
Akamai DNS
Surrogate
Surrogate
DNS query
test.nyu.edu
128.4.30.15
local DNS server
(dns.nyu.edu)
128.4.4.12
2: Application Layer
23
DNS based Request-Routing
www.cnn.com
Akamai DNS
Akamai
CDN
Requesting DNS - 76.43.32.4
Surrogate - 145.155.10.15
Surrogate
58.15.100.152
Surrogate
145.155.10.15
Requesting DNS - 76.43.32.4
Requesting DNS - 76.43.32.4
Available Bandwidth = 10 kbps
RTT = 10 ms
Client
Client DNS
76.43.35.53
76.43.32.4
Available Bandwidth = 5 kbps
RTT = 100 ms
www.cnn.com
A 145.155.10.15
TTL = 10s
2: Application Layer
24
DNS based Request Routing: Discussion
 Originator Problem: Client may be far removed
from client DNS
 Client DNS Masking Problem: Virtually all DNS
servers, except for root DNS servers honor
requests for recursion
Q: Which DNS server resolves a request for test.nyu.edu?
Q: Which DNS server performs the last recursion of the
DNS request?
 Hidden Load Factor: A DNS resolution may result
in drastically different load on the selected
surrogate – issue in load balancing requests, and
predicting load on surrogates
2: Application Layer
25
Summary
 P2P architecture and its benefits
 P2P content distribution
 BitTorrent, Skype
 Content distribution network (CDN)
 DNS-based request routing
2: Application Layer
26
Download