TransparentInternet - Computer Science Division

advertisement
Towards a Transparent Internet
Yan Chen
Ehab Al-Shaer
Richard Yang
Dept of Computer
EECS
School of
Science
Department
Computer
Science
Yale University
Northwestern
University DePaul University
Short Bios
• Yan Chen
–
–
–
–
Assistant professor of Northwestern Univ.
DOE CAREER Award in 2005
Microsoft Trustworthy Computing Award in 2004 & 2005
TPC co-chair of IWQoS 07, PC for Infocom, Mobicom,
etc.
• Ehab Al-Shaer:
– Associate professor and Director of MNLAB
– Very actively involved in the area of network operation
and management for more than 10 years.
– TPC co-chair of IM’07, the premier network management
conference. PC for INFOCOM, ICNP, IM/NOMS,
ASIACCS
Motivations
• The Internet has evolved to become a uncooperative ossificated network of networks
– Network has to be treated as a blackbox
» Performance of even neighboring networks are opaque
» Inter-domain routing based on policies but not performance
» Have to resort to overlay networks which are suboptimal
– Diagnosis and fault location extremely hard
• Network config management error-prone &
expensive
– Reactive configurations: tune after deployment
– Vulnerable: manually handled and subject to conflicts
– Imperative & fragmented: need to access several
specific devices in order to implement a service goal
Proposed Solution: Transparent
Internet
• Every network shares its measurement and
management information with other networks when
necessary (glass box)
– Performance: delay, loss, available bandwidth, etc.
» Can be at link-level
– Management info
» Configurations: QoS setting, traffic policing, firewalls, etc.
– Traffic info: traffic matrices, traffic characteristics
• Information sharing through
– As part of the inter-domain protocols: Transparent
Gateway Protocols (TGP)
– Other applications: leverage DHT
Analogy to the Airline Alliance
• When airlines compose multi-lag flights, they
need more than just route info
– Type of aircraft, # of vacancies, probability of
punctuation, etc.
• Such open model is mutual beneficial
– Provide the best flight composition for clients
– Similarly, open network model can provide best
communications for applications
Objectives I
Provides a completely transparent view of the Internet to
networks and applications
• Diagnosis & trouble shooting becomes extremely easy
– No more Internet tomography needed
• Flexible inter-domain routing
– Not just based on policy or # of AS/hops
– Flexible metrics based on bandwidth, latency, etc.
• Global traffic engineering
– Each AS performs its own local traffic engineering
– Provide AS path-level routing guide
• Unified framework that applications query (push/pull) info
as needed
– Streaming media, content distribution
– Anomaly/security applications
Objectives II
Provides an autonomic, provable and proactive
configuration management
• Proactive verification: configuration verified and
translated to different vendor specific devices
• Proactive validation: Test the configuration changes on
archived network traffic without interrupting the
operation networks
• Autonomic configuration: from high-level “management
objectives” to configuration parameters
– Configurations are auto-tuned dynamically to achieve the
“objectives”
Auto-tuning & Proactivity
Deploying
defining
Verifying
Validation
Evaluation/
Prediction
Optimizing
Flexible Inter-domain Routing
• Multiple routing paths with TGP
– Incorporate measurement info into AS paths
– Bandwidth-intensive and latency-intensive applications
can take different AS paths.
• Challenge: inter-domain routing based on
bandwidth without making reservation
• Solution: Discretize the bandwidth for good
tradeoff b/t adaptation and stability
– Though stability is a classical problem, not unique to
TGP
Global Traffic Engineering (TE)
• For the current Internet, TE is executed in each
AS -- thus only local optimum is achieved
– Allowing the network to handle all traffic patterns
possible, within the networks ingress-egress capacity
constraints (e.g. two phase routing)
• With global information, we can potentially
achieve global optimum (or Nash equilibrium)
– Each AS is a selfish individual
– A center (or each AS) infers the Nash equilibrium
– Each AS can try the Nash equilibrium, or attempt to
benefit itself based on the inferred Nash equilibrium
Example of Benefit of Global TE
1G traffic
to AS 1
AS 4
AS 2
AS 5
AS 1
AS 3
1G traffic
to AS 1
Example of Benefit of Global TE
• Without Global TE
1G traffic
to AS 1
AS 4
AS 2
AS 5
AS 1
AS 3
1G traffic
to AS 1
Example of Benefit of Global TE
• With Global TE
1G traffic
to AS 1
AS 4
AS 2
AS 5
AS 1
AS 3
1G traffic
to AS 1
Unified Transparency Framework
for Various Functionality
• Sharing of anomaly/security-related
measurement
– Various characteristics of traffic: heavy hitter,
heavy changes, histogram, etc.
– Self-diagnosis to survivability
• Adaptations
– Routing adaptations at router level or application
level
Practical Issues and Solutions
• Incentives for information sharing
– Mandatory for next-generation Internet ?
– Alliance model for incremental growth
• Security/cheating: Trust but verify
– Trust most of the info shared but periodically verify
» Much easier than the current Internet tomography unless
many ASes collude
– Verification part of the protocol
» Some fields in the packet headers designed for that
purpose
Summary
• Transparent Internet revolutionalizes the black
box networks to “glass box”
• Enable/improve many functionalities
–
–
–
–
Diagnosis and trouble shooting over global Internet
Flexible Inter-domain routing
Global traffic engineering
Provable and proactive configuration management to
verify, validate and self-tune configuration without
interrupting the main operation networks
Backup Materials
Summary
• Configuring the current Internet is highly
complex, improvable and passive
• Our approach provide a fully proactive/autonomic
configuration architecture to verify, validate and
self-tune configuration without interrupting the
main operation network
• Our architecture uses on high-level goal-oriented
policy refinement approach
Objectives (cont.)
Provides a provable and proactive configuration for NGI
• Automated Configuration Management: from high-level “management objectives”
to configuration parameters
– Allow “Sami” to access all web servers except the ones in the “accounting” department
– QoS configuration is highly complex: FQ, shapers, RED classes
• Correct and Seamless network-wide configuration
– Creating a unified configuration representation and verifying the mapping
– Conflict detection and resolution
• Set-and-Test configuration validation framework
– Specially important for mission critical network
– Very useful for delay-sensitive application
• Self-managed Internet
– Autonomic Configuration: configurations are auto-tuned dynamically to
achieve the “objectives”
– Proactive Configuration: configurations are auto-steered dynamically to avoid
predicted problems
• Searchable MIBs Configuration
– using tagged MIB objects of meta-data and semantic web to provide (1)
“multi-view” configuration management and (2) MIB information fusion
Technical Approach: Major Security policy verification components includes
(1) Policy modeling: BDD representation for all network security policy
(2) Consistency checking of global network polices
(2) Goal-oriented verification: verifying certain user-defined service properties
(4) Policy aggregation and translation to high-level-language and distribution
High Level Security
Definition Language
what
Global Configuration
Query Language
Goal-oriented Security Policy
Policy Tactics
how
FW
Router
IPSec
UPR
UPR
UPR
Aggregation  Global UPR
Goal Oriented Verification
Policy Segmentation
Policy Translation & Distribution
FW
Router
IPSec
IDS/
IPS
Access
Point
Consistency
Check
Policy translation validation
•
Configuration Definition and
Verification Architecture
(3) Autonomic Programmable
Network Control
AS3
Reasoning
Symptoms
AS4
Policies
H
AS2
AS5
AS1
Diagnose
Actions
Action
selection
Feedback (Symptoms)
PSA=problem-symptom-action
Evaluation
Reporting/Visualization
Tools
PSA Model
Autonomic Programmable Network
Control
Policies
PSA Model
Problem Reasoning
Reporting/Visualization
Tools
Symptoms
AS3
AS4
AS2
AS5
AS1
Diagnose
Actions
Feedback (Symptoms)
Measurement Info to Share
• Basic metrics
– Delay, loss rate, capacity, available bandwidth
– Demand (or traffic volume) and application types
• Intra-AS Measurement Info
– Link-level info
» Queried only when necessary
– Aggregated Info
» OD flow level info
» Path segment b/t entry and exit points in each AS
• Inter-AS Measurement Info
– General AS relationship
– AS-level topology
– Inter-AS link metrics
Transparent Internet Architecture
Combined w/ routing info and
export to neighboring ASes
through TGP protocol
Provide global retrievable
Management Information Base (MIB)
with DHT
Network link-level monitoring
Methodology
iterate
Analytical
evaluation
Algorithm
design
Realistic
simulation
PlanetLab
tests
• Network topology
• Web workload
• Network end-to-end latency
measurement
TGP MIB Dissemination Architecture
• Leverage Distributed Hash Table - Tapestry for
– Distributed, scalable location with guaranteed success
– Search with locality
data
source
data plane
replica
Dynamic Replication/Update
and Replica Management
cache
always update
adaptive
coherence
Replica Location
Web
server
SCAN server
client
DHT mesh
Overlay Network Monitoring
network plane
Adaptive Overlay Streaming Media
Stanford
5. Alert +
New Overlay Path
OVERLAY NETWORK
OPERATION CENTER
2. Register trigger
4. Detect congestion /
failure
UC Berkeley
X
CLIENT
UC San Diego
1. Setup
connection
SERVER
3. Network congestion /
failure
7. Skip-free streaming
media recovery
6. Setup New Path
OVERLAY RELAY
NODE
HP Labs
•
•
•
•
Implemented with Winamp client and SHOUTcast server
Congestion introduced with a Packet Shaper
Skip-free playback: server buffering and rewinding
Total adaptation time < 4 seconds
Existing CDNs Fail to Address these
Challenges
No coherence
for dynamic
content
X
Unscalable
network
monitoring O(M × N)
M: # of client
groups, N: #
of server
Non-cooperative
replication
inefficient
Problem Formulation
• Subject to certain total replication cost (e.g., # of URL replicas)
• Find a scalable, adaptive replication strategy to reduce avg access cost
Download