yao_thesisdefense - Northwestern University

advertisement
Internet Networking and
Application Troubleshooting
Yao Zhao
EECS Department
Northwestern University
1
Outline
• Motivation
• Dissertation Overview
• Network Layer Troubleshooting
– VScope, Lend, FAD and SPA
• Application Layer Troubleshooting
– Rake
• Conclusions and Future Work
2
Motivation
“When something breaks in the Internet, the
Internet's very decentralized structure makes it
hard to figure out what went wrong and even
harder to assign responsibility.”
- “Looking Over the Fence at Networks: A
Neighbor's View of Networking Research”, by
Committees on Research Horizons in
networking, National Research Council, 2001.
3
Troubleshooting Philosophy
• Entity Oriented Troubleshooting
– Monitor entity separately
• E.g. Router packet drop rates, queue size and
other SNMP counters
• E.g. Machine CPU load, I/O intensity, network
utility and other performance counters
– Potential problems
• Not all entities can be monitored
• Inferring entity performance from the counters may
be challenging
4
Troubleshooting Philosophy
• Entity Oriented Troubleshooting
• Task Based Troubleshooting
– Use task performance to infer entity
performance
• E.g. From Internet path loss rate to infer link-level
loss rates
– Advantage
• Work with limited monitor points (e.g. end hosts)
• Focus on target performance directly
5
Thesis Statements
• We design troubleshooting systems that
monitor and diagnosis the Internet
distribute systems in both network layer
and application layer using the task based
troubleshooting philosophy.
6
Publications
•
Papers
– Y. Zhao, Y. Chen, S. Ratnasamy, Load balanced and Efficient Hierarchical DataCentric Storage in Sensor Networks, in the Proc. of SECON 2008
– Y. Gao, Y. Zhao, R. Schweller, S. Venkataraman, Y. Chen, D. Song, and M. Kao,
Detecting Stealthy Spreaders Using Online Outdegree Histograms, in the Proc.
of IWQoS, 2007
– Y. Zhao and Y. Chen, A Suite of Schemes for User-level Network Diagnosis
without Infrastructure, in the Proc. of IEEE INFOCOM, 2007
– P. Narayana, R. Chen, Y. Zhao, Y. Chen, Z. Fu, and H. Zhou, Automatic
Vulnerability Checking of IEEE 802.16 WiMAX Protocols through TLA+, in Proc.
of NPSec, 2006
– Y. Zhao, Y. Chen, and D. Bindel, Towards Unbiased End-to-End Network
Diagnosis, in Proc. of ACM SIGCOMM 2006
– Y. Zhao, Q. Zhang, B. Li, Y. Chen and W. Zhu, Hop ID based Routing in Mobile
Ad Hoc Networks, in Proceedings of ICNP, 2005
•
Patents
– E. C. Gillum, Q. Ke, Y. Xie, F. Yu and Y. Zhao, Graph Based Bot-User Detection,
being filed through Microsoft Corporation, MS docket number 324953.01.
– J. Wang, Y. Chen, D. Pei, Y. Zhao, and Z. Zhu, Towards Efficient Large-Scale
Network Monitoring and Diagnosis Under Operational Constraints, being filed
through AT&T, docket number 1209-144.
7
Outline
• Motivation
• Dissertation Overview
• Network Layer Troubleshooting
– VScope, Lend, FAD and SPA
• Application Layer Troubleshooting
– Rake
• Conclusions and Future Work
8
Motivation
Application
Transport
Network
Monitoring
Data Link
Diagnosis
Model
9
Components in Network
Troubleshooting
• Model
– Defines the extrinsic observations and
intrinsic faulty problems as well as the
relationship between them
• Monitoring
– Collect the observations
• Diagnosis
– Identify the faulty location and find out the root
cause
10
Thesis Research Topics
Lend, FAD
and SPA
VScop
e
Diagnosis
Rake
Monitoring
Application
Network
Data Link
Transport
Model
11
Outline
• Motivation
• Dissertation Overview
• Network Layer Troubleshooting
– VScope, Lend, FAD and SPA
• Application Layer Troubleshooting
– Rake
• Conclusions and Future Work
12
Network Layer Troubleshooting
• LEND [Sigcomm06]
– Tomography Diagnosis with least statistic
assumptions
• FAD & SPA [Infocom05]
– On-demand loss rate diagnosis without
infrastructure
• VScope [Patent]
– Experimental design for ISP VPN network
monitoring and diagnosis
13
LEND
• Basic Assumptions
– End-to-end measurement can infer the end-to-end
properties accurately
– Link level properties are independent
• Problem Formulation
– Given end-to-end measurements, what is the finest
granularity of link properties can we achieve under
basic assumptions?
Better accuracy
Basic
assumptions
Diagnosis granularity?
More and stronger
statistic assumptions
Virtual link
14
LEND
• Contributions
– Define the minimal identifiable unit under basic
assumptions (MILS)
– Prove that only E2E paths are MILS with a directed
graph topology (e.g., the Internet)
– Propose good path algorithm (incorporating
measurement path properties) for finer MILS
Better accuracy
Basic
assumptions
Diagnosis granularity?
More and stronger
statistic assumptions
Virtual link
15
FAD & SPA
• Motivation
– How do end users, with no special privileges,
identify packet loss inside the network with
one or two computers?
• Conclusions
– We proposed three user-level loss rate
diagnosis approaches
– The combo of our approaches and Tulip
[SOSP03] is much better than any single
approach
16
VScope Motivation
• Two Important Services Provided by ISP
– Internet access service
– VPN service
• Monitoring and Diagnosis on ISP
Networks
– Ensure Service Level Agreement (SLA)
– Help Network Operations
17
Problem Definition (1)
• Challenges in ISP Network Monitoring and
Diagnosis
– Operational constraints on monitors and links
• A monitor can measure a certain number of paths at a time
• The measurement traffic through a link cannot exceed a
threshold (e.g. 1% of the link bandwidth)
• Path and monitor selection constraints
– Monitor installation is costly
– Real-time diagnosis
– Special star-like topology features of ISP networks
• Access links should be monitored
• The backbone topology extended with access links
(backboneExt) is large and star-like
18
Problem Definition (2)
• Monitor Setup Phase
– From certain monitor candidates select minimal
number of monitors, which in the measurement phase
can measure a certain path set that covers all links in
the network under the given measurement constraints
– NP-hard even without considering constraints
• Monitoring and Fault Diagnosis Phase
– When faulty paths are discovered in the path
monitoring phase, how to quickly select some paths
under the operational constraints to be further
measured so that the faulty link(s) can be accurately
identified?
19
Outline
• Motivation
• Dissertation Overview
• Network Layer Troubleshooting
– VScope, Lend, FAD and SPA
• Application Layer Troubleshooting
– Rake
• Conclusions and Future Work
20
Rake: Semantic Assisted Large
Distributed System Diagnosis
•
•
•
•
•
Motivation
Related Work
Rake
Evaluation
Conclusions
21
Motivation
• Large distributed systems
involve hundreds or thousands
of nodes
Load Balancer
– E.g. search system, CDN
• Host-based monitoring cannot
infer the performance or detect
bugs
– Hard to translate OS-level info
(such as CPU load) into application
performance
– Application log may not be enough
• Task-based approach is
adopted in many diagnosis
systems
Web Servers
DISPATHER
DISPATHER
DISPATHER
Aggregator
Index Servers
– WAP5, Magpie, Sherlock
22
Task-based Approaches
• The Critical Problem – Message Linking
– Link the messages in a task together into a
path or tree
23
Example of Message Linking in Search System
URL
Load Balancer
URL
Web Servers
URL
DISPATHER
DISPATHER
DISPATHER
Aggregator
Search
keyword
Index Servers
Search
Doc ID
keyword
24
Task-based Approaches
• The Critical Problem – Message Linking
– Link the messages in a task together into a path or tree
• Black-box approaches
– Do not need to instrument the application or to understand its
internal structure or semantics
– Time correlation to link messages
• Project 5, WAP5, Sherlock
• White-box approaches
– Extracts application-level data and requires instrumenting the
application and possibly understanding the application's source
codes
– Insert a unique ID into messages in a task
• X-Trace, Pinpoint
25
Problems of Black-Box
• Time Correlation
– Affected by cross traffic
0
1
2
3
4
5
0
1
2
3
4
26
Related Work
Invasiveness
Application
Knowledge
Black-box
Grey-box
White-box
Non-Invasive
Invasive
Network
Sniffing
Interposition
App or OS
Logs
Project 5,
Sherlock
WAP5
Footprint
Rake
Source code
modification
Magpie
X-Trace,
Pinpoint
27
Rake
• Key Observations
– Generally no unique ID linking the messages
associated with the same request
– Exist polymorphic IDs in different stages of
the request
• Semantic Assisted
– Use the semantics of the system to identify
polymorphic IDs and link messages
28
Message Linking Example
URL
Load Balancer
URL
Web Servers
URL
DISPATHER
DISPATHER
DISPATHER
Aggregator
Search
keyword
Index Servers
Search
keyword
Doc ID
29
Questions on Semantics
• What Are the Necessary Semantics?
– In worst case, re-implement the application
• How Does Rake Use the Semantics?
– Naïve design is to implement Rake for each
application with specific application semantics
• How Efficient Is the Rake with Semantics
– Can message linking to accurate?
– What’s the computational complexity of Rake?
30
Necessary Semantics
• Intra-node linking
– The system semantics
• Inter-node link
– The protocol semantics
Node
P
Q
R
S
31
Utilize Semantics in Rake
• Implement Different Rakes for Different
Application is time consuming
– Lesson learnt for implementing two versions
of Rake for CoralCDN and IRC
• Design Rake to take general semantics
– A unified infrastructure
– Provide simple language for user to supply
semantics
32
Example of Rake Language (IRC)
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
<?xml version="1.0" encoding="ISO-8859-1"?>
<Rake>
<Message name="IRC PRIVMSG">
<Signature>
<Protocol> TCP </Protocol>
<Port> 6667 </Port>
</Signature>
<Link_ID>
<Type> Regular expression </Type>
<Pattern> PRIVMSG\s+(.*) </Pattern>
</Link_ID>
<Follow_ID id="0">
<Type> Same as Link ID </Type>
</Follow_ID>
<Query_ID>
<Type> No Return ID </Type>
</Query_ID>
</Message>
Follow_ID
</Rake>
=
Link_ID
Query_ID
P
Q
=
Response_ID
R
33
S
Signature
• Signature to Classify Messages
– <Signature>
• <Protocol> TCP </Protocol>
• <Port> 6667 </Port>
– </Signature>
• Formats of Signatures
– Socket information
• Protocol, port
– Expression for TCP/IP header
• udp [10]&128==0
– Regular expression
– User defined function
34
Link_ID and Follow_ID
• Follow_IDs
– The IDs will be in the triggered messages by this
message
– One message may have multiple Follow_IDs for
triggering multiple messages
• Link_ID
– The ID of the current message
– Match with Follow_ID previously seen
• Linking of Link_ID and Follow_ID
– Mainly for intra-node message linking
35
Query_ID and Response_ID
• Query_IDs
– The communication is in Query/Response style, e.g.
RPC call and DNS query/response.
– The IDs will be in the response messages to this
message
• Response_ID
– The ID of the current message to match Query_ID
previously seen
– By default requires the query and response to use the
same socket
• Linking of Query_ID and Response_ID
– Mainly for inter-node message linking
36
Complicated Semantics
• The process of generating IDs may be
complicated
– XML or regular expression is not good at
complex computations
– So let user provide own functions
• User provide share/dynamic libraries
• Specify the functions for IDs in XML
• Implementation using Libtool to load user defined
function in runtime
37
Example for DNS
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
<?xml version="1.0" encoding="ISO-8859-1"?>
<Rake>
<Message name="DNS Query">
<Signature>
<Protocol> UDP </Protocol>
<Port> 53 </Port>
<Expression> udp[10] & 128 == 0 </Expression>
</Signature>
<Link_ID >
Extract the
<Type> User Function </Type>
queried host
<Libray> dns.so </Libray>
<Function> Link_ID </Function>
</Link_ID>
<Follow_ID id="0">
<Type> Link_ID </Type>
</Follow_ID>
<Query_ID>
<Type> Link_ID </Type>
</Query_ID>
</Message>
• ……………………………..
38
Accuracy Analysis
• One-to-one ID Transforming
– Examples
• In search, URL -> Keywords -> Canonical format
• In CoralCDN, URL -> Sha1 hash value
– Ideally no error if requests are distinct
• Request ambiguousness
– Search keywords
• Microsoft search data
• Less than 1% messages with duplication in 1s
– Web URL
• Two real http traces
• Less than 1% messages with duplication in 1s
– Chat messages
• No duplication with timestamps
39
Potential Applications
• Search
– Verified by a Microsoft guy
• CDN
– CoralCDN is studied and evaluated
• Chat System
– IRC is tested
• Distributed File System
– Hadoop DFS is tested
40
Evaluation
• Application
– CoralCDN
– Deployed on PlanetLab
• Experiment
– Employ PlanetLab hosts as web clients
– Retrieve URLs from real traces with different
frequency
• Metrics
– Linking accuracy (false positive, false negative)
– Diagnosis ability
• Compared Approach
– WAP5
41
CoralCDN Task Tree
42
Message Linking Accuracy
• Rake Linking Accuracy is 100% for
CoralCDN
– Sha1 hash provides almost one-to-one URL
to HashID mapping
– The cache mechanism
• If the same URL is received twice, the 2nd one will
be blocked until the first one retrieves back the
webpage
• Use Rake Linking as Ground Truth to
Evaluate WAP5
43
Message Linking Accuracy (1)
Percentage (%)
WAP5 False Negative
80
70
60
50
40
30
20
10
0
33
53
69
93
118
Request Rate
The higher request rate, the less accuracy in WAP5.
44
Message Linking Accuracy (1)
WAP5 False Positive
Percentage (%)
200
150
100
50
0
33
53
69
93
118
Request Rate
The higher request rate, the less accuracy in WAP5.
45
Diagnosis Ability
• Controlled Experiments
– Inject junk CPU-intensive processes
– Calculated the packet processing time using WAP5 and Rake
0.16
0.14
0.12
0.1
0.08
0.06
RAKE
0.04
WAP5
0.02
ko
ko
ala
_C
P
ala U _1
0
_
ko CPU
ala
_
_ C 20
P
ko
ala U _3
0
_
ko CPU
ala
_
_ C 40
PU
ko
_5
ala
0
_C
PU
ko
ala
_
_ C 60
P
ko
ala U _7
0
_
ko CPU
ala
_8
0
_
C
ko
ala PU _
9
_C
PU 0
_1
00
0
Obviously Rake can identify the slow machine, while WAP5 fails.
46
Discussion
• Implementation Experience
– How hard for user to provide semantics
• CoralCDN – 1 week source code study
• DNS – a couple of hours
• Hadoop DFS – 1 week source code study
• Inter-process Communication
• Encryption
– Dynamic library interposition
47
Conclusions of Rake
• Feasibility
– Rake works for many popular applications in different
categories
• Easiness
– Rake allows user to write semantics via XML
– Necessary semantics are easy to obtained given our
experience
• Accuracy
– Much more accurate than black-box approaches and
probably matches white-box approaches
48
Outline
• Motivation
• Dissertation Overview
• Network Layer Troubleshooting
– VScope, Lend, FAD and SPA
• Application Layer Troubleshooting
– Rake
• Conclusions and Future Work
49
Conclusions and Future Work
• Demonstrate Task-based Troubleshooting Is
Promising
– Network layer troubleshooting
• VScope, LEND, FAD and SPA
– Application layer troubleshooting
• Rake
• Future Work
– Extend Rake in diagnosis
• Timeline for Thesis Writing
– From present to Feb. 1
50
Q & A?
Thanks!
51
52
Backup
53
Monitor Setup Phase
• Single-round Monitoring
– Measure all the target paths simultaneously
– Basic and is adopted by most monitoring
experimental design papers
• Multi-round Monitoring
– Measure all the target paths in different time period
(round)
• Tradeoff between time and link/node constraints
– Multi-round Monitoring is necessary and efficient for
two reasons
• Existing of operational constraints
• Star-like topology
54
Single-Round Monitor Selection
• Pure Greedy Algorithm
– Select monitors one by one and every time
select the monitor that can measure most
uncovered links under the constraints
• To calculate the gain of adding a new monitor is a
variant of Maximum k-Coverage problem
– Simple and local optimized
• Greedy Assisted Linear Programming
based algorithm
55
Greedy Assisted Linear
Programming based algorithm
• Formulate Integer Linear Programming First
– ILP is NP-hard problem
• Relaxation to Linear Programming
– Change all {0,1}-variable to continuous variable
between 0 and 1
• Random Rounding
– Solve the linear programming in polinomial time
– Round the solutions within [0, 1] back to {0,1}-integers
with certain probabilities
56
Multi-round Monitor Selection
• Star-like Topology and Operation Constraints
Make Single-round Monitor Selection Inefficient
– Multi-round monitoring vs Reducing measurement
frequency
• Algorithms for Multi-round Monitor Selection
– Multiple the constraints with the round number and
run single-round monitor selection
– Schedule the paths to measure in different rounds
• Greedy scheduling
• Random scheduling
• Linear programming based scheduling
57
Path Measurement Scheduling
• Greedy algorithm
– Minimize link utilization in every step
• Random algorithm
– Randomly schedule paths independently
– Run random algorithm multiple times to get
the best one
• Linear Programming based algorithm with
random rounding
58
Monitoring and Diagnosis
• Path Monitoring and Faulty Path Discovery
• Faulty Link Diagnosis
– Select and measure some paths which favor
of the diagnosis of the potential faulty links
Iterative Continuous Monitoring
N
Monitor
Selection &
Deployment
VScope Setup
Path
Monitoring
Faulty
Paths
Y
Link
Diagnosis
VScope Operation
59
Background and Related Work
• Network Layer Diagnosis
– Linear algebraic model
– Monitoring experimental design
– Diagnosis algorithms
• Application Layer Diagnosis
– Sherlock: enterprise network service
diagnosis
60
Linear Algebraic Model
Path loss rate pi, link loss rate lj:
1  p1  (1  l1)(1  l 2)
log(1  p1)  log(1  l1)  log(1  l 2)
 log(1  l1) 
 1 1 0 log(1  l 2) 
 log(1  l 3) 
G
A
p1
D
2
B
1
3
p2
C
Usually an
underconstrained
system
 x1 
1 1 0    b1 
11 11 10 x 2   bb1 2 

  x3  
 
61
Monitoring Experimental Design
• Monitor Placement Problem
– Select least monitors that can measure some
paths covering all the links [Infocom03]
• Path Selection Problem
– Selection of the basis of the path matrix
[Sigcomm04]
– SVD based path selection [Infocom05]
– Bayesian experimental design [Sigmetrics06]
• Network Layer Diagnosis
62
Network Layer Diagnosis
• Internet Tomography
– Temporal correlations based algorithms
• Unbiased if multicast is supported
– Statistic algorithms
• Introducing additional statistic assumption or
optimization goal
0.1
0
0.1
63
Download