On the efficient detection of elephant flows in aggregated

advertisement
On the efficient detection of
elephant flows in aggregated
network traffic
Javier Rivillo Lopez
Jose Alberto Hernandez
Iain W. Phillips
Networks and Control Group
Research School of Informatics
Loughborough University
J.Rivillo-Lopez@lboro.ac.uk
J.A.Hernandez@lboro.ac.uk
I.W.Phillips@lboro.ac.uk
LCS, 2005
Outline
• Motivation
• Flow analysis
• Detection method
• Experiments and results
• Conclusions
Motivation (I):
Definitions
• Flow: unidirectional set of packets of the same
•
transport protocol sharing the same source and
destination IP addresses and ports.
Elephant flow: stream of packets which
contribute to network load substantially more
than the rest of the flows.
– A threshold must be defined by the network
managers/administrators.
– The threshold value depends on the network size.
Motivation (II):
Elephant and mice phenomenon
• Usually a few
•
•
flows carry
most of the
data
Trace example
from NLANR
router. In
future, MASTS.
0.1% of the
flows carry
nearly 83% of
the total traffic
Figure 1: A flow aggregation view of network traffic
Motivation (III):
Elephant and mice phenomenon
• Elephant flows
– Low-priority applications
– Large data transfer transactions and peer-to-peer file
sharing
• Mice flows
– Sensitive to delay, jitter and high loss rates.
– Voice over IP, online gaming, small http requests
• Under this phenomenon, Internet's best effort
•
delivery is not suitable.
The performance of the network can be
improved by detecting the elephant flows and
applying traffic engineering solutions
Flow analysis (II):
Flow duration
Figure 2: Flow duration histogram
• Most of the mice flows (92%) have very short duration (< 2 sec)
• Most elephants are long duration flows: heavy tail behaviour.
Flow analysis (III):
Mean interarrival time
Figure 3: Flow mean interarrival histogram
• Elephants have very low average packet interarrival time.
• So, a flow with high average packet rate and long duration is very
likely to be an elephant.
Detection method (I):
Sampling
• Requirement: Low computational cost
• Continuous monitoring not suitable:
– Requires huge amount of resources.
– Not scalable.
• Sampling is required.
• Random sampling not suitable because we lose
•
information about the packet interarrival and timing.
Solution: Windowing.
– Example: monitoring the network 20ms every 2 seconds.
• Monitoring 1% of the time, Sampling factor = 20ms/2sec = 100
Detection method (II):
Elephant detection algorithm
• Objective: identify flows with low packet interarrival
•
•
•
time (high packet rate) and long duration: Elephants.
Step 1. A flow has high packet rate when it has at least
Np packets in a sampling window.
Step 2. A flow is considered elephant when it has been
identified as high packet rated flow in Nw different
sampling windows.
Parameters:
– Np, Nw, w and T
• In future, the algorithm will be adaptative: The
parameters will be calculated automatically by the
system.
Experiments and Results
•
Results with w=20ms and T=2sec:
Figure 4: Flows identified as elephant traffic
• Np=2, Nw=2: 80% of the elephant flows are correctly identified,
they carry 89% of the total traffic and 0.12% of the mice flows are
misidentified as elephants
• Increasing Np and Nw, we get more Precision but less Recall.
Conclusions
• Identifying elephant flows for traffic engineer solutions
can improve the network performance.
• The properties of elephant and mice flows have been
obtained studying real traffic data.
• The long tail behaviour and high packet transmission
rate shown by the elephants have been used in the
elephant detection method explained.
• This scalable and low computational cost method uses
high sampling rate for early detection of elephant flows.
• We have shown in the results that it is a valid method
and its parameters may be adjusted for a tradeoff
between Precision and Recall in identifying the elephant
flows.
THANK YOU!
Download