Delay correlation

advertisement
UNSW
School of Electrical Engineering and Telecommunications
Opportunistic Flow-Level Latency
Estimation
Using Consistent NetFlow
Group 4:Garnsey, Dennis
Kang, Kang
Liu, Weiming
Xu, Yang
Lin, Shijie
Chen, Zhouyuan
Opportunistic Flow-Level Latency
Estimation Using Consistent NetFlow
1
UNSW
School of Electrical Engineering and Telecommunications
Summary
This paper presents a study in the use of time-stamps
in NetFlow to estimate network latency and discusses
ways to retrofit latency measurements to existing
networks using NetFlow.
Some of the techniques covered include
• Hash-based sampling to provide consistent
NetFlow
• Opportunistic Latency Estimation – using smaller
flows to estimate the average latency and standard
deviation of longer flows
• NetFlow is used for Fault, Performance and
Security Management
Opportunistic Flow-Level Latency Estimation Using Consistent NetFlow
2
UNSW
What is latency and why is it important?
School of Electrical Engineering and Telecommunications
• Delay in data propagation introduced by links
and network devices
• Caused by
– The speed of light
– Switching and processing
– Queuing and shaping
• Typical latencies
– Within Sydney 10
milliseconds
– Within Australia - 30 - 100
milliseconds
– Australia to the US - 200 From http://www.akamai.com/html/technology/dataviz2.html
milliseconds
2009 Google announced that the next major update to the Page Rank Algorithm (search
result indexing) will start taking into account the pages load (response time)
3
UNSW
School of Electrical Engineering and Telecommunications
What is a NetFlow record and why is it useful?
• NetFlow originally a routing technology
– Routers swap the destination mac address on packets and forward to
egress port
– NetFlow cached the destination mac and egress port to speed up routing
– Superseded by other routing technologies (hardware rather than CPU
based)
– still needed when routing decision requires CPU (e.g. ACLs)
• NetFlow records are still in
router
• Exporting them to a collector
provides valuable information
about network traffic patterns
• Contains network and
application info – otherwise
need RMON
Netflow V5
Source IP address
Destination IP address
IP address of next hop router
SNMP index of input interface
SNMP index of output interface
Packets in the flow
Bytes in the packets of the flow
SysUptime at start of flow
SysUptime at the last packet of the flow was received
Source Port
Unused (zero)
Destination Port
TCP flag
Source AS
Src. Mask
IP protocol type
ToS
Destination AS
Dst. Mask
Unused (zero)
Opportunistic Flow-Level Latency Estimation Using Consistent NetFlow
4
UNSW
School of Electrical Engineering and Telecommunications
What is consistent NetFlow?
• Problem
– How do we correlate a NetFlow record from one
router with a NetFlow record from another router?
• Issues
– Time synchronisation of routers may not be accurate
so time stamps don't match
– Packet loss
– Cache expiry due to load
– Different sampling across routers - may be random
• Solution
– Hash-based sampling
Opportunistic Flow-Level Latency Estimation Using Consistent NetFlow
5
UNSW
School of Electrical Engineering and Telecommunications
Consistent NetFlow
• Sample packets at every link
–
–
–
–
Pseudo random sampling (e.g., 1-out-of-100)
Compute a hash over the invariant fields (same on each hop) of the packet
Packet is selected for reporting if the hash falls within a given range
All routers use the same hash, input fields and selection range
» Result is consistent flow selection
• Details of consistent sampling
– x: subset of invariant bits in the packet
– Hash function: h(x) = x mod A
– Sample if h(x) < r, where r/A is a thinning factor
Opportunistic Flow-Level Latency Estimation Using Consistent NetFlow
6
UNSW
School of Electrical Engineering and Telecommunications
People and Standards
• Nick Duffield (AT&T Labs)
– 2000-2002 – Trajectory Sampling (hash based sampling)
– 2009 Co-author RFC 5474 PSAMP
– 2012 Co-author Opportunistic Flow-Level Latency Estimation
Using Consistent Netflow
• PSAMP/IPFIX (some overlap, but complementary)
– IPFIX is standardisation track for NetFlow Export
• Describes how IP flow information is to be formatted and transferred from an
exporter to a collector
– PSAMP is standardisation track for Flow sampling
• network elements to select subsets of packets by statistical and other
methods, and to export a stream of reports on the selected packets to a
Collector
“PSAMP selection operations include random selection, deterministic
selection (Filtering), and deterministic approximations to random selection
(Hash-based Selection).”
- RFC
5474 Latency Estimation Using Consistent NetFlow
Opportunistic
Flow-Level
7
UNSW
School of Electrical Engineering and Telecommunications
Opportunistic latency estimation
• NetFlow records have timestamps – start of flow
and end of flow
• Can we estimate average latency and standard
deviation from flow time-stamps?
• Opportunistic – measure latency of shorter flows
which occur during the same time frame as a
longer flow, and interpolate the packet delay
within the longer flow
Opportunistic Flow-Level Latency Estimation Using Consistent NetFlow
8
UNSW
School of Electrical Engineering and Telecommunications
Basic knowledge
 Prerequisite:
There are two basic assumptions this approach relies on:
1. Time Synchronization—the fundamental requirement for enabling accurate oneway delay measurements.
2. Packet Forwarding Order— the stream of packets follows a serial order (FIFO)
 Flow correlation:
Two approaches to associate flow records:
Mapping Packet Label to a Timestamp
Timing Checks to Eliminate Inconsistencies
 Delay correlation:
Central premise of Foundations of Delay Correlation:
When two packets traverse a link closely separated in time, then the queuing delays
that experience are positively correlated.
UNSW
School of Electrical Engineering and Telecommunications
Latency Estimation
UNSW
School of Electrical Engineering and Telecommunications
Variance and Its Estimation
UNSW
School of Electrical Engineering and Telecommunications
Interpolation of Packet Delays
Delay difference of
two known packets
The delay we
estimate
Closest delay
in the past
Time difference of
two known packets
UNSW
School of Electrical Engineering and Telecommunications
EVALUATION
• Estimator Accuracy
– Comparison to Active Probes
– Accuracy With Respect to Flow Duration
– Comparison to Interpolation and Trajectory Sampling
UNSW
School of Electrical Engineering and Telecommunications
Sampling and Loss Rate Variation
two variables that control the effective number of sampled
packets: packet sampling rate and loss rate
1.Impact of Sampling Rate
As shown above, relative errors decrease with the packet sampling rate
increasing
UNSW
School of Electrical Engineering and Telecommunications
Sampling and Loss Rate Variation
2.Impact of Packet Loss Rate
relative error reduces
as increasing the
packet loss rate when
using Multiflow and
WISC traces.
three traces WISC-R1, -R2 and -R3 have small
(0.01%), medium (0.12%), and high (4.59%)
packet loss rates, respectively,
UNSW
School of Electrical Engineering and Telecommunications
Accuracy of Standard Deviation Estimates
The increase of the packet
loss rate reduces the relative
error of standard deviation of
flow-level latency. But the
estimation of Endpoint
cannot be trusted as
Multiflow because of its poor
accuracy.
Using WISC traces also shows
the same trend, but the
improvement in accuracy of
standard deviation estimates
among traces is less than that
in mean estimation accuracy.
UNSW
School of Electrical Engineering and Telecommunications
Conclusion
• Problem being solved
– NetFlow time stamps should be able to be used to as data for
measurement for network latency
• Proposal
– Use hash-based sampling for consistent NetFlow
– Opportunistic Latency Estimation using time stamps from shorter
flows to estimate average and standard deviation of latency with
longer flow
• Experimental evaluation
–
–
–
–
Uses real and synthetic data and real and theoretical delay modeling
Check accuracy of hash-based NetFlow
Check accuracy of estimators – endpoint, multiflow and hybrid
Compare with real data and alternative estimators (trajectory
sampling)
Opportunistic Flow-Level Latency Estimation Using Consistent NetFlow
17
UNSW
School of Electrical Engineering and Telecommunications
Results
• Estimator accuracy
– The multiflow estimator was more accurate than either the endpoint
estimator or active probes for packet sampling.
– For flow sampling the endpoint estimator is more accurate.
– Over a range of flow sizes, endpoint performs better up to about size
3-4 and then accuracy decreases.
– The flow sampling above contained a large number of small flows
which was why the endpoint estimator was more accurate.
Opportunistic Flow-Level Latency Estimation Using Consistent NetFlow
18
UNSW
School of Electrical Engineering and Telecommunications
Criticism
• NF records are useful for network management, but problems that are not
addressed here are
– NF is resource intensive
– Resources used by NF could be needed for data traffic
• Some network management systems (Riverbed, Tenable) correlate NF
records, however not in the deterministic manner as proposed here.
• Given that this approach still relies on sampling, NF will still not replace
Wireshark and network sniffing in the network management tool for
packet capturing, and SNMP will still be used for lower level utilization
reporting. NF sits midway between the two.
• PSAMP is not commercially available yet (to my knowledge), so this
approach is still evolving
• The utility of per-flow average delays and standard deviations is not clear
and may not be known until it becomes commercially available (if ever).
Opportunistic Flow-Level Latency Estimation Using Consistent NetFlow
19
UNSW
School of Electrical Engineering and Telecommunications
Opportunistic Flow-Level Latency
Estimation
Using Consistent NetFlow
Supplementary slides
Opportunistic Flow-Level Latency
Estimation Using Consistent NetFlow
20
UNSW
School of Electrical Engineering and Telecommunications
Introduction
gathering information from the network
SNMP Interface Counters
Key Fields
Packet 2
Interface
E1
Packets In
36787
Packets Out
47856
Bytes In
786302
Bytes Out
789309
SNMP packet
counters on
interfaces
NetFlow tables
in CPU
IP Layer
Network
Monitoring
SNMP tcpConnEntry
tcpConn
State
tcpConn
LocAddr
tcpConn
LocPort
tcpConn
RmtAddr
tcpConn
RmtPort
estab
167.8.15.92
227
176.15.53.216
228
estab
167.8.15.92
235
176.15.53.216
240
closing
167.8.15.92
236
178.67.124.15
196
estab
167.8.15.92
244
181.33.16.4
227
Netflow Records
Key Fields
Packet 2
Source IP
167.8.15.92
Destination IP
176.15.53.216
Source port
227
Destination port
228
Layer 3 Protocol
TCP - 6
TOS Byte
0
Input Interface
Ethernet 0
Source
IP
Dest
IP
Dest
I/F
Proto
col
TOS
…
Pkts
167.8.15.92
176.15.53.216
E1
6
0
…
11000
167.8.15.92
176.15.53.216
E1
6
0
…
11000
SNMP packet
counters on
interfaces
Switch Layer
SNMP tcp
conn entries
on end hosts
Applications
UNSW
School of Electrical Engineering and Telecommunications
NetFlow
• Netflow n-tuple may include
–
–
–
–
–
–
–
Flow Usage counters
Start time and end time
Interfaces used
QoS flags
IP Addresses
Applications ports
Routing information
UNSW
School of Electrical Engineering and Telecommunications
NetFlow V5 Header
0
0-3
8
16
NetFlow Version
24
Flow Record Count (1-30)
4-7
SysUptime of the export device booted
8-11
Current count of seconds since 0000 UTC 1970
12-15
Residual nanoseconds since 0000 UTC 1970
16-19
Sequence counter of total flows seen
20-23
engine_type
engine_id
Unused (zero)
Format of NetFlow V.5 Header
http://www.plixer.com/support/netflow_v5.html
31
UNSW
School of Electrical Engineering and Telecommunications
NetFlow V5 Flow Record
0
8
16
24
0-3
Source IP address
4-7
Destination IP address
8-11
IP address of next hop router
12-15
SNMP index of input interface
SNMP index of output interface
16-19
Packets in the flow
20-23
Bytes in the packets of the flow
24-27
SysUptime at start of flow
28-31
SysUptime at the last packet of the flow was received
Source Port
32-35
36-39
Unused (zero)
44-47
Destination Port
TCP flag
Source AS
40-43
Src. Mask
31
IP protocol type
Destination AS
Dst. Mask
Unused (zero)
Format of NetFlow V.5 Flow Record
See http://www.plixer.com/support/netflow_v5.html
ToS
UNSW
School of Electrical Engineering and Telecommunications
NetFlow V9 Template
bit 0-15
flowset_id = 0
• The distinguishing feature of
the NetFlow Version 9 format is
that it is template based.
Templates provide an
extensible design to the record
format to allow future
enhancements to NetFlow
services without requiring
changes to the basic flowrecord format.
length
Packet Header
template_id
field_count
Template
FlowSet
field_1_type
Data FlowSet
field_2_length
Data FlowSet
...
Template
FlowSet
Data FlowSet
...
field_1_length
field_2_type
field_3_type
field_3_length
...
field_N_type
field_N_length
template_id
field_count
field_1_type
field_1_length
...
field_N_type
field_N_length
Format of NetFlow V.9 Template
UNSW
School of Electrical Engineering and Telecommunications
NetFlow V9 Header
0
0-3
8
16
NetFlow Version
24
Flow Record Count (1-30)
4-7
SysUptime of the export device booted
8-11
Current count of seconds since 0000 UTC 1970
12-15
Sequence counter of all export packets sent by the export device. Note: This is a change from
the Version 5 and Version 8 headers, where this number represented “total flows.”
16-19
A 32-bit value that is used to guarantee uniqueness for all flows exported from a particular
device. (The Source ID field is the equivalent of the engine type and engine ID fields found
in the NetFlow Version 5 and Version 8 headers).
Format of NetFlow V.9 Header
From http://www.plixer.com/support/netflow_v9.html
31
UNSW
School of Electrical Engineering and Telecommunications
NetFlow V9 Flow Record
• 87 fields possible - too many to fit on slide
Field Type
Value
IN_BYTES
1
Length (bytes)
Description
N (default is 4)
Incoming counter with length N x 8
bits for number of bytes associated
with an IP Flow.
IN_PKTS
2
N (default is 4)
Incoming counter with length N x 8
bits for the number of packets
associated with an IP Flow
FLOWS
3
N
Number of flows that were
aggregated; default for N is 4
...
...
...
...
LAST_SWITCHED
21
4
System uptime at which the last
packet of this flow was switched
FIRST_SWITCHED
22
4
System uptime at which the first
packet of this flow was switched
...
...
...
...
Partial format of NetFlow V.9 Flow Record
From http://www.plixer.com/support/netflow_v9.html
UNSW
Router 2 Record
Router 1 Record
source IP address
School of Electrical Engineering and Telecommunications
Correlate
So
rce IP address
destination IP address
destination IP address
source TCP/UDP application port
source TCP/UDP application port
destination TCP/UDP application port
destination TCP/UDP application port
next hop router IP address
next hop router IP address
input physical interface index
input physical interface index
output physical interface index
output physical interface index
packet count for this flow
packet count for this flow
byte count for this flow
byte count for this flow
start of flow timestamp
Synchronise
start of flow timestamp
end of flow timestamp
end of flow timestamp
IP Protocol (for example, TCP=6; UDP=17)
IP Protocol (for example, TCP=6; UDP=17)
Type of Service (ToS) byte
Type of Service (ToS) byte
TCP Flags (cumulative OR of TCP flags)
TCP Flags (cumulative OR of TCP flags)
source AS number
source AS number
destination AS number
destination AS number
source subnet mask
source subnet mask
destination subnet mask
flags (indicates, among other things, which flows are invalid)
shortcut router IP address
Consistent
NetFlow
destination subnet mask
flags (indicates, among other things, which flows are invalid)
shortcut router IP address
Download