Cristian Estan, George Varghese, Mike Fisk
Computer Science and Engineering Department,
University of California, San Diego
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Why count flows?
• Detect port/IP scans
•
Identify DoS attacks
• Estimate spreading rate of a worm
•
Packet scheduling
Dave Plonka’s FlowScan
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Existing flow counting solutions
Router
Network bandwidth
Memory
NetFlow data
Memory size
& bandwidth
Fast link
Server
Analysis
Traffic reports
Network Network Operations Center
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Motivating question
•
Can we count flows at line speeds at the router?
–
Wrong solution – counters
– Naïve solution – use hash tables (like NetFlow)
–
Our approach – use bitmaps
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting algorithms
•
A family of algorithms that can be used as building blocks in various systems
•
Algorithms can be adapted to application
•
Low memory and per packet processing
•
Generalize flows to distinct header patterns
–
Count flows or source addresses to detect attack
–
Count destination address+port pairs to detect scan
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Talk structure
•
Per packet processing for bitmap algorithms
•
Computing flow count estimates from bitmaps
•
Variance analysis of estimates
•
Derived algorithms
•
Related work
•
Measurements
•
Conclusions
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – direct bitmap
Set bits in the bitmap using hash of the flow ID of incoming packets
HASH( green )=10001001
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – direct bitmap
Different flows have different hash values
HASH( blue )=00100100
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – direct bitmap
Packets from the same flow always hash to the same bit
HASH( green )=10001001
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – direct bitmap
Collisions OK, estimates compensate for them
HASH( violet )=10010101
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – direct bitmap
HASH( orange )=11110011
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – direct bitmap
HASH( pink )=11100000
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – direct bitmap
As the bitmap fills up, estimates get inaccurate
HASH( yellow )=01100011
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – direct bitmap
Solution: use more bits
HASH( green )=10001001
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – direct bitmap
Solution: use more bits
Problem: memory scales with the number of flows
HASH( blue )=00100100
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – virtual bitmap
Solution: a) store only a portion of the bitmap b) multiply estimate by scaling factor
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – virtual bitmap
HASH( pink )=11100000
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – virtual bitmap
Problem: estimate inaccurate when few flows active
HASH( yellow )=01100011
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – multiple bmps
Solution: use many bitmaps, each accurate for a different range
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – multiple bmps
HASH( pink )=11100000
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – multiple bmps
HASH( yellow )=01100011
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – multiple bmps
Use this bitmap to estimate number of flows
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – multiple bmps
Use this bitmap to estimate number of flows
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – multires. bmp
OR
OR
Problem: must update up to three bitmaps per packet
Solution: combine bitmaps into one
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – multires. bmp
HASH( pink )=11100000
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting – multires. bmp
HASH( yellow )=01100011
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Talk structure
•
Per packet processing for bitmap algorithms
•
Computing flow count estimates from bitmaps
•
Variance analysis of estimates
•
Derived algorithms
•
Related work
•
Measurements
•
Conclusions
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Basic estimates
Direct bitmap
Virtual bitmap
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Multiresolution bitmap estimate
Find most accurate component
Estimate number of flows hashing to it
Apply scaling factor
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Talk structure
•
Per packet processing for bitmap algorithms
•
Computing flow count estimates from bitmaps
•
Variance analysis of estimates
•
Derived algorithms
•
Related work
•
Measurements
•
Conclusions
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Relative error in estimates
Direct bitmap
Virtual bitmap
Multiresolution bitmap
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Error of virtual bitmap
Flow density (flows/bit)
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Memory requirements
Direct bitmap < N / ln (Nε 2 +1)
1.5441/ ε 2 Virtual bitmap
Multiresolution bitmap 0.9186 ln (Nε 2 ) / ε 2 +ct.
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
100 million flows, error 1%
Hash table*
Direct bitmap
1.21 Gbytes
1.29 Mbytes
Virtual bitmap* 1.88 Kbytes
Multiresolution bitmap 10.33 Kbytes
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Talk structure
•
Per packet processing for bitmap algorithms
•
Computing flow count estimates from bitmaps
•
Variance analysis of estimates
•
Derived algorithms
•
Related work
•
Measurements
•
Conclusions
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Triggered bitmap
•
Need multiple instances of counting algorithm
(e.g. port scan detection)
•
Many instances count few flows
•
Triggered bitmap
–
Allocate small direct bitmap to new sources
– If number of bits set exceeds trigger value, allocate large multiresolution bitmap
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Talk structure
•
Per packet processing for bitmap algorithms
•
Computing flow count estimates from bitmaps
•
Variance analysis of estimates
•
Derived algorithms
•
Related work
•
Measurements
•
Conclusions
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Related work
• Flajolet, Martin (1985) probabilistic counting
–
Memory use similar to multiresolution bitmap
•
Whang et al (1990) introduce direct bitmap
•
You, Chang (1996) use virtual bitmap
•
Chauduri, Motwani, Narasayya (1998)
–
Counting flows without bias impossible from sampled data
•
Duffield, Lund, Thorup (2002)
– Accurate solutions based on counting TCP SYN flags
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Talk structure
•
Per packet processing for bitmap algorithms
•
Computing flow count estimates from bitmaps
•
Variance analysis of estimates
•
Derived algorithms
•
Related work
•
Measurements
•
Conclusions
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Multires. bmp. vs. prob. counting
Number of flows (log scale)
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Scan detection memory usage
Interval length
12 seconds
Snort
(naïve)
Probabilistic counting
Triggered bitmap
1.94 M 2.42 M 0.37 M
600 seconds 49.60 M 22,34 M 5.59 M
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Talk structure
•
Per packet processing for bitmap algorithms
•
Computing flow count estimates from bitmaps
•
Variance analysis of estimates
•
Derived algorithms
•
Related work
•
Measurements
•
Conclusions
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
A family of counting algorithms
Setting Algorithm Applications
General counting Multiresolution bmp.
Track infections
Narrow range Virtual bitmap Triggers (e.g. DoS)
Small counts common Triggered bitmap
Stationarity
Add and delete
Adaptive bitmap
Increment-decrement
Port scans
Measurement
Scheduling
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Bitmap counting algorithms
•
A family of algorithms that can be used as building blocks in various systems
•
Algorithms can be adapted to application
•
Low memory and per packet processing
– With 2Kbytes error around 1%
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
The end
Bitmap algorithms will be available at: http://ial.ucsd.edu/bitmaps/
Any questions?
Acknowledgements: Vern Paxson, David Moore,
Philippe Flajolet, Marianne Durand, Alex Snoeren, K
Claffy, Stefan Savage, Florin Baboescu, NIST,NSF
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Adaptive bitmap
•
Virtual bitmap measures accurately number of flows if range known in advance
•
Often number of flows does not change rapidly
•
Measurement repeated
•
Can use previous measurement to tune virtual bitmap
•
Combine a large virtual bitmap with a small multiresolution bitmap used for tuning
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Adaptive bitmap accuracy
Number of flows (log scale)
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
With 2 kilobytes of memory
Adaptive bitmap
(min avg max)
Probabilistic counting
(min avg max)
Trace1 -4.4% 1.1% 4.7% -9.5% 2.8% 13.3%
Trace2 -1.9% 0.7% 2.0% -6.9% 2.8% 7.6%
Trace3 -1.8% 0.6% 1.8% 2.4% 10.2% 17.7%
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003
Increment-decrement algorithms
•
Active flow defined as flow with packets in queue
•
Must support additions and deletions
•
Replace bits of bitmap with counters
–
Increment when packet arrives
– Decrement when packet leaves
–
Estimate number of flows based on zero counters
Bitmap algorithms for flow counting – Internet Measurement Conference, October 2003