Network Performance Management S. Keshav C/NRG

advertisement
Network Performance
Management
S. Keshav
C/NRG
(with Rosen Sharma, Andy Choi, Wilson
Huang, Lili Qiu, Russell Schwager, Rachit
Siamwalla, Jia Wang, and Yin Zhang)
Motivation
 Networks are increasing in breadth….
– greater density of connections
– PCs come with built-in networking
– ADSL and cable modems
– wireless networking
 as well as in depth
– variety of qualities, policies, and media
The current situation
 Loss of productivity from
– slow file access
– web site disconnection
– slow access to a web site
– no one knows exactly why!
 Greater breadth and depth
=> even more dependency on the network
=> even more problems
Is QoS enough?
 Lots of research in the area of QoS
– RSVP, differential service etc. provide a good
overall user experience, one stream at a time
– Is QoS all there is to a good user experience?
 An incorrect reservation  poor service for
one stream
 A misconfigured router  complete loss of
service to one or more ports!
Aha!
 User experience is affected more by
‘mundane’ network management than by
‘exotic’ QoS research
 This motivates our entire research effort
Why networks fail
 Link or router failure
 Transient overload
 Unanticipated increase in load
 Misconfiguration
Increasingly harder to detect
Need Better Network
Management
 Current approaches
– GUI-centric
– lots of flashing lights, but no intelligence
 Can detect failures but...
– ad hoc capacity planning
– ad hoc configuration
• no way of testing other than “just try it!”
 Can’t manage network performance
Performance management
Topology discovery
Configure new hardware
(simulation)
Collect statistics
(monitoring)
Fix problems (AI and
simulation)
Identify problems (display
and simulation)
Discovery: Project Octopus
Temporary Set
Heuristic
Permanent Set
Techniques
 DNS-ls
 SNMP
 Random probe
 Traceroute
 Directed broadcast ping
Results
 Have automatically discovered entire CS
department topology
 As well as entire Stanford topology (> 220
subnets)
 Cornell topology is being discovered as we
speak!
– info being shared with CIT
Monitoring
 A PERL script uses SNMP and queries a
router using various MIB entries.
 The MIB entries are stored in an input file.
 The values gathered from the router are
stored in a file.
 The script works on both UNIX and
WinNT.
Monitoring (contd.)
 Other PERL scripts parse the data and
convert it to other formats.
 Currently supported formats:
– HTML - The data is presented in a table format
in HTML.
– GNUPlot graphs - The data can be graphed or
saved in pbm format
A Case Study: CSGate2
 From 2/19/98 to 2/23/98, the router CSGate2 was probed every
5 minutes recording various statistics on the data coming into
and going out of the router.
Incoming bytes at CSgate2
Display goals
 We want to display multiple views
 Views should be dynamic
 Shoul allow expansion and contraction
 Rapid creation of user interface
 Reusability of GUI components
Solution: Script Java
 Component-based system
 Reusable manageable components
 Can build large manageable applications
 Sharing over the web
 Record and playback
Architecture
 Use JavaScript/Visual Basic as the scripting
language
 Use Java to write components
 Create a adapter hierarchy for the current
AWT components
Script Java
Objects
Communication
Abstraction
Data Model
 multicast channels
 linearized data
 HTML pages
 Java structures
intelligence
 protection by
namespace
structures
 java  perl 
javascript
Advantages
 Allows us to glue components using a
scripting language, allowing rapid
prototyping and development
 New components can be easily integrated
 For large applications, a lot of the
complexity and chaos can be taken out of
scripting
Advantages(cont.)
 JavaScript can be streamed from the server,
allowing for presentations and sharing
 Dynamic Html
– layers are windows
– these windows render html
Storage goals
 We need to store topology and monitoring
results somewhere
 Database: too structured and too much
overhead
 File system: not enough semantics
 Idea: treat URL as a file system link and
HTML tags as associated semantics
WebFS
 HTML tags allow arbitrary semantic
abstractions
 Manipulate these abstractions to present a
virtualized file system
 grep -headings *.html
 sed ‘/<annot tag=foo>/jdbc(“tags.db”,
“foo”)/’
The magic bullet: simulation
 Realistic simulation where networking
subsystem interacts with other parts of
kernel
 Fast simulation for large networks ( > 1000
hosts)
 Hide the abstraction of simulated network,
same API as system calls
FreeBSD kernel User Space
machine
gated
msg
Telnetd
gated

traps
Telnetd
ping
Sockets
Network Stack
ping
Kernel wrapper
Kernel core
Simulated machine
machine
gated
msg
Telnetd
ping
Kernel wrapper
Kernel core
 Task based approach
– a trap sends a message
to kernel
– an upper call is a
message from kernel
 All components of
simulated machine
live on same process
Simulated link
More on simulated machine
 Capture network related system calls, file
descriptor auto re-mapping.
 Virtual file system root
 Single-thread kernel, therefore no need for
locking
Simulated network
machine
Telnetd msg
gated
ping
Kernel core
Integrating with real network
 Use U-Net to interact
with external device
 Router has the illusion
of being in a physical
network
 Test equipment before
actual deployment
Unet
Physical Router
Tradeoffs
 Balance between realism and speed
– Using FreeBSD as basis for realistic simulation
– Using session level simulation to speed up
 Ease of porting applications
Open issues
 Fault identification
– Bayesian networks?
– Ensemble of experts?
– Other AI approaches?
 How to do session-level simulation?
 Configuring real systems
– IP9000
Download