IP Performance Measurements using Surveyor

advertisement
IP Performance
Measurements using Surveyor
Matt Zekauskas
matt@advanced.org
Guy Almes, Sunil Kalidindi
August, 1998
ISMA 98
Outline
• Background
• Surveyor infrastructure
• Reporting and analysis
• Status
I: Background
• Internet topology is increasingly
complex
• Commonly used measurement tools
(like ping and traceroute) are
inadequate
• Result: users don’t understand the
Internet’s performance and reliability
IP Performance Metrics
• IETF IPPM effort
– Framework RFC
– One-way delay and packet loss drafts
– Others: connectivity, bulk transfer, DV
• Surveyor: implementation of one-way
delay and packet loss metrics
Motivation for measuring
delay
• Minimum of delay:
transmission/propagation delay
• Variation of delay: queuing delay
• Large delay makes sustaining highbandwidth flows harder
• Erratic variation in delay makes realtime apps harder
Uses
• Problem determination
• Engineering (trends, loads)
• Feedback to advanced applications
(e.g., Tele-Immersion, CMU’s Odyssey)
• Monitor QoS
One-way versus round trip
• Paths are asymmetric
• Even when paths are symmetric,
forward and reverse paths may have
radically different performance asymmetric queuing
II. The Surveyor Infrastructure
• Measurement machines at campuses
and at other interesting places along
paths (e.g., gigaPoPs, interconnects)
• GPS to synchronize clocks
• Centralized database to store
measurement data
• Web based reporting and analysis tools
II. Surveyor Infrastructure
Measurement machines
Measurement Machines
• Dell 400 MHz Pentium Pro
• 128 MBytes RAM; 2 GBytes disk
• BSDI Unix
• TrueTime GPS card and antenna
• Network Interface (10/100bT, FDDI)
• Special driver for the GPS card
Measurement Technology
• Active tests of one-way delay and loss
– Measurement daemon
– Test packets time-stamped with GPS time
– Back-to-back calibration: 95% of
measurements ± 50 s
– Measurements centrally managed
• Truer-time daemon to watch clocks
Ongoing Tests
Ongoing Tests - Delay
• Type-P
– 12 byte UDP packets, 40 bytes total
– Port “random” per session
• Scheduled using a Poisson Process
– average rate: 2 per second
• “Mostly” full mesh
Ongoing Tests - Routing
• Traceroute to same sites as One-Way
delay
• Scheduled with Poisson process
– average rate: one every 10 minutes
Collecting Results
Central Database Machine
• SGI Origin 200
• 2 processors, 256MB
• 327GB Fibrechannel-attached RAID
for data storage
(DataDirect Networks EV-1000)
Central Database Machine
• Collects performance data from the
measurement machines [ssh, pull]
• Stores the data in a home-grown
database
• Serves data and summaries to reporting
and analysis tools [http]
Current Surveyor Deployment
• 28 machines, 623 paths
–
–
–
–
–
–
–
CSG Schools
Tele-Immersion Labs
National Labs
NASA Ames
CA*net2 Ottawa site
Auckland, NZ
…others
Surveyor Map (N. America)
III. Reporting and analysis
tools
• Web based Tools
• Daily summary reports
• Integration with traceroute
measurements
Daily summary reports
• Take a 24-hour sample for a given path
• Divide it into one-minute sub-samples
• For each one-minute sub-sample:
– Minimum delay (blue)
– 50th percentile (green)
– 90th percentile (red)
Example daily reports
• Advanced Network & Services and
University of Chicago
– path is symmetric
– asymmetric queuing
Examples (continued)
• Advanced Network & Services and
University of Pennsylvania
– path asymmetric
Examples (continued)
• CMU to Brown University
Examples - Route Change
• Advanced Network & Services to
Penn State University
• Route change switched providers, and
removed one provider from the path
Examples - Auckland
• University of Auckland, NZ to
University of Washington, Seattle
• Asymmetric queuing, congested transpacific path
IV Status
• Deployment rate: 1/week
• Planned: Abilene backbone
– probe at each backbone router
– experiment with piecewise delay
Full Mesh of End-to-end Paths
O(N2) paths
Paths with Exchange Points
O(X2+N)
Abilene
Router Nodes
gigaPoPs
Universities
Near-term improvements
• Improve measurement software
– time stamping in-kernel: to scale without
losing accuracy
• New and improved analyses
–
–
–
–
real-time display tools
flag interesting paths
trends …
improved data export to other sites
Summary
• One-way Delay and Loss are
– practical
– useful
• Surveyor infrastructure growing
• Now focus on analysis and applications
More info
• Surveyor project info
– http://www.advanced.org/surveyor/
– Email: mm-info@advanced.org
• Access to plots
– Email me - matt@advanced.org
• IETF IPPM WG
– http://www.advanced.org/IPPM/
Download