(Bae) CASE STUDY: Wireshark in the Large Enterprise

advertisement
Wireshark in the Large Enterprise
June 16, 2010
Hansang Bae
Senior Vice President | Citi (f.k.a. Citigroup)
Email: hansang@gmail.com
PLEASE REFER TO THE “ANSWERSHEET.DOCX” FILE FOR ADDITIONAL INFORMATION ABOUT THIS PRESENTATION.
THESE SESSIONS WILL BE AVAILABLE ON YOUTUBE: HTTP://WWW.YOUTUBE.COM/USER/HANSANGB
SHARKFEST ‘10
Stanford University
June 14-17, 2010
SHARKFEST ‘10 | Stanford University | June 14 –17, 2010
Please Let TCP Do Its Job.
Problem: Application developers escalate an issue with
slow file (MQ) transfers.
Troubleshooting Steps:
1. What should you rule out immediately?
2. What affects throughput and why?
3. Look for patterns and ask the right questions. Quick examination
would reveal what? Doesn’t it look normal? Can you spot the issue
quickly? Were you guys paying attention yesterday?!?
4. Use the graphing tools. Picture is worth a thousand words.
5. Setup your Wireshark environment in a standard way. Use
Configuration Manager to help you.
SHARKFEST ‘10 | Stanford University | June 14 –17, 2010
Don’t Jump to Conclusions!
Another application development team escalates a
“slowness” problem.
Troubleshooting Steps:
1. Trust But Verify (tcp.analysis.flags)
2. Look for telltale signs of problems.
3. Who’s sending and who’s receiving? Besides looking at
the name of the file….can you figure it out?
4. Apply Occam’s Razor when solving problems.
SHARKFEST ‘10 | Stanford University | June 14 –17, 2010
Another (unusual) Hidden Danger!
Application testing with an external vendor doesn’t
work. It tested fine when tested with intraresources.
Troubleshooting Steps:
1. If it works internally but not with an external vendor (reachable via
Internet) what device should you suspect? Learn to Divide and
Conquer – the power of binary search!
2. Have “High Bandwidth Conversations” with qualified peers.
3. Look out for “Defaults” HSB’ism: Defaults are the guardian angels
for the clueless!
4. Another case of “picture is worth a thousand words”
SHARKFEST ‘10 | Stanford University | June 14 –17, 2010
Odd Numbers are Evil? Really?
Software Update System is slow in delivering packages
to staging servers. It impacts 300,000+ users!
Troubleshooting Steps:
1. Usual Suspects (Duplex, Window size, Pkt loss, and LFN)
2. Use the information in the trace to eliminate some of the “usual
suspects.” Not all inefficiencies come into play. Does Window come
into play here?
3. Do I need to see the SYN/SYN+ACK to see what environment this is?
What other options are there?
4. Use Time Reference markings liberally?
5. Case of “too much of a good thing”
SHARKFEST ‘10 | Stanford University | June 14 –17, 2010
Another Zebra Case!
Users are calling into the helpdesk because the Citrix
sessions are dying.
Main Concept:
1. Applications traversing the Internet play by a different set of
rules/standards. Packet loss is a way of life.
2. Do you **REALLY** know TCP?
3. Did you pick up on why the 500ms delay is significant?
4. What is Fast Retransmit and how is it different from “regular”
Retransmission?
5. Learn the art of spotting something unusual. But first, you need to
understand “what’s unusual.”
SHARKFEST ‘10 | Stanford University | June 14 –17, 2010
Wan Optimization
After upgrading WAN optimization appliances, tellers started
reporting intermittent printing issues. Transient problems like
these are the toughest to resolve. What was the time to
Resolution? Three days - thanks to packet captures.
Main Concept:
1. Last change was OS upgrade on the wan optimization appliance, so
start there.
2. Capturing in the right capture points is critical. Why?
3. Is it worth looking at TCP Session #2?
4. What should you compare? What can you compare?
5. Sake Blok’s session last year on SSL decryption was VERY helpful!
SHARKFEST ‘10 | Stanford University | June 14 –17, 2010
Wan Optimization (Con’t)
SHARKFEST ‘10 | Stanford University | June 14 –17, 2010
Wan Optimization (Con’t)
SHARKFEST ‘10 | Stanford University | June 14 –17, 2010
Download