deepstart.ppt

advertisement
Deep Start: A Hybrid Strategy
for Automated Performance
Problem Searches
Philip C. Roth
pcroth@cs.wisc.edu
Computer Sciences Department
University of Wisconsin
1210 W. Dayton St.
Madison, WI 53706-1685
USA
pcroth@cs.wisc.edu
Paradyn/Condor Week (12 March 2001, Madison, WI)
Performance Consultant
• Paradyn’s automated bottleneck diagnosis
component
• Search-based
•General to very specific experiments
• Experimental data collected using dynamic
instrumentation
• Automates process that experienced
programmer would use
pcroth@cs.wisc.edu
[2 of 28]
Deep Start
Deep Start Search Strategy
• Goal: more scalable automated searches
• Idea: search “closer” to actual bottlenecks
• Hybrid approach
•Automated search using dynamic instrumentation
•Stack sampling
• Benefits
•Efficiency-find bottlenecks more quickly
•Effectiveness-find bottlenecks hidden from
current search strategy
pcroth@cs.wisc.edu
[3 of 28]
Deep Start
Performance Consultant
• Hypotheses: reasons why application may be
performing poorly
TopLevelHypothesis
ExcessiveSyncWaitingTime
ExcessiveIOBlockingTime
CPUbound
TooManySmallIOOps
pcroth@cs.wisc.edu
[4 of 28]
Deep Start
Performance Consultant
• Resources: locations where application may
be performing poorly
pcroth@cs.wisc.edu
[5 of 28]
Deep Start
Performance Consultant
• Focus: tuple of resources, one from each hierarchy
• Names a set of application resources
• </Code/setup.c,/Machine/lc05.cs.wisc.edu,/SyncObject>
pcroth@cs.wisc.edu
[6 of 28]
Deep Start
Current Search Strategy
• First determine why application is performing
poorly
• Search through hypotheses at whole program focus
• Then find where application is performing poorly
• Refine focus as much as possible
• Code follows call graph
• Others follow resource hierarchy structure
• One step at a time
• Prune search path when experiment metric is
below threshold
pcroth@cs.wisc.edu
[7 of 28]
Deep Start
Current Search Strategy
pcroth@cs.wisc.edu
[8 of 28]
Deep Start
Current Search Strategy
pcroth@cs.wisc.edu
[9 of 28]
Deep Start
Current Search Strategy
pcroth@cs.wisc.edu
[10 of 28]
Deep Start
Current Search Strategy
pcroth@cs.wisc.edu
[11 of 28]
Deep Start
Deep Start Search Strategy
• Goal: start searching “closer” to actual
bottlenecks
• Approach: Use stack samples gathered as
side-effect of dynamic instrumentation as
hints
• Sampling augments current strategy
•Won’t miss bottlenecks due to sampling
pcroth@cs.wisc.edu
[12 of 28]
Deep Start
Deep Start Example
pcroth@cs.wisc.edu
[13 of 28]
Deep Start
Deep Start Example
Deep starter
pcroth@cs.wisc.edu
[14 of 28]
Deep Start
Deep Start Example
Adding deep starter was worthwhile!
pcroth@cs.wisc.edu
[15 of 28]
Deep Start
Deep Start Example
pcroth@cs.wisc.edu
[16 of 28]
Deep Start
Stack Sampling
• Paradyn daemons perform stack walk whenever they insert dynamic instrumentation
• Daemons now save stack samples
• Samples delivered in batches to
Performance Consultant
pcroth@cs.wisc.edu
[17 of 28]
Deep Start
Choosing Deep Starters
• Function count graph
•ABCD
•AECD
•AFD
•AFG
pcroth@cs.wisc.edu
[18 of 28]
Deep Start
Choosing Deep Starters
• Function count graph
•ABCD
•AECD
•AFD
•AFG
B:1
A:4
E:1
F:2
pcroth@cs.wisc.edu
[19 of 28]
C:2
D:3
G:1
Deep Start
Choosing Deep Starters
• Consider functions whose count is above
threshold
• Percentage of total samples seen
• Choose deepest function in abovethreshold subgraphs
B:1
A:4
E:1
F:2
pcroth@cs.wisc.edu
[20 of 28]
Threshold = 3
C:2
D:3
G:1
Deep Start
Adding Deep Starters
• Look for deep starters each time PC refines
along search path that has already searched
Code hierarchy
• Search from deep starter at high priority
•Focuses attention near likely bottlenecks
pcroth@cs.wisc.edu
[21 of 28]
Deep Start
Results
• Applications
•Sequential circuit layout
• Sun Ultra 10 (SPARC)
• Solaris 7
•Parallel global circulation simulation
•Parallel quantum chromodynamics simulation
• Eight-node x86 cluster, 100 Mb/s Ethernet switch
• Linux 2.2.17, high-resolution timer patch
• MPICH 1.2
pcroth@cs.wisc.edu
[22 of 28]
Deep Start
Bubba
• sequential circuit layout
100%
90%
Bottlenecks Found
80%
70%
60%
Deep Start
50%
Call Graph
40%
30%
20%
10%
0%
0
50
100
150
200
250
Time (sec)
pcroth@cs.wisc.edu
[23 of 28]
Deep Start
Su3_rmd
• SU(3) lattice gauge theory simulation
100%
90%
Bottlenecks Found
80%
70%
60%
Deep Start
Call Graph
50%
40%
30%
20%
10%
0%
0
50
100
150
200
250
300
Time (sec)
pcroth@cs.wisc.edu
[24 of 28]
Deep Start
Om3
• global ocean general circulation model
100%
90%
Bottlenecks Found
80%
70%
60%
Deep Start
50%
Call Graph
40%
30%
20%
10%
0%
0
50
100
150
200
250
300
Time (sec)
pcroth@cs.wisc.edu
[25 of 28]
Deep Start
Future Work
• Bidirectional Deep Start
•Search upward from deep starter as well as
downward
•Takes further advantage of stack samples as
“hot paths”
• “Priming the pump”
•Sampling period at the start of a Performance
Consultant search
•Avoids making early deep starter decisions based
on too few samples
pcroth@cs.wisc.edu
[26 of 28]
Deep Start
Future Work
• Minimize cost of finding deep starters
•With probability p at each refinement
•Every nth refinement
•Every nth sample
•Push model
• Make context-sensitive decisions to support
deep starters in non-Code hierarchies
•Identify functions in stack samples
pcroth@cs.wisc.edu
[27 of 28]
Deep Start
Conclusions
• Deep Start strategy is more efficient and
effective than current Performance
Consultant search strategy
• Hybrid strategy takes advantage of
strengths of dynamic instrumentation and
sampling
• Technique applicable to other types of hints
• Better scalability for automated searches
Demo Wednesday!
pcroth@cs.wisc.edu
[28 of 28]
Deep Start
Download