CACHETOR Detecting Cacheable Data to Remove Bloat Khanh Nguyen

advertisement
CACHETOR
Detecting Cacheable Data to
Remove Bloat
Khanh Nguyen
Guoqing Xu
UC Irvine
USA
Introduction
Bloat: Excessive work to accomplish simple tasks
•
•
Modern software suffers from bloat [Xu et.al., FoSER 2010]
•
It is difficult for compilers to remove the penalty
•
One pattern: repeated computations that have the
same inputs and produce the same outputs
•
4 out of 18 best practices (IBM’s)* are to reuse data
Khanh Nguyen - UC Irvine
*
www.ibm.com/software/webservers/appserv/ws_bestpractices.pdf
Example
{0.0,
2.3,
1.0,
float[] fValues = {?,
?,1.0,
?, ?,
. .1.0,
. , ?};
3.4, 1.0, 1.0, . . . , 1.0};
int[] iValues = new int[fValues.length] ;
int cached_result = Float.floatToIntBits(1.0);
for (int i = 0; i < fValues.length; i++){
ifiValues[i]
(fValues[i] === 1.0)
iValues[i]
= cached_result;
Float.floatToIntBits(fValues[i]);
else
iValues[i] = Float.floatToIntBits(fValues[i]);
}
{adapted from sunflow, an open-source image rendering system}
Khanh Nguyen - UC Irvine
The Big Picture
Dynamic Dependence
Analysis
I-Cachetor
Dependence
Profile/Graph
D-Cachetor
M-Cachetor
Khanh Nguyen - UC Irvine
Cachetor
•
Introduction
•
Scalable algorithms for the dependence
analysis
•
3 detectors
•
Evaluations
Khanh Nguyen - UC Irvine
In Theory
Practice
Abstract
Value
Profiling
Full Value
Profiling
Cachetor
Abstract
Dynamic
Slicing
Full Dynamic
Slicing
Khanh Nguyen - UC Irvine
Overview
•
Combine value profiling and dynamic slicing
in a mutually-beneficial and scalable manner
•
•
Distinct values are used to abstract instruction
instances
Result: an abstract dependence graph
•
•
Nodes: abstract representations of runtime
instances
Edges: dependence relationships between nodes
Khanh Nguyen - UC Irvine
Equivalence Class
e1
Inst.
instances
Instruction
… i
en
f1
Khanh Nguyen - UC Irvine
Equivalence Class
1
1
1
2
2
3
Inst.
instances
3
3
2
Values created
3
4
4
5
5
6
6
6
6
6
f1(inst. instance) = value created
Values created
Inst.
instances
f2
f1
-Top-N ?
- Hashing ?
3
3
3
6
6
Inst.
instances
1
Values created
1
1
1
4
4
7
0
1
7
2
2
2
2
5
5
8
8
8
f1
- Hashing
f2
Another Abstraction Level
•
Context sensitive:
•
To distinguish entities based on the calling
context
•
•
To improve the tool’s precision
Please refer to our paper for details
Khanh Nguyen - UC Irvine
Cacheability
•
Quantitative measurement indicating how likely a program
entity will keep producing/containing identical values
•
•
Compute cacheability for 3 kinds of program entities:
•
Instruction
•
Data structure
•
Method call
Rank and report top entities
Khanh Nguyen - UC Irvine
Cachetor
•
Introduction
•
Scalable algorithms for the dependence
analysis
•
3 detectors
•
Evaluations
Khanh Nguyen - UC Irvine
I-Cachetor
•
Detect instructions that create identical values
•
Compute cacheability for each static instruction (Inst.CM)
•
Cacheability:
0
1
2
3
D-Cachetor: Overview
•
•
2 steps:
•
Step 1: detect cacheable individual objects
•
Step 2: detect cacheable data structure
Compute cacheability for each allocation site node
D-Cachetor: Step 1
•
Compute cacheability for each object (Obj.CM),
not considering reference relationships
•
Focus: instructions that write primitive-typed fields
a = new A()1
1
2
…
a.f = b<2,3> a.g = c<3,3> a.h = d<5,7>
t
a.… = …
D-Cachetor: Step 2
•
Group objects using the
reference relationships
ds = new DS()2
•
Compute DataStructureCM
•
Focus: instructions that write
reference-typed fields
•
Add only objects whose Obj.CM
is within a range
a = new A()4
c = new C()2
b = new B()6
d = new D()7
M-Cachetor
•
Detect method calls that have the same inputs and
produce the same outputs
•
Compute CallSiteCM
•
For each call site c: a = f( ), CallSiteCM is:
•
If a is primitive: CallSiteCM = Inst.CMc
•
If a is reference: CallSiteCM = the average of
DataStructureCM of all data structures rooted at a
Implementation
•
Jikes RVM 3.1.1
•
Optimizing-compiler-only mode
•
Context-sensitive
•
Evaluated on 14 benchmarks from DaCapo & Java
Grande
Khanh Nguyen - UC Irvine
Overheads
X
600
Geo. Mean = 201.96X (Time) - 1.98X(Space)
Time
Space
X
10
9
500
8
7
400
6
300
5
4
200
3
2
100
1
0
0
Khanh Nguyen - UC Irvine
Case Studies
Program
Time
Reduction
Space
Reduction
GC runs
Reduction
GC time
Reduction
montecarlo
12.1%
98.7%
70.0%
89.2%
raytracer
19.1%
1.2%
33.3%
30.2%
euler
20.5%
0.4%
40.0%
44.8%
bloat
13.1%
12.6%
-7.3%
-4.0%
xalan
5.2%
0.1%
-0.7%
-1.1%
Khanh Nguyen - UC Irvine
False Positives
Program
D-Cachetor
M-Cachetor
montecarlo
2
6
raytracer
3
4
euler
1
7
bloat
1
4
xalan
4
5
Numbers of false positives identified among top 20 items in the reports of D-Cachetor
and M-Cachetor.
Khanh Nguyen - UC Irvine
False Positives Sources
•
Handling of floating point values
•
Context-sensitive reporting
•
Missing the actual values
•
Hashing-induced false positives
Khanh Nguyen - UC Irvine
Conclusions
•
Cachetor - novel tool, supports detection of
cacheable data to improve performance
•
•
Scalable combination of value profiling and
dynamic slicing
3 detectors that can detect cacheable:
o
o
o
•
Instructions
Data structures
Method calls
Large optimization opportunities can be found from
Cachetor’s reports
Khanh Nguyen - UC Irvine
THANK YOU!
Questions - Comments?
Khanh Nguyen - UC Irvine
What happened in montecarlo?
public void runSerial() {
results = new Vector(nRunsMC);
// Now do the computation.
PriceStock ps;
for( int iRun=0; iRun < nRunsMC; iRun++ )
{
ps = new PriceStock();
ps.setInitAllTasks(initAllTasks);
ps.setTask(tasks.elementAt(iRun)); ps.setTask(iRun, (long)iRun*11);
ps.run();
results.addElement(ps.getResult());
}
{Calculate the result on the fly}
private void processSerial() {
processResults();
}
private void initTasks(int nRunsMC) {
tasks = new Vector(nRunsMC);
for( int i=0; i < nRunsMC; i++ ) {
String header=
"MC run “ + String.valueOf(i);
ToTask task =
new ToTask(header, (long)i*11);
tasks.addElement((Object) task);
}
}
Khanh Nguyen - UC Irvine
Download