A Framework for Source-CodeLevel Interprocedural Dataflow Analysis of AspectJ Software Guoqing Xu and Atanas Rountev Ohio State University Supported by NSF under CAREER grant CCF-0546040 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Outline Motivation Program representation - Control flow representation - Data flow representation Proof-of-concept analyses - Object effect analysis - Dependence-analysis-based slicing Experimental evaluation 2 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Interprocedural Dataflow Analysis Interprocedural dataflow analysis is important - For various software engineering and compiler construction tasks - e.g. performance optimization, static software verification, testing, software understanding and evolution Powerful analyses for AspectJ are needed - AOP becomes more and more popular - Need good program representation - Need new algorithms 3 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Program Representation for AspectJ Properties of a good representation - Should be easy to use for various clients Adapt existing Java analysis algorithm - Should provide clean separation between the base code and aspects Automated reasoning of aspects-base-code interaction Advantages of source-code-level over bytecodelevel analysis - Produces more relevant results - Provides clean separation of base code and aspects - Faster running time 4 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Proposed representation Has both properties - Take the large body of existing analysis algorithms for Java, and adapt them easily to AspectJ - Define new analysis algorithms specifically for AspectJ features Control flow representation [ICSE’07] - Complex interactions of advices - Dynamic advices Data flow representation [this paper] - Using calls and returns along chains of nested advice invocations - Expose the decision-making data for dynamic advices 5 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Running Example /* before1 */ class Point { before(Point p, int x) : int x; setterX(p) void setX(int x) { this.x = x; } {…} static void main(String[] a) { Point p = new Point(); /* around1 */ void around(Point p, int x) : p.setX(10); } setterX(p) { … proceed(p,x); … }} aspect BoundPoint { /* before2 */ pointcut setterX(Point p) : before(Point p) : setterX(p) { … } call(void Point.setX(*)) && /* after1 */ target(p); after(Point p) : setterX(p) { …advices } … // } 6 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Control-Flow Representation Interprocedural Control Flow Graph (ICFG) - Java-like representation for “normal” calls No join points - New representation for interactions at join points Interaction graph Multiple advices applicable at a join point - Calls to represent nesting relationships - Use ph_* placeholder method to represent call to proceed Dynamic advices - Use ph_decision placeholder decision node to guard call sites of dynamic advices 7 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University before1 root ph_root around1 entry entry before1 ... return proceed ph_proceed1 before2 entry entry ... before2 entry before1 ... around1 after1 return around1 exit return Point.setX p.setX exit p.setX return ... after advices ... exit before2 exit 8 CFG edge entry return ... exit exit Call edge PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Handling of Dynamic Advices ph_decision T F before1 return around1 return ph_decision T F ph_proceed1 return exit 9 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Data-flow Representation for IG Declarations for placeholder methods - Create a formal parameter for (1) the receiver object or (2) a formal parameter of the crosscut method - e.g. for shadow p.setX(6), the entry method is declared as void ph_root(Point arg0, int arg1) Declarations for non-around-advice - The original list of formal parameters is directly used - e.g.void before1(int arg0, Point arg1) Call sites for advices are built with appropriate actual parameters - void ph_root(Point arg0, int arg1) { before1(arg1, arg0); } 10 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Handling of Around Advice Handling of around advice is complicated - It must have all the necessary parameters of the shadow call site - Parameters needed by an around advice can be different for different shadows abc solution: replicate the body of an around advice for each shadow that the around advice matches Our solution: construct a globally valid list that includes parameters required at all matching shadows 11 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Example setX(Point arg0, int arg1) setY(Point arg2, float arg3) setZ(double arg4, Point arg5) around(Point p):call(void Point.set*(*)) && target(p) around(Point arg0, int arg1, Point arg2, float arg3, double arg4, Point arg5, int dv) 12 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Handling of Around Advice Call to around advice - Non-trivial actual parameters are passed to the call site only for the formals corresponding to the currently-active shadow - A unique shadow ID is given for dv - e.g. setX(Point arg0, int arg1) setY(Point arg2, float arg3) setZ(double arg4, Point arg5) around(Point p):call(void Point.set*(*)) && target(p) around(Point arg0, int arg1, Point arg2, float arg3, double arg4, Point arg5, int dv) for p.setX(..), around(arg0, arg1, null, 0, 0, null, 0) for p.setY(..), around(null, 0, arg2, arg3, 0, null, 1) 13 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Handling of Around Advice (Cond.) Call to a placeholder method within an around advice - Use dv to select calls to placeholder methods - e.g. void around( Point arg0, int arg1, Point arg2, float arg3, double arg4, Point arg5, int dv){ switch(dv){ //corresponding to shadow setX case 0: ph_proceed0(arg0, arg1); break; //corresponding to shadow setY case 1: ph_proceed1(arg2, arg3); break; //corresponding to shadow setZ case 2: ph_proceed2(arg4, arg5); } } 14 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Data with ph_decision Nodes For a dynamic advice, the data that contributes to dynamic decision making is associated with the ph_decision node that guards the call to the advice - E.g. before(Point p, int i):if(i > 0) && args(i) && target(p) both p and i are associated with the ph_decision node 15 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University So What? Data flow is explicit with parameter passing Advice nesting relationship is clear Ready for data-flow analysis 16 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Object Effect Analysis Compute for each object passed from the base code into an advice, a state machine encoding all events that must happen on the object - The state machine is represented by a regular expression - Can directly be used for program verification (e.g. checking typestate based properties) - E.g. (reset | ε) (( setX | ε) | ( getX (( setX | ε ) |( setRectangular | ε )))) ((wx wy | ε)) ( ( (wx | ε) | ( rx ( ( wx | ε) | (( wx wy) | ε))))) - At the core of this analysis is a must-alias analysis A typical example of interprocedural analysis for AspectJ - Need to track the interprocedural flow of data - Need to be aware of the separation between base code and aspects 17 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Object Effect Analysis (Cond.) Data flow problem - We define a lattice element for each reference-typed formal parameter of a ph_root method - E.g. for void ph_root(Point arg0, int arg1) The lattice is {larg0, ┴ ,┬ } - Transfer functions: v1 = v2: fn(S) = S[v1 S(v2)] v1 = v2.fld: fn(S) = S[v1 ┴] v1.fld = v2: fn(S) = S v1 = new X: fn(S) = S[v1 ┴] other nodes: fn(S) = S - Compute meet-over-all-valid-paths soluction MVPn 18 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Analysis Algorithm Phase 1: relate formals to variables - For each variable in a method, compute a set of formal parameters that the variable may alias - Bottom-up traverse call graph to compute a summary function that relates the return value of a method to its parameter(s) Phase 2: Propagation of lattice elements - Top-down traverse the call graph starting from each ph_root method - For each variable, replace the formal parameter associated to it with the corresponding lattice element(s) Phase 3: Effect graph construction - Prune ICFG by removing nodes that do not have a lattice element - Compute SCC in the graph, and bottom-up traverse the SCC-DAGs 19 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Dependence-based Slicing Another typical example of interprocedural dataflow analysis Given existing slicing algorithms for Java, adapting it to AspectJ is very simple - Variables associated with ph_decision nodes are considered as used - The slice for a call to proceed in an around advice includes slices that are computed for the group of calls to ph_proceed methods in the advice 20 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Experimental Evaluation Comparison of ICFG and SDG sizes between sourcecode analysis and bytecode analysis Effect analysis results comparison between using must-alias analysis and may-alias analysis Slice relevance comparison between source-code slicing and bytecode slicing Implementation based on the abc compiler - Between static weaving and advice weaving - Intraprocedural representation based on Jimple from abc 21 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Numbers of Edges in ICFG and SDG ▲- SDG Edges ■- ICFG edges # ICFG edges 2X smaller #SDG edges 3X smaller 22 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Object Effect Analysis 120 100 80 60 40 20 0 be an /v tr 4 te aci le ng qu co ic m/ ks v7 nu or ll t/ ch v2 nu ec ll k/ ch v2 ec k/ v5 lo d/ v1 lo d/ v2 sp dcm ac /v ew 2 ar /v 1 Must/May The analysis has short running time - E.g. less than 3 sec for the largest program 23 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University 24 dc m/ v2 lo d/ v2 lo d/ v1 qu ic ks or t/ v2 nu ll ch ec k/ v2 nu ll ch ec k/ v5 te le co m/ v7 tr ac in g be an /v 4 sp ac ew ar /v 1 dc m/ v2 lo d/ v2 lo d/ v1 be an /v 4 tr ac in g te le co m/ qu v7 ic ks or t/ v2 nu ll ch ec k/ v2 nu ll ch ec k/ v5 Slicing Relevance Ratio and Time 120 100 80 60 Source 40 Bytecode 20 0 160 120 80 40 0 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Conclusions We propose a program representation for AspectJ software - A control flow representation and a data flow representation Two proof-of-concepts analyses - Both are IDE (Interprocedural Distributive Environment) problems, which require context-sensitivity and flow-sensitivity - Representative of (1) a large class of existing Java analysis algorithms and (2) potentially new algorithms designed specifically for AspectJ Experimental results - Source-code-level analysis is superior to bytecode-level analysis - Program representation is easy to use and adapt existing algorithms - New algorithms can be easily designed 25 PRESTO: Program Analyses and Software Tools Research Group, Ohio State University