Getting Started in Program Analysis Research: Outline Background and useful skills – Ana Using and developing analysis – Mary Lou Identifying and building infrastructure – Lori Evaluating your analysis – Ana Ana Milanova I am from Bulgaria National High School for Math and Science American University in Bulgaria, 1997 – I have a degree in Business Administration Rutgers University, PhD in CS, 2003 Now Assistant Professor at RPI Research: program analysis for software tools Family – Husband Tony – Katarina, 5 and Petar, 2 Program Analysis: Useful Background and Skills Program Analysis Static program analysis – Analyzes the source code of the program – Run-time behavior properties without running the program E.g., ”The object values that flow to reference variable x are only of classes A and B, but not C.” Static analyses are conservative: consider all possible run-time behaviors of the program Program Analysis Dynamic program analysis – Analyzes a set of program executions – Reasons about run-time behavior properties over observed executions E.g., ”The object values that flowed to reference variable x during observed executions were only of classes A and B, but not C.” Dynamic analyses are incomplete: consider only behaviors over particular executions Goal: combine with static analysis Uses of Static Program Analysis Compilers – traditional application domain – Enables optimizing transformation Software engineering tools – Static debugging, verification, security Uncover difficult errors and security flaws – Testing Evaluate and improve test suites – Software understanding Calling structure Complex dependences Change impacts Uses of Program Analysis Analysis for compiler optimization is different from Analysis for software tools Different requirements, different success criteria (more later…) Static Analysis Methodologies Data-flow analysis Constraint-based program analysis Abstract interpretation Type and effect systems Model checking Example: Data-flow Analysis Flow facts 1. i=11 read x,y {(i,1)} 2. if x<y {(i,1)} {(i,1)} 3. p(i) 4. i=j+5 {(i,4)} 5. p(i) {(i,4)} 6. i=11 {(i,1)} 7. i=i*i {(i,6)} – Information that we are propagating – E.g., set of definitions {(i,1), (i,4),(i,6)…} Transfer functions – The effect of a statement on the incoming flow facts – E.g., statement i=11 at 6 “kills” the incoming definition (i,4), and “generates” definition (i,6) Theory Data-flow frameworks – – – – Control-flow graph CFG Space of flow facts L Space of transfer functions F Certain properties of L and F allow a general solution procedure Fixed-point iteration – Termination: the iterative computation terminates – Safety (correctness, soundness): the solution is conservative For most problems the analysis produces “noise” Theory and Practice Analysis cost – how much time, memory Analysis precision – how much noise – a.m(): A more precise analysis a: {B}, and a less precise analysis a: {A,B,C} Typically, there is a tradeoff between cost and precision! In practice, we need to analyze very large programs, 100K LOC, even 1M LOC Theory and Practice Approximations - introduce noise – make the CFG “smaller” – make the set of flow facts “smaller” – make the transfer functions converge faster Approximations are necessary – But be careful: different approximations for different analyses Standard Approximations Flow-sensitive vs. flow-insensitive x: {true} x: {true}, y: {false} x: {false}, y: {false} x = true; y = false; x = y; x: {true,false}, y: {false} Standard Approximations Context-sensitive vs. context-insensitive Merged flow: a.f = true b.f = false A(bool X) { this.f = X;} a.f = true/false a = new A(true); b.f = true/false b = new A(false); a.f: {true}, b.f: {false} a.f: {true,false}, b.f: {true,false} Useful Background and Skills Higher-level undergraduate or graduate courses on: – Programming Languages, Compilers, Algorithms, Logic, Software Engineering, Architecture Analytical and programming skills Step1: Design a program analysis algorithm Understand your target language (e.g., Java and C++, C) Step2: Implement the analysis algorithm Understand the language(s) of the infrastructure Step3: Evaluate analysis algorithm Useful Resources Books (my personal list) – “Compilers: Principles, Techniques and Tools” by Aho, Sethi, Ullman, Ch. 10 An introduction to data-flow analysis – “Program Analysis” by Nielsen, Nielsen, Hankin An excellent reference for advanced students – “Model Checking” by Clarke, Grumberg, Peled Course material on the web – Classes taught by professors – My class (there are better ones, of course): www.cs.rpi.edu/~milanova/csci6961/lectures/ Using and Developing Program Analysis Mary Lou Soffa University of Virginia About Mary Lou Soffa Confused about what I wanted to be Ph.D. programs: – Mathematics, Sociology; Philosophy; Environmental Acoustics: disenchanted – Found what I really loved – computer science After 25+ years at Pitt, moved to UVA – Small farm – grow “crops”; love my tractor – Passion – increasing the participation of women and minorities in computer science – Professional achievement – 24 Ph.D. students; ½ are women. Program analysis How to apply program analysis in your research What are questions and what do you have to do Solve a problem Program behavior static or dynamic Determine information needed What parts of program are involved Develop appropriate representation Develop analysis Develop algorithm Have a goal – program code Problem – Improve performance – Understand program – Find errors – Locate cause of errors Need to collect information about the program that helps you infer properties of program Static or dynamic code Determine information needed What questions are you asking What do you need to gather to answer questions – Examples: Statements needed to compute an expression Values are always constant at a particular program point Locations of dead statement Branches that are correlated Example: redundancy Remove redundancies with goal of improving performance – – – – – Redundant redundant expressions Redundant loads Redundant stores Dead code Static Remove redundant expressions from program representation Redundant expressions Does the value need to be computed for correct semantics? X := A * B F := C + E C := C + 1 If (cond) then R := A * B; S := C+ E Else X := A * B; A := 6 End if G= A*B What parts of program involved Given information you need, what parts of program are involved Examples: – branches and statements that change values in conditional – all possible execution paths – Array definitions and uses – Types – Loops Example: Redundant expressions Expressions Definitions Control flow among definitions and expressions Program paths Program representation Program representation that enables collection of information Granularity – Source, intermediate, binary Issues: how to get representation from another representation Example: redundant expressions Want to know how expressions flow Is the value of an expression same as when expression used again Need control flow graph with statements in nodes – intermediate level X := A + B Available Expressions Control flow graph X := A * B F := C + E C := C + 1 R := A * B X := A * B S := C+ E A := 6 G := A*B Formulate analysis over representation How to gather information from representation How many analyses Direction of flow of analysis Along all paths or any path Local solution Global solution Example: Redundant expressions Local - basic block – single entry/exit – What expressions are generated – What expressions are “killed” by a definition Global Flow over flow graph – Forward flow – Must be true on all paths Redundant Expressions Control flow graph X := A * B F := C + E C := C + 1 {A * B} { A * B} { A * B} R := A * B X := A * B S := C+ E A := 6 { A * B, C+E} G := A * B Develop analyses Data flow equations – use data flow framework Algorithm Preciseness Expense Data flow equations Gen (B) = all expressions Kill (B) = all definitions – kill all incoming available expression Out(B) = Gen(B) (IN(B) – Kill(B)) In(B) = Out(j) Dynamic Optimization Static optimizations – Apply before execution Dynamic Optimizations Apply during execution – redundancy expressions Binary code Program traces 1. A=4 2. T1 = A*B 3. L1: T2 = T1/C 4. If T2 < W go to L2 5. M = T1 * K 6. T3 = M + 1 7. L2: H = I 8. M = T3 - H 9. If T3 > 0 go to L3 10. Go to L1 11. L3: halt B1 1. A = 4 2. T1 = A*B B2 3. L1: T2 = T1/C 4.if T2<W go to L2 B3 B4 B1 B2 5. M = T1*K 6. T3 = M + 1 7. L2: H = I 8. M = T3-H 9. If 3 > 0 go to L3 B5 10. go to L1 B4 B3 B5 B6 B6 11. L3:halt Program Trace Binary code A=4 T1 = A*B T2 = T1/C If T2 !< W jump out H=I M = T3 - H If T3 > 0 go to L3 T2 = T1/C If T2 !< W jump out M = T1 * K T3 = M + 1 H=I M = T3 - H halt Dynamic optimization Note: Single entry; multiple exits No Loops Need to Representation – bring up a level from binary code Applying optimizations Not as complicated But, cannot tolerate much overhead – Phases in static – Developed algorithm that can apply multiple optimizations – Demand driven – Limit study of dynamic optimizations Conclusion Need analysis in many different applications – Virtual execution enviroments Multicore Wireless sensor networks – Testing Testing for wireless sensor networks Testing for security Identifying and Building Infrastructure Lori’s Journey Science/Math love: Started in chemistry at liberal arts college. Field Trip and first cs course -> CS major. Advisor’s strong push for grad school -> U Pitt. Took compilers course from Mary Lou -> PhD in compiler optimization. Big year: 10/85-married Mark. 1/86-started at Rice. 4/86-PhD Family: The yankees returned north 3 years later! University of Delaware: 15+ yrs. Visiting, Assistant, Associate, Full Family: Lauren (HS senior), Lindsay (16 and driving), Matt (11) Support: Mark, Mark, Mark,… Mary Lou, Errol, Sandee, CRA-W Currently: software tools, testing, compiler optimization Identifying and Building Infrastructure for Analysis Research What kinds of infrastructure do you need? How to identify and build infrastructure Examples What kinds of infrastructure do you need? Analysis Research and Evaluation Analysis Framework Software Hardware Workloads People Labspace Identifying Analysis Framework Software Determine Goals Specify Requirements - Short term - Long term - Needed - Desired (Prioritized) Search for Possibilities - Peers/Experts - Technical papers - Internet search Try Them Out - Install + Run Tests - Read docs - Examine code - Try small task Weigh Choices - Meet Requirements? - Ease of Use/Change?... Example: Identifying Analysis Framework Software Determine Goals Specify Requirements Search for Possibilities Try Them Out Weigh Choices Evaluate new analysis on Java On its own and in client tool - Needed: call graph, cfg, chg Realistic environment/apps Easy to extend/build client tools - Common environment is IDE, Java. Eclipse platform - Install + explore - Write a small plugin - Use call graph, chg, cfg for small task - Learning curve vs Available analyses, realism Implementing Your Analysis Once you have decided on an infrastructure: – Think Reuse!! Think modularity!! – Think prototype, but extensible and scalable – Test, test, test - try to be systematic – Debug – not easy Example: Implementing My NL Analysis Build small modular components -> reuse – – – – Analyzing method signatures to extract NL Building program representation for NL Traversing program rep Building program rep for IR Design reps to avoid loss of info -> reuse – Id’s and their roles and locations in code – Verb, Direct object rep -> extensible Managing the Evolving Software Infrastructure Managing change over time and people – CVS, subversion Tracking tasks, bugs, deadlines/goals – TRAC, bugzilla, gforge Maintaining documentation – JavaDocs, Doxygen Testing, testing, testing – Unit, system, regression -- test suites Sounds like software engineering… Selecting Appropriate Hardware Determine Goals Specify Requirements Search for Possibilities Weigh Choices - Short term - Long term - Needed - Desired (Prioritized) - Peers/Experts - System Staff - Meet Requirements? - Costs within budget? - Need to ask for money? Gathering Good Workloads Kind of Evaluation Desired Controlled Experiment Case Studies Representative Synthesized Benchmarks Try to reduce threats to validity of experiments: - varied/similar - domain - size - complexity/form - known and available to others Example: Gathering Good Workloads Kind of Evaluation Desired Research Questions: - How effective is our FindConcept Tool versus other code search tools? (versus lexical search and IR) (precision and recall) - How does the human effort compare? Case Studies Representative Try to reduce threats to validity of experiments: - varied/similar - domain Sourceforge: - size - very large - complexity/form - many cvs updates (active) - known/available to others - varied in domain Identifying Strong Students Teach a compiler or program analysis course regularly Identify students from the course Ideal = Creative + quick to understand analysis + good problem solver + hard working + good coder + good communicator + good writer + show initiative and interest in analysis Some training will be required. Start Small. Create a Pipeline. Building a Working Lab Space Needs: - one workspace/computer/storage per grad student - room for growth and undergrad researchers - current technology – minimize old machines – maintenance? - lab printer - lab library of research-oriented background books Make it somewhere students want to work: - posters/pictures/plants - open and pleasant – microwave, frig, coffeepot…? - all needed resources/supplies easily available - conference room for larger research meetings Static Program Analysis: Evaluating Your Analysis A Typical Program Analysis Research Project Step 1: Design your analysis – Reason about safety – Reason about complexity in terms of program size Step 2: Implement your analysis – Hard! – Complex and difficult to test, debug and verify – a real problem Step 3: EVALUATE! Evaluation of a Compiler Analysis Strict requirements for the analysis – Safety is crucial! An unsafe analysis may miss an execution path, and result in a change of the original program – Analysis time (and space) Constraint by normal compilation time Objective success criteria – Show improvement in execution time – Show reduction in memory footprint Evaluation of a Compiler Analysis Established benchmarks – E.g., the SPEC JVM98 General evaluation of Java compilers – E.g., the DaCapo benchmark suite Memory intensive Java applications Ideally you would say something like this: “our analysis increases compilation time by at most 10%, and results in speed-up of 10-16% on the SPEC JVM98 benchmarks”. Evaluation of an Analysis for a Software Tool Requirements for the analysis - not so strict – Relaxing safety is OK! – Analysis time (space) is not so crucial Developers would definitely wait if the analysis finds “difficult” bugs such as data races and memory leaks Success criteria - not so objective – Precision – low noise – Practicality – practical time/space requirements, works on 100K LOC – Usability of tool – Bugs found – absolutely sure Evaluation of an Analysis for a Software Tool Precision is CRUCIAL – noise is really bad! – E.g., there are 10 buffer overflow bugs in program P – Safe analysis A issues 1000 warnings, 10 are real and 990 are false positives – Unsafe analysis B issues 13 warnings, 8 are real and 5 are false positives – Analysis B is much more useful than analysis A! Absolute precision – done more and more often – Choose a subset of analyzed programs – Manually find the real solution – Compare with analysis solution Precision – how much noise is there? Recall (if the analysis is unsafe) – how much did it miss? – E.g., a.m(): The real solution a: {B}, a safe analysis solution a: {A,B,C}. Precision - 67% noise! Evaluation of an Analysis for a Software Tool Finding a benchmark set – – – – – Depends on analysis application Large programs Diverse programs, as many as it is feasible Publicly available: sourceforge.org Look at benchmark suites in published work! Ideally, you will have a large set of diverse programs, will show acceptable absolute precision (low false positive rate) and practical cost Comparison with Existing Analysis Well-known program analysis problems – “Haven’t we solved that problem yet?” – E.g., Points-to analysis Design a new analysis A Compare with best known analysis B – Show improvement in one or more of: analysis cost, analysis precision What Not to Do Propose a new analysis without any evaluation – E.g., “We describe this new great points-to analysis.” Design your own metric, different from established metrics – E.g., “We propose a novel points-to analysis A and points-to analysis A’ which improves on A. Therefore, both A and A’ are great.” Use non-standard benchmark – Report on a subset: the ones for which the analysis works Questions An Example: Devirtualization in Object-oriented Programs Polymorphism and dynamic dispatch class A { void m() { … } } class B extends A { void m() { … } } class C extends A { void m() { … } } – Virtual call: a.m() is dispatched at run-time, based on the class of the receiver, A, B or C – Powerful: enables modern software engineering – But costly: 13% of time spent in virtual dispatch Analysis: “only B objects ever flow to a” Optimization: virtual call a.m() => direct call to B.m() Uses of Static Program Analysis Software engineering tools – Static debugging, verification, security Uncover difficult errors and security flaws – Testing Evaluate and improve test suites – Software understanding Calling structure Complex dependences Change impacts Many (unexplored) areas of application Static Debugging Analyze the program and look for bugs – Memory and pointer bugs: memory leaks, null pointer dereferences, double frees, buffer overflows, etc. – Concurrency bugs: races, deadlocks – Issue warnings Microsoft: – PREFix and PREfast tools in use since 2000 – Many new tools developed IBM: – Tools for static debugging of production J2EE – Tools for security auditing of J2EE Software Testing Coverage-based testing – Improve test quality with good “coverage” – E.g., cover all possible receiver classes at virtual calls Step 1: analyze the tested code – What are all possible receiver classes at virtual calls? a.m(): Analysis: “only B objects ever flow to a” Step 2: insert instrumentation compare Step 3: run tests and report coverage – What were the receiver classes actually observed while running the tests? Software Understanding X.n() Navigate through calling structure: Reason about (im)mutability B.m() – Powerful, central to imperative programming – Many real bugs are due to unintended mutability Q1: is a method A.m(…) side-effect free? Q2: can a private field in a class A be mutated by untrusted clients of A (i.e., classes that use A)? Reason about other quality attributes Find code related to a change, etc. Reverse engineering Program Representations if (x<y) then z=1; else z=2; Control Flow Graph – Linear – 3-address statements – Flow of control if x<y z=1 Syntax Tree – Tree – Parse tree of the program F T z=2 If-then-else Expr Stmt Stmt x<y z=1 z=2