A Path-Sensitively Sliced Control Flow Graph Joxan Jaffar Vijayaraghavan Murali National University of Singapore 2 Motivation • Static slicing – projecting program behavior on a target • Pre-processing step for analysis, program comprehension etc. • Can we go beyond static slicing? if (c) p = 1; else p = 0; x = 0; if (p > 0) x = 1; P if (x == 0) z = 1; TARGET(z) P’ • No static slicing is effective on P • But it is equivalent to P’ wrt target z • Analysis of P’ is clearly easier • Novel concept introduced: Tree Slicing if (!c) z = 1; 3 Motivation if (c) p = 1; else p = 0; x = 0; if (p > 0) x = 1; if (x == 0) z = 1; TARGET(z) if (c) { } else { z = 1; } Post processing if (!c) z = 1; Path-sensitive expansion Slicing if (c) { p = 1; x = 0; x = 1; } else { p = 0; x = 0; z = 1; } TARGET(z) 4 Why does this work? • Slicing a program fragment is more effective when there is path-sensitivity • In general symbolic execution displays path-sensitivity as it unfolds the path leading to the fragment • But path-sensitivity may not always be useful! Example… 5 When does this not work? if (c) p = 1; else p = 0; Program Fragment where all values of z are unaffacted by p TARGET(z) if (c) { P P’ p = 1; } else { p = 0; } TARGET(z) • P’ is effectively twice the size of P but there is no benefit from the duplication due to path-sensitivity • Worse: full path-sensitivity is intractable! 6 What is the solution? • Some form of merging • At every merge point: original CFG • Not at all: full SE tree (intractable) • Exactly when path-sensitivity is no longer useful for slicing • Path-Sensitively Sliced CFG (PSSCFG) 7 Methodology • Symbolically execute the program generating SE tree and computing dependencies for slicing • Merge states if it does not cause loss of slicing (dependency) information • How to detect this? • Merging conditions… c1 c2 8 Merging Conditions • Interpolant Ψ • Logical formula stored at each sub-tree that succinctly captures all infeasible paths in it • Typically a generalization of the context 𝑐1 c1 Ψ • Witness paths 𝜔 • Set of paths in the sub-tree that contribute to its dependency information • Typically only a few paths in the sub-tree 𝜔 9 Merging Conditions • Theorem: If a new context 𝑐2 implies the interpolant 𝛹 and all witness paths 𝜔 are feasible under 𝑐2 then by exploring 𝑐2 again we would obtain exactly the same slicing information as before c1 Ψ c2 c2⇒ Ψ • Path-sensitivity under 𝑐2 is no longer useful for slicing 𝜔 • Merge 𝑐2 with 𝑐1 𝜔 10 Tree Slicing • Once SE tree is generated, apply slicing on the tree itself instead of on program statements • Rules are similar to traditional slicing using dependency sets • Must also consider infeasible paths and contexts in the tree (example soon) • Advantage: the same program fragment can be sliced from one part of the tree but not another • Not applicable to static slicing! 11 Example: generate SE tree ¬𝑐 𝑐 flag=1 x=2 flag=1 {x,d} ¬𝑑 𝑑 y=5 y=4 𝑐 ∧ 𝑓𝑙𝑎𝑔 = 1 ∧ 𝑥 = 2 ∧ ¬𝑑 ∧ 𝑦 = 5 ⇒ 𝑓𝑙𝑎𝑔 = 1 flag=1 {x,y} 𝑓𝑙𝑎𝑔 ¬𝑓𝑙𝑎𝑔 z=y+x z=y+x flag=0 x=2 flag=0 {x} y=4 y=5 flag=0 {x} 𝑓𝑙𝑎𝑔 ¬𝑓𝑙𝑎𝑔 z=x+1 z=x+1 12 Example: apply Tree Slicing ¬𝑐 𝑐 flag=1 x=2 flag=1 {x,d} ¬𝑑 𝑑 y=4 flag=1 {x,y} 𝑓𝑙𝑎𝑔 ¬𝑓𝑙𝑎𝑔 z=y+x y=5 flag=0 x=2 flag=0 {x} y=4 flag=0 {x} 𝑓𝑙𝑎𝑔 ¬𝑓𝑙𝑎𝑔 z=x+1 y=5 13 Example: “decompile” to C 𝑐 ¬𝑐 x=2 x=2 𝑑 y=4 z=y+x ¬𝑑 y=5 z=x+1 14 Example: benefits of PSSCFG P P’ • Faster verification and analysis • Less # of paths/variables • Less constraint solving for concolic testing • on P, always generates values for c and d • on P’, generates d only if c was non-zero • Completely off-the- shelf transformation! 15 Experiments • Implemented on TRACER symbolic execution framework • Manageable blow-up and generation time of PSSCFG • Compared with static slice from Frama-C 16 Experiments • Off-the-shelf concolic tester CREST gains 3.1 times speed-up (24 hrs vs 8 hrs) 17 Experiments • Off-the-shelf verifiers gain 1.5 to 5.8 times speedup • ARMC unable to terminate on original program but able to on transformed program 18 Related work • Jaffar, Murali, Navas & Santosa. Path-Sensitive Backward Slicing. In SAS’12 • Daoudi et al. Consus: A Scalable Approach to Conditioned Slicing. WCRE, 2002. • Full path enumeration • Snelting et al. Combining Slicing and Constraint Solving for Validation of Measurement Software. SAS ’96. • Less precise: Conservatively includes a statement if there is at least one feasible path to the criterion point • Balakrishnan et al. Path-sensitive analysis through infeasible-path detection and syntactic language refinement. SAS’08 • “Offline” transformation for external consumption, only removes infeasible paths (no slicing) • Jhala & Majumdar. Path Slicing. PLDI’05 • Reduces length of CE in CEGAR • Does not work with full CFG (no consideration of merges) • State-merging methods (Kuznetzov’12, McMillan’06 …) • Do not (need to) guarantee lossless-ness of slicing information 19 Future Work • Program analysis without targets • No benefit from slicing • But still, detect infeasible paths – benefit path-insensitive analyses • Alias analysis: 10% reduction in (spurious) aliasing information • Program execution • Partial evaluation – different versions for each “target” computation • Reduction in number of steps (exe time) 20 Something for everyone... Whenever you have to process (analyze, verify, test, execute…) any program wrt some target, use the PSSCFG and you are likely to get benefit! Thank You! Questions?