Pointer and Alias Analysis Kinds of alias info Aliases: two expressions that denote same mutable memory location Points-to analysis • at each program point, calculate set of p→x bindings, if p points to x • two related problems: Introduced through • pointers • may points-to: p may point to x • call-by-reference • must points-to: p must point to x • array indexing Storage shape analysis • C unions, Fortran common, equivalence • at each program point, calculate an abstract description of the structure of pointers etc. Applications of alias analysis: • improved side-effect analysis: if assign to one expression, what other expressions are modified? Alias-pair analysis • if certain modified or not modified, not a problem • at each program point, calculate set of (expr1,expr2) pairs, if expr1 and expr2 reference the same memory • if uncertain, things can get ugly • may and must alias-pair versions • eliminate redundant loads/stores & dead stores (CSE & dead assign elim, for pointer ops) • automatic parallelization of code manipulating data structures Points-to analysis is simple Storage shape analysis more abstract • ... Craig Chambers Alias-pairs analysis more general than points-to analysis, but more complicated 133 CSE 501 Craig Chambers 134 An intraprocedural points-to analysis May-point-to scalars At each program point, calculate set of p→x bindings, if p points to x Domain: Pow(Var × Var) CSE 501 Flow functions: Outline: p := &x • define may version first, then consider must version MAY-PTsucc = MAY-PTpred - {p→*} ∪ {p→x} • develop algorithm in increasing stages of complexity • pointers only to scalars p := q • add pointers to pointers MAY-PTsucc • add pointers to dynamically-allocated storage • add pointers to array elements = MAY-PTpred - {p→*} ∪ {p→t | q→t ∈ MAY-PTpred} Meet function: union Craig Chambers 135 CSE 501 Craig Chambers 136 CSE 501 Must-point-to Using alias info How to define must-point-to analysis? E.g. reaching definitions Option 1: analogous to may-point-to, but as must problem At each program point, calculate set of s:x bindings, if x might get its definition from stmt s • e.g. intersection is meet operation Simple flow functions: Option 2: interpretation of may-point-to results s: *p := x • if p may point to only x, then p must point to x: MUST-PT = { p→x | p→x ∈ MAY-PT and p→y ∈ MAY-PT ⇒ y=x} RDsucc = RDpred ∪ {*:z | p→z ∈ MUST-PTpred} {s:z | p→z ∈ MAY-PTpred} • what if p points to nil? p assigned an integer? s: x := *p RDsucc = RDpred ∪ Craig Chambers 137 CSE 501 Reaching “right hand sides” {*:x} {s:x} Craig Chambers 138 CSE 501 Example A variation on reaching definitions that passes definitions through copies ➀ x := 3 s:x in set if x might get its definition from rhs of stmt s, skipping through uninteresting copies and pointer loads where possible Can use reaching right-hand sides to construct def/use chains that skip through copies, e.g. for better constant propagation ➁ p := &x ➂ y := 5 ➃ q := &y ➄ q := &x Flow functions: s: x := y RDsucc = RDpred ∪ ➅ *p := 7 {*:x} {s’:x | s’:y ∈ RDpred} ➆ z := *q ➇ *q := 4 ➈ w := *p s: x := *p RDsucc = RDpred ∪ Craig Chambers {*:x} {s’:x | p→z ∈ MAY-PTpred ∧ s’:z ∈ RDpred} 139 CSE 501 Craig Chambers 140 CSE 501 Adding pointers to pointers Example Flow functions: int x, y, z; p := *q MAY-PTsucc x := 5 int *p, *q; = MAY-PTpred - {p→*} ∪ {p→t | q→r ∈ MAY-PTpred ∧ r→t ∈ MAY-PTpred} y := 6 int **R; p := &x q := &y R := &p *p := q MAY-PTsucc = MAY-PTpred - {r→* | p→r ∈ MUST-PTpred} ∪ {r→t | p→r ∈ MAY-PTpred ∧ q→t ∈ MAY-PTpred } *R := q *q := 7 q x := p p := 8 := *R *q := 9 z Craig Chambers 141 CSE 501 Adding pointers to dynamically-allocated memory := *p Craig Chambers 142 CSE 501 Example p := new T ➀ lst := new Cons Issue: each execution creates a new location ➁ p := lst Idea: generate new var to stand for new location • make Var domain unbounded • newvar: return next unused element of Var ➂ t := new Cons Flow function: ➃ *p := t s: p := new T MAY-PTsucc Craig Chambers = MAY-PTpred 143 ➄ p := t - {p→*} ∪ {p→newvar} CSE 501 Craig Chambers 144 CSE 501 A monotonic, finite approximation Adding pointers to array elements Can’t create a new variable each time analyze statement Array index expressions can generate aliases • lattice is infinitely tall if Var domain is infinite! • not a monotonic flow function! a[i] aliases b[j] if: • a aliases b and i equals j One solution: create a special summary node for each new stmt • a and b overlap, and ... Domain = Pow((Var+Stmt) × (Var+Stmt)) Can have pointers to array elements: p := &a[i] s: p := new T MAY-PTsucc = MAY-PTpred - {p→*} ∪ {p→locs} Can have pointer arithmetic, for array addressing: p := &a[0]; ...; p++ Alternatives: • summary node for each type T How to model arrays? • k-limited summary • maintain distinct nodes up to k links removed from root vars, then merge together • ... Craig Chambers 145 CSE 501 • could treat whole array as big monolithic location • could try to reason about array index expressions ⇒ array dependence analysis (later) Craig Chambers 146 CSE 501