Type Inference II David Walker COS 441 Type Inference Goal: Given unannotated program, find its type or report it does not type check Overview: generate type constraints (equations) from unannotated programs solve equations Constraint Generation Typing rules for annotated programs: G |-- e : t Note: given G and e, at most one type t Typing rules for unannotated programs: G |-- u => e : t, q Note: given G and u, there may be many possible types for u; the many possibilities represented using type schemes u has a type t if q has a solution Remember: fun f (x) = f x Rule comparison implicit equality easy to implement because t can’t contain variables G |-- e1 : bool G |-- e2 : t G |-- e3 : t ---------------------------------------------------------------G |-- if e1 then e2 else e3 : t G |-- u1 ==> e1 : t1, q1 G |-- u2 ==> e2 : t2, q2 G |-- u3 ==> e3 : t3, q3 ------------------------------------------------------------------G |-- if u1 then u2 else u3 ==> if e1 then e2 else e3 : a, q1 U q2 U q3 U {t1 = bool, a = t2, a = t3} equality harder to implement because t2, t3 can contain variables that may be further constrained elsewhere Non-local Constraints Does this type check? fun f (x) = fun g (y) = (if true then x else y, ...) It depends: fun f (x) = fun g (y) = (if true then x else y, x + y) fun f (x) = fun g (y) = (if true then x else y, x + (if y then 3 else 4)) Non-local Constraints But remember, it was easy to check when types were declared in advance: fun f (x:int) = fun g (y:int) = (if true then x else y, x + y) fun f (x:int) = fun g (y:bool) = (if true then x else y, x + (if y then 3 else 4)) Solving Constraints A solution to a system of type constraints is a substitution S S |= q iff applying S makes left- & right-hand sides of each equation equal A solution S is better than T if it is “more general” intuition: S makes fewer or less-defined substitutions (leaves more variables alone) T <= S if and only if T = U o S for some U Most General Solutions S is the principal (most general) solution of a constraint q if S |= q (it is a solution) if T |= q then T <= S (it is the most general one) Lemma: If q has a solution, then it has a most general one We care about principal solutions since they will give us the most general types for terms principal solutions principal solutions give rise to most general reconstruction of typing information for a term: fun f(x:a):a = x is a most general reconstruction fun f(x:int):int = x is not Unification Unification: An algorithm that provides the principal solution to a set of constraints (if one exists) If one exists, it will be principal Unification Unification: Unification systematically simplifies a set of constraints, yielding a substitution during simplification, we maintain (S,q) S is the solution so far q are the constraints left to simplify Starting state of unification process: (I,q) Final state of unification process: (S, { }) identity substitution is most general Unification Machine We can specify unification as a transition system: (S,q) -> (S’,q’) Base types & simple variables: -------------------------------(S,{int=int} U q) -> (S, q) -----------------------------------(S,{bool=bool} U q) -> (S, q) ----------------------------(S,{a=a} U q) -> (S, q) Unification Machine Functions: ---------------------------------------------(S,{s11 -> s12= s21 -> s22} U q) -> (S, {s11 = s21, s12 = s22} U q) Variable definitions --------------------------------------------- (a not in FV(s)) (S,{a=s} U q) -> ([a=s] o S, [s/a]q) -------------------------------------------- (a not in FV(s)) (S,{s=a} U q) -> ([a=s] o S, [s/a]q) Occurs Check What is the solution to {a = a -> a}? Occurs Check What is the solution to {a = a -> a}? There is none! (Remember your homework) The occurs check detects this situation -------------------------------------------- (a not in FV(s)) (S,{s=a} U q) -> ([a=s] o S, [s/a]q) occurs check Irreducible States Recall: final states have the form (S, { }) Stuck states (S,q) are such that every equation in q has the form: int = bool s1 -> s2 = s (s not function type) a= s (s contains a) or is symmetric to one of the above Stuck states arise when constraints are unsolvable Termination We want unification to terminate (to give us a type reconstruction algorithm) In other words, we want to show that there is no infinite sequence of states (S1,q1) -> (S2,q2) -> ... Termination We associate an ordering with constraints q < q’ if and only if q contains fewer variables than q’ q contains the same number of variables as q’ but fewer type constructors (ie: fewer occurrences of int, bool, or “->”) This is a lexicographic ordering There is no infinite decreasing sequence of constraints To prove termination, we must demonstrate that every step of the algorithm reduces the size of q according to this ordering Termination Lemma: Every step reduces the size of q Proof: By cases (ie: induction) on the definition of the reduction relation. -------------------------------(S,{int=int} U q) -> (S, q) ---------------------------------------------(S,{s11 -> s12= s21 -> s22} U q) -> (S, {s11 = s21, s12 = s22} U q) -----------------------------------(S,{bool=bool} U q) -> (S, q) ----------------------------(S,{a=a} U q) -> (S, q) ------------------------ (a not in FV(s)) (S,{a=s} U q) -> ([a=s] o S, [s/a]q) Complete Solutions A complete solution for (S,q) is a substitution T such that 1. 2. T <= S T |= q intuition: T extends S and solves q A principal solution T for (S,q) is complete for (S,q) and 3. for all T’ such that 1. and 2. hold, T’ <= T Properties of Solutions Lemma 1: Every final state (S, { }) has a complete solution. It is S: S <= S S |= { } Properties of Solutions Lemma 2 No stuck state has a complete solution (or any solution at all) it is impossible for a substitution to make the necessary equations equal int bool int t1 -> t2 ... Properties of Solutions Lemma 3 If (S,q) -> (S’,q’) then T is complete for (S,q) iff T is complete for (S’,q’) T is principal for (S,q) iff T is principal for (S’,q’) in the forward direction, this is the preservation theorem for the unification machine! Summary: Unification By termination, (I,q) ->* (S,q’) where (S,q’) is irreducible. Moreover: If q’ = { } then (S,q’) is final (by definition) S is a principal solution for q. Why? S is principal for (S,q’) (by lemma 1) S is principal for (I,q) (by lemma 3) Since S is principal for (I,q), since all possible solutions T <= I, S is a principal solution for q. Summary: Unification (cont.) ... Moreover: If q’ is not { } (and (I,q) ->* (S,q’) where (S,q’) is irreducible) then (S,q) is stuck. Consequently, (S,q) has no complete solution. By lemma 3, even (I,q) has no complete solution and therefore q has no solution at all. Summary: Type Inference Type inference algorithm. Given a context G, and untyped term u: Find e, t, q such that G |- u ==> e : t, q Find principal solution S of q via unification Apply S to e, ie our solution is S(e) if no solution exists, there is no reconstruction S(e) contains schematic type variables a,b,c, etc that may be instantiated with any type Since S is principal, S(e) characterizes all reconstructions. End