Notes by Ruggero Morselli

advertisement
CMSC 631 – Fall 2002
Ruggero Morselli
CMSC 631 – PAPER REVIEW
Jens Palsberg
Type-Based Analysis and Applications, PASTE 2001
Program Analysis is a general term, under which several problems fall. Examples are:
alias analysis (determine if two program expressions can be both references to the same
memory location), liveness analysis (determine if the value of a variable at a certain point
of the program may be read, because it gets overwritten or destroyed), method
reachability (determine which procedures a polymorphic procedure call can actually
invoke at run time), just to mention a few. Most of those problems are relevant for
optimized compilation, but there are other applications like program understanding and
correctness checking.
Type-Based Program Analysis is the collective name of the techniques for solving these
problems that exploit the type derivation of the program, assuming that it is written in a
language with some concept of static types and that the program type checks. This paper
discusses the advantages of type-based program analysis, mentioning several examples.
The first sample problem is flow analysis for lambda-calculus: given an expression e in
the lambda-calculus language, determine, for each subexpression e’ of e, which function
f, whose definition appears in e¸ e’ can evaluate to. The paper briefly recalls 0-CFA, a
non-typed-based technique for this analysis. Then it discusses three different type-based
analysis techniques for the same problem.
1. Types as discriminators: a simple type system is defined for the lambda calculus.
Then the type inference algorithm is run on the input expression. Each e’ can
evaluate to any function f that has the same type as e’.
2. Type and effect system: augment the simple type system with qualified function
type. Each function type is qualified with the function values it can possibly take.
Type inference will produce the result
3. Sparse flow graph: this method does not use directly the type system. It runs an
algorithm that is similar to 0-CFA that apparently ignores types. The advantage is
that, if the expression type checks against the original type system and the size of
the types of the subexpressions is O(1) in the size of the program, then this
algorithm produces the output graph in linear, rather than cubic, time.
A second example of analysis problem is method reachability, which is useful to
determine which function calls can be inlined. In most languages, a polymorphic call is
made through a pointer to an object and the type of the pointer is known. The type of the
object can be only a subtype of the type of the pointer. This can be used to efficiently rule
out most of the functions of the program as possible targets of the call. More type
information can actually be used, like verifying the presence of upcasts from a type T to a
supertype T1, to determine if a pointer of type T1 can actually refer to an object of type
T.
Another example is alias analysis. The same techniques seen for method reachability can
be used to verify if two pointers can actually refer to the same memory location or not.
Additional type-based tricks include observing that two expressions p.f and q.g are never
alias if f<>g, because, at most, they can be two distinct fields of the same object.
CMSC 631 – Fall 2002
Ruggero Morselli
Other examples mentioned in the paper are using qualified types to verify if the objects of
a Java class with default access can actually be accessed by code in a different package
(confinement analysis); this is useful for program understanding. Or implementing region
memory management for variables in Standard ML, where the places in the program
when these regions can be allocated and deallocated can be determined statically.
Advantages of type-based analysis? First of all, it tends to be far more efficient than
analyses that ignore types. The sparse-flow graph compared to the 0-CFA, mentioned
above, is just one example.
Another important point is that type-based analyses are easier to understand and therefore
simpler to design. Also it is well understood how to prove the correctness of a type
system, and this often automatically verifies the correctness of the analysis.
As a conclusion, types are very useful to deduce properties about a program and to lead
to efficient optimized compilation and program understanding. This also underlies the
advantage of strongly statically typed languages like Java, compared to weakly typed
languages like C or dynamically typed languages like Lisp.
Download