Type Checking/Polymorphism

advertisement
CS 2104 Type checking /
Polymorphism
Reading: Chapters 3.1, 4.3, 4.4.1
Dr. Abhik Roychoudhury
Adapted from Goh Aik Hui’s lecture notes.
1
Overview
 Types
 Motivation
for typed languages
 Issues in Type Checking
How to type check?
 How to cater for Polymorphism
 Type Equivalence
 When to type check?
 Strong and Weak typed languages

2
1. Motivation for typed languages

Untyped Languages: perform any operation on any
data.
 Example: Assembly
movi 5 r0
addf 3.6 r0
// Move integer 5 (2’s complement) to r0
// Treat bit representation in r0 as a
// Floating point representation and add 3.6
// to it.
Result? You can be sure that r0 does not contain 8.6!


(+) Flexibility : “I can do anything I want to and you can’t
stop me”
(–) Ease of Error Checking. (programs are prone to
errors, especially huge ones). “I am human, my brain is
limited, I can’t remember and monitor everything.”
3
1. Motivation for typed languages
Typed Languages:

A type represents a set of values. Programs / procedures /
operators are functions from an input type to an output type.

Type Checking is the activity of ensuring that the operands /
arguments of an operator / procedure are of compatible type
This is done by using a set of rules for associating a type with
every expression in the language. (These rules are known as the
type system).


A type error results when an operator is applied to an operand of
inappropriate/incompatible type.

Output of a type system:


There are type-errors (wrt type system) => Program is NOT type-safe.
There are no type-errors (wrt type system) => Program is type-safe.
4
1. Motivation for typed languages
TC says that there
are type errors
TC says that there
are no type errors
Program really
has errors
Program really does
not have errors
Usually true
Possible
TC errs on the
conservative side
?????
Program MAY still have errors
1.
It may still have type errors due to unsafe features of a language.
This is due to bad type system design.
2.
It may have logic errors. This serves to show that type errors is but
one of the many errors you encounter.
5
1. Motivation for typed languages
TC says that there
are type errors
Program really
has errors
Program really does
not have errors
Usually true
Possible
TC errs on the
conservative side
Typed Languages
(+) Error Detection
(+) Program documentation
(–) Loss of Flexibility (but it’s ok, I don’t lose much freedom anyway
since I don’t usually program in that way in the first place. I gain
more than what I lose).
6
2. Issues in Type Checking
 How
to type-check?
 How to cater for polymorphism?
 What is your definition of “compatible
type”?
 When to perform type checking?
 Is your language strongly or weakly
typed?
7
2.1 How to type-check?
Definition:
 Type statements are of the form:
<expr> : <type>
meaning that an expression <expr> ‘is-of-thetype’ (the ‘:’ symbol) <type>.
 Examples:





3 : int
3+4 : int
3.14 : real
“abc” : String
while (x < 5) {x++;} : Stmt
8
2.1 How to type-check?
Definition:
 Type rules are of the form:
e1 : t1
e2 : t2
…
en : tn
f e1 e2 …en : t
(rule name)
where each ei : ti is a type statement, n 0.
The rule is interpreted as “IF e1 is of type t1 and … and en is of
type tn THEN f e1 e2 …en is of type t.”
This is similar to the rules we saw when we studied semantics.
9
2.1 How to type-check?
Examples of type rules:
 Rule for constants:
1 : int

2 : int
3 : int
Rule for addition:
E1 : int
E2 : int
(+)
E1 + E2 : int

Rule for boolean comparison:
E1 : int
E2 : int
(==)
E1 == E2 : bool
10
2.1 How to type-check?
Examples of type rules:
 Rule for assignment statement:
x:T
E:T
(:=)
x := E; : Stmt

Rule for if-statment:
E1 : Bool
S1 : Stmt
S2 : Stmt
(if)
if (E1) {S1} else {S2} : Stmt
11
2.1 How to type-check?

Given the program:
…And Given the rules:
1 : int
int x;
…
x := x+1;
…
2 : int
E1 : int
E2 : int
E1 + E2 : int
E1 : int
E2 : int
E1 == E2 : bool
x:T
A program/expression is typesafe if we can construct a
derivation tree to give a type for
that program/expression.
x : int
E:T
x := E; : Stmt
E1 : Bool
S1 : Stmt
3 : int
(+)
(==)
(:=)
S2 : Stmt
(if)
if (E1) {S1} else {S2} : Stmt
x : int
1 : int
(+)
x+1 : int
x:=x+1; : Stmt
(:=)
12
2.1 How to type-check?

Given the program:
int x; float y;
…
if (x == 3) {
y := x;
} else {
x := x+1;
}
…
…And Given the rules:
1 : int
2 : int
E1 : int
E2 : int
E1 : int
E2 : int
E:T
(==)
???
y:=x; : Stmt
(:=)
x := E; : Stmt
E1 : Bool
S1 : Stmt
S2 : Stmt
(if)
if (E1) {S1} else {S2} : Stmt
Follow the rules! Try to build tree.
Cannot build tree => Not type safe
x==3 : Bool
(==)
E1 == E2 : bool
A program/expression is typesafe if we can construct a
derivation tree to give a type for
that program/expression.
3 : int
(+)
E1 + E2 : int
x:T
x : int
3 : int
(:=)
if (x==3) {y:=x;} else {x:=x+1;} : Stmt
x : int
x : int
1 : int
(+)
x+1 : int
x:=x+1; : Stmt
(:=)
(if)
13
Issues in Type Checking
 How
to type-check?
 How to cater for polymorphism?
 What is your definition of “compatible
type”?
 When to perform type checking?
 Is your language strongly or weakly
typed?
14
2.2 How to cater for Polymorphism

Polymorphism = poly (many) + morph (form)

Polymorphism is the ability of a data object to
that can take on or assume many different
forms.

Polymorphism can be categorised into 2 types


Ad-hoc Polymorphism ( discussed now)
Universal Polymorphism


Parametric (discussed with Functional Programming)
Inclusion (discussed later in the lecture)
15
2.2 How to cater for Polymorphism
Cardelli and Wegner’s classification (1985)
Polymorphism
Ad-Hoc
Coercion
Overloading
Ad-Hoc polymorphism is obtained
when a function works, or appears
to work on several different types
(which may not exhibit a common
structure) and may behave in
unrelated ways for each type.
Universal
Parametric
Inclusion
Universal polymorphism is
obtained when a function works
uniformly on a range of types;
these types normally exhibit
some common structure.
16
2.2 Polymorphism – Coercion
COERCION
A coercion is a operation that converts the type of an
expression to another type. It is done automatically by
the language compiler.
(If the programmer manually forces a type conversion, it’s called casting)
E : int
(Int-Float Coercion)
E : float
int x; float y;
...
y := x;
...
17
2.2 Polymorphism – Coercion
Example of the use of COERCION
int x; float y;
…
if (x == 3) {
y := x;
} else {
x := x+1;
}
…
1 : int
2 : int
E1 : int
E2 : int
E1 : int
E2 : int
(==)
E1 == E2 : bool
Add in new rule…
E : float
(+)
E1 + E2 : int
x:T
E : int
3 : int
E:T
(:=)
x := E; : Stmt
(Int-Float Coercion)
E1 : Bool
S1 : Stmt
S2 : Stmt
(if)
if (E1) {S1} else {S2} : Stmt
x : int
x : int
3 : int
x==3 : Bool
(==)
y : float
x : float
y:=x; : Stmt
(Coercion)
(:=)
if (x==3) {y:=x;} else {x:=x+1;} : Stmt
x : int
x : int
1 : int
(+)
x+1 : int
x:=x+1; : Stmt
(:=)
(if)
18
2.2 Polymorphism – Coercion
Coercion
Widening
Narrowing
Widening coercion converts a value to a
type that can include (at least
approximations of) all of the values of the
original type.
Narrowing coercion converts a value to
a type that cannot store (even
approximations of) all of the values of
the original type.
Widening is safe most of the time. It can be
unsafe in certain cases.
Narrowing is unsafe. Information is lost
during conversion of type.
int
<- Widening
float
Narrowing ->
int
float
19
2.2 Polymorphism – Coercion
Coercions
(+) Increase flexibility in programming

Example:
float x,y,z;
int a,b,c;

If I have no coercions, and I intend to add y and
a and store in x, then writing…
x = y + ((float) a);
…is too much of a hassle.
 Therefore coercion can be useful.
20
2.2 Polymorphism – Coercion
Coercions
(–) Decrease Reliability (error detection)

Example:
float x,y,z;
int a,b,c;

If I have coercions and I intend to add x and y
and store in z, but I accidentally write…
z = x + a;
…then my error will go undetected because the
compiler will simply coerce the a to a float.

Therefore coercion can lead to problems.
21
2.2 Polymorphism – Overloading
OVERLOADING
An overloaded operation has different meanings, and
different types, in different contexts.
E1 : int
E2 : int
(+-int)
E1 + E2 : int
E1 : float
E2 : float
(+-float)
E1 + E2 : float
22
2.2 Polymorphism – Overloading
Example of the use of Overloading
int x,y,z; float a,b,c;
…
if (x == 3) {
x := y + z;
} else {
a := b + c;
}
…
1 : int
2 : int
E1 : int
E1 : int
x : int
x==3 : Bool
(==)
E2 : int
(==)
E:T
(:=)
x := E; : Stmt
E2 : float
3 : int
(+)
E1 == E2 : bool
Add in new rule…
E1 + E2 : float
E2 : int
E1 + E2 : int
x:T
E1 : float
3 : int
(+-float)
E1 : Bool
S1 : Stmt
S2 : Stmt
(if)
if (E1) {S1} else {S2} : Stmt
x : int
y:int
z:int
(+)
y+z : int
x:=y+z; : Stmt
(:=)
if (x==3) {x:=y+z;} else {a:=b+c;} : Stmt
b:float
c:float
(+ -float)
a : float
b+c : float
a:=b+c; : Stmt
(:=)
(if)
23
2.2 Polymorphism – Overloading
Overloading
(+) Increase flexibility in programming

Examples are when user wants to use an
operator to express similar ideas.
 Example:
int
int
int
a =
p =
x =

a,b,c;
p[10], q[10], r[10];
x[10][10], y[10][10], z[10][10];
b * c; // integer multiplication
a * q; // Scalar multiplication
y * z; // Matrix multiplication
Therefore overloading is good.
24
2.2 Polymorphism – Overloading
Overloading
(–) Decrease Reliability (error detection)

Examples are when user intends to use the
operator in one context, but accidentally uses it
in another.
 Example


In many languages, the minus sign is overloaded to
both unary and binary uses.
x = z–y and x = -y
will both compile. What if I intend to do the first, but
accidentally leave out the ‘z’?
Similarly, in C, we can have a situation when
x = z&y and x = &y
will both compile. Is overloading good?
25
2.2 Polymorphism – Overloading
Overloading
(–) Decrease Reliability (error detection)

Even for common operations, overloading may
not be good.

Example
int sum, count;
float average;
...
average = sum / count;
Since sum and count are integers, integer
division is performed first before result is
coerced to float.
That’s why Pascal has div for integer division
and / for floating point division.
26
2.2 How to cater for Polymorphism
Cardelli and Wegner’s classification (1985)
Polymorphism
Ad-Hoc
Coercion
Overloading
Ad-Hoc polymorphism is obtained
when a function works, or appears
to work on several different types
(which may not exhibit a common
structure) and may behave in
unrelated ways for each type.
Universal
Parametric
Inclusion
Universal polymorphism is
obtained when a function works
uniformly on a range of types;
these types normally exhibit
some common structure.
27
2.2 Inclusion Polymorphism
 Q:
Is the subclass regarded as a subtype
of the parent class?
 Yes
– Inclusion Polymorphism (Sub-typing)
class A {…}
class B extends A {…}
Note that B  A (Inclusion)

A a = new B();

A a = new A();
Polymorphism
28
2.2 Inclusion Polymorphism
 Q:
Is the subclass regarded as a subtype
of the parent class?
 Yes
– Inclusion Polymorphism (Sub-typing)
 Some
people call it the IS-A relationship
between parent and derived class.
“class Table extends Furniture”
Table IS-A Furniture.
Table  Furniture
29
2.2 Inclusion Polymorphism
Variables are polymorphic – since they can
refer to the declared class and to subclasses
too.
 Requirement (Do you know why?):




Subclass must INHERIT EVERYTHING from the
base class.
Subclass must NOT MODIFY ACCESS CONTROL
of the base class methods/data.
That’s why C++ Inclusion Polymorphism
definition adds a ‘public’ to the derived class
since a private derived class modifies access
control of base class methods/data.
30
Issues in Type Checking
 How
to type-check?
 How to cater for polymorphism?
 What is your definition of “compatible
type”?
 When to perform type checking?
 Is your language strongly or weakly
typed?
31
2.3 Type Equivalence
type
// type definitions
Q = array [1..10] of integer;
S = array [1..10] of integer;
T = S;
type
//
Queue =
Stack =
Tree =
var
var
a
b
c
d
:
:
:
:
// variable declarations
Q;
S;
T;
array [1..10] of integer;
a
b
c
d
:
:
:
:
type definitions
array [1..10] of integer;
array [1..10] of integer;
Stack;
// variable declarations
Queue;
Stack;
Tree;
array [1..10] of integer;
begin
a := b; // Is this allowed?
// Meaning to say “Is a and b
// the same type?”
begin
a := b; // Is this allowed?
// Meaning to say “Is a and b
// the same type?”
a := c; // Is this allowed?
a := d; // Is this allowed?
b := c; // Is this allowed?
end.
a := c; // Is this allowed?
a := d; // Is this allowed?
b := c; // Is this allowed?
end.
If you had said “yes” to most of it, chances are that you are adopting structural
equivalence. If you had said “no” most of the time, then it is likely you are
adopting name equivalence.
32
2.3 Type Equivalence
Difference between type names and
anonymous type names.
 The type of a variable is either described
through:
A
type name: (1) those names defined using
a type definition command. (eg. ‘type’ for
Pascal, ‘typedef’ for C.), or… (2) the
primitive numeric types (eg. int, float)
 Or directly through a type constructor (eg.
array-of, record-of, pointer-to). In this case,
the variable has an anonymous type name.
33
2.3 Type Equivalence

Example
type
// type definitions
Q = array [1..10] of integer;
S = array [1..10] of integer;
T = S;
var
a
b
c
d
:
:
:
:
// variable declarations
Q;
S;
T;
array [1..10] of integer;
Q,S,T are type names
d has a type, but d does
not have a type name.
begin
a := b; // Is this allowed?
// Meaning to say “Is a and b
// the same type?”
a := c; // Is this allowed?
a := d; // Is this allowed?
b := c; // Is this allowed?
end.
34
2.3 Type Equivalence
 When
are two types equivalent ()?
Rule 1: For any type name T, T T.
Rule 2: If C is a type constructor and T1 T2, then CT1 CT2 .
Rule 3: If it is declared that type name = T, then name  T.
Rule 4 (Symmetry): If T1 T2,then T2 T1.
Rule 5 (Transitivity): If T1 T2 and T2 T3, then T1 T3.
 What
rules do you want to use?
35
2.3 Type Equivalence
 When
are two types equivalent ()?
Rule 1: For any type name T, T T.
Rule 2: If C is a type constructor and T1 T2, then CT1 CT2 .
Rule 3: If it is declared that type name = T, then name  T.
Rule 4 (Symmetry): If T1 T2,then T2 T1.
Rule 5 (Transitivity): If T1 T2 and T2 T3, then T1 T3.





 Structural
Equivalence will use all the
rules to check for type equivalence.
36
2.3 Type Equivalence
 When
are two types equivalent ()?
Rule 1: For any type name T, T T.
Rule 2: If C is a type constructor and T1 T2, then CT1 CT2 .
Rule 3: If it is declared that type name = T, then name  T.
Rule 4 (Symmetry): If T1 T2,then T2 T1.
Rule 5 (Transitivity): If T1 T2 and T2 T3, then T1 T3.





 (Pure)
Name Equivalence will use only
the first rule. Unless the two variables
have the same type name, they will be
treated as different type
37
2.3 Type Equivalence
 When
are two types equivalent ()?
Rule 1: For any type name T, T T.
Rule 2: If C is a type constructor and T1 T2, then CT1 CT2 .
Rule 3: If it is declared that type name = T, then name  T.
Rule 4 (Symmetry): If T1 T2,then T2 T1.
Rule 5 (Transitivity): If T1 T2 and T2 T3, then T1 T3.





 Declarative
Equivalence will leave out
the second rule.
38
2.3 Type Equivalence

Example
type
// type definitions
Q = array [1..10] of integer;
S = array [1..10] of integer;
T = S;
var
a,x
b :
c :
d :
e :
// variable declarations
: Q;
S;
T;
array [1..10] of integer;
array [1..10] of integer;
begin
a := x; // Is this allowed?
// Meaning to say “Is a and b
// the same type?”
a := b; // Is this allowed?
a := c; // Is this allowed?
a := d; // Is this allowed?
b := c; // Is this allowed?
d := e; // Is this allowed?
end.
R1: For any type name T, T T.
R2: If C is a type constructor and T1 T2,
then CT1 CT2 .
R3: If it is declared that type name = T,
then name  T.
R4 (Symmetry): If T1 T2,then T2 T1.
R5 (Transitivity): If T1 T2 and T2 T3,
then T1 T3.
SE
NE
DE
yes
yes
yes
yes
yes
yes
yes
yes
no
no
no
no
no
no
no
no
yes
no
39
2.3 Type Equivalence
Name Equivalence


Easy to implement checking,
since we need only compare
the name.
Very restrictive, inflexible.
type idxtype = 1..100;
var count : integer;
index : idxtype;
Structure Equivalence


Harder to implement since
entire structures must be
compared. Other issues to
consider: eg. arrays with same
sizes but different subscripts –
are they the same type? (similar
for records and enumerations)
More flexible, yet the flexibility
can be bad too.
type celsius = real;
fahrenheit = real;
var x : celsius;
y : fahrenheit;
...x := y; // Allowed?
40
2.3 Type Equivalence


Different Languages adopt different rules. And the rules
may change for one language (people can change their
minds too!)
Pascal






Before 1982 – unknown.
ISO1982 – Declarative Equivalence.
ISO1990 – Structural Eqivalence.
C : Structural Equivalence, except for structs and
unions, for which C uses declarative equivalence. If the
two structs are in different files, then C goes back to
structural equivalence.
C++ : Name Equivalence
Haskell/SML : Structural Equivalence.
41
Issues in Type Checking
 How
to type-check?
 How to cater for polymorphism?
 What is your definition of “compatible
type”?
 When to perform type checking?
 Is your language strongly or weakly
typed?
42
2.4 When to perform Type Checking?
When is the variable
bound to the type?
When can I type
check?
Compile-Time
Run-Time
(Static Type Binding)
(Dynamic Type Binding)
In theory, you can choose
to type check at compile
time or run-time.
No choice but to do
dynamic type checking.
In practice, languages try to
do it as much statically as
possible.
Eg. SML, Pascal
Eg. JavaScript, APL
43
2.4 When to perform Type Checking?

Static Type Checking – done at compile time.



(+) Done only once
(+) Earlier detection of errors
(–) Less Program Flexibility (Fewer shortcuts and
tricks)
44
2.4 When to perform Type Checking?
 Dynamic
Type Checking – done at run
time.
 (–)
Done many times
 (–) Late detection of errors
 (–) More memory needed, since we need to
maintain type information of all the current
values in their respective memory cells.
 (–) Slows down overall execution time, since
extra code is inserted into the program to
detect type error.
 (+) Program Flexibility (Allows you to ‘hack’
dirty code.)
45
Issues in Type Checking
 How
to type-check?
 How to cater for polymorphism?
 What is your definition of “compatible
type”?
 When to perform type checking?
 Is your language strongly or weakly
typed?
46
2.5 Strong Type Systems

A programming language is defined to be strongly typed
if type errors are always detected STATICALLY.
 A language with a strong-type system only allows typesafe programs to be successfully compiled into
executables. (Otherwise, language is said to have a
weak type system).
 Programs of strong-type systems are guaranteed to be
executed without type-error. (The only error left to
contend with is logic error).
47
2.5 Strong Type Systems
Language
Strongly
Typed?
Fortran
No
Ada
No
Modula-3
No
Allows variable of one type to refer to value of
another type through EQUIVALENCE keyword.
Library function UNCHECKED_CONVERSION
suspends type checking.
Same as Ada through use of keyword LOOPHOLE
C, C++
No
1. Forced conversion of type through type casting
Why?
2. Union Types can compromise type safety
Java
No
Type Casting
Pascal
Almost
Variant Records can compromise type safety
SML
Yes
Haskell
Yes
All variables have STATIC TYPE BINDING.
48
2.5 Weak-Type Systems: Variant Recs

Variant Records in C (via union keyword) compromises Type
Safety
...
typedef union { int X;
float Y;
char Z[4];} B;
...
B P;

Variant part all have overlapping (same) L-value!!!
Problems can occur. What happens to the code below?
P.X = 142;
printf(“%O\n”, P.Z[3])


All 3 data objects have same L-value and occupy same storage.
No enforcement of type checking.
 Poor language and type system design
49
2.5 Weak-Type Systems: Variant Recs

Variant Records in Pascal tries to overcome C’s deficiency. They
have a tagged union type.
type whichtype = (inttype, realtype);
type uniontype = record
case V : whichtype of
inttype : (X: integer);
realtype: (Y: real);
end

But the compiler usually doesn’t check the consistency between
the variant and the tag. So we can ‘subvert’ the tagged field:
var P: uniontype
P.V = inttype;
P.X = 142;
P.V = realtype; // type safety compromised
50
Download