CS 2104 Prog. Lang. Concepts Dr. Abhik Roychoudhury School of Computing Today’s lecture Finish discussion on type checking initiated in last lecture Discuss the role of expressions in programming languages Issues in Type Checking How to type-check? How to cater for polymorphism? What is your definition of “compatible type”? When to perform type checking? Is your language strongly or weakly typed? 2.4 When to perform Type Checking? When is the variable bound to the type? Compile-Time Run-Time (Static Type Binding) (Dynamic Type Binding) When can I type check? In theory, you can choose to type check at compile time or run-time. No choice but to do dynamic type checking. In practice, languages try to do it as much statically as possible. Eg. SML, Pascal Eg. JavaScript, APL When to perform Type Checking? Static Type Checking – done at compile time. (+) Done only once (+) Earlier detection of errors (–) Less Program Flexibility (Fewer shortcuts and tricks) 2.4 When to perform Type Checking? Dynamic Type Checking – done at run time. (–) Done many times (–) Late detection of errors (–) More memory needed, since we need to maintain type information of all the current values in their respective memory cells. (–) Slows down overall execution time, since extra code is inserted into the program to detect type error. (+) Program Flexibility (Allows you to ‘hack’ dirty code.) Issues in Type Checking How to type-check? How to cater for polymorphism? What is your definition of “compatible type”? When to perform type checking? Is your language strongly or weakly typed? 2.5 Strong Type Systems A programming language is defined to be strongly typed if type errors are always detected STATICALLY. A language with a strong-type system only allows type-safe programs to be successfully compiled into executables. (Otherwise, language is said to have a weak type system). Programs of strong-type systems are guaranteed to be executed without type-error. (The only error left to contend with is logic error). 2.5 Strong Type Systems Fortran No Ada No Modula-3 No Allows variable of one type to refer to value of another type through EQUIVALENCE keyword. Library function UNCHECKED_CONVERSION suspends type checking. Same as Ada through use of keyword LOOPHOLE C, C++ No 1. Forced conversion of type through type casting 2. Union Types can compromise type safety Java No Type Casting Pascal Almost Variant Records can compromise type safety SML Yes Haskell Yes All variables have STATIC TYPE BINDING. 2.5 Weak-Type Systems: Variant Recs Variant Records in C (via union keyword) compromises Type Safety ... typedef union { int X; float Y; char Z[4];} B; ... B P; Weak-Type Sys: Variant Recs Variant part all have overlapping (same) L-value!!! Problems can occur. What happens to the code below? P.X = 142; printf(“%O\n”, P.Z[3]) All 3 data objects have same L-value and occupy same storage. No enforcement of type checking. Poor language and type system design Weak-Type Sys: Variant Recs Variant Records in Pascal tries to overcome C’s deficiency. They have a tagged union type. type whichtype = (inttype, realtype,stringtype); type uniontype = record case V : whichtype of inttype : (X: integer); realtype: (Y: real); stringtype: (Z:…) end But the compiler usually doesn’t check the consistency between the variant and the tag. So we can ‘subvert’ the tagged field: var P: uniontype P.V = inttype; P.X = 142; P.V = realtype; // type safety compromised CS 2104 – Prog. Lang. Concepts Expressions Lecturer : Dr. Abhik Roychoudhury School of Computing What is an expression ? The notion of value is central to programming. The allowed set of possible values of a variable denotes its type. Program variables get instantiated to values at run-time. Integer variables to integer values String variables to array of characters etc. We previously called values as r-values to distinguish between values and addresses. With this perspective, we could define an expression simply as: An expression is a formal description of a value. Examples 2 2*5 F(4) + 2*5 // Need to define function F A<B A < B \/ C = D // A,B,C,D are variables P(A, B) \/ Q(C, D) // P,Q are predicates // more on this later in the course Expression evaluation Evaluating an expression is reducing it to a simpler but equivalent expression e.g. evaluate 2+3 to 5 Each reduction step in evaluation involves taking a function application and replacing it by its equivalent value. E.g. to evaluate 2+3 we replace the application of the + function by the value resulting from addition. A reduction step represented by =>, 2+3=>5 Prefix, Infix, Postfix Notation Position of Function Examples Prefix Left of argument(s) sqrt(16), f(3,4) Infix Between two arguments 3 f 4, 3 + 4 Postfix Right of arguments 16 sqrt, 3 4 f Postfix notation Widely used in real compilers for programming languages. The programmer’s expression is first converted to an equivalent postfix expression. We can then very easily generate code for evaluating the postfix expression. Evaluation of postfix expressions have a very direct correspondence with manipulation of stacks. Postfix evaluation - Example Expression Code 35+86-* ^ push 3 push 5 add push 8 push 6 sub mul 35+86-* ^ 35+86-* ^ Stack Contents <> <3> <3,5> <8> <8,8> <8,8,6> <8, 2> <16> Postfix Evaluation - Algorithm 1. Start with the leftmost element in the expression. 2. If it is not a function, push it to stack and repeat. 3. If it is a function of k arguments (a) Take the k elements at the top of the stack (b) Apply the function to these k elements © Replace the k elements by the result of the function application (d) Move right in the expression and repeat step 2. 4. Reached end of expression, value of expression at stack-top. Function Definitions So far, our example expressions have shown only built-in arithmetic functions e.g. +, * The notion of expressions is of course not restricted to the use of built-in functions. There are several ways of defining a function : As an equation Let us now study this in more depth. Functions as Equations Defined as : f(x, y) = x + y * y Standard format used in mathematics. The function is defined with the above equation. The equation itself has no value. Only the expression in the r.h.s. of the equality has a value. If f/2 appears in an expression 1 + 2 * f(x, y) Then the occurrence of f/2 must be reduced to the value represented on the r.h.s of the equation of f. Evaluating functions as eqns Assume f(x1,…,xn) = e Application of function f/n in expression e1 reduced as follows: Match actual arguments with formal arguments Substitute actual arguments in e Evaluate the result replace occurrence of f/n in e1 with result Example: 1 + 2 * f(2 +1, 1+ 6) where f(x, y) = x + y * y Matching : x is 2 + 1, y is 1 +6 Substitution: ( (2 + 1 ) + (1 + 6) * ( 1 + 6)) Evaluate : f (2 +1, 1+ 6) = 52 Replace : 1 + 2 * 52 Side Effects A function has side effect if it changes its parameter(s) and/or global variables. Example : A + f(A) Suppose execution of f changes the value of A. Consider the following code: A = 10 B = A + f(A) Suppose f reduces its argument by 1, and returns this value. Left – right evaluation : 10 + 9 = 19 Right-left evaluation : 9 + 9 = 18 Side Effects on Global Var. Int A = 5; Int fun1(){ } Void fun2(){ A = 17; Return 3 A = A + fun1(); } What is the value of global variable A in fun2 ? Depends on the order of expression evaluation. Two solutions Disallow expressions with functional side-effects Need extra parameters for the functions. Allow only one evaluation order (say left-to-right). Will disallow compiler optimizations which re-order operands. Relational Expressions Relational Expressions - Use relational operators and operands of various types - Evaluate to some boolean representation - Operator symbols used vary somewhat among languages (!=, /=, .NE., <>, #) Boolean Expressions Boolean Expressions - Operands are boolean and the result is boolean - Operators: FORTRAN 77 .AND. .OR. .NOT. FORTRAN 90 and or not && and || or ! not xor C Ada - C has no boolean type--it uses int type with 0 for false and nonzero for true - Operator Precedence Precedence of All Operators: Pascal: not, unary - *, /, div, mod, and +, -, or relops Ada: ** *, /, mod, rem unary -, not +, -, & relops and, or, xor C, C++, and Java have over 50 operators and 17 different levels of precedence Short Circuit Evaluation Evaluating an expression without evaluating all the operands. e.g. (a > b) and (c > 5) If we know that a > b is false, then there is no need To determine whether (c > 5) is true. Short Circuit Evaluation Pascal: does not use short-circuit evaluation Problem: table look-up index := 1; while (index <= length) and (LIST[index] <> value) do index := index + 1 If value is not in LIST, then ??? Short circuit evaluation C, C++, and Java: use short-circuit evaluation for the usual Boolean operators (&& and ||), but also provide bitwise Boolean operators that are not short circuit (& and |) Ada: programmer can specify either (short-circuit is specified with and then and or else) FORTRAN 77: short circuit, but any side-affected place must be set to undefined Short-circuit evaluation exposes the potential problem of side effects in expressions e.g. (a > b) || (b++ / 3) Programs as expressions The general notion of expression is so powerful and fundamental that : In the IMP language (our toy imperative language) we discussed both arithmetic and boolean expressions where Any program could be seen as an expression Any computation could be seen as expression evaluation Arithmetic expressions evaluate to numbers Boolean expressions evaluate to true/false We will now walk through different varieties of expressions where program statements (e.g. if-then-else) will also be viewed as expressions. 1. Constants Semantically, any expression whose name directly denotes its value. E.g. 2, true, ‘cs2104’ Constants can be of various types : integer, boolean, string Evaluating a constant will not lead to any further simplification. 2. Function Applications Application of a function to one or more arguments. E.g. 2 + 3, true /\ true Evaluation will involve using the definition of a function, in this case the built-in function + Other functions can be assumed to be given in an equational form, as discussed before. 3. Assignment statement Assigns a value v to a variable X. Can be seen as an expression with operand X returning v. However such expressions have side-effects. Problem of evaluation order if an assignment expression is part of a bigger expression. A + (A := C) 4. Conditional Expressions Of the form : if B then E1 else E2 What is the value of if B then E1 else E2 Evaluate B If B evaluates to true, then the value of E1 is the value of the conditional expression If B evaluates to false then the value of E2 is the value of the conditional expression Could arbitrarily nest if-then-else to simulate other imperative program constructs such as switch statements. 4. Let expressions Example: let square(x) = x*x in square(square(2)) Of the form: let function_definition in sub_expression The function definition defines a function f in equational form. The sub-expression contains function applications of f We assume that definition of f is non-recursive. 4. Let expressions Evaluation proceeds by replacing applications of f in sub-expression with the definition of f Example: let square(x) = x*x in square(square(2)) square(2) * square(2) 2 * 2 * 2 * 2 = 16 Let expressions allow for function definitions. Their evaluation is same as macro-expansion. Nested Let In fact the let expression can be more general. A let can define variables and /or functions. Since let is itself an expression, it can be nested. Example: let x = 1 let y = x +1 let f(z) = z +2 in x*y*f(3) let y = 2 let f(z) = z + 2 in 1*y*f(3) let f(z) = z + 2 in 1*2*f(3) 1*2*(3+2) = 10 5. Recursive Let letrec fact(n) = if zero(n) then 1 else n * fact(n-1) in fact(10) Extension of let allowing recursive definitions. Evaluation of fact(10) proceeds recursively. Letrec expressions can again be nested. Using letrec we can model recursion and iteration. Summary Expression is a representation of value. Evaluation of an expression returns that value. Arithmetic and boolean expressions are very common in programming languages. Evaluation of expressions may involve side effects. Evaluation of an expr. may not require all its operands to be evaluated. Any program can be seen as an expression. Any computation can be seen as expression evaluation. We will discuss much more about expressions when we study functional programming (Next lecture)