Chapter 5 Names, Bindings, Type Checking, and Scopes ISBN 0-321-33025-0 Chapter 5 Topics • • • • • • • • • • Introduction Names Variables The Concept of Binding Type Checking Strong Typing Type Compatibility Scope and Lifetime Referencing Environments Named Constants 1-2 Imperative Languages • Imperative languages are abstractions of von Neumann architecture – Memory • stores both instructions and data – Processor • provides operations for modifying the contents of the memory 1-3 Memory Cells and Variables • The abstractions in a language for the memory cells of the machine are variables. – In some cases, the characteristics of the abstractions are very close to the characteristics of the cells; • an example of this is an integer variable, which is usually represented directly in one or more bytes of memory. – In other cases, the abstractions are far removed from the organization of the hardware memory, • as with a three-dimensional array, which requires a software mapping function to support the abstraction. 1-4 Attributes of Variables • A variable can be characterized by a collection of properties, or attributes. • The most important of variable attributes is type, a fundamental concept in programming languages. 1-5 Design Considerations of Data Type • The design of the data types of a language requires that a variety of issues be considered. • Among the most important of these issues are – the scope of variables – the lifetime of variables and – type equivalence. • Related to the first two are the issues of – type checking and – initialization. 1-6 C-based Languages • In this book, the author uses the phrase C-based languages to refer to C, C++, Java, and C#. 1-7 Name • One of the fundamental attributes of variables: names, which have broader use than simply for variables. • A name is a string of characters used to identify some entity in a program. 1-8 Other Usage of the Name Attribute of Variables • Names are also associated with – labels – subprograms – formal parameters and – other program constructs. • The term identifier is often used interchangeably with name. 1-9 Design Issues for Names – Are names case sensitive? – Are special words reserved words or keywords? 1-10 Length of Names • Length – If too short, they cannot be connotative – Length examples: • FORTRAN I: maximum 6 • C 89: – no length limitation on its internal names » Only the first 31 are significant – external names (defined outside functions and handled by linkers) » are restricted to 6 characters. • C# and Java: no limit, and all are significant • C++: no limit, but implementers often impose one – They do this so the symbol table in which identifiers are stored during compilation need not be too large, and also to simplify the maintenance of that table. 1-11 Name Forms • Names in most programming languages have the same form: a letter followed by a string consisting of letters, digits, and underscore character (_). – In the 1970s and 1980s, underscore characters were widely used to form names. • E.g. my_stack – Nowadays, in the C-based languages, underscore form names are largely replaced by camel notation. • E.g. myStack 1-12 Embedded Spaces in Names • In versions of Fortran prior to Fortran 90, names could have embedded spaces, which were ignored. – For example, the following two names were equivalent: Sum Of Salaries SumOfSalaries 1-13 Case Sensitivity • In many languages, notably the C-based languages, uppercase and lowercase letters in names are distinct – For example, the following three names are distinct in C++: rose, ROSE, and Rose. 1-14 Drawbacks of Case Sensitivity • detrimental to readability – Names that look very similar in fact denote different entities. – Case sensitivity violates the design principle • that language constructs that look the same should have the same meaning. • detrimental to writability – The need to remember specific case usage makes it more difficult to write correct programs. 1-15 Special Words • Special words in programming languages are used – to make programs more readable by naming actions to be performed. – to separate the syntactic entities of programs. • In most languages, special words are classified as reserved words, but in some they are only keywords. – P.S.: In program code examples in this book, special words are presented in boldface. 1-16 Keywords • A keyword is a word of a programming language that is special only in certain contexts. 1-17 Example of Keywords • Fortran is one of the languages whose special words are keywords. – In Fortran, the word Real, when found at the beginning of a statement and followed by a name, is considered a keyword that indicates the statement is a declarative statement. – However, if the word Real is followed by the assignment operator, it is considered a variable name. – These two uses are illustrated in the following: Real Apple Real = 3.4 • Fortran compilers and Fortran program readers must recognize the difference between names and special words by context. 1-18 Reserved Words • A reserved word is a special word of a programming language that can NOT be used as a name. 1-19 Advantages of Reserved Words • As a language design choice, reserved words are better than keywords because the ability to redefine keywords can lead to readability problems. 1-20 Drawback Example of Keywords • In Fortran, one could have the statements • Integer Real • Real Integer which declare the program variable Real to be of Integer type and the variable Integer to be of Real type. • In addition to the strange appearance of these declaration statements, the appearance of Real and Integer as variable names elsewhere in the program could be misleading to program readers. 1-21 Variables • A program variable is an abstraction of a computer memory cell or collection of cells. • Variables can be characterized as a sixtuple of attributes: – – – – – – Name Address Value Type Lifetime Scope 1-22 Benefits of Using Variables • One of the major adjustments from machine languages to assembly languages was to replace absolute numeric memory addresses with names, making programs far more readable and thus easier to write and maintain. • That above step also provided an escape from the problem of manual absolute addressing, because the translator that converted the names to actual addresses also chose those addresses. 1-23 Address • The address of a variable is the memory address with which it is associated. • In many languages, it is possible for the same variable name to be associated with different addresses – at different places and – at different times in the program. 1-24 How Parameters and Local Variables Are Represented in an Object File? abc(int aa) {int bb; bb=aa; : : abc: function prologue *(%ebp-4)=*(%ebp+8) function epilogue } a C function equivalent assembly code aa return address previous frame point ebp bb P.S.: function prologue and function epilogue are added by a compiler 1-25 The Same Names in Different Functions Are Associated with Different Addresses • A program can have two subprograms, subl and sub2, each of which defines a variable that uses the same name, say sum. • Because these two variables are independent of each other, a reference to sum in subl is unrelated to a reference to sum in sub2. 1-26 The Same Names in Different Executions May Be Associated with Different Addresses • If a subprogram has a local variable that is allocated from the run-time stack when the subprogram is called, different calls may result in that variable having different addresses. – These are in a sense different instantiations of the same variable. 1-27 Memory Allocation of Local Variables G(int a) { int i; high address stack G’s stack frame H(3); add_g: i++; } i b return address add_g H(int b) { char c[100]; int i; address of G’s frame point H’s stack frame C[99] while((c[i++]=getch())!=EOF) { } C[0] } low address i 1-28 L-value • The address of a variable is sometimes called its L-value, because that is what is required when a variable appears in the left side of an assignment statement. 1-29 Aliases • It is possible to have multiple variables that have the same address. • When more than one variable name can be used to access a single memory location, the names are called aliases. 1-30 Disadvantages of Aliases • Aliasing is a hindrance to readability because it allows a variable to have its value changed by an assignment to a different variable. – For example, if variables total and sum are aliases, any change to total also changes sum and vice versa. – A reader of the program must always remember that total and sum are different names for the same memory cell. – Because there can be any number of aliases in a program, this is very difficult in practice. • Aliasing also makes program verification more difficult. 1-31 Ways to Create Aliases • Aliases can be created in programs in several different ways. • C and C++: union types. • Two pointer variables are aliases when they point to the same memory location. • Reference variables • When a C++ pointer is set to point at a variable, the pointer, when dereferenced, and the variable’s name are aliases. 1-32 Type • The type of a variable determines – the range of values the variable can store and – the set of operations that are defined for values of the type. • For example, the type int in Java specifies – a value range of -2147483648 to 2147483647 and – arithmetic operations for addition, subtraction, multiplication, division, and modulus. 1-33 Value • The value of a variable is the contents of the memory cell or cells associated with the variable. 1-34 Abstract Cells • It is convenient to think of computer memory in terms of abstract cells, rather than physical cells. • The physical cells, or individually addressable units, of most contemporary computer memories are byte-sized, with a byte usually being eight bits in length. – This size is too small for most program variables. • We define an abstract memory cell to have the size required by the variable with which it is associated. 1-35 Example • Although floating-point values may occupy four physical bytes in a particular implementation of a particular language, we think of a floating-point value as occupying a single abstract memory cell. • We consider the value of each simple nonstructured type to occupy a single abstract cell. • Henceforth, when we use the term memory cell, we mean abstract memory cell. 1-36 r-value • A variable's value is sometimes called its r-value because it is what is required when the variable is used on the right side of an assignment statement. • To access the r-value, the L-value must be determined first. – Such determinations are not always simple. • For example, scoping rules can greatly complicate matters, as is discussed in Section 5.8. 1-37 Binding and Binding Time • In a general sense, a binding is an association, such as – between an attribute and an entity or – between an operation and a symbol. • The time at which a binding takes place is called binding time. 1-38 Possible Binding Time • Bindings can take place at: – – – – – – language design time, language implementation time, compile time, link time, load time, run time. 1-39 Static Binding • A binding is static if – it first occurs before run time and – remains unchanged throughout program execution. 1-40 Dynamic Binding • If a binding – first occurs during run time or – can change in the course of program execution, it is called dynamic. 1-41 Type Binding • Before a variable can be referenced in a program, it must be bound to a data type. • Two important aspects of type bindings are – how the type is specified? – when the binding takes place? • Types can be specified statically through some form of – explicit declaration – implicit declaration • Both explicit and implicit declarations create static bindings to types. 1-42 Explicit Declarations • An explicit declaration is a statement in a program that lists variable names and specifies that they are a particular type. – Most programming languages designed since the mid-1960s require explicit declarations of ALL variables. • Perl, JavaScript, Ruby, and ML are some exceptions. 1-43 Implicit Declarations • An implicit declaration is a means of associating variables with types through default conventions instead of declaration statements. – In this case, the FIRST appearance of a variable name in a program constitutes its implicit declaration. – Several widely used languages whose initial designs were done before the late 1960s – notably Fortran, PL/I, and BASIC – have implicit declarations. 1-44 Implicit Declaration Example • In Fortran, an identifier that appears in a program that is not explicitly declared is implicitly declared according to the following convention: – If the identifier begins with one of the letters I, J, K, L, M, or N, or their lowercase versions, it is implicitly declared to be Integer type. – In all other cases, it is implicitly declared to be Real type. 1-45 Drawbacks of Implicit Declarations • Although they are a minor convenience to programmers, implicit declarations can be detrimental to reliability because they prevent the compilation process from detecting some typographical and programmer errors. – For example, in Fortran, variables that are accidentally left undeclared by the programmer are given default types and unexpected attributes, which could cause subtle errors that are difficult to diagnose. 1-46 Disable Implicit Declarations in Fortran • Many Fortran programmers now include the declaration – Implicit none – in their programs. • This declaration instructs the compiler to no implicitly declare any variables. 1-47 Method to Avoid Implicit Declarations • Some of the problems with implicit declarations can be avoided by requiring names for specific types to begin with particular special characters. • For example, in Perl, – any name that begins with $ is a scalar, which can store either a string or a numeric value. – any name beginning with @ is an array – The above rules create different name spaces for different type variables. In this scenario, the names @apple and %apple are unrelated, because each is from a different name space. – Furthermore, a program reader always knows the type of a variable when reading its name. 1-48 Declarations and Definitions • In C and C++, one must sometimes distinguish between declarations and definitions. • Declarations specify types and other attributes but do not cause allocation of storage. • Definitions specify attributes and cause storage allocation, 1-49 Number of Declarations and Definitions • For a specific name, a C program can have ANY number of compatible declarations, but only a SINGLE definition. 1-50 Purpose of Variable Declarations • One purpose of variable declarations in C is to provide the type of a variable defined external to a function but used in the function. • It tells the compiler the type of a variable and that it is defined elsewhere. 1-51 Function Definition and Function Prototype • The idea in previous slides carries over to the functions in C and C++, where prototypes declare names and interfaces, but not the code of functions. • Function definitions, on the other hand, are complete. 1-52 Example file2.c int a=100; /*variable definition*/ int bar(int y) {int x; /*function definition*/ x=y; return(x); } file1.c #include<stdio.h> extern int a; /*variable declaration*/ extern int bar(int); /*function prototype*/ main() { printf("a=%d\n",a); printf("bar(3)=%d\n",bar(3)); } 1-53 Compilation Steps 1.gcc –c file1.c -> 2.gcc –c file2.c -> 3.gcc file1.o file2.o -> file1.o file2.o a.out 1-54 Dynamic Type Binding • With dynamic type binding: – the type is not specified by a declaration statement, nor can it be determined by the spelling of its name. – the variable is bound to a type when it is assigned a value in an assignment statement. • When the assignment statement is executed, the variable being assigned is bound to the type of the value of the expression on the right side of the assignment. 1-55 The Primary Advantage of Dynamic Variable Type Binding • A great deal of programming flexibility. 1-56 Creation of Generic Programs • A program to process a list of data in a language that uses dynamic type binding can be written as a generic program, meaning that it is capable of dealing with data of any numeric type. • Whatever type data is input will be acceptable, because the variables in which the data is to be stored can be bound to the correct type when the data is assigned to the variables after input. • By contrast, because of static binding of types, one cannot write a C++ or Java program to process a list of data without knowing the type of that data. 1-57 Example of Dynamic Binding • In PHP, and JavaScript, the binding of a variable to a type is dynamic. – For example, a JavaScript script may contain the following statement: list = [10.2, 3.5] Regardless of the previous type of the variable named list, this assignment causes it to become a single-dimensioned array of numeric elements of length 2. – If the statement list = 47 followed the assignment above, list would become a numeric scalar variable. 1-58 Dynamic Binding Is Less Reliable in Error Detection • Dynamic type binding causes programs to be less reliable, because the error detection capability of the compiler is diminished relative to a compiler for a language with static type bindings. 1-59 Dynamic Binding Results in Weak Typerelated Error Detection • Dynamic type binding allows any variable to be assigned a value of any type. • Incorrect types of right sides of assignments are not detected as errors; rather the type of the left side is simply changed to the incorrect type. 1-60 Example of Drawbacks of Dynamic Type Binding • Suppose – that in a particular JavaScript program, i and x are currently storing scalar numeric values, and y is currently storing an array. – that the program needs the assignment statement i = x; but because of a keying error, it has the assignment statement i = y; • In Javascript (or any other language that uses dynamic type binding), no error is detected in this statement by the interpreter - i is simply changed to an array. But later uses of i will expect it to be a scalar, and correct results will be impossible. • In a language with static type binding, the compiler would detect the error in the assignment i = y, and the program would not get to execution. 1-61 Disadvantages of Dynamic Binding in terms of Cost • Perhaps the greatest disadvantage of dynamic type binding is cost. • The cost of implementing dynamic attribute binding is considerable, particularly in execution time. • Type checking must be done at run time. • Furthermore, every variable must have a run-time descriptor associated with it to maintain the current type. • The storage used for the value of a variable must be of varying size, because different type values require different amounts of storage. 1-62 Implementation Concerns • Languages that have dynamic type binding for variables are usually implemented using pure interpreters rather than compilers. • Up to date computers do not have instructions whose operand types are not known at compile time. – Therefore, a compiler cannot build machine instructions for the expression A + B if the types of A and B are not known at compile time. • Pure interpretation typically takes at least ten times as long as to execute equivalent machine code. 1-63 Type Inference • ML is a programming language that supports both functional and imperative programming (Milner et al., 1990). • ML employs an interesting type inference mechanism, in which the types of most expressions can be determined without requiring the programmer to specify the types of the variables. 1-64 General Syntax of a ML Function fun function_name(formal parameters) = expression; • The value of the expression is returned by the function. 1-65 Example (1) • The function declaration fun circumf(r) = 3.14159 * r * r; specifies a function that takes a floatingpoint argument ( real in ML) and produces a floating-point result. • The types are inferred from the type of the constant in the expression. 1-66 Example (2) • Likewise, in the function fun times10(x) = 10 * x; the argument and functional value are inferred to be of type int. 1-67 Example (3) • Consider the following ML function: fun square(x) = x * x; – ML determines the type of both the parameter and the return value from the * operator in the function definition. Because this is an arithmetic operator, the type of the parameter and the function are assumed to be numeric. – In ML, the default numeric type to be int. So, it is inferred that the type of the parameter and the return value of square is int. 1-68 Example (4) • If square were called with a floating-point value, as in square(2.75); it would cause an error, because ML does not coerce real values to int type. 1-69 Example (5) • If we wanted square to accept real parameters, it could be rewritten as fun square(x) : real = x * x; • Because ML does not allow overloaded functions, this version could no coexist with earlier int version. 1-70 Allocation and Deallocation of Memory Cells • The memory cell to which a variable is bound somehow must be taken from a pool of available memory. This process is called allocation. • Deallocation is the process of placing a memory cell that has been unbound from a variable back into the pool of available memory. 1-71 The Lifetime of a Variable • The lifetime of a variable is the time during which the variable is bound to a specific memory location. • So the lifetime of a variable begins when it is bound to a specific cell and ends when it is unbound from that cell. 1-72 Categories of Scalar Variables 1-73 Categories of Scalar Variables • It is convenient to separate scalar (unstructured) variables into four categories, according to their lifetimes: – – – – static stack-dynamic explicit heap-dynamic implicit heap-dynamic 1-74 Static Variables • Static variables are those that – are bound to memory cells before program execution begins and – remain bound to those same memory cells until program execution terminates. 1-75 Applications of Static Variables • Globally accessible variables are often used throughout the execution of a program, thus making it necessary to have them bound to the same storage during that execution. • Sometimes it is convenient to have variables that are declared in subprograms be history-sensitive, that is, have them retain values between separate executions of the subprogram. – This is a characteristic of a variable that is statically bound to storage. 1-76 Advantages of Static Variables • Another advantage of static variables is efficiency. – All addressing of static variables can be direct. • Other kinds of variables often require indirect addressing, which is slower. – No run-time overhead is incurred for allocation and deallocation of static variables, although this time is often negligible. 1-77 Disadvantages of Static Variables • reduced flexibility – in a language that has only variables that are statically bound to storage, recursive subprograms cannot be supported. • storage cannot be shared among variables – For example, • Suppose a program has two subprograms, both of which require large unrelated arrays. • Further suppose that the two subprograms are never active at the same time. • If the arrays are static, they cannot share the same storage for their arrays. 1-78 Example • C and C++ allow programmers to include the static specifier on a variable definition in a function, making the variables it defines static. 1-79 Stack-Dynamic Variables • Stack-dynamic variables are those – whose storage bindings are created when their definition statements are elaborated but – whose types are statically bound. 1-80 Elaboration of the Definition Statements of Stack-Dynamic Variables • Elaboration of such a definition refers to the storage allocation and binding process indicated by the definition. • Elaboration takes place when execution reaches the code to which the definition is attached. a subprogram or a block • Elaboration occurs during run time. 1-81 Memory Allocation of Stack-Dynamic Variables Occur during Run-time G(int a) { int i; high address stack G’s stack frame H(3); add_g: i++; } i b return address add_g H(int b) { char c[100]; int i; address of G’s frame point H’s stack frame C[99] while((c[i++]=getch())!=EOF) { } C[0] } low address i 1-82 Example • The variable definitions that appear at the beginning of a Java method are elaborated when the method is called. • The variables defined by those definitions are deallocated when the method completes its execution. 1-83 The Location That Stores StackDynamic Variables • As their name indicates, stack-dynamic variables are allocated from the run-time stack. 1-84 Storage Binding of a Variable May Occur before Its Declaration • Some languages – for example, C and Java – allow variable definitions to occur anywhere a statement can appear. • In some implementations of these languages, all of the stack-dynamic variables defined in a function or method (not including those declared in nested blocks) may be bound to storage at the beginning of execution of the function or method, even though the definitions of some of these variables do not appear at the beginning. 1-85 Stack-Dynamic Variables and Recursive Programs • To be useful, at least in most cases, recursive subprograms require some form of dynamic local storage so that each active copy of the recursive subprogram has its own version of the local variables. • These needs are conveniently met by stackdynamic variables. 1-86 Memory Sharing • The introduction of stack-dynamic variables allows – all subprograms to share the same memory space for their locals. 1-87 Disadvantages of Stack-Dynamic Variables • the run-time overhead of allocation and deallocation. – however the overhead is not significant, because all of the stack-dynamic variables that are defined at the beginning of a subprogram are allocated and deallocated togerher. • slower accesses – Indirect addressing is required • subprograms cannot be history sensitive. 1-88 Examples of Stack-Dynamic Variables • In Java, C++ and C#, local variables defined in methods are by default stack-dynamic. • In Pascal and Ada, all non-heap variables defined in subprograms are stack-dynamic. 1-89 Explicit Heap-Dynamic Variables • Explicit heap-dynamic variables are nameless (abstract) memory cells that are allocated and deallocated by explicit run-time instructions specified by the programmer. 1-90 Reference Explicit Heap-Dynamic Variables • Explicit heap-dynamic variables, which are allocated from and deallocated to the heap, can only be referenced through pointers or reference variables. – The pointer or reference variable that is used to access an explicit heap-dynamic variable is created as any other scalar variable. 1-91 Properties of a Heap • The heap is a collection of storage cells whose organization is highly disorganized because of the unpredictability of its use. 1-92 Creating an Explicit Heap-Dynamic Variable • An explicit heap-dynamic variable is created: – by an operator (for example, in Ada and C++ ) or – by a call to a system subprogram provided for that purpose (for example, malloc() in C). 1-93 Allocation Operator in C++ • In C++, the allocation operator, named new, uses a type name as its operand. • When executed, an explicit heap-dynamic variable of the operand type is created and a pointer to it is returned. – Because an explicit heap-dynamic variable is bound to a type at compile time, that binding is static. – However, such variables are bound to storage at the time they are created, which is during run time. 1-94 Deleting a Heap-Dynamic Variables • In addition to a subprogram or operator for creating explicit heap-dynamic variables, some languages include a means of destroying them. 1-95 Example of Explicit Heap-dynamic Variables What follows is a C++ code segment: int *intnode; //create a pointer ... intnode = new int; // create the heap-dynamic variable delete intnode; // deallocate the heap-dynamic variable // to which intnode points • In this example, an explicit heap-dynamic variable of int type is created by the new operator. • This variable can then be referenced through the pointer, intnode. • Later, the variable is deallocated by the delete operator. 1-96 Java Objects • Java, all data except the primitive scalars are objects. – e.g. class Circle { … } Circle cir=new Circle() ; • Java objects are explicit heap-dynamic and are accessed through reference variables. • Java has no way of explicitly destroying a heap-dynamic variable; rather, implicit garbage collection is used. 1-97 Applications of Explicit Heap-Dynamic Variables • Explicit heap-dynamic variables are often used for dynamic structures, such as linked lists and trees, that need to grow and/or shrink during execution. • Such structures can be built conveniently using pointers or references and explicit heap-dynamic variables. 1-98 Disadvantages of Explicit HeapDynamic Variables • the difficulty of using pointer reference variables correctly. • the cost of – references to the variables – allocations and – deallocations. • the complexity of storage management implementation. 1-99 Implicit Heap-dynamic Variables • Implicit heap-dynamic variables are bound to heap storage only when they are assigned values. • In fact, all their attributes are bound every time they are assigned. 1-100 Example • For example, a JavaScript script may contain the following statement to assign a value to the implicit heap-dynamic variable list : – list = [10.2, 3.5] Regardless of the previous type of the variable named list, this assignment causes it to become a single-dimensioned array of numeric elements of length 2. – If the statement list = 47 followed the assignment above, list would become a numeric scalar variable. 1-101 Advantages • The advantage of such variables is that they have the highest degree of flexibility, allowing highly generic code to be written. 1-102 Disadvantages • the run-time overhead of maintaining all the dynamic attributes, which could include array subscript types and ranges, among others. • the loss of some error detection by the compiler, as discussed in Section 5.4.2.2. 1-103 Type Checking 1-104 Generalize the Concepts of Functions and Assignment Statements • Subprograms are thought of as operators – their parameters are their operands. • The assignment symbol is thought of as a binary operator – with its target variable and its expression being the operands. 1-105 Type Checking • Type checking is the activity of ensuring that the operands of an operator are of COMPATIBLE types. 1-106 Compatible Types • A compatible type is one that is – either legal for the operator or – is allowed under language rules to be implicitly converted by compiler-generated code (or the interpreter) to a legal type. • This automatic conversion is called a coercion. 1-107 Example • If an int variable and a float variable are added in Java, the value of the int variable is coerced to float and a floating-point add is done. 1-108 Type Errors • A type error is the application of an operator to an operand of an inappropriate type. 1-109 Example of Type Errors • In the original version of C, if an int value was passed to a function that expected a float value, a type error would occur (because compilers for that language did not check the types of parameters.) – Integer Signed Attacks 1-110 Example of Integer Signed Attacks void *memcpy(void *dest, const void *src, size_t n); P.S.: size_t is equivalent to unsigned integer static char data[256]; void *store_data(char *buf, int len) { if (len > 256 ) return -1; return memcpy(data, buf, len); } P.S.: memcpy requires an unsigned integer for the length parameter; therefore, the signed variable len would be promoted to an unsigned integer, lose its negative sign, and could wrap around and become a very large positive number, cause memcpy() to read past the bounds of buf. 1-111 Static Type Checking • If all bindings of variables to types are static in a language, then type checking can nearly always be done statically. 1-112 Dynamic Type Checking • Dynamic type binding requires type checking at run time, which is called dynamic type checking. – Some languages, such as JavaScript and PHP, because of their dynamic type binding, allow only dynamic type checking. 1-113 Pros and Cons of Static Type Checking • It is better to detect errors at compile time than at run time, because the earlier correction is usually less costly. • The penalty for static checking is reduced programmer flexibility. 1-114 Type Checking for Memory Cells That Can Store Values of Different Types • Type checking is complicated when a language allows a memory cell to store values of different types at different times during execution. – Such memory cells can be created with • Ada variant records • Fortran EQUIVALENCE and • C and C++ unions. 1-115 Type Checking for Variables That Can Store Values of Different Types Must Be Dynamic • For variables that can store values of different types, type checking, if done, MUST be dynamic and requires the run-time system to maintain the type of the current value of such memory cells. • So even though all variables are statically bond to types in languages such as C++, not all type errors can be detected by static type checking. – For example, the type of a statically bond C++ variable may be union. char c; union sign static type checking { int first; int ≡ int char second; } number; number.first=12; c=number.second; int ≡ int 1-116 Strongly Typed Programming Languages • A programming language is strongly typed if type errors are always detected. • The above requires that the types of all operands can be determined either at compile time or at run time. 1-117 The Importance of Strongly Typed Languages • The importance of strong typing lies in its ability to detect ALL misuses of variables that result in type errors. • A strongly typed language also allows the detection, at run time, of uses of the incorrect type values in variables that can store values of more than one type. 1-118 Fortran 95 Is Not Strongly Typed • In Fortran 95 the use of Equivalence between variables of different types allows a variable of one type to refer to a value of a different type, without the system being able to check the type of the value when one of the Equivalenced variables is referenced or assigned. 1-119 Explanation: Fortran 95 Is Not Strongly Typed Integer A Real R Equivalence (A,R) A=123 A 123 is not a real number; hence, a type error occurs. 123 R 1-120 C and C++ Are Not Strongly Typed Languages • C and C++ are not strongly typed languages because – both allow functions for which parameters are not type checked. – Furthermore, the union types of these languages are not type checked. 1-121 Coercion Rules vs. Type Checking • The coercion rules of a language have an important effect on the value of type checking. – For example, • Expressions are strongly typed in Java. • However, an arithmetic operator with one floatingpoint operand and one integer operand is legal. • The value of the integer operand is coerced to floating-point, and a floating-point operation takes place. • Even though the above is what is usually intended by the programmer, the coercion also results in a loss of part of the reason for strong typing – error detection (see next slide). 1-122 The Value of Strong Typing Is Weakened by Coercion • Suppose a program written in a strongly typed language had the int variables a and b and the float variable d. • Now, if a programmer meant to type a + b, but mistakenly typed a + d, the error would not be detected by the compiler. The value of a would simply be coerced to float. • The above coercion weaken the value of strong typing. 1-123 Coercion Reduces Reliability • Languages with a great deal of coercion, like Fortran, C, and C++, are significantly less reliable than those with little coercion, such as Ada. • Java and C# has half as many assignment type coercions as C++, so its error detection is better than that of C++, but still not nearly as effective as that of Ada. 1-124