Names, Binding, Type Checking and Scopes

advertisement
Names, Binding, Type Checking and Scopes
 In an imperative language, a variable is an abstraction of a memory cell.
 The following are the fundamental semantic issues of variables in an imperative language:
1. Attribute of variables.
2. Issues of aliases.
3. Binding and binding times for variable attributes
4. Classification of variables into categories (according to their binding times).
5. Type checking, strong typing, and type compatibility rules.
6. Scoping rules for names: static and dynamic.
7. Referencing environment of a statement.
8. Named constants and variable initialization techniques.
Attributes of Variables
 In an imperative language, a variable is characterized by the following attributes:
1. Name:
how is it referred to?
2. Address: where is the corresponding memory cell located in memory?
3. Value:
what is the value stored at the corresponding memory location?
4. Type:
what is the type of values that can be stored at the corresponding memory location?
5. Lifetime:
for how long does the corresponding memory cell is kept in memory?
6. Scope:
where in the program can a reference be made to the variable?
Names
 Are also called identifiers.
 Are also associated with labels, subprograms, formal parameters, and any programming language
entity.
 Design issues:
FORTRAN 1
maximum length: 6 characters
FORTRAN 90 and ANSI C maximum length: 31 characters
Ada
no maximum length
C++
limit set by implementations.
 Connectors:

Underscore character: C, C++, Java

Dash character

Pascal, Modulu-2, and FORTRAN 77 do not allow connectors
COBOL
 Case sensitivity:

Uppercase and lowercase characters are different in C, C++, and Java.

The names in other languages are not. FORTRAN 90 and COBOL allow lowercase characters
which are internally translated to uppercase.
 Special words:

Keywords:
are words with assigned meanings in certain contexts.
Examples: if, while, else in C/C++.

Reserved words: are words that cannot be used as user-defined names
In C/C++ all keywords are reserved words
In FORTRAN, keywords integer and real are not reserved words
For example, REAL and INTEGER are keywords, which are not reserved words:
INTEGER REAL
REAL INTEGER
In Ada, keywords integer and Float are not reserved words.

Standard identifiers:
are names that are defined in libraries and/or other program units and
made visible to programmers in C/C++, Ada, and Java. These names can be redefined by the
programmer. Examples, printf, and scanf in C.
Address of a Variable
 The address of a variable is the address of the corresponding memory cell.
 It is sometimes referred to as the l-value of the variable whereas it contents is referred to as its rvalue.
Example
num = num + 5;
l-value
r-value
Aliases
 If two variable names can be used to access the same memory location, they are called aliases.
 Aliases are harmful to readability.
 How are aliases created?

Pointers (see Example 1)

Reference variables (see Example 2)

Pascal variant records

C/C++ unions (to be discussed later)

FORTRAN EQUIVALENCE

Through parameters (to be discussed later).
Example 1: (pointers in C/C++)
int main( )
{
int num = 5,
*pt;
pt = #
*pt += 3;
cout << endl << “num=” << num << “\t*pt=” << *pt;
num += 2;
cout << endl << “num=” << num << “\t*pt=” << *pt;
return 0;
}
Output:
num= 8 *pt= 8
num= 10 *pt= 10
Example 2: (reference variables in C++)
int main( )
{
int robert = 5;
int bob = &robert;
bob ++;
cout << endl << “robert=” << robert << “\tbob=” << bob;
robert += 2;
cout << endl << “robert=” << robert << “\tbob=” << bob;
return 0;
}
Output:
robert= 6
robert= 8
bob= 6
bob= 8
Type
 The type of a variable determines the range of values of that variable and the set of operations that are
defined for those values.
 In the case of floating-point, it also determines the precision.
Value
 The value of a variable is the contents of the memory location associated to that variable.
 It is also called the r-value of the variable.
Primitive data types
Signed integers
are represented in two’s complement.
Data types:
C/C++/C#
Java
signed char
short int
int
long int
byte
short
int
long
Unsigned integers
are represented as sequences of bits.
Data types:
C/C++/C#
Java
char (1 byte)
unsigned char
unsigned short int
unsigned int
unsigned long int
char (2 bytes)
also used to represent a single character
Floating-point Types
-
are used to approximate real numbers.
-
are represented as fractions and exponents (refer to Figure 6.1, page 252)
-
the precision is the number of bits in the fractional part of the value that are accurate.
Data types:
C/C++/C#
Java
float
double
long double
float
double
single precision
double precision
Decimal
-
decimal data types store a fixed number of decimal digits,
-
with a decimal point at a fixed position in the value.
-
They are stored in memory in Binary-coded decimal (BCD).
-
They are available in COBOL, PL1, and C#.
Boolean types
-
only two elements in the range of values (true and false).
-
Usually represented using a byte.
Data types:
Character Types
C++/C#
Java
bool
boolean
characters are stored in computers in numeric forms:
-
ASCII code (128 characters)
-
Unicode (16-bit code)
in most programming languages
in Java, JavaScript, and C#.
Character String Type
 Values consist of a sequence of characters
 Is a primitive data type in FORTRAN 95 and BASIC with the following operations: assignment, relational
operator, catenation, and the substring reference operations.
 Is not a primitive data type in C, C++, C#, or Java


C
o
use char arrays to store Null terminated character strings
o
use of standard library functions (strcpy( ), strcat( ), strlen( ), strcmp( ), etc.) to provide operations on
strings.
C++
o
Use of char arrays as in C.
o
Use of the string class from the standard library.


Java
o
Use the String class for constant strings
o
Use the StringBuffer class for variables whose values are changeable.
String Length

Static length strings: The length of the string is set when the string is created. This is the case for the
Java String class, the C++ string class, and the .NET class library available in C#.

Limited dynamic length strings: string have a varying length up to a declared fixed maximum set by the
variable definition. This is the case in C/C++. But in C/C++, the end of a string is marked by the null
character: a run-time descriptor is not needed.


Dynamic length strings: variables have varying length with no maximum. This is the case in JavaScript
and Perl.
Implementation of character string types - compile/run time descriptors (refer to page 258)
Binding
 Binding is an association, such as between an attribute and an entity, or between an operation and a
symbol.
Examples:
The association of a variable to a memory location.
The association of the symbol + to addition.
 Binding time the time at which a binding takes place.
Binding Times
1.
Language design time: Example, the symbol + is bound to addition of integers/ real numbers.
2.
Language implementation/compiler design time: Example, the range of possible values is
associated to a data type.
3.
Compile time: Example, a variable is bound to a data type.
4.
Linkage time: Example, a call to a library function is bound to that function code, or a reference
to an external global variable is bound to that variable.
5.
Load time: Example, a global variable is bound to a memory location.
6.
Execution time: Example, the binding of a local variable to a memory location.

A Binding is static
if it occurs before runtime and remains unchanged throughout the program
execution.
Example:

binding of a data type to a variable.
A binding is dynamic
Example:
if it occurs during the execution of the program or can change during
the execution of the program.
binding of local variables to memory locations.
Type Bindings

How is the type of a variable specified?

When does binding take place?
Static type binding:
 A type is bound to a variable by using a declaration statement:
Explicit declaration:
Examples int num1, num2; in C/C++ language
Implicit declaration:
Examples in FORTRAN, an identifier that is not explicitly declared is an integer if it starts with I, J,
K, L, M, or N; otherwise it is a real.
Other languages with implicit declarations are PL/1, BASIC, Perl, . . . , etc
Note: distinguish between declaration of a variable and definition of a variable in C/C++:
Examples: int num;
//definition of variable num
extern int num;
// declaration of variable num
Dynamic Type Binding (APL and SNOBOL)
 A variable is bound to a type when it is assigned a value (in an assignment statement).
 This provide programming flexibility
Example:
in APL, or SNOBOL, a variable LIST could be bound to a list of integer values, a list of
floating-point values, and an integer value in the same program
LIST < ----- 10.2 7.5 - 10.4 23.5
LIST < ----
12 25 46
LIST < ----
45
Type Inference (ML, Miranda, and Haskell)
 Types are determined from the context of the reference of the variable.
Example:
Declaration of a function in ML:
fun circleArea( r ) = 3.1415 * r * r ;
specifies a function that takes a real argument and returns a real result.
fun times10( r ) = 10 * r ;
specifies a function that takes an integer argument and returns an integer result.
fun square( r ) : int = r * r ;
// the type is explicitly specified
fun square( r ) = r * (r : int);
// the type is explicitly specified
Storage Bindings
 Allocation:
getting a memory cell/location from some pool of available memory cells.
 Deallocation: putting a memory cell back into the pool.
 The lifetime of a variable:
is the time during which it is bound to a memory cell.
 A memory cell may be allocated in six different places in memory as follows:
Environment Variables and command line arguments
Stack
Heap
Uninitialized Data Segment
Initialized Data Segment
Text
Categories of Variables by Lifetimes
 Static variable
is bound to a memory location (in the initialized/uninitialized data segment) before the execution of
the program begins and remains bound to the same memory location throughout the execution of the
program.
Examples: global and static local variables in C/C++.
All variables in FORTRAN 77.
Advantages:
efficiency (direct addressing, no overhead); history-sensitive subprogram support.
Disadvantage:
lack of flexibility (do not allow recursion)
 Stack-dynamic variable
is bound to a memory location (in the stack) when the block in which the variable is defined is
entered, and is deallocated when the block is exited.
Examples:
local variables in C/C++.
Advantages:
allows recursions, conserve storage.
Disadvantages:
overhead of allocation and deallocation; subprograms cannot be history sensitive;
inefficient references (indirect addressing).
 Explicit heap-dynamic variable
memory cell is allocated and deallocated (in the heap) by explicit statements specified by the
programmer, which takes effect during the execution of the program.
Examples (in C++)
int *pt;
pt = new int;
//allocate a memory location to hold an integer value and assigns it address to pointer pt
.
.
.
delete pt;
// deallocate the memory location
int *table;
table = new int [10];
/* allocate 10 consecutive memory locations to hold integer value and assigns it address of the first to
pointer table */
.
.
.
delete [ ] table;
Advantage:
// deallocate the memory locations
provides for dynamic storage management.
Disadvantage:
inefficient and unreliable.
 Implicit heap-dynamic variable
allocation and deallocation of a memory location is caused by assignment statements.
Example: in APL we have the following:
LIST < ----- 10.2 -34.6 51.3
LIST < ---- 47
// allocation of three memory cells
// allocation of one memory cell
Exercise
1.
Indicate where each memory location of the following program is allocated, when it is allocated,
and when it is de-allocated.
2. Trace the execution of the following program and show the contents of all memory locations created
during the execution of the program (make up addresses when one is needed). Assume that the
program is executed as follows:
Program 25 12
int num1 = 10, num2;
int procedure (int *, int &, int);
int main( int argn, char * argv[ ])
{
int result1, result2, val1, val2, *pt;
if (argn != 3)
return 0;
val1 = atoi (argv[ 1 ]);
val2 = atoi (argv[ 2 ]);
pt = new int;
*pt = 17;
result1 = procedure (pt, val1, val2);
num1 += num2;
cout << “\nnum1 =” << num1 << “\tnum2=” << num2 << “\tval1=” << val1
<< “\tval2 =” << val2 << “\tresult1=” << result1 << “\t*pt=” << *pt;
val1 - = val2;
*pt += 3;
result2 = procedure ( &val2, val1, val1 – val2);
cout << “\nnum1 =” << num1 << “\tnum2=” << num2 << “\tval1=” << val1
<< “\tval2 =” << val2 << “\tresult1=” << result1 << “\tresult2=” << result2
<< “\t*pt=” << *pt;
return 0;
}
int procedure (int *p, int &n1, int n2)
{
static int flag = 0;
int n = 5;
num2 = 2 * num1;
flag ++;
n ++;
*p += 3;
n1 += 4;
n2 += n;
return(*p + n1 + n2 + flag );
}
Scope of a Variable
 A program block is a section of a program code where variables can be defined and used.
 Program blocks are specified as follows:

Using the left brace { and the right brace } in C/C++, Java, C#:
{. . .}

Using the keywords BEGIN and END in Pascal:
BEGIN . . . END;

Using the declare clause in Ada:
declare Temp : Integer:
begin
Temp := First;
First := Second;
Second := Temp;
end;
Note that in Ada, the declare clause is not needed if variables are not defined in a block.
 Automatic variables defined in a program block are referred to as its local variables.
 Memory locations are allocated for the local variables of a program block (in the stack) when the
execution of the program enters that program block and are deallocated when it exits that program block.
 Program blocks may be nested in most programming languages; but procedures/functions may not be
nested in C/C++, Java, or C#. They may be nested in Ada, JavaScript, and Pascal.
 The scope of a variable is the range of statements in which that variable can be referenced (or is visible).
 The non-local variables of a program unit (procedure of block) are the variables that are visible but not
declared there.
 Two methods are used to determine the scope of non-local variables (how references to names are
associated with variables): static scoping and dynamic scoping.
Static Scoping (C/C++, Java, Ada, C#, Perl)
 Is the method of binding names to variables as follows:
1. First look in the program unit (procedure or block) in which the reference is found. If a declaration is
found, bind that name to the variable.
2. Otherwise, look in the immediate outer unit.
3. Continue until a declaration is found.
Examples: page 220
Notes:
-
The for structure in C++, Java, and C# allows variable definitions in their initialization expressions. The
scope of these variables is restricted to the for structure.
-
In C++, the scope operator is used to refer to a global variable (:: num) when there is a local variable
with the same name in a program unit.
Dynamic Scoping (APL, SNOBOL 4, Perl, COMMON LISP)
 Is the method of binding names to variables as follows:
1. First look in the procedure in which the reference is found. If a declaration is found, bind that name
to the variable.
2. Otherwise, look in the subprogram in which this subprogram is called.
3. Continue until a declaration is found.
Example: page 228.
Referencing Environments
 The referencing environment of a statement is the collection of all the names that are visible in that
statement.
Examples: pages 230, 231, and 232.
Named Constants
 A named constant is a variable that is bound to a value only when it is bound to a memory location.
 Advantages:
make the program more readable and easy to modify.
 The initialization expression is specified as follows:

A constant is Pascal.

A constant-valued expression in Modula-2 and FORTRAN

Any kind of expression if Ada, C++, and Java
Examples:
In C++,
const int val = 2 * width + 5;
In Java,
final int len = 100;
Variable Initialization
 The binding of a variable to a value at the time it is bound to a memory location is called
initialization.
 Initialization is often done on the declaration statement.
Examples:
Exercise:
In Ada,
SUM : FLOAT := 0.0;
In C++,
int num = 50;
5, 6, 7, 8, 9, 10, 11, 12, pages 236-240.
Download