CS461 Programming Languages

advertisement
CS461 Programming Languages
Lectures week 3
FORTRAN (FORmula TRANslating system)
mid 50’s: IBM John Backus – algebraic language translator
efficiency the big issue -> impact on design
54 – made up as they went along
57 – released
Fortran0 – Fortran IV (66) Fortran77 (ANSI) -> many ideas for later languages
Characteristic features
- set of fixed fields
- typing implicit I-N (integer)
- IF provides branches -, 0, +
- DO statement (Fortran IV and before – repeat, incr. Only, limits could not be expressions)
- FORMAT statements gave control over I/O and introduced H for (Hollerith) character strings
- Commenting
For Historical Reasons look at early versions
FORTRAN0 – lacked subprograms (proc/fns/subroutines)
FORTRAN1 – similar to pseudocodes, had subprogram facilities (but communicated by using parameters
or shared data areas called COMMON blocks, aka GLOBAL variables)
In general 2 parts to a program:
- declarative (describes data areas, lengths, init values)
- imperative (commands)
------
-----
FORTRAN PROGRAM
nonexecutable (compile time)
executable (run time)
Bindings
Declaration include bindings and initializations;
1. allocate area of memory of a size
2. bind (attach) a symbolic name to area of memory
3. initialize contents of memory
Example:
DIMENSION DTA (900)
1. allocates 900 words
2. binds the name DTA to the location
3. DATA DTA/900*0.0
(not required in Fortran)
Imperatives are computational (arith, move), control-flow (IF, GOTO, DO-LOOP), or I/O(READ, PRINT)
Fortrans primary computational imperative: assignment
Stages to Run program and Bindings
1. compilation
2. linking
3. loading
4. execution
AVG=SUM/FLOAT(N)
1.
Compilation
Fortran subprograms -> relocatable object code
statements -> instructions of computer subprograms reside in memory w/ other subprograms not
yet compiled
=> impossible to determine at compile time location in memory that subprogram will go
Therefore addresses of variables and statements not known
So binding occurs later during loading!
2.
Linking - Incorporate libraries, subprograms already compiled
3.
Loading –
-
4.
Execution
program placed in computer memory
go from relocatable code (.OBJ) to absolute format (.EXE)
bind all code and data references to addresses of locations
Compilation: 3 phases
Determines efficiency of final program
1.
2.
3.
syntactic analysis (lexical analysis and parser) - classify statements, extract parts
optimization – produce as good code as could be produced by experienced programmer
code synthesis (relocatable format)
Design: Data Structures
- suggested by math: scalars and vectors (arrays)
Scalars
Primary primitives: numeric scalars (distinct values, ordering)
Fortran II (60s)
INTEGERs – indexing and counting
Floating point – evaluation of math and physical formulas
Double precision
Complex (scientific calculations)
Logicals
Integers (32 bit word)
s b30 b29 … b2 b1 b0
Operations: +,-,*,/, tests for 0, tests for sign
Floating point -1.5x103 coefficient and power of 10 Operations: : +,-,*,/,comp, abs, exp (library)
sm sc c7 … c0 m21 m20 … m1 m0
mx2c where m is mantissa and c is characteristic
NOTE: Arithmetic operators overloaded (VIOLATES ORTHOGONALITY PRINCIPLE)
Can mix types in expressions, but computer numbers not related same way math numbers are
Compiler resolves by looking at context to determine machine instructions needed to generate
Early Fortran did NOT allow implicit or explicit conversion X+FLOAT (I)
Later versions allowed implicit coercion.
or I=IFIX(X)
Characters
Integer Type OVERWORKED
Integer could represent integers and char-strings
Hollerith constant – type integer (early form of char string)
Example: 6HCARMEL -> “CARMEL”
Character strings not first class in FORTRAN
Can’t use in all ways we want -> VIOLATES REGULARITY PRINCIPLE
Also no Hollerith variable
No string comparisons
Weak typing creates a loop-hole (VIOLATES SECURITY PRINCIPLE)
Permits reading into integer/real variables
Permits constants to be used as parameters where integers are expected
Fortran 77 HAS CHARACTER data type.
ARRAY (data constructor)
Example : DIMENSION DTA(100), COORD(10,10)
Fortran does not require initialization
Dimensions – integer, limited to 3 (7 in FORTRAN 77) (VIOLATES 0-1-INFINITY PRINCIPLE and
REGULARITY PRINCIPLE)
Array implementation will be skipped, read MacClennan’s book if you are interested.
Array Subscripts
Had to fit a form for optimization purposes.
Examples: I+1 allowed 1+ I not allowed
Subscript forms: c, v, v+c or c-c, c*v, c*v + c or c*v –c
VIOLATES REGULARITY PRINCIPLE
Name Structures
-
organize names in program
declarations or binding constructs
Example: INTEGER I, J, K
- 1 word allocated to each
- names bound to addresses
- initialization in DATA statement
- information put into a ‘symbol table’
EXAMPLE
name
I
type
location
integer
0245
Declarations are non-executable – provide information to compiler, liner and loader
Static allocation done before execution and doesn’t change during execution
In FORTRAN, all subprograms before invocation have locations allocated
In Pascal and C++ - allocate memory dynamically
The optional declarations in FORTRAN are dangerous!
- False economy VIOLATES SECURITY PRINCIPLE!
- Leads to obscure name chosen, such as KOUNT, ISUM, XLENGTH
- And what about typos? COUNT = COUMT + 1
COUMT was implicitly declared and value is ?
Environments determine meanings (Concept of SCOPE important for midterm!)
Context of statement based on environment
Set of definitions visible to a statement or construct
Environment determines visibility of bindings
In FORTRAN
- subprograms are separately compilable
- variable names local in scope
- see parameters
- see COMMON block (global) but each subprogram must include an identical declaration of a
COMMON block. (What if all specs don't agree? No Error! VIOLATES SECURITY PRINCIPLE)
-
subprogram names are GLOBAL
no nested hidden subprograms – all at same level, all visible to all
VIOLATES INFORMATION HIDING PRINCIPLE
Example
SUBROUTINE A
COMMON/SYMTAB/NAMES(100), LOC(100), TYPE(100), DIMS(100)
…
END
SUBROUTINE B
COMMON/SYMTAB/NAMES(100), LOC(100), TYPE(100), DIMS(100)
…
END
What if LOC and TYPE switched? LOC are integers and TYPE are reals. PROBLEM!
But won’t be caught! (VIOLATES SECURITY PRINCIPLE)
FORTRAN VIOLATES SYNTACTIC CONSISTENCY PRINCIPLE
Which states things which look similar should be similar and things which look different should be
Different.
Examples: ‘**’ for exponent, but leave one * out and you have legal multiplication.
FORTRAN has weak typing (so does C/C++):
Integer variables can contain addresses and chars
Should have a LABEL type to hold addresses
VIOLATES DEFENSE IN DEPTH PRINCIPLE
Which states if an error gets through one line of defense (such as syntactic checking by compiler), then it
should be caught by next line of defense (type checking)
HARDEST PROBLEM IN LANGUAGE DESIGN: identifying interaction of features
Example: how does syntax of GOTO’s work with overloading of integer type (where integer can contain
addresses).
Control, data, name, syntactic structures
Control structures govern flow of control
If
Early Fortran Example: IF (e) n1, n2, n3
Evaluates expression e, branches to n1, n2, or n3 depending if result - ,0 ,+
3way branching unusual, inspired by IBM 709 assembly language
Difficult to keep meaning of 3 labels straight
NOW Fortran: IF (X .EQ. A(I)) K = I – 1
NOTE: VIOLATES SYNTACTIC CONSISTENCY PRIN.
GOTO
Early Fortran: GOTO workhorse of control flow
Example:
IF (e) GOTO 100
…case for False
GOTO 200
100 …case for True
200 rest of code
Example:
100 …code…
IF (e) GOTO 100
Example:
100 IF (e) GOTO 200
…code…
GOTO 100
200 …rest of code…
This is just an If/Then/Else! Can you tell?!?
This is a Repeat/Until loop
This is a while loop !
Other examples: Computed GOTOs and Assigned GOTOs (are like switches in C++0
When we see an IF statement, hard to see if its an IF, IF-ELSe, leading or trailing decision loop. Difficult
to identify control structures. With GOTO it is even possible to write mid-decision loops! VIOLATES
STRUCTURE PRINCIPLE.
GOTO is a 2-edged sword: primitive but powerful control structure
Understandability is sacrificed
STRUCURE PRINCIPLE states the static structure of a program should correspond in a simple way with
its dynamic structure of corresponding computations. (Should be possible to visualize behavior by looking
at written form.)
DO-LOOP
- higher level control structure
- definite loop
Example: DO 100 I=1, N
A(I) = A(I) * 2
100 CONTINUE
We have what we want, rather than how (init, incr, test, branch) – SUPPORTS AUTOMATION
PRINCIPLE and ABSTRACTION PRINCIPLE
-
can be nested
highly optimized (LCV, initial and final values all stated explicitly along with extent of loop.)
PRESERVATION OF INFORMATION PRINCIPLE – the language should allow the representation of
information that user might know and that the compiler might need.
DO-WHILE (VAX FORTRAN)
DO WHILE (condition)
….body…
END DO
SUBPROGRAMS
 late addition
 libraries – they had
 user-defined subprogram – they didn’t have
 remedied in FORTRAN II (w/ subroutines and functions)
YOUR FORTRAN assignments uses 2 subroutines (BISECT&PRINT) and 2 functions (F&G)
Subprograms define Procedural Abstraction (SUPPORT ABSTRACTION PRINCIPLE)
- fragments of code that occur more than once w/ different variables
Example: SUBROUTINE name (formal parameters)
…body…
RETURN
END
RETURNS allowed anywhere in subprogram
Invoke by CALL statement
CALL name (actual parameters)
When executed actuals bound to formals, binding occurs during run-time
Parameters
-
usually passed by reference (Pascal VAR parameters, C++ pass by reference with &)
FORTRAN allows parameters used for input, output or both
Pass by reference
Output parameter
- need address of variable
formal parameter bound to address of actual
- efficient
- actually it is an input-output
- Pass by reference can be dangerous (side effects) because output parameter is actually input-output
- Input variable can be changed this way
Example:
SUBROUTINE SWITCH (N)
N=3
RETURN
END
CALL SWITCH(I)
puts 3 in I
CALL SWITCH(2)
puts 3 in the ‘literal table’ (constants portion of memory) where 2 is stored.
Effect of I = 2+2 => will give you I = 6
VIOLATES SECURITY PRINCIPLE!
Pass-by-value-Result
Value of actual is copied to format at invocation and result copied to actual at exit.
Both operations done by the caller, compiler can omit 2nd operation if parameter is constant or expression
Activation Records
- investigate way subprograms implemented
- save state of caller (contents of variables, registers, IP)
- way of knowing where subprogram returns to
IN nonrecursive FORTRAN – 1 activation record per subprogram
-
when a subprogram is invoked actual parms -> location callee knows to find them
-> callee’s activation record
-
on return -> transmit to callee a pointer to caller’s activation record
- store that pointer in callee’s activation record
- pointer is a dynamic link
formatting
In pseudocode: fixed format lexical convention columns dedicated
FORTRAN ignored blanks – all of them!
Example: DIMENSION IN DATA (10000), RESULT(8000)
DIMENSIONINDATA(10000),RESULT(8000)
DIMENSI…
Causes problems with compilers and humans
Example: DO 20 I = 1. 100 same as DO20I=1.100
Looks like DO 20 I = 1, 100
<- autodeclaration of DO20I as a float
American Viking Venus probe lost because of this error.
VIOLATES PRINCIPLE OF DEFENSE IN DEPTH (implicit declaration missed it too)
Lack of reserved words = Mistake!
Example: DIMENSION IF(100)
Now: IF(I-1) = 1,2,3 confused with IF(I-1) = 1 2 3
Compiler was a nightmare to write!
Syntax of algebraic notation
(-B + SQRT(B**2 – 4*A*C))/(2*a)
Arithmetic operators have precedence
1. exp
2. mult, /
3. +, Languages differ on unary operator -b or +b could be at highest or at 3.
No nested except in DO LOOP – Fortran 77 allows more!
Linear syntactic organization.
Download