COMP421-week9-10-11

advertisement
COMP-421 Compiler Design
Presented by
Dr Ioanna Dionysiou
Lecture Outline
[ALSU07] Chapter 6 Intermediate Code
Representation
– Section 6.2
[ALSU07] Chapter 7 Run-Time Environments
– Sections 7.1, 7.2, 7.3
Copyright (c) 2011 Ioanna Dionysiou
Chapter 6 Lecture Outline
Intermediate Code Representation
– 3-address code essentials
– 3-address code implementation techniques
Copyright (c) 2011 Ioanna Dionysiou
Intermediate Code Representation
Syntax tree
Parsing
Type-checking
Intermediate
Form
Generation
Assumption: Source code is parsed and type-checked
Copyright (c) 2011 Ioanna Dionysiou
Intermediate Languages
So far, we have seen one kind of intermediate
representation
– Syntax trees (2nd week of classes)
• Natural hierarchical structure of program
A new kind
– 3-address code
• Abstract form
Copyright (c) 2011 Ioanna Dionysiou
3-address code
It is a sequence of statements of the general
form
x := y op z
Arithmetic operator
Logical operator
Names
Constants
temporaries
Copyright (c) 2011 Ioanna Dionysiou
3-address code
/* source language expression */
/* 3-address statements */
x+y*z
t1 := y * z
t2 := x + t1
Where t1, t2 are compilergenerated temporary names
Variable names can appear
directly in 3-address
statements
Copyright (c) 2011 Ioanna Dionysiou
3-address code
/* source language expression */
/* 3-address statements */
a := b * -c + b * -c
t1 := -c
t2 := b * t1
t3 := -c
t4 := b * t3
t5 := t2 + t4
a := t5
:=
+
a
*
b
*
unimus
c
b
unimus
In other words, for each
interior node in the syntax
tree we create a new
temporary variable
c
Copyright (c) 2011 Ioanna Dionysiou
Types of 3-address statements
– Assignment statement
x := y op z
– Assignment instruction
x := op z
– Copy statement
x := y
– Unconditional jump
Procedural call
p(x1,x2,…,xn)
param x1
param x2
param xn
call p, n
Indexed assignments
x := y[i] , x[i] := y
goto L
– Conditional jump
if x relop y goto L
Address and pointer
assignments
x := &y, x := *y, *x := y
Copyright (c) 2011 Ioanna Dionysiou
Implementation of 3-address
3-address statement is an abstract form of
intermediate code
Need a concrete form
– Records with fields for operator and operands
• Quadruples
• Triples
• Indirect triples
Copyright (c) 2011 Ioanna Dionysiou
Quadruples
It is a record structure with 4 fields
•
•
•
•
Operator
Argument 1
Argument 2
Result
a:= b * -c + b * -c
t1 := -c
t2 := b * t1
t3 := -c
t4 := b * t3
t5 := t2 + t4
a := t5
op
arg1
unimus
c
*
b
unimus
c
*
b
t3
t4
+
t2
t4
t5
:=
t5
Copyright (c) 2011 Ioanna Dionysiou
arg2
result
t1
t1
t2
t3
a
Triples
It is a record structure with 3 fields
• Operator
• Argument 1 (pointer to symbol table or triple)
• Argument 2 (pointer to symbol table or triple)
– Idea
– avoid entering temporary names to symbol table
– Refer to a temp value by the position of that statement that
computes it
a:= b * -c + b * -c
t1 := -c
t2 := b * t1
t3 := -c
t4 := b * t3
t5 := t2 + t4
a := t5
op
arg1
unimus
c
*
b
unimus
c
*
b
(2)
+
(1)
(3)
assign
a
(4)
Copyright (c) 2011 Ioanna Dionysiou
arg2
(0)
Indirect Triples
It is a record structure with 3 fields
• Operator
• Argument 1 (pointer to symbol table or triple)
• Argument 2 (pointer to symbol table or triple)
– Idea
• List pointers to triples instead listing triples
statement
op
arg1
(0)
(14)
unimus
c
(1)
(15)
*
b
(2)
(16)
unimus
c
(3)
(17)
*
b
(16)
(4)
(18)
+
(15)
(17)
(5)
(19)
assign
a
(18)
Copyright (c) 2011 Ioanna Dionysiou
arg2
(14)
In-class Exercise
Show the 3-address code for the following
statements:
a := 5 + 6
c := a + 3
Try to optimize the 3-address code generated
Copyright (c) 2011 Ioanna Dionysiou
Chapter 7 Lecture Outline
Chapter 7
– Source language issues
• Activation trees, control stacks, scope of declaration,
bindings
– Storage Organization
• Activation records
– Storage Allocation Strategies
• Static allocation, stack allocation, heap allocation
Copyright (c) 2011 Ioanna Dionysiou
Source Language Issues
Before considering code generation
– Need to relate static program source to actions
that must take place at runtime to implement this
program
• Relationship between names and data objects
• Allocation and deallocation of data objects is managed
by runtime support package
– Routines loaded with the generated target code
• Depending on the language, the runtime environment
may be
– Fully static, stack-based or dynamic
– Type of environment determines the need to use stack, heap
or both
Copyright (c) 2011 Ioanna Dionysiou
Program and Procedures
Procedure definition
procedure name
associated with
procedure body
Procedure call
PROGRAM sort(input,output);
VAR a : array[0..10] of Integer;
PROCEDURE readarray;
VAR i : Integer;
BEGIN
for i:= 1 to 9 do read(a[i]);
END;
FUNCTION partition(y,z : Integer): Integer;
VAR i,j,x,v : Integer;
BEGIN
…
END;
PROCEDURE quicksort(m,n : Integer);
VAR i : Integer;
Formal parameters
BEGIN
if (n > m) then BEGIN
i := partition(m,n);
quicksort(m, i-1);
quicksort(i+1,n)
END
END;
BEGIN /* of main */
a[0] := -9999; a[10] := 9999;
readarray;
quicksort(1,9)
END.
Actual parameters
Copyright (c) 2011 Ioanna Dionysiou
Program and Procedures
Procedure Activation
An execution of a
procedure body
Lifetime of Procedure
Activation
Sequence of steps between
the first and last statements
in the execution of the
procedure body, including
time spent executing other
procedures called by p
PROGRAM sort(input,output);
VAR a : array[0..10] of Integer;
PROCEDURE readarray;
VAR i : Integer;
BEGIN
for i:= 1 to 9 do read(a[i]);
END;
FUNCTION partition(y,z : Integer): Integer;
VAR i,j,x,v : Integer;
BEGIN
…
END;
PROCEDURE quicksort(m,n : Integer);
VAR i : Integer;
Formal parameters
BEGIN
if (n > m) then BEGIN
i := partition(m,n);
quicksort(m, i-1);
quicksort(i+1,n)
END
END;
BEGIN /* of main */
a[0] := -9999; a[10] := 9999;
readarray;
quicksort(1,9)
END.
Actual parameters
Copyright (c) 2011 Ioanna Dionysiou
Flow of control
Assumptions about the flow of control among
procedures during program execution
– Control flows sequentially
– Execution of a procedure starts at the beginning
of procedure body and eventually returns control
to the point immediately following the place where
the procedure was called
• In this case we can use trees to illustrate the flow of
control in a program
• goto statement (transfers control to another point)
Copyright (c) 2011 Ioanna Dionysiou
Procedure Activation
Each time control enters a procedure q from
procedure p, it eventually returns to p (if no
errors occur)
– Activation of procedure p
– Activation of procedure q
– Return to activation of procedure p
Consider
– lifetime(p) and lifetime(q)
• Lifetime(p) and lifetime(q) are non-overlapping
• Lifetime(p) and lifetime(q) are nested
– See example above
Copyright (c) 2011 Ioanna Dionysiou
Recursive Procedures
A recursive procedure is a procedure that
calls itself
– A new activation begins before an earlier
activation of the same procedure has ended
– Several activations for procedure p exist at time t
A recursive procedure is a procedure p that
calls another procedure q, which in turns calls
procedure p
– Procedure p
Procedure q
• Call to procedure q
Copyright (c) 2011 Ioanna Dionysiou
Call to procedure p
Activation tree
We can use a tree (activation tree) to depict
the way control enters and leaves activations
How to build an activation tree
– Each node represents an activation of a
procedure
• The root represents the activation of the main program
• The node for a is the parent of node b iff control flows
from activation a to b
• The node for a is to the left of the node for b iff the
lifetime(a) occurs before lifetime(b)
Copyright (c) 2011 Ioanna Dionysiou
Activation Tree for a program
PROGRAM sort(input,output);
VAR a : array[0..10] of Integer;
PROCEDURE readarray;
VAR i : Integer;
BEGIN
for i:= 1 to 9 do read(a[i]);
END;
FUNCTION partition(y,z : Integer): Integer;
VAR i,j,x,v : Integer;
BEGIN
…
END;
PROCEDURE quicksort(m,n : Integer);
VAR i : Integer;
BEGIN
if (n > m) then BEGIN
i := partition(m,n);
quicksort(m, i-1);
quicksort(i+1,n)
END
END;
BEGIN /* of main */
a[0] := -9999; a[10] := 9999;
readarray;
quicksort(1,9)
END.
Can you build the activation tree
for this program?
Use
s - activation of main program sort
r - activation of readarray
p - activation of partition
q(x,y) - activation of quicksort
Start with
s as the root of the tree
r and q(1,9) are its children, etc…
Copyright (c) 2011 Ioanna Dionysiou
Activation Tree for a program
PROGRAM sort(input,output);
VAR a : array[0..10] of Integer;
PROCEDURE readarray;
VAR i : Integer;
BEGIN
for i:= 1 to 9 do read(a[i]);
END;
FUNCTION partition(y,z : Integer): Integer;
VAR i,j,x,v : Integer;
BEGIN
…
END;
PROCEDURE quicksort(m,n : Integer);
VAR i : Integer;
BEGIN
if (n > m) then BEGIN
i := partition(m,n);
quicksort(m, i-1);
quicksort(i+1,n)
END
END;
BEGIN /* of main */
a[0] := -9999; a[10] := 9999;
readarray;
quicksort(1,9)
END.
s
r
q(1,9)
p(1,9)
q(1,3)
p(1,3)
p(2,3)
q(1,0)
q(5,9)
q(2,3) p(5,9)
q(2,1)
Copyright (c) 2011 Ioanna Dionysiou
q(3,3) p(7,9)
q(5,5)
q(7,9)
q(7,7)
q(9,9)
Control Stack and Activation Tree
Flow of control in a program
– Depth-first traversal of the activation tree
– Use a stack called control stack to keep track of
the live procedure activations
• Push node for an activation onto the stack as the
activation begins
• Pop node for an activation off the stack when the
activation ends
• Control Stack contents
– Node n is at top of the control stack
– Stack contains the nodes along the path from n to the root
Copyright (c) 2011 Ioanna Dionysiou
Activation Tree for a program
s
r
p(1,9)
p(1,3)
r, p(1,9), p(1,3) and q(1,0)
have executed to completion
(dashed lines)
q(1,9)
q(1,3)
q(1,0)
q(2,3)
Stack contains
q(2,3)
q(1,3)
q(1,9)
s
Copyright (c) 2011 Ioanna Dionysiou
top
Scope of declaration
PROGRAM sort(input,output);
VAR a : array[0..10] of Integer;
PROCEDURE readarray;
VAR i : Integer;
BEGIN
for i:= 1 to 9 do read(a[i]);
END;
FUNCTION partition(y,z : Integer): Integer;
VAR i,j,x,v : Integer;
BEGIN
…
END;
PROCEDURE quicksort(m,n : Integer);
VAR i : Integer;
BEGIN
if (n > m) then BEGIN
i := partition(m,n);
quicksort(m, i-1);
quicksort(i+1,n)
END
END;
BEGIN /* of main */
a[0] := -9999; a[10] := 9999;
readarray;
quicksort(1,9)
END.
Scope of variables depends
on the language scope rules
(1) Local
(2) Nonlocal
Same name, different scope
Copyright (c) 2011 Ioanna Dionysiou
Bindings of names
Environment
– Function that maps a name to a storage location
• f: name  storage location
State
– Function that maps a storage location to a value
• g: storage location  value
g( f(name) ) = value
Copyright (c) 2011 Ioanna Dionysiou
Bindings of names
Consider storage location 100 is associated with
variable x and it currently holds value 0
– Environment
• x is bound to storage location 100
– State
• Value held is 0
Consider assignment statement x:=10
– Environment
• x is still bound to storage location 100
– State
• Value held is 10
Copyright (c) 2011 Ioanna Dionysiou
Bindings of names
A binding is the dynamic counterpart of a declaration
– Pascal local variable name in a procedure is bound to a different
storage location in each activation of a procedure
STATIC NOTION
DYNAMIC COUNTERPART
Definition of procedure
Activations of the procedure
Declaration of name
Bindings of the name
Scope of declaration
Lifetime of binding
Copyright (c) 2011 Ioanna Dionysiou
Lecture Outline
Source language issues
– Activation trees, control stacks, scope of
declaration, bindings
Storage Organization
– Activation records
Storage Allocation Strategies
– Static allocation, stack allocation, heap allocation
Copyright (c) 2011 Ioanna Dionysiou
Storage Organization
Suppose that the compiler obtains memory
from the OS so that it can execute the
compiled program
– Program gets loaded on a newly created process
This runtime storage must hold
– Generated target code
– Data objects
– A counterpart of the control stack to keep track of
procedure activations
Copyright (c) 2011 Ioanna Dionysiou
Runtime Memory
code for function 1
code for function 2
PASCAL and C use extensions of the control
stack to manage activations of procedures
...
code for function n
global / static area
stack
free space
Stack contains information about register values,
value of program counter and data objects whose
lifetimes are contained in that of an activation
Heap holds all other information. For example,
activations that cannot be represented as a tree.
By convention, stack grows down and the top of
the stack is drawn towards the bottom of this slide
(value of top is usually kept in a register)
heap
Copyright (c) 2011 Ioanna Dionysiou
Runtime Memory
code for function 1
code for function 1
code for function 2
code for function 2
...
...
code for function n
code for function n
global / static area
global / static area
stack
stack
Stack grows
free space
free space
heap
heap
Copyright (c) 2011 Ioanna Dionysiou
Activation Record
Information needed by a single
execution of a procedure is
managed using an activation
record or frame
– Not all compilers use all of the
fields
– Pascal and C push activation
record on the runtime stack
when procedure is called and
pop the activation record off the
stack when control returns to
the caller
Copyright (c) 2011 Ioanna Dionysiou
returned value
actual parameters
optional control link
optional access link
saved machine status
local data
temporaries
Activation Record
1)
Temporary values
e.g. those arising in the evaluation of
expressions
2)
actual parameters
Local data
Data that is local to an execution of the
procedure
3)
returned value
Saved machine status
State of the machine info before procedure
is called. Values of program counter and
machine registers that have to be restored
when control returns from the procedure
optional control link
optional access link
saved machine status
local data
temporaries
Copyright (c) 2011 Ioanna Dionysiou
Activation Record
4)
Access Link
refer to non-local data held in other activation records
5)
Control link
points to the activation record of the caller
6)
Actual parameters
actual parameters
optional control link
used by the calling procedure to supply parameters to the
called procedure
(in practice these are passed in registers)
7)
returned value
Returned value
optional access link
saved machine status
used by the called procedure to return a value to the
calling procedure
(in practice it is returned in a register)
local data
temporaries
Copyright (c) 2011 Ioanna Dionysiou
Local data
The field for local data is set when
declarations in a procedure are examined
during compile time
– Variable-length data is not stored here
– Keep a count of the memory locations that have
been allocated so far
– Determine a relative address (offset) of the
storage for a local with respect to some position
(e.g. beginning of the frame)
• Multibyte objects are stored in consecutive bytes and
given the address of the first byte
Copyright (c) 2011 Ioanna Dionysiou
Lecture Outline
Source language issues
– Activation trees, control stacks, scope of
declaration, bindings
Storage Organization
– Activation records
Storage Allocation Strategies
– Static allocation, stack allocation, heap allocation
Copyright (c) 2011 Ioanna Dionysiou
Storage-allocation strategies
There are three storage allocation strategies:
– Static allocation
– Stack-based allocation
– Dynamic (or heap-based) allocation
Copyright (c) 2011 Ioanna Dionysiou
Static Allocation
In a static environment (Fortran 77) there are
a number of restrictions:
– Size of data objects are known at compile time
– No recursive procedures
– No dynamic memory allocation
Only one copy of each procedure activation
record exists at time t
– We can allocate storage at compile time
• Bindings do not change at runtime
• Every time a procedure is called, the same bindings
occur
Copyright (c) 2011 Ioanna Dionysiou
Static Allocation
int i = 10;
code for function 1
int f(int j)
{
int k;
int m;
…
}
main()
{
int k;
f(k);
}
code main()
code for function 2
...
code f()
i (int)
main()
Activation
record
code for function n
global / static area
stack
k (int)
free space
f()
Activation
record
k (int)
m (int)
Copyright (c) 2011 Ioanna Dionysiou
heap
Stack-based Allocation
In a stack-based allocation, the previous
restrictions are lifted (Pascal, C, etc)
– procedures are allowed to be called recursively
• Need to hold multiple activation records for the same
procedure
• Created as required and placed on the stack
– Each record will maintain a pointer to the record that activated
it
– On completion, the current record will be deleted from the
stack and control is passed to the calling record
– Dynamic memory allocation is allowed
– Pointers to data locations are allowed
Copyright (c) 2011 Ioanna Dionysiou
Stack-based Allocation
PROGRAM sort(input,output);
VAR a : array[0..10] of Integer;
PROCEDURE readarray;
VAR i : Integer;
BEGIN
for i:= 1 to 9 do read(a[i]);
END;
FUNCTION partition(y,z : Integer): Integer;
VAR i,j,x,v : Integer;
BEGIN
…
END;
PROCEDURE quicksort(m,n : Integer);
VAR i : Integer;
BEGIN
if (n > m) then BEGIN
i := partition(m,n);
quicksort(m, i-1);
quicksort(i+1,n)
END
END;
BEGIN /* of main */
a[0] := -9999; a[10] := 9999;
readarray;
quicksort(1,9)
END.
Position in
Activation Tree
s
Copyright (c) 2011 Ioanna Dionysiou
Activation Records
On Stack
s
a (array)
Stack-based Allocation
PROGRAM sort(input,output);
VAR a : array[0..10] of Integer;
PROCEDURE readarray;
VAR i : Integer;
BEGIN
for i:= 1 to 9 do read(a[i]);
END;
FUNCTION partition(y,z : Integer): Integer;
VAR i,j,x,v : Integer;
BEGIN
…
END;
PROCEDURE quicksort(m,n : Integer);
VAR i : Integer;
BEGIN
if (n > m) then BEGIN
i := partition(m,n);
quicksort(m, i-1);
quicksort(i+1,n)
END
END;
BEGIN /* of main */
a[0] := -9999; a[10] := 9999;
readarray;
quicksort(1,9)
END.
Position in
Activation Tree
Activation Records
On Stack
s
s
r
Copyright (c) 2011 Ioanna Dionysiou
a (array)
r
i (integer)
Stack-based Allocation
PROGRAM sort(input,output);
VAR a : array[0..10] of Integer;
PROCEDURE readarray;
VAR i : Integer;
BEGIN
for i:= 1 to 9 do read(a[i]);
END;
FUNCTION partition(y,z : Integer): Integer;
VAR i,j,x,v : Integer;
BEGIN
…
END;
PROCEDURE quicksort(m,n : Integer);
VAR i : Integer;
BEGIN
if (n > m) then BEGIN
i := partition(m,n);
quicksort(m, i-1);
quicksort(i+1,n)
END
END;
BEGIN /* of main */
a[0] := -9999; a[10] := 9999;
readarray;
quicksort(1,9)
END.
Position in
Activation Tree
Activation Records
On Stack
s
s
r
q(1,9)
Copyright (c) 2011 Ioanna Dionysiou
a (array)
q(1,9)
i (integer)
Stack-based Allocation
PROGRAM sort(input,output);
VAR a : array[0..10] of Integer;
PROCEDURE readarray;
VAR i : Integer;
BEGIN
for i:= 1 to 9 do read(a[i]);
END;
FUNCTION partition(y,z : Integer): Integer;
VAR i,j,x,v : Integer;
BEGIN
…
END;
PROCEDURE quicksort(m,n : Integer);
VAR i : Integer;
BEGIN
if (n > m) then BEGIN
i := partition(m,n);
quicksort(m, i-1);
quicksort(i+1,n)
END
END;
BEGIN /* of main */
a[0] := -9999; a[10] := 9999;
readarray;
quicksort(1,9)
END.
Position in
Activation Tree
Activation Records
On Stack
s
s
r
q(1,9)
p(1,9) q(1,3)
Copyright (c) 2011 Ioanna Dionysiou
a (array)
q(1,9)
i (integer)
q(1,3)
i (integer)
Calling Sequences
Procedure calls are
implemented by generating
calling sequences in the
target code
– Call sequence: allocates
activation record and enters
information into fields
– Return sequence: restores the
state of the machine so that
the calling procedure can
continue execution
Copyright (c) 2011 Ioanna Dionysiou
returned value
actual parameters
optional control link
optional access link
saved machine status
local data
temporaries
Calling Sequences
returned value
Why placing returned value and
actual parameters next to the
activation record of the caller?
– Caller can access these values
using offsets from its own activation
record
– No need to know the middle part of
the callee’s activation record
actual parameters
optional control link
optional access link
saved machine status
local data
temporaries
returned value
actual parameters
optional control link
optional access link
saved machine status
local data
temporaries
Copyright (c) 2011 Ioanna Dionysiou
caller
callee
Calling Sequences
returned value
How do we calculate offset?
actual parameters
– Maintain a register that points to
the end of the machine status
field in an activation record
– Top_sp is known to the caller, so
it can be responsible for setting it
before control flows to the called
procedure
– Callee can access its temporaites
and local data using offsets from
top_sp
optional control link
optional access link
saved machine status
local data
temporaries
returned value
actual parameters
optional control link
optional access link
top_sp
saved machine status
local data
temporaries
Copyright (c) 2011 Ioanna Dionysiou
caller
callee
Call Sequence
The caller evaluates actuals
The caller
– stores a return address and the old value of
top_sp into the callee’s activation record
– increments top_sp; that is moved past the caller’s
local data and temporaries and the callee’s
parameter and status fields
The callee
– saves register values and other status information
The callee
– initializes its local data and begins execution
Copyright (c) 2011 Ioanna Dionysiou
Return Sequence
The callee places a return value next to the
activation record of the caller
Using the information in the status field, the
callee
– restores top_sp and other registers
– braches to a return address in the caller’s code
Although top_sp has been decremented, the
caller can copy the returned value into its own
activation record and use it to evaluate an
expression
Copyright (c) 2011 Ioanna Dionysiou
Dangling References
Whenever storage is deallocated, the problem
of dangling references arises
– Occurs when there is a reference to storage that
has been deallocated
– Logical error
• Mysterious bugs can appear
Copyright (c) 2011 Ioanna Dionysiou
Dangling References
int *dangle()
{
int i = 23;
return &i;
}
What’s the problem?
main()
{
int *p;
p = dangle();
}
Copyright (c) 2011 Ioanna Dionysiou
Dangling References
Local variable i only exists in
dangle()
int *dangle()
{
int i = 23;
return &i;
}
main()
{
int *p;
p = dangle();
}
When procedure completes
execution and control is
transferred to main(), the space for
i does not exist anymore (pop
activation record for dangle off the
stack)
Pointer p is a dangling reference
Copyright (c) 2011 Ioanna Dionysiou
Dynamic Allocation
Returning the address of a local variable is defined to be a
logical error (e.g. in C)
In a dynamic environment there is no such restriction
– All variables and activation records must be maintained for as long as
there are references to them
• Callee outlives the caller
– It is also possible to return pointers to local functions
– Must deallocate space when procedures and variables are no longer
needed (garbage collection)
Copyright (c) 2011 Ioanna Dionysiou
Dynamic Allocation
Use a heap to maintain these records
– Also called free store
– Heap management is challenging
Copyright (c) 2011 Ioanna Dionysiou
Download