• Review: Syntax directed translation.

advertisement
• Review: Syntax directed translation.
– Translation is done according to the parse tree.
• Each production (when used in the parsing) is a substructure of the parse tree.
• Attributes are associated with grammar symbols
– Each grammar symbol represents a construct of the
program.
– Attributes represent the results of translation for the
construct.
» E.g: translation = construct the syntax tree, use a tree
attribute with each symbol
» E.g: translation = calculation the result of the exp, use
a val attribute to represent the result.
• Semantics rules tell what to do (how to compute the
related attributes) when the sub-structure is founded.
• Two types of attributes:
– Synthesized attribute:
• Associated with the left hand side symbol of a
production.
• the value depends on the attributes associated with
the symbols in the right hand side of the production
(attributes of its children nodes in the parse tree).
– Inherited attribute:
• Associated with a symbol in the left hand side of a
production.
• The value depends on the attributes of its parent or
sibling nodes in the parse tree.
• Two ways to define the translation:
– Syntax directed definition.
• Just define the attributes and semantics rules without
specifying the order to evaluate the rules.
– The order is implicit in the rules
• To realize a general syntax directed definition, the
compiler needs to conceptually do the following:
– Build the parse tree  topologically sort the nodes based
on the implicit order  evaluate the attributes
– Not efficient if this has to be done.
• Some special definitions can be implemented
efficiently without actually build the parse tree.
– S-attributed definitions.
– L-attributed definitions.
• Two ways to define the translation:
– Syntax directed translation.
• Not only define the attributes and the semantics
rules, but also specify the order of how the
semantics rules should be applied.
• Realizing an S-attributed definition in a LR
parser:
– Extend the stack to have an additional field
(val) for the S-attribute.
State
…
(X, sx)
(Y, sy)
(Z, zy)
…
Parser stack
val
…
X.x
Y.y
Z.z
…
top
• Realizing a S-attributed definition in a LR
parser (example 5.17 at page 296):
L  E n {print(e.val)}
E  E1 + T {E.val = E1.val + T.val}
E  T {E.val = T.val}
T  T1 * F {T.val = T1.val + F.val}
T  F {T.val = F. val}
F  ( E ) {F.val = E.val}
F  digit {F.val = digit.lexval}
L  E n {print(val[top];}
E  E1 + T {val[top-2] = val[top-2] + val[top];}
ET
T  T1 * F {val[top-2] = val[top-2] * val[top];}
TF
F  ( E ) {val[top-2] = val[top-1];}
F  digit
• YACC allows only synthesized attributes
– It can also handle special types of L-attributes
• An attributes can depend on the attributes of the
sibling to its left.
– Those attributes are already on the stack. How to access
them: $i with I <= 0. See the example yacc_inherit.y
– Using this is somewhat tricky, need to make sure the
context of a production is exactly the same outside the
production.
» Need to use markers in many cases.
– Or passing the attributes with global variables. This is also
tricky.
• Static checking and symbol table
• chapter 6, chapter 7.6 and chapter 8.2
• Static checking: check whether the program follows
both the syntactic and semantic conventions at
compile time (versus dynamic checking -- check at
run time).
• Examples of static checking
– Type checks:
– Flow of control checks
int a, b[10], c;
…
a = b + c;
main {
int I
….
I++;
break;
}
– Examples of static checks
– uniqueness check:
– defined before use:
– name related check:
main() {
int i, j;
double i, j;
….
}
main() {
int i;
i1 = 0;
….
}
LOOPA:
LOOP
EXIT WHEN I=N
I=I+1;
END LOOP LOOPB;
– Some checks can only be done at runtime:
• arraybound checking in java:
a[i] = 0;
– To perform static checks, semantic information
must be recorded in some place -- symbol table.
• Grammar specifies the syntax, additional (semantic)
information, sometime called attributes, must be
recorded in symbol table for all identifiers.
• Typically attributes in a symbol table entry include
type and offset (where in the memory can I find this
variable?).
– Struct {int id; int type; int offset;} stentry;
• Organization of a symbol table:
– basic requirement: must be able to find the information
associated with a symbol (identifier) quickly.
– Example: array, link list, hash table.
– Provides two functions: enter(table, name, type, offset)
and lookup(name);
– Dealing with nested scope:
Program sort(input, output)
var a: array [0..10] of integers;
x: integer;
procedure readarray
var x : real;
begin …. x …. End
procedure quicksort(i, j)
begin … x … end
begin … x … end
main() {
int a, b;
a = 0;
{
int a;
a = 1;
}
printf(“a = %d\n”, a);
}
– How to organize the symbol table?
– How to do lookup and enter?
• One symbol table for each scope (procedure, blocks)?
• Maintain a stack of symbol tables for lookup/enter
• Symbol tables for sort:
nil header
a ...
x ...
readarray
quicksort
header
x ….
Symbol table for readarray
Symbol table for sort
header
Symbol table for quicksort
• How does the compiler created the symbol table?
– First let us consider the simple case: no nested scope, every thing
entered into one symbol table: table by using
• enter (table, id, type, offset)
– grammar:
P ->D
D ->D; D
D ->id : T
T -> integer
T ->real
T ->array [num] of T
T ->^T
I : array [10] of integer;
j : real;
k : integer
I array(10, integer)
j real
k integer
0
40
48
P -> {offset = 0;} D
D ->D; D
D ->id : T {enter(table, id.name, T.type, offset);
offset:= offset + T.width}
T -> integer {T.type = integer; T.width = 4}
T ->real {T.type = real; T.width = 8;}
T ->array [num] of T1 {T.type = array(num.val, T1.type);
T.width = num.val * T1.width}
T ->^T1 {T.type = pointer(T1.type); T.width = 4;}
– Now consider the case when you have nested
procedures (blocks can be considered as special
procedures)
• must maintain a stack of symbol tables, create new ones when
entering new procedure
• must reset offset when entering new procedures (a stack of
offsets)
• Let us also compute the total size of a table
– Grammar:
P->D
D ->D; D
D->id : T
D->proc id; D; S
T ->integer | real | array[num] of T | ^T
• mktable(previous): make a new table, properly set all links and
related information.
• Enter(table, name, type, offset).
• Addwidth(table, width): compute all memory needed by the
symbol table.
• Enterproc(table, name, newtable): enter the procedure name
with its symbol table into the old table.
– Grammar:
P->{t=mktable(nil); push(t, tblptr);push(0, offset);}D
{addwidth(top(tblptr), top(offset))}
D ->D; D
D->id : T {enter(top(tblptr), id.name, T.type, top(offset));
top(offset) = top(offset) + T.width;}
D->proc id; {t:=mktable(top(tblptr));push(t, tblptr); push(0,
offset);}D; S {t:= top(tblptr);addwidth(t, top(offset));
pop(tblptr); pop(offset);enterproc(top(tblptr), id.name, t)}
• Dealing with structure (record):
– T ->record D end
– Make a new symbol table for all the fields in the record.
T->record
{
t=mktable(nil);
push(t, tblptr);
push(0, offset);
}
D end
{
T.type = record(top(tblptr));
T.width = top(offset);
pop(tblptr);
pop(offset);
}
Question:
How does allowing variable declaration at anywhere in a program (like
in C++, java) affect the maintenance of the symbol tables?
Download