Synthesis and the Parse Tree.

advertisement
Synthesis and
the Parse Tree
© Allan C. Milne
Abertay University
v14.6.16
Agenda.
• Synthesis.
• Parse Trees.
• An Example.
• Representing The Tree.
Synthesis.
• The final phase of the compiler is to generate
an artefact.
• Requires synthesis of contextual information
on the syntactic structure of the program and
the meaning of user-defined names.
• This information is exposed to this synthesis
phase via
– the parse tree for the syntactic structure; and
– the symbol table for the identifier context.
What do We Know So Far?
• How to perform parsing.
• How to create a symbol table.
• Sometimes synthesis can be performed as
the parse proceeds.
• If so, then no parse tree is required and
appropriate artefact generation functions can
be called directly from the action code
associated with productions in the Yacc
script.
What Might We Need To Know?
• However often synthesis requires knowledge
of the entire program structure before artefact
generation can proceed.
–
•
This structure is represented by the parse tree.
This latter approach requires us to
know
–
–
–
how to represent a parse tree;
how to build the tree; and
how to then process the tree.
The Parse Tree.
• This is a tree data structure representing the
syntactic structure of the input program.
• It effectively represents the derivation
sequence of the program.
• The root of the tree is the starter symbol.
•
•
•
The branches are the elements of the production
being applied.
Non-terminal elements have, in turn, their own
branches.
The leaves are the terminal tokens of the source
program.
( ant, dog )
<AnimalList>
(
<MoreAnimals>
<Animal>
tANT
,
<Animal>
tDOG
)
( ant, dog )
•In representing a parse tree we often omit the
terminal symbols that are ‘noise’ (syntactic
sugar).
<AnimalList>
(
<MoreAnimals>
<Animal>
tAnt
,
<Animal>
tDOG
)
GenVal Examples.
• For GenVal, the parse tree does not require
to reflect the <Declarations> part of a script.
– The <Declarations> part processing constructs the symbol
table as the parse proceeds.
– The compiler does not need to refer back to this parse, only
to the symbol table.
– We can therefore start the tree from <StatementSequence>.
• Terminal keywords and punctuation will not be
represented in the parse tree except where
significant.
Generate 2 integer values from 0 to
limit*6
<StatementSequence>
<Statement>
<Generator>
<Expression>
<Type>
<Range>
tNUMBER (2)
tINTEGER
<Expression>
tNUMBER (0)
<Expression>
tIDENTIFIER
“Limit”
*
<Expression>
<Expression>
tNUMBER (6)
Representing The Parse Tree.
• Use a child/sibling pointer model of the tree.
• A node represents a non-terminal or terminal
symbol.
• The child pointer of a non-terminal node
points to the first node of a list of nodes
representing the elements making up the
production of the non-terminal.
• The sibling pointer of any node points to the
node representing the next element of the
production being applied.
The Parse Tree Node.
• Represented by a struct containing
–
–
the type of the node;
the value associated with the node (only valid
for terminal nodes);
The pointers associated
with the child/sibling tree
structure;
- a pointer to the start node
of the sub-branch defining
the structure (for a nonterminal)
- a pointer to the next sibling
node of the production for the
parent node.
struct treeNode {
int type;
union data {
double dblValue;
char *strValue;
} value;
struct treeNode *structure;
struct treeNode *next;
};
typedef struct treeNode parseNode;
So We Have …
Type:
NTStatementSequence
Value: Structure
Next : null
NTStatement
Value: Structure
Next : null
NTGenerator
Value: Structure
Next : null
NTExpression
Value: Structure
Next
NTType
Value: Structure
Next
NTRange
Value: Structure
Next : null
… continued
NTExpression
Value: Structure
Next
NTNumber
Value: 2
Structure : null
Next : null
NTType
Value: Structure
Next
NTInteger
Value: Structure : null
Next : null
NTRange
Value: Structure
Next : null
… exercise
for the
student
Download