Set Debugging Lex & Yacc

advertisement
Set 6
Debugging with Lex & Yacc
DEBUGGING YOUR GRAMMAR
WITH YACC
Assume your Yacc defn. file is called “Test”
1. OBTAINING THE PARSING MACHINE
Employ: yacc -v Test
A representation of the parsing machine
will be written onto y.output
OBTAINING A TRACE OF THE YACC PARSE
(a) In your main program in Test, employ:
yydebug = 1;
.
.
yyparse();
(b) Now employ:
yacc -t Test
Note: The yacc options –d, -v, -t can be combined,
e.g. you can employ: yacc –dvt Test
SELECTING WHAT TO TRACE
1.
You can switch tracing on or off by setting yydebug to 1 or
0 respectively.
2.
A method of allowing one to control which portion(s) of the
source to trace is to include among your productions for a
statement:
statement : switchon ‘;’ {yydebug = 1;}
| switchoff ‘;’ {yydebug = 0;}
;
You can then insert the above statements into you source to
bracket the portions that you wish to trace.
RECOVERING FROM SOURCE
PROGRAM ERRORS
When Yacc reaches a state of the parsing
machine without a default reduction, and which
has no transition defined for the next input
symbol from the source program, it calls yyerror
with the string argument “syntax error”.
But the user’s source program may have many
errors, and our compiler should attempt to
report as many as possible at each run.
Accordingly, we need a way of bypassing the
statement containing the error, so that we can
continue to parse the rest of the source
program in order to detect additional errors.
This is accomplished using the system word
“error” within the grammar.
If the Yacc defn file contains the “production”:
AA : BB error CC
or AA : error CC
the C-program produced by Yacc will do the
following on detecting a syntax error in the
source:
(a) It will call yyerror to report “syntax error”
(b) It will pop items off both state no. and symbol
stacks, until it has popped a state which is a BBsuccessor of other states, or if BB is omitted, a
state which is an error-successor of other states
Note 1. If we were using symbol stack as we
first did in class, the above would be equivalent
to popping items from both stacks until a “BB”
or “error” was popped from symbol stack.
Note 2. No state can be a successor of other
states with respect to more than one symbol.
(c) It will now perform an AA-transition from the
state which is now at the top of the stack
(d) It will then start inputting symbols from the
source and discarding them until it has read and
discarded the symbol CC.
At this stage, it will resume parsing the source
in the usual manner.
MOST COMMON USE
The most common use of the “error” symbol is
in conjuction with productions for lists of
statements, e.g:
statement_list : statement_list statement
| statement
| error ‘;’
;
Productions of this kind occur in many places,
including “if” statements, “while” statements etc.
Assuming that all statements end in a
semicolon, this in effect:
(a) pops off all the state numbers involved in
reading the current statement up to the point
where a syntax error was detected, and then
(b) discards all further source symbols until after
a semicolon is read.
Download