Set 6 Debugging with Lex & Yacc DEBUGGING YOUR GRAMMAR WITH YACC Assume your Yacc defn. file is called “Test” 1. OBTAINING THE PARSING MACHINE Employ: yacc -v Test A representation of the parsing machine will be written onto y.output OBTAINING A TRACE OF THE YACC PARSE (a) In your main program in Test, employ: yydebug = 1; . . yyparse(); (b) Now employ: yacc -t Test Note: The yacc options –d, -v, -t can be combined, e.g. you can employ: yacc –dvt Test SELECTING WHAT TO TRACE 1. You can switch tracing on or off by setting yydebug to 1 or 0 respectively. 2. A method of allowing one to control which portion(s) of the source to trace is to include among your productions for a statement: statement : switchon ‘;’ {yydebug = 1;} | switchoff ‘;’ {yydebug = 0;} ; You can then insert the above statements into you source to bracket the portions that you wish to trace. RECOVERING FROM SOURCE PROGRAM ERRORS When Yacc reaches a state of the parsing machine without a default reduction, and which has no transition defined for the next input symbol from the source program, it calls yyerror with the string argument “syntax error”. But the user’s source program may have many errors, and our compiler should attempt to report as many as possible at each run. Accordingly, we need a way of bypassing the statement containing the error, so that we can continue to parse the rest of the source program in order to detect additional errors. This is accomplished using the system word “error” within the grammar. If the Yacc defn file contains the “production”: AA : BB error CC or AA : error CC the C-program produced by Yacc will do the following on detecting a syntax error in the source: (a) It will call yyerror to report “syntax error” (b) It will pop items off both state no. and symbol stacks, until it has popped a state which is a BBsuccessor of other states, or if BB is omitted, a state which is an error-successor of other states Note 1. If we were using symbol stack as we first did in class, the above would be equivalent to popping items from both stacks until a “BB” or “error” was popped from symbol stack. Note 2. No state can be a successor of other states with respect to more than one symbol. (c) It will now perform an AA-transition from the state which is now at the top of the stack (d) It will then start inputting symbols from the source and discarding them until it has read and discarded the symbol CC. At this stage, it will resume parsing the source in the usual manner. MOST COMMON USE The most common use of the “error” symbol is in conjuction with productions for lists of statements, e.g: statement_list : statement_list statement | statement | error ‘;’ ; Productions of this kind occur in many places, including “if” statements, “while” statements etc. Assuming that all statements end in a semicolon, this in effect: (a) pops off all the state numbers involved in reading the current statement up to the point where a syntax error was detected, and then (b) discards all further source symbols until after a semicolon is read.