Yacc Example: Calculator - College of Engineering and

advertisement
Compiler Design
Yacc Example
"Yet Another Compiler Compiler"
Kanat Bolazar
Lex and Yacc
• Two classical tools for compilers:
– Lex: A Lexical Analyzer Generator
– Yacc: “Yet Another Compiler Compiler” (Parser Generator)
• Lex creates programs that scan your tokens one by one.
• Yacc takes a grammar (sentence structure) and generates a
parser.
Input
Lexical Rules
Grammar Rules
Lex
Yacc
yylex()
yyparse()
Parsed Input
2
Lex and Yacc
• Lex and Yacc generate C code for your analyzer & parser.
Grammar Rules
Lexical Rules
C code
Lex
Input
yylex()
char
stream
C code
Lexical Analyzer
(Tokenizer)
C code
token
stream
Yacc
yyparse()
Parsed
Input
C code
Parser
3
Flex, Yacc, Bison, Byacc
• Often, instead of the standard Lex and Yacc, Flex and Bison
are used:
– Flex: A fast lexical analyzer
– (GNU) Bison: A drop-in replacement for (backwards compatible
with) Yacc
• Byacc is Berkeley implementation of Yacc (so it is Yacc).
• Resources:
http://en.wikipedia.org/wiki/Flex_lexical_analyser
http://en.wikipedia.org/wiki/GNU_Bison
• The Lex & Yacc Page (manuals, links):
http://dinosaur.compilertools.net/
4
Yacc: A Standard Parser Generator
•
•
•
•
Yacc is not a new tool, and yet, it is still used in many projects.
Yacc syntax is similar to Lex/Flex at the top level.
Lex/Flex rules were regular expression – action pairs.
Yacc rules are grammar rule – action pairs.
declarations
%%
rules
%%
programs
5
Yacc Examples: Calculator
•
•
•
•
A standard Yacc example is the int-valued calculator.
Appendix A of Yacc manual at Lex and Yacc Page shows
such a calculator.
We'll examine this example in parts.
Let's start with four operations:
E -> E + E
|E–E
|E*E
|E/E
•
Note that this grammar is ambiguous because 2 + 5 * 7
could be parsed 2 + 5 first or 5 * 7 first.
6
Yacc Calculator Example: Declarations
%{
# include <stdio.h>
# include <ctype.h>
Directly included C code
int regs[26];
int base;
list is our start symbol; a list of one-line
statements / expressions.
%}
%start list
%token DIGIT LETTER
%left '+' '-'
%left '*' '/' '%'
%left UMINUS
/*
DIGIT & LETTER are tokens;
(other tokens use ASCII codes, as in '+', '=',
etc)
Precedence and associativity (left) of
precedence for unary minus */
operators:
+, - have lowest precedence
*, / have higher precedence
7
Yacc Calculator Example: Rules
%% /* begin rules section */
list : /* empty */
| list stat '\n'
| list error '\n'
{ yyerrok; }
;
list: a list of one-line statements / expressions.
Error handling allows a statement to be corrupt,
but list continues with next statement.
statement: expression to calculate, or assignment
stat : expr
{ printf( "%d\n", $1 ); }
| LETTER '=' expr
{ regs[$1] = $3; }
;
number: made up of digits (tokenizer should handle this, but this is a simple example).
number: DIGIT
{ $$ = $1; base = ($1==0) ? 8 : 10; }
| number DIGIT
{ $$ = base * $1 + $2; }
;
8
Yacc Calculator Example: Rules, cont'd
expr :
|
|
|
|
|
|
|
;
'(' expr ')'
{ $$ = $2; }
expr '+' expr
{ $$ = $1 + $3; }
expr '-' expr
{ $$ = $1 - $3; }
expr '*' expr
{ $$ = $1 * $3; }
expr '/' expr
{ $$ = $1 / $3; }
'-' expr
%prec UMINUS
{ $$ = - $2; }
LETTER
{ $$ = regs[$1]; }
number
Unary minus
Letter: Register/var
9
Yacc Calculator Example: Programs (C Code)
%%
/* start of programs */
yylex() {
/* lexical analysis routine */
/* returns LETTER for a lower case letter, yylval = 0 through 25 */
/* return DIGIT for a digit, yylval = 0 through 9 */
/* all other characters are returned immediately */
int c;
while( (c=getchar()) == ' ' ) {/* skip blanks */ }
/* c is now nonblank */
if( islower( c ) ) {
yylval = c - 'a';
return ( LETTER );
}
if( isdigit( c ) ) {
yylval = c - '0';
return( DIGIT );
}
return( c );
}
10
Download