Copyright 1992 Carnegie Mellon University. All rights Reserved. $Disclaimer:

advertisement
Copyright 1992 Carnegie Mellon University. All rights Reserved.
$Disclaimer:
Permission to use, copy, modify, and distribute this software and its
documentation for any purpose is hereby granted without fee,
provided that the above copyright notice appear in all copies and that
both that copyright notice, this permission notice, and the following
disclaimer appear in supporting documentation, and that the names of
IBM, Carnegie Mellon University, and other copyright holders, not be
used in advertising or publicity pertaining to distribution of the
software
without specific, written prior permission.
IBM, CARNEGIE MELLON UNIVERSITY, AND THE OTHER COPYRIGHT HOLDERS
DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING
ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT
SHALL IBM, CARNEGIE MELLON UNIVERSITY, OR ANY OTHER COPYRIGHT HOLDER
BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY
DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE
OF THIS SOFTWARE.
$
parse
The parse object class provides a convenient method for parsing text
according to some grammar. The grammar is represented by tables and the
parser is in static code, so there are no name conflicts and multiple
parsers
can be employed with impunity. (However, only one parser can be used
from
any given .c file.)
Tables are generated with a version of the bison package from the Free
Software Foundation. The parser is newly written code specific to the
Andrew User Interface System.
Source Code
An application which wants to use a grammar prepares the grammar in a
file with extension .y using all the conventions of yacc with a few
extensions
noted below. Suppose the input file is sample.y. Processing by bison
(utilizing the -n and -d options) will
yield two files of interest: sample.tab.c and sample.act.
The application utilizing the parser is a .c file, say sample.c.
The includes section of sample.c has at least these items,
in the given order:
#include <yystype.h>
/* for YYSTYPE */
#include <parse.ih>
/* parse object */
#include <sample.tab.c>
/* parse tables */
#include <parsedesc.h> /* declare parse_description */
#include <parsepre.h> /* begin function 'reduceActions' */
#include <sample.act> /* body of function 'reduceActions' */
#include <parsepost.h> /* end of function 'reduceActions' */
The first of these, yystype.h, may have any name and is only needed
if the grammar employs the <type> construct; the file must #define
YYSTYPE to be suitable as a type in declarations. (See RESTRICTION
below.)
The inclusion of sample.tab.c defines various identifiers and tables;
their names begin yy and YY, but they are all static. The inclusion of
parsedesc.h declares the variable
parse_description
which contains pointers to the tables included from sample.tab.c. The
parse_description value is passed to the parse object and to the lexical
analyzer.
The sample.act file contains the semantics portions of the grammar rules.
Between sample.act and the parsepre.h and parsepost.h files a function
reduceActions() is declared. It is passed to the parser and called for
each
grammar reduction.
RESTRICTION - The value stack in parse_Run is a stack of (void *).
YYSTYPE value should be no bigger than the size of such a value.
The
During initialization, the application must create a parser object and
pass
it various information:
struct parse *parseobject;
struct lexan *lexanobject;
...
lexanobject = lexan_Create(&lexan_description);
/* see lexan.doc for further initialization,
including setting the lexanobject to refer to input text */
parseobject = parse_Create(
&parse_description,
/* declared in parsedesc.h */
lexanobject,
/* created just above */
reduceActions,
/* function defined during includes */
rock,
/* any void* value */
error);
/* function called for errors */
The reduceActions parameter may be NULL, in which case no semantics are
performed. Usually the value is created with the parsepre/act/parsepost
sandwich indicated above.
The error parameter may be NULL, in which case the error message is
printed
with no indication of where it is in the source stream. If a function is
provided, it will be called for errors with this calling sequence:
error(parseobject, severity, message)
where the message is a character string and
severity is one of these constants declared in parse.ih:
parse_WARNING
parse_SERIOUS
/* processing continues */
/* compile ceases, but continue error check */
parse_SYNTAX
/* like parse_SERIOUS, but was a syntax error */
parse_FATAL /* cannot even continue error checking */
if the severity value is or'ed with parse_FREEMSG, the error routine is
expected to free() the message.
After the parser has been initialized and the lexan object set to the
source text, the entire parse is performed with the single call:
parse_Run(parseobject)
the value returned indicates whether success or various degrees of
failure.
Note on stack size
The stacks are initially allocated at 2000 elements, but can grow if
needed.
Use left recursion in grammars to avoid requiring great stack depth.
Note that
stack depth reflects the amount of information a program reader needs to
interpret the program and a grammar requiring a large stack is too
complex.
The value stack is pointers to objects as returned by the lexer and set
in the action routines. The client is responsible for the memory
occupied
by the pointees of this stack. If the parser terminates early for a
syntax
error or ABORT, these values can be deleted by supplying a KillVal
function.
SetKillVal(parser, f)
will establish f as the killval function. After a syntax error and
before
discarding the stack, the killval function is called for each value on
the
stack. The function is also called as states are popped for error
recovery.
The call to killval is
(self->killval)(self, value-pointer-from-stack)
Compilation
In the Imakefile, the bison processing is done via a rule like
sample.tab.c: sample.gra
ExecuteFromDESTDIR(bison -n sample.gra)
If sample.c is an object itself, the Imakefile might be as simple as:
IHFILES = sample.ih
DOFILES = sample.do
NormalObjectRule()
NormalATKRule()
sample.tab.c: sample.gra
ExecuteFromDESTDIR(bison -n sample.gra)
InstallClassFiles($(DOBJS), $(IHFILES))
In some applications, a single parse object can be reused for all
compilations.
Usually the only values differing between one use and the next are the
lexical analyzer and the rock value. To provide for recursive
compilations,
these can be stored in local variables and resotred after the parse:
struct parse *parser;
initialization
... parser = parse_Create(...);
compilesomething(self)
struct something *self;
{
struct lexan *savelexan;
void *saverock;
struct tlex *tlex;
...
savelexan = parse_GetLex(parser);
saverock = parse_GetRock(parser);
...
tlex = tlex_Create (...);
...
parse_SetLex(parser, tlex);
parse_SetRock(parser, self /* ( forinstance) */ );
parse_Run(parser);
...
parse_SetLex(parser, savelexan);
parse_SetRock(parser, saverock);
...
}
Download