Parser Generator Guide: Grammars, Lexical Analysis, and More

The Contents The Parser Generator.........................................................................................3 Parsers and Grammars...................................................................................3 What Does a Parser Do?...............................................................3 Lexical Analysis: The Scanner.....................................................3 The Components of a Parser.............................................................5 Grammars.........................................................................................6 How to specify grammars..............................................................................6 BNF notational conventions.............................................................7 The Different Types of Parsers......................................................................7 The Parser Generator....................................................................7 Files containing the parser generator............................................7 How to Compile and Use the Parser Generator................................7 A Simple Example Using the Parser Generator.............................................7 Clarifying the underlying concepts..............................................7 The Complications: Precedence, Associativity and Ambiguity.....................7 Input to Parser Generator..............................................................................7 Creating a Simple Calculator............................................................7 Converting to Parser Generator Syntax............................................7 Identifying User-Defined Domains/Productions..........................7 Substituting Parser Generator Syntax...........................................7 Developing Priority Groups.........................................................7 Separating Productions into Priority Groups................................7 Ordering the Priority Groups........................................................7 Ordering Productions within Each Priority Group.......................7 Identifying Right-Associative Productions...................................7 Adding Separator Information to List Productions.......................7 Describing the Output Terms.......................................................7 Preserving Token Locations.........................................................7 Generating the Parser...................................................................7 Using the Generated Parser..............................................................7 Lexical Analysis............................................................................................7 Parsing..........................................................................................................7 Using the Parser's Output..............................................................................7 Filling in Holes.................................................................................7 Improving the User Interface............................................................7 False Alarms.....................................................................................7 Changing syntax_error/2..................................................................7 Changing evaluate__expression/1................................................7 Additional Clauses for expect/3...................................................7 Extra Tokens................................................................................7 The Upgraded Program................................................................7 The Scanner Revisited......................................................................7 Calculator Wrap-up......................................................................7 Extending the Calculator Project......................................................7 Bootstrapping the Parser Generator..............................................7 The Parser Generator This document demonstrates how to use the parser generator to take a grammar specified by you and automatically create a parser for that grammar. The document is organized in five sections. Section One gives an overview of parsers, grammars and parser generators, and how they work. It is intended to introduce these topics to those with a limited knowledge of parsing. Section Two discusses how to use the parser generator. It illustrates the development process by means of a simple example. Section Three explains in detail how to define languages by grammars, which are acceptable to the parser generator. Using that information. Section Four then takes you on a tour through the development process of a small compiler, which takes advantage of the more advanced features of generated parsers, such as error handling etc. Finally, for the very hardy. Section Five discusses how to recreate the parser generator itself using "bootstrapping." Parsers and Grammars What Does a Parser Do? A parser is a program that is able to recognize the underlying structure in a source text. For example, a Pascal compiler uses a parser to recognize the if, while, repeat and case statements and the procedures, functions and expressions in a Pascal source file. Parsers are used not only with programming languages but also with command interpreters to interpret user input for various types of programs— expert system shells and natural-language interfaces, for example. A parser translates source text into a format that is suitable for the next phase of a compiler, interpreter or other program. For example, using the output from a parser, the next phase of a compiler would be to generate the corresponding object code. Before the real process of parsing can begin, however, there is a preliminary stage in which the source text must be broken up into tokens, or logical units that mean something in the language under consideration. This stage is carried out by part of the parser called a scanner (sometimes this is referred to as the lexical analyzer). Lexical Analysis: The Scanner The scanner is perhaps the simplest part of any compiler. It reads the source input a character at a time, searching for the next token. For example, given: NewX = OldY + 1; the scanner in the PDC Prolog compiler would break this string up into the meaningful tokens: NewX, =, OldY, +, 1, ; A scanner can also attach some attributes to each token. For example, the tokens NewX and OldY in the above Prolog statement are represented as tokens of type "identifier," with their names as attributes, id("NewX"), id("OldY"). The file XSCAN.PRO contains a simple scanner written in PDC Prolog. Note that much of the work is done by the standard predicate fronttoken, which itself represents a scanner, in that it breaks strings up into potential tokens; the rest of the program serves to classify or remap those tokens into the ones defined for a given language. DOMAINS TOK = id(STRING); int(INTEGER); plus; minus; mult; if_; then; else; write; lpar; rраr TOKL = ТОК* PREDICATES tokl(STRING, TOKL) str_tok(STRING, ТОК) Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 3 CLAUSES tokl("", [] ) :- !. tokl(STR, [ TOK|TOKL] ) :fronttoken(STR, STRTOK, RESTSTR), str_tok(STRTOK, ТОК), tokl(RESTSTR, TOKL). str_tok(" + ", plus):- !. str_tok(" - ", minus):- !. str_tok(" - ", mult) :- !. str_tok("(", lpar) :- !. str_tok(")", rpar) :- !. str_tok("if", if_) :- !. str_tok("then", then) :- !. str_tok("else", else) :- !. str_tok("write", write_ ) :- !. str_tok(INTSTR, int(INTEGER)) :str_int(INTSTR, INTEGER), !. str_tok(ID, id(ID)) :isname(ID), !. maketok(STR, ТОК, S, S) :str_tok(STR, ТОК), !. To see what it does, supply the goal: tokl("if a then write(6)", TOKL) The result will be: TOKL = [ if_ , id("a") , then , write_, lpar, int(6),rpar ] Notice that if represented as a scanner token is if_ and that write becomes write_. This is to avoid confusion between Prolog keywords and scanner tokens. Likewise, the tokenized forms of (and) are lpar and rpar, respectively. Also, integer values in the source text are given in terms of the functor int() and identifiers in terms of id(). More specialized requirements for a scanner might involve a different algorithm for grouping sequences of characters into tokens than that embodied in fronttoken, though fronttoken will suffice for the scanners illustrated in this document. The Components of a Parser In translating a source language, a parser usually performs the following sequence of steps: 1. Asks the scanner for the next token in the input source. 2. Checks that the token corresponds to one of the allowed patterns as specified by the language's grammar; that is, performs basic error checking. 3. Imposes on the token, in relation to previous and/or succeeding tokens, a structure that can be used by subsequent phases of the compiler, interpreter or other program. Grammars The structures to be recognized by a parser are normally described by a grammar. The advantages of describing the input for a parser by a grammar include:  A grammar gives a precise, readily comprehensible specification of the input--the language to be parsed.  Creating an efficient parser for a certain kinds of well-designed grammars can easily be automated.  A grammar helps determine the structure of the program, which should handle the input for the parser. In our case, there is a close connection between the grammar and the domain definitions for the Prolog program that should handle the input. We are all familiar with English grammar from school. Thus, "My dog eat meat" is considered bad grammar because the form of the verb "to eat" that goes with "my dog" should be "eats," not "eat." There are many different kinds of grammars, including those for French, Swedish, German, Arabic and even PDC Prolog. It is a language's grammar that specifies which tokens and which sequences of tokens legally belong to that language. What follows is a brief introduction to how grammars are specified in computing. How to specify grammars There are various ways of describing a grammar. For input to the parser generator, we use a type of grammar specification very close to what is known as BNF (Backus-Naur Form). The precise differences between a BNF and parsergenerator type of grammars will be explained later. To understand a BNF grammar, let's first define some terms. A grammar generally involves four entities: terminals, non-terminals, a start symbol, and production rules. Terminals are the basic symbols from which sentences are created in a language. The word token, as used above, is a synonym for terminal. For example, in the sentence mary likes big cars the terminals are mary, likes, big, and cars. Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 5 Non-terminals are symbols that describe the general types of object allowed in a grammar. (In the English language, for example, these would correspond to the types "verbs," "adjectives," "nouns," etc., as opposed to any actual words). Using terminals and non-terminals, we then list the allowed production rules for forming valid sentences in the language specified by the grammar. Production rules define the ways in which grammatical structures can be built from one another and from terminals. The syntax for a production rule is: <name-of-non-terminal> ::= a sequence of non-terminals and/or terminals where the ::= symbol is read as "is defined to be". For example, given the following production rule: <SENTENCE> ::= <SUBJECT> likes <OBJECT> there is one terminal, likes, and the non-terminals are <SENTENCE>, <SUBJECT> and <OBJECT>. If <SUBJECT> and <OBJECT> are now defined as follows: <SUBJECT> ::= mary | john <OBJECT> ::= dogs | cats (where the vertical bar ( | ) should be read as "or", so that a <SUBJECT> is one of the words mary or john), then all possible <SENTENCE>s are as follows: mary mary John John likes likes likes likes dogs cats dogs cats One non-terminal symbol in a grammar is always selected as the start symbol. As its name implies, the start symbol is where a parser begins when determining how to parse its source input. It is also frequently used to denote the language being defined. If our example above: <SENTENCE> ::= <SUBJECT> likes <OBJECT> <SUBJECT> ::= mary | john <OBJECT> ::= dogs | cats was regarded as a complete grammar, then we would take <SENTENCE> to be its start symbol and read this as a grammar for sentences. BNF notational conventions Some notational conventions to keep in mind when using BNF are:  Non-terminals are surrounded by < and > to easily distinguish them from terminal symbols. An example of a non-terminal is <STATEMENT>.  The asterisk (*) is used in production rules to indicate the possibility of zero or more instances of the non-terminal or terminal symbol. For example, a language could be defined as a series of zero or more statements: <LANGUAGE> ::= <STATEMENT>* The plus sign (+) is used in production rules to indicate the possibility of one or more instances of the non-terminal or terminal symbol. For example, a number in the above language could be defined as <NUMBER> ::= <DIGITS>+ , indicating that a number is made up of one or more digits.  The | mark is used to indicate "or" in a production rule. For example, <LETTER-OR-DIGIT> ::= <LETTER> | <DIGIT>  A non-terminal surrounded by [ and ] in a production rule may be used zero or one times. That is, it is optional. For example, <SIGNED-NUMBER> ::= [<SIGN>] <NUMBER> Thus, the grammar <SENTENCE> ::= <SUBJECT> likes <OBJECT> | does <SUBJECT> like <OBJECT> <SUBJECT> ::= john | mary <OBJECT> ::= <ADJECTIVE> <NOUN> <ADJECTIVE> ::= big | medium | small <NOUN> ::= books | cars defines a total of five production rules: one defining the non-terminal < SENTENCE > and then four defining each of the remaining non -terminals <SUBJECT>, <OBJECT>, <ADJECTIVE> and <NOUN>. (Notice that the nonterminal <OBJECT> is itself defined in terms of other non-terminals). The words likes, does, like, john, mary, big, medium, small, cars and books are the terminal symbols of the language. In other words, they are the symbols from which all valid sentences are made up according to the production rules. It follows that: mary likes big cars is a valid sentence in the language described by this grammar because of the following chain of productions: 1. A <SENTENCE> takes the form <SUBJECT> likes <OBJECT>. Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 7 2. mary is a <SUBJECT>. 3. An <OBJECT> takes the form <ADJECTIVE> <NOUN>. 4. big is an <ADJECTIVE>. 5. cars is a <NOUN>. Hence, big cars is an <OBJECT> and mary likes big cars is a <SENTENCE>. The Different Types of Parsers There are two main classes of parsers: top-down and bottom-up. Given a sentence to parse, top-down parsers start with the most general production in the grammar (the start symbol) and try to produce a list of productions that generate that sentence (exactly as we did when trying to prove that mагу likes big cars was valid in our introductory example). On the other hand, bottom-up parsers start with a given sentence (i.e., a list of terminal symbols) and try to work backwards to prove that it is an instance of one of the grammar's production rules. Because of this, a bottom-up parser usually requires access to a parsing table. A parsing table for a grammar is a table in which the entries are production rules from that grammar and the columns and rows are indexed by terminals and non-terminals. The table simply records which production rule the parser should use when a given terminal symbol is next considered in the parser's attempt to verify the input sentence. When working through the given sentence, a bottom-up parser looks ahead a certain amount. Otherwise, it wouldn't be able to determine, for example, whether a sentence that begins with if is going to be of the form if…then... or of the form if...then...else… Most parser generators produce a bottom-up parser, because it's relatively easy to generate a parsing table to drive a bottom-up parser for almost any reasonable grammar. On the other hand, if a top-down parser is to be constructed that is sufficiently general so that each of the parsers produced by the parser generator can be a special case of it, then the initial, general parser must be able to backtrack--a feature that comes free with Prolog. The parser generator produces top-down parsers for several reasons, the main one being that it is then easier to build structures during the parsing process. Further, the difference list technique in Prolog makes it relatively easy to construct a very efficient top-down parser with an arbitrary look-ahead length combined with backtracking. As you will discover, the resulting parsing speed of a generated parser can be very fast, particularly if you code the accompanying scanner (lexical analyzer) in С or assembler. The Parser Generator A parser generator is a program that generates a parser from the specification of a grammar. Thus, from a BNF-like grammar, our parser generator produces a Prolog program capable of recognizing sentences conforming to that grammar. A parser generator makes it easier to write parsers. Even with Prolog, which was originally designed with natural language parsing in mind, it is still a relatively complicated task to write a parser on your own. Since not everyone has the time to develop expertise in writing parsers (as well as in everything else they must know about in computing), a parser generator allows even non-experts to construct parsers. Files containing the parser generator The parser generator is built from the files, which are all placed in the VPI\ PROGRAMS\PARSER subdirectory: How to Compile and Use the Parser Generator Before using the parser generator it is necessary to compile it to an .EXE file. The first step is to load the file containing your prepared parser definition (or to type it in directly using the edit option). Activating the main menu entry, Generate parser, will then generate the required parser, which will consist of two parts, each placed in a separate file. If the name of the file containing the parser generator input is PASCAL.GRM, the generated parser's domain declarations will be in the file PASCAL.DOM, and the predicate declarations and clauses will be generated in the file PASCAL.PAR. The generated parser can then be viewed by selecting Display parser from the main menu. A Simple Example Using the Parser Generator There are many small but important details to be explained before considering a realistic example of parser generator use. These will be discussed in later sections, but in this section we will give an extremely simple example just to familiarize you with the mechanics of getting a parser generated: which files you should create and which files are automatically generated. There are two important things to keep in mind while reading the following sections. First, for every suitable grammar given as input, the parser generator Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 9 produces a Prolog program that is a parser for that grammar. Second, in every case you must then write the accompanying scanner that tokenizes the source language input so that it is suitable for use by the automatically generated parser. Since we have provided examples of such scanners in the files XCALC.SCA, XCALC2.SCA and XCALCULA.SCA, it should generally require only minor modifications to make (a copy of) one of these works with a different grammar. Our previous example of a sentence grammar, recast in Parser Generator format, might look like this: productions SENTENCE = SUBJECT likes OBJECT -> likes(SUBJECT, OBJECT), does SUBJECT like OBJECT -> does_like(SUBJECT, OBJECT) SUBJECT = john -> john, mагу -> mагу OBJECT = ADJECTIVE NOUN -> attribute(ADJECTIVE, NOUN) ADJECTIVE = big -> big, red -> red NOUN = books -> books, cars -> cars which, if input as SENTENCE.GRM, would generate a file SENTENCE.DOM that looks like: /**************************************************************** DOMAIN DEFINITIONS ****************************************************************/ DOMAINS SENTENCE = likes(SUBJECT,OBJECT); does_like(SUBJECT,OBJECT) SUBJECT = john( ); mary( ) OBJECT = attribute(ADJECTIVE, NOUN) ADJECTIVE = big( ); red( ) NOUN = books( ); cars( ) ТОК = likes( ); does( ); like( ); john( ); mary( ); big( ); red( ); books( ); cars( ); nil and a file SENTENCE.PAR, which looks like /*************************************************************** PARSING PREDICATES ***************************************************************/ PREDICATES s_sentence(TOKL, TOKL, SENTENCE) s_subject(TOKL,TOKL,SUBJECT) s_object(TOKL.TOKL,OBJECT) s_adjective(TOKL,TOKL,ADJECTIVE) s_noun(TOKL, TOKL, NOUN ) CLAUSES s_sentence(LL1,LLO,likes(SUBJECT,OBJECT)):s_subject(LL1,LL2,SUBJECT), expect ( t ( likes,_),LL2,LL3), s_object( LL3,LLO,OBJECT), !. s_sentence( [t( does,_) | LL1],LLO, does_like(SUBJECT ,OBJECT)):-!, s_subject(LL1,LL2,SUBJECT), expect(t(like,_),LL2,LL3), s_object(LL3,LLO,OBJECT). s_sentence(LL,_,_):-syntax_error(sentence, LL), fail. s_subject( [ t ( john,_) | LL] ,LL, john):-!. s_subject( [ t (mary,_) | LL ] ,LL, mагу):-!. s_subject( LL,_,_):-syntax_error(subject,LL),fail. s_object(LL1,LLO,attribute(ADJECTIVE,NOUN)):s_adjective(LL1,LL2,ADJECTIVE), s_noun(LL2,LLO,NOUN),!. s_adjective( [ t (big,_) | LL],LL , big):-!. s_adjective( [ t (red,_) | LL],LL , red):-!. s_adjective(LL,_,_):-syntax_error(adjective ,LL),fail. Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 11 s_noun( [ t (books, _) | LL] ,LL, books): -!. s_noun( [ t (cars,_) | LL] ,LL ,cars):-!. s_noun(LL,_,_):-syntax_error(noun ,LL),fail. If you then took a suitable scanner, defined to produce lists of the tokens defined above (see the ТОК. domain above), the following sequence, entered as a goal tokl("does john like red cars" ,TOKLIST), s_sentence(TOKENLIST, [], RESULT) would bind RESULT to does_like(john,attribute(red,cars)). The issues of what happens from there on, and whether or not that is a correct or sufficient "interpretation" of the parsed text, are best addressed in the context of a particular application, though clearly the choices made in the grammar specification will contribute to the structure of the overall solution. Clarifying the underlying concepts While reading the detailed description of the parser generator in subsequent sections, it is important to remember that, for every grammar given as input, the parser generator produces a program similar in basic structure to the above sentence parser. The actual parser generated is more complicated only because of considerations of priority, associatively, efficiency and the detection of errors. The next section examines a more complete example embracing these concepts, but, first, in this section we will discuss the grammar used to specify input for the parser generator in more detail, relate it to BNF grammars and take a closer look at the "environment" of the generated parser. With these concepts clarified, you will be ready to expand the example programs and eventually design your own parser. The Complications: Ambiguity Precedence, Associativity and There are three particular issues that have to be addressed when dealing with a realistic grammar. These are:  Precedence: Addition, multiplication and exponentiation are normally given different priorities so that multiplication binds tighter than addition. These different priorities are sometimes reflected in the grammar.  Associativity. In an expression like 2+3+4, the addition operator is usually assumed to be left associative, which means that the operation 2 + 3 should be carried out first, before the operation 3+4. In the expression 2^3^4, however, the operation 3^4 should be carried out first, i.e., ^ is right associative.  Ambiguity: There is not always just one way to combine the input to a parser into a grammatical structure. The if…then…else sentence is the most often cited example. In the context of the above grammar, consider the construction: if a then if b then sent1 else sent2 The grammar does not specify whether this construction should be interpreted as: if a then ( if b then sent1 else sent2) or as if a then ( if b then sent1) else sent2 In the parser generator, these problems have been solved by making the following constraints on the way in which the source language grammar should be specified:  Precedence: Productions for one production name should be separated into groups having the same priority. Thus, plus (+) and minus (-) should share the same group, and multiplication (*) and division (/) should both belong to another group. Groups are separated by the symbol --.  Associativity: Operators are by default left associative. If they are to be right associative, then the relevant productions must be preceded by the keyword rightassoc.  Ambiguity: This is solved by considering the order of the productions in such a way that a parser will choose productions appearing earlier in the production list in preference to those appearing later. In the above example this means that, should the second of the above two interpretations be the one required, the if sentence with an else should come before the if sentence without an else. Input to Parser Generator A formal specification of the format you must use when inputting information to the parser generator is given at the end of this section. The formal specification is in BNF form, so refer back to "Backus-Naur Form" on page 186 if you have any questions about how BNF works. Note the distinction between the format or syntax of the parser generator's input and its formal description. BNF notation is used to describe a language that itself closely conforms to BNF notation. Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 13 First, let's follow through an informal specification in words. Input for the parser generator consists of three different kinds of sections, each of which is optional: 1. Specification of user-defined predicates. 2. Specification of user defined domains. 3. Production rules. If you wish to manually code some parsing predicates, they should be declared to the parser generator at the beginning of your grammar definition, after the keywords userdefined predicates. Similarly, user-defined domains should be declared after the keywords userdefined domains and before the predicates. Following the userdefined sections, there can be any number of production rule sections, but the keyword productions must precede each such section. A production begins with the name of that production in upper-case letters. There are two kinds of productions: list productions and general productions. A list production is a production name followed by an asterisk or a plus sign. An asterisk means zero or more occurrences of that grammatical object can appear in a valid sentence; a plus means one or more. Then, a list production can contain an optional separator specification. When no separator is given, the list simply doesn't have any separator symbols. For example, you can declare that an expression list is a possibly empty list of expressions separated by commas as follows: EXPLIST = EXP* separator comma This specification generates the following PDC Prolog domain declaration in the domain definition file of the parser generator output (we'll give more details about the generated parser later): DOMAINS EXPRLIST = EXP* This domain declaration is independent of whether an asterisk or a plus is used and whether a separator is used or not. A general production consists of one or more groups of individual productions, each with the same priority. The priority groups are separated by two minus signs (--), while the productions in each group are separated by a comma (,). An individual production can be preceded by the keyword rightassoc. If the production is right-associative, then its name can be followed by a (possibly empty) list of grammatical tokens. These are either names of other productions or scanner tokens. It can have an optional parameter list for those cases where the scanner places some attributes in the token--for example, the value of an integer or the name of an identifier. After the list of grammatical tokens comes an arrow (->), which is followed by a specification of the Prolog term that should be built from this production. This term can have zero or more arguments, the arguments being the names of productions or PDC Prolog terms. Productions may also specify the source-text locations of one or more tokens, using the :CURSOR keyword. The production names in the list of grammatical tokens must be in the same order and are assumed to be the same as the production names used in the corresponding Prolog term: the first production name in the grammatical token list must correspond to the first name in the term, and so on. If two corresponding names are not equal, a conversion between the two names is inserted in the parser. For example: MYPRODNAME = upper(STRING) -> MYPRODNAME generates the following parse-predicate: s_prodname( [t(upper(STRING), ) | LL ] , LL , MYPRODNAME) :- !, STRING = MYPRODNAME. The BNF grammar in the following table shows how input for the parser generator must be specified. <parser generator input> ::= userdefined productions <production names> | userdefined domains <production names> | productions <productions> <associativity> ::= rightassoc <asterisk or plus> ::= * | + <general production> ::= <priority group> {<priority group> }* <grammatical tokens> ::= <grammatical token>* <grammatical token» ::= <production name> |< scanner token> [: CURSOR] <individual production> ::= [<associativity>] <grammatical tokens> -> <term> Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 15 <lower case string> ::= {a | b | c | d | e | f | g | h| i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z}+ <priority group>::= <individual production» {, <individual production>}* <production>::= <production name> = <production body» <productions> ::= <production>+ <production body>::= <general production> | <production name> <asterisk or plus> [<separator>] <production name> ::= <upper case string> <production names> ::= <production name> {, <production name> } <scanner token> ::= <lower case string> (<production names>) | <lower case string» <separator > : : = <string> <term> ::= <lower case string> | <lower case string> (<production names>) | <production name> <upper case string> ::= {A | B | C | D | E | F | G | H | J | K | L M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z}+ Figure 1: BNF specification of the Grammar Notice that there are several important differences between the standard BNF syntax described at the beginning of the document and the syntax towhich the parser generator's input must conform:  The < and > characters that normally surround a non-terminal are not allowed, and the non-terminals are capitalized.  Instead of the ::= operator to indicate the body of a production, the parser generator uses =.  The parser generator provides support for priority groups and right associativity.  The parser generator requires that each individual production be accompanied with a description of the Prolog term that is to be constructed when that production is recognized.  The next sections focus on two complete demonstrations using the parser generator. Throughout these examples we'll be discussing issues generic to all programs developed with the assistance of this tool. Creating a Simple Calculator Our first objective will be to build a program that accepts equations like (20 * (12 + 14.5)) ^ 40 and returns the result. To start with, we'll need a grammar. Using BNF syntax, let's describe what input to our "calculator" will look like. (Note that everything to the right of each percent sign (%) is a comment and would not be part of conventional BNF syntax). <expr> :: = ( | <expr> * | <expr> + | <expr> ^ | <expr> | <expr> / | <numbег> <expr> ) % parenthetical expressions <expr> % multiplication <expr> % addition <expr> % exponentiation <expr> % subtraction <expr> % division % floating-point/integer constants Two things to note about this grammar  The recursive use of <expr> provides support for expressions within an expression.  Developing production rules for <number> is not necessary in our case because PDC Prolog contains built-ins that recognize numeric constants. Converting to Parser Generator Syntax Our next task is to convert from the BNF representation to that of the parser generator. This is a fairly straightforward process involving seven steps: 1. Identifying user-defined domains and productions. 2. Substituting parser generator syntax. 3. Developing the priority groups. 4. Identifying the right-associative productions (if any). 5. Adding optional separator info to list productions. Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 17 6. Describing, for each non-list production, the Prolog term that is to be constructed when that production is found in the source text. 7. Identifying tokens whose location should be preserved in the output term. As you'll see, not all of these steps are needed for our calculator example. Identifying User-Defined Domains/Productions In special situations, it might become necessary to define your own domains and parsing predicates. For these situations, two special sections are supported: userdefined domains and userdefined productions. After the appropriate section heading, list the names (separate them with commas if there are more than one) of the user-defined domains/ productions. userdefined domains DOMAIN1,DOMAIN2 userdefined productions PRODUCTION1,PRODUCTION2 These sections must appear first in the parser generator input. Note that no userdefined sections are required for our calculator's grammar. Substituting Parser Generator Syntax The next step in converting a BNF to the syntax of the parser generator involves making a few cosmetic changes to the current form of the grammar. This is a sixstep process: 1. Remove the < and > characters that surround the non-terminals and shift the name of each non-terminal to upper case. 2. Substitute a unique lower-case name for each of the terminals (+, -, ^, etc., in our example). 3. Those non-terminals that you'd rather treat as terminals (perhaps because it would be more convenient to recognize them at the lexical analysis stage) should be represented as a standard domain surrounded by a functor, <number> in our calculator's grammar is one such non-terminal. 4. Replace all ::= with = 5. Separate every production with a comma (,) instead of a bar ( | ). 6. Head the section with the word "productions". For our calculator's BNF, the result of this step would be: productions EXPR = lpar EXPR rpar, EXPR mult EXPR, EXPR plus EXPR, EXPR power EXPR, EXPR minus EXPR, EXPR div EXPR, number (REAL) Developing Priority Groups This step involves grouping/ordering of the production rules and is performed in three passes: 1. Separating the productions into priority groups. 2. Ordering the priority groups. 3. Ordering the productions within each priority group. Separating Productions into Priority Groups To be consistent with conventional rules of precedence, certain operators should be given an identical evaluation priority (addition and subtraction are at one priority, multiplication and division are at another, etc.). Production rules must accordingly be grouped with others having the same priority. For our calculator's grammar, four priority groups should be fairly apparent. The double hyphen (--) is the parser generator's priority group separator, where productions within each group are separated by a comma): productions EXPR = lpar EXPR rpar, number( REAL) - EXPR mult EXPR, EXPR div EXPR - EXPR plus EXPR, EXPR minus EXPR, - EXPR power EXPR Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 19 Ordering the Priority Groups Once the priority groups are established, they must be ordered where the group having the loosest binding appears first in the parser generator input. This ordering will provide guidance when evaluating expressions, so that, instead of a simple left-to-right evaluation, those operations that bind most tightly will be performed before the others. For example, in the expression 20+30*40-50, the multiplication (an operation that binds more tightly than addition or subtraction) should be performed first. The priority groups for our calculator's grammar should therefore be ordered as follows: productions EXPR = EXPR plus EXPR , EXPR minus EXPR - EXPR mult EXPR, EXPR div EXPR - EXPR power EXPR - lpar EXPR rраr, number(REAL) Ordering Productions within Each Priority Group In situations where more than one production (within a given priority group) could match the same source text, a parser produced by the parser generator will assume that the first applicable production is the correct one. For example, if we converted the following BNF (plus the productions for recognizing a <condition>) into a valid parser generator grammar . . . <statement> ::= if <condition> then <statement> | if <condition> then <statement> else <statement> then generated a parser from this specification. The parser would interpret if A < 5 if A > 1 print "A else print "A then then < 5 and A > 1" >= 5" to mean if A < 1 then ( if A > 1 then print "A < 5 and A > 1" else print "A >= 5" ) because the if/then production appears before the if/then/else production (in the parser generator input). If the order of the two production rules were reversed, the interpretation would be if A < 5 then ( if A > 1 then print "A < 5 and A > 1") else print "A >= 5" Note that these issues do not apply to our calculator's grammar. Identifying Right-Associative Productions Normally, evaluations involving operators of like priority are performed on a leftto-right basis. For example, in the expression 20+30-40, the addition should be carried out before the subtraction. Productions where evaluation begins on the left are said to be "left associative" (the parser generator's default). But for some operations, left associativity yields an incorrect result. For example, evaluating the expression 20 ^ 30 ^ 40 should begin on the right, working towards the left. As you would suspect, productions requiring this right-to-left evaluation are termed "right associative." In our calculator's grammar, only the production that handles the exponentiation is right associative. productions EXPR = EXPR plus EXPR, EXPR minus EXPR - EXPR mult EXPR, EXPR div EXPR - rightassoc EXPR power EXPR - lpar EXPR rpar, number(REAL) Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 21 Adding Separator Information to List Productions In addition to the + (one or more) and * (zero or more) BNF operators, the parser generator grammar provides for the specification of a separator token (optional). FIELD_LIST = FIELD* separator semicolon Note: In all list productions, the list element must be itself a production (a nonterminal) -- the only token that may appear in a list production is the separator token. Our calculator's grammar doesn't have any list productions. Describing the Output Terms It's during this step that we specify how the parser should identify which production was applied. For example, if we provide our calculator with the equation 5 + 7.3 we want our calculator's parser to respond with the term add(number(5),number(7.3)) to indicate that it found the addition operator separating two numbers (and what the actual values of the numbers were). The productions, and each one's associated output term (those we've chosen to use), for our calculator program are as follows: productions EXPR = EXPR plus EXPR -> add(EXPR, EXPR), EXPR minus EXPR -> subtract(EXPR, EXPR) - EXPR div EXPR -> divide(EXPR, EXPR), EXPR mult EXPR -> multiply(EXPR. EXPR) - rightassoc EXPR power EXPR -> power(EXPR, EXPR), - lpar EXPR rpar -> parenthetical(EXPR), number(REAL) -> number(REAL) This is also the final form of our parser generator input (contained in XCALC.GRM). Preserving Token Locations Some applications require references back to the source text after the scanning/parsing stage. In these situations, the grammar can specify that one or more token locations are stored in a production's output term. This is done with the :CURSOR keyword NUMBER = number (REAL):CURSOR -> number(REAL , CURSOR) or RANGE = from int (INTEGER) : CURSOR to int (INTEGER) : CURSOR -> range( INTEGER, CURSOR, INTEGER, CURSOR) Our calculator grammar does not require use of the :CURSOR keyword. Generating the Parser If we have made no syntax errors in our grammar and we instruct the parser generator to generate a parser from it, two files will automatically be produced: one containing the parsing predicates (having a .PAR extension), and the other containing some of the domains needed by the parsing predicates (having a .DOM extension). In the case of our calculator program, since the input grammar is contained in the file XCALC.GRM, the generated parsing predicates will be placed in XCALC.PAR and the domains in XCALC.DOM. Once you've compiled the parser generator, run PARSER, load XCALC.GRM as the input grammar, and instruct the program to generate the parser. The following two files should be created: /*********************************************************** PARSING PREDICATES ***********************************************************/ PREDICATES s_expr(TOKL,TOKL,EXPR) s_expr1(TOKL,TOKL,EXPR) s_expr5 ( TOKL, TOKL, EXPR, EXPR ) s_expr2( TOKL, TOKL, EXPR ) s_expr6( TOKL, TOKL, EXPR, EXPR ) s_expr3( TOKL, TOKL, EXPR ) s_expr7( TOKL,TOKL,EXPR,EXPR) s_expr4 ( TOKL, TOKL, EXPR) Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 23 CLAUSES s_expr(LL1,LLO,EXPR) :s_expr1(LL1,LLO,EXPR). s_expr1(LL1,LLO,EXPR_):s_expr2(LL1,LL2, EXPR ), s_expr5(LL2,LLO,EXPR,EXPR_). s_expr2( LL1, LLO, EXPR.) :s_expr3(LL1,LL2,EXPR), s_expr6(LL2,LLO,EXPR,EXPR_). s_expr3(LL1, LLO,EXPR_) :s_expr4(LL1, LL2, EXPR), s_expr7(LL2,LLO,EXPR,EXPR_). s_expr4( [t ( lpar,_) | LL1], LLO, perenthetical( EXPR)):-!, s_expr( LL1,LL2,EXPR), expect( t ( rpar,_),LL2, LLO). s_expr4( [t ( number(REAL),_) | LL], LL, number(REAL)):-!. s_expr4(LL,_,_):- syntax_error (expr4, LL), fail. s_expr5( [t ( plus,_) | LL1 ] , LLO,EXPR,EXPR_):-!, s_expr2( LL1,LL2,EXPR1), s_expr5( LL2, LLO, add(EXPR,EXPR1), EXPR_). s_expr5( [t ( minus,_) | LL1] , LLO,EXPR,EXPR_):-!, s_expr2(LL1,LL2,EXPR1), s_expr5(LL2,LLO,subtract(EXPR,EXPR1),EXPR_). s_expr5(LL,LL,EXPR,EXPR). s_expr6( [t ( mult,_) | LL1 ], LLO,EXPR,EXPR_):-!, s_expr3(LL1,LL2,EXPR1), s_expr6(LL2,LLO,multiply(EXPR,EXPR1),EXPR_) s_expr6( [t (div,_) | LL1 ], LLO,EXPR,EXPR_):-!, s_expr3(LL1,LL2,EXPR1), s_expr6(LL2,LLO,divide(EXPR,EXPR1),EXPR_). s_expr6(LL,LL,EXPR,EXPR). s_expr7( [t ( power,_) | LL1 ] , LLO, EXPR,power(EXPR,EXPR1 )):-!, s_expr3(LL1, LLO,EXPR1 ). s_expr7(LL,LL,EXPR,EXPR). and XCALC.DOM: /*********************************************************** DOMAIN DEFINITIONS ***********************************************************/ DOMAINS EXPR = add ( EXPR , EXPR ) ; subtract(EXPR.EXPR); multiply(EXPR, EXPR); divide(EXPR,EXPR); роwer(ЕХРР,EXPR); parenthetical(EXPR); number(REAL) ТОК = plus( ); minus( ); mult( ); div( ); роwer( ); lpar( ); rраr( ); number(REAL); nil Using the Generated Parser Now that we've generated the calculator's parser, it's time to consider the overall structure of our program. We want to read equations from the user (entered from the keyboard) and evaluate each one, so we use the following predicate as our primary control structure. user_interface :write("\n\n\n\nEnter expression (Esc to quit): "), readln(EXPR), % Fails on esc. !, evaluate_expression(EXPR), user_interface. user_interface. Evaluating each expression turns out to be a three-step process: 1. Converting the expression into a list of tokens—the data structure required as input to the parser. 2. Passing the list of tokens to the parser. 3. Performing the actual calculation, using the term returned from the parser. Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 25 We'll use the following as our evaluation predicate. evaluate_expression( EXPR ) :tokenize(EXPR, TOKENS), write("\nTokens -> ", TOKENS), parse ( TOKENS, TERM), write("\n\nTerm ---> ", ТЕRM), !, calculate(TERM, RESULT), write("\n\nAnswer -> ", RESULT). evaluate expression(_) :sound(30, 300), write("\n\n<<Illegal Expression>>"). It's important to note that tokenize/2 is designed to fail if the user entered unrecognized tokens, and parse/2 will fail if the tokens don't comprise a valid equation (per our calculator's grammar). Lexical Analysis Lexical analysis, or scanning, refers to the phase of identifying the tokens in the input string. Our calculator source code employs tokenize/2 for this purpose: tokenize(EXPRESSION, TOKENS) :scan(0, EXPRESSION, TOKENS). It calls scan/3 to actually convert the expression (a string) into the list of tokens. To maintain compatibility with the parsing predicates, the returned list must be of the domain TOKL, which is defined as: DOMAINS CURSOR = INTEGER CURSORTOK = t ( TOK, CURSOR ) TOKL = CURSORTOK* You may have noticed the ТОК domain as defined in XCALC.DOM. It contains a list of all of the tokens our scanner will need to recognize. CURSOR is used to keep track of where each is found in the input expression. As an example, if the input to our calculator was 20 * 30 ^ 40 the scanner should return the list [t(number(20),0),t(mult,2),t(number(30),4),t(power,6),t(number (40),8)] The following definition of scan/3 (contained in XCALC.SCA) should suit our needs nicely. DOMAINS NUMBER_OF_EXTRA_CHARACTERS = INTEGER NUMBER_OF_SPACES = INTEGER PREDICATES is_a_space( CHAR) scan( CURSOR, SOURCE, TOKL) skip_spaces(SOURCE, SOURCE, NUMBER_OF_SPACES) string_token(STRING, ТОК) CLAUSES is_a_space(' '). is_a_space('\t'). is_a_space('\n'). scan(STARTING_POSITION,SOURCE,[t(TOKEN, LOCATION_OF_TOKEN )|ТАIL] ) :skip_spaces(SOURCE,NEW_SOURCE,NUMBER_OF_SPACES), LOCATION_OF_TOKEN=STARTING_POSITION+NUMBER_OF_SPACES , fronttoken(NEW_SOURCE, FRONTTOKEN, REST), !, % Make the case of each token unimportant. upper_lower(FRONTTOKEN, LOWER_CASE_FRONTTOKEN ), string_ token(LOWER_CASE_ FRONTTOKEN, TOKEN), str_len( FRONTTOKEN, LENGTH_OF_FRONTTOKEN ), NEW_STARTING_POSITION = LOCATION_OF_TOKEN + LENGTH_OF_FRONTTOKEN, scan(NEW_ STARTING_ POSITION, REST , TAIL). scan( _ , _ , [] ). skip_spaces(SOURCE, NEW_SOURCE, NUMBER_OF_SPACES) :frontchar(SOURCE, CHAR, SOURCE1), is_ a_ space( CHAR), !, skip_spaces (SOURCE1, NEW_ SOURCE, NUMBER OF SPACES_ IN_SOURCE1), NUMBER_OF_SPACES = NUMBER_OF_SPACES_IN_SOURCE1 + 1. skip_spaces(SOURCE, SOURCE, 0). Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 27 string_token("/", div) :- !. string_token("(", lpar) ;- !. string_token("-", minus) :- !. string_token("*", mult) :- !. string_token(STRING,number(INTEGER)):-str_int(STRING,INTEGER ),!. string_token(STRING,number(REAL)):-str_real(STRING,REAL), !. string_token(" + ", plus) :- !. string_token(" ^ ", power) :- !. string_token(")", rраr) :- !. Parsing Parsing basically involves calling the highest-level parsing predicate (the one corresponding to the start production), handing it the tokens returned by the lexical analysis pass. The predicate naming convention used by the parser generator is simply the letter s, followed by an underscore, followed by the name of a non-terminal. The start production for our calculator grammar is EXPR (actually, EXPR is the only non-terminal we defined), so the predicate we call to handle our parsing is named s_expr. parse(TOKENS, TERM) :s_expr(TOKENS,UNUSED_TOKENS,TERM), UNUSED_TOKENS = [ ]. Notice that s_expr returns a list of unused tokens as its second parameter. This is because the parsers produced by the parser generator perform their magic via a technique known as "parsing with difference lists." In this strategy, each parsing predicate takes something off the front of the input list and returns the remainder of the list (the unused portion). This remainder is handed as input to the next parsing predicate, which returns its remainder, etc. When the list is exhausted, a successful parse is normally assumed to have occurred. Hence, parse/2's test to ensure that UNUSED_TOKENS is empty--if there are unused tokens when s_expr returns, the parse failed. Using the Parser's Output As mentioned earlier, the parsers produced by the parser generator return a term (a tree) that represents the structure of the input expression. For example, if the input to our calculator's parser was the tokenized form of 20 * 30 ^ 40 then the returned term would be multiply ( number(20) , power ( number(30),number(40))) So to obtain the solution to the user's equation, we need a predicate that knows how to traverse these trees, performing the appropriate calculations along the way. The following seven clauses (one for each production in the calculator's grammar) will do the job for us. calculate( add(EXPR1, EXPR2), RESULT ) :calculate(EXPR1, RESULT1), calculate(EXPR2,RESULT2), RESULT = RESULT1 + RESULT2. calculate(subtract(EXPR1, EXPR2), RESULT) :calculate(EXPR1, RESULT1), calculate(EXPR2, RESULT2), RESULT = RESULT1 - RESULT2. calculate(multiply(EXPR1, EXPR2), RESULT) :calculate(EXPR1, RESULT1), calculate(EXPR2, RESULT2), RESULT = RESULT1 * RESULT2. calculate(divide(EXPR1, EXPR2), RESULT) :calculate(EXPR1, RESULT1), calculate(EXPR2, RESULT2), RESULT = RESULT1 / RESULT2. calculate(power(EXPR1, EXPR2), RESULT) :calculate(EXPR1, RESULT1), calculate(EXPR2, RESULT2), RESULT = exp(ln(RESULT1) ^ RESULT2). calculate(parenthetical(EXPR), RESULT) :calculate(EXPR, RESULT). calculate(number(RESULT), RESULT). Filling in Holes All of the various pieces of this program come together in the main program file (XCALC.PRO). check_determ CONSTANTS Dialog_window = 1 Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 29 DOMAINS CURSOR = INTEGER CURSORTOK = t(ТОК, CURSOR) MESSAGE = STRING RESULT = REAL SOURCE = STRING TOKL = CURSORTOK* include "xdemos\\xcalc.dom" % Parser domains ( from parser generator). PREDICATES expect (CURSORTOK, TOKL, TOKL) syntax_error(MESSAGE, TOKL) % Contained Predicates % -------------------include "parser\\windows.pro" include "xdemos\\xcalc.sca" include "xdemos\\xcalc.par" include "xdemos\\xcalc.гun" include "xdemos\\xcalc.ui CLAUSES expect(TOK, [ TOK | L ], L). syntax_error(_, _). % % % % % window_... scan/3 s_expr/3 calculate/2 user_interface/0 GOAL window_Make(dialog_window, white_on_blue, white_on_black, "Simple Calculator", full), window_ChangeFrame(bottom,center, " Legal operators include *, /, + , -, ^, and parentheses. ", blue_on_white), user_interface, removewindow. The only important aspects of this program that we haven't yet discussed are the predicates expect/3 and syntax_error/2. A version of these two must be supplied by any application using a parser-generator-produced parser. The predicate expect/3 is called when a grammatical entity should be followed by a token (of the ТОК domain). Its three parameters are the expected token (input), the list of tokens currently being processed (input) and the tail of the above list (output). The predicate expect/3 should fail if the expected token is not the first token in the list (of tokens). Otherwise, it should return the (possibly empty) tail of the list. The predicate syntax_error/2 is called when a production is determined not to match the particular list of tokens. Its two parameters are the name of the production that didn't match (input) and the list of tokens being processed (input). Note that our calculator program uses the absolute minimum definitions for these predicates. In the next section, we'll look at more elaborate versions of these two, as used in a better error-reporting scheme. But before we go on, try compiling XCALC.PRO, entering calculations and observing the output. It's important that you have a good understanding of what we've done thus far before going on to the more advanced topics. Improving the User Interface Currently, when the calculator program can't make sense out of the user's input, only the message "<<Illegal Expression>>” is provided as feedback. It would be nice if the program could instead provide a more descriptive error message and even go so far as "point" (visually) at the problem. As it turns out, making the changes necessary to accomplish this feat is a simple and straightforward task. The next few sections detail this procedure. False Alarms As the parser works its way down through the Prolog clauses that implement the production rules, some productions will naturally fail to match the input expression. When this happens, the parser makes a call to syntax_error/2. But, since the first production tried is not always the correct one, this may not actually be an error (a real syntax error occurs only if no matching production can be found). Therefore, in the event of an unsuccessful parse, we need a strategy for deciding which of these "false alarms" to identify as the syntax error. One such strategy is simply to keep track of the error that occurred at the deepest level in the source text. Each time syntax_error/2 is called, compare the cursor position of the current token with the cursor position of the last syntax error. If the newest error occurred further into the source text than the previous error, save it as the deepest level achieved. Then, should the parse fail, display the currently saved error information as the syntax error. The declaration DATABASE determ error(MESSAGE, CURSOR) Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 31 identifies a good storage place for this information. We'll use error/2 in the upcoming changes to the calculator program. Changing syntax_error/2 So our first order of business is to replace our minimal version of syntax_error/2 syntax_error(_, _). with one that really does something. syntax_error(PROD, [t(_, CURSOR) | _] ) :is_a_deeper_error( CURSOR ), format(MESSAGE, "Syntax error in %", PROD), replace_error(MESSAGE, CURSOR). syntax_error(PROD, [ ] ) :format(MESSAGE, "Syntax error in % (premature end of text).", PROD), replace_error(MESSAGE, end_of_text). The first clause implements our "deepest error" strategy with the assistance of a couple of utility predicates. is_a_deeper_error(CURSOR):error(_, OLD_CURSOR), !, CURSOR > OLD_CURSOR. is_a_deeper_error(_). replace_error(MESSAGE, CURSOR) :retractall(error(_, _)), assert(error(MESSAGE, CURSOR)). The second syntax_error clause handles an unexpected end of source text (The identifier end_of_text is a special one we use to denote the cursor position of the end of the input expression). Note that there is no need to check whether this is the deepest error thus far; you can't get any deeper than the end of text! Changing evaluate__expression/1 To report the deepest error found (upon an unsuccessful parse), we need to change the second clause of evaluate expression/1 from evaluate_expression(_) :sound(30, 300), write("\n\n<<illegal Expression>>"). to one which will display this additional information. evaluate expression(EXPR) :sound(30, 300), write("\n\n<<illegal Expression>>"), retract(error(MESSAGE, CURSOR)), % Get error info. nl, nl, write(MESSAGE), % Don't point if error is regarding end-of-text. point_to_error_if_appropriate(CURSOR, EXPR). The actual pointing at the problem will be handled by point_to_error_if_appropriate(end_of_text, _) :- !. % Don't if end-of-text. point_to_error_if_appropriate(CURSOR, EXPR) :str_len( FILLER, CURSOR), writef("\n\n%s%c", FILLER, down_ arrow), nl, write(EXPR). where we've decided (arbitrarily) not to point if it's a problem with the end of text. Additional Clauses for expect/3 Now that we've taken care of syntax errors, there are similar changes we could make to expect/3 that will also help pinpoint the problem in the source text. Namely, adding the following two alternative clauses expect ( t (TOKEN, _), [ t ( _, CURSOR) | _ ], _) :is_a_deeper_error(CURSOR ), string_ token ( STRING, TOKEN ) , format(MESSAGE, "Expected %.", STRING), replace_error(MESSAGE, CURSOR), fail. expect(t(TOKEN, _), [ ], _) :string token(STRING, TOKEN), format ( MESSAGE, "Expected % (premature end of text).", STRING), replace_error(MESSAGE, end of text), fail. after the present one expect(TOK, [ТОК | L] , L) :- !. Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 33 will flag an error when the expected token is not found where anticipated. Notice the calling of string_token/2 (defined in XCALC2.SCA) to convert the expected token to its raw form. Extra Tokens Instead of the parse simply failing when our top-level production predicate returns a non-empty list of tokens parse(TOKENS, ТЕRM) :s_ expr(TOKENS, UNUSED_TOKENS, TERM), UNUSED_TOKENS = [ ]. it would be nice to specifically report the problem. This is easily handled by rewriting this predicate as parse(TOKENS, TERM) :s_expr(TOKENS, UNUSED_TOKENS, TERM), insure_list_is_empty(UNUSED_TOKENS). where insure_list_is_empty/1 is written as insure list_is_empty(n) :- !. insure_list_is_empty( [ t (_, CURSOR) | _ ] ) :is_a_deeper_error( CURSOR ), replace_ error("Unexpected token.", CURSOR), fail. The Upgraded Program All of these changes are contained in the files XCALC2.PRO and XCALC2.UI. First, XCALC2.PRO: check_determ CONSTANTS dialog_window = 1 DOMAINS CURSOR = INTEGER CURSORTOK = t(ТОК, CURSOR) MESSAGE = STRING RESULT = REAL SOURCE = STRING TOKL = CURSORTOK* include “xdemos\\xcalc.dom" % Parser domains (from pагsег generator). DATABASE determ error(MESSAGE, CURSOR) PREDICATES expect(CURSORTOK, TOKL, TOKL) is a deeper_error(CURSOR) replace error(MESSAGE, CURSOR) syntax_error(MESSAGE, TOKL) % Contained Predicates % ------------------------include "parser\\windows.pro" % window_... include "xdemos\\xcalc2.sca" % scan/3 include "xdemos\\xcalc.par" % s_expr/3 include "xdemos\\xcalc.run" % calculate/2 include "xdemos\\xcalc2.ui" % user_interface/0 CLAUSES expect( TOK, [ ТОК | L], L) :-!. expect ( t ( TOKEN, _ ) , [ t ( _, CURSOR) | _ ], _ ) :is_a_deeper_error( CURSOR ), string_token(STRING, TOKEN), format ( MESSAGE, "Expected %." , STRING), replace error ( MESSAGE, CURSOR), fail. expect ( t (TOKEN, _ ), [ ] , _) :string_token(STRING, TOKEN), format(MESSAGE,"Expected % (premature end of text).",STRING), replace error(MESSAGE, end of text), fail. is_a_deeper_error(CURSOR) :еrrоr(_, OLD_CURSOR), !, CURSOR > OLD_CURSOR. is_a_deeper_error(_). replace_error(MESSAGE, CURSOR) :retractall( error(_, _)), assert(error(MESSAGE, CURSOR)). Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 35 syntax_error(PROD, [ t ( _, CURSOR) l _ ] ) :is_a_ deeper_error(CURSOR), format(MESSAGE, "Syntax error in %.", PROD), replace_error(MESSAGE, CURSOR). syntax_error ( PROD, [ ]) :format (MESSAGE, "Syntax error in % (premature end of text).", PROD), replace_error(MESSAGE, end_of_text). GOAL window_Make(dialog_window, white_on_blue, white_on_black, " Simple Calculator (with improved error reporting)", full), window_ChangeFrame(bottom, center, " Legal operators include *, /, +, -, ^, and parentheses. ", blue_on_white), user_interface, removewindow. XCALC2.UI defines the user interface: CONSTANTS down_arrow = 25 end of text = -1 PREDICATES insure_list_is_empty(TOKL) evaluate_expression(string) parse(TOKL, EXPR) point_to_error_if_appropriate(INTEGER, STRING) tokenize(STRING, TOKL) user_interface CLAUSES evaluate_expression( EXPR ) :tokenize(EXPR, TOKENS), write("\nTokens -> ", TOKENS), parse(TOKENS, TERM), write("\n\nTerm ---> ", TERM), !, calculate(TERM, RESULT), write("\n\nAnswer -> ", RESULT). evaluate_expression(EXPR) :sound(30,300), write("\n\n <<Illegal Expression>>"), retract(error(MESSAGE, CURSOR)), nl, nl, write(MESSAGE), % Get error info. % Don't point if error is regarding end-of-text. point_to_error_if_appropriate( CURSOR, EXPR ). insure_list_is_empty([]) :-!. insure_list_is_empty([t( _ , CURSOR) | -] ) :is_a_deeper_error(CURSOR), replace_error("Unexpected token.", CURSOR), fail. parse(TOKENS, TERM) :s_expr(TOKENS, UNUSED_TOKENS, TERM), insure_list_is_empty(UNUSED_TOKENS). point_to_error_if_appropriate(end_of_text, _) :- !. % Don't if end-of-text. point to_error if_appropriate(CURSOR, EXPR) :str_len(FILLER,CURSOR), writef("\n\n%s%c", FILLER, down_arrow), nl, write(EXPR). tokenize(EXPR, TOKENS) :- scan(0, EXPR, TOKENS). user_interface :write("\n\n\n\nEnter expression (Esc to quit): "), readln(EXPR), % Fails on esc. !, evaluate_expression( EXPR ), user_interface. user_interface. Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 37 The Scanner Revisited That's it for the parsing side of things, but a minor change to the scanner will also allow us to identify errors which occur during lexical analysis. Basically, what we want to do is to report unrecognized tokens. In the case of our calculator, this involves identifying tokens, which won't unify with any of the string_token/2 clauses. The predicate get_token/3 will work for this purpose. get_token(FRONTTOKEN, TOKEN, _) :string_token(PROUTTOKEN, TOKEN), !. get_token(_, _, CURSOR) :is_a_deeper_error( CURSOR ), replace error("Illegal token.", CURSOR), fail. We'll need to change the first scan/3 clause to call get_token instead of string_token, making the new definition: scan(STARTING_POSITION,SOURCE,[t(TOKEN,CURSOR)|TAIL]):skip_spaces ( SOURCE, NEW_SOURCE,NUMBER_OF_SPACES), CURSOR = STARTING_POSITION + NUMBER_OF_SPACES, fronttoken(NEW_SOURCE, FRONTTOKEN, REST), !, % Make the case of each token unimportant. upper_lower(FRONTTOKEN, LOWER_CASE_FRONTTOKEN), get_token(LOWER_CASE_FRONTTOKEN. TOKEN, CURSOR), str_len(FRONTTOKEN,LENGTH_OF_FRONTTOKEN), NEW_STARTING_POSITION = CURSOR + LENGTH_OF_FRONTTOKEN, scan(NEW_STARTING_POSITION, REST. TAIL). scan(_, _, [ ] ). The new scanner is contained in XCALC2.SCA: DOMAIN NUMBER_OF_EXTRA_CHARACTERS = INTEGER NUMBER_OF_SPACES = INTEGER PREDICATES get_token( STRING, ТОК, CURSOR) is_a_space( CHAR) scan( CURSOR, SOURCE, TOKL) skip_spaces(SOURCE, SOURCE, NUMBER_OF SPACES) string_token(STRING, ТОК) CLAUSES get_token(FRONTTOKEN,TOKEN,_ ):-string_token(FRONTTOKEN,TOKEN ) , !. get_token(_, _, CURSOR) :is_a_deeper_error ( CURSOR ), replace error("lllegal token.", CURSOR), fail. is_a_space(' '). is_a_space('\t'). is_a_space('\n'). scan(STARTING_POSITION, SOURCE, [t(TOKEN, CURSOR)| (TAIL]) :skip_spaces(SOURCE, NEW_SOURCE, NUMBER_OF_SPACES), CURSOR = STARTING_POSITION + NUMBER_OF_ SPACES, fronttoken(NEW_SOURCE, FROUTTOKEN, REST), !, % Make the case of each token unimportant. upper_lower(FRONTTOKEN, LOWER_CASE_FRONTTOKEN ), get_token(LOWER_CASE_FRONTTOKEN, TOKЕN, CURSOR), str_len(FRONTTOKEN, LENGTH_OF_FRONTTOKEN), NEW_STARTING_POSITION = CURSOR + LENGTH_OF_FRONTTOKEN, scan(NEW_STARTING_POSITION, REST, TAIL). scan(_, _, [ ] ). skip_spaces(SOURCE, NEW_SOURCE, NUMBER_OF_SPACES) :frontchar(SOURCE, CHAR, SOURCE1), is_a_space (CHAR), !, skip_spaces(SOURCE1,NEW_SOURCE,SPACES_IN_ SOURCE1), NUMBER_OF_SPACES = SPACES_IN_SOURCE1 + 1. skip_spaces(SOURCE, SOURCE, 0). string_token("/", div) :- !. string_token("(", lpar) :-!. string_token("-", minus) :- !. string_token("*", mult) :- !. string_token(STRING, number( INTEGER)) :str_int(STRING, INTEGER), !. string_token(STRING, number(REAL)) :str_real(STRING, REAL), !. string_token("+", plus) :- !. string_token("+", power) :- !. string_token(")", rpar) :- !. Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 39 Calculator Wrap-up Try running XCALC2.PRO and supplying it with your favorite poorly formed expressions. You might be surprised at how well this "deeper error" strategy works in this context. In the next section, we'll take a huge leap forward as we consider the implementation of an actual compiler for a simple programming language. Extending the Calculator Project Now let's expand our calculator grammar into that of an actual programming language, build an interpreter for it, and even include the ability to compile and link our language's source-code (producing executable files). The language, which we'll call Calcula, will be designed to accept programs like clear write "Process up to what number? " read MAX_VALUE nl write "Ном many exponents? " read NUMBER_OF_EXPONENTS for VALUE = 1 to MAX_VALUE do nl for EXPONENT = 1 to NUMBER_OF_EXPONENTS - 1 do write VALUE, "^", EXPONENT write " = ", VALUE^EXPONENT, ", " enddo write VALUE, "^", NUMBER_OP_EXPONENTS write " = ", VALUE^NUMBER_OF_EXPONENTS enddo So let's start with the following as our Calcula BNF: <program> ::= <stmnt>+ % A program has one or more statements. <cond> ::= <expr> <oper> <expr> <expr> ::= ( <expr> ) | <expr> * <expr> | <expr> + <expr> | <expr> ^ <expr> | <expr> - <expr> | <expr> / <expr> |.<number> % parenthetical expressions % multiplication % addition %exponentiation % subtraction % division % floatina-point/integer constants <oper> ::= = | <> | < | > | <= | >= % % % % % % equal to not equal to less than greater than less than or equal to greater than or equal to <output terms> ::= <expr> | “<string>” <output terms> ::= <output term> {, <output term> } % Output terms are % output term possibly % followed by a % comma and another % output term. <stmnt> ::= <var> = <expr> % Bind a variable to an % expression % Clear screen | clear | while <cond> do <program> enddo | for <var> = <expr> to <expr> do <program> enddo | if <cond> the <program> endif | if <cond> then <program> else <program> endif | | | | | input <vars> memory nl output <output terms> time % do while loop % for loop % if/then % if/then/else % % % % % Read from user Display memory info Start a new line Write to screen Display current time % % % % Variables are a variable possibly followed by a comma and another variable <var> ::= <name> <vars> ::= <var> {, <var> } Note that <number>, <string> and <name> are not defined; they are already recognized by the PDC Prolog system. After applying the previously outlined BNF conversion process, we might end up with the following parser generator grammar: Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 41 userdefined domains OP productions PROGRAM = STMNT+ COND = EXPR OPER EXPR EXPR = EXPR plus EXPR EXPR minus EXPR -EXPR mult EXPR EXPR div EXPR -rightassoc EXPR power -lpar EXPR rpar number (REAL) VAR OPER = op(OP) :CURSOR OUTPUT_TERM = EXPR str(STRING) -> condition(EXPR,OPER,EXPR) -> add (EXPR,EXPR), -> subtract(EXPR,EXPR), -> multiply (EXPR,EXPR), -> DIVIDE (EXPR,EXPR) -> power( EXPR,EXPR) -> parenthetical (EXPR), -> number(REAL) -> var(VAR) -> operator(OP,CURSOR) -> expr(EXPR), -> str(STRING) OUTPUT_TERM = OUTPUT_TERM+ separator comma STMNT = VAR OPER EXPR clear while COND do PROGRAM enddo -> bind(VAR,OPER,EXPR), -> clear, -> do_while (COND,PROGRAM), for VAR OPER EXPR TO EXPR do PROGRAM enddo ->for(VAR,OPER,EXPR,EXPR,PROGRAM), if_ COND then PROGRAM endif -> if_then(COND,PROGRAM), if_COND then PROGRAM else PROGRAM endif ->if_then_else ( COND,PROGRAM,PROGRAM), input VARS -> input(VARS), memory -> memorу, nl -> nl, output OUTPUT_TERMS -> output(OUTPUT_TERMS), time -> time VAR = name(STRING):CURSOR-> var(STRING, CURSOR) VARS = VAR+ separator comma Of particular interest in this grammar are:  We'll be defining our own OP domain (used in the OPER production).  Unlike the calculator's grammar (where our start production was EXPR), Calcula's top-level production is PROGRAM. A PROGRAM is a list of STMNTs (one or more, with no separator between them), where a STMNT takes one of the 11 identified forms.  The terms produced when the productions OPER and ***MISSING TEXT (FBB)*** are found will contain the location of the identified token.  The token if_ is used (instead of if) to prevent confusion with Prolog syntax. After supplying the above grammar as input to the parser generator, the following two files result: XCALCULA.DOM and XCALCULA.PAR /*********************************************************** XCALCULA.DOM -- DOMAIN DEFINITIONS ***********************************************************/ DOMAINS PROGRAM = STMNT* COND = condition(EXPR,OPER,EXPR) EXPR = add( EXPR, EXPR) ; subtract(EXPR,EXPR); multiply(EXPR, EXPR); divide(EXPR,EXPR); power(EXPR,EXPR); parenthetical(EXPR); number(REAL); var(VAR) OPER = operator(OP,CURSOR) OUTPUT_TERM = expr(EXPR); str(STRING) OUTPUT_TERMS = OUTPUT_TERM* Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 43 STMNT = bind(VAR,OPER,EXPR); clear( ); do_while( COND, PROGRAM ) ; for( VAR, OPER, EXPR, EXPR .PROGRAM ); if_then(COND.PROGRAM); if_then_eIse( COND, PROGRAM, PROGRAM ) ; input (VARS); memory (); nl( ); output (OUTPUT_ TERMS); time() VAR = var(STRING,CURSOR) VARS = VAR* TOK = comma(); plus(); minus(); mult(); div(); роwer(); lpar(); rраr(); number( REAL); op(OP); str(STRING); clear(); while(); do(); enddo(); for(); to_(); if_(); then(); endif(); else(); input(); memory( ); nl(); output(); time(); name( STRING ); nil /*********************************************************** XCALCULA.PAR -- PARSING PREDICATES ***********************************************************/ PREDICATES s_program(TOKL,TOKL,PROCRAM) s_program(TOKL,TOKL,PROGRAM) s_output_terms( TOKL, TOKL, OUTPUT_TERMS ) s_output_terms1 (TOKL, TOKL ,OUTPUT_TERMS) s_vars(TOKL,TOKL,VARS) s_vars1(TOKL,TOKL,VARS) s_cond( TOKL, TOKL, COND ) s_expr(TOKL,TOKL,EXPR) s_expr1(TOKI,TOKL,EXPR) s_expr2 ( TOKL, TOKL, EXPR, EXPR ) s_expr5(TOKL,TOKL,EXPR) s_expr6( TOKL, TOKL, EXPR, EXPR ) s_expr3( TOKL, TOKL, EXPR ) s_expr7< TOKL, TOKL, EXPR, EXPR) s_expr4(TOKL,TOKL,EXPR) s_oper(TOKL,TOKL,OPER) s_output_ term( TOKL, TOKL, OUTPUT TERM) s_stmnt (TOKL, TOKL, STMNT) s_var(TOKL,TOKL,VAR) CLAUSES s_cond(LL1,LLO,condition(EXPR,OPER.EXPR1)):s_expr(LL1,LL2,EXPR), s_oper(LL2,L13,OPER), s_expr(LL3,LLO,EXPR1),!. s_expr(LL1,LLO,EXPR):s_expr1(LL1,LLO,EXPR). s_expr1(LLl, LLO,EXPR_):s_expr2(LL1,LL2,EXPR), s_expr5(I.L2,LLO,EXPR,EXPR_). s_expr2(LL1,LLO,EXPR ):s_expr3(LL1,LL2,EXPR), s_expr6( LL2, LLO, EXPR, EXPR_). s_expr3(LL1,LLO,EXPR ):s_expr4(LL1,LL2,EXPR), s_expr7( LL2, LLO, EXPR, EXPR_). Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 45 s_expr4( [ t (lpap,_) | LL1],LLO,parenthetical(EXPR)):-!, s_expr(LL1 ,LL2,EXPR), expect(t(rpar,_),LL2,LLO). s_expr4([t(number(REAL),_) | LL],LL,number(REAL)):-!. s_expr4(LL1,LLO,var(VAB)):s_var(LL1, LLO,VAR) ,!. s_expr4(LL,_,_):-syntax_error(expr4,LL), fail. s_oper([t(op(OP),CURSOR)|LL],LL,operator(OP,CURSOR)):-!. s_oper(LL,_,_):-syntax_error(oper,LL), fail. s_output_term(LL1,LLO,expr(EXPR)):s_expr(L1,LLO,EXPR), ! . s_output_term([t(str(STRING),_)| LL],LL,str(STRING)):-!. s_output_term(LL,_,_):-syntax_error(output_term,LL), fail. s_stmnt(LL1,LLO,bind(VAR,OPER,EXPR)):s_var(LL1,LL2,VAR), s_ oper(LL2,LL3,OPER), s_expr< LL3, LLO, EXPR ), !. s_stmnt( [t(clear,_ ) | LL], LL, clear): -!. s_stmnt([t(while,_) | LL1],LLO,do_while(COND,PROGRAM)):-!. s_ cond(LL1,LL2,COND), expect(t(do,_),LL2,LL3), s_program(LL3,LL4, PROGRAM), expect(t(enddo, _),LL4,LLO). s_stmnt([t(for,_)|LL1],LLO,for(VAR,OPER,EXPR,EXPR1,PROGRAM)):-!, s_var(LL1,LL2,VAR), s_oper(LL2,LL3,OPER), s_expr(LL3,LL4,EXPR). expect ( t (to,_),LL4,LL5), s_expr(LL5,LL6,EXPR1), expect(t(do,_ ),LL6,LL7), s_program(LL7, LL8,PROGRAM), expect(t(enddo,_),LL8,LLO). s_stmnt([t(if_,_)lLL1],LLO,if then(COND, PROGRAM)) :s_cond(LL1,LL2,COND), expect(t(then,_),LL2,LL3), s_program(LL3,LL4,PROGRAM), expect(t(endif,_),LL4,LLO),!. s_stmnt([t(if_,_)|LL1,LLO, if_then_else (COND,PROGRAM,PROGRAM1 )):-!, s_cond(LL1,LL2,COND), expect(t(then,_),LL2,LL3), s_program(LL3,LL4, PROGRAM), expect(t(else,_ ),LL4,LL5), s_program(LL5,LL6.PROGRAM1), expect(t(endif,_),LL6,LLO). s_stmnt([t(input,_) | LL1],LLO,input(VARS)):-!, s_vars(LL1,LLO,VARS). s_stmnt([t(memory,_) | LL],LL, memory):-!. s_stmnt([t(nl,_ ) | LL],LL,nl):-!. s_stnmt([t(output,_) | LL1],LLO,output(OUTPUT_TERMS)):-!. s_output_terms(LL1,LLO,OUTPUT_TERMS). s_stmnt([t(time,_) | LL] , LL, time):-!. s_stmnt(LL,_,_):-syntax_error(stmnt,LL), fail. s_var([t(name(STRING),CURSOR)|LL],LL,var(STRING,CURSOR)):-!. s_var(LL,_,_):-syntax_error(var,LL),fail. Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 47 s_expr5([t(plus,_)lLL1],LLO.EXPR,EXPR_):-!, s_expr2(LL1,LL2,EXPR1), s_expr5(LL2,LLO,add(EXPR,EXPR1),EXPR_). s_expr5([t(minus,_) | LL1],LLO,EXPR,EXPR_):-!, s_expr2(LL1,LL2,EXPR1), s_expr5(LL2,LLO,subtract(EXPR,EXPR1),EXPR_). s_expr5(LL,LL,EXPR,EXPR). s_expr6([t(mult,_)lLL1],LLO,EXPR,EXPR_):-!. s_expr3(LL1,LL2,EXPR1), s_expr6(LL2,LLO,multiply(EXPR,EXPR1),EXPR_). s_expr6( [t(div,_) |LL1] ,LLO,EXPR,EXPR_):-!, s_expr3(LL1,LL2,EXPR1), s_expr6(LL2,LLO,divide(EXPR,EXPR1),EXPR_). s_expr6(LL,LL,EXPR,EXPR). s_expr7( [t(power, _) | LL1],LLO,EXPR,power(EXPR,EXPR1 )):-!, s_expr3(LL1 ,LLO,EXPR1 ). s_expr7( LL,LL,EXPR,EXPR). s_ргоgгаm( LL 1, LLO, [STHMT | PROGRAM] ): s_stmnt(LL1,LL2,STMMT), s_program1(LL2, LLO, PROGRAM). s_program1(LL1,LLO, [STMMT | PROGRAM] ):s_stmnt(LL1,LL2,STMNT),!, s_program1(LL2, LLO, PROGRAM). s_program1(LL,LL,[ ] ). s_output_terms(LL1,LLO, [OUTPUT_TERM l OUTPUT_TERMS] );s_output_term(LL1,LL2,OUTPUT_TERM). s_output_terms1(LL2,LLO,OUTPUT_TERMS). s_output_terms1([t(comma,_) | LL1],LL2,OUTPUT_ TERMS):-!, s_output_tems(LL1,LL2, OUTPUT_ TERMS). s_output_terms1(LL,LL,[ ]). s_vars(LL1,LLO,[VAR | VARS]):s_var(LL1,LL2,VAR), s_vars1(LL2,LLO,VARS). s_vars1([t(comma,_ ) | LL1],LL2,VARS):-!, s_vars(LL1,LL2,VARS). s_vars1(LL,LL,[ ]). The details of the system supporting the generated Calcula parser are contained in XCALCULA.PRO and subordinate files. Highlights of this system include:  Involvement of the PDC Prolog editor, utilizing the MSG and POS parameters to identify errors.  The production, after each successful execution of the Calcula source, of XTEMP.PRO—a PDC Prolog source file which, when compiled under the PDC Prolog compiler, provides emulation of the Calcula source. Upon exiting the Calcula system, the user is given the opportunity to compile XTEMP.PRO, thereby creating a stand-alone, executable version of a Calcula program.  Reuse of the "deeper error" strategy (described previously in the XCALC discussion).  Synonyms for several of the Calcula grammar's terminals (for example, you may use output, write, show or display as the initial terminal in an output statement). Note: If you are interested in compiling XTEMP.PRO (the PDC Prolog code produced for each successfully executing Calcula source program) to an .EXE file, the following guidelines should be observed:  To obtain heap sufficient to run the Prolog compiler, compile XCALCULA.PRO to .EXE, exit the Prolog environment and then run XCALCULA outside of the integrated development environment.  Increase the PDC Prolog compiler's stack size (via the S option on the command line) for long Calcula programs. Bootstrapping the Parser Generator The parser generator was constructed by "bootstrapping," and you yourself can repeat the work. In essence, this refers quite simply to the process of using the parser generator to specify and generate a parser for itself, that is, for the parser generator specification language. All you need to do is treat the grammar given in Figure 8.1 as the grammar of a source language for which you want to generate a parser. Hence, we need to augment this grammar, as we did with that for simple expressions in Calcula. The result is shown in Figure 8.3 below. It assumes the following tokenized forms for terminal symbols in the parser generator's grammar: Terminal Symbol Tokenized Form Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 49 -> = * + userdefined predicates domains -, separator : ( ) arrow equal star plus userdefined_ predicates_ domains_ priorsep comma separator colon lpar rpar Figure 2:Tokenized Terminal Symbols productions PARSER = SECTIОN* SECTION = userdefined_ predicates_ PRODNAMES -> userpreds_(PRODNAMES). userdefined_ domains_ PROONAMES -> userdoms_<PRODNAMES), productions_PRODUCTIONS->productions_(PRODUCTIONS) PRODUCTIONS = PRODUCTION+ PRODUCTION = PRODNAME equal PRODBODY -> p(PRODNAME,PRODBODY) PRODBODY = upper(STRING):CURSOR STAR_PLUS SEPARATOR ->list(PRODNAME,CURSOR,STAR_PLUS,SEPARATOR), PR0DGROUPS -> groups(PROOGROUPS) STAR_PLUS = star -> star, plus -> plus SEPARATOR = separator_ id(STRING) -> sep(STRING), -> none PROOGROUPS = PRIORGROUP+ separator priorsep PRIORGROUP = SINGPROD+ separator comma SINGPROD = ASSOC GRAMTOKL arrow:CURSOR TERM -> prod(ASSOC, GRAMTOKL, CURSOR, TERM) ASSOC = rightassoc_ -> right, -> left GRAMTOKL = GRAMTOK* GRAMTOK=upper(STRING):CURSOR -> prodname(PRODNAME,CURSOR), TOKK CURSORDEMAND -> tok(TOKK,CURSORDEMAND) CURSORDEMAND=colon PRODNAME -> curdemand(PRODNAME), -> none TOKK = id(STRING):CURSOR lpar PRODNAMES rpar -> cmp(STRING,CURSOR,PRODNAMES), id(STRING) :CURSOR -> name(STRING,CURSOR) TERM = upper(STRING):CURSOR -> dom(PRODNAME, CURSOR), id(STRING):CURSOR lpar PRODNAMES rpar -> term( STRING, CURSOR, PRODNAMES), id(STRING):CURSOR -> name(STRING, CURSOR) PROONAME = upper(STRING) -> PROONAME PROONAMES = PRODNAME+ separator comma Figure 3: Grammar for bootstrapping the parser generator Visual Prolog 4.0, Beta 2.0, 6/17/2019 Page 51

Parser Generator Guide: Grammars, Lexical Analysis, and More

Related documents

Products

Support

Parser Generator Guide: Grammars, Lexical Analysis, and More

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib