CSE305: Attribute Grammars Lukasz Ziarek lziarek@buffalo.edu Announcements • Update to HW1 directions • Clarification on what to turn in • Summarizes Piazza discussions / topics • Highlighted to make it easy to spot • Reading • Chapter 4 and Attribute Grammar Reference Section 3.1 (13 pages) (find it under general resources) • Slight change to TA office hours this week • Manjusha Wed (Feb. 10) 1-2:30 pm in Davis 302. 2 Challenge problem revisited (again) • Idea: chain functions together • Problems that remain • Deletion • Update • Lets start be re-examining add 3 Putting it all together This is a function that always returns “not found” val map = new() val map’ = add(“hi”, 1, map) val map’’ = add(“hello”, 2, map’) This is a function that checks if the argument given is equal to “hi” if yes it returns 1, if not it calls map 4 Putting it all together val map = new() val map’ = add(“hi”, 1, map) val map’’ = add(“hello”, 2, map’) fun map’’(x) = if x = “hello” then 2 else map’(x) 5 Putting it all together val map = new() val map’ = add(“hi”, 1, map) val map’’ = add(“hello”, 2, map’) fun map’’(x) = if x = “hello” then 2 else if x = “hi” then 1 else map(x) 6 Putting it all together val map = new() val map’ = add(“hi”, 1, map) val map’’ = add(“hello”, 2, map’) fun map’’(x) = if x = “hello” then 2 else if x = “hi” then 1 else raise or print “not found” 7 Update “hi” to map 3 • What happens if we add another mapping for “hi”? val map = new() val map’ = add(“hi”, 1, map) val map’’ = add(“hello”, 2, map’) val map’’’ = add(“hi”, 3, map’’) 8 Putting it all together – lookup (“hi”) fun map’’’(x) = function returns if x = “hi” then 3 never reached else if x = “hello” then 2 else if x = “hi” then 1 else raise or print “not found” ordered comparisons 9 Delete • Delete works on the same principle as update • Instead of adding a new binding, raise an exception that the item is not found • Since the functions we synthesize have no handlers, the exception propagates out of the map 10 Delete • What happens if we “add” another mapping for “hi” that just raises not found? val map = new() val map’ = add(“hi”, 1, map) val map’’ = add(“hello”, 2, map’) val map’’’ = delete(“hi”, map’’) 11 Putting it all together – delete (“hi”) function returns fun map’’’(x) = if x = “hi” then raise or print “not found” never reached else if x = “hello” then 2 else if x = “hi” then 1 else raise or print “not found” ordered comparisons 12 Complexity • • • • • Creation of a new map O(1) function creation Adding an element O(1) function composition Updating an element O(1) defined in terms of add Deleting an element O(1) function composition Lookup O(n) • But what is n? 13 Attributes • • • • • Synthesized Attributes • Pass information up the parse tree Inherited Attributes • Pass information down the parse tree or from left siblings to the right siblings Attribute values assumed to be available from the context. Attribute values computed using the semantic rules provided. The constraints on the attribute evaluation rules permit topdown left-to-right (one-pass) traversal of the parse tree to compute the meaning. 14 Attributes (continued) How are attribute values computed? • If all attributes were inherited, the tree could be decorated in top-down order. • If all attributes were synthesized, the tree could be decorated in bottom-up order. • In many cases, both kinds of attributes are used, and it is some combination of topdown and bottom-up that must be used. 15 Information Flow inherited computed available synthesized ... ... 16 Inherited Attributes Declaration and Use { int i, j, k; i := i + j + j; } <stmts> -> <stmts> ; <stmt> <stmt> -> <assign-stm> | <decl> <assign-stm> -> <var> := <expr> <var>.env := <assign-stm>.env <expr>.env := <assign-stm>.env 17 Inherited and Synthesized Attributes • Coercion Code Generation 5.0 + 2 coerce_int_to_real • Determination of un-initialized variables • Determination of reachable non-terminals 18 Static Semantics (coercion insertion) E -> n | m E.type := int E -> x | y E.type := real E -> E -> E1 E1 + * E2 E2 if E1.type = E2.type then E.type := E1.type else E.type := real 19 Static Semantics (type checking if) E.type := int E -> n | m E -> p | q E -> if then else E0 E1 E2 E.type := bool ( E0.type = bool ) ( E1.type = E2. type ) then E.type := E1.type else type error if 20 An Extended Example Distinct identifiers in a straight-line program. BNF <exp> ::= <var> | <exp> + <exp> <stm> ::= <var> := <exp> | <stm> ; <stm> Attributes <var> <exp> <stm> id ids ids num Semantics specified in terms of sets (of identifiers). 21 <exp> <exp>.ids <exp> <exp>.ids <stm> <stm>.ids <stm>.num <stm> <stm>.ids <stm>.num ::= <var> = { <var>.id } ::= <exp1> + <exp2> = <exp>.ids U <exp>.ids ::= <var> := <exp> ={ <var>.id } U <exp>.ids = | <stm>.ids | ::= <stm1> ; <stm2> = <stm1>.ids U <stm2>.ids = | <stm>.ids | 22 Alternate approach : Use lists Attributes envi : list of vars in preceding context envo : list of vars for following context dnum : number of new variables <exp> ::= <var> <exp>.envo = if member(<var>.id,<exp>.envi) then <exp>.envi else add(<var>.id,<exp>.envi) 23 Attribute Computation Rules <exp> ::= envi envo dnum <exp1>.envi <exp2>.envi <exp>.envo <exp>.dnum <exp1> + <exp2> envi envo dnum = = = = envi envo dnum <exp>.envi <exp1>.envo <exp2>.envo length(<exp>.envo) 24 Extending our examples • What information did we compute? • Number of distinct variables • List (set) of variables for each expression • Very first example computed “environments” • mapping between id and an expression 25 Putting it all together – static semantics • BNF -- Power to express structure • Semantics that are structural • Precedence • Associativity • EBNF -- Power to express computation over structure • Static semantics • Type checking 26 Dynamic Semantics • No single widely acceptable notation or formalism for describing semantics. • The general approach to defining the semantics of any language L is to specify a general mechanism to translate any sentence in L into a set of sentences in another language or system that we take to be well defined. • Here are three approaches we’ll briefly look at: • Operational semantics • Axiomatic semantics • Denotational semantics 27 Operational Semantics • Idea: describe the meaning of a program in language L by specifying how statements effect the state of a machine, (simulated or actual) when executed. • The change in the state of the machine (memory, registers, stack, heap, etc.) defines the meaning of the statement. • Similar in spirit to the notion of a Turing Machine and also used informally to explain higher-level constructs in terms of simpler ones 28 Axiomatic Semantics Based on formal logic (first order predicate calculus) Original purpose: formal program verification Approach: Define axioms and inference rules in logic for each statement type in the language (to allow transformations of expressions to other expressions) • The expressions are called assertions and are either • Preconditions: An assertion before a statement states the relationships and constraints among variables that are true at that point in execution • Postconditions: An assertion following a statement • • • 29 CSE305: Lexers and Parsers Lukasz Ziarek lziarek@buffalo.edu 30 Expression Compilation Example lexical analyzer tokenized expression: implicit type conversion (why?) parser 31 S* Regular (lexer) Context-free (parser) Context-sensitive (type-checker) “Correct” Programs (no run-time errors) 32 Introduction • Language implementation systems must analyze source code, regardless of the specific implementation approach • Nearly all syntax analysis is based on a formal description of the syntax of the source language (BNF) 33 Syntax Analysis • The syntax analysis portion of a language processor nearly always consists of two parts: • A low-level part called a lexical analyzer (mathematically, a finite automaton based on a regular grammar) • A high-level part called a syntax analyzer, or parser (mathematically, a push-down automaton based on a context-free grammar, or BNF) 34 Advantages of Using BNF to Describe Syntax • Provides a clear and concise syntax description • The parser can be based directly on the BNF • Parsers based on BNF are easy to maintain 35 Reasons to Separate Lexical and Syntax Analysis • Simplicity - less complex approaches can be used for lexical analysis; separating them simplifies the parser • Efficiency - separation allows optimization of the lexical analyzer • Portability - parts of the lexical analyzer may not be portable, but the parser always is portable 36 Lexical Analysis • A lexical analyzer is a pattern matcher for character strings • A lexical analyzer is a “front-end” for the parser • Identifies substrings of the source program that belong together - lexemes • Lexemes match a character pattern, which is associated with a lexical category called a token • sum is a lexeme; its token may be IDENT 37 Lexical Analysis (continued) • The lexical analyzer is usually a function that is called by the parser when it needs the next token • Three approaches to building a lexical analyzer: • • • Write a formal description of the tokens and use a software tool that constructs a table-driven lexical analyzer from such a description Design a state diagram that describes the tokens and write a program that implements the state diagram Design a state diagram that describes the tokens and handconstruct a table-driven implementation of the state diagram 38 State Diagram Design • A naïve state diagram would have a transition from every state on every character in the source language - such a diagram would be very large! 39 Lexical Analysis (continued) • In many cases, transitions can be combined to simplify the state diagram • When recognizing an identifier, all uppercase and lowercase letters are equivalent • Use a character class that includes all letters • When recognizing an integer literal, all digits are equivalent - use a digit class 40 Lexical Analysis (continued) • Reserved words and identifiers can be recognized together (rather than having a part of the diagram for each reserved word) • Use a table lookup to determine whether a possible identifier is in fact a reserved word 41 Lexical Analysis (continued) • Convenient utility subprograms: • getChar - gets the next character of input, puts it in nextChar, determines its class and puts the class in charClass • addChar - puts the character from nextChar into the place the lexeme is being accumulated, lexeme • lookup - determines whether the string in lexeme is a reserved word (returns a code) 42 State Diagram 1-43 43