Lesson 11 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg Outline • Syntax-directed specifications of language semantics – Syntax-directed definitions – Syntax-directed translation schemes • Semantic analysis – Focus on type analysis SYNTAX-DIRECTED SPECIFICATIONS OF LANGUAGE SEMANTICS Overview of syntax-directed specifications of semantics • Semantics can be expressed by: – Attaching attributes to grammar symbols – Specifying how to compute those attributes: • By adding semantic rules, we get a syntax-directed definition (SDD) • By adding semantic actions, we get a syntaxdirected translation scheme (SDT) Examples of attributes: values of evaluated subtrees Grammar: E→E+T E→T T → num E E.val = 7 E.val = 3 E T.val = 3 T String: 3+4 num num.val = 3 + T num num.val = 4 Examples of attributes: (line, column) in source file Grammar: E→E+T E→T T → num E E.coord = (1,1) to (1,2) T.coord = (1,1) to (1,2) String: 3❏+❏4 E + E.coord = (1,1) to (1,6) T +.coord = (1,3) to (1,4) T num num.coord = (1,1) to (1,2) num num.coord = (1,5) to (1,6) Examples of attributes: data types Grammar: E→E+T E→T T → num E E.type = int E.type = int E + T.type = int T String: 3+4 num num.type = int T num num.type = int SDDs vs. SDTs SDDs • Do not specify any evaluation order for the semantic rules • The rules appear at the end of the production bodies • The semantic rules may only have controlled side effects • Mostly useful for specification SDTs • Explicitly specify an evaluation order for the semantic actions • The actions may appear anywhere in the production bodies • The semantic actions may be arbitrary code fragments • Mostly useful for implementation Syntax-directed definitions • SDD of values of evaluated subtrees: Production Rules E → E1 + T E.val = E1.val + T.val E→T E.val = T.val T → num T.val = num.val • Each node has an attribute val holding the value of its evaluated subtree Types of attributes • An attribute at a parse tree node N may be of two kinds: – Synthesized • Computed in terms of attributes at the children of N and/or other attributes at N – Inherited • Assigned to N at the parent of N Annotated parse tree for “3 + 4” E.val = 7 E.val = 3 T.val = 3 num.val = 3 + T.val = 4 num.val = 4 Inherited attributes • After eliminating left recursion from the previous grammar: Production Rules E → T E' E'.inh = T.val E.val = E'.syn E' → + T E'1 E'1.inh = E'.inh + T.val E'.syn = E'1.syn E' → ε E'.syn = E'.inh T → num T.val = num.val Annotated parse tree for “3 + 4” E.val = 7 E'.inh = 3 E'.syn = 7 T.val = 3 num.val = 3 + T.val = 4 num.val = 4 E'.inh = 7 E'.syn = 7 ε Classes of SDDs • S-attributed SDDs: – Only synthesized attributes • L-attributed SDDs: S L – Inherited attributes depend on attributes of symbols to the left in the production (including the head) – Attributes at a node N can also depend on other attributes at N, if it does not introduce dependency cycles • In the first lab, you used an L-attributed SDD • In the third lab, you will use an S-attributed SDD Example of non-L-attributed SDD Production A→BC Semantic rules A.syn = B.syn; B.inh = f(C.syn, A.syn) Example SDD • Grammar for mathematical functions: E→E+E E→E*E E → num | x | ( E ) • Goal: specify an SDD for the translation of an expression into its derivative (as a string), recalling the rules: x’ = 1 n’ = 0 (f * g)’ = f’ * g + f * g’ (f + g)’ = f’ + g’ Example SDD Production E → E1 + E2 Rules E.expr = E1.expr || ”+” || E2.expr E.der = ”(” || E1.der || ”+” || E2.der || ”)” E → E1 * E2 E.expr = E1.expr || ”*” || E2. expr E.der = ”(” || E1.der || ”*” || E2. expr || ”+” || E1.expr || ”*” || E2.der || ”)” E→x E.expr = ”x” E.der = ”1” E → num E.expr = num.val E.der = ”0” E → ( E1 ) E.expr = ”(” || E1.expr || ”)” E.der = E1.der Where || is the string concatenation operator Exercise (1) Draw the parse tree for the string 2*(x + x) Then decorate it using the SDD on the previous slide. Exercise (2) Complete the following SDD of values of evaluated subtrees. Production Rules E → T E‘ E'.inh = T.val E.val = E'.syn E' → + T E'1 E'1.inh = E'.inh + T.val E'.syn = E'1.syn E' → ε E'.syn = E'.inh T → F T' ? T' → * F T'1 ? T' → ε ? F → num F.val = num.val F→(E) ? SEMANTIC ANALYSIS Semantic analysis • Why? – Not all errors are lexical or syntactical – Needed to generate correct code • When? – Can be done during parsing (semantic actions) – Easier in separate passes on some intermediate program representation Semantic analysis – name analysis • Examples of name analyses from trac42: – Is a referenced variable declared? – Is a variable uniquely declared in the scope? – Is a called function declared/defined? void my_func(void) { int x = y + 23 * my_fnuc(); char x = ‘x’; } Semantic analysis – type analysis • Examples of type analyses from trac42: – Is an operator applied to operands of the right types? – Is a function called with the right number of arguments? – Are the function arguments of the right types? int my_func(int arg) { int x = arg * “bla bla bla”; int y = my_func(23, 78); return my_func(37.8); } More examples of semantic analyses • Is a “break” statement enclosed in a loop or a switch? (C, C++, C#, Java…) • From ADA: The beginning and end of blocks should be tagged with the same name Static vs. dynamic checks • Static checks are done during compilation – Static type checks requires type specifications by the programmer or type inference by the compiler • Dynamic checks are done during runtime – Dynamic type checks require type information to be carried with data objects. Examples: • A ” type” member in structs • Vpointers and vtables in OO languages • Trac42 needs only static checks Type analysis • Type information is gathered from: – Declarations of variables and functions char str[256]; int some_func(float arg); – Format of constants x = 15.7f; c = ’a’; z = 98ul; • A type system specifies how to assign type attributes to program parts – More on this in the next lecture What is a type? • Specification of: – The size needed to store a data object – How to interpret the stored data • Examples: – Unsigned short: needs 16 bits and is interpreted as a number from 0 to 65535 – Signed short: needs 16 bits and is interpreted as a number from -32768 to 32767 Using types for code generation unsigned x; unsigned y = 24; unsigned z = 6; x = y / z; int x; int y = 24; int z = 6; x = y / z; … divl -8(%ebp) … … idivl -8(%ebp) … Type conversions • Changes the type of a data object • Explicit: int a = 12; float b = (float) a; void* p_v = (void*) &a; int* p_i = (int*) p_v; • Implicit (coercions): inferred by the compiler: char a = 12; int b = a; float x = 17; • Trac42 does not have type conversions Representing types • Type expressions: – Basic types, e.g., int, char, float • void – No value • error – Erroneous type – Type names – Type constructors. Examples: • int* p; • int a[27]; • int f(char a, float b); pointer(int) array(27, int); char × float → int Representing types – type constructors • pointer(T) – Pointer to an object of type T • array(I, T) – Array with I nr of elements of type T • T1 × T2 – Product of types T1 and T2 Representing types – type constructors • Records. Similar to products, but includes the member name. – Example: struct A { int a; char b; }; record((a x integer) x (b x char)) • T1 → T2 – Function taking the type T1 as argument and returning T2 – T1 is often a product type Tree representation of type expressions • Example: → – int* my_func(char a, char b); – Type of my_func: char × char → pointer(int) char × pointer char int Exercise (3) Write the type expression and draw it as a tree for the function char** your_func(int a, char* b, float c); Conclusion • SDDs and SDTs are two similar ways to attach semantics to a grammar • Attributes on grammar symbols can be synthesized or inherited • Data types are needed to guard against errors and to generate correct code • Types can be represented as type expressions Next time • More type analysis