CS4XX - INTRODUCTION TO COMPILER THEORY FINAL EXAM Total Points : 100 1) Give a regular expression for each of the regular sets described below. (10 pts) a) All strings of lower-case letters that either begin or end in an a. Some example strings in the language: a, accc, abax, abaxa. Note: You may make a regular definition for lower-case letters. Solution: a<lower>*|<lower>*a where <lower> denotes lower case letters from a to z. b) All strings of a's and b's that contain no three consecutive b's. Some example strings in the language: abab, abbaaa, eps (the empty string), baabb. Solution: (ε|b|bb)(abb|ab|a)* 2) Show that the following grammar is ambiguous (10 pts) A --> A x B |x B --> x B |x Solution: Consider the string : xxxxx A A A x x A x B x B x A B x x B B x x x Parse Tree 1 Parse Tree 2 Since the above grammar produces two parse trees for the same string it is ambiguous. Final Exam 1 3) The following is a version of an expression grammar. Do the steps to make an LL(1) parse table (shown above for problem 1). If there are conflicts, show all the values that can go in a spot in the table. (10pts) E --> E + E | E-E | id a. Transformed grammar without left recursion (if necessary). E id Q Q +EQ | -EQ | <eps> b. Left-factored grammar (starting with the grammar from the first part) (if necessary) No left factoring is necessary 4) Briefly discuss what the potential advantages/disadvantages are of bottom-up versus a top-down parser generators. (10 pts) Solution: Bottom-up - Harder to debug + more expressive => grammar does not need to be modified (as much) and this results in more intuitive parse tree structure. Top-down + Easier to debug and more intuitive. + Can be implemented manually - less expressive => may require extensive changes to the grammar to make it parsable. This results in less intuitive parse tree structure. 5) Give a regular expression and a CFG that accept the same infinite language of your choice. (10 pts) Solution: There are many possible answers, but a simple one is: regular expression: a* context-free grammar: S -> Sa | epsilon A common mistake was writing a grammar that had no strings in the language at all because it lacked a base case where a non-terminal was replaced by only terminals (e.g., the grammar S -> Sa). Final Exam 2 6) Give two implementations for deciding whether an NFA accepts an input string. (10 pts) Solution: There are at least three possible implementations: 1. Convert the NFA to an equivalent DFA and run the DFA on the input. 2. Simulate the execution of the NFA, making arbitrary choices for non-deterministic choices and backtracking to undo choices that do not lead to acceptance of the input string. 3. Simulate the execution of the NFA, keeping track at each step of the set of states the NFA could be in. 7) Consider the following regular expression from the alphabet {a,b}: b*a | bb(30 pts) a) Use Thompson's construction to make an NFA from the regular expression (show it as a state diagram). NOTE: do not build an ad-hoc NFA: the point is to use Thompson's construction. Solution: Thompson’s Construction ε b ε 2 3 a ε 4 5 6 ε ε ε 10 1 ε 7 b 8 b Final Exam 9 ε 3 b) Use subset construction to create a DFA equivalent to the NFA you gave for part A. Show your work. Show it as a state table, using the sets from the NFA as the names for the new states, as we did in examples in lecture. Solution: Start state: [1] ε closure[1]=[12357] mov(12357,a)=[6] ε closure[6]=[6 10] –Final state mov(12357,b)=[48] ε closure[48]=[3458] mov(3458,a)=[6] ε closure[6]=[6 10] –Final state mov(3458,b)=[49] ε closure[49]=[3459 10] mov(3 4 5 9 10,a)=[6] ε closure[6]=[6 10] –Final state mov(3 4 5 9 10,b)=[4] ε closure[4]=[345] mov(3 4 5,a)=[6] ε closure[6]=[6 10] –Final state mov(3 4 5,b)=[4] ε closure[4]=[345] A B L [12357] [6 10] [3458] M* [6 10] - - N [3458] [6 10] [3459 10] O [3459 10] [6 10] [345] P [345] [6 10] [345] *Indicates Final state Final Exam 4 c) Also show it as a state diagram, using capital letters as the state names. Call the start state L, the next one M, etc. (to avoid confusion with the alphabet used in the DFA). Show the correspondence between the letters, L, M, etc. and the sets of states used in the table. Do some hand runs of strings through the state diagram to verify for yourself that it recognizes the language described by the original regular expression (you don't have to write anything for this last bit). I.e., this is a way to check that your answer to this problem is correct. STATE DIAGRAM b L N a a P a b b a M O Final Exam 5 8) Given the following grammar: module ::= statement statement ::= PRINT expression_list expression_list ::= expression | expression COMMA expression_list expression ::= INT | MINUS expression | expression PLUS expression Draw the parse tree for the following program (10 pts) Solution: Final Exam 6