Lecture # 4 Chapter 1 (Left over Topics) Chapter 3 (continue) Left over Topics of Chapter 1 • What is Analysis /Synthesis Model of Compilation? • Symbol Table Management • Error Detection and Reporting • What is meant by grouping of compilation phases into Front End and Back End? • What is meant by Single / Multiple Passes? • What are the Compiler Construction Tools available? The Analysis-Synthesis Model of Compilation • There are two parts to compilation: – Analysis determines the operations implied by the source program which are recorded in a tree structure – Synthesis takes the tree structure and translates the operations therein into the target program 3 Another way.. • Analysis: breaks the source program into constituent pieces and creates intermediate representation • Synthesis: constructs target program from the intermediate representation • The first three phases namely: Lexical Analysis, Syntax Analysis and Semantic Analysis form the analysis part • The last three phases form the Synthesis part Symbol Table Management • An essential function of a compiler is to record the identifiers used in the source program and to collect information about various attributes of each identifier • A symbol table is a data structure containing an entry for each identifier with fields for the attributes of the identifier Error Detection and Reporting • Each phase of the compiler can encounter error. • After detecting error the compiler must deal with that error so that compilation can proceed. • A lexical analyzer will detect errors where characters do not form a token • Errors where token violates the syntax are determined by syntax analysis • If the compiler tries to add two variables one of which is the name of a function and another is an array then Symantic Analysis will throw error Section 1.5: The Grouping of Phases • Compiler phases are grouped into front and back ends: – Front end: analysis (machine independent) – Back end: synthesis (machine dependent) • Front End focuses on understanding the source program and the backend focuses on mapping programs to the target machine. 7 Compiler Passes • Compiler Passes: – A collection of phases is done only once (single pass) or multiple times (multi pass) • Single pass: usually requires everything to be defined before being used in source program • Multi pass: compiler may have to keep entire program representation in memory Section 1.6: Compiler-Construction Tools • Software development tools are available to implement one or more compiler phases – – – – – Scanner generators (Lex and Flex) Parser generators (Yacc and Bison) Syntax-directed translation engines Automatic code generators Data-flow engines For further details this webpage would be sufficient http://dinosaur.compilertools.net/ COP5621 Fall 2009 9 ANTLR 3.x Project for Compiler Construction • This is a project that is built using Eclipse and the source code along with all the class files are available in Java. This aids the students in creating compiler project on a fly. • Its C# libraries are also available that can be used. • I would try to take a lab and discuss it • It tutorials and videos are available at the following address: http://www.vimeo.com/groups/29150/videos Recap of the last lecture • Difference: Skeletal Source Program Preprocessor Source Program Compiler Target Assembly Program Assembler Relocatable Object Code Linker Absolute Machine Code Libraries and Relocatable Object Files 11 Recap We discussed: • What are Regular Expressions ? How to write ? • RE→NFA (Thompson’s construction) • NFA →DFA (Subset construction) The Subset Construction Algorithm Initially, -closure(s0) is the only state in Dstates and it is unmarked while there is an unmarked state T in Dstates do mark T for each input symbol a do U := -closure(move(T,a)) if U is not in Dstates then add U as an unmarked state to Dstates end if Dtran[T,a] := U end do end do 13 Subset Construction Example start 0 1 3 7 a a a b a1 2 4 b 5 b 8 b 6 a2 a3 b C b start A b Dstates A = {0,1,3,7} B = {2,4,7} C = {8} D = {7} E = {5,8} F = {6,8} a3 a b D a a B a1 b E a3 b F a2 a3 14 Today’s Lecture • How can we minimize a DFA? (Hopcroft’s Algorithm) Section 3.9: Minimization of DFA • What do we want to achieve? Hopcroft’s Algorithm Pg 142 • Input: A DFA M with set of states S, set of inputs , transition function defined, start state So and set of accepting states F • Output: A DFA M’ accepting the same language as M and having fewer states as possible Algorithm 3.6 • Method: Step1:Construct an initial partition P of the states with two groups : the accepting states (F) and the non accepting states (S-F) Step2:Apply the following procedure (Construction of Pnew) to construct a new partition (Pnew) Procedure for Pnew construction • For each group G of P do partition G into subgroups such that two states s and t are in the same subgroup if and only if for all input symbols a, states s and t have transitions on a to states in the same group of P • Replace G in Pnew by the set of all subgroups formed Algorithm 3.6(continue..) • Step3: If Pnew = P and proceed to step 4 . Otherwise repeat step 2 with P=Pnew • Step4:Choose one state as the state representative and add these states in M’ • Step5: If M’ has a dead state and unreachable state then remove those states (A dead state is a non accepting state that has transitions to itself on all inputs. An unreachable state is any state not reachable from the start state ) • Step6: Complete Example # 1 • The DFA for (a|b) *abb Example # 1 (Applying Minimization)