Listoflectures TDDA69DataandProgramStructure Parsing CyrilleBerger 1IntroductionandFunctionalProgramming 2ImperativeProgrammingandDataStructures 3Parsing 4Evaluation 5ObjectOrientedProgramming 6Macrosand 7VirtualMachinesandBytecode 8GarbageCollectionandNativeCode 9ConcurrentComputing 10DeclarativeProgramming 11Logic 12Summary 2/39 Lecturecontent Howisaprograminterpreted? Parsingaprogram Direct Grammars Sourcecode Parser Parser ContextFreeGrammar Parsetree AbstractSyntax 3/39 AbstractSyntaxTree Treevisitor Generator Sourcecode Bytecode VirtualMachine Assembler Assembly ... OperatingSystem CPU 4/39 LexerandParser Atokenisawordoranatomic element,ieif,1,+... Alexerisconvertingasequence ofcharactersintoasequenceof tokens Aparsercombinestokensintoa datastructure Parsingaprogram 6 Lexerforacalculator Tokens:'+','-','/','*','(',')','[0-9]+','[0-9]+. [0-9]+' my_program=['102.0-3.1*', defnext_char(program): if(at_end(program)): returnNone program[1]=program[1]+1 returnprogram[0][program[1]] defat_end(program): returnlen(program[0])<=program[1]+ 1 defnext_token(program): c=next_char(program) while(candc.isspace()): c=next_char(program) if(notc): returnNone if(cin['+','*','/','-','(',')']): returnc if(notc.isdigit()): returnNone num='' while(True): num+=c c=next_char(program) if(c==None): returnfloat(num) if(notc.isdigit()andc!='.'): program[1]=program[1]-1 returnfloat(num) Directevaluation 7 ReversePolishNotationCalculator(1/2) ReversePolishNotationCalculator(2/2) Evaluate:'145*23+/+' Lexer:['1','4','5','*','2','3','+','/','+' ] Parser: 145*23+/+ Currenttoken:'<eof>' Currentstack: 5 Iftokenisanumber,pushitonthestack Iftokenisanoperator,removethefirsttwo elementsofthestackandpushtheresult Whatistheresultofevaluation: www.govote.at655256 start 9 10 Calculator ReversePolishNotationCalculator(3/3) defpop2(stack): return(stack.pop(-2), stack.pop(-1)) defrpn_calc(program): program=[program,-1] stack=[] while(True): t= next_token(program) if(t==None): returnstack[0] elif(t=='-'): (a,b)=pop2(stack) stack.append(a-b) elif(t=='+'): (a,b)=pop2(stack) stack.append(a+b) elif(t=='*'): (a,b)=pop2(stack) stack.append(a*b) elif(t=='/'): (a,b)=pop2(stack) stack.append(a/b) else: stack.append(t) 1+4*5/(2+3) defcalc(program): program=[program,-1] returncalc_rec(program) defcalc_rec(program): t=next_token(program) if(t==')'): returnException(program,t) if(t=='('): a=calc_rec(program) next_token(program) else: a=t t=next_token(program) if(t==Noneort==')'): returna elif(t=='-'): returna-calc_rec(program) elif(t=='+'): returna+calc_rec(program) elif(t=='*'): returna*calc_rec(program) elif(t=='/'): returna/calc_rec(program) else: raiseException(program,t) 11 12 Letsintroduceloops... Whatifwewanttoevaluate thisprogram: Grammars for(i=0;i<10;++i) { print(i) } 13 Recursivestructure Formalgrammar Aformalgrammarisanaturalnotation forarecusivestructure Aformalgrammarisdefinedby G=(V,Σ,R,S) Programminglanguageshaverecursive structures STATEMENTcan ifEXPRESSIONthenSTATEMENTelseSTATEMENT whileEXPRESSIONdoSTATEMENTend STATEMENT;STATEMENT;.... EXPRESSION ... Σisasetofterminals keywords, Visasetofnon-terminals EXPRESSIONcan STATEMENT, S∊Visthestartsymbol Risthesetofproductionrules EXPRESSION+EXPRESSION EXPRESSION*EXPRESSION EXPRESSION(EXPRESSION) ... (W1,...,Wm)X(Z1,...,Zp)→Y1,..., X∊VYi,Wj,Zk∊Σ⋃V⋃{Ɛ 15 16 Exampleofformalgrammar Chomskyhierarchy Foranalphabetmadeof twolettersΣ={'a','b'} Productionrules: Type0:unrestrictedgrammarscan generateallTuringcompletelanguage Type1:context-sensitivegrammars,the ruleshavetheform: Exempleofuse: S→aSb→aaSbb→ Type2:context-freegrammars 1S→aSb 2S>ba αXβ→αɣ*βwhereαβ∊Σ⋃V⋃{Ɛ},ɣ∊Σ⋃VandX ∊V X→ɣ*whereɣ∊Σ⋃V⋃{Ɛ Type3:regulargrammars Ruleshavetheform:X→ɣ,X→Yɣ,X→ X,Y∊Vandɣ∊ 17 18 ContextFreeGrammar ContextFreeGrammar Acontextfreegrammarisdefined by G=(V,Σ,R,S) Σisasetof keywords,literals... VisasetofnonSTATEMENT, S∊Visthestart Risthesetofproduction X→Y1,..., X∊VYi∊Σ⋃V⋃{Ɛ 20 Derivation ContextFreeGrammarforaCalculator Aderivationisa sequenceof productions TerminalsΣ={'+','*','/','-','(',')','0',...'9' Non-terminalsEXPR, Rules EXPR→EXPR'+' S→...→...→...→... |EXPR'-' |EXPR'/' |EXPR'*' |'('EXPR')' | Aderivationcanbe drawnasatree Startsymbolisthetree'sroot ForaproductionX→Y1...Yn, Y1...YnarethechildrenofX LITERAL→LITERAL'0' ... LITERAL→LITERAL'9' Thestartsymbolis X Y1 21 E→E{'+'|'-'|'*'| '/'}E |'('E |[0-9]+ Derive: 1+4*5/(2+ E 1 E + E 4 E * E 5 E / ( E 2 E E + Parsetree ) E 3 23 Yn 22 Derivationofcalculator Grammar: ... Parsetree Aparse Terminalsattheleaves Non-terminalsatthe interiornodes Anin-orderofthe leavesisthe originalinput Theparse treeshows the associationof operations E 1 E + E 4 E * E 5 ParsetreeUnicity Grammar: E / ( E 2 E→E{'+'|'-'|'*'|'/'} |'('E |[0-9]+ E E + Derive: 1+4*5/(2+ ) E 3 2 * E 3 E + E 1 + E 4 E * E 5 E / ( E 2 start E E + ) E 3 25 E 2 E * E + E 3 26 PrecedenceinContextFreeGrammar Thisstringhastwoparsetrees:2*3+1 E 1 Whyisthatnot unique? www.govote.at 522708 Ambiguity E E E TerminalsΣ={'+','*', '/','-','(',')','0',...'9'} Non-terminalsEXPR, FACTOR,TERM,LITERAL Rules E EXPR→EXPR'+' |EXPR'-'EXPR |TERM 1 TERM→TERM'/'TERM F 2 T E + T 3 1 * F F |TERM'*'TERM |FACTOR FACTOR→'('EXPR')' |LITERAL LITERAL→LITERAL'0' ... LITERAL→LITERAL'9' 27 28 Ambiguity Associativity Thisstringhastwoparsetrees:2-3-1 E 2 E - E 3 E - E 1 E 2 E - E - E 3 TerminalsΣ={'+','*', '/','-','(',')','0',...'9'} Non-terminalsEXPR, FACTOR,TERM,LITERAL Rules E EXPR→EXPR'+'TERM |EXPR'-'TERM |TERM 1 TERM→TERM'/'FACTOR |TERM'*'FACTOR |FACTOR FACTOR→'('EXPR')' |LITERAL LITERAL→LITERAL'0' ... LITERAL→LITERAL'9' E T F 2 E - E - T F 1 T 3 F 29 30 Parsergenerator Shouldyouwritetheparserby hand? FromaContext-Free Grammardefinition,generate Yacc,Bison,Antlr4... aparser: AbstractSyntaxTree 31 WhatisanAbstractSyntaxTree? AbstractSyntaxTreevsParseTree(1/3) Aparsertracesthederivationof asequenceoftokens Weneedastructural representationofaprogram,such asaparsetree AnAbstractSyntaxTreeisa parsetreewithlessdetail E 1 E + E 4 E * E 5 E / ( E 2 E E + ) E 3 Dowereally needall thosenodes? Canwe removethe'E' and parenthesis? 33 AbstractSyntaxTreevsParseTree(2/3) E 1 E + E 4 + E * E 5 E / ( E 2 1 E E + ) E 4 * 5 / 2 34 AbstractSyntaxTreevsParseTree(3/3) Theybothcapturesthenesting structure Theyare ASTaremorecompactand easiertouse + 3 3 35 36 Lisp(1/2) Representinga treeasan Firstelementis array thevalue Secondelementisa listofchildren + 1 4 ['+'[1['*'[4['/'[5 ['+'[23]]]]]]]] InLisp: * 5 Lisp(2/2) ThesyntaxofLispis / 2 EXPR→'('operatorEXPR...')' Operators defineanewfunction (define(<name><formalparameters>)<body>) + condanewfunction 3 (cond(exprresult)(exprresult)...(Tdefault_result)) (+1(*4(/5(+2 37 Conclusion Token,LexerandParser Howtowriteaparser andevaluatorfora calculator FormalGrammars AbstractSyntax 39/39 38