CSE 305 Introduc0on to Programming Languages Lecture 11 – Implementa0on of Lexer/Parser and Scrip0ng CSE @ SUNY-­‐Buffalo Zhi Yang Courtesy of Professor P.N.Hilfinger Courtesy of Yao-­‐Yuan Chuang Courtesy of Dr. Sagiv No0ce Board • First, homework4 is due on July 4, 2013 (Thursday). • Second, homework5 will be posted. Our objec0ve • The first objec0ve of our class, is to comprehend a new programming language within very short 5me period, and because you have this ability to shorten your learning curve, you are going to manipulate the language with an insight learning. • The second objec0ve is to even engineer your own language! Review what we ve learnt and see future eg: Egyp0an Number System; Complement Number eg: Abacus Number System eg: Gate system, Including different underline device 1st Genera0on language: Machine Code eg: MIPS 2nd Genera0on language: Assembly Code eg: Fortran Regular Expression What s next ? 3rd Genera0on Language: Macro func0on Macro func5on Basic Calcula0on System Lexer Compiler System Virtual Machine Parser Push Down Automata Type Checking Context-­‐Free Grammar Lambda Calculus Theory A family tree of languages Cobol <Fortran> BASIC Algol 60 <LISP> PL/1 Simula <ML> Algol 68 C Pascal <Perl> <C++> Modula 3 Dylan Ada <Java> <C#> <Scheme> <Smalltalk> <Ruby> <Python> <Haskell> <Prolog> <JavaScript> The Front End stream of characters Lexer stream of tokens Parser abstract syntax Type Checker • Lexical Analysis: Create sequence of tokens from characters • Syntax Analysis: Create abstract syntax tree from sequence of tokens • Type Checking: Check program for well-­‐formedness constraints Lexical Analysis • Lexical Analysis: Breaks stream of ASCII characters (source) into tokens • Token: An atomic unit of program syntax – i.e., a word as opposed to a sentence • Tokens and their types: Characters Recognized: foo, x, listcount 10.45, 3.14, -­‐2.1 ; ( 50, 100 if Type: ID REAL SEMI LPAREN NUM IF Token: ID(foo), ID(x), ... REAL(10.45), REAL(3.14), ... SEMI LPAREN NUM(50), NUM(100) IF Ambiguous Token Rule Sets • We resolve ambigui0es using two conven0ons: – Longest match: The regular expression that matches the longest string takes precedence. – Rule Priority: The regular expressions iden0fying tokens are wrifen down in sequence. If two regular expressions match the same (longest) string, the first regular expression in the sequence takes precedence. Ambiguous Token Rule Sets • Example: – Iden0fier tokens: [a-­‐z] [a-­‐z0-­‐9]* – Sample keyword tokens: if, then, ... • How do we tokenize: – foobar ==> ID(foobar) or ID(foo) ID(bar) • use longest match to disambiguate – if ==> ID(if) or IF • keyword rules have higher priority than iden0fier rule Lexer Implementa0on Implementa0on Op0ons: 1. Write Lexer from scratch – Boring and error-­‐prone 2. Use Lexical Analyzer Generator – Quick and easy ml-­‐lex is a lexical analyzer generator for ML. lex and flex are lexical analyzer generators for C. Where are we ? ~~~ Previously, we discussed what it means to have a deriva0on(CFG) of a sentence according to a grammar, and how one can use a deriva0on (once found) to guide that applica0on(PDA) of seman0c ac0ons that compute a seman0c value (i.e., transla0on) of a sentence (where “sentence” can mean an en0re program). Two classes based on which non-­‐terminal is examined: top-­‐down (leYmost deriva5on) bo[om-­‐up (rightmost deriva5on) Look ahead ? What does it mean?~~~ There are algorithms that can be used to parse the language defined by an arbitrary CFG. However, in the worst case, the algorithms take O(n3) 0me, where n is the number of tokens. That is too slow! LL(1) ^^ ^ || |___ one token of look-­‐ahead ||_____ do a levmost deriva0on |______ scan the input lev-­‐to-­‐right LALR(1) ^ ^^ ^ | || |__ one token of look-­‐ahead | ||____ do a rightmost deriva0on in reverse | |_____ scan the input lev-­‐to-­‐right |_______ LA means "look-­‐ahead"; this has nothing to do with the number of tokens the parser can look at before it chooses what to do -­‐-­‐ it is a technical term that only means something when you study how LR parsers work... What is LL/LR parser • If we constrain the grammar somewhat, we can always parse in linear 0me. This is good! Linear-­‐0me parsing – LL parsers. Recognize LL grammar • Use a top-­‐down strategy – LR parsers • Recognize LR grammar Use a bofom-­‐up strategy • LL(n) : Lev to right, Levmost deriva0on, look ahead at most n symbols. • LR(n) : Lev to right, Right deriva0on, look ahead at most n symbols LL(1) Grammars A context-­‐free grammar whose Predict sets are always disjoint (for the same non-­‐terminal) is said to be LL(1). LL(1) grammars are ideally suited for top-­‐down parsing because it is always possible to correctly predict the expansion of any nonterminal. No backup is ever needed. In short, we are looking for all terminals that is produced by rule: A-­‐> X1 X2…Xn Example Recursive Descent Parsers Example Grammars Production Predict Set S→A a {b,d,a} ntext-free grammar whose ct sets disjoint 1) Grammars A are → Balways D {b, d, a} he same non-terminal) is said context-free whose LL(1). B → bgrammar {b} LL(1) Grammars edict sets are always disjoint are ideally suited → λnon-terminal) {d, orgrammars the B same is a} said whose A context-free grammar op-down parsing because it is be LL(1).Predict sets are always disjoint D → dto correctly{ d } ys possible (for the non-terminal) is said (1) grammars aresame ideally ct the expansion of anysuited nonto be LL(1).because D → λ { a rinal. top-down parsing No backup is ever } it is ways possible to correctlyare ideally suited LL(1) grammars ed. edict the expansion ofparsing any nonfor top-down because Since the predict sets of both Bit is ally, letproductions rminal. No backup is ever always possible to correctly and both D eded. = X1...Xn)productions predict the expansion of this any nonare disjoint, rmally, let *LL(1). grammar terminal. No backup is ever X1...Xn is ⇒ a...} Vt | A → )= rst(X1...Xnneeded. + w(A) = {a in V | S ⇒ Formally, let t a...} in Vt | A → X1...Xn ⇒* ...Aa...} First(X1...Xn) =+ llow(A) = {a in V t | S ⇒ ...Aa...} * {a in Vt | A → X1...Xn ⇒ a...} Follow(A) = {a in Vt | S ⇒+ ...Aa...} © CS 536 Fall 2012 247 249 AnX early implementation of topPredict(A → 1...Xn) = down (LL(1)) parsing was * If X1...Xn⇒ λ recursive descent. Then First(X U 11...X Predict(A → ...Xnn))was = Follow(A) A Xparser organized as a set o Else First(X *1...Xn) procedures, one for eac parsing If X1...Xn⇒ λ non-terminal. Each parsing Predict(A → X1...X n) = IfThen someFirst(X CFG,1G, property ...Xhas ) U the Follow(A) n * that allprocedure of distinct Ifpairs X1...X ⇒was λ responsible for ...X ) Elsefor First(X n 1 n parsing a sequence of tokens productions withFirst(X the same Then ...X ) U Follow(A) 1 property n non-terminal. derivable from its If some CFG, G, has the lefthand side, Else First(X 1...X n) for...X allFor pairs of → distinct example, a parsing procedur and A Y ...Y Athat →X 1 n with the same 1 m productions when called, would the IfA,some CFG, G, has the call property itlefthand is the case that side, scanner and match a token that for all pairs of distinct and A → Y ...Y A → X1...X→ ...X ) ∩ Predict(A X nsequence 1withmthe from 1 n derivable productions sameA. it is the case lefthand side, ) = φthe start symbol’s Predict(A → Ythat Starting 1...Ymwith Y1...Y Aparsing →1...X X1...X )∩ Predict(A → X n and A →we m would the nprocedure, then G is LL(1). itmatch is case ...Ythe ) =entire φthat input, which Predict(A → Y1the m LL(1) grammars arederivable to )parse must be from the star Predict(A →easy X1...X n ∩ then G is LL(1). in a top-down manner since symbol. ...Y )=φ Predict(A Y1correct. predictions are always LL(1) grammars are → easy to m parse in a top-down since thenmanner G is LL(1). predictions are always correct. LL(1) grammars are easy to parse in a top-down manner since predictions are always correct. © CS 536 Fall 2012 © CS 536 Fall 2012 248 Recursive Descent Parsers An early implementa0on of top-­‐ down (LL(1)) parsing was recursive descent. A parser was organized as a set of parsing procedures, one for each non-­‐ terminal. Each parsing procedure was responsible for parsing a sequence of tokens derivable from its non-­‐terminal. For example, a parsing procedure, A, when called, would call the scanner and match a token sequence derivable from A. Star0ng with the start symbol’s parsing procedure, we would then match the en0re input, which must be derivable from the start symbol. This approach is called recursive descent because the parsing procedures were typically recursive, and they descended down the input’s parse tree (as top-­‐ down parsers always do). descended rse tree (as lways do). gram pred We m right We start with a procedure Match, that matches the current input token against a predicted token: Building A Recursive Descent Parser void Match(Terminal a) { We start with a procedure Match, that matches the current input token if (a == currentToken) against a predicted token: currentToken = Scanner(); void Match(Terminal else SyntaxErrror();} CS 536 Fall 2012 a) { if (a == currentToken) To build a parsing procedure for a currentToken = Scanner(); else SyntaxErrror(); non-terminal A, we look at all } productions with A on the To build a parsing procedure for a non-­‐terminal lefthand side:A, we look at all produc0ons with A on the levhand side: A → X ...X | A → Y ...Y | ... © 251 1 n match terminals, and calling 1 © CS 536 Fall 2012 m Usua used Inste “mac sequ proc We use predict decide We use predict sets parsing to decide procedures w hich produc0on o sets match to (LL(1) grammars to tmatch nonwhich to match (LL(1) always have disjoint predict sets). We mproduction atch a produc0on’s righthand side terminals. grammars always have disjoint by calling Match to match terminals, and calling parsing procedures to The general form of a parsing predict match non-­‐ terminals. The general form osets). f a parsing procedure for procedure for We match a production’s ... is Match to A → X1...X righthand by| calling n | A → Yside 1...Ym 251 void A() { if (currentToken in Predict(A→X1...Xn)) for(i=1;i<=n;i++) CS 536 (X[i] Fall 2012 if is a terminal) Match(X[i]); else X[i](); else if (currentToken in Predict(A→Y1...Ym)) for(i=1;i<=m;i++) if (Y[i] is a terminal) © 252 LL(1) Parse Tables LL(1) Parse Tables An LL(1) parse table, T, is a twodimensional array. Entries in T are An LL(1) parse table, T, is a two-­‐ dimensional array. Entries in production numbers or blank T are produc0on numbers or blank (error) entries. (error) entries. T is indexed by: TAis by:A is the non-­‐ terminal we want to • , a indexed non-­‐terminal. expand. • A, a non-terminal. A is the nonwe twant to expand. • Cterminal T, the current oken that is to be matched. • CT, the current token that is to be matched. • T[A][CT] = A → X1...Xn if CT is in Predict(A → X1...Xn) T[A][CT] = error if CT predicts no production with A as its lefthand side CSX-lite Exam Production 1 Prog → { Stmts } E 2 Stmts → Stmt St 3 Stmts → λ 4 Stmt → id = Exp 5 Stmt → if ( Expr 6 Expr → id Etail 7 Etail → + Expr 8 Etail → - Expr 9 Etail → λ { Prog Stmts Stmt Expr Etail } if 3 2 1 5 ( CSX-lite Example LL(1) Parse Table Predict Set Production CSX-lite twon T are k on- to be 1 Prog → { Stmts } Eof 2 Stmts → Stmt Stmts id if 3 Stmts → λ → Stmt 2 Stmts }id 4 3 →Stmts λ Stmt id =→ Expr ; Predict Set Prog → { Stmts } Eof 1 { Stmts 4 id = Expr ; Stmt →Stmt if → ( Expr ) Stmt id if 6 Expr id →Etail 6 → Expr id Etail id id Stmt → if ( Expr ) Stmt 5 if 7 7 →Etail → + Expr Etail + Expr + 8 8 →Etail → - Expr Etail - Expr -- 9 Etail → λ ) Etail → λ 9 } Stmts Prog 1 Stmts Expr Stmt Etail 3 ) { } 1 if ( 3 2 2 5 4 Stmt if ( if } id 5 Prog { ) ) id id 2 = ; ; + = - + ; - eof ; eof 26 5 9 Expr 4 7 8 9 6 Etail 267 { Production n) with A Example 9 © CS 536 Fall 2012 7 8 9 268 s. by: 3 Stmts → λ } 4 Stmt → id = Expr ; id Example of LL(1) Parsing (step 1if) 5 Stmt → if ( Expr ) Stmt minal. A is the nonExample of LL(1) Parsing 6 Expr → id Etail want to expand. We’ll again parse We’ll parse { a = b + c; } Eof 7 Etail → + Expr ent token that is to be { a = b + c; } Eof 8 Etail → - Expr We s tart b y p lacing P rog ( the s tart s ymbol) on the parse We start by placing Prog (the start 9 Etail → λ on the parse stack. stack. → X1...Xsymbol) n redict(A → X1...Xn) { } if ( ) id Parse Stack Remaining Input error Prog Prog 1 { a = b + c; } Eof ts no production with A Stmts 3 2 2 { a = b + c; } Eof { and side Stmts Stmt 5 4 } Eof rse Tables Stmts Expr CSX-lite Example 9 } Eof 267 n-terminal. A is the nonal we want to expand. current token that is to be ed. T] = A → X1...Xn s in Predict(A → X1...Xn) T] = error © CS 536 Fall 2012 r Driver + ) ; = } Eof - ; 7 8 9 Production Predict Set 1 Prog → { Stmts } Eof { 2 Stmts → Stmt Stmts id if 3 Stmts → λ } Stmt → id = Expr ; id 5 Stmt → if ( Expr ) Stmt if 6 Expr → id Etail id 7 Etail → + Expr + 8 Etail → - Expr - 9 Etail → λ ) 4 © CS 536 Fall 2012 270 + eof 6 Etail a = b + c; } Eof ) parse table, T, is a twob + c; Stmt Entries in aT= are onal array. Stmts } ion numbers or blank Eof ntries. exed by: id 268 ; } if ( ) id = + Example{ of LL(1) Parsing ; eof s. by: 3 Stmts → λ } 4 Stmt → id = Expr ; id Example of LL(1) Parsing (step 2) 5 Stmt → if ( Expr ) Stmt if minal. A is the non6 Expr → id Etail id want to expand. 7 Etail → + Expr + ent token thatWe’ll is toparse be { a = b + c; } Eof 8 Etail → - Expr We start by placing Prog (the start symbol) on the parse 9 Etail → λ ) → X1...Xn stack. redict(A → X1...Xn) { } if ( ) id = Parse Stack Remaining Input Parse Stack Remaining Input error id ProgEtail1 a = b + c; } Eof + c; } Eof = ; ts no production with A Stmts 3 2 2 Expr Stmts ; } and side StmtEof 5 4 Stmts } Eof rse Tables = Expr ; Stmts } Eof Expr+ CSX-lite Example 9 ) parse table, T, is a twoonal array. Entries in T are b + c; } Eof Expr ion numbers or blank ; ntries.Stmts } Eof exed by: b + c; } Eof id Production CS 536 267 Etail ; Stmts } Eof © CS 536 Fall 2012 r Driver Stmts → λ id Etail 4 Stmt © Fall 2012 ; 5 StmtsStmt } 6 Eof Expr c; } Eof 8 9 } → id = Expr ; id → if ( Expr ) Stmt if → id Etail id + 8 Etail → - Expr - Etail → λ ) © 7 eof id if Etail → + Expr CS 536 Fall 2012 ; { 7 9 271 - Predict Set c; } Eof Expr 1 ; Prog → { Stmts } Eof Stmts 2 } Stmts → Stmt Stmts Eof 3 + + c; } Eof6 Expr Etail; Stmts } Eof = b + c; } Eof n-terminal. A is the nonal we want to expand. current token that is to be ed. T] = A → X1...Xn s in Predict(A → X1...Xn) T] = error ; 268 ; 272 } if ( ) id = + Example{ of LL(1) Parsing ; eof s. by: 3 Stmts → λ } 4 Stmt → id = Expr ; id Example of LL(1) Parsing (step 3if) 5 Stmt → if ( Expr ) Stmt minal. A is the non6 Expr → id Etail want to expand. 7 Etail → + Expr ent token thatWe’ll is toparse be { a = b + c; } Eof 8 Etail → - Expr We start by placing Prog (the start symbol) on the parse 9 Etail → λ → X1...Xn stack. redict(A → X1...Xn) { } if ( ) id Parse Stack Remaining Input error Prog 1 + c; } Eof Etail ts no production with A Stmts 3 2 2 ; Stmts and side Stmt 5 4 } Eof rse Tables + Expr ; Stmts } Eof Expr ) parse table, T, is a twoonal array. Entries in T are Expr ion numbers or blankc; } Eof ; ntries. Stmts } Eof exed by: id Etail A is the nonn-terminal. ; al we want Stmtsto expand. } currentEof token that is to be 267 c; } Eof ed. T] = A → X1...Xn s in Predict(A → X1...Xn) T] = error r Driver © CS 536 Fall 2012 CSX-lite Example 9 + c; } Eof id + ) ; = - ; 7 8 9 Production Predict Set 1 Prog → { Stmts } Eof { 2 Stmts → Stmt Stmts id if 3 Stmts → λ } Stmt → id = Expr ; id 5 Stmt → if ( Expr ) Stmt if 6 Expr → id Etail id 7 Etail → + Expr + 8 Etail → - Expr - 9 Etail → λ ) 4 © CS 536 Fall 2012 eof 6 Etail 272 + 268 ; } if ( ) id = + Example{ of LL(1) Parsing ; eof s. by: ; Stmts } Eof 3 Stmts } → λ } 4 Stmt → id = Expr ; id Eof Example of LL(1) Parsing (step 4) 5 Stmt → if ( Expr ) Stmt if minal. A is the non6 Expr → id Etail id want to expand. 7 Etail → + Expr + ent token thatWe’ll is toparse be { a = b + c; } Eof 8 Etail → - Expr We start by placing Prog (the start symbol) on the parse 9 Etail → λ ) ; → X1...Xn stack. redict(A → X1...Xn) { } if ( ) id = + error Parse Stack Prog 1 Syntax Errors in LL(1) Remaining Input ts no production with A Stmts Parsing 3 2 2 ; } Eof Etail ; and side Stmt 5 4 Stmts © CS 536 Fall 2012 } Eof rse Tables ; © CS 536 Fall 2012 271 Expr ) parse table, T, is a twoonal array. T are } Eof Stmts Entries in } ion numbers or blank Eof ntries. } Eof } Eof exed by: Eof 267 n-terminal. A is the nonDone! al we want to expand.All input matched current token that is to be ed. T] = A → X1...Xn s in Predict(A → X1...Xn) T] = error r Driver - ; eof In LL(1) parsing, syntax errors 6 detected as are automatically soon as the first illegal token 9 7 8 is9 seen. Production Predict Set is How? When an illegal token by the parser, either it Progseen → { Stmts } Eof { fetches error entry from the Stmts → Stmt an Stmts id if LL(1) parse table or }it fails to Stmts → λ match an expected token. Stmt → id = Expr ; id Let’s see how the following Stmt → if ( Expr ) Stmt if illegal CSX-lite program is Exprparsed: → id Etail id CSX-lite Example Etail ; } Eof Stmts } Eof Eof 272 1 2 3 4 © CS 536 Fall 2012 5 6 7 8 9 268 Etail → + Expr + { b + c = a; } Eof Etail → - Expr - (Where should the first syntax be detected?) ) ; Etailerror → λ } if ( ) id = + Example{ of LL(1) Parsing ; eof Syntax Errors in LL(1) Parsing In LL(1) parsing, syntax errors are automa0cally detected as soon as the first illegal token is seen. How? When an illegal token is seen by the parser, either it fetches an error entry from the LL(1) parse table or it fails to match an expected token. Let’s see how the following illegal program is parsed: { b + c = a; } Eof (Where should the first syntax error be detected?) Example arse Tables rse Tables CSX-liteExample Example CSX-lite Parse Stack Remaining Input 1)parse parse table, T, is atwotwo- Input table, Parse StackT, is a Remaining ionalarray. array. Entries inTTare are onal Prog Entries in { b + c = a; } Eof tion numbersororblank blank on numbers { { b + c = a; } Eof entries. Stmts ntries.} Eof exedby: by: exed = Production + c = a; } Eof Predict Set Production Predict Set Expr ; 1 Prog → { Stmts } Eof { 1 Prog { Stmts→ { Stmts } Eof } Stmts → Stmt Stmts 2 2 Stmts → Stmt Stmts id id if if Eof Stmts→ → 3 3 Stmts λ λ } } Current token (+) fails + c = a; } Eof to Stmt match expected Expr 4 4 Stmt →→ id id = = Expr ; ; token (=)! Stmt ( Expr ) Stmt 5 5 Stmt →→ if if ( Expr ) Stmt Expr→ →id id Etail 6 6 Expr Etail Stmts b + c = a; } Eof on-terminal. Aisisthe thenonnon} n-terminal. A Eof nal wewant want expand. al we totoexpand. Stmt b + c = a; } Eof Stmts ecurrent current token that is to be } token that is to be ed. Eof d. id X ...X b + c = a; } Eof T] 1 nn ] ==AA→→ = X1...X Expr Predict(A ) 1...X sisininPredict(A →→XX ; 1...X n)n Stmts CT] error T] ==error } predicts noproduction productionwith withAA redicts Eof no slefthand lefthandside side Etail Expr 7 7 Etail →→ + + Expr Etail - Expr 8 8 Etail →→ - Expr id id if if id id + + - ) ); ; Etail→ →λ λ 9 9 Etail { { } } if if ( ( ) ) id id = = + + - - ; ; eof eof Prog 1 1 Prog Stmts Stmts 3 3 2 2 2 2 Stmt Stmt 5 5 4 4 © CS 536 Fall 2012 275 267 267 Expr Expr Etail Etail CS 536 Fall 2012 © © CS 536 Fall 2012 CS 536 Fall 2012 © 6 6 9 9 7 78 89 9 276 268 268 Example arse Tables rse Tables CSX-liteExample Example CSX-lite 1)parse parse table, two- Input Parse StackT,T,isisaatwoRemaining table, ionalarray. array. Entriesinin+TcT=are are = a; } Eof onal Entries Expr tion numbers blank on numbers ororblank ; entries. ntries.Stmts } Eof exedby: by: exed Production Production 1 Prog → { Stmts } Eof 1 Prog → { Stmts } Eof Stmts StmtStmts Stmts 2 2 Stmts →→ Stmt Stmts→ → 3 3 Stmts λ λ Stmt Expr 4 4 Stmt →→ id id = = Expr ; ; Stmt ( Expr ) Stmt 5 5 Stmt →→ if if ( Expr ) Stmt Current token (+) fails + c = a; } Eof on-terminal. Aisisthe thenonnonto matchA expected n-terminal. token (=)!to expand. nal wewant want al we to expand. ecurrent currenttoken tokenthat thatisistotobe be ed. d. T] 1...X ] ==AA→→XX 1...X nn Predict(A→→XX...X ...X ) sisininPredict(A 1 1 n)n CT] error T] ==error predicts noproduction productionwith withAA redicts no slefthand lefthandside side Expr→ →id id Etail 6 6 Expr Etail Etail Expr 7 7 Etail →→ + + Expr © } } id id if if id id + + - - Etail - Expr 8 8 Etail →→ - Expr Etail→ →λ λ 9 9 Etail ) ); ; { { } } if if ( ( ) ) id id = = + + - - ; ; eof eof Prog 1 1 Prog Stmts Stmts 3 3 2 2 2 2 Stmt Stmt 5 5 4 4 Expr Expr Etail Etail CS 536 Fall 2012 Predict Set Predict Set { { id id if if 6 6 9 9 7 78 89 9 276 267 267 © CS 536 Fall 2012 CS 536 Fall 2012 © 268 268 XPath • A Language for Locating Nodes in XML Documents • XPath expressions are written in a syntax that resembles paths in file systems • The list of nodes located by an XPath expression is called a Nodelist • XPath is used in XSL and in XQuery (a query language for XML) • W3Schools has an XPath tutorial • XPath includes – Axis navigation – Conditions – Functions XML Schema • An XML Schema describes the structure of an XML document. • In this tutorial you will learn how to create XML Schemas, why XML Schemas are more powerful than DTDs, and how to use XML Schema in your applica5on. • <?xml version="1.0"?> <xs:schema xmlns:xs="h[p://www.w3.org/2001/XMLSchema"> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> <?xml version="1.0" encoding="ISO-8859-1"?> <catalog> <cd country="UK"> <title>Dark Side of the Moon</title> <artist>Pink Floyd</artist> <price>10.90</price> An XML document </cd> <cd country="UK"> <title>Space Oddity</title> <artist>David Bowie</artist> <price>9.90</price> </cd> <cd country="USA"> <title>Aretha: Lady Soul</title> <artist>Aretha Franklin</artist> <price>9.90</price> </cd> </catalog> 29 catalog.xml catalog country UK cd country UK title artist price cd cd Pink Floyd 10.90 USA title artist price title artist price Space Oddity Dark Side of the Moon country Aretha: Lady Soul David Bowie 9.90 Aretha Franklin 9.90 The Main Idea in the Syntax of XPath Expressoins • / at the beginning of an XPath expression represents the root of the document • / between element names represents a parent-child relationship • // represents an ancestor-descendent relationship • @ marks an attribute • [condition] specifies a condition catalog.xml catalog country UK cd country UK title artist price /catalog cd cd USA title artist price title artist price Space Oddity Dark Side of the Moon Pink Floyd 10.90 country Aretha: Lady Soul David Bowie Aretha Franklin 9.90 Getting the root element of the document 9.90 catalog.xml catalog country UK cd country UK title artist price /catalog/cd cd cd USA title artist price title artist price Space Oddity Dark Side of the Moon Pink Floyd 10.90 country Aretha: Lady Soul David Bowie 9.90 Finding child nodes Aretha Franklin 9.90 catalog.xml catalog country UK cd country UK title artist price /catalog/cd/price cd cd USA title artist price title artist price Space Oddity Dark Side of the Moon Pink Floyd 10.90 country Aretha: Lady Soul David Bowie 9.90 Finding descendent nodes Aretha Franklin 9.90 catalog.xml catalog country UK cd country UK title artist price /catalog/cd[price<10] cd cd USA title artist price title artist price Space Oddity Dark Side of the Moon Pink Floyd 10.90 country Aretha: Lady Soul David Bowie 9.90 Condition on elements Aretha Franklin 9.90 catalog.xml /catalog//0tle catalog country UK cd country UK title artist price //0tle cd cd USA title artist price title artist price Space Oddity Dark Side of the Moon Pink Floyd 10.90 country Aretha: Lady Soul David Bowie Aretha Franklin 9.90 // represents any directed path in the document 9.90 catalog.xml catalog country UK cd country UK title artist price /catalog/cd/* cd cd USA title artist price title artist price Space Oddity Dark Side of the Moon Pink Floyd 10.90 country Aretha: Lady Soul David Bowie Aretha Franklin 9.90 * represents any element name in the document 9.90 catalog.xml What will the following expressions return? country UK cd catalog country UK title artist price cd cd USA title artist price title artist price Space Oddity Dark Side of the Moon Pink Floyd 10.90 country Aretha: Lady Soul David Bowie Aretha Franklin 9.90 * represents any element name in the document 9.90 catalog.xml /catalog/cd[1] catalog country UK cd country UK title artist price /catalog/cd[last()] cd cd USA title artist price title artist price Space Oddity Dark Side of the Moon Pink Floyd 10.90 country Aretha: Lady Soul David Bowie 9.90 Position based condition Aretha Franklin 9.90 catalog.xml /catalog/cd[@country= UK ] catalog country UK cd country UK title artist price cd cd USA title artist price title artist price Space Oddity Dark Side of the Moon Pink Floyd 10.90 country Aretha: Lady Soul David Bowie 9.90 @ marks attributes Aretha Franklin 9.90 catalog.xml catalog /catalog/cd/@country country UK cd country UK title artist price cd cd USA title artist price title artist price Space Oddity Dark Side of the Moon Pink Floyd 10.90 country Aretha: Lady Soul David Bowie 9.90 @ marks attributes Aretha Franklin 9.90 Relative Navigation Using Axes • Starts with the current node and not with the root (/) • A . marks the current node (e.g., ./title) • A .. marks the parent node (e.g., title/../*) • There are also other axes, e.g., child, descendent, ancestor, parent, followingsibling, etc. Functions • Many functions that are included in XPath • Some examples: – count() – returns the number of nodes in a nodelist – last() – returns the last node in a nodelist – name() – returns the name of a node – position() – returns the position of the node in the nodelist Additional Examples of XPath Expressions These examples use element names that are not necessarily from the XML document that was shown previously Examples of XPath Expressions • para – Selects the para children elements of the context node • * – Selects all element children of the context node • text() – Selects all text node children of the context node • @name – Selects the name attribute of the context node More Examples of XPath Expressions • @* – Selects all the attributes of the context node • para[1] – Selects the first para child of the context node • para[last()] – Selects the last para child of the context node • */para – Selects all para grandchilren of the context node More Examples of XPath Expressions • /doc/chapter[5]/section[2] – Selects the second section of the fifth chapter of the doc • chapter//para – Selects the para element descendants of the chapter element children of the context node • //para – Selects all the para descendants of the document root and thus selects all para elements in the same document as the context node More Examples of XPath Expressions • //olist/item – Selects all the item elements that have an olist parent and are in the same document as the context node • . – Selects the context node • .//para – Selects the para descendants of the context node • .. – Selects the parent of the context node More Examples of XPath Expressions • ../@lang – Selects the lang attribute of the parent of the context node • para[@type= warning ] – Selects the para children of the context node that have a type attribute with value warning • chapter[title] – Selects the chapter children of the context node that have one or more title children More Examples of XPath Expressions • para[@type= warning ][5] – Selects the fifth para child among the children of the context node that have a type attribute with value warning • para[5][@type= warning ] – Selects the fifth para child of the context node if that child has a type attribute with value warning More Examples of XPath Expressions • chapter[title= Introduction ] – Selects the chapter children of the context node that have one or more title children with string-value equal to Introduction • employee[@secretary and @assistant] – Selects employee children of the context node that have both a secretary attribute and an assistant attribute More Examples of Xpath Expressions • /university/department/course – This Xpath expression matches any path that starts at the root, which is a university element, passes through a department element and ends in a course element • ./department/course[@year=2002] – This Xpath expression matches any path that starts at the current element, continues to a child which is a department element and ends at a course element with a year attribute that is equal to 2002 Location Paths • The previous examples are abbreviations of location paths – See XPath tutorial in W3Schools – For example, // is short for /descendantor-self::node()/. • //para is short for / descendant-or-self::node()/child::para 53 What Is A Scrip0ng Language • Modern scrip0ng languages have two principal sets of ancestors. – command interpreters or shells of tradi0onal batch and terminal (command-­‐line) compu0ng • IBM s JCL, MS-­‐DOS command interpreter, Unix sh and csh – various tools for text processing and report genera0on • IBM s RPG, and Unix s sed and awk. • From these evolved – Rexx, IBM s Restructured Extended Executor, which dates from 1979 – Perl, originally devised by Larry Wall in the late 1980s, and now the most widelyused general purpose scrip0ng language. – Other general purpose scrip0ng languages include Tcl ( 0ckle ), Python, Ruby, VBScript (for Windows) and AppleScript (for the Mac) What Is A Scrip0ng Language • Scrip0ng on Microsov pla€orms – As in several other aspects of compu0ng, Microsov tends to rely on internally developed technology in the area of scrip0ng languages – Most scrip0ng applica0ons are based on VBScript -­‐ dialect of Visual Basic – Microsov has also developed a very general scrip0ng interface (Windows Script) that is implemented uniformly by the opera0ng system, the web server, and the Internet Explorer browser – A Windows Script implementa0on of JScript, the company s version of JavaScript, comes pre-­‐installed on Windows machines, but languages like Perl and Python can be installed as well, and used to drive the same interface. – Many other Microsov applica0ons use VBScript as an extension language, but for these the implementa0on framework (Visual Basic for Applica0ons [VBA]) does not make it easy to use other languages instead What Is A Scrip0ng Language • Scrip0ng on Microsov pla€orms – Given Microsov s share of the desktop compu0ng market, VBScript is one of the most widely used scrip0ng languages • It is almost never used on other pla€orms – Perl, Tcl, Python, PHP, and others see significant use on Windows • For server-­‐side web scrip0ng, PHP currently predominates: as of February 2005, some 69% of the 59 million Internet web sites surveyed by Netcrav LTD were running the open source Apache web server, and of them most of the ones with ac0ve content were using PHP • Microsov s Internet Informa0on Server (IIS) was second to Apache, with 21% of the sites, and many of those had PHP installed as well. • For client-­‐side scrip0ng, where Internet Explorer controls about 70% of the browser market, most web site administrators need their content to be visible to the other 30% • Explorer supports JavaScript (JScript), but other browsers do not support VBScript Shell Scripts • A shell script is just a file containing shell commands, but with a few extras: – The first line of a shell script should be a comment of the following form: ! ! #!/bin/sh for a Bourne shell script. Bourne shell scripts are the most common, since C Shell scripts have buggy features. – A shell script must be readable and executable. ! ! chmod +rx scriptname – As with any command, a shell script has to be in your path to be executed. • If . is not in your PATH, you must specify ./scriptname instead of just scriptname My First Script • I want to type “ls –a” in a Unix Terminal • Is is the same thing if I write a script or a shell file ? #!/bin/sh ls -­‐a Yes !~~~ What is Shell? • Shell is the interface between end user and the Linux system, similar to the commands in Windows • Bash is installed as in /bin/sh • Check the version Other programs csh Kernel % /bin/sh --version bash X window Pipe and Redirec0on • Redirec0on (< or >) % ls –l > lsoutput.txt (save output to lsoutput.txt) % ps >> lsoutput.txt (append to lsoutput.txt) % more < killout.txt (use killout.txt as parameter to more) % kill -l 1234 > killouterr.txt 2 >&1 (redirect to the same file) % kill -l 1234 >/dev/null 2 >&1 (ignore std output) • Pipe (|) – % % % Process are executed concurrently ps | sort | more ps –xo comm | sort | uniq | grep –v sh | more cat mydata.txt | sort | uniq | > mydata.txt (generates an empty file !) Shell as a Language • We can write a script containing many shell commands • Interac0ve Program: – grep files with POSIX string and print it % for file in * > do > if grep –l POSIX $file > then > more $file Ø fi Ø done Posix There is a file with POSIX in it – * is wildcard % more `grep –l POSIX *` % more $(grep –l POSIX *) % more –l POSIX * | more 61 Wri0ng a Script • Use text editor to generate the first file #!/bin/sh # first # this file looks for the files containing POSIX # and print it for file in * do if grep –q POSIX $file then echo $file fi done exit code, 0 means successful exit 0 % /bin/sh first % chmod +x first %./first (make sure . is include in PATH parameter) Syntax • • • • • • • • Variables Condi0ons Control Lists Func0ons Shell Commands Result Document 63 Variables • Variables needed to be declared, note it is case-­‐sensi0ve (e.g. foo, FOO, Foo) • Add $ for storing values % salutation=Hello % echo $salutation Hello % salutation=7+5 % echo $salutation 7+5 % salutation= yes dear % echo $salutation yes dear % read salutation Hola! % echo $salutation Hola! Quo0ng • Edit a vartest.sh file #!/bin/sh myvar= Hi there echo $myvar echo $myvar echo `$myvar` echo \$myvar Output Hi there Hi there $myvar $myvar Enter some text Hello world $myvar now equals Hello world echo Enter some text read myvar echo $myvar exit 0 now equals $myvar Environment Variables • • • • • • • • $HOME $PATH $PS1 $PS2 $$ $# $0 $IFS home directory path (normally %) (normally >) process id of the script number of input parameters name of the script file separa0on character (white space) • Use env to check the value Parameter % IFS = ` ` % set foo bar bam % echo $@ foo bar bam % echo $* foo bar bam % unset IFS % echo $* foo bar bam doesn t mafer IFS Edit file try_var #!/bin/sh salutation= Hello echo $salutation echo The program $0 is now running echo The parameter list was $* echo The second parameter was $2 echo The first parameter was $1 echo The user s home directory is $HOME echo Please enter a new greeting read salutation echo $salutation echo The script is now complete exit 0 %./try_var foo bar baz Hello The program ./try_var is now running The second parameter was bar The first parameter was foo The parameter list was foo bar baz The user s home directory is /home/ychuang Please enter a new greeting Hola Hola The script is now complete Parameter need s pace ! Condi0on • test or [ if test –f fred.c then ... fi expression1 –eq expression1 –ne expression1 –gt expression1 –ge expression1 -lt expression1 –le !expression If [ -f fred.c ] then ... fi expression2 expression2 expression2 expression2 expression2 expression2 if [ -f fred.c ];then ... fi -d -e -f -g -r -s -u -w -x String1 = string2 String1 != string 2 -n string (if not empty string) -z string (if empty string) file file file file file file file file file if if if if if if if if if directory exist file set-group-id readable size >0 set-user-id writable executable Control Structure Syntax if condition then statement else statement fi #!/bin/sh echo Is it morning? Please answer yes or no read timeofday if [ $timeofday = yes ]; then echo Good morning else echo Good afternoon fi exit 0 Is it morning? Please answer yes or no yes Good morning Condi0on Structure #!/bin/sh echo Is it morning? Please answer yes or no read timeofday if [ $timeofday = yes ]; then echo Good morning elif [ $timeofday = no ]; then echo Good afternoon else echo Sorry, $timeofday not recongnized. Enter yes or no exit 1 fi exit 0 Condi0on Structure #!/bin/sh echo Is it morning? Please answer yes or no read timeofday if [ $timeofday = yes ]; then echo Good morning elif [ $timeofday = no ]; then echo Good afternoon else echo Sorry, $timeofday not recongnized. Enter yes or no exit 1 fi exit 0 If input enter s0ll returns Good morning Loop Structure Syntax for variable do statement done #!/bin/sh for foo in bar fud 43 do echo $foo done exit 0 bar fud 43 How to output as bar fud 43? Try change for foo in bar fud 43 This is to have space in variable Loop Structure • Use wildcard * #!/bin/sh for file in $(ls f*.sh); do lpr $file done exit 0 Print all f*.sh files Loop Structure Syntax while condition do statement done Syntax until condition do statement done Note: condi0on is Reverse to while How to re-­‐write previous sample? #!/bin/sh for foo in 1 2 3 4 5 6 7 8 9 10 do echo here we go again done exit 0 #!/bin/sh foo = 1 while [ $foo –le 10 ] do echo here we go again foo = $foo(($foo+1)) done exit 0 Case Statement Syntax case variable in\ pattern [ | pattern ] …) statement;; pattern [ | pattern ] …) statement;; #!/bin/sh … echo Is it morning? Please answer yes or no esac read timeofday case $timeofday in yes) echo Good Morning ;; y) echo Good Morning ;; no) echo Good Afternoon ;; n) echo Good Afternoon ;; * ) echo Sorry, answer not recongnized ;; esac exit 0 Case Statement • A much cleaner version #!/bin/sh echo Is it morning? Please answer yes or no read timeofday case $timeofday in yes | y | Yes | YES ) echo Good Morning ;; n* | N* ) echo Good Afternoon ;; * ) echo Sorry, answer not recongnized ;; esac exit 0 But this has a problem, if we enter never which obeys n* case and prints Good Avernoon Case Statement #!/bin/sh echo Is it morning? Please answer yes or no read timeofday case $timeofday in yes | y | Yes | YES ) echo Good Morning echo Up bright and early this morning ;; [nN]*) echo Good Afternoon ;; *) echo Sorry, answer not recongnized echo Please answer yes of no exit 1 ;; esac exit 0 List • AND (&&) statement1 && statement2 && statement3 … #!/bin/sh touch file_one rm –f file_two Check if file exist if not then create one Remove a file if [ -f file_one ] && echo Hello && [-f file_two] && echo then echo in if else Output echo in else Hello fi in else exit 0 there List • OR (||) statement1 || statement2 || statement3 … #!/bin/sh rm –f file_one if [ -f file_one ] || echo Hello then echo in if else Output echo in else Hello fi in else exit 0 || echo there Statement Block • Use mul0ple statements in the same place get_comfirm && { grep –v $cdcatnum $stracks_file > $temp_file cat $temp_file > $tracks_file echo add_record_tracks } Func0on • You can define func0ons for structured scripts function_name() { statements } #!/bin/sh foo() { echo Function foo is executing } Output echo script starting script starting foo Function foo is executing echo script ended Script ended exit 0 You need to define a func0on before using it The parameters $*,$@,$#,$1,$2 are replaced by local value if func0on is called and return to previous aver func0on is finished Func0on define local variable Output? Check the scope of the variables #!/bin/sh sample_text= global variable foo() { local sample_text= local variable echo Function foo is executing echo $sample_text } echo script starting echo $sample_text foo echo script ended echo $sample_text exit 0 Func0on • Use return to pass a result #!/bin/sh yes_or_no() { echo Is your name $* ? while true do echo –n Enter yes or no: read x case $x in y | yes ) return 0;; n | no ) return 1;; * ) echo Answer yes or esac done } echo Original parameters are $* if yes_or_no $1 then echo Hi $1, nice name else echo Never mind fi exit 0 no Output ./my_name John Chuang Original parameters are John Chuang Is your name John? Enter yes or no: yes Hi John, nice name. Command • External: use interac0vely • Internal: • only in script • break skip loop #!/bin/sh rm –rf fred* echo > fred1 echo > fred2 mkdir fred3 echo > fred4 for file in fred* do if [ -d $file ] ; then break; fi done echo first directory starting fred was $file rm –rf fred* exit 0 Command • : treats it as true #!/bin/sh rm –f fred if [ -f fred ]; then : else echo file fred did not exist fi exit 0 Command • con0nue con0nues next itera0on #!/bin/sh rm –rf fred* echo > fred1 echo > fred2 mkdir fred3 echo > fred4 for file in fred* do if [ -d $file ]; then echo skipping directory $file continue fi echo file is $file done rm –rf fred* exit 0 Command • . ./shell_script execute shell_script classic_set #!/bin/sh verion=classic PATH=/usr/local/old_bin:/usr/bin:/bin:. PS1= classic> latest_set #!/bin/sh verion=latest PATH=/usr/local/new_bin:/usr/bin:/bin:. PS1= latest version> % . ./classic_set classic> echo $version classic Classic> . latest_set latest latest version> Command • echo print string • -­‐n do not output the trailing newline • -­‐e enable interpreta0on of backslash escapes – – – – – – – – – – \0NNN the character whose ACSII code is NNN \\ backslash \a alert \b backspace \c suppress trailing newline \f form feed \n newline \r carriage return \t horizontal tab Try these \v ver0cal tab % echo –n string to \n output % echo –e string to \n output Command • eval evaluate the value of a parameter similar to an extra $ % % % % foo=10 x=foo y= $ $x echo $y Output is $foo % foo=10 % x=foo % eval y= $ $x % echo $y Output is 10 Command • • • • • • exit n ending the script 0 means success 1 to 255 means specific error code 126 means not executable file 127 means no such command 128 or >128 signal #!/bin/sh if [ -f .profile ]; then exit 0 fi exit 1 Or % [ -f .profile ] && exit 0 || exit 1 Command • export This is export2 #!/bin/sh echo $foo echo $bar gives a value to a parameter Output is %export1 The second-syntactic variable % This is export1 #!/bin/sh foo= The first meta-syntactic variable export bar= The second meta-syntactic variable export2 Command • expr evaluate expressions %x=`expr $x + 1` (Assign result value expr $x+1 to x) Also can be wrifen as %x=$(expr $x + 1) Expr1 | expr2 (or) expr1 != expr2 Expr1 & expr2 (and) expr1 + expr2 Expr1 = expr2 expr1 – expr2 Expr1 > expr2 expr1 * expr2 Expr1 >= expr2 expr1 / expr2 Expr1 < expr2 expr1 % expr2 (module) Expr1 <= expr2 Command • prin€ format and print data • Escape sequence – \\ backslash – \a beep sound – \b backspace – \f form feed – \n newline – \r carriage return – \t tab – \v ver0cal tab • Conversion specifier – %d decimal – %c character – %s string – %% print % % printf %s\n hello Hello % printf %s %d\t%s Hi There 15 people Hi There 15 people Command • return return a value • set set parameter variable #!/bin/sh echo the date is $(date) set $(date) echo The month is $2 exit 0 Command • Shiv shiv parameter once, $2 to $1, $3 to $2, and so on #!/bin/sh while [ $1 != echo $1 shift done exit 0 ]; do Command • trap ac0on aver receiving signal trap command signal • signal explain HUP (1) hung up INT (2) interrupt (Crtl + C) QUIT (3) Quit (Crtl + \) ABRT (6) Abort ALRM (14) Alarm TERM (15) Terminate Command #!/bin/sh trap rm –f /tmp/my_tmp_file_$$ INT echo creating file /tmp/my_tmp_file_$$ date > /tmp/my_tmp_file_$$ echo press interrupt (CTRL-C) to interrupt … while [ -f /tmp/my_tmp_file_$$ ]; do echo File exists sleep 1 done echo The file no longer exists trap INT echo creating file /tmp/my_tmp_file_$$ date > /tmp/my_tmp_file_$$ echo press interrupt (CTRL-C) to interrupt … while [ -f /tmp/my_tmp_file_$$ ]; do echo File exists sleep 1 done echo we never get there exit 0 Command creating file /tmp/my_file_141 press interrupt (CTRL-C) to interrupt … File exists File exists File exists File exists The file no longer exists Creating file /tmp/my_file_141 Press interrupt (CTRL-C) to interrupt … File exists File exists File exists File exists Command Unset remove parameter or func0on #!/bin/sh foo= Hello World echo $foo unset $foo echo $foo Pafern Matching • find search for files in a directory hierarchy find [path] [options] [tests] [actions] op0ons -­‐depth find content in the directory -­‐follow follow symbolic links -­‐maxdepths N fond N levels directories -­‐mount do not find other directories tests -­‐a0me N accessed N days ago -­‐m0me N modified N days ago -­‐new otherfile name of a file -­‐type X file type X -­‐user username belong to username Pafern Matching operator ! -­‐not test reverse -­‐a -­‐and test and -­‐o -­‐or test or ac0on -­‐exec command execute command -­‐ok command confirm and exectute command -­‐print print -­‐ls ls –dils Find files newer than while2 then print % find . –newer while2 -print Pafern Matching Find files newer than while2 then print only files % find . –newer while2 –type f –print Find files either newer than while2, start with _ % find . \( -name _* –or –newer while2 \) –type f – print Find files newer than while2 then list files % find . –newer while2 –type f –exec ls –l {} \; Pafern Matching • grep print lines matching a pafern (General Regular Expression Parser) grep [options] PATTERN [FILES] op0on -­‐c -­‐E -­‐h -­‐i -­‐l -­‐v print number of output context Interpret PATTERN as an extended regular expression Supress the prefixing of filenames ignore case surpress normal output invert the sense of matching % grep in words.txt % grep –c in words.txt words2.txt % grep –c –v in words.txt words2.txt Regular Expressions • • a regular expression (abbreviated as regexp or regex, with plural forms regexps, regexes, or regexen) is a string that describes or matches a set of strings, according to certain syntax rules. Syntax – ^ Matches the start of the line – $ Matches the end of the line – . Matches any single character – [] Matches a single character that is contained within the brackets – [^] Matches a single character that is not contained within the brackets – () Defines a "marked subexpression – {x,y}Match the last "block" at least x and not more than y 0mes Regular Expressions • Examples: – ".at" matches any three-­‐character string like hat, cat or bat – "[hc]at" matches hat and cat – "[^b]at" matches all the matched strings from the regex ".at" except bat – "^[hc]at" matches hat and cat but only at the beginning of a line – "[hc]at$" matches hat and cat but only at the end of a line Regular Expressions • • • • • • • • • • • • • POSIX class similar to meaning [:upper:] [A-­‐Z] uppercase lefers [:lower:] [a-­‐z] lowercase lefers [:alpha:] [A-­‐Za-­‐z] upper-­‐ and lowercase lefers [:alnum:] [A-­‐Za-­‐z0-­‐9] digits, upper-­‐ and lowercase lefers [:digit:] [0-­‐9] digits [:xdigit:] [0-­‐9A-­‐Fa-­‐f] hexadecimal digits [:punct:] [.,!?:...] punctua0on [:blank:] [ \t] space and TAB characters only [:space:] [ \t\n\r\f\v]blank (whitespace) characters [:cntrl:] control characters [:graph:] [^ \t\n\r\f\v] printed characters [:print:] [^\t\n\r\f\v] printed characters and space • Example: [[:upper:]ab] should only match the uppercase lefers and lowercase 'a' and 'b'. Regular Expressions • POSIX modern (extended) regular expressions • The more modern "extended" regular expressions can oven be used with modern Unix u0li0es by including the command line flag "-­‐E". • + Match one or more 0mes • ? Match at most once • * Match zero or more • {n} Match n 0mes • {n,} Match n or more 0mes • {n,m} Match n to m 0mes Regular Expressions • Search for lines ending with e % grep e$ words2.txt • Search for a % grep a[[:blank:]] word2.txt • Search for words star0ng with Th. % grep Th.[[:blank:]] words2.txt • Search for lines with 10 lower case characters % grep –E [a-z]\{10\} words2.txt Command • $(command) to execute command in a script • Old format used ` but it can be confused with #!/bin/sh echo The current directory is $PWD echo the current users are $(who) Arithme0c Expansion • Use $((…)) instead of expr to evaluate arithme0c equa0on #!/bin/sh x=0 while [ $x –ne 10]; do echo $x x=$(($x+1)) done exit 0 Parameter Expansion • Parameter Assignment ${param:-­‐default} set default if null foo=fred ${#param} length of param echo $foo ${param%word} remove smallest suffix pafern ${param%%word} remove largest suffix pafern #!/bin/sh ${param#word} remove smallest prefix pafern for i in 1 2 ${param##word} remove largest prefix pafern do my_secret_process $i_tmp done Gives result mu_secret_process: too few arguments #!/bin/sh for i in 1 2 do my_secret_process ${i}_tmp done Parameter Expansion #!/bin/sh unset foo echo ${foo:-bar} foo=fud echo ${foo:-bar} foo=/usr/bin/X11/startx echo ${foo#*/} echo ${foo##*/} bar=/usr/local/etc/local/networks echo ${bar%local*} echo ${bar%%local*} Exit 0 Output bar fud usr/bin/X11/startx startx /usr/local/etc /usr Here Documents • A here document is a special-­‐purpose code block, starts with << #!/bin.sh #!/bin.sh ed a_text_file <<HERE cat <<!FUNKY! 3 hello d this is a here .,\$s/is/was/ w document q a_text_file !FUNCKY! HERE That is line 1 exit 0 exit 0 That is line 2 That is line 3 That is line 4 Output That is line 1 That is line 2 That was line 4 Debug • • • sh –n<script> set -­‐o noexec check syntax set –n sh –v<script> set -­‐o verbose echo command before set –v sh –x<script> set –o trace echo command aver set –x set –o nounset gives error if undefined set –x set –o xtrace set +o xtrace trap echo Exiting: critical variable =$critical_variable EXIT