Document 10835006

advertisement
CSE 305 Introduc0on to Programming Languages Lecture 11 – Implementa0on of Lexer/Parser and Scrip0ng CSE @ SUNY-­‐Buffalo Zhi Yang Courtesy of Professor P.N.Hilfinger Courtesy of Yao-­‐Yuan Chuang Courtesy of Dr. Sagiv No0ce Board •  First, homework4 is due on July 4, 2013
(Thursday). •  Second, homework5 will be posted. Our objec0ve •  The first objec0ve of our class, is to comprehend a new programming language within very short 5me period, and because you have this ability to shorten your learning curve, you are going to manipulate the language with an insight learning. •  The second objec0ve is to even engineer your own language! Review what we ve learnt and see future eg: Egyp0an Number System; Complement Number eg: Abacus Number System eg: Gate system, Including different underline device 1st Genera0on language: Machine Code eg: MIPS 2nd Genera0on language: Assembly Code eg: Fortran Regular Expression What s next ? 3rd Genera0on Language: Macro func0on Macro func5on Basic Calcula0on System Lexer Compiler System Virtual Machine Parser Push Down Automata Type Checking Context-­‐Free Grammar Lambda Calculus Theory A family tree of languages
Cobol <Fortran> BASIC Algol 60 <LISP> PL/1 Simula <ML> Algol 68 C Pascal <Perl> <C++> Modula 3 Dylan Ada <Java> <C#> <Scheme> <Smalltalk> <Ruby> <Python> <Haskell> <Prolog> <JavaScript> The Front End stream of characters Lexer stream of tokens Parser abstract syntax Type Checker •  Lexical Analysis: Create sequence of tokens from characters •  Syntax Analysis: Create abstract syntax tree from sequence of tokens •  Type Checking: Check program for well-­‐formedness constraints Lexical Analysis •  Lexical Analysis: Breaks stream of ASCII characters (source) into tokens •  Token: An atomic unit of program syntax –  i.e., a word as opposed to a sentence •  Tokens and their types: Characters Recognized: foo, x, listcount 10.45, 3.14, -­‐2.1 ; ( 50, 100 if Type: ID REAL SEMI LPAREN NUM IF Token: ID(foo), ID(x), ... REAL(10.45), REAL(3.14), ... SEMI LPAREN NUM(50), NUM(100) IF Ambiguous Token Rule Sets •  We resolve ambigui0es using two conven0ons: –  Longest match: The regular expression that matches the longest string takes precedence. –  Rule Priority: The regular expressions iden0fying tokens are wrifen down in sequence. If two regular expressions match the same (longest) string, the first regular expression in the sequence takes precedence. Ambiguous Token Rule Sets •  Example: –  Iden0fier tokens: [a-­‐z] [a-­‐z0-­‐9]* –  Sample keyword tokens: if, then, ... •  How do we tokenize: –  foobar ==> ID(foobar)
or ID(foo) ID(bar) •  use longest match to disambiguate –  if ==> ID(if) or IF •  keyword rules have higher priority than iden0fier rule Lexer Implementa0on Implementa0on Op0ons: 1.  Write Lexer from scratch – 
Boring and error-­‐prone 2.  Use Lexical Analyzer Generator – 
Quick and easy ml-­‐lex is a lexical analyzer generator for ML. lex and flex are lexical analyzer generators for C. Where are we ? ~~~ Previously, we discussed what it means to have a deriva0on(CFG) of a sentence according to a grammar, and how one can use a deriva0on (once found) to guide that applica0on(PDA) of seman0c ac0ons that compute a seman0c value (i.e., transla0on) of a sentence (where “sentence” can mean an en0re program). Two classes based on which non-­‐terminal is examined: top-­‐down (leYmost deriva5on) bo[om-­‐up (rightmost deriva5on) Look ahead ? What does it mean?~~~ There are algorithms that can be used to parse the language defined by an arbitrary CFG. However, in the worst case, the algorithms take O(n3) 0me, where n is the number of tokens. That is too slow! LL(1) ^^ ^ || |___ one token of look-­‐ahead ||_____ do a levmost deriva0on |______ scan the input lev-­‐to-­‐right LALR(1) ^ ^^ ^ | || |__ one token of look-­‐ahead | ||____ do a rightmost deriva0on in reverse | |_____ scan the input lev-­‐to-­‐right |_______ LA means "look-­‐ahead"; this has nothing to do with the number of tokens the parser can look at before it chooses what to do -­‐-­‐ it is a technical term that only means something when you study how LR parsers work... What is LL/LR parser • If we constrain the grammar somewhat, we can always parse in linear 0me. This is good! Linear-­‐0me parsing – LL parsers. Recognize LL grammar • Use a top-­‐down strategy – LR parsers • Recognize LR grammar Use a bofom-­‐up strategy • LL(n) : Lev to right, Levmost deriva0on, look ahead at most n symbols. • LR(n) : Lev to right, Right deriva0on, look ahead at most n symbols LL(1) Grammars A context-­‐free grammar whose Predict sets are always disjoint (for the same non-­‐terminal) is said to be LL(1). LL(1) grammars are ideally suited for top-­‐down parsing because it is always possible to correctly predict the expansion of any nonterminal. No backup is ever needed. In short, we are looking for all terminals that is produced by rule: A-­‐> X1 X2…Xn Example Recursive Descent Parsers
Example
Grammars
Production
Predict Set
S→A
a
{b,d,a}
ntext-free
grammar
whose
ct sets
disjoint
1)
Grammars
A are
→ Balways
D
{b, d, a}
he same non-terminal) is said
context-free
whose
LL(1).
B
→ bgrammar
{b}
LL(1)
Grammars
edict sets are always disjoint
are ideally suited
→
λnon-terminal)
{d,
orgrammars
the B
same
is a}
said whose
A
context-free
grammar
op-down
parsing
because
it
is
be LL(1).Predict sets are always
disjoint
D → dto correctly{ d }
ys possible
(for the
non-terminal) is said
(1)
grammars
aresame
ideally
ct the
expansion
of anysuited
nonto be
LL(1).because
D
→
λ
{
a
rinal.
top-down
parsing
No backup is ever } it is
ways possible
to correctlyare ideally suited
LL(1) grammars
ed.
edict the
expansion
ofparsing
any
nonfor
top-down
because
Since
the predict
sets
of
both Bit is
ally, letproductions
rminal.
No
backup
is
ever
always possible
to correctly
and both
D
eded.
=
X1...Xn)productions
predict the expansion
of this
any nonare disjoint,
rmally,
let
*LL(1).
grammar
terminal.
No
backup is ever
X1...Xn is
⇒
a...}
Vt | A →
)=
rst(X1...Xnneeded.
+
w(A)
=
{a
in
V
|
S
⇒
Formally,
let
t
a...}
in Vt | A → X1...Xn ⇒* ...Aa...}
First(X1...Xn) =+
llow(A) = {a in V
t | S ⇒ ...Aa...} *
{a in Vt | A → X1...Xn ⇒ a...}
Follow(A) = {a in Vt | S ⇒+ ...Aa...}
©
CS 536 Fall 2012
247
249
AnX early
implementation of topPredict(A →
1...Xn) =
down
(LL(1)) parsing was
*
If X1...Xn⇒
λ
recursive
descent.
Then First(X
U
11...X
Predict(A
→
...Xnn))was
= Follow(A)
A Xparser
organized as a set o
Else
First(X
*1...Xn) procedures, one for eac
parsing
If X1...Xn⇒
λ
non-terminal.
Each
parsing
Predict(A
→ X1...X
n) =
IfThen
someFirst(X
CFG,1G,
property
...Xhas
) U the
Follow(A)
n
*
that
allprocedure
of
distinct
Ifpairs
X1...X
⇒was
λ responsible for
...X
)
Elsefor
First(X
n
1
n
parsing
a sequence
of tokens
productions
withFirst(X
the
same
Then
...X
)
U
Follow(A)
1 property
n non-terminal.
derivable
from
its
If some CFG,
G, has the
lefthand
side,
Else
First(X
1...X
n)
for...X
allFor
pairs
of →
distinct
example,
a
parsing
procedur
and
A
Y
...Y
Athat
→X
1
n with the same
1
m
productions
when
called,
would
the
IfA,some
CFG,
G, has
the call
property
itlefthand
is the case
that
side,
scanner
and
match
a token
that
for all
pairs
of distinct
and
A
→
Y
...Y
A → X1...X→
...X
)
∩
Predict(A
X
nsequence
1withmthe from
1
n derivable
productions
sameA.
it is the case
lefthand
side,
) = φthe start symbol’s
Predict(A
→
Ythat
Starting
1...Ymwith
Y1...Y
Aparsing
→1...X
X1...X
)∩
Predict(A →
X
n and A →we
m
would
the
nprocedure,
then G is LL(1).
itmatch
is
case
...Ythe
) =entire
φthat input, which
Predict(A →
Y1the
m
LL(1) grammars
arederivable
to )parse
must be
from the star
Predict(A
→easy
X1...X
n ∩
then
G is LL(1).
in
a top-down
manner
since
symbol.
...Y
)=φ
Predict(A
Y1correct.
predictions
are always
LL(1) grammars
are →
easy
to m
parse
in a top-down
since
thenmanner
G is LL(1).
predictions are always correct.
LL(1) grammars are easy to parse
in a top-down manner since
predictions are always correct.
©
CS 536 Fall 2012
©
CS 536 Fall 2012
248
Recursive Descent Parsers An early implementa0on of top-­‐ down (LL(1)) parsing was recursive descent. A parser was organized as a set of parsing procedures, one for each non-­‐
terminal. Each parsing procedure was responsible for parsing a sequence of tokens derivable from its non-­‐terminal. For example, a parsing procedure, A, when called, would call the scanner and match a token sequence derivable from A. Star0ng with the start symbol’s parsing procedure, we would then match the en0re input, which must be derivable from the start symbol. This approach is called recursive descent because the parsing procedures were typically recursive, and they descended down the input’s parse tree (as top-­‐
down parsers always do). descended
rse tree (as
lways do).
gram
pred
We m
right
We start with a procedure Match,
that matches the current input
token against a predicted token:
Building A Recursive Descent Parser void Match(Terminal a) {
We start with a procedure Match, that matches the current input token if (a == currentToken)
against a predicted token: currentToken = Scanner();
void Match(Terminal else SyntaxErrror();}
CS 536 Fall 2012 a) { if (a == currentToken) To build a parsing procedure for a
currentToken = Scanner(); else SyntaxErrror(); non-terminal
A, we look at all
} productions with A on the
To build a parsing procedure for a non-­‐terminal lefthand
side:A, we look at all produc0ons with A on the levhand side: A → X ...X | A → Y ...Y | ...
©
251
1
n
match terminals, and calling
1
©
CS 536 Fall 2012
m
Usua
used
Inste
“mac
sequ
proc
We
use
predict
decide
We use predict sets parsing
to decide procedures
w
hich produc0on o sets
match to
(LL(1) grammars to tmatch
nonwhich
to match
(LL(1)
always have disjoint predict sets). We mproduction
atch a produc0on’s righthand side terminals.
grammars
always
have
disjoint
by calling Match to match terminals, and calling parsing procedures to The
general
form
of
a
parsing
predict
match non-­‐ terminals. The general form osets).
f a parsing procedure for procedure for
We match a production’s
... is Match to
A → X1...X
righthand
by| calling
n | A → Yside
1...Ym
251
void A() {
if (currentToken in Predict(A→X1...Xn))
for(i=1;i<=n;i++)
CS
536 (X[i]
Fall 2012
if
is a terminal)
Match(X[i]);
else X[i]();
else
if (currentToken in Predict(A→Y1...Ym))
for(i=1;i<=m;i++)
if (Y[i] is a terminal)
©
252
LL(1) Parse Tables LL(1) Parse Tables
An LL(1) parse table, T, is a twodimensional array. Entries in T are
An LL(1) parse table, T, is a two-­‐ dimensional array. Entries in production numbers or blank
T are produc0on numbers or blank (error) entries. (error) entries.
T is indexed by: TAis
by:A is the non-­‐ terminal we want to • , a indexed
non-­‐terminal. expand. • A, a non-terminal. A is the nonwe twant
to expand.
• Cterminal
T, the current oken that is to be matched. • CT, the current token that is to be
matched.
• T[A][CT] = A → X1...Xn
if CT is in Predict(A → X1...Xn)
T[A][CT] = error
if CT predicts no production with A
as its lefthand side
CSX-lite Exam
Production
1
Prog → { Stmts } E
2
Stmts → Stmt St
3
Stmts → λ
4
Stmt → id = Exp
5
Stmt → if ( Expr
6
Expr → id Etail
7
Etail → + Expr
8
Etail → - Expr
9
Etail → λ
{
Prog
Stmts
Stmt
Expr
Etail
}
if
3
2
1
5
(
CSX-lite Example
LL(1) Parse Table Predict Set
Production
CSX-lite
twon T are
k
on-
to be
1
Prog → { Stmts } Eof
2
Stmts → Stmt Stmts
id if
3
Stmts
→ λ → Stmt
2
Stmts
}id
4
3 →Stmts
λ
Stmt
id =→ Expr
;
Predict Set
Prog → { Stmts } Eof
1
{
Stmts
4
id = Expr ;
Stmt
→Stmt
if →
( Expr
) Stmt
id
if
6
Expr
id →Etail
6 →
Expr
id Etail
id
id
Stmt → if ( Expr ) Stmt
5
if
7
7 →Etail
→ + Expr
Etail
+ Expr
+
8
8 →Etail
→ - Expr
Etail
- Expr
--
9
Etail → λ
)
Etail → λ
9
}
Stmts
Prog
1
Stmts
Expr
Stmt
Etail
3
)
{
}
1
if
(
3
2
2
5
4
Stmt
if
(
if
}
id
5
Prog
{
)
)
id
id
2
=
;
;
+
=
-
+
;
-
eof
;
eof
26
5
9
Expr
4
7
8
9
6
Etail
267
{
Production
n)
with A
Example
9
©
CS 536 Fall 2012
7
8
9
268
s.
by:
3
Stmts → λ
}
4
Stmt → id = Expr ;
id
Example of LL(1) Parsing (step 1if) 5
Stmt → if ( Expr ) Stmt
minal. A is the nonExample of LL(1) Parsing
6
Expr → id Etail
want to expand.
We’ll
again
parse
We’ll parse { a = b + c; } Eof 7 Etail → + Expr
ent token
that
is
to
be
{ a = b + c; } Eof
8
Etail → - Expr
We s
tart b
y p
lacing P
rog (
the s
tart s
ymbol) on the parse We start by placing Prog (the start
9
Etail → λ
on the parse stack.
stack. → X1...Xsymbol)
n
redict(A → X1...Xn)
{
}
if
(
)
id
Parse Stack
Remaining Input
error Prog
Prog
1
{ a = b + c; } Eof
ts no production with A
Stmts
3
2
2
{ a = b + c; } Eof
{
and side
Stmts
Stmt
5
4
}
Eof
rse Tables
Stmts
Expr
CSX-lite Example
9
}
Eof
267
n-terminal. A is the nonal we want to expand.
current token that is to be
ed.
T] = A → X1...Xn
s in Predict(A → X1...Xn)
T] = error
©
CS 536 Fall 2012
r Driver
+
)
;
=
} Eof
-
;
7
8
9
Production
Predict Set
1
Prog → { Stmts } Eof
{
2
Stmts → Stmt Stmts
id if
3
Stmts → λ
}
Stmt → id = Expr ;
id
5
Stmt → if ( Expr ) Stmt
if
6
Expr → id Etail
id
7
Etail → + Expr
+
8
Etail → - Expr
-
9
Etail → λ
)
4
©
CS 536 Fall 2012
270
+
eof
6
Etail
a = b + c; } Eof
) parse table, T, is a twob + c;
Stmt Entries in aT= are
onal array.
Stmts
}
ion numbers
or blank
Eof
ntries.
exed by:
id
268
;
}
if
(
)
id
=
+
Example{ of
LL(1)
Parsing
;
eof
s.
by:
3
Stmts → λ
}
4
Stmt → id = Expr ;
id
Example of LL(1) Parsing (step 2) 5
Stmt → if ( Expr ) Stmt
if
minal. A is the non6
Expr → id Etail
id
want to expand.
7
Etail → + Expr
+
ent token thatWe’ll is toparse be { a = b + c; } Eof 8
Etail → - Expr
We start by placing Prog (the start symbol) on the parse 9
Etail → λ
)
→ X1...Xn stack. redict(A → X1...Xn)
{
}
if
(
)
id
=
Parse Stack
Remaining Input
Parse Stack
Remaining Input
error id
ProgEtail1
a = b + c; } Eof
+ c; } Eof
=
;
ts no production
with
A
Stmts
3
2
2
Expr
Stmts
;
}
and side
StmtEof
5
4
Stmts
}
Eof
rse Tables
=
Expr
;
Stmts
}
Eof
Expr+
CSX-lite Example
9
) parse table, T, is a twoonal array. Entries in T are
b + c; } Eof
Expr
ion numbers
or blank
;
ntries.Stmts
}
Eof
exed by:
b + c; } Eof
id
Production
CS 536
267
Etail
;
Stmts
}
Eof
©
CS 536 Fall 2012
r Driver
Stmts → λ
id
Etail
4
Stmt
©
Fall 2012
;
5 StmtsStmt
}
6 Eof Expr
c; } Eof
8
9
}
→ id = Expr ;
id
→ if ( Expr ) Stmt
if
→ id Etail
id
+
8
Etail → - Expr
-
Etail → λ
)
©
7
eof
id if
Etail → + Expr
CS 536 Fall 2012
;
{
7
9
271
-
Predict Set
c; } Eof
Expr
1 ; Prog → { Stmts } Eof
Stmts
2 } Stmts → Stmt Stmts
Eof
3
+
+ c; } Eof6
Expr
Etail;
Stmts
}
Eof
= b + c; } Eof
n-terminal. A is the nonal we want to expand.
current token that is to be
ed.
T] = A → X1...Xn
s in Predict(A → X1...Xn)
T] = error
;
268
;
272
}
if
(
)
id
=
+
Example{ of
LL(1)
Parsing
;
eof
s.
by:
3
Stmts → λ
}
4
Stmt → id = Expr ;
id
Example of LL(1) Parsing (step 3if) 5
Stmt → if ( Expr ) Stmt
minal. A is the non6
Expr → id Etail
want to expand.
7
Etail → + Expr
ent token thatWe’ll is toparse be { a = b + c; } Eof 8
Etail → - Expr
We start by placing Prog (the start symbol) on the parse 9
Etail → λ
→ X1...Xn stack. redict(A → X1...Xn)
{
}
if
(
)
id
Parse
Stack
Remaining
Input
error
Prog
1
+
c;
}
Eof
Etail
ts no production
with A
Stmts
3
2
2
;
Stmts
and side
Stmt
5
4
}
Eof
rse Tables
+
Expr
;
Stmts
}
Eof
Expr
) parse table, T, is a twoonal array. Entries in T are
Expr
ion numbers
or blankc; } Eof
;
ntries. Stmts
}
Eof
exed by:
id
Etail A is the nonn-terminal.
;
al we want
Stmtsto expand.
}
currentEof
token that is to be
267
c; } Eof
ed.
T] = A → X1...Xn
s in Predict(A → X1...Xn)
T] = error
r Driver
©
CS 536 Fall 2012
CSX-lite Example
9
+ c; } Eof
id
+
)
;
=
-
;
7
8
9
Production
Predict Set
1
Prog → { Stmts } Eof
{
2
Stmts → Stmt Stmts
id if
3
Stmts → λ
}
Stmt → id = Expr ;
id
5
Stmt → if ( Expr ) Stmt
if
6
Expr → id Etail
id
7
Etail → + Expr
+
8
Etail → - Expr
-
9
Etail → λ
)
4
©
CS 536 Fall 2012
eof
6
Etail
272
+
268
;
}
if
(
)
id
=
+
Example{ of
LL(1)
Parsing
;
eof
s.
by:
;
Stmts
}
Eof
3
Stmts
} → λ
}
4
Stmt → id = Expr ;
id
Eof
Example of LL(1) Parsing (step 4) 5
Stmt → if ( Expr ) Stmt
if
minal. A is the non6
Expr → id Etail
id
want to expand.
7
Etail → + Expr
+
ent token thatWe’ll is toparse be { a = b + c; } Eof 8
Etail → - Expr
We start by placing Prog (the start symbol) on the parse 9
Etail → λ
) ;
→ X1...Xn stack. redict(A → X1...Xn)
{
}
if
(
)
id
=
+
error Parse Stack
Prog
1
Syntax
Errors in LL(1)
Remaining Input
ts no production
with
A
Stmts Parsing
3
2
2
; } Eof
Etail
;
and side
Stmt
5
4
Stmts
©
CS 536 Fall 2012
}
Eof
rse Tables
;
©
CS 536 Fall 2012
271
Expr
) parse table, T, is a twoonal array.
T are
} Eof
Stmts Entries in
}
ion numbers
or blank
Eof
ntries.
} Eof
}
Eof
exed by:
Eof
267
n-terminal. A is the nonDone!
al we want
to expand.All input matched
current token that is to be
ed.
T] = A → X1...Xn
s in Predict(A → X1...Xn)
T] = error
r Driver
-
;
eof
In LL(1) parsing, syntax errors
6 detected as
are automatically
soon as the first
illegal token
9
7
8 is9
seen.
Production
Predict
Set is
How? When an illegal
token
by the
parser, either
it
Progseen
→ { Stmts
} Eof
{
fetches
error entry
from the
Stmts
→ Stmt an
Stmts
id if
LL(1) parse table or }it fails to
Stmts → λ
match an expected token.
Stmt → id = Expr ;
id
Let’s see how the following
Stmt → if ( Expr ) Stmt
if
illegal CSX-lite program is
Exprparsed:
→ id Etail
id
CSX-lite Example
Etail
; } Eof
Stmts
}
Eof
Eof
272
1
2
3
4
©
CS 536 Fall 2012
5
6
7
8
9
268
Etail → + Expr
+
{ b + c = a; } Eof
Etail → - Expr
-
(Where should the first syntax
be detected?) ) ;
Etailerror
→ λ
}
if
(
)
id
=
+
Example{ of
LL(1)
Parsing
;
eof
Syntax Errors in LL(1) Parsing In LL(1) parsing, syntax errors are automa0cally detected as soon as the first illegal token is seen. How? When an illegal token is seen by the parser, either it fetches an error entry from the LL(1) parse table or it fails to match an expected token. Let’s see how the following illegal program is parsed: { b + c = a; } Eof (Where should the first syntax error be detected?) Example arse
Tables
rse Tables
CSX-liteExample
Example
CSX-lite
Parse Stack
Remaining Input
1)parse
parse
table,
T, is atwotwo- Input
table,
Parse
StackT, is a Remaining
ionalarray.
array.
Entries inTTare
are
onal
Prog Entries in
{ b + c = a; } Eof
tion
numbersororblank
blank
on numbers
{
{ b + c = a; } Eof
entries.
Stmts
ntries.}
Eof
exedby:
by:
exed
= Production
+ c = a; } Eof Predict Set
Production
Predict Set
Expr
;
1
Prog → { Stmts } Eof
{
1
Prog
{
Stmts→ { Stmts } Eof
} Stmts → Stmt Stmts
2 2 Stmts
→ Stmt Stmts
id id
if if
Eof
Stmts→ →
3 3 Stmts
λ λ
} }
Current token (+) fails
+ c = a; } Eof
to Stmt
match expected
Expr
4 4 Stmt
→→
id id
= =
Expr
; ;
token (=)!
Stmt
( Expr
) Stmt
5 5 Stmt
→→
if if
( Expr
) Stmt
Expr→ →id id
Etail
6 6 Expr
Etail
Stmts
b + c = a; } Eof
on-terminal.
Aisisthe
thenonnon}
n-terminal.
A
Eof
nal
wewant
want
expand.
al we
totoexpand.
Stmt
b + c = a; } Eof
Stmts
ecurrent
current
token
that
is
to be
} token that is to be
ed.
Eof
d.
id X ...X
b + c = a; } Eof
T]
1 nn
] ==AA→→
= X1...X
Expr
Predict(A
)
1...X
sisininPredict(A
→→XX
;
1...X
n)n
Stmts
CT]
error
T] ==error
}
predicts
noproduction
productionwith
withAA
redicts Eof
no
slefthand
lefthandside
side
Etail
Expr
7 7 Etail
→→
+ +
Expr
Etail
- Expr
8 8 Etail
→→
- Expr
id id
if if
id id
+ +
- ) ); ;
Etail→ →λ λ
9 9 Etail
{ { } } if if ( ( ) ) id id = = + + - - ; ; eof eof
Prog 1 1
Prog
Stmts
Stmts
3 3 2 2
2 2
Stmt
Stmt
5 5
4 4
©
CS 536 Fall 2012
275
267 267
Expr
Expr
Etail
Etail
CS 536 Fall 2012
©
©
CS 536
Fall 2012
CS 536
Fall 2012
©
6 6
9 9
7 78 89 9
276
268 268
Example arse
Tables
rse Tables
CSX-liteExample
Example
CSX-lite
1)parse
parse
table,
two- Input
Parse
StackT,T,isisaatwoRemaining
table,
ionalarray.
array.
Entriesinin+TcT=are
are
=
a;
} Eof
onal
Entries
Expr
tion
numbers
blank
on numbers
ororblank
;
entries.
ntries.Stmts
}
Eof
exedby:
by:
exed
Production
Production
1
Prog → { Stmts } Eof
1
Prog → { Stmts } Eof
Stmts
StmtStmts
Stmts
2 2 Stmts
→→
Stmt
Stmts→ →
3 3 Stmts
λ λ
Stmt
Expr
4 4 Stmt
→→
id id
= =
Expr
; ;
Stmt
( Expr
) Stmt
5 5 Stmt
→→
if if
( Expr
) Stmt
Current token (+) fails
+ c = a; } Eof
on-terminal.
Aisisthe
thenonnonto matchA
expected
n-terminal.
token (=)!to expand.
nal
wewant
want
al we
to expand.
ecurrent
currenttoken
tokenthat
thatisistotobe
be
ed.
d.
T]
1...X
] ==AA→→XX
1...X
nn
Predict(A→→XX...X
...X )
sisininPredict(A
1 1 n)n
CT]
error
T] ==error
predicts
noproduction
productionwith
withAA
redicts no
slefthand
lefthandside
side
Expr→ →id id
Etail
6 6 Expr
Etail
Etail
Expr
7 7 Etail
→→
+ +
Expr
©
} }
id id
if if
id id
+ +
- -
Etail
- Expr
8 8 Etail
→→
- Expr
Etail→ →λ λ
9 9 Etail
) ); ;
{ { } } if if ( ( ) ) id id = = + + - - ; ; eof eof
Prog 1 1
Prog
Stmts
Stmts
3 3 2 2
2 2
Stmt
Stmt
5 5
4 4
Expr
Expr
Etail
Etail
CS 536 Fall 2012
Predict Set
Predict Set
{
{
id id
if if
6 6
9 9
7 78 89 9
276
267 267
©
CS 536
Fall 2012
CS 536
Fall 2012
©
268 268
XPath
•  A Language for Locating Nodes in XML Documents
•  XPath expressions are written in a syntax that resembles
paths in file systems
•  The list of nodes located by an XPath expression is
called a Nodelist
•  XPath is used in XSL and in XQuery (a query language
for XML)
•  W3Schools has an XPath tutorial
•  XPath includes
–  Axis navigation
–  Conditions
–  Functions
XML Schema •  An XML Schema describes the structure of an XML document. •  In this tutorial you will learn how to create XML Schemas, why XML Schemas are more powerful than DTDs, and how to use XML Schema in your applica5on. •  <?xml version="1.0"?> <xs:schema xmlns:xs="h[p://www.w3.org/2001/XMLSchema"> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> <?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
<cd country="UK">
<title>Dark Side of the Moon</title>
<artist>Pink Floyd</artist>
<price>10.90</price>
An XML document
</cd>
<cd country="UK">
<title>Space Oddity</title>
<artist>David Bowie</artist>
<price>9.90</price>
</cd>
<cd country="USA">
<title>Aretha: Lady Soul</title>
<artist>Aretha Franklin</artist>
<price>9.90</price>
</cd>
</catalog>
29
catalog.xml
catalog
country
UK
cd
country
UK
title artist price
cd
cd
Pink Floyd
10.90
USA
title artist price title artist price
Space Oddity
Dark Side of
the Moon
country
Aretha: Lady
Soul
David Bowie
9.90
Aretha Franklin
9.90
The Main Idea in the Syntax of
XPath Expressoins
•  / at the beginning of an XPath
expression represents the root of the
document
•  / between element names represents a
parent-child relationship
•  // represents an ancestor-descendent
relationship
•  @ marks an attribute
•  [condition] specifies a condition
catalog.xml
catalog
country
UK
cd
country
UK
title artist price
/catalog cd
cd
USA
title artist price title artist price
Space Oddity
Dark Side of
the Moon
Pink Floyd
10.90
country
Aretha: Lady
Soul
David Bowie
Aretha Franklin
9.90
Getting the root element of the document
9.90
catalog.xml
catalog
country
UK
cd
country
UK
title artist price
/catalog/cd cd
cd
USA
title artist price title artist price
Space Oddity
Dark Side of
the Moon
Pink Floyd
10.90
country
Aretha: Lady
Soul
David Bowie
9.90
Finding child nodes
Aretha Franklin
9.90
catalog.xml
catalog
country
UK
cd
country
UK
title artist price
/catalog/cd/price cd
cd
USA
title artist price title artist price
Space Oddity
Dark Side of
the Moon
Pink Floyd
10.90
country
Aretha: Lady
Soul
David Bowie
9.90
Finding descendent nodes
Aretha Franklin
9.90
catalog.xml
catalog
country
UK
cd
country
UK
title artist price
/catalog/cd[price<10] cd
cd
USA
title artist price title artist price
Space Oddity
Dark Side of
the Moon
Pink Floyd
10.90
country
Aretha: Lady
Soul
David Bowie
9.90
Condition on elements
Aretha Franklin
9.90
catalog.xml
/catalog//0tle catalog
country
UK
cd
country
UK
title artist price
//0tle cd
cd
USA
title artist price title artist price
Space Oddity
Dark Side of
the Moon
Pink Floyd
10.90
country
Aretha: Lady
Soul
David Bowie
Aretha Franklin
9.90
// represents any directed path in the document
9.90
catalog.xml
catalog
country
UK
cd
country
UK
title artist price
/catalog/cd/* cd
cd
USA
title artist price title artist price
Space Oddity
Dark Side of
the Moon
Pink Floyd
10.90
country
Aretha: Lady
Soul
David Bowie
Aretha Franklin
9.90
* represents any element name in the document
9.90
catalog.xml
What will the following
expressions return?
country
UK
cd
catalog
country
UK
title artist price
cd
cd
USA
title artist price title artist price
Space Oddity
Dark Side of
the Moon
Pink Floyd
10.90
country
Aretha: Lady
Soul
David Bowie
Aretha Franklin
9.90
* represents any element name in the document
9.90
catalog.xml
/catalog/cd[1] catalog
country
UK
cd
country
UK
title artist price
/catalog/cd[last()] cd
cd
USA
title artist price title artist price
Space Oddity
Dark Side of
the Moon
Pink Floyd
10.90
country
Aretha: Lady
Soul
David Bowie
9.90
Position based condition
Aretha Franklin
9.90
catalog.xml
/catalog/cd[@country= UK ] catalog
country
UK
cd
country
UK
title artist price
cd
cd
USA
title artist price title artist price
Space Oddity
Dark Side of
the Moon
Pink Floyd
10.90
country
Aretha: Lady
Soul
David Bowie
9.90
@ marks attributes
Aretha Franklin
9.90
catalog.xml
catalog
/catalog/cd/@country country
UK
cd
country
UK
title artist price
cd
cd
USA
title artist price title artist price
Space Oddity
Dark Side of
the Moon
Pink Floyd
10.90
country
Aretha: Lady
Soul
David Bowie
9.90
@ marks attributes
Aretha Franklin
9.90
Relative Navigation Using Axes
•  Starts with the current node and not with
the root (/)
•  A . marks the current node (e.g., ./title)
•  A .. marks the parent node (e.g., title/../*)
•  There are also other axes, e.g., child,
descendent, ancestor, parent, followingsibling, etc.
Functions
•  Many functions that are included in XPath
•  Some examples:
–  count() – returns the number of nodes in a
nodelist
–  last() – returns the last node in a nodelist
–  name() – returns the name of a node
–  position() – returns the position of the node in
the nodelist
Additional Examples of
XPath Expressions
These examples use element names
that are not necessarily from the XML
document that was shown previously
Examples of XPath Expressions
•  para
–  Selects the para children elements of the context
node
•  *
–  Selects all element children of the context node
•  text()
–  Selects all text node children of the context node
•  @name
–  Selects the name attribute of the context node
More Examples of
XPath Expressions
•  @*
–  Selects all the attributes of the context node
•  para[1]
–  Selects the first para child of the context node
•  para[last()]
–  Selects the last para child of the context node
•  */para
–  Selects all para grandchilren of the context node
More Examples of
XPath Expressions
•  /doc/chapter[5]/section[2]
–  Selects the second section of the fifth chapter of the
doc
•  chapter//para
–  Selects the para element descendants of the chapter
element children of the context node
•  //para
–  Selects all the para descendants of the document root
and thus selects all para elements in the same
document as the context node
More Examples of
XPath Expressions
•  //olist/item
–  Selects all the item elements that have an olist parent
and are in the same document as the context node
•  .
–  Selects the context node
•  .//para
–  Selects the para descendants of the context node
•  ..
–  Selects the parent of the context node
More Examples of
XPath Expressions
•  ../@lang
–  Selects the lang attribute of the parent of the
context node
•  para[@type= warning ]
–  Selects the para children of the context node
that have a type attribute with value warning
•  chapter[title]
–  Selects the chapter children of the context
node that have one or more title children
More Examples of
XPath Expressions
•  para[@type= warning ][5]
–  Selects the fifth para child among the children
of the context node that have a type attribute
with value warning
•  para[5][@type= warning ]
–  Selects the fifth para child of the context node
if that child has a type attribute with value
warning
More Examples of
XPath Expressions
•  chapter[title= Introduction ]
–  Selects the chapter children of the context
node that have one or more title children with
string-value equal to Introduction
•  employee[@secretary and @assistant]
–  Selects employee children of the context node
that have both a secretary attribute and an
assistant attribute
More Examples of
Xpath Expressions
•  /university/department/course
–  This Xpath expression matches any path that
starts at the root, which is a university
element, passes through a department
element and ends in a course element
•  ./department/course[@year=2002]
–  This Xpath expression matches any path that
starts at the current element, continues to a
child which is a department element and
ends at a course element with a year
attribute that is equal to 2002
Location Paths
•  The previous examples are abbreviations
of location paths
–  See XPath tutorial in W3Schools
– For example, // is short for /descendantor-self::node()/.
•  //para is short for
/
descendant-or-self::node()/child::para
53
What Is A Scrip0ng Language •  Modern scrip0ng languages have two principal sets of ancestors. –  command interpreters or shells of tradi0onal batch and terminal (command-­‐line) compu0ng •  IBM s JCL, MS-­‐DOS command interpreter, Unix sh and csh –  various tools for text processing and report genera0on •  IBM s RPG, and Unix s sed and awk. •  From these evolved –  Rexx, IBM s Restructured Extended Executor, which dates from 1979 –  Perl, originally devised by Larry Wall in the late 1980s, and now the most widelyused general purpose scrip0ng language. –  Other general purpose scrip0ng languages include Tcl ( 0ckle ), Python, Ruby, VBScript (for Windows) and AppleScript (for the Mac) What Is A Scrip0ng Language •  Scrip0ng on Microsov pla€orms –  As in several other aspects of compu0ng, Microsov tends to rely on internally developed technology in the area of scrip0ng languages –  Most scrip0ng applica0ons are based on VBScript -­‐ dialect of Visual Basic –  Microsov has also developed a very general scrip0ng interface (Windows Script) that is implemented uniformly by the opera0ng system, the web server, and the Internet Explorer browser –  A Windows Script implementa0on of JScript, the company s version of JavaScript, comes pre-­‐installed on Windows machines, but languages like Perl and Python can be installed as well, and used to drive the same interface. –  Many other Microsov applica0ons use VBScript as an extension language, but for these the implementa0on framework (Visual Basic for Applica0ons [VBA]) does not make it easy to use other languages instead What Is A Scrip0ng Language •  Scrip0ng on Microsov pla€orms –  Given Microsov s share of the desktop compu0ng market, VBScript is one of the most widely used scrip0ng languages •  It is almost never used on other pla€orms –  Perl, Tcl, Python, PHP, and others see significant use on Windows •  For server-­‐side web scrip0ng, PHP currently predominates: as of February 2005, some 69% of the 59 million Internet web sites surveyed by Netcrav LTD were running the open source Apache web server, and of them most of the ones with ac0ve content were using PHP •  Microsov s Internet Informa0on Server (IIS) was second to Apache, with 21% of the sites, and many of those had PHP installed as well. •  For client-­‐side scrip0ng, where Internet Explorer controls about 70% of the browser market, most web site administrators need their content to be visible to the other 30% •  Explorer supports JavaScript (JScript), but other browsers do not support VBScript Shell Scripts •  A shell script is just a file containing shell commands, but with a few extras: –  The first line of a shell script should be a comment of the following form: ! !
#!/bin/sh
for a Bourne shell script. Bourne shell scripts are the most common, since C Shell scripts have buggy features. –  A shell script must be readable and executable. ! !
chmod +rx scriptname
–  As with any command, a shell script has to be in your path to be executed. •  If . is not in your PATH, you must specify ./scriptname instead of just scriptname My First Script •  I want to type “ls –a” in a Unix Terminal •  Is is the same thing if I write a script or a shell file ? #!/bin/sh ls -­‐a Yes !~~~ What is Shell? •  Shell is the interface between end user and the Linux system, similar to the commands in Windows •  Bash is installed as in /bin/sh •  Check the version Other programs csh Kernel % /bin/sh --version
bash X window Pipe and Redirec0on •  Redirec0on (< or >) % ls –l > lsoutput.txt (save output to lsoutput.txt)
% ps >> lsoutput.txt (append to lsoutput.txt)
% more < killout.txt (use killout.txt as parameter to
more)
% kill -l 1234 > killouterr.txt 2 >&1 (redirect to the
same file)
% kill -l 1234 >/dev/null 2 >&1 (ignore std output)
•  Pipe (|) – 
%
%
%
Process are executed concurrently ps | sort | more
ps –xo comm | sort | uniq | grep –v sh | more
cat mydata.txt | sort | uniq | > mydata.txt
(generates an empty file !)
Shell as a Language •  We can write a script containing many shell commands •  Interac0ve Program: –  grep files with POSIX string and print it % for file in *
> do
> if grep –l POSIX $file
> then
> more $file
Ø  fi
Ø  done
Posix
There is a file with POSIX in it
–  * is wildcard % more `grep –l POSIX *`
% more $(grep –l POSIX *)
% more –l POSIX * | more
61 Wri0ng a Script •  Use text editor to generate the first file #!/bin/sh
# first
# this file looks for the files containing POSIX
# and print it
for file in *
do
if grep –q POSIX $file
then
echo $file
fi
done
exit code, 0 means successful exit 0 % /bin/sh first
% chmod +x first
%./first (make sure . is include in PATH
parameter)
Syntax • 
• 
• 
• 
• 
• 
• 
• 
Variables Condi0ons Control Lists Func0ons Shell Commands Result Document 63 Variables •  Variables needed to be declared, note it is case-­‐sensi0ve (e.g. foo, FOO, Foo) •  Add $ for storing values % salutation=Hello
% echo $salutation
Hello
% salutation=7+5
% echo $salutation
7+5
% salutation= yes dear
% echo $salutation
yes dear
% read salutation
Hola!
% echo $salutation
Hola!
Quo0ng •  Edit a vartest.sh file #!/bin/sh
myvar= Hi there
echo $myvar
echo $myvar
echo `$myvar`
echo \$myvar
Output
Hi there
Hi there
$myvar
$myvar
Enter some text
Hello world
$myvar now equals Hello world
echo Enter some text
read myvar
echo $myvar
exit 0
now equals $myvar
Environment Variables • 
• 
• 
• 
• 
• 
• 
• 
$HOME
$PATH
$PS1
$PS2
$$ $# $0 $IFS home directory path (normally %) (normally >) process id of the script number of input parameters name of the script file separa0on character (white space) •  Use env to check the value Parameter % IFS = ` `
% set foo bar bam
% echo $@
foo bar bam
% echo $*
foo bar bam
% unset IFS
% echo $*
foo bar bam
doesn t mafer IFS Edit file try_var #!/bin/sh
salutation= Hello
echo $salutation
echo The program $0 is now running
echo The parameter list was $*
echo The second parameter was $2
echo The first parameter was $1
echo The user s home directory is $HOME
echo Please enter a new greeting
read salutation
echo $salutation
echo The script is now complete
exit 0
%./try_var foo bar baz
Hello
The program ./try_var is now running
The second parameter was bar
The first parameter was foo
The parameter list was foo bar baz
The user s home directory is /home/ychuang
Please enter a new greeting
Hola
Hola
The script is now complete
Parameter need s pace ! Condi0on
•  test or [ if test –f fred.c
then
...
fi
expression1 –eq
expression1 –ne
expression1 –gt
expression1 –ge
expression1 -lt
expression1 –le
!expression If [ -f
fred.c ]
then
...
fi
expression2
expression2
expression2
expression2
expression2
expression2
if [ -f fred.c ];then
...
fi
-d
-e
-f
-g
-r
-s
-u
-w
-x
String1 = string2
String1 != string 2
-n string (if not empty string)
-z string (if empty string)
file
file
file
file
file
file
file
file
file
if
if
if
if
if
if
if
if
if
directory
exist
file
set-group-id
readable
size >0
set-user-id
writable
executable
Control Structure Syntax if condition
then
statement
else
statement
fi
#!/bin/sh
echo Is it morning? Please answer yes or no
read timeofday
if [ $timeofday = yes ]; then
echo Good morning
else
echo Good afternoon
fi
exit 0
Is it morning? Please answer yes or no
yes
Good morning
Condi0on Structure #!/bin/sh
echo Is it morning? Please answer yes or no
read timeofday
if [ $timeofday = yes ]; then
echo Good morning
elif [ $timeofday = no ]; then
echo Good afternoon
else
echo Sorry, $timeofday not recongnized. Enter yes or no
exit 1
fi
exit 0
Condi0on Structure #!/bin/sh
echo Is it morning? Please answer yes or no
read timeofday
if [ $timeofday = yes ]; then
echo Good morning
elif [ $timeofday = no ]; then
echo Good afternoon
else
echo Sorry, $timeofday not recongnized. Enter yes or no
exit 1
fi
exit 0
If input enter s0ll returns Good morning Loop Structure Syntax for variable
do
statement
done
#!/bin/sh
for foo in bar fud 43
do
echo $foo
done
exit 0
bar
fud
43
How to output as bar fud 43? Try change for foo in bar fud 43 This is to have space in variable
Loop Structure •  Use wildcard * #!/bin/sh
for file in $(ls f*.sh); do
lpr $file
done
exit 0
Print all f*.sh files Loop Structure Syntax while condition
do
statement
done
Syntax until condition
do
statement
done
Note: condi0on is Reverse to while How to re-­‐write previous sample? #!/bin/sh
for foo in 1 2 3 4 5 6 7 8 9 10
do
echo here we go again
done
exit 0
#!/bin/sh
foo = 1
while [ $foo –le 10 ]
do
echo here we go again
foo = $foo(($foo+1))
done
exit 0
Case Statement Syntax case variable in\
pattern [ | pattern ] …) statement;;
pattern [ | pattern ] …) statement;;
#!/bin/sh
…
echo Is it morning? Please answer yes or no
esac
read timeofday
case $timeofday in
yes) echo Good Morning ;;
y)
echo Good Morning ;;
no) echo Good Afternoon ;;
n)
echo Good Afternoon ;;
* ) echo Sorry, answer not recongnized ;;
esac
exit 0
Case Statement •  A much cleaner version #!/bin/sh
echo Is it morning? Please answer yes or no
read timeofday
case $timeofday in
yes | y | Yes | YES ) echo Good Morning ;;
n* | N* )
echo Good Afternoon ;;
* )
echo Sorry, answer not recongnized ;;
esac
exit 0
But this has a problem, if we enter never which obeys n* case and prints Good Avernoon Case Statement #!/bin/sh
echo Is it morning? Please answer yes or no
read timeofday
case $timeofday in
yes | y | Yes | YES )
echo Good Morning
echo Up bright and early this morning
;;
[nN]*)
echo Good Afternoon ;;
*)
echo Sorry, answer not recongnized
echo Please answer yes of no
exit 1
;;
esac
exit 0
List •  AND (&&) statement1 && statement2 && statement3 … #!/bin/sh
touch file_one
rm –f file_two
Check if file exist if not then create one Remove a file if [ -f file_one ] && echo Hello && [-f file_two] && echo
then
echo in if
else
Output echo in else
Hello
fi
in else
exit 0
there
List •  OR (||) statement1 || statement2 || statement3 … #!/bin/sh
rm –f file_one
if [ -f file_one ] || echo Hello
then
echo in if
else
Output echo in else
Hello
fi
in else
exit 0
|| echo
there
Statement Block •  Use mul0ple statements in the same place get_comfirm && {
grep –v $cdcatnum $stracks_file > $temp_file
cat $temp_file > $tracks_file
echo
add_record_tracks
}
Func0on •  You can define func0ons for structured scripts function_name() {
statements
}
#!/bin/sh
foo() {
echo Function foo is executing
}
Output echo script starting
script starting
foo
Function foo is executing
echo script ended
Script ended
exit 0
You need to define a func0on before using it The parameters $*,$@,$#,$1,$2 are replaced by local value if func0on is called and return to previous aver func0on is finished Func0on define local variable Output? Check the scope of the variables #!/bin/sh
sample_text= global variable
foo() {
local sample_text= local variable
echo Function foo is executing
echo $sample_text
}
echo script starting
echo $sample_text
foo
echo script ended
echo $sample_text
exit 0
Func0on •  Use return to pass a result #!/bin/sh
yes_or_no() {
echo Is your name $* ?
while true
do
echo –n Enter yes or no:
read x
case $x in
y | yes ) return 0;;
n | no ) return 1;;
* ) echo Answer yes or
esac
done
}
echo Original parameters are $*
if yes_or_no $1
then
echo Hi $1, nice name
else
echo Never mind
fi
exit 0
no
Output ./my_name John Chuang
Original parameters are John Chuang
Is your name John?
Enter yes or no: yes
Hi John, nice name.
Command •  External: use interac0vely •  Internal: •  only in script •  break skip loop #!/bin/sh
rm –rf fred*
echo > fred1
echo > fred2
mkdir fred3
echo > fred4
for file in fred*
do
if [ -d $file ] ; then
break;
fi
done
echo first directory starting fred was $file
rm –rf fred*
exit 0
Command •  :
treats it as true #!/bin/sh
rm –f fred
if [ -f fred ]; then
:
else
echo file fred did not exist
fi
exit 0
Command •  con0nue con0nues next itera0on #!/bin/sh
rm –rf fred*
echo > fred1
echo > fred2
mkdir fred3
echo > fred4
for file in fred*
do
if [ -d $file ]; then
echo skipping directory $file
continue
fi
echo file is $file
done
rm –rf fred*
exit 0
Command •  . ./shell_script execute shell_script classic_set
#!/bin/sh
verion=classic
PATH=/usr/local/old_bin:/usr/bin:/bin:.
PS1= classic>
latest_set
#!/bin/sh
verion=latest
PATH=/usr/local/new_bin:/usr/bin:/bin:.
PS1= latest version>
% . ./classic_set classic> echo $version classic Classic> . latest_set latest latest version> Command •  echo print string •  -­‐n do not output the trailing newline •  -­‐e enable interpreta0on of backslash escapes – 
– 
– 
– 
– 
– 
– 
– 
– 
– 
\0NNN the character whose ACSII code is NNN \\ backslash \a alert \b backspace \c suppress trailing newline \f form feed \n newline \r carriage return \t horizontal tab Try these \v ver0cal tab % echo –n
string to \n output
% echo –e
string to \n output
Command •  eval evaluate the value of a parameter similar to an extra $ %
%
%
%
foo=10
x=foo
y= $ $x
echo $y
Output is $foo % foo=10
% x=foo
% eval y= $ $x
% echo $y
Output is 10 Command • 
• 
• 
• 
• 
• 
exit n ending the script 0 means success 1 to 255 means specific error code 126 means not executable file 127 means no such command 128 or >128 signal #!/bin/sh
if [ -f .profile ]; then
exit 0
fi
exit 1
Or % [ -f .profile ] && exit 0 || exit 1
Command •  export
This is export2 #!/bin/sh
echo $foo
echo $bar
gives a value to a parameter Output is %export1
The second-syntactic variable
%
This is export1 #!/bin/sh
foo= The first meta-syntactic variable
export bar= The second meta-syntactic variable
export2
Command •  expr evaluate expressions %x=`expr $x + 1` (Assign result value expr $x+1 to x) Also can be wrifen as %x=$(expr $x + 1) Expr1 | expr2 (or)
expr1 != expr2 Expr1 & expr2 (and)
expr1 + expr2 Expr1 = expr2 expr1 – expr2 Expr1 > expr2 expr1 * expr2 Expr1 >= expr2 expr1 / expr2 Expr1 < expr2 expr1 % expr2 (module) Expr1 <= expr2 Command •  prin€ format and print data •  Escape sequence –  \\
backslash –  \a
beep sound –  \b
backspace –  \f
form feed –  \n
newline –  \r
carriage return –  \t
tab –  \v
ver0cal tab •  Conversion specifier –  %d
decimal –  %c
character –  %s
string –  %% print % % printf %s\n hello
Hello
% printf %s %d\t%s Hi
There 15 people
Hi There 15
people
Command •  return
return a value •  set set parameter variable #!/bin/sh
echo the date is $(date)
set $(date)
echo The month is $2
exit 0
Command •  Shiv shiv parameter once, $2 to $1, $3 to $2, and so on
#!/bin/sh
while [ $1 !=
echo $1
shift
done
exit 0
]; do
Command •  trap ac0on aver receiving signal trap command signal •  signal
explain HUP (1)
hung up INT (2)
interrupt (Crtl + C) QUIT (3)
Quit (Crtl + \) ABRT (6)
Abort ALRM (14) Alarm TERM (15) Terminate Command #!/bin/sh
trap rm –f /tmp/my_tmp_file_$$ INT
echo creating file /tmp/my_tmp_file_$$
date > /tmp/my_tmp_file_$$
echo press interrupt (CTRL-C) to interrupt …
while [ -f /tmp/my_tmp_file_$$ ]; do
echo File exists
sleep 1
done
echo The file no longer exists
trap INT
echo creating file /tmp/my_tmp_file_$$
date > /tmp/my_tmp_file_$$
echo press interrupt (CTRL-C) to interrupt …
while [ -f /tmp/my_tmp_file_$$ ]; do
echo File exists
sleep 1
done
echo we never get there
exit 0
Command creating file /tmp/my_file_141
press interrupt (CTRL-C) to interrupt …
File exists
File exists
File exists
File exists
The file no longer exists
Creating file /tmp/my_file_141
Press interrupt (CTRL-C) to interrupt …
File exists
File exists
File exists
File exists
Command Unset
remove parameter or func0on #!/bin/sh
foo= Hello World
echo $foo
unset $foo
echo $foo
Pafern Matching •  find search for files in a directory hierarchy find [path] [options] [tests] [actions]
op0ons -­‐depth
find content in the directory -­‐follow
follow symbolic links -­‐maxdepths N fond N levels directories -­‐mount
do not find other directories tests -­‐a0me N
accessed N days ago -­‐m0me N
modified N days ago -­‐new otherfile name of a file -­‐type X
file type X -­‐user username
belong to username Pafern Matching operator !
-­‐not
test reverse -­‐a -­‐and
test and -­‐o -­‐or
test or ac0on -­‐exec command
execute command -­‐ok command confirm and exectute command -­‐print
print -­‐ls ls –dils Find files newer than while2 then print % find . –newer while2 -print Pafern Matching Find files newer than while2 then print only files % find . –newer while2 –type f –print
Find files either newer than while2, start with _ % find . \( -name _* –or –newer while2 \) –type f –
print
Find files newer than while2 then list files % find . –newer while2 –type f –exec ls –l {} \; Pafern Matching •  grep print lines matching a pafern (General Regular Expression Parser) grep [options] PATTERN [FILES]
op0on -­‐c
-­‐E
-­‐h
-­‐i
-­‐l
-­‐v
print number of output context Interpret PATTERN as an extended regular expression Supress the prefixing of filenames ignore case surpress normal output invert the sense of matching % grep in words.txt
% grep –c in words.txt words2.txt
% grep –c –v in words.txt words2.txt
Regular Expressions • 
• 
a regular expression (abbreviated as regexp or regex, with plural forms regexps, regexes, or regexen) is a string that describes or matches a set of strings, according to certain syntax rules. Syntax –  ^ Matches the start of the line –  $ Matches the end of the line –  . Matches any single character –  [] Matches a single character that is contained within the brackets –  [^] Matches a single character that is not contained within the brackets –  () Defines a "marked subexpression –  {x,y}Match the last "block" at least x and not more than y 0mes Regular Expressions •  Examples: –  ".at" matches any three-­‐character string like hat, cat or bat –  "[hc]at" matches hat and cat –  "[^b]at" matches all the matched strings from the regex ".at" except bat –  "^[hc]at" matches hat and cat but only at the beginning of a line –  "[hc]at$" matches hat and cat but only at the end of a line Regular Expressions • 
• 
• 
• 
• 
• 
• 
• 
• 
• 
• 
• 
• 
POSIX class similar to meaning [:upper:] [A-­‐Z] uppercase lefers [:lower:] [a-­‐z] lowercase lefers [:alpha:] [A-­‐Za-­‐z] upper-­‐ and lowercase lefers [:alnum:] [A-­‐Za-­‐z0-­‐9] digits, upper-­‐ and lowercase lefers [:digit:] [0-­‐9] digits [:xdigit:] [0-­‐9A-­‐Fa-­‐f] hexadecimal digits [:punct:] [.,!?:...] punctua0on [:blank:] [ \t] space and TAB characters only [:space:] [ \t\n\r\f\v]blank (whitespace) characters [:cntrl:] control characters [:graph:] [^ \t\n\r\f\v] printed characters [:print:] [^\t\n\r\f\v] printed characters and space • 
Example: [[:upper:]ab] should only match the uppercase lefers and lowercase 'a' and 'b'. Regular Expressions •  POSIX modern (extended) regular expressions •  The more modern "extended" regular expressions can oven be used with modern Unix u0li0es by including the command line flag "-­‐E". •  + Match one or more 0mes •  ? Match at most once •  * Match zero or more •  {n} Match n 0mes •  {n,} Match n or more 0mes •  {n,m} Match n to m 0mes Regular Expressions •  Search for lines ending with e % grep e$ words2.txt
•  Search for a % grep a[[:blank:]] word2.txt
•  Search for words star0ng with Th. % grep Th.[[:blank:]] words2.txt
•  Search for lines with 10 lower case characters % grep –E [a-z]\{10\} words2.txt
Command •  $(command) to execute command in a script
•  Old format used ` but it can be confused with
#!/bin/sh
echo The current directory is $PWD
echo the current users are $(who)
Arithme0c Expansion •  Use $((…)) instead of expr to evaluate arithme0c equa0on #!/bin/sh
x=0
while [ $x –ne 10]; do
echo $x
x=$(($x+1))
done
exit 0
Parameter Expansion •  Parameter Assignment ${param:-­‐default} set default if null foo=fred
${#param} length of param echo $foo
${param%word} remove smallest suffix pafern ${param%%word} remove largest suffix pafern #!/bin/sh ${param#word} remove smallest prefix pafern for i in 1 2 ${param##word} remove largest prefix pafern do my_secret_process $i_tmp done Gives result mu_secret_process: too few arguments #!/bin/sh for i in 1 2 do my_secret_process ${i}_tmp done Parameter Expansion #!/bin/sh
unset foo
echo ${foo:-bar}
foo=fud
echo ${foo:-bar}
foo=/usr/bin/X11/startx
echo ${foo#*/}
echo ${foo##*/}
bar=/usr/local/etc/local/networks
echo ${bar%local*}
echo ${bar%%local*}
Exit 0
Output bar fud usr/bin/X11/startx startx /usr/local/etc /usr Here Documents •  A here document is a special-­‐purpose code block, starts with << #!/bin.sh
#!/bin.sh ed a_text_file <<HERE cat <<!FUNKY!
3 hello
d this is a here
.,\$s/is/was/ w document
q a_text_file !FUNCKY!
HERE That is line 1 exit 0
exit 0 That is line 2 That is line 3 That is line 4 Output That is line 1 That is line 2 That was line 4 Debug • 
• 
• 
sh –n<script> set -­‐o noexec check syntax set –n sh –v<script> set -­‐o verbose echo command before set –v sh –x<script>
set –o trace echo command aver set –x set –o nounset gives error if undefined set –x set –o xtrace
set +o xtrace
trap echo Exiting: critical variable =$critical_variable
EXIT
Download