Context-Free Grammars

advertisement
CDT314
FABER
Formal Languages, Automata
and Models of Computation
Lecture 7
School of Innovation, Design and Engineering
Mälardalen University
2012
1
Content
Midterm results
Regular vs. Non-regular Languages
Context-Free Languages
Context-Free Grammars
Derivation Trees. Ambiguity
Applications
Push-Down Automata, PDA
2
Midterm 1 Solution
http://www.idt.mdh.se/kurser/cd5560/12_11/examination/
Duggor/MIDTERM1-20121127-Solution.pdf
3
A comment on the MIDTERM 1
The Pumping Lemma
for Regular Languages
Pumping Lemma cannot be used to prove that a
language is regular!
An example: If something is a square it always has
four edges (a property of square)
But: having proved that something has four edges
does not necessarily mean that the object is a square.
http://www2.mat.ua.pt/rosalia/cadeiras/TC/pump.pdf
4
Time to take the next step: beyond Regular Languages
n l n l
{a b c
: n, l  0}{a : n  0}
n!
Non-regular languages
Context-Free Languages
n n
R
{a b }
{ww }
Regular Languages
5
Automata theory: formal languages and
formal grammars
Grammar
Languages
Automaton
Type-0
Recursively
enumerable
Turing machine
Type-1
Contextsensitive
Linear-bounded nondeterministic Turing
machine
Type-2
Context-free
Non-deterministic pushdown
automaton
Production
rules
No restrictions
and
Type-3
Regular
Finite state automaton
6
Context-Free Languages
Based on C Busch, RPI, Models of Computation
7
Context-Free Languages
Context-Free
Grammars
Pushdown
Automata
8
Context-Free Grammars
9
Grammar
Formal Definition
G  V , T , S , P
V:
T:
Set of variables
Set of terminal symbols
S : Start variable
P:
Set of production rules
10
Repetition: Regular Grammars
Grammar
Variables
G  (V ,T , S , P)
Terminal
symbols
Start
variables
Right or Left Linear Grammars. Productions of the form:
A  xB
A  Bx
or
Cx
x is string of terminals
11
Definition: Context-Free Grammars
Grammar
Variables
G  (V ,T , S , P)
Terminal
symbols
Start
variables
Productions of the form:
A x
x is string of variables and terminals
12
Regular vs. Context-free Grammar
A regular grammar is either right or left linear, whereas
context free* grammar is any combination of terminals
and non-terminals. Hence regular grammars are a
subset of context-free grammars. Grammar generating
palindromes is not regular:
S  ABA
A  something
B  something
*The name context-free grammar is explained by the property of productions that are
independent of the surrounding symbols. There are also context-sensitive grammars where
13
productions depend on the context (symbols that surround variables).
Example 1:
A context-free grammar G
S  aSb
S 
A derivation
S  aSb  aaSbb  aabb
14
A context-free grammar G
S  aSb
S 
Another derivation
S  aSb  aaSbb  aaaSbbb  aaabbb
15
S  aSb
S 
L(G )  {a b : n  0}
n n
( ( ( ( ) ) ) )
16
Example 2:
A context-free grammar
G
S  aSa
S  bSb
S 
A derivation
S  aSa  abSba  abba
17
A context-free grammar
G
S  aSa
S  bSb
S 
Another derivation
S  aSa  aaSaa  aaaSaaa
 aaabSbaaa  aaabbaaa
18
S  aSa
S  bSb
S 
L(G)  {ww : w {a, b}*}
R
19
Example 3:
A context-free grammar
G
S  aSb
S  SS
S 
A derivation
S  SS  aSbS  abS  ab
20
A context-free grammar
G
S  aSb
S  SS
S 
A derivation
S  SS  aSbS  abS  abaSb  abab
21
S  aSb
S  SS
S 
L(G )  {w : na ( w)  nb ( w),
and na (v)  nb (v)
in any prefix v}
( )( ( ( ) ) ) ( ( ) ) 
22
Example 4:
Language L  {a nb m:n  m} is context - free.
For the case n  m :
S  AS1 ,
S1  aS1b|λ,
A  aA|a.
For the case n  m :
S  S1B ,
S1  aS1b|λ,
B  bB|b.
n  m:
n  m:
S  AS1 ,
S1  aS1b|λ,
A  aA|a.
S  S1B ,
S1  aS1b|λ,
B  bB|b.
The grammar for the language L  {a nb m:n  m} is :
S  AS1|S1B
S1  aS1b|λ
A  aA|a
B  bB|b
Definition: Context-Free Grammars
Grammar
Variables
G  (V ,T , S , P)
Terminal
symbols
Start
variables
Productions of the form:
A x
x is string of variables and terminals
25
Definition: Context-Free Languages
A language
L
is context-free
if and only if there is a grammar
G with
L  L(G )
26
Derivation Order
1. S  AB
2. A  aaA
3. A  
4. B  Bb
5. B  
Leftmost derivation
1
2
3
4
5
S  AB  aaAB  aaB  aaBb  aab
27
Derivation Order
1. S  AB
2. A  aaA
3. A  
4. B  Bb
5. B  
Rightmost derivation
1
4
5
2
3
S  AB  ABb  Ab  aaAb  aab
28
S  aAB
A  bBb
B  A|
Leftmost derivation
S  aAB  abBbB  abAbB  abbBbbB
 abbbbB  abbbb
29
S  aAB
A  bBb
B  A|
Rightmost derivation
S  aAB  aA  abBb  abAb
 abbBbb  abbbb
30
Derivation Trees
31
Derivation can be represented in a tree form
S  AB
A  aaA | 
B  Bb | 
S  AB
S
A
B
32
B  Bb | 
A  aaA | 
S  AB
S  AB  aaAB
S
A
a
a
B
A
33
A  aaA | 
S  AB
B  Bb | 
S  AB  aaAB  aaABb
S
A
a
a
B
A
B
b
34
A  aaA | 
S  AB
B  Bb | 
S  AB  aaAB  aaABb  aaBb
S
A
a
a
B
A

B
b
35
S  AB
A  aaA | 
B  Bb | 
S  AB  aaAB  aaABb  aaBb  aab
S
Derivation Tree
B
A
a
a
A
B


b
36
A  aaA | 
S  AB
B  Bb | 
S  AB  aaAB  aaABb  aaBb  aab
S
Derivation Tree
A
a
a
B
A
B


yield
b
aab
 aab
37
Partial Derivation Trees
S  AB
A  aaA | 
B  Bb | 
S  AB
Partial derivation tree
S
A
B
38
S  AB  aaAB
Partial derivation tree
S
A
a
a
B
A
39
S  AB  aaAB
sentential form
Partial derivation tree
S
yield
A
a
a
B
aaAB
A
40
Sometimes, derivation order doesn’t matter
Leftmost:
S  AB  aaAB  aaB  aaBb  aab
Rightmost:
S  AB  ABb  Ab  aaAb  aab
S
The same derivation tree
A
a
a
B
A
B


b
41
Ambiguity
42
E  E  E | E  E | (E) | a
a  a a
E
E
a

E
a
derivation
(* denotes multiplication)
E  E  E  a  E  a  E E
 a  a  E  a  a a
E

E
a
leftmost derivation
43
E  E  E | E  E | (E) | a
a  a a
derivation
E
E
E  E E  E  E E
 a  E E  a  aE
E

E

E
a
 a  a a
leftmost derivation
a
a
44
E  E  E | E  E | (E) | a
a  aa
E
E
a
E

E
E
a

E
E
a
a
E

E

E
a
a
45
E  E  E | E  E | (E) | a
a  aa
Two derivation trees
E
E
a

E
E
a

E
E
a
a
E
E

E

E
a
a
46
The grammar
E  E  E | E  E | (E) | a
is ambiguous!
String a  a  a has two derivation trees
E
E
a
E

E
E
a

E
E
a
a
E

E

E
a
a
47
E  E  E | E  E | (E) | a
is ambiguous as the string a  a  a
The grammar
has two leftmost derivations:
E  E  E  a E  a EE
 a  a E  a  a*a
E  EE  E  EE  a EE
 a  aE  a  aa
48
Definition
A context-free grammar
G is ambiguous
if some string w L(G ) has
two or more derivation trees
(two or more leftmost/rightmost derivations).
49
Why do we care about ambiguity?
a  aa
a2
E
E
a

E
E
E
a

E
E
a
a
E

E

E
a
a
50
Why do we care about ambiguity?
2  22
E
E
2
E

E
E
2

E
E
2
2
E

E

E
2
2
51
Why do we care about ambiguity?
2  22
6
E
2
E
2
8
E
4
E

2
E

2
2  22  6
2
E
2
E
2
2
4
E

2
E

2
E
2
2
2  22  8
52
Correct result:
2  22  6
6
E
2
E
2
4
E

2
E
2

2
E
2
53
Ambiguity is bad
for programming languages
We want to remove ambiguity!
54
We fix the ambiguous grammar…
E  E  E | E  E | (E) | a
E  E T
…by introducing parentheses ()
to indicate grouping, (precedence)
E T
T T F
Non-ambiguous grammar
T F
F  (E)
F a
55
E  E T T T  F T  a T  a T F
 a  F F  a  aF  a  aa
E
E  E T
a  aa

E T
E
T T F
T
T
F
F
T F
F  (E)
F a
a
T
a

F
a
56
Unique derivation tree
a  aa
E
E

T
T
T
F
F
a
a

F
a
57
The grammar G :
E  E T
E T
T T F
T F
F  (E)
is non-ambiguous.
F a
Every string w L(G ) has a unique
derivation tree.
58
Inherent Ambiguity
Some context free languages
have only ambiguous grammars!
Example:
S  S1 | S2
L  {a b c }  {a b c }
n n m
n m m
S1  S1c | A
S 2  aS2 | B
A  aAb | 
B  bBc | 
59
The string
n n n
a b c
has two derivation trees
S1
S
S
S1
S2
c
a
S2
60
n l n l
{a b c
: n, l  0}{a : n  0}
n!
Non-regular languages
Context-Free Languages
n n
R
{a b }
{ww }
Regular Languages
61
Applications:
Compilers
62
Machine Code
Program
v = 5;
if (v>5)
x = 12 + v;
while (x !=3) {
x = x - 3;
v = 10;
}
......
Compiler
Add v,v,0
cmp v,5
jmplt ELSE
THEN:
add x, 12,v
ELSE:
WHILE:
cmp x,3
...
63
Compiler
Lexical
analyzer
input
program
parser
output
machine
code 64
A parser “knows” the grammar
of the programming language
65
Parser
PROGRAM  STMT_LIST
STMT_LIST STMT; STMT_LIST | STMT;
STMT EXPR | IF_STMT | WHILE_STMT
| { STMT_LIST }
EXPR  EXPR + EXPR | EXPR - EXPR | ID
IF_STMT  if (EXPR) then STMT
| if (EXPR) then STMT else STMT
WHILE_STMT while (EXPR) do STMT
66
The parser finds the derivation
of a particular input
derivation
Parser
input
10 + 2 * 5
EE+E
|E*E
| INT
EE+E
E+E*E
 10 + E*E
 10 + 2 * E
 10 + 2 * 5
67
derivation
EE+E
E+E*E
 10 + E*E
 10 + 2 * E
 10 + 2 * 5
derivation tree
E
E
+
E
10
E
*
E
5
2
68
derivation tree
E
E
machine code
+
E
mult a, 2, 5
add b, 10, a
10
E
2
*
E
5
69
Parsing examples
70
Parser
input
string
grammar
derivation
71
Example:
Parser
input
aabb
S  SS
derivation
S  aSb
S  bSa
?
S 
72
Exhaustive Search
S  SS | aSb | bSa | 
Phase 1:
S  SS
S  aSb
Find derivation of
aabb
S  bSa
S 
All possible derivations of length 1
73
S  SS
aabb
S  aSb
S  bSa
S 
74
Phase 2
S  SS | aSb | bSa | 
S  SS  SSS
S  SS  aSbS
Phase 1
S  SS  bSaS
S  SS
S  SS  S
S  aSb
S  aSb  aSSb
aabb
S  aSb  aaSbb
S  aSb  abSab
S  aSb  ab
75
S  SS | aSb | bSa | 
Phase 2
S  SS  SSS
S  SS  aSbS
aabb
S  SS  S
S  aSb  aSSb
S  aSb  aaSbb
Phase 3
S  aSb  aaSbb  aabb
76
Final result of exhaustive search
(top-down parsing)
Parser
input
aabb
S  SS
S  aSb
S  bSa
S 
derivation
S  aSb  aaSbb  aabb
77
Another use of context free grammars: Context Free Art
http://www.contextfreeart.org/index.html
78
Context Free Art
79
Context-Free Languages
Context-Free
Grammars
Pushdown
Automata
stack
automaton
80
Pushdown Automata
PDAs
81
Pushdown Automaton - PDA
Input String
Stack
States
82
The Stack
A PDA can write symbols on a stack
and read them later on.
POP reading symbol
PUSH writing symbol

y
x
z
All access to the stack only on the top!
(Stack top is written leftmost in the string, e.g. yxz)
A stack is valuable as it can hold an unlimited
amount of information.
The stack allows pushdown automata to
recognize some non-regular languages.
83
The States
Input
symbol
Pop old
reading
stack symbol
q1 a, b / c
Push new
writing
stack symbol
q2
84
q1 a, b / c
q2
input

a


a

stack
b
h
e
$
top
Replace
c
h
e
$
(An alternative is to start and finish with empty stack)
85
q1 a,  / c
q2
input

a


stack
b
h
e
$
top
Push
a

c
b
h
e
$
86
q1 a,b / 
q2
input

a


a

stack
b
h
e
$
top
Pop
h
e
$
87
q1 a,
/
q2
input

a


a

stack
b
h
e
$
top
No Change
b
h
e
$
88
Formal Definition
Pushdown Automaton is defined as 7-tuple
M  (Q, , ,  , q0, z, F )
Final
states
States
Input
alphabet
Stack
alphabet
start
Transition
state
function
Stack
start
symbol
89
Time 0
Example 3.7 Salling:
A PDA for simple nested parenthesis strings
(
(
(
)
)

)
Input
(, / (
start
s
Stack
), ( /
), (/ 
q
end
90
Example 3.7
Time 1
Input
(
(
(
)
)
(, / (
start
s
(
)

Stack
), ( /
), (/ 
q
end
91
Example 3.7
Time 2
Input
(
(
(
(
)
)
)
(

(, / (
start
s
Stack
), ( /
), (/ 
q
end
92
Example 3.7
Time 3
Input
(
(
(
)
)
(
(
)
(

(, / (
start
s
), ( /
), (/ 
q
Stack
end
93
Example 3.7
Time 4
Input
(
(
(
)
)
(
(
)
(
(, / (

), ( /
Stack
start
s
), (/ 
q
end
94
Example 3.7
Time 5
Input
(
(
(
)
)
(
)
(

(, / (
start
s
), ( /
), (/ 
q
Stack
end
95
Example 3.7
Time 6
Input
(
(
(
)
)
(
)

), ( /
(, / (
start
s
Stack
), (/ 
q
end
96
Example 3.7
Time 7
Input
(
(
(
)
)

)
Stack
(, / (
start
s
), ( /
), (/ 
q
end
97
Download