The CYK Algorithm - Department of Computer Science

advertisement
The CYK Algorithm
Presented by
Aalapee Patel
Tyler Ondracek
CS6800 Spring 2014
A Membership Problem
• To determine if the given string is a member of the
language defined by a context free grammar.
• Given a context-free grammar G and a string w
– G = (V, ∑ ,P , S) where
» V finite set of variables
» ∑ (the alphabet) finite set of terminal symbols
» P finite set of rules
» S start symbol (distinguished element of V)
Is W in the language of G?
CYK Algorithm
• Developed by J. Cocke D. Younger, T. Kasami to answer
the membership problem
• Input should be in Chomsky Normal form
– A  BC
– Aa
– Sλ
where B, C Є V – {S}
• Uses bottom up parsing
• Uses dynamic programming or table filling algorithm
CYK Basic Idea
Let u = x1x2…xn be a string to be tested for
membership
• Step 1: For each substring of u of length 1 find
the set of variables A with a rule A -> xi,i
• Step 2: for each substring of u of length 2 find
the set of variables A that derives A -> xi,i+1
:
• Step n: for the string x1,n find the set of
variables A that derives A -> x1,n
The Diagonal Table Approach
Formula for filling the table:
Wi,j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j )
W1,4
W1,3
W2,4
W1,2
W2,3
W3,4
W1,1
W2,2
W3,3
W1
W2
W4,4
W3
• Fill the table using the above formula where Wi,j is a
production of the Grammar
• The final row (i.e. W1,4) determines if the string w is in L(G)
• If it contains the start symbol (S) then w is in L(G)
W4
CYK Table Filling Example
W1,4
W1,3
W2,4
W1,2
W2,3
W3,4
W1,1
W2,2
W3,3
W4,4
c
b
b
a
W1,4
W1,3
W2,4
W1,2
W2,3
W3,4
{A, C}
{B}
{B}
c
b
{A}
b
a
First row is is not filled using the previous slides formula but is simply filled by which
transition(s) contain the symbol
CYK Table Filling Example
W1,4
W1,3
W2,4
W1,2
W2,3
W3,4
W1,1
W2,2
W3,3
W4,4
c
b
b
a
W1,4
W1,3
W2,4
{S, C}
W2,3
W3,4
{A, C}
{B}
{B}
c
b
Wi,j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j )
W1,2 = (W1,1, W2,2) = {A, C} {B} = {AB, CB}.
What rules form AB and CB?
{A}
b
a
CYK Table Filling Example
W1,4
W1,3
W2,4
W1,2
W2,3
W3,4
W1,1
W2,2
W3,3
W4,4
c
b
b
a
W1,4
W1,3
W2,4
{S, C}
{∅}
W3,4
{A, C}
{B}
{B}
c
b
Wi,j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j )
W2,3 = (W2,2, W3,3) = {B} {B} = {BB}.
What rules form BB?
{A}
b
a
CYK Table Filling Example
W1,4
W1,3
W2,4
W1,2
W2,3
W3,4
W1,1
W2,2
W3,3
W4,4
c
b
b
a
W1,4
W1,3
W2,4
{S, C}
∅
{C}
{A, C}
{B}
{B}
c
b
Wi,j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j )
W3,4 = (W3,3, W4,4) = {B} {A} = {BA}.
What rules form BA?
{A}
b
a
CYK Table Filling Example
W1,4
W1,3
W2,4
W1,2
W2,3
W3,4
W1,1
W2,2
W3,3
W4,4
c
b
b
a
W1,4
{C}
W2,4
{S, C}
∅
{C}
{A, C}
{B}
{B}
c
b
{A}
b
Wi,j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j )
W1,3 = (W1,1, W2,3), (W1,2, W3,3) = {A, C} {∅} U {S, C} {B}= {A, C, SB, CB}.
What rules form A, C, SB or CB?
a
CYK Table Filling Example
W1,4
W1,3
W2,4
W1,2
W2,3
W3,4
W1,1
W2,2
W3,3
W4,4
c
b
b
a
W1,4
{C}
{B}
{S, C}
∅
{C}
{A, C}
{B}
{B}
c
b
{A}
b
Wi,j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j )
W2,4 = (W2,2, W3,4), (W2,3, W4,4) = {B} {C} U {∅} {A}= {BC, A}
What rules form BC or A?
a
CYK Table Filling Example
W1,4
W1,3
W2,4
W1,2
W2,3
W3,4
W1,1
W2,2
W3,3
W4,4
c
b
b
a
{S, A, C}
{C}
{B}
{S, C}
∅
{C}
{A, C}
{B}
{B}
c
b
{A}
b
a
Wi,j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j )
W1,4 = (W1,1, W2,4), (W1,2, W3,4), (W1,3, W4,4) = {A, C} {B} U {S, C} {C} U {C} {A} = {AB, CB, SC,
CC, CA}
What rules form AB, CB, SC, CC or CA?
CYK Table Filling Example
W1,4
W1,3
W2,4
W1,2
W2,3
W3,4
W1,1
W2,2
W3,3
W4,4
c
b
b
a
{S, A, C}
{C}
{B}
{S, C}
∅
{C}
{A, C}
{B}
{B}
c
b
The string w is in the language
• Since W1,n which is W1,4 has the start symbol
{A}
b
a
CYK Table Filling Example
W1,4
W1,3
W2,4
W1,2
W2,3
W3,4
W1,1
W2,2
W3,3
W4,4
c
b
b
a
W1,4
W1,3
W2,4
W1,2
W2,3
W3,4
W1,1
W2,2
W3,3
c
b
W4,4
a
a
First row is is not filled using the previous slides formula but is simply filled by which
transition(s) contain the symbol
CYK Table Filling Example
W1,4
S  AB | BC
A  BA | a
B  CC | b
C  AB | a
W1,3
W2,4
W1,2
W2,3
W3,4
W1,1
W2,2
W3,3
W4,4
c
b
b
a
W1,4
W1,3
W2,4
W1,2
W2,3
W3,4
W1,1
W2,2
W3,3
a
a
W4,4
b
a
First row is is not filled using the previous slides formula but is simply filled by which
transition(s) contain the symbol
References
•
•
•
•
David Rodriguez-Velazquez “The CYK Algorithm” 2009 course website
Savitha parur venkitachalam “Membership problem CYK Algorithm” 2013 course
presentation
http://en.wikipedia.org/wiki/CYK_algorithm
Languages and Machines, An Introduction to the Theory of Computer Science - Thomas A.
Sudkamp
Download