The CYK Algorithm Presented by Aalapee Patel Tyler Ondracek CS6800 Spring 2014 A Membership Problem • To determine if the given string is a member of the language defined by a context free grammar. • Given a context-free grammar G and a string w – G = (V, ∑ ,P , S) where » V finite set of variables » ∑ (the alphabet) finite set of terminal symbols » P finite set of rules » S start symbol (distinguished element of V) Is W in the language of G? CYK Algorithm • Developed by J. Cocke D. Younger, T. Kasami to answer the membership problem • Input should be in Chomsky Normal form – A BC – Aa – Sλ where B, C Є V – {S} • Uses bottom up parsing • Uses dynamic programming or table filling algorithm CYK Basic Idea Let u = x1x2…xn be a string to be tested for membership • Step 1: For each substring of u of length 1 find the set of variables A with a rule A -> xi,i • Step 2: for each substring of u of length 2 find the set of variables A that derives A -> xi,i+1 : • Step n: for the string x1,n find the set of variables A that derives A -> x1,n The Diagonal Table Approach Formula for filling the table: Wi,j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j ) W1,4 W1,3 W2,4 W1,2 W2,3 W3,4 W1,1 W2,2 W3,3 W1 W2 W4,4 W3 • Fill the table using the above formula where Wi,j is a production of the Grammar • The final row (i.e. W1,4) determines if the string w is in L(G) • If it contains the start symbol (S) then w is in L(G) W4 CYK Table Filling Example W1,4 W1,3 W2,4 W1,2 W2,3 W3,4 W1,1 W2,2 W3,3 W4,4 c b b a W1,4 W1,3 W2,4 W1,2 W2,3 W3,4 {A, C} {B} {B} c b {A} b a First row is is not filled using the previous slides formula but is simply filled by which transition(s) contain the symbol CYK Table Filling Example W1,4 W1,3 W2,4 W1,2 W2,3 W3,4 W1,1 W2,2 W3,3 W4,4 c b b a W1,4 W1,3 W2,4 {S, C} W2,3 W3,4 {A, C} {B} {B} c b Wi,j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j ) W1,2 = (W1,1, W2,2) = {A, C} {B} = {AB, CB}. What rules form AB and CB? {A} b a CYK Table Filling Example W1,4 W1,3 W2,4 W1,2 W2,3 W3,4 W1,1 W2,2 W3,3 W4,4 c b b a W1,4 W1,3 W2,4 {S, C} {∅} W3,4 {A, C} {B} {B} c b Wi,j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j ) W2,3 = (W2,2, W3,3) = {B} {B} = {BB}. What rules form BB? {A} b a CYK Table Filling Example W1,4 W1,3 W2,4 W1,2 W2,3 W3,4 W1,1 W2,2 W3,3 W4,4 c b b a W1,4 W1,3 W2,4 {S, C} ∅ {C} {A, C} {B} {B} c b Wi,j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j ) W3,4 = (W3,3, W4,4) = {B} {A} = {BA}. What rules form BA? {A} b a CYK Table Filling Example W1,4 W1,3 W2,4 W1,2 W2,3 W3,4 W1,1 W2,2 W3,3 W4,4 c b b a W1,4 {C} W2,4 {S, C} ∅ {C} {A, C} {B} {B} c b {A} b Wi,j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j ) W1,3 = (W1,1, W2,3), (W1,2, W3,3) = {A, C} {∅} U {S, C} {B}= {A, C, SB, CB}. What rules form A, C, SB or CB? a CYK Table Filling Example W1,4 W1,3 W2,4 W1,2 W2,3 W3,4 W1,1 W2,2 W3,3 W4,4 c b b a W1,4 {C} {B} {S, C} ∅ {C} {A, C} {B} {B} c b {A} b Wi,j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j ) W2,4 = (W2,2, W3,4), (W2,3, W4,4) = {B} {C} U {∅} {A}= {BC, A} What rules form BC or A? a CYK Table Filling Example W1,4 W1,3 W2,4 W1,2 W2,3 W3,4 W1,1 W2,2 W3,3 W4,4 c b b a {S, A, C} {C} {B} {S, C} ∅ {C} {A, C} {B} {B} c b {A} b a Wi,j = (Wi, i , Wi+1, j ), (Wi, i+1 , Wi+2, j ) …… (Wi, j-1 , Wj, j ) W1,4 = (W1,1, W2,4), (W1,2, W3,4), (W1,3, W4,4) = {A, C} {B} U {S, C} {C} U {C} {A} = {AB, CB, SC, CC, CA} What rules form AB, CB, SC, CC or CA? CYK Table Filling Example W1,4 W1,3 W2,4 W1,2 W2,3 W3,4 W1,1 W2,2 W3,3 W4,4 c b b a {S, A, C} {C} {B} {S, C} ∅ {C} {A, C} {B} {B} c b The string w is in the language • Since W1,n which is W1,4 has the start symbol {A} b a CYK Table Filling Example W1,4 W1,3 W2,4 W1,2 W2,3 W3,4 W1,1 W2,2 W3,3 W4,4 c b b a W1,4 W1,3 W2,4 W1,2 W2,3 W3,4 W1,1 W2,2 W3,3 c b W4,4 a a First row is is not filled using the previous slides formula but is simply filled by which transition(s) contain the symbol CYK Table Filling Example W1,4 S AB | BC A BA | a B CC | b C AB | a W1,3 W2,4 W1,2 W2,3 W3,4 W1,1 W2,2 W3,3 W4,4 c b b a W1,4 W1,3 W2,4 W1,2 W2,3 W3,4 W1,1 W2,2 W3,3 a a W4,4 b a First row is is not filled using the previous slides formula but is simply filled by which transition(s) contain the symbol References • • • • David Rodriguez-Velazquez “The CYK Algorithm” 2009 course website Savitha parur venkitachalam “Membership problem CYK Algorithm” 2013 course presentation http://en.wikipedia.org/wiki/CYK_algorithm Languages and Machines, An Introduction to the Theory of Computer Science - Thomas A. Sudkamp