1 Language Theory and Infinite Graphs Colin Stirling School of Informatics University of Edinburgh cps marktoberdorf today 2 Example a X a A b ε b C b YX b b a ε a AA b XX YXX CA b b b b AAA XXX .... . . . .... . . . · · A=X · ²=² C =YX UNF · AA = XX UNF · · A=X CA = Y XX BAL(L) · CX = Y XX CUT · C =YX cps marktoberdorf UNF today 3 Decidability Sketched decidability proof of language equivalence for simple grammars using tableau proof system Propositions • Every tableau is finite · • α ∼ β iff the tableau with root α = β is successful · Main point, given α = β, then every goal in the tableau has a bounded size (in terms of |α|, |β| and the size of the PDG) cps marktoberdorf today 4 Comments What are key features that underpin decidability? • Bisimulation equivalence is a congruence with respect to stack prefixing if L(α) = L(β) then L(δα) = L(δβ) Congruence allows us to tear apart a configuration α0α and replace its tail with a potentially equivalent configuration β, α0β • Bisimulation equivalence supports cancellation of postfixed stacks if L(αδ) = L(βδ) then L(α) = L(β) Cancellation, as used in CUT, has the consequence that goals become small cps marktoberdorf today 5 Back to DPDA • How can we tear apart DPDA (stable) configurations pα and replace parts with parts? What is a part of a configuration? A configuration has state as well as stack • L(pα) = L(qβ) does not, in general, imply L(pδα) = L(qδβ) • L(pαδ) = L(qβδ) does not, in general, imply L(pα) = L(qβ) cps marktoberdorf today 6 a Even worse a a a a, c pX b b c pYX b pYYX c pYYYX c ... b .... pε • A main attack on the decision problem in the 1960/70s examined differences between stack lengths and potentially equivalent configurations that eventually resulted in a proof of decidability for real-time DPDAs • However, L(pY nX) = L(pY mX) for every m and n cps marktoberdorf today 7 However Many of these problems disappear with pushdown grammars. Despite nondeterminism, stacking is a congruence with respect to pre and postfixing for language and bisimulation equivalence and both equivalences support pre and postfixed cancellation 1. L(α) = L(β) iff L(αδ) = L(βδ) 2. L(α) = L(β) iff L(δα) = L(δβ) 3. α ∼ β iff αδ ∼ βδ 4. α ∼ β iff αδ ∼ βδ cps marktoberdorf today 8 Therefore ... • This suggests that we should try to remain with PDGs • Deterministic PDGs are too restricted • Arbitrary PDGs are too rich (as they generate all the non-empty context-free languages) • cps What is needed is a mechanism for constraining nondeterminism in a PDG marktoberdorf today 9 Textbook transformation of PDAs into PDGs • From PDA to language equivalent PDGs (with ²-transitions) w • Introduce stack symbols [pXq] so that L([pXq]) = {w : pX −→ q²} • Convert basic transitions into sets of transitions: a a pX −→ q² is {[pXq] −→ ²} a a pX −→ rY is {[pXq] −→ [rY q]} a a pX −→ rY Z is {[pXq] −→ [rY p0][p0Zq] : p0 ∈ P} • w ∈ L(pX1 . . . Xn) iff w ∈ L([pX1q1][q1X2q2] . . . [qn−1Xnqn]) for some qi cps marktoberdorf today 10 Example q² ↑b pX ↓b r² b ←− qX ↑b a −→ pXX ↓b ² ←− rX b ←− qXX ↑b a −→ pXXX ↓b ² ←− rXX b ←− . . . a −→ . . . ² ←− . . . a • [pXq] −→ [pXp][pXq] . . . a • [pXq] −→ [pXq][qXq] . . . • Introducing extra nondeterminism cps marktoberdorf today 11 For DPDA • Transform into normal form PDG without ²-transitions • We only convert transitions for a ∈ A ² • A stack symbol [pXq] is an ε-symbol, if pX −→ q² ∈ T. All ε-symbols are erased from the right hand side of any transition in the SDG • Next, the SDG is normalised by removing all unnormed stack symbols cps marktoberdorf today 12 Example a a a a a, c pX c pYX b b pYYX c pYYYX c ... b b .... pε a [pXr] −→ [pXr] c [pXr] −→ [pXr] a [pY r] −→ ² c [pY r] −→ [pY p][pY r] [pXp] −→ [pXp] c [pXp] −→ [pXp] [pY p] −→ ² c [pY p] −→ [pY r][rY p] a [pXp] −→ ² b b [pY p] −→ [pY p][pY p] c [pY r] −→ [pY r][rY r] c There are two ²-stack symbols, [rXp] and [rY r] cps marktoberdorf today 13 Example a a a a a, c pX c pYX b pYYX b c pYYYX c ... b b .... pε a [pXp] −→ ² a [pY r] −→ ² c [pY r] −→ [pY r] [pXp] −→ [pXp] [pY p] −→ ² c [pY r] −→ [pY p][pY r] b [pXp] −→ [pXp] c b [pY p] −→ [pY p][pY p] c Extra nondeterminism: consider G([pY r]) cps marktoberdorf today 14 Main problem • Transformation does not preserve bisimulation equivalence • How to overcome this ? • Need to preserve structure cps marktoberdorf today 15 Sum configurations • Convert transitions into transitions over SUMS: a a pX −→ q² is [pXq] −→ ² a a pX −→ rY is [pXq] −→ [rY q] a a P pX −→ rY Z is [pXq] −→ {[rY p0][p0Zq] : p0 ∈ P} • Convert pX1 . . . Xn to P {[pX1q1][q1X2q2] . . . [qn−1Xnqn] : qi ∈ P} • Language of a sum is union of languages of summands • For DPDA determinize transitions for sum configurations: preserve structure cps marktoberdorf today 16 Example a a a a a, c pX c pYX b pYYX b c pYYYX c ... b b .... pε a [pXp] −→ ² a [pY r] −→ ² c [pY r] −→ [pY r] [pXp] −→ [pXp] [pY p] −→ ² c [pY r] −→ [pY p][pY r] b [pXp] −→ [pXp] c b [pY p] −→ [pY p][pY p] c • Let A = [pXp], B = [pY p], C = [pY r]. So, pY X is BA + C cps marktoberdorf today 17 Determinized transitions a a a a a, c pX c pYX b b b pYYX c c pYYYX ... b .... pε a A −→ ² b A −→ A a C −→ ² B −→ BB b c A −→ A B −→ ² c C −→ BC + C c • Let A = [pXp], B = [pY p], C = [pY r]. So, pY X is BA + C cps marktoberdorf today 18 Strict deterministic grammars Due to Harrison and Havel in 1970s Add an extra component to a PDG • An equivalence relation ≡ on S (stack symbols) ≡ partitions the stack symbols S into disjoint subsets S1, . . . , Sk : for each i, and pair of stack symbols X, Y ∈ Si, X ≡ Y INTUITION FROM DPDA: X ≡ Y iff X = [pZq] and Y = [pZr] cps marktoberdorf today 19 Sequences Extend ≡ on S to a relation on S∗ α ≡ β iff • α = β, or • there is a δ ∈ S∗, α = δXα0, β = δY β 0, X ≡ Y and X 6= Y For instance, if Y ≡ Z then XY α ≡ XZ for any α ≡ on stacks is not an equivalence relation cps marktoberdorf today 20 Properties • αβ ≡ α iff β = ² • α ≡ β iff δα ≡ δβ • If α ≡ β and γ ≡ δ then αγ ≡ βδ • If α ≡ β and α 6= β then αγ ≡ βδ • If αγ ≡ βδ and |α| = |β| then α ≡ β cps marktoberdorf today 21 Strict deterministic ≡ on S is strict when a 1. if X −→ α ∈ T ||| a Y −→ β ∈ T 2. if X −→ α ∈ T ||| a Y −→ α ∈ T then α ||| β then X || Y a A PDG with ≡ is strict determinisitic, an SDG, if its relation ≡ is strict cps marktoberdorf today 22 Example X −→ X a X −→ ² b X −→ X c Z −→ ² b Z −→ Z Y −→ Y Y c Y −→ ² c Z −→ Y Z a a,c c X b ε c Z a c a YZ c a YYZ c ... b ε Strict: partition {{X}, {Y, Z}} c c c Y −→ Y Y , Z −→ Z, Z −→ Y Z and Y Y ≡ Z, Y Y ≡ Y Z and Z ≡ Y Z cps marktoberdorf today 23 Constrained nondeterminism • If each set in the partition is a singleton then the SDG is deterministic, a simple grammar, and α ≡ β iff α = β • In general, constrained nondeterminism a a X −→ ² ∈ T, X ≡ Y and X 6= Y implies no a-transition Y −→ β ∈ T a Suppose Y −→ β. By condition 1 of being strict ² ≡ β, so β = ². By condition 2 of being strict X = Y , which is a contradiction. cps marktoberdorf today 24 Key property Strictness conditions generalise to words and stacks w 1. if α −→ α0 ||| w β −→ β 0 2. if α −→ α0 ||| a β −→ α0 then α0 ||| β0 then α || β w Proof in the notes page 25 cps marktoberdorf today 25 Corollaries • If α ≡ β and w ∈ L(α), then for all words v, and a ∈ A, wav 6∈ L(β) • If α ≡ β and α 6= β then L(α) ∩ L(β) = ∅ DPDA intuition: properties hold for [pXq], [pXr] cps marktoberdorf today 26 Sum configurations • Extend configuration to sets of sequences of stack symbols, {α1, . . . , αn} • Write it as a sum α1 + . . . + αn without repeated elements • Degenerate case is the empty sum, ∅ • L(α1 + . . . + αn) = • L(∅) = ∅ cps S {L(αi) : 1 ≤ i ≤ n} marktoberdorf today 27 Admissible sums • β1 + . . . + βn is admissible if βi ≡ βj for each i and j • Admissible configurations are preserved by transitions If α1 + . . . + αn admissible then for all w the sum configuration w w {βi1 : α1 −→ βi1} ∪ . . . ∪ {βin : αn −→ βin} is admissible • DPDA intuition: {[pX1q1] . . . [qn−1Xnqn] : qi ∈ P} is admissible cps marktoberdorf today 28 Example a a,c c X b ε c Z a YZ c a YYZ c ... b ε Strict: partition {{X}, {Y, Z}} Admissible: XX, ZZZ + ZZY , Y X + Z, Z + Y Z, Z + Y Z + Y Y Z Not admissible X + Z, Y + ² cps marktoberdorf today 29 Determinization • T is changed to Td. For each stack symbol X and a ∈ A, the family of a a transitions X −→ α1, . . . , X −→ αn ∈ T is determinized a X −→ α1 + . . . + αn ∈ Td a • For each X, a a unique transition X −→ cps P marktoberdorf αi ∈ Td as sum could be ∅ today 30 Determinization cont. • Prefix rule for generating transitions is generalised a If Xi −→ α1i + . . . + αni i for each i : 1 ≤ i ≤ m then a X1β1 + . . . + Xmβm −→ α11β1 + . . . + αn1 1 β1 + ... ... + α1mβm + . . . + αnmm βm • Determinized graph Gd(α1 + . . . + αn) using Td and the extended prefix rule cps marktoberdorf today 31 Example S = {X, Y, Z}, A = {a, b, c} and the partition of S = {{X}, {Y, Z}}. And Td is a a a b b X −→ X Y −→ ² Z −→ ∅ X −→ ² Y −→ ∅ b c c c Z −→ ² X −→ X Y −→ Y Y Z −→ Y Z + Z Gd(Y X + Z) is a a,c X b a YX + Z b c a YYX + YZ + Z b c ... ... ε cps marktoberdorf today 32 Which is ∼ a a a a a, c pX b b c pYX b pYYX c pYYYX c ... b .... pε cps marktoberdorf today 33 Characterising DPDA • A DPDA can be converted into a determinized SDG, and configuration pα is converted into sum configuration sum(pα) of the SDG where Gc(pα) ∼ Gd(sum(pα)) • (Conversely, a determinized SDG can be transformed into a DPDA) • DPDA equivalence problem = to deciding whether two admissible configurations of a determinized SDG are bisimulation equivalent cps marktoberdorf today 34 Notation • E, F , G, . . . range over admissible configurations • + can be extended: if E and F are admissible, E ∪ F is admissible, E and F are disjoint, E ∩ F = ∅, then E + F is the admissible configuration E ∪ F • Also use sequential composition, if E and F are admissible, then EF is the admissible configuration {βγ : β ∈ E and γ ∈ F } • E = E1G1 + . . . + EnGn is a head/tail form , if the head E1 + . . . + En is admissible and at least one Ei 6= ∅, and each tail Gi 6= ∅ If we write E = E1G1 + . . . + EnGn then E is a head/tail form cps marktoberdorf today 35 Congruence • Decision question E ∼ F ? • Assume E = E1G1 + . . . + EnGn – If each Hi 6= ∅ and each Ei 6= ε and for each j such that Ej 6= ∅, Hj ∼m Gj , then E ∼m+1 E1H1 + . . . + EnHn – If each Hi 6= ∅ and for each j such that Ej 6= ∅, Hj ∼ Gj , then E ∼ E 1 H1 + . . . + E n Hn • We can tear apart configurations and replace parts with equivalent parts cps marktoberdorf today 36 Notation v • For X ∈ S, w(X) is the unique shortest word v ∈ A+ such that X −→ ² • w(E) is the unique shortest word for configuration E • M is the maximum norm of the SDG u • “E following u”, written E · u, is the unique F such that E −→ F cps marktoberdorf today 37 Decision procedure • Given by a deterministic tableau proof system where goals are reduced to subgoals · • Goals and subgoals have the form E = F , is E ∼ F • Rules are (generalisations of) unfold, UNF, and balance rules, BAL(L) and BAL(R). (CUT free) cps marktoberdorf today