Fachbereich Mathematik Kord Eickmeyer Summer Term 2015 Exercise Sheet no. 1 – Formal Foundations of Computer Science (E1.1) [warm-up: bracketing and regular expressions] Give all α ∈ REG({0, 1}) which, after removing all parentheses, lead to 0+10∗. Which is the one we mean by 0+10∗? What are the languages L(α) for these α? Solution. Possible bracketings are: (i) (0+1)(0∗) (ii) ((0+1)0)∗ (iii) (0+((10)∗)) (iv) (0+(1(0∗))) (v) (0+(10))∗ Of these, (0+(1(0∗))) is what we usually mean. (E1.2) [a sample DFA] Consider the following DFA over the alphabet Σ = {0, 1}: 1 0 q0 1 q1 0 (i) Write this automaton formally as a tuple (Q, Σ, δ, q0 , F ), i.e. give the sets Q and F , the initial state q0 and the transition function δ. (ii) Describe in words the language recognised by this automaton. (iii) Give a regular expression describing the same language. Solution. (i) A = (Q, Σ, δ, q0 , F ) with Q := {q0 , q1 }, Σ := {0, 1}, 0 1 δ := q0 q0 q1 q1 q0 q1 F := {q1 }. (ii) The automaton accepts exactly those words whose last letter is a 1. (iii) L(A) = L((0+1)∗1). (E1.3) [regular languages] For each of these languages, give a regular expression Ei and a deterministic finite automaton Ai such that Li = L(Ei ) = L(Ai ): L1 := {w ∈ {a, b, c}∗ |w|a ≥ 1 and |w|b ≥ 2}, L2 := {w ∈ {0, 1}∗ there is at most one pair of adjacent 1s in w}, L3 := {w ∈ {0, 1}∗ every pair of adjacent 0s appears before any pair of adjacent 1s}. Solution. Let α := (a+b)∗. Then L1 = L(αaαbαbα + αbαaαbα + αbαbαaα). An automaton for L1 is a a a, b ≥1, 0 b ≥1, 1 b a a 0, 0 ≥1, ≥2 b b 0, 1 a b 0, ≥2 Here, a state label 0, ≥2 means that words for which the automaton is in this state have 0 a’s and at least two b’s. (E1.4) [converting NFAs to DFAs] Sketch a transition diagram for the NFA A := ({p, q, r, s, t}, {0, 1}, δ, p, {s, t}), with δ as below. Convert it into a DFA, sketch the resulting transition diagram. Informally describe the language it accepts. δ := p q r s t 0 {p, q} {r, s} {p, r} ∅ ∅ 1 {p} {t} {t} ∅ ∅ (E1.5) [syntactic sugar for regular expressions] In applications, regular expressions are often allowed to contain additional forms of subexpressions. Let Σ be a finite alphabet and define the set EREG(Σ) of extended regular expressions as • REG(Σ) ⊆ EREG(Σ), • E+ ∈ EREG(Σ) for every E ∈ EREG(Σ), and • E{m, n} ∈ EREG(Σ) for every E ∈ EREG(Σ) and m ≤ n ∈ N. The inteded meaning of the additional constructs E+ and E{m, n} is that L(E+) := {w1 , . . . , wk k ≥ 1 and w1 , . . . , wk , ∈ L(E)}, L(E{m, n}) := {w1 , . . . , wk m ≤ k ≤ n and w1 , . . . , wk ∈ L(E)}. Show that these extensions do not increase the class of of definable languages: For every α ∈ EREG(Σ) there is an α0 ∈ REG(Σ) such that L(α) = L(α0 ). (E1.6) [outlook: languages which are not regular] Consider the alphabet Σ = {(, ), a, ◦} and the language L ⊆ Σ∗ defined as the smallest language for which • a∈L • for every α1 , . . . , αk ∈ L also (α1 ◦ · · · ◦ αk ) ∈ L. In other words, L contains all correctly bracketed expressions: L = {a, (a ◦ a), (a ◦ a ◦ a), (a ◦ (a ◦ a)), ((a ◦ a) ◦ a), . . .}. We will see later that L is not a regular language. Though we lack the tools to prove this now, it is instructive to try constructing a regular expression or an automaton for L to see where you get stuck. If we restrict the nesting depth in the bracketed expressions, the resulting languages do become regular: Define L0 := {a}, Ln+1 := Ln ∪ {(α1 ◦ . . . ◦ αk ) | α1 , . . . , αk ∈ Ln }. Then each Ln is regular. For each n ≥ 0, give an automaton A with L(A) = Ln .