LING/C SC/PSYC 438/538

advertisement
LING/C SC/PSYC 438/538
Lecture 11
Sandiway Fong
Administrivia
• Homework 3 graded
Last Time
1. Introduced Regular Languages
– can be generated by regular expressions
– or Finite State Automata (FSA)
– or regular grammars --- not yet introduced
2. Deterministic and non-deterministic FSA
3. DFSA can be easily encoded in Perl:
– hash table for the transition function
– foreach loop over a string (character by character)
– conditional to check for end state
4. NDFSA can be converted into DFSA
– example of the set of states construction
– Practice: ungraded homework exercise
Ungraded Homework Exercise
• do not submit, do the following exercise to check your
understanding
– apply the set-of-states construction technique to the two
machines on the ε-transition slide (repeated below)
– self-check your answer:
• verify in each case that the machine produced is deterministic and
accurately simulates its ε-transition counterpart
>
a
b
ε
>
a
ε
b
Ungraded Homework Exercise Review
• Converting a NDFSA into a DFSA
>
a
1
b
2
3
Note: this machine with an ε-transition
is non-deterministic
ε
Note: this machine is
deterministic
>
{1,3}
a
{2}
b
{3}
Ungraded Homework Exercise Review
• Converting a NDFSA into a DFSA
>
a
1
b
2
3
Note: this machine with an ε-transition
is non-deterministic
ε
Note: this machine is
deterministic
>
{1,2}
a
{2}
b
b
{3}
Last Time
Regular Languages
• Three formalisms
– All formally equivalent (no difference in expressive power)
– i.e. if you can encode it using a RE, you can do it using a FSA or regular grammar,
and so on …
Perl regular
expressions
stuff out
here
Regular
Expressions
FSA
Regular Languages
Regular
Grammars
talk more about formal
equivalence later today…
Perl Regular Expressions
• Perl regex can include backreferences to groupings (i.e. \1, etc.)
– backreferences give Perl regexs expressive power beyond regular
languages:
• the set of prime numbers is not a regular language
Lprime = {2, 3, 5, 7, 11, 13, 17, 19, 23,.. }
can be proved using
the Pumping Lemma
for regular languages
(later)
Backreferences and FSA
• Deep question:
>
s
a
x
a
y
b
x2
a
– why are backreferences impossible in FSA?
Example:
Doesn’t work!
Suppose you wanted a machine that Why?
accepted /(a+b+)\1/
One idea: link two copies of the machine together
• Perl implementation:
– how to modify it get the backreference effect?
b
a
b
y2
b
Regular Languages and FSA
• Formal (constructive) set-theoretic definition of a regular language
• Correspondence between REs and Regular Languages
• concatenation (juxtaposition)
• union
(| also [ ])
• Kleene closure (*)
= (x+ = xx*)
• Note:
• backreferences are memory devices and thus are too powerful
• e.g. L = {ww} and prime number testing (earlier slides)
Regular Languages and FSA
• Other closure properties:
• Not true higher up: e.g. context-free grammars as we’ll see later
Equivalence: FSA and Regexs
Textbook gives one direction only
• Case by case:
a)
b)
c)
Empty string
Empty set
Any character from the alphabet
Equivalence: FSA and Regexs
• Concatenation:
– Link final state of FSA1 to initial state of FSA2 using an empty transition
Note: empty transition can be eliminated using the set of states construction
(see earlier slides in this lecture)
Equivalence: FSA and Regexs
• Kleene closure:
– repetition operator: zero or more times
– use empty transitions for loopback and bypass
Equivalence: FSA and Regexs
• Union: aka disjunction
– Non-deterministically run both FSAs at the same time, accept if either one accepts
Regular Languages and FSA
• Other closure properties:
Let’s consider building the FSA machinery for each of these guys in turn…
Regular Languages and FSA
• Other closure properties:
Regular Languages and FSA
• Other closure properties:
Regular Languages and FSA
• Other closure properties:
Regular Languages and FSA
• Other closure properties:
Regular Expressions from FSA
Textbook Exercise: find a RE for
Examples (* denotes string not in the language):
*ab *ba
bab
λ (empty string)
bb
*baba
babab
Regular Expressions from FSA
• Draw a FSA and convert it to a RE:
b
b
> 1
b
2
a
3
ε
b*
b
( ab+ )+
= b+(ab+)*| ε
b
4
[Powerpoint
Animation]
Regular Expressions from FSA
• Perl implementation:
$s = "ab ba bab bb baba babab";
while ($s =~ /\b(b+(ab+)*)\b/g) {
print "<$1> match!\n";
}
Note:
doesn’t include
the empty string
case
• Output:
perl test.perl
<bab> match!
<bb> match!
<babab> match!
Note: /../g global flag
for multiple matches
Download