Language Recognition

advertisement

INFO 2950

Prof. Carla Gomes gomes@cs.cornell.edu

Module

Modeling Computation:

Language Recognition

Rosen, Chapter 12.4

1

What sets can be recognized by a

Finite State Automata?

Regular Sets

2

Regular Sets

Definition: A regular set is a set that can be generated starting from the empty set, empty string , and single elements from the alphabet , using concatenations, unions, and Kleene closures in arbitrary order.

We will give a more precise definition after we define a regular expression .

Regular Expressions

Definition: The regular expressions over a set I are defined recursively by:

– 

(the empty set) is a regular expression,

– 

(the set containing the empty string) is a regular expression,

– x is a regular expression for all x

I,

– ( AB ) , ( A

B ) , and A * are regular expressions if A and B are regular expressions

Definition: A regular set is a set represented by a regular expression.

Regular Expression Example

Examples: 001 * , 1(0



(0

1) * 11, and AB * C are regular expressions

The regular set defined by the regular expression 01 * is the set of strings starting with a 0 followed by 0 or more 1 s.

The regular set defined by (10) * is the set of strings containing 0 or more copies of 10 .

The regular set defined by 0(0

1) * 1 is the set of all binary strings beginning with 0 and ending with 1.

The regular set defined by (0

1)1(0

1) is the set of strings { 010 , 011 , 110 , 111 }.

What are the strings represented by

10*

A 1 followed by any mnumber of 0s (including no zeros)

(10)*

Any number of copies of 10 (including null string)

6

0

01 the string 0 or the string 01

0 (0

1)*

Any string beginning with 0

(0*1)*

Any string not ending with a 0 (including null string)

7

Find a regular expression

The set of bit strings with even length

(00

01

10

11)*

Set of bit strings ending with a 0 not containing 11

Concatenations of 0 or 10 ; not the null string

(0

10)*(0

10)

8

The set of bit strings containing and odd number of 0s

At least one 0

Zero or more 1s, followed by a 0, followed by zero or more 1

1*01*(01*01*)*

9

Regular Expression Applications

Regular expressions are actually used quite often in computer science.

For instance, if you are editing a file with vi , and want to see if it contains the string blah followed by a number followed by any character followed by the letter Q , you can use the regular expression blah[0-9][0-9]*.Q

This works because vi uses regular expressions for searching.

Regular Expression Regular Grammar a*

(a+b)* a* + b* a*b ba*

(ab)*

S

 

| aS

S

 

| aS | bS

S

 

| A | B

A

 a | aA

B

 b | bB

S

 b | aS

S

 bA

A

 

| aA

S

 

| abS

EXAMPLE 1

Consider the language { a m b n | m, n

N }, which is represented by the regular expression a*b *.

A regular grammar for this language can be written as follows:

S

 

| aS | B

B

 b | bB.

Grammars, Expressions, and

Automata

• Consider the set

A ={binary strings which start with 0 and end with 1 }

We saw that A is recognized by a finite-state automata.

A is generated by the grammar with V={S,A,0,1},

T={0,1}, and P={S

0A, A

0A, A

1A, A

1}

We also saw that A is defined by the regular expression

0(0

1) * 1

• This is no coincidence, as we will see next.

Three Equivalent Representations

Regular expressions

Finite automata

Each can describe the others

Regular languages

Kleene’s Theorem:

For every regular expression, there is a deterministic finite-state automaton that defines the same language, and vice versa.

Grammars, Expressions, and

Automata

Theorem: Let L be a language. The following three statements are equivalent

L is regular set (that is, L generated by a regular expression)

L is a regular language (that is, L generated by a regular grammar)

L is recognized by a finite-state automaton

• Put another way, L is a regular set if and only if L is a regular language if and only if L is recognized by a finitestate automaton .

• In other words, regular sets, regular languages, and languages recognized by finite-state automata are all the same thing.

Example

Example: What language does the following finite-state automaton recognize?

Complex Example Continued

• If start by going to state S

1 can recognize 000, 0110, 011100, 0111110,

011111100, 00100, 0010100, 01110110, 01110100, …

• It is not easy to see the pattern right away, but notice that they

Start with 0

Can have any number of instances of 111 or 01 interleaved

Can then have either 00 or 110

Can end with any number of 1s.

• These are all of the form 0(111

01)*(00

110)1*

• But we can also start by going to S

6

Complex Example Continued

• If we start by going to S

6

Start with 1 we notice that the strings

Have any number of occurrences of 01

Have a 1

End with as many 0s as we want

• These are of the form

1(01)*10*

• Thus, we can recognize

(0(111

01)*(00

110)1*)



1(01)*10*)

Limitations

Problem: Find a finite-state automaton that recognizes the following language

L ={0 n 1 n | n=0,1,2,…}

• Solution: It cannot be done.

• Proof: Take an advanced course.

• Can you describe

L with a regular expression ?

• Can you give a regular grammar that generates L ?

• Can you give any grammar that generates L ?

Models of computing

DFA

Push down automata

Bounded Turing M’s

Turing machines -

-

-

regular languages

Context-free

Context sensitive

Phrase-structure

20

Summary

• Hopefully it is clear that although finite-state machines and finite-state automata are useful models of computation, they have serious limitations.

• Are there more powerful ways to model computation?

• The answer is: Yes.

• Some more powerful models include

Pushdown automaton

Linear bounded automaton

Turing machines

Quantum computation models

Download