August 25

advertisement
Class 1
What this course is all about?
We will look at languages in the abstract sense--alphabet, grammar rules, machines that
recognize the languages. There are four classes of languages usually studied in formal
language theory: regular (Type 3), context-free (Type 2), context-sensitive (Type 1) and
recursively enumerable or unrestricted (Type 0) Each language type has a different set of
rules which the grammars must obey and each is recognized by a different kind of machine.
We will focus primarily on the regular and context-free languages.
We also consider more abstract concepts such as decidability i.e. given a particular question is
it possible to give a correct answer for every instance of the problem? In other words, if we
have an algorithm to solve the problem, that algorithm must return a correct answer for every
possible instance of the problem. The classic undecidable problem is the halting problem.
(This is typically proven for a Turing machine—given a Turing machine M and input string w
does M halt when run on w), Basically the problem asks, given a program and input for the
program will the program halt when run on that input. (This would be a handy tool to have so
you could avoid having infinite loops in a program.) We could think of this as a function called
HALT which can take a program and its input and decide (correctly) if the program will halt on
that input.
Let P be a program that contains the function HALT. Recall that the input to P is a program
and its input. Here’s how P operates:
Program P (program, input, output)
read program and its input
answer  HALT(program, input)
if the answer is “no”, then stop
else {answer is “yes”}
i2
while i < 5
ii+0
end
Now, suppose the input to program P is a copy of P itself and some input program. Consider
what happens. If HALT says “yes” then it means P doesn’t go into an infinite loop, but the way
P is designed, this is the situation under which P goes into an infinite loop. Basically what we
have is program P goes into an infinite loop if and only if P halts.
Thus, no such algorithm can exist. In other words it’s not decidable if a program will always
halt. That is, there are some questions that cannot be answered by a computer. This is the
basic idea behind decidability. We’ll see more of this later when we ask questions like “Is the
complement of a context-free language also context-free?”
Now, let’s define some terms we will be using throughout the semester.
In order to define a language we need to begin with an alphabet—a finite set of symbols
denoted by . From the individual symbols, strings (finite sequences of symbols) are
constructed. A language is any set of strings over an alphabet. The length of a string w
denoted |w| is the number of symbols in it. The empty string (length 0) is denoted by . Other
terms we use are concatenation (“hooking” two strings together), reversal (reverse the order of
the symbols in the string), substring (a contiguous set of symbols from a string), prefix (a
substring that begins with the first symbol, if nonempty) and suffix (a substring which ends with
the last symbol, if nonempty).
For example, consider the strings x and y below:
x = abba
y = bbba
They can be concatenated in two ways: xy = abbabbba while yx = bbbaabba
The reversal of y denoted yR = abbb
The prefixes of x are , a, ab, abb, and abba.
The suffixes of y are , a, ba, bba and bbba.
The notation * is used set of all finite length strings w that can be obtained by using the
symbols in the alphabet . Note that while all strings in * have finite length, the set itself is
infinite.
Download