Background Information

advertisement
Background information
This is a quick recap of material you learned in prerequisite courses that will be used in
EECS 510.
Mathematical Preliminaries
It is assumed that you know and understand the following set terminology, operations
and properties:
Union (), intersection (), difference (-), complement. Universal set, null set, subset
(), proper subset (), Cartesian produce (), and powerset (2S) Note: this notation
differs from that in 210
DeMorgan’s Laws
______ _
_
_____
_
_
S1  S2 = S1  S2 and S1  S2 = S1  S2
The natural numbers (N) are the elements of the set {0, 1, 2, …}.
A set is countable if it can be matched one-to-one with N or a subset of N.
Functions and Relations
A function is a rule that assigns to elements of one set a unique element of another set.
If f : S1  S2 denotes a function, then set S1 is the domain of f and set S2 is its
codomain The range of a function is the set of elements from S 2 that are images of
elements in S1. If the domain of f is all of S1 then f is called a total function. Otherwise,
it’s a partial function.
For the big O definitions, we typically use the natural numbers or positive integers as
the domain. We say that f has order at most g if f(n)  c•g(n) for all n greater than a
constant N0,. We use f(n)  O(g(n)) to denote this. If |f(n)|  c•|g(n)| we say f has order
at least g written f(n)  (g(n)). Finally, if there are constants c1 and c2 such that
c1•|g(n)|  |f(n)  c2•|g(n)| for n sufficiently large then f and g have the same order of
magnitude denoted f(n)  (g(n))
There are three properties of relations on a set that you should know: reflexive,
symmetric and transitive p. The symbol  especially as it is used to indicate that two
quantities are equivalent.
Graphs and Trees
A graph G = (V, E) consists of two finite sets, the set V = {v1, v2, …, vn} of vertices
(nodes) and the set E = {e1, e2, …, em} of edges. Each edge connects a pair of vertices
from V. In most instances in here we’ll be using directed graphs. So an edge e = (vj, vk)
goes out of vj and into vk. Labels may be placed on the vertices and the edges of a
graph. We’ll use circles for vertices rather than dots. For example, the parity checking
finite automaton is a directed graph
Here is some additional graph terminology:
adjacent—two vertices are adjacent if there is an edge between them
incident—edge e is incident on vertex v if one endpoint of e is vertex v
degree of a vertex—number of edges incident on it
indegree and outdegree—the number of edges into or out of a vertex, respectively
walk—a sequence of edges in which consecutive edges share an endpoint
path—a walk in which no edge is repeated
simple path—a path in which no vertex repeated
cycle—a path whose endpoints are the same
This implies simple cycle i.e. all vertices in the cycle are different
loop—edge from a vertex to itself.
On page 9 in the text there is an algorithm for finding all simple paths between a pair of
vertices. This algorithm is exhaustive in that it begins with paths of length 1, extends
them to paths of length 2 as long as the simple condition is not violated and continues in
this way until all possible paths have been examined. This is clearly inefficient, but in
this course we don’t concern ourselves with efficiency. Instead, we are only concerned
with whether or not an algorithm exists to solve a problem.
A tree is a special kind of graph—it is a directed graph with no cycles and one distinct
vertex called the root with the property that there is exactly one path from the root to
every other vertex in the tree. (In fact, there must be exactly one path between any pair
of vertices in the corresponding undirected tree.) Most trees we study in the course are
parse trees, also called derivation trees. Because a tree has no cycles, the indegree of
each node except the leaf is 0. Here are some common terms associated with trees:
leaf—a node with outdegree 0
parent of a node—the endpoint of the edge into that node
child or children of a node—the node(s) to which the outgoing edges lead
level of vertex—the root is at level 0 and all other nodes are on a level whose number is
the length of the path from the root to that vertex.
height of a tree—the highest level number or the length of the longest path from the root
to a leaf.
Occasionally we use ordered trees in which all nodes at each level are ordered. For
example, the nodes at level k might values a, a+1, …, a+j if there are j nodes on that
level.
Proof techniques
The most commonly used proof technique in this course is mathematical induction. We
also look at biconditionals or if and only if statements. For example, we might want to
prove that a particular grammar generates a certain language or that two definitions are
equivalent. You are assumed to be familiar with the following proof methods:
deductive or direct p  q--start with p and use a correct argument to arrive at q
(Be sure to look at the alternate ways to express the conditional)
indirect or contrapositive—assume q is false and use that to prove this implies that p is
also false i.e. q  p
Proof by contradiction—similar to a contrapositive proof in that we assume q is false
and argue until we obtain a contradiction of a known fact, not necessarily p. Common
contradiction proofs include proving the square root of 2 is irrational, proving f(x) 
O(g(x)) and showing there is no largest prime number.
When we want to prove that two grammars generate the same language, or that a
particular grammar generates a certain language we are proving that two sets are
equal. Proving that two sets are equal is done by showing each is subset of the other.
Mathematical Induction:
Induction is closely related to incursion in that we are frequently describing recursively
defined sets and proving properties of the elements in that set.. For example, consider
the following recursive definition of a language L over alphabet  = {a, b}:
1.   L
2. If w  L then awb is also in L
3. All elements of L are obtained from using the rules above.
If you examine this definition a bit It should be clear that L = {anbn | n  0}, but to prove it
we need to use mathematical induction. The induction step is based on line 2 of the
definition or the corresponding part of a grammar G for this language:
S  aSb | 
The basic approach to induction is as follows:
1. Prove the basis holds i.e. prove P0 or P1 is true
2. Assume for some k  1, P1, P2, …, Pk are all true
3. Show that the truth of P1, P2, …, Pk imply the truth of Pk+1
In this case the induction will be on the strings produced by the grammar G i.e. show
that if S derives a string w then w is in L and if x  L then S derives x. Here’s a proof
sketch—note that this differs from the one in the text since here the induction is on the
number of steps in the derivation and the one in the book inducts on the length of the
string.
Basis: in a one step derivation we use S   and   L ( = a0b0)
Assume if string w is obtained from S by using an k step derivation then w = ak-1bk-1.
Suppose w is obtained from S in k + 1 steps. Then, the first step must be
S  aSb. The remainder of the string is derived from S using k steps and we know
from the hypothesis that in k steps, S derives ak-1bk-1. Putting this together with the first
step we have this derivation S  aSb k aak-1bk-1b or akbk and this string is in L.
Thus, we have shown that the language produced by the grammar (L(G)) is a subset of L.
Now, let’s show that L  L(G).
Basis:   L and S  is a derivation for it.
Hypothesis: Assume that for all k  0 if w = akbk then S * w (using 0 or more steps)
Induction step: Consider the string x = ak+1bk+1. We now must find a derivation for x.
S  aSb. By hypothesis S * akbk so we have the following derivation of x
S  aSb  a(akbk)b = ak+1bk+1= x. Therefore L  L(G) so the two sets are equal.
If we want string of the form anbn where n  1 we just need to replace replace the
production S   by S  ab.
Look through the induction proofs in the book that a binary tree of height n has at most
2n leaves and example 1.8 on page 17.
Now, let’s return to grammars. Be sure you understand examples 1.11, 1.12 and 1.13
in the text.
Let’s look at problem 11c in section 1.2
Find a grammar over  = {a, b} that generates all strings with no more than three a’s.
Note this means we may have 0, 1, 2 or 3 a’s. Here’s one grammar that will work
S  bS | aA | 
A  bA | aB | 
B  bB | aC | 
C  bC | 
no a’s if the  production is used
1 a if the  production is used here
2 a’s if the  production is used here
3 a’s if the  production is used here
Let’s return to one of the problems you worked on in class.
Let L be the set of all strings of odd length ending in b. In order to generate such
strings, basically we need to do is to guarantee that when the derivation stops the string
has odd length and ends in b. Letting S denote the start symbol, one rule we need is
S  b since that is the shortest odd length string we can obtain. There are many
equivalent grammars, here’s one:
S  aA | bA | b
A  aS | bS
Example derivation: S  aA  aaS  aabA  aabaS  aabab
The two variables are used so that we can guarantee that the string we end up with has
odd length, This follows from the observation that the only way to terminate a derivation
is to use the rule S  b.
Download