Background information This is a quick recap of material you learned in prerequisite courses that will be used in EECS 510. Mathematical Preliminaries It is assumed that you know and understand the following set terminology, operations and properties: Union (), intersection (), difference (-), complement. Universal set, null set, subset (), proper subset (), Cartesian produce (), and powerset (2S) Note: this notation differs from that in 210 DeMorgan’s Laws ______ _ _ _____ _ _ S1 S2 = S1 S2 and S1 S2 = S1 S2 The natural numbers (N) are the elements of the set {0, 1, 2, …}. A set is countable if it can be matched one-to-one with N or a subset of N. Functions and Relations A function is a rule that assigns to elements of one set a unique element of another set. If f : S1 S2 denotes a function, then set S1 is the domain of f and set S2 is its codomain The range of a function is the set of elements from S 2 that are images of elements in S1. If the domain of f is all of S1 then f is called a total function. Otherwise, it’s a partial function. For the big O definitions, we typically use the natural numbers or positive integers as the domain. We say that f has order at most g if f(n) c•g(n) for all n greater than a constant N0,. We use f(n) O(g(n)) to denote this. If |f(n)| c•|g(n)| we say f has order at least g written f(n) (g(n)). Finally, if there are constants c1 and c2 such that c1•|g(n)| |f(n) c2•|g(n)| for n sufficiently large then f and g have the same order of magnitude denoted f(n) (g(n)) There are three properties of relations on a set that you should know: reflexive, symmetric and transitive p. The symbol especially as it is used to indicate that two quantities are equivalent. Graphs and Trees A graph G = (V, E) consists of two finite sets, the set V = {v1, v2, …, vn} of vertices (nodes) and the set E = {e1, e2, …, em} of edges. Each edge connects a pair of vertices from V. In most instances in here we’ll be using directed graphs. So an edge e = (vj, vk) goes out of vj and into vk. Labels may be placed on the vertices and the edges of a graph. We’ll use circles for vertices rather than dots. For example, the parity checking finite automaton is a directed graph Here is some additional graph terminology: adjacent—two vertices are adjacent if there is an edge between them incident—edge e is incident on vertex v if one endpoint of e is vertex v degree of a vertex—number of edges incident on it indegree and outdegree—the number of edges into or out of a vertex, respectively walk—a sequence of edges in which consecutive edges share an endpoint path—a walk in which no edge is repeated simple path—a path in which no vertex repeated cycle—a path whose endpoints are the same This implies simple cycle i.e. all vertices in the cycle are different loop—edge from a vertex to itself. On page 9 in the text there is an algorithm for finding all simple paths between a pair of vertices. This algorithm is exhaustive in that it begins with paths of length 1, extends them to paths of length 2 as long as the simple condition is not violated and continues in this way until all possible paths have been examined. This is clearly inefficient, but in this course we don’t concern ourselves with efficiency. Instead, we are only concerned with whether or not an algorithm exists to solve a problem. A tree is a special kind of graph—it is a directed graph with no cycles and one distinct vertex called the root with the property that there is exactly one path from the root to every other vertex in the tree. (In fact, there must be exactly one path between any pair of vertices in the corresponding undirected tree.) Most trees we study in the course are parse trees, also called derivation trees. Because a tree has no cycles, the indegree of each node except the leaf is 0. Here are some common terms associated with trees: leaf—a node with outdegree 0 parent of a node—the endpoint of the edge into that node child or children of a node—the node(s) to which the outgoing edges lead level of vertex—the root is at level 0 and all other nodes are on a level whose number is the length of the path from the root to that vertex. height of a tree—the highest level number or the length of the longest path from the root to a leaf. Occasionally we use ordered trees in which all nodes at each level are ordered. For example, the nodes at level k might values a, a+1, …, a+j if there are j nodes on that level. Proof techniques The most commonly used proof technique in this course is mathematical induction. We also look at biconditionals or if and only if statements. For example, we might want to prove that a particular grammar generates a certain language or that two definitions are equivalent. You are assumed to be familiar with the following proof methods: deductive or direct p q--start with p and use a correct argument to arrive at q (Be sure to look at the alternate ways to express the conditional) indirect or contrapositive—assume q is false and use that to prove this implies that p is also false i.e. q p Proof by contradiction—similar to a contrapositive proof in that we assume q is false and argue until we obtain a contradiction of a known fact, not necessarily p. Common contradiction proofs include proving the square root of 2 is irrational, proving f(x) O(g(x)) and showing there is no largest prime number. When we want to prove that two grammars generate the same language, or that a particular grammar generates a certain language we are proving that two sets are equal. Proving that two sets are equal is done by showing each is subset of the other. Mathematical Induction: Induction is closely related to incursion in that we are frequently describing recursively defined sets and proving properties of the elements in that set.. For example, consider the following recursive definition of a language L over alphabet = {a, b}: 1. L 2. If w L then awb is also in L 3. All elements of L are obtained from using the rules above. If you examine this definition a bit It should be clear that L = {anbn | n 0}, but to prove it we need to use mathematical induction. The induction step is based on line 2 of the definition or the corresponding part of a grammar G for this language: S aSb | The basic approach to induction is as follows: 1. Prove the basis holds i.e. prove P0 or P1 is true 2. Assume for some k 1, P1, P2, …, Pk are all true 3. Show that the truth of P1, P2, …, Pk imply the truth of Pk+1 In this case the induction will be on the strings produced by the grammar G i.e. show that if S derives a string w then w is in L and if x L then S derives x. Here’s a proof sketch—note that this differs from the one in the text since here the induction is on the number of steps in the derivation and the one in the book inducts on the length of the string. Basis: in a one step derivation we use S and L ( = a0b0) Assume if string w is obtained from S by using an k step derivation then w = ak-1bk-1. Suppose w is obtained from S in k + 1 steps. Then, the first step must be S aSb. The remainder of the string is derived from S using k steps and we know from the hypothesis that in k steps, S derives ak-1bk-1. Putting this together with the first step we have this derivation S aSb k aak-1bk-1b or akbk and this string is in L. Thus, we have shown that the language produced by the grammar (L(G)) is a subset of L. Now, let’s show that L L(G). Basis: L and S is a derivation for it. Hypothesis: Assume that for all k 0 if w = akbk then S * w (using 0 or more steps) Induction step: Consider the string x = ak+1bk+1. We now must find a derivation for x. S aSb. By hypothesis S * akbk so we have the following derivation of x S aSb a(akbk)b = ak+1bk+1= x. Therefore L L(G) so the two sets are equal. If we want string of the form anbn where n 1 we just need to replace replace the production S by S ab. Look through the induction proofs in the book that a binary tree of height n has at most 2n leaves and example 1.8 on page 17. Now, let’s return to grammars. Be sure you understand examples 1.11, 1.12 and 1.13 in the text. Let’s look at problem 11c in section 1.2 Find a grammar over = {a, b} that generates all strings with no more than three a’s. Note this means we may have 0, 1, 2 or 3 a’s. Here’s one grammar that will work S bS | aA | A bA | aB | B bB | aC | C bC | no a’s if the production is used 1 a if the production is used here 2 a’s if the production is used here 3 a’s if the production is used here Let’s return to one of the problems you worked on in class. Let L be the set of all strings of odd length ending in b. In order to generate such strings, basically we need to do is to guarantee that when the derivation stops the string has odd length and ends in b. Letting S denote the start symbol, one rule we need is S b since that is the shortest odd length string we can obtain. There are many equivalent grammars, here’s one: S aA | bA | b A aS | bS Example derivation: S aA aaS aabA aabaS aabab The two variables are used so that we can guarantee that the string we end up with has odd length, This follows from the observation that the only way to terminate a derivation is to use the rule S b.