Introduction to Mathematical Proofs Wai Yai Pong November 4, 2015 Contents 1 2 Dive into proofs 1.1 Why do proofs and how to do them? 1.2 Exercises . . . . . . . . . . . . . . . . 1.3 Mathematical Induction . . . . . . . 1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4 11 13 16 Some Set Theory 2.1 Sets and subsets . . . . . 2.2 Operations on Sets . . . 2.2.1 Exercises . . . . . 2.3 Relations and Functions 2.3.1 Cardinality . . . 2.3.2 Exercises . . . . . 2.4 Equivalence Relations . 2.5 Orderings . . . . . . . . 2.5.1 Embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 19 20 22 24 26 28 30 34 38 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 1 Dive into proofs 3 4 1.1 CHAPTER 1. DIVE INTO PROOFS Why do proofs and how to do them? The way to demonstrate the validity of a mathematical statement is to give a proof. This is much like performing experiment is the way to demonstrate the validity of a claim in physical science. A math student should be able to write proofs much like a chemistry student should be able to carry out chemistry experiments—which is absolutely essential. Writing proofs is much like writing poems—it consists of two parts: 1) An idea about the objects that you want to write about, 2) Expressing your ideas in the “form” of a poem. Say, you want to write a poem about mountains, first you have to have some ”feelings” towards the mountains that you want to write about. Then you decide on the format that you want to express these ideas. For instant, there is a format for Chinese poems in which a poem has 4 verses and each verses consists of 7 words. Thus, you have to accomplish whatever you want to say about the mountains in 28 words. There are additional rules too: like the last word of the 1st and the 2nd verse are in rhythm and so are the last word of the 3rd and the 4th verse, etc. If you want to prove a mathematical statement, first you have to have some ideas on why the statement is true. Then you express your ideas according to the ”rules” of logic. It is the second part of proof-writing that can be ”taught”. The first part can only be ”learned”. So how to learn to write proofs? Well, the analogy continues: how to learn poem-writing (or song-writing or painting)? If you ask a poet, the answer will be read a lot of poems first! The underlying massage is that one should memorize a bunch of poems first (and understand the ”messages” carry by those poems)!! If someone claims to be a poet but can hardly recite any poem, will you believe the claim? With this philosophy in mind, let’s dive into proof-writing. Here is our first statement. Statement 1. The sum of two even numbers is even. Well, you would say this is obviously true. . . Not so quick, to the medieval minds, heavier objects fall faster than lighter objects but it had been shown otherwise by Galileo. So to make sure our statement is correct, let’s give a proof of it. Once you start thinking of giving an explanation, you will realize that there are many things to clarify. First, what means by a number? A real number, an integer or what? And what are they? Also, what is the sum of two numbers? What means by a number is even? To get all these strict would take us too long. So let us deal with these issues later. For now, let us understand 1.1. WHY DO PROOFS AND HOW TO DO THEM? 5 the numbers are integers and we assume the basic properties of their addition and multiplication1 . So the only concept yet to be explained is “even”: An integer is even if it is divisible by two. It is fine but if we said so then we have to explain what means by “divisible” (remember we only assume the basics about addition and multiplication). So let us rephrase the sentence using multiplication: An integer n is even if n = 2m for some integer m. The last statement is an example of a definition in mathematics. Just like a definition in a dictionary, it declares the meaning of a term, in this case the term is “even number”, using other terms that have already be defined or assumed2 . Proof of Statement 1. Suppose a, b are two even numbers. By definition, a = 2m and b = 2n for some integers m and n. Thus, a + b = 2m + 2n = 2(m + n) is an even number. A few remarks about the proof are in order: 1. This is NOT a formal proof (we will explain later). 2. We use a few properties about integers in the proof without explicitly stating them: we use the fact that the sum of two integers is an integer which is needed to conclude that m + n in the proof is an integer. Also, we use the fact that multiplication of integers distribute over sum. In other words, we use without proof that k(m + n) = km + kn for any integers k, m and n. 3. This is an example of a constructive proof: we show that a + b is even by explicitly constructing an integer, namely m + n, such that a + b is twice of that. In other words, we prove the existence of something by giving an explicit way of constructing it. A closer examination of Statement 1 reveals its logical structure. Formally, it can be stated as ∀ a, b ∈ Z(∃n, m ∈ Z, a = 2n ∧ b = 2m =⇒ ∃k ∈ Z, a + b = 2k). 1 for those who knows, we are assuming the integers form a ring under addition and multiplication. 2 Obviously, this process cannot go on forever, so there must be some “basic” undefined terms that we have to start with. In this case, it will be the integers. 6 CHAPTER 1. DIVE INTO PROOFS 1. The symbol ∀ reads “for all” and the symbol ∃ reads “there exist(s)”. They are the quantifiers. In fact, to be strict, ∀ a, b ∈ Z appears in the front of this sentence should be written as ∀ a ∈ Z∀b ∈ Z. 2. The symbols ∧ and =⇒ are logical connectives. They stand for logical “and” and “imply”, respectively. Logical connectives combine two statements to form another one. Another commonly used connective is ∨ which stands for the logical “or”. We will have a closer look of them later. 3. The formula: ∃n, m ∈ Z, a = 2n ∧ b = 2m =⇒ ∃k ∈ Z, a + b = 2k is an implication, i.e. a formula of the form P =⇒ Q. The variables n, m and k are bounded in this formula since they are within the scope of some quantifiers. The variables a and b are free. Let ϕ( a, b) denote this formula, then the original one becomes ∀ a, b ∈ Z, ϕ( a, b). The variables a, b is trust quantified out and become bounded. A formula without any free variable is called a sentence. 4. The contra-positive of the implication ϕ =⇒ ψ is the implication ¬ψ =⇒ ¬ ϕ. They are logically equivalent. The contra-positive of the above implication is: ¬(∃k ∈ Z), a + b = 2k) =⇒ ¬(∃m, n ∈ Z, a = 2n ∧ b = 2m) If we translate it back into English, it says: if a + b not even then it is not the case that both a and b are even. which is clearly equivalent to the original one even though it sounds a little bit awkward. Instead of saying it is not the case that both a and b are even, we can say that either a or b is not even. So intuitively ¬( ϕ ∧ ψ) is equivalent to ¬ ϕ ∨ ¬ψ. And now if we actually declare that an integer is odd if it is not even, then we can restate the above as: If a + b is odd then a or b is odd. Finally, if you quantify a and b out, it reads If the sum of two integers is odd then at least one of them is odd. As you know they cannot be both odd, but one cannot deduce that from the above statement. 1.1. WHY DO PROOFS AND HOW TO DO THEM? 7 Let us make a few things precise before moving on. We use N to denote the set of natural numbers, i.e. N = {1, 2, 3, . . .} and Z the set of integers. We assume the basic rules of addition and multiplication of integers3 : for any a, b, c ∈ Z, 1. ( a + b) + c = a + (b + c). 2. a + b = b + a. 3. a + 0 = a. 4. a + d = 0 for some d ∈ Z. 5. a(bc) = ( ab)c. 6. ab = ba. 7. a · 1 = a. 8. a(b + c) = ab + ac. Definition 1.1.1. For a, b ∈ Z, we say that a divides b, written as a | b if there exists c ∈ Z such that ac = b. In this case, we also say that a is a divisor (or factor) of b and b is a multiple of a. Note that every integer divides 0 but 0 is the only divisor of 0. The divisors of 1 are called units. In Z, 1 and −1 are the only units. A divisor of an integer n is proper if it not a unit nor a unit times n. Definition 1.1.2. Let a and b be integers not both 0, the greatest common divisor of a and b, denoted by gcd( a, b) is the largest integer that divides both a and b. We define gcd(0, 0) to be 0. Let a and b be non-zero integers, the least common multiple of a and b, denoted by lcm( a, b) is the smallest positive multiple of both a and b. In the case either a or b is zero, lcm( a, b) is defined to be 0. Definition 1.1.3. A prime number is an integer bigger than 1 which has no positive factors other than 1 and itself. Two integers are relatively prime (or coprime) if their greatest common divisor is 1. The second statement that we are going to prove is: 3 In other words, we assume (Z, 0, 1, +, ·) from a commutative ring with 1. 8 Proposition 1.1.4. CHAPTER 1. DIVE INTO PROOFS √ 2 is irrational. Note here we introduce the word “Proposition”. In contemporary mathematics, a Proposition is a statement of interest and the word Theorem usually is reserved for propositions that are important for the discourse. The word Lemma signifies a proposition that is going to be used in proving another propositions. To prove Proposition 1.1.4, we first need to clarify the word “irrational”. Let us say that a number is rational if it is the ratio of two integers and is irrational if it is not rational. The problem of this definition is again—what “numbers” are we talking about? The definition attempts to definite rational numbers as a subset of some set of numbers. Well, since the number system should at least √ contains 2 in order for the statement to make sense, so let us take the set of numbers as the set of real numbers R. The second question, equally important, √ is that what is 2? Ans: the positive real number whose square is 2. There are two assumptions here in order for this definition to make sense: 1) the existence of positive real number whose square is 2 and 2) the uniqueness of such number. We will not be able to justify either of them here since we have yet to define the real numbers! In any case, let us now make things official: Definition 1.1.5. A real number r is rational if r = m/n for some m, n ∈ Z. A real number is irrational if it is not rational. We use Q to denote the set of rational numbers. With these definitions and √ assumptions sorted out, let’s think about why 2 is irrational and how we can proof it. The proposition requires us to show that no matter what integers r √ and s are chosen, the ratio r/s cannot be 2, in order words, (r/s)2 cannot be 2. There are infinitely many integer pairs to check, how can we achieve that? This time we use a technique called proof by contradiction: We can prove P implies Q by assuming P and ¬ Q (the negation of Q) and try to deduce a contradiction from them. If we succeed in doing so, then we show that P ∧ ¬ Q implies a false statement and that can only be the case if P ∧ ¬ Q itself is false. In other words, ¬( P ∧ ¬ Q) is true, but that is logically equivalent to ¬ P ∨ Q which is also equivalent to P =⇒ Q. √ Proof of Proposition 1.1.4. Suppose 2 = r/s for some integers r, s. By canceling the common factors of r and s, we can assume r and s are co-prime. After squaring both sides of the equation and clear the denominator, we get 2s2 = r2 . In particular, r2 is even and so is r (see Exercise 5). Thus r = 2k for some integer k, and so 2s2 = r2 = (2k )2 = 4k2 . That means s2 = 2k2 . By the same argument, we conclude that s is even as well. This is a contradiction, since r and s are co-prime and hence cannot be both even. 1.1. WHY DO PROOFS AND HOW TO DO THEM? 9 The next statement that we are going to examine is also a classic. Theorem 1.1.6. There are infinitely many primes. Our first proof goes back to Euclid some 2300 years ago! One way of proving the statement is as following: for any given number n, construct a prime number that is greater than n. However, in this case, such a constructive proof turns out to be extremely difficult. To date, finding ways to produce large primes is still one of the most active area in number theory research. So how did Euclid prove the infinitude of primes some two millenniums ago? The answer is again proof by contradiction. But first we need a fundamental lemma: Lemma 1.1.7. Every integer larger than 1 has a prime factor. Proof. Suppose not, then the collection N of integers bigger than 1 that has no prime factors is non-empty. Its least element, say m0 must not be a prime otherwise it will be its own prime factor, contradicting the definition of N. So m0 must have a factor 1 < m1 < m0 . Since m0 is the least element of N and m1 is less than m0 , m1 cannot be in N. Now since m1 is an integer bigger than 1, so the fact that it is not in N implies it must have a prime factor. But any factor of m1 is also a factor of m0 (see Exercise 2). So m0 has a prime divisor, contradiction. Let A be a set of integers. We say an integer k is a lower bound of A if k ≤ a for every a ∈ A. We say that A is bounded below if A has a lower bound. Note that a lower bound of A needs not be in A, for example, 0 is a lower bound of N but 0 ∈ / N. A least element of A is an element m ∈ A such that m ≤ a for every a ∈ A. If m1 , m2 are both least elements of A, then by the definition, we must have m1 ≤ m2 and m2 ≤ m1 . So consequently, m1 = m2 . In other words, if A has a least element, then it is the (unique) least element of A. Note, however, that A needs not have a least element, for instance if A is the set of negative integers. One thing is clear though, if A is non-empty and finite, then we can find the least element of A by comparing its elements one by one 4 . More generally, we have the following: Proposition 1.1.8. Let W be a set of integers. If W is non-empty and bounded below then the least element of W exists. Proof. Since W is bounded below, it has a lower bound, say m. Since W is nonempty, some integer w belongs to W. By definition of lower bound, m ≤ w and 4 we need the finiteness of A to guarantees that this process will terminate 10 CHAPTER 1. DIVE INTO PROOFS so w belongs to the set F = { n ∈ W : m ≤ n ≤ w }. As a finite non-empty set, the least element of F exists, let us call it w0 . We claim that w0 is also the least element of W. First, since every element of F is an element of W, in particular w0 ∈ W. Now all it remains to show is that no integer smaller than w0 can be in W. Since m is a lower bound of W, no number smaller than m can be in W. Thus, we only need to show that an integer a cannot be in W, if m ≤ a < w0 . Suppose the contrary that a ∈ W then a would be in F by the very definition of F. But this contradicts w0 is the least element of F. Proposition 1.1.8 is a fundamental property of the ordering on subsets of integers that are bounded below (N, in particular). In fact, we have used it implicitly in our proof of Lemma 1.1.7 to guarantee the existence of m0 and we are going to use it again to proof the principle of mathematical induction in Section 1.3. Now let us return to the proof of Theorem 1.1.6. Proof. Suppose on the contrary that there are only finitely many primes, say p1 , . . . , pn . Consider the number N = ( p1 · · · pn ) + 1. Clearly N > 1 and so it must have a prime factor, say q, by Lemma 1.1.7. On one hand q cannot be any of the pi since N is a multiple of q but not a multiple of any of the pi ’s. On the other hand, q must be one of the pi ’s since q is a prime, so we get a contradiction. Euclid’s proof is an example of a non-constructive proof. He showed that they are infinitely many prime numbers without constructing even a single prime!! In a sense, proving an existence statement with a non-constructive proof is not “as good as” a constructive one. But in many cases, non-constructive proofs are the only proofs available. Let us now give a different proof of Euclid’s result. Certainly, it is not precise of what do we mean by a “different” proof. This usually means a proof involving a different “idea”. By this standard the following proof is not really different from Euclid’s proof. Second proof of Theorem 1.1.6. For each n, the number n! + 1 must have a prime factor according to Lemma 1.1.7 and any factor of n! + 1 must be larger than n. Thus we show that there are arbitrary large prime and so there must be infinitely many primes. 1.2. EXERCISES 1.2 11 Exercises 1. Mimic the proof of Statement 1, show that the sum of two multiples of 3 is again a multiple of 3. More generally, show that for any integer k, the sum of two multiplies of k remains a multiple of k. 2. Show that for any integers a, b and c, if a divides b and b divides c then a divides c. 3. Division algorithm Show that for any a, b ∈ Z with b > 0, there exist unique integers q and r with 0 ≤ r < b such that a = bq + r 4. Greatest common divisor Show that (a) For any integers a and b, there is a unique non-negative integer d such that the set { am + bn : m, n ∈ Z} is the set of multiples of d. (b) Show that the d in Part (4a) is a common divisor of a and b; and (c) show that if d0 is a common divisor of a and b then d0 divides d. 5. Show that (a) The product of two odd numbers is odd. (b) Deduce that for any integer r if r2 is even then r is even. (c) Let p be a prime, show that if p divides ab then p divides a or p divides b. (d) Deduce that if a2 is divisible by a prime p, then so is a. 6. Mimic the proof of Proposition 1.1.4, show that √ (a) 3 is irrational. √ (b) in generally, p is irrational for any prime p. √ (c) more generally, n is irrational for any square-free n. (An integer n > 1 is square-free if no square other than 1 divides n.) √ (d) Deduce that n is irrational if n is not a perfect square. 12 CHAPTER 1. DIVE INTO PROOFS 7. Fundamental Theorem of Arithmetic (a) Existence of Prime Factorization. Show that every natural number bigger than 1 is a product of primes. (b) Uniqueness of Prime Factorization. Show that if p1 , . . . , pn , q1 , . . . , qm are primes and that p1 · · · p n = q1 · · · q m , then n = m and up to permutation pi = qi for each i. 8. Show that there are infinitely many primes of the form 4n − 1. Hints for Exercises 3 Consider the set of differences of a and the multiples of b, i.e. the set { a − bn : n ∈ Z}. Argue that this set must contains some non-negative integers and hence its least non-negative member exists. Show that it is the remainder of a divided by b. 4 If a = b = 0, then d is clearly 0 as well. So we can assume either a or b is not zero. If the set of integral combinations of a and b is going to the set of multiples of some d > 0, then clearly d must be the least positive element of the set. Thus argue that some integral combination of a and b is positive and so there must be the least one. 5 For Part (a), use division algorithm to argue that an odd number must be of the form 2k + 1. For Part (c), show that if p does not divide a then it must divide b. Since p is a prime, if it does not divide a, then a and p must be relatively prime. Now use Q.4. 6 For Part (c). Suppose n is square-free. Since n > 1, it has a prime factor, say p. If sqrtn = r/s for some integers r and s. By canceling the gcd of r and s, we can assume that r and s are relatively prime. Obtain a contradiction by arguing that (like in Part (b)) that p divides both r and s. 7 For Part (a), mimic the proof of Lemma 1.1.7. For Part (b) use Q.5(c). 1.3. MATHEMATICAL INDUCTION 1.3 13 Mathematical Induction Let P(n) be a statement about the natural numbers, for example: The sum of the first n odd numbers is n2 . Since the n-th odd number is 2n − 1, the statement P(n) asserts that 1 + 3 + . . . + (2n − 1) = n2 . One can think of P(n) as a family of statements indexed by the natural numbers: P(1) is the statement 1 = 12 P(2) is the statement 1 + 3 = 22 P(3) is the statement 1 + 3 + 5 = 32 , etc. All instances of P(n) listed above are true! Hum. . . maybe all of them are true! If that’s the case, how are we going to show that? Well, as mortals we certainly can’t verify them one by one. . . Luckily, there is a powerful way, called mathematical induction, of proving a family of statements like this. As you will see, it is actually a disguise of Proposition 1.1.8. Theorem 1.3.1 (Principle of Mathematical Induction). Let W be a set of integers such that 1. w0 ∈ W for some integer w0 ; and 2. For any k ≥ w0 , the implication k ∈ W =⇒ k + 1 ∈ W. then W ⊇ {n ∈ Z : n ≥ w0 }. Here is how Theorem 1.3.1 can help us to prove a statement like “the sum of the first n odd numbers is n2 ”. Let W = {n ∈ N : P(n) is true.} In other words, let W be the set of witnesses of the statement. To show that P(n) is true for all natural number n means to show that the witness set contains the set N = {n ∈ Z : n ≥ 1}. So according to Theorem 1.3.1, we need to show two things, namely 1. 1 ∈ W, i.e. P(1) is true; and 14 CHAPTER 1. DIVE INTO PROOFS 2. the implication: for any k ≥ 1, if P(k) is true, then so is P(k + 1). Let’s see how does it apply to our example: Proposition 1.3.2. For any n ∈ N, 1 + 3 + . . . (2n − 1) = n2 . Proof. When n = 1, the left-side of the equation is 1 and right side is 12 which is again 1. So the statement is true for n = 1. Suppose the statement is true for some k ≥ 1, i.e. 1 + 3 + . . . + (2k − 1) = k2 (1.1) then 1 + 3 + . . . + (2k − 1) + (2(k + 1) − 1) = k2 + (2k + 2 − 1) = k2 + 2k + 1 = ( k + 1)2 . So we establish the proposition by mathematical induction. There are some common terminologies about induction proofs: 1. The statement P(w0 ) is call the basic case. In the example, above the base case is 1 = 12 . 2. The induction step is the proof of the implication (2) in Theorem 1.3.1 within the induction proof. In the example above, it is the proof of the implication: for all k ≥ 1, if 1 + 3 + . . . + (2k − 1) = k2 , then 1 + 3 + . . . + (2( k + 1) − 1) = ( k + 1)2 . 3. The premises P(k) in the implication P(k) =⇒ P(k + 1) is called the induction hypothesis. In our case, the induction hypothesis is 1 + 3 + . . . + (2k − 1) = k2 . Mathematical induction is a framework for proving statements indexed by a well-ordered set of integers (more about well-order later). In our proof-poem analogy, it plays the same role as the format of a certain kind of poems. But in mathematics, we can even explain why the proof should take that form. So let us demonstrate that Theorem 1.3.1 is just Proposition 1.1.8 in disguise. 1.3. MATHEMATICAL INDUCTION 15 Proof of Theorem 1.3.1. Suppose on the contrary that W does not contain the set {n ∈ Z : n ≥ w0 }. So the set C = {m ∈ Z : m ≥ w0 , m ∈ / W } is non-empty. Clearly C is bounded below by w0 , so according the Proposition 1.1.8, the least element, say m0 , of C exists. By the definition of C, m0 ≥ w0 and m0 ∈ / W. Since we assume w0 ∈ W, thus m0 6= w0 . Therefore, we must have m0 > w0 . That means m0 − 1 ≥ w0 . Note that m0 − 1 cannot be in C since m0 is the least element of C. Thus we conclude that m0 − 1 ∈ W. But then according to the implication (2), m0 = (m0 − 1) + 1 ∈ W, contradiction. Next we introduce a variant of mathematical induction, called the strong induction. Contrary to what its name suggests, it is not stronger than mathematical induction itself since they imply each other. Theorem 1.3.3 (Strong Induction Principle). Let W be a set of integers such that 1. w0 ∈ W for some integers w0 ; and 2. For any k ≥ w0 , the implication (∀w0 ≤ m ≤ k, m ∈ W ) =⇒ k + 1 ∈ W is true, then W ⊇ {n ∈ Z : n ≥ w0 }. It is clear that if (2) in Theorem 1.3.1 is true than (2) is true, therefore, Principle of Mathematical Induction does follows from the Strong Induction Principle. Now we show that the converse is also true: Proof of Theorem 1.3.3. Suppose W satisfies both Condition (1) and (2) of the theorem. Let P(k) be the statement ∀w0 ≤ m ≤ k, m ∈ W. Clearly if P(k) is true for all k ≥ w0 , then W contains the set {n ∈ Z : n ≥ w0 }. By assumption (1), w0 ∈ W. So by mathematical induction (i.e. Theorem 1.3.1), it remains to show that for every k ≥ w0 , P(k) =⇒ P(k + 1). But it follows from Condition 2 of the theorem and P(k) that k + 1 ∈ W. But then P(k ) together with k + 1 ∈ W simply means m ∈ W for all w0 ≤ m ≤ k + 1 which is nothing but P(k + 1). Therefore, we show the implication P(k ) =⇒ P(k + 1) for any k ≥ w0 and hence finish the proof. Now that we see that PMI and SMI are logically equivalent, a natural question would be why consider two equivalent forms of induction? It turns out that sometime SMI is more flexible to apply. To illustrate this point, let us reprove Lemma 1.1.7 using strong induction 16 CHAPTER 1. DIVE INTO PROOFS Proof. We want to show that every integer larger than 1 has a prime factor. Clearly, 2 has a prime factor, since itself is a prime. Let k ≥ 2 and suppose every number from 2 to k has a prime factor. So either k + 1 is a prime, in that case k + 1 has a prime factor, namely itself; or else k + 1 has a proper factor, i.e. there is some 1 < m < k that divides k. It then follows from the induction hypothesis that m has a prime factor, say p. Since m divides k + 1, so p divides k + 1 as well. Either case, we show that k + 1 must have a prime factor and so we are done. 1.4 Exercises 1. This is almost everyone first example of Mathematical Induction. Prove that for any natural number n, 1+2+...+n = n ( n + 1) . 2 Numbers of this form are called the triangular numbers. The first few triangular numbers are 1, 3, 6, 10, . . .. The expression on the right side of the equation can be interpreted as the number of ways to choose two 1 things out of n things, read as n choose 2 and is often denoted by (n+ 2 ). The story is that Gauss discovered this formula at a very young age (some say around 8 years old). 2. Prove that for each natural number n ≥ 2, 1 1 1 n−1 + +···+ = . 1·2 2·3 ( n − 1) · n n 3. Prove that for all n ≥ 2, 1 1 1 n+1 1− · 1− ··· 1− 2 = . 4 9 2n n The next few questions are more challenging. 4. Suppose n lines are drawn on a plane in general positions, i.e. no two lines are parallel and no 3 lines intersect at a point. How many regions do they separate the plane into? (Try it for small n, and guess a formula. Then prove your formula by induction.) 5. (Triminoes) This one is another classic. 1.4. EXERCISES 17 (a) Show that a 2n × 2n checkerboard with any one square deleted can be tiled by “L”-shape tiles each covering 3 squares (called the triminoes). (b) Prove that for n ≥ 1, 4n − 1 is divisible by 3 directly without resorting to Part (5a). 6. Put 2n dots on a circle, color them in anyway so that n of them are red and n of them are blue. Show that one can always start from one of these dots going clock-wise through the circle so that the number of red dots visited is at least the number of blue dots visited at any moment in this journey. 7. The Fibonacci sequence Fn is defined recursively as follows: F1 = F2 = 1 and Fn = Fn−1 + Fn−2 for all n ≥ 3. Show that F3k is even for every k ≥ 1. Chapter 2 Some Set Theory 18 2.1. SETS AND SUBSETS 2.1 19 Sets and subsets We treat sets as a language for our discussion and will take an informal approach to set theory. To us a set is simply a collection of objects 1 . We specify the objects of interest in between a pair of curly brackets. For example, {0, 1, 2} is a set consists of three symbols 0, 1, and 2. We write a∈X to indicate that a is an element (or a member) of the X and write a ∈ / X to indicate otherwise. So 1 ∈ {0, 1, 2} but 4 ∈ / {0, 1, 2}. A set is finite if it has finitely many elements. The number of elements of a finite set X, denoted by | X |, it called the size (or cardinality) of X. A set if is infinite if it is not finite. Later on, we discuss the cardinality of infinite sets but until then when we talk about the size of a set, we assume the set is finite. We cannot write down all members of an infinite set, so to specify an infinite set we either assume we know what are the elements, for example, the set of natural numbers N = {1, 2, 3, . . . }; or specify the elements by a property from a given set. For example, the set of even numbers can be given as {n ∈ N : n = 2k for some k ∈ N}. A set X is a subset of a set Y, written as X ⊆ Y, if every element of X is an element of Y. We refer ⊆ as (set) inclusion. It is clear that if X ⊆ Y and Y ⊆ Z, then X ⊆ Z. We say that X is a proper subset of Y, written as X ( Y, if X ⊆ Y but Y 6⊆ X. We say that two sets are equal if they are subset of each other. There are two important consequence of this declaration. First, there is only one set with no elements, we call it the empty set or (the null set) and denote it by ∅. Second, the order in which we list the elements of a set does not matter. For example, {5, 2, 3, 7} and {2, 3, 5, 7} are the same set. It is the set of all primes numbers less than 10. We write P ( X ) = { A : A ⊆ X }. 1 see Exercise 15 for a problem of such naı̈ve approach. (2.1) 20 CHAPTER 2. SOME SET THEORY for the set of all subsets of X. We call it the power set of X. Since ∅ and X are always subsets of X, P( X ) has at least two elements unless X = ∅. Example 2.1.1. Let A = { a}. Then P( A) = {∅, { a}}. Let B = { a, b} then a subset of B is either a subset of A (if it does not contain b) or it is a subset of A union {b}. So P( B) = {∅, { a}, {b}, { a, b}}. 2.2 Operations on Sets Given sets X and Y, the set X \ Y consists of the elements of X that are not in Y. We call X \ Y the complement of Y in X. In practice, it is often convenient to fix a set that contains all the objects of interest in our discourse. We call such a set a universe. For example, if number theory is the topic, then Z the set of integers is a natural choice of the universe. Certainly, at times we may find it necessary to enlarge the universe to some larger set, say to Q the set of rational numbers. For computer scientists, perhaps the set of all binary strings is a good choice of the universe. So in a sense, there is nothing universal about the “universe”. We write ¬ X for the complement2 of X in some fixed universe (that contains X). Clearly, the empty set and the universe are complement of each other. Moreover, we have Proposition 2.2.1. 1. ¬¬ X = X 2. X ⊆ Y if and only if ¬Y ⊆ ¬ X. The intersection of X and Y, denoted by X ∩ Y is the set whose elements are elements of both X and Y. Two sets are disjoint if they do not have any element in common; in other words, their intersection is empty. The union of X and Y, denoted by X ∪ Y, is the set3 whose elements are either elements of X or elements of Y. We say that the X ∪ Y is a disjoint union if X and Y are disjoint. Clearly | X ∪ Y | = | X | + |Y | if the union is a disjoint union. Example 2.2.2. Let X = {m, a, t, h, e} and Y = {m, a, t, i, c, s}. The X ∩ Y = {m, a, t} and X ∪ Y = {m, a, t, h, e, i, c, s}. Alternatively, X ∩ Y can be defined as the subset of X and Y that contains every common subsets of X and Y. Similarly, X ∪ Y is the set the containing both X and Y but is contained in any set that has both X and Y as subsets. 2∼ X is another popular notation for the complement of X with respect to some universe is a different from the construction that we have seen before. It does not specify a set as a subset of another set. 3 Union 2.2. OPERATIONS ON SETS 21 Proposition 2.2.3. For any sets X, Y and Z, 1. X ∩ Y = Y ∩ X, X ∪ Y = Y ∪ X. 2. X ∩ (Y ∩ Z ) = ( X ∩ Y ) ∩ Z, X ∪ (Y ∪ Z ) = ( X ∪ Y ) ∪ Z. 3. If X ⊆ Y, then Z ∩ X ⊆ Z ∩ Y. 4. X ⊆ Y if and only if X ∩ ¬Y = ∅. We give a proof of the following statements to illustrate how to apply the results in Proposition 2.2.1 and 2.2.3. As an exercise, try identifying which statements in these two propositions are used in the proofs. Theorem 2.2.4 (De Morgan’s Law). For sets X and Y 1. ¬( X ∩ Y ) = ¬ X ∪ ¬Y. 2. ¬( X ∪ Y ) = ¬ X ∩ ¬Y. Proof. Since X, Y ⊆ X ∪ Y, ¬( X ∪ Y ) ⊆ ¬ X, ¬Y and hence ¬( X ∪ Y ) ⊆ ¬ X ∩ ¬Y. On the other hand, ¬ X ∩ ¬Y ⊆ ¬ X, ¬Y. Therefore, ¬(¬ X ∩ ¬Y ) contains both X and Y and hence contains their union. But that means ¬ X ∩ ¬Y ⊆ ¬( X ∪ Y ). This concludes the proof of (1). The second statement follows from the first by replacing X and Y, respectively, by their complements. As an application of DeMorgan’s Law, let us show that, Proposition 2.2.5. For any sets X, Y and Z, X ∩ Y ⊆ Z if and only if X ⊆ ¬Y ∪ Z. Proof. X ∩ Y ⊆ Z if and only if ( X ∩ Y ) ∩ ¬ Z = X ∩ (Y ∩ ¬ Z ) = ∅ if and only if X ⊆ ¬(Y ∩ ¬ Z ). By De Morgan’s Law, the last set is ¬Y ∪ Z. We leave the proofs of the following statements as an exercise. Theorem 2.2.6 (Distributive Laws). For any sets X, Y and Z 1. X ∩ (Y ∪ Z ) = ( X ∩ Y ) ∪ ( X ∩ Z ). 2. X ∪ (Y ∩ Z ) = ( X ∪ Y ) ∩ ( X ∪ Z ). The (Cartesian) Product X × Y is the set of ordered pairs {( x, y) : x ∈ X, y ∈ Y }. If X = Y, we often write X 2 for X × X. For finite sets X and Y, it is easy to see that | X × Y | = | X ||Y |. In particular, if one of them is empty then so is their product. 22 CHAPTER 2. SOME SET THEORY Example 2.2.7. Let X = {0, 1} and Y = { a, b, c}, then X 2 = {(0, 0), (0, 1), (1, 0), (1, 1)} X × Y = {(0, a), (0, b), (0, c), (1, a), (1, b), (1, c)}. 2.2.1 Exercises 1. Let A = {d, i, s, c, r, e, t}, B = {m, a, t, h, e, i, c, s} and the universe U be the set of lower case English letters. Write down the following sets: (a) A ∪ B (c) A \ B (e) ¬ A ∩ ¬ B (b) A ∩ B (d) B \ A (f) ¬( A ∪ B) Answers for Part (1e) and (1f) are the same as guaranteed by DeMorgan’s Law. 2. Repeat Exercise (1) for U = {n ∈ Z : 0 ≤ n ≤ 20}, A is the set of multiples of 3 in U and B is the set of multiples of 5 in U. 3. For n ∈ N, let Cn be the subset of N consists of the multiples of n. (a) Find k such that Ck = C2 ∩ C3 . (b) Find k such that Ck = C12 ∩ C18 . (c) In general, for any m, n ∈ N, show that there is a k ∈ N such that Ck = Cn ∩ Cm . 4. (Symmetric Difference.) The symmetric difference of two sets X and Y, denoted by X∆Y is defined to be the set ( X \ Y ) ∪ (Y \ X ). Show that X∆Y = ( X ∪ Y ) \ ( X ∩ Y ). Deduced that X = Y if and only if X∆Y = ∅. 5. If the X 4Y = X 4 Z, is it necessary that Y = Z? If the answer is “yes” give a proof. If the answer is “no”, give an example to show why it is not true. 6. What can be said about two sets X and Y if X \ Y = Y \ X? 7. Suppose X and Y are finite sets. Show that (a) | X | = | X \ Y | + | X ∩ Y |. (b) | X ∪ Y | = | X | + |Y | − | X ∩ Y |. 2.2. OPERATIONS ON SETS 23 8. (Ordered Pairs.) Define ( a, b) as the set {{ a}, { a, b}}. Show that ( a, b) = (c, d) if and only if a = c and b = d. 9. Show that for X, Y nonempty, X × Y = Y × X only if X = Y. 10. Write down P(∅) and P( P(∅)). 11. (Power Set.) Let X = { a, b, c} (a) Write down P( X ). What is the size of P( X )? (b) In general, what is the size of the power set of a set of size n? 12. Let X = {0, 1} and Y = { a, b}. (a) What is the size of P( X × Y )? (b) Write down the set of all size 2 subsets of X × Y, i.e. the set { A ⊆ X × Y : | A | = 2}. 13. A partition of X is a subset of P( X ) consisting of pairwise disjoint subsets whose union is X. (a) Find the set of all partitions of { a, b, c}. (b) How many partitions are there of a set of size n? 14. Prove the Distributive Laws (2.2.6). 15. (Russell’s Paradox) Care must be taken when we define a set by a property (formulas expressible using the ∈ relation). For example, consider the set U = {X : X ∈ / X} Intuitively, U is the set of all sets that does not contain itself as an element. Does U belongs to itself? Show that there is a contradiction either way. 24 CHAPTER 2. SOME SET THEORY 2.3 Relations and Functions A relation from X to Y is simply a subset of X × Y. Let R be relation from X to Y and S be a relation from Y to Z. Denoted by S ◦ R the composition of S by R which is the following relation from X to Z {( x, z) ∈ X × Z : there exists y ∈ Y such that ( x, y) ∈ R and (y, z) ∈ S} Composition of relations is associative4 , that is, S ◦ ( R ◦ T ) = (S ◦ R) ◦ T. The inverse of a relation R ⊂ X × Y, denoted by R−1 , is the relation {(y, x ) ∈ Y × X : ( x, y) ∈ R} The inverse of R−1 is clearly R itself. Note also that (S ◦ R)−1 = R−1 ◦ S−1 . A function from X to Y is a triple ( f , X, Y ) where f ⊆ X × Y is a relation from X to Y such that of any x ∈ X there is a unique y ∈ Y such that ( x, y) ∈ f . We call X the domain and Y the co-domain and f the graph of the function ( f , X, Y ). We often refer to a function by its graph and instead of ( f , X, Y ), we write f : X → Y; instead of ( x, y) ∈ f , we write f ( x ) = y. The range of f , denoted by Rng f , is the set of y ∈ Y such that y = f ( x ) for some x ∈ X. In other words, Rng f = {y ∈ Y : y = f ( x ) for some x ∈ X } We often write Rng f as f ( X ). Let Z be a subset of Y, the inverse image of Z under f is the following subset of X f −1 ( Z ) = { x ∈ X : f ( x ) ∈ Z } . If Z = {z} is a singleton, we write f −1 (z) instead of f −1 ({z}). First, let us give two unorthodox examples: A moment of thoughts should convince you that “boyfriends” is only a relation but not a function since one may have more than one boyfriend. However, (biological) “father” is not only a relation but a function as well5 . Example 2.3.1. Let X = { a, b, c}, Y = {0, 1, 2} and Z = {α, β, γ}. And R = {( a, 1), (b, 1), (c, 1), (c, 2)} and S = {(0, α), (1, β), (2, α)}. Then S ◦ R = {( a, β), (b, β), (c, β), (c, α)} R−1 = {(1, a), (1, b), (1, c), (2, c)} S−1 = {(α, 0), (α, 2), ( β, 1)}. 4A non-associative product that you may be familiar with is the cross product of vectors. genetics progress, this may change in the future. 5 As 2.3. RELATIONS AND FUNCTIONS 25 Neither R nor R−1 is a function. S is a function but S−1 is not a function. Thus the inverse of a function in general needs not be a function. Example 2.3.2. For a real number r, let br c be the greatest integer not exceeding r. We call this integer the floor of r. We call the function from R to R sending each real number to its floor the floor function. Let f be a function from A to B. We say that f is 1-to-1 (or injective, or an injection) if f ( a) = f ( a0 ) implies a = a0 , for every a, a0 ∈ A. In other words, no two distinct elements of A map to the same element of B by f . Yet in other words, f −1 (b) is a singleton for every b ∈ Rng f . We say that f is onto (or surjective or a surjection) if for every b ∈ B there is an a ∈ A such that f ( a) = b. In other words, every element of B is the image of some element of A. Yet in other words, f is onto means its range and its co-domain coincide. A function that is both 1-1 and onto is bijective (or a 1-to-1 correspondence or a bijection). Example 2.3.3. 1. The function S in Example 2.3.1 is neither 1-1 nor onto. 2. The range of the floor function (Example 2.3.2 is Z the set of integers. Therefore, it is not surjective. The inverse image of an integer n is the half-open interval [n, n + 1) hence the floor function is not 1-to-1 either. Example 2.3.4. 1. Let X be a subset of Y, the map ιYX : X → Y sending each x ∈ X to itself is an injection. 2. For any non-empty sets X and Y, the map π X sending ( x, y) ∈ X × Y to x and the map πY sending ( x, y) ∈ X × Y to y are clearly both surjective. We call π X and πY the canonical projections of the product X × Y. Example 2.3.5. A permutation of a set X is a bijection from X to itself. Let S( X ) denote the set of permutations of X. The identity map of X to itself, idX , is a permutation of X. The composition of two permutations is a permutation (Exercise 2). Moreover, the inverse of a permutation of X is a permutation (Exercise 4). Since function composition is associative (Exercise 1), S( X ) together with function composition form a group. We call it the symmetric group on X. In particular, we write Sn instead of S( X ), if X is finite of n elements and we write, for example, ! 1 2 3 2 1 3 26 CHAPTER 2. SOME SET THEORY for the element of S3 that maps 1 7→ 2, 2 7→ 1 and 3 7→ 3. Example 2.3.6. There is a 1-to-1 correspondence between P( X ) and the set of functions from X to 2 = {0, 1}. Namely, to each A ⊆ X we associate its membership function χ A : X → 2, χ A (x) = 0 x∈ /A 1 x ∈ A. Conversely, to each function f : X → 2, we associate to f the subset f −1 (1) ∈ P( X ). One checks readily that these two functions are inverse of each other. It follows easily from this observation that | P( X )| is 2| X | . We use the notation Y X to denote the set of functions from X to Y. This seemingly odd notation is, in fact, rather natural: for non-empty finite sets X and Y, |Y X | = |Y || X | . 2.3.1 Cardinality The cardinality of a finite set is simply the number of elements that is has. Does it make sense to talk about sizes of infinite sets? Afterall, are they all have the same size—infinite? It turns out, maybe rather surprising the first time when you hear that, there are different levels of “infinity”. We encourage the reader to learn more about cardinalities by reading a set theory book. For us, we will distinguish two types of “infinity” namely, countably infinite and uncountably infinite. Let us begin with the following definition: Definition 2.3.7. Two sets X and Y (possibly infinite) have the same cardinality, denoted by | X | = |Y | if there is a 1-to-1 correspondence between them. We say that a set is countably infinite if it has the same cardinality as the set of natural numbers N. A set is countable if it is either finite or countably infinite. A set is uncountable if it is not countable. Since composition of two bijection is a bijection, so if | X | = |Y | and |Y | = | Z | then | X | = | Z |. Example 2.3.8. The set of non-negative integers ω := N ∪ {0} is countably infinite since n 7→ n + 1 is a bijection from ω to N. Example 2.3.9. The set of even numbers E is countably infinite since n 7→ 2n is a bijection from N to E. The set of odd numbers O is also countably infinite since m 7→ m − 1 is a bijection between E and O. 2.3. RELATIONS AND FUNCTIONS 27 Definition 2.3.10. We write | X | ≤ |Y | if there is an injection from X to Y. In this case, we say that the cardinality of X is at most the cardinality of Y. We write | X | < |Y | if | X | ≤ |Y | but | X | 6= |Y |. If X is infinite then X \ { x } where x ∈ X is still infinite. From this, it follows easily that there is an injection from N to X for any infinite set X. Thus |N| ≤ | X | for any infinite X. In other words, |N| is the least infinite “size”. Theorem 2.3.11 (Cantor, Bernstein, Schroeder). If | X | ≤ |Y | and |Y | ≤ | X | then | X | = |Y | . Example 2.3.12. Suppose X is an infinite subset of N. Then the natural inclusion of X into N is injective so | X | ≤ |N|. On the other hand, since X is infinite, | X | ≤ |N|, therefore, we concldue that X is countably infinite. It is clear that there is a surjection from N to any countable set X. If there is a surjection, say f , from N to X, then sending each x ∈ X to its least preimage is an injection from X into N. Thus we show that Proposition 2.3.13. X is countable if and only if there is a surjection from N to X. Proposition 2.3.14. The product of two countable sets is countable. Proof. Suppose f is a surjection from N to X and g is a surjection from N to Y. Then the map f × g sending (i, j) to ( f (i ), g( j)) is a surjection from N2 to X × Y. So it remains to show that N2 is countable. But this follows immediately by counting N2 “diagonally”: (1, 1), (1, 2), (2, 1), (1, 3), (2, 2), (3, 1), . . . Proposition 2.3.15. The union of countably many countable set is countable. Proof. Suppose Xn (n ∈ N) are countable sets. Let f n be a surjection from N to Xn . Then the map (i, j) 7→ f i ( j) is a surjection from the countable set N2 to the S union n∈N Xn and hence it is countable. Example 2.3.16. The set of integers is to union of the three countable sets, namely N, {0} and −N and so is countable according to Proposition 2.3.15. Example 2.3.17. The set of rational numbers Q is the union of nm o Qn = : m ∈ Z , ( n ∈ N) n and {0}. Each Qn is in bijection with Z and hence countable. Therefore, by Proposition 2.3.15, Q itself is countable. 28 CHAPTER 2. SOME SET THEORY Theorem 2.3.18 (Cantor). For any set X, | X | < | P( X )|. Proof. Mapping x to the subset { x } of X is clearly an injective map. Therefore, | X | ≤ | P( X )|. Since there is a bijection between P( X ) and 2X , it remains to show that there is no bijection between X and 2X . Suppose on the contrary that x 7→ f x is a bijection between X and 2X . Then the map g defined by g( x ) = 1 − f x ( x ) a map from X to 2 but clearly differs from each f x at x, contradicting the fact that f is surjective. 2.3.2 Exercises 1. (Associativity of relation composition) Suppose R ⊆ X × Y, S ⊆ Y × Z, T ⊆ W × X. Show that that S ◦ ( R ◦ T ) and (S ◦ R) ◦ T are the same relation from W to Z. 2. (Composition of functions) Show that (a) the composition of two injections is an injection. (b) the composition of two surjection is a surjection. (c) Deduce that the composition of two bijection is a bijection. 3. Compute R−1 ◦ S−1 in Example 2.3.1. Verify that R−1 ◦ S−1 = (S ◦ R)−1 . 4. (Inverse of a function) Suppose f is a function from X to Y, show that its inverse relation f −1 is a function from Y to X if and only if f is bijective. 5. Let X, Y be sets (a) Show that if X = ∅ then Y X = {∅}. (b) Show that if Y = ∅ and X 6= ∅, then Y X = ∅. 6. Let X = { a, b, c}, Y = {0, 1, 2, 3} and Z = {α, β} (a) Give a function f : X → Y and a function g : Y → Z such that g ◦ f is onto but f is NOT 1-to-1. Represent your functions by pictures. (b) Write down the function g ◦ f that you got in Part (6a) as a set of ordered pairs. 7. Given X = { p, q, r }, Y = {0, 1, 2, 3}. (a) Give a function f : X → Y which is NOT injective. (b) Give a function g : Y → X which is NOT surjective. 2.3. RELATIONS AND FUNCTIONS 29 (c) Write the composition g ◦ f as a set of ordered pairs. 8. Let f be a function from X to Y. Prove or give a counterexample to the following statements: (a) If A ⊆ X, f −1 ( f ( A)) = A. (b) If B ⊆ Y, f ( f −1 ( B)) = B. 9. Let f be a function from X to Y. Prove or give a counterexample to the following statements: (a) If A ⊆ X, f ( X \ A) = f ( X ) \ f ( A). (b) If C ⊆ Y, f −1 (Y \ C ) = f −1 (Y ) \ f −1 (C ) (c) If A, B ⊆ X, f ( A ∩ B) = f ( A) ∩ f ( B). (d) If A, B ⊆ X, f ( A ∪ B) = f ( A) ∪ f ( B). (e) If C, D ⊆ Y, f −1 (C ∩ D ) = f −1 (C ) ∩ f −1 ( D ). (f) If C, D ⊆ Y, f −1 (C ∪ D ) = f −1 (C ) ∪ f −1 ( D ). 10. Let f , g and h be functions. Prove or give a counterexample to the following statements: (a) If h ◦ f = h ◦ g and h is 1-to-1, then f = g. (b) If h ◦ f = h ◦ g and h is onto, then f = g. (c) If f ◦ h = g ◦ h and h is 1-to-1, then f = g. (d) If f ◦ h = g ◦ h and h is onto, then f = g. 11. Let σ, τ ∈ S4 be σ= 1 2 2 3 3 4 ! 4 , 1 τ= 1 2 2 4 3 1 4 3 ! (a) Compute σ ◦ τ. (b) Compute τ ◦ σ. (c) Compute τ −1 (d) Compute σ3 (i.e. σ ◦ σ ◦ σ) 12. Show that the product of finitely many countable sets is countable. 13. Give an explicit formula for the map from N to N2 given in the proof of Proposition 2.3.14. 30 CHAPTER 2. SOME SET THEORY 2.4 Equivalence Relations Equality must be among the first few relations that we encountered in our lives. However, a moment of thoughts should convince you that the daily usage of the word “equality” has more to do with justice than mathematics. In fact, the more you think about equality the more peculiar it seems. What means by two things are equal? The answer given by set theory is that two sets are equal if and only if they have the same elements. Certainly, then you will ask what are elements? Or how can we tell whether two elements are the same? Now you may appreciate why people tried to build everything out of nothing (the empty set). On the other hand, is equality what we really care about? Probably not, at least, not for most of the time. Every cell phone manufactured has a unique series number. But wouldn’t it be strange if you go into a store and ask to buy a phone by a series number? Also think about the profile for your ideal partner that you put on a dating site... Does it really specify “the one”? Trying to answer these questions should convince you that, most of the time, we only care about things up to some sort of equivalence. So what are the essential properties of equality? First, everything should be equal to itself, right? Second, if a = b then b = a. Finally if a = b and b = c, then a should be equal to c. Each of these properties is important on its own, so we will give each of them a name. A relation R on a set X (i.e. R ⊆ X × X and we write aRb for ( a, b) ∈ R) is • reflexive if for any a ∈ X, aRa. • symmetric if for any a, b ∈ X, aRb implies bRa. • transitive if for any a, b, c ∈ X, aRb and bRc implies aRc. An equivalence relation on X is a reflexive, symmetric, transitive relation on X. Let E be an equivalence relation on X, we say that two elements x, y ∈ X are equivalent under E (or E-equivalent) if aEb. Example 2.4.1. 1. On any set X, equality = is an equivalence relation. It is the finest6 equivalence relation on X. That means every equivalence relation E on X contains equality. 6 If R and S are two relations on the same set X, we say that R is finer than S if R ⊆ S. 2.4. EQUIVALENCE RELATIONS 31 2. On any set X, X × X is an equivalence relation (any two elements of X are equivalent with respect to this relation). It is clearly the coarsest (equivalence) relation on X. 3. Let X = { a, b, c}. Then R = {( a, a), (b, b), (c, c), ( a, b), (b, a)} is an equivalent relation. But S = R ∪ {( a, c), (c, a)} is not even though S is both reflexive and symmetric, it fails to to a transitive relation. 4. Given a natural number n, two integers are equivalent mod n, written a ≡ b (mod n) if a and b have the same reminder when divided by n. In other words a ≡ b (mod n) if (b − a) is divisible by n. It is an equivalence relation on Z. Given an equivalence relation E on X, the E-equivalence class of an element x ∈ X, denoted by [ x ] E , is the subset of X consisting of elements that are equivalence to x (under E): [ x ] E = {y ∈ X : xEy} So an equivalence class of E is simply a subset of X consisting of all elements of X there are E-equivalent to each other. Also, it is clear that xEy exactly means [ x ] E = [y] E . Example 2.4.2. Referring to Example 2.4.1 1. The equivalence classes of = are { x } (x ∈ X). In other words, for each x ∈ X, [ x ]= is { x }. 2. The equivalence classes of X × X is X. So X × X has only 1 equivalence class namely X. So for any x, y ∈ X, [ x ] X ×X = [y] X × X = X. In other words, everything is equivalent to everything else. 3. There are n-equivalence classes of ≡ (mod n). To be even more concrete, let’s take n = 3. There all of multiples of 3 are equivalent, i.e. [0] = {0, ±3, ±6, . . .} is an equivalence class. While [1] and [2] are the other two equivalence classes of this relation. As you can see in each case, the set of equivalence classes form a partition of X. This is not a coincidence. Proposition 2.4.3. Let E be an equivalence relation on X. Then X/E := {[ x ] E : x ∈ X } is a partition of X. 32 CHAPTER 2. SOME SET THEORY Proof. Since each x ∈ [ x ] E so the union of the elements of X/E is X. Suppose [ x ] ∩ [y] 6= ∅. Pick a z in the intersection. Thus zEx and zEy. By symmetry and transitivity, xEy but that means [ x ] = [y]. In other words, two distinct equivalent classes must be disjoint. This completes the proof. Conversely, given a partition P of X, one get an equivalent relation EP by declaring two elements in X are EP -equivalent if they are in the same member of EP . An example should clarify this concept. Example 2.4.4. Consider the partition P = {{ a, c}, {b}, {d, e, f }} of the set X = { a, b, c, d, e, f }. Then EP is the equivalence relation: {( x, x ) : x ∈ X } ∪ {( a, c), (c, a), (d, e), (e, d), (d, f ), ( f , d), (e, f ), ( f , e)} Often useful is that we identify elements that are equivalent (that is why we call them as equivalent to begin with). Hence it is instructive to think of the elements of X/E as individual entities themselves (instead of subsets of X). The difference here is philosophical rather than mathematical. We speak of X/E as the quotient of X by E. And the surjection π E : X → X/E, sending x to its equivalence class [ x ] E is called the canonical map (or canonical projection). Conversely, given a surjection π : X → Y with X, the set Pπ = {π −1 (y) : y ∈ Y }, of inverse images of elements of Y under π forms a partition of X and we write Eπ for the equivalence relation corresponds to the partition Pπ . Example 2.4.5. Let X = { a, b, c, d, e, f } and Y = {0, 1, 2} and π be the surjection defined by π ( a) = π (c) = 0, π (b) = 1 and π (3) = π (4) = π (5) = 2. Then one checks readily, that Eπ is the equivalence relation in Example 2.4.4. To summarize, we have described a 1-to-1 correspondence between the following three kinds of objects: • Equivalence relations on X. • Partitions of X. • Surjections with domain X. Exercises 1. For each of the follow conditions give a relation that satisfies them. 2.4. EQUIVALENCE RELATIONS 33 (a) reflexive, symmetric but not transitive. (b) reflexive, transitive but not symmetric. (c) symmetric, transitive but not reflexive. 2. For each of the following relation, decide whether it is an equivalence relation. If not, states which of the three properties of an equivalence relation that it fails to possess. 3. Suppose E and E0 are equivalence relations on the same set X. We say that E0 refines E if aE0 b implies aEb for any a, b ∈ X. Consider the partition P = {{c, a}, {d}, {e, f , b}} of X. (a) Write down X. (b) Write down, EP , the equivalence relation associated P as a set of order pairs. (c) Give the partition of an equivalence relation on X that refines EP but not the equality of X nor EP itself. 4. (Bell Numbers) Let Bn be the number of partitions (equivalently the number of equivalence relations) on a set of size n. (a) What should be B0 ? (b) Write down all partitions of the set { a, b, c}. Hence find B3 . (c) Find B4 . 34 CHAPTER 2. SOME SET THEORY 2.5 Orderings Order relation is another type of relation frequently encountered in real life. Think about ranking the “cuteness” of the guys (or girls) in a party. You may decide that John is cuter Joe and Ryan is cuter than John, so you know Ryan is cuter than Joe (be consistent with yourself!). However, Dave is different, you can’t quite compare him to the other guys. Yep, this is essentially what an order is about except in real life, we often compare only a certain aspect of an object so it is hardly the case that the relation that we have in mind is anti-symmetric (which we will define next). A relation is a partial order (or simply an order) if it is reflexive, transitive and anti-symmetric. We have already defined reflexive and transitive. A relation R on X is anti-symmetric meaning for any a, b ∈ X if both aRb and bRa then a = b. Here is where the “cuteness” analogy breaks down. Two guys that are equally cute to you does not mean they are the same person7 . From now on, we will use a more suggestive symbol to denote a general partial order. A poset is a pair ( X, ) where X is a set and is a partial order on X. We often simply say that X is a poset if the partial order is understood. We say that a and b are -comparable (or simply comparable) if either a b or b a. A subset C of X is a chain if any two elements of C are -comparable. Clearly, a subset of a chain is also a chain. A subset A of X is an anti-chain if no two elements of A are comparable. Again, it is clear that every subset of an anti-chain is an anti-chain. Also, every singleton { x } (x ∈ X) is both a chain and an anti-chain. We say that is a total order (or linear order) on X itself is a chain. In that case, say we that ( X, ) (or simply X) is a totally (or linearly) ordered set. Example 2.5.1. 1. Equality is a partial order. It is not an interesting one since no two distinct elements are comparable. 2. The usual order ≤ on R is a total order. 3. Divisibility is a partial order on N. More precisely, for natural numbers a and b, let a ≤d b mean “a divides b”. One checks easily that ≤d is a partial order on N. However, it is not a total order since, for example, 3 and 5 are not comparable with respect to ≤d . 7a relation that is reflexive and transitive is called a pre-order. 2.5. ORDERINGS 35 4. For any set X, the inclusion ⊆ is a partial order on ℘( X ). It is not a total order unless | X | ≤ 1. When X is finite of size n, we identify the poset (℘( X ), ⊆) with the boolean algbera on n atoms, denoted by Bn . Note that |Bn | = 2n . 5. Let Σ be a non-empty set. Let Σ∗ be the set of strings (i.e. finite sequences) over Σ. We say that a string u is a prefix (or initial segment) of v ∈ Σ∗ , denoted by u v v if v = uw for some string w. It is easy to see that v is a partial order on Σ∗ . 6. Suppose (Σ, ≤) is a totally ordered set. The lexicographic order is a total order on the set of sequences of Σ. For sequences α, β of Σ, a ≺ b if ai < bi where i is the first index where α and β disagree. Let Y be a subset of a poset ( X, ). Then • An element a of Y is the greatest (resp. least) element of Y is for any y ∈ Y, y a (resp. a y). • An element a of Y is a maximal (resp. minimal) element of Y if for any y ∈ Y, a y (resp. y a) implies a = y. • An element of b of X is an upper (resp. a lower) bound of Y if y b (b y) for every y ∈ Y. • An element of s of X is a supremum (resp. an infimum) of Y if s is the least (resp. greatest) element of the set of upper (resp. lower) bounds of Y. We use sup Y (resp. inf Y) to denote the supremum (resp. infimum) of Y. Few remarks are in order. 1. Y can have at most one greatest (resp. least) element. In other words, Y may not have a greatest (resp. least) element but if it has, it can have only one. This is because if a, a0 ∈ Y are greatest elements of Y then first both of them are in Y. Since a is a greatest element of Y, so a0 a. Likewise, since a0 is also a greatest element of Y, so a a0 . Thus a = a0 by the anti-symmetric requirement on . Consequently, sup Y (resp. inf Y), as the least (resp. greatest) element of the set of upper (resp. lower) bounds of Y, if exists, must be unique. 2. Y may have more than one maximal (minimal) elements. 36 CHAPTER 2. SOME SET THEORY 3. In a totally ordered set, since any two elements are comparable, the concept of maximal (minimal) element and greatest (least) element coincide. 4. The greatest (resp. least) element of Y, if exists, is a maximal (resp. minimal) element of Y as well but not vice versa. 5. An upper (lower) bound of Y need not be in Y. 6. A subset Y of X need not have a supremum (infimum). However, as the least elements of the set of upper (lower) bounds of Y, sup Y (inf Y) if exists must be unique. 7. Also if the greatest (least) element of Y exists, then it must also be the supremum (infimum) of Y but not vice versa. 8. The concept of supremum and infimum may sound a bit foreign at first. However, many familiar concepts in mathematics are just supremum and infimum in disguise. For example, (a) If the poset is (N, |), then an upper of a subset Y is just a common multiple of the members of Y. Hence sup Y is the simply the l.c.m (least common multiple) of the elements of Y. Dually, inf Y is the g.c.d (greastest common divisor) of the elements of Y. (b) Next let us consider the poset (℘( X ), ⊆). An upper bound of a subset Y of ℘( X ) (i.e. Y is a collection of subsets of X) is simply a subset of X that contains each of the members of Y. Hence sup Y is simply the union of the elements of Y. Likewise, inf Y is simply the intersection of the members of Y. An example should make these concepts clear Example 2.5.2. Consider the poset X with the following Hasse diagram. •j •h •i •e •f •g •b •c •d •a 2.5. ORDERINGS 37 1. The subset {c} of X is a chain (as we have mentioned, any singleton is a chain). So is the subset of { a, c}. However, the subset { a, d} is not a chain since a, d are not comparable. 2. As chains of X are elements of ℘( X ). They are partially ordered by inclusion ⊆. In this sense, a maximal chain is a chain that is not properly contained in any chain. For example, { a, c, f , h, j} is a maximal chain but so is { a, c, e, h, j}. On the other hand {c, g, j} is not a maximal chain because the chain { a, c, g, i, j} contains it properly. 3. Let Y = { a, b, c} then the minimal elements of Y are a and b. The maximal elements of Y are b and c. Note that b is both maximal and minimal. Y has neither the greastest element or the least element. 4. The set of lower bounds of Y is L = { a}. The set of upper bounds of Y is U = {e, f , h, i, j}. Clearly a itself is the greatest element of { a} so inf Y = a. On the other hand, since U does not have the least element, so sup Y does not exist. 5. Now consider the subset Y = {e, f }. Then the set of upper bounds of Y is {h, j} hence sup Y = h. The set of lower bounds of Y is { a, b, c} and since it has no greatest elements so inf Y does not exist. 6. A trickier question is what is sup ∅? Since any element of X is an upper bound of ∅, so the set of upper bounds of ∅ is just X itself. Thereforem sup ∅ would be the the least element of X, if any. Since the X is our example has no least element, therefore sup X does not exist in this case. By the same token, inf X should be the greatest element of X, if any and so for this particular example, inf X = j. A partial order on X is a well order on X if every non-empty subset of X has a least element. Note that this condition already implies that is a total order on X, since for any two elements a, b ∈ X, the subset { a, b} has a least element, in particular a and b must be -comparable. Clearly, a finite total order must be a well order. Zorn’s lemma is the assertion that if every chain in a partially ordered set has an upper bound then the partially ordered set has a maximal element. It is equivalent to Axiom of Choice and the well-ordering principle. The first one asserts that the product of a non-empty family of non-empty set is non-empty and the second one asserts that every non-empty set can be well-ordered. The equivalence between them are non-trivial, unfortunately we won’t have time 38 CHAPTER 2. SOME SET THEORY to go over them. See for example, Halmo’s classic—Naive Set theory if you are interested. Example 2.5.3. 1. The usual order ≤ on N is a well order. However, the usual order on R is not, for example the non-empty subset (0, 1) has no least element. 2. The lexicographical order on English words is another example of a wellordered. 3. Consider (N, ≤d ), 1 is the least element. If you consider the subset N \ {1}, then the primes are minimal (but not least) elements. 4. X is the greatest elements of the poset (℘( X ), ⊆). If X is finite, then the singletons are minimal elements of the subset S := P( X ) \ { X, ∅} while the subsets of size | X | − 1 are the maximal elements. X is an upper bound of S (note that X ∈ / S) while ∅ is a lower bound of S. 5. Let X = { a, b, . . . , j}. And consider the partial order ≤ on X a ≤ d ≤ f , g ≤ h ≤ i, b, c ≤ e, j Draw it’s Hasse Diagram! The maximal elements of X are i, e and j and the minimal elements are a, b, c and j. Note that j is both a maximal and a minimal element of X. The subset {i, h, g, d, a} is a chain and so is {i, h, f , d, a}. (How many chains of X are there?) They are the maximal chains (if we ordered the chains of ( X, ≤) by inclusion) of X. The set { f , g, e, j} is a maximal anti-chain of X. Let Y be the subset { h, f , g, d} of X then h is the greatest element of Y and d is the least element of Y. The upper bounds of Y are i and h with h being the least upper bound. The lower bounds are a and d with d being the greatest lower bound. The subset {b, c} has an upper bound e but no lower bounds. The subset { f , b} has neither upper bounds nor lower bounds. 2.5.1 Embeddings Let ( X, X ) and (Y, Y ) be two posets. A map ϕ : X → Y is order preserving if x X x 0 implies ϕ( x ) Y ϕ( x 0 ). An embedding is a 1-to-1 map such that x X x 0 if and only if f ( x ) Y f ( x 0 ). An isomorphism is a surjective embedding. Example 2.5.4. 2.5. ORDERINGS 39 1. Suppose | X | ≥ 2, then any constant map from X to Y is an order preserving map but not an embedding (since it is not 1-to-1). 2. A bijection from an anti-chain (of size > 1) to a chain is order preserving but not an embedding. 3. Let X be a poset. One can check that the map sending x ∈ X to the following set ↓ x : = { y ∈ X : y x }. is an embedding, we call it the down-set embedding of X. In particular, every poset of size n embeds into Bn the Boolean algebra of size 2n . Exercises 1. Using the fact that there are infinitely many primes, shows that every finite Boolean algebra can be embedded into (N, |). Deduce that (N, |) is universal for finite posets, in the sense that every finite poset can be embedded into (N, |). 2. Show that every partial order on a finite set X can be extended to a total order on X. In fact, by Zorn’s Lemma, one can show that the same holds for infinite X. 3. Consider the posets •z •β •x •y •v •u P2 = P1 = •α •γ •t (a) Give two different embeddings from P1 to P2 . (b) How many embeddings are there from P1 to P2 ? •w Index 1-to-1 correspondence, 25 ordered pairs, 23 Bell numbers, 33 bijection, 25 permutation, 25 power set, 20 cardinality, 19, 26 Cartesian Product, 21 co-domain, 24 complement, 20 countable, 26 relation, 24 composition, 24 inverse, 24 domain, 24 empty set, 19 equivalence relation refine, 33 floor, 25 function, 24 1-to-1, 25 bijective, 25 co-domain, 24 domain, 24 graph, 24 injective, 25 onto, 25 range, 24 surjective, 25 set, 19 complement, 20 disjoint, 20 element, 19 finite, 19 infinite, 19 intersection, 20 power, 20 product, 21 subset, 19 symmetric difference, 22 universe, 20 subset proper, 19 surjection, 25 symmetric group, 25 uncountable, 26 union disjoint, 20 infinite countably, 26 injection, 25 40