Here

advertisement
Introduction to Mathematical Proofs
Wai Yai Pong
November 4, 2015
Contents
1
2
Dive into proofs
1.1 Why do proofs and how to do them?
1.2 Exercises . . . . . . . . . . . . . . . .
1.3 Mathematical Induction . . . . . . .
1.4 Exercises . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
4
11
13
16
Some Set Theory
2.1 Sets and subsets . . . . .
2.2 Operations on Sets . . .
2.2.1 Exercises . . . . .
2.3 Relations and Functions
2.3.1 Cardinality . . .
2.3.2 Exercises . . . . .
2.4 Equivalence Relations .
2.5 Orderings . . . . . . . .
2.5.1 Embeddings . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
18
19
20
22
24
26
28
30
34
38
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 1
Dive into proofs
3
4
1.1
CHAPTER 1. DIVE INTO PROOFS
Why do proofs and how to do them?
The way to demonstrate the validity of a mathematical statement is to give a
proof. This is much like performing experiment is the way to demonstrate the
validity of a claim in physical science. A math student should be able to write
proofs much like a chemistry student should be able to carry out chemistry
experiments—which is absolutely essential.
Writing proofs is much like writing poems—it consists of two parts: 1) An
idea about the objects that you want to write about, 2) Expressing your ideas in
the “form” of a poem. Say, you want to write a poem about mountains, first you
have to have some ”feelings” towards the mountains that you want to write
about. Then you decide on the format that you want to express these ideas.
For instant, there is a format for Chinese poems in which a poem has 4 verses
and each verses consists of 7 words. Thus, you have to accomplish whatever
you want to say about the mountains in 28 words. There are additional rules
too: like the last word of the 1st and the 2nd verse are in rhythm and so are the
last word of the 3rd and the 4th verse, etc.
If you want to prove a mathematical statement, first you have to have some
ideas on why the statement is true. Then you express your ideas according to
the ”rules” of logic. It is the second part of proof-writing that can be ”taught”.
The first part can only be ”learned”. So how to learn to write proofs? Well, the
analogy continues: how to learn poem-writing (or song-writing or painting)?
If you ask a poet, the answer will be read a lot of poems first! The underlying
massage is that one should memorize a bunch of poems first (and understand
the ”messages” carry by those poems)!! If someone claims to be a poet but can
hardly recite any poem, will you believe the claim?
With this philosophy in mind, let’s dive into proof-writing. Here is our first
statement.
Statement 1. The sum of two even numbers is even.
Well, you would say this is obviously true. . . Not so quick, to the medieval
minds, heavier objects fall faster than lighter objects but it had been shown
otherwise by Galileo. So to make sure our statement is correct, let’s give a
proof of it. Once you start thinking of giving an explanation, you will realize
that there are many things to clarify. First, what means by a number? A real
number, an integer or what? And what are they? Also, what is the sum of two
numbers? What means by a number is even? To get all these strict would take
us too long. So let us deal with these issues later. For now, let us understand
1.1. WHY DO PROOFS AND HOW TO DO THEM?
5
the numbers are integers and we assume the basic properties of their addition
and multiplication1 . So the only concept yet to be explained is “even”:
An integer is even if it is divisible by two.
It is fine but if we said so then we have to explain what means by “divisible”
(remember we only assume the basics about addition and multiplication). So
let us rephrase the sentence using multiplication:
An integer n is even if n = 2m for some integer m.
The last statement is an example of a definition in mathematics. Just like a definition in a dictionary, it declares the meaning of a term, in this case the term is
“even number”, using other terms that have already be defined or assumed2 .
Proof of Statement 1. Suppose a, b are two even numbers. By definition, a = 2m
and b = 2n for some integers m and n. Thus,
a + b = 2m + 2n = 2(m + n)
is an even number.
A few remarks about the proof are in order:
1. This is NOT a formal proof (we will explain later).
2. We use a few properties about integers in the proof without explicitly
stating them: we use the fact that the sum of two integers is an integer
which is needed to conclude that m + n in the proof is an integer. Also, we
use the fact that multiplication of integers distribute over sum. In other
words, we use without proof that k(m + n) = km + kn for any integers
k, m and n.
3. This is an example of a constructive proof: we show that a + b is even by
explicitly constructing an integer, namely m + n, such that a + b is twice
of that. In other words, we prove the existence of something by giving an
explicit way of constructing it.
A closer examination of Statement 1 reveals its logical structure. Formally, it
can be stated as
∀ a, b ∈ Z(∃n, m ∈ Z, a = 2n ∧ b = 2m =⇒ ∃k ∈ Z, a + b = 2k).
1 for
those who knows, we are assuming the integers form a ring under addition and multiplication.
2 Obviously, this process cannot go on forever, so there must be some “basic” undefined terms
that we have to start with. In this case, it will be the integers.
6
CHAPTER 1. DIVE INTO PROOFS
1. The symbol ∀ reads “for all” and the symbol ∃ reads “there exist(s)”.
They are the quantifiers. In fact, to be strict, ∀ a, b ∈ Z appears in the front
of this sentence should be written as ∀ a ∈ Z∀b ∈ Z.
2. The symbols ∧ and =⇒ are logical connectives. They stand for logical “and” and “imply”, respectively. Logical connectives combine two
statements to form another one. Another commonly used connective is
∨ which stands for the logical “or”. We will have a closer look of them
later.
3. The formula: ∃n, m ∈ Z, a = 2n ∧ b = 2m =⇒ ∃k ∈ Z, a + b = 2k
is an implication, i.e. a formula of the form P =⇒ Q. The variables
n, m and k are bounded in this formula since they are within the scope of
some quantifiers. The variables a and b are free. Let ϕ( a, b) denote this
formula, then the original one becomes ∀ a, b ∈ Z, ϕ( a, b). The variables
a, b is trust quantified out and become bounded. A formula without any
free variable is called a sentence.
4. The contra-positive of the implication ϕ =⇒ ψ is the implication ¬ψ =⇒
¬ ϕ. They are logically equivalent. The contra-positive of the above implication is:
¬(∃k ∈ Z), a + b = 2k) =⇒ ¬(∃m, n ∈ Z, a = 2n ∧ b = 2m)
If we translate it back into English, it says:
if a + b not even then it is not the case that both a and b are even.
which is clearly equivalent to the original one even though it sounds a
little bit awkward. Instead of saying it is not the case that both a and b are
even, we can say that either a or b is not even. So intuitively ¬( ϕ ∧ ψ) is
equivalent to ¬ ϕ ∨ ¬ψ. And now if we actually declare that an integer is
odd if it is not even, then we can restate the above as:
If a + b is odd then a or b is odd.
Finally, if you quantify a and b out, it reads
If the sum of two integers is odd then at least one of them is odd.
As you know they cannot be both odd, but one cannot deduce that from
the above statement.
1.1. WHY DO PROOFS AND HOW TO DO THEM?
7
Let us make a few things precise before moving on. We use N to denote the set
of natural numbers, i.e.
N = {1, 2, 3, . . .}
and Z the set of integers. We assume the basic rules of addition and multiplication of integers3 : for any a, b, c ∈ Z,
1. ( a + b) + c = a + (b + c).
2. a + b = b + a.
3. a + 0 = a.
4. a + d = 0 for some d ∈ Z.
5. a(bc) = ( ab)c.
6. ab = ba.
7. a · 1 = a.
8. a(b + c) = ab + ac.
Definition 1.1.1. For a, b ∈ Z, we say that a divides b, written as a | b if there
exists c ∈ Z such that ac = b. In this case, we also say that a is a divisor (or
factor) of b and b is a multiple of a.
Note that every integer divides 0 but 0 is the only divisor of 0. The divisors of
1 are called units. In Z, 1 and −1 are the only units. A divisor of an integer n
is proper if it not a unit nor a unit times n.
Definition 1.1.2. Let a and b be integers not both 0, the greatest common divisor
of a and b, denoted by gcd( a, b) is the largest integer that divides both a and b.
We define gcd(0, 0) to be 0.
Let a and b be non-zero integers, the least common multiple of a and b,
denoted by lcm( a, b) is the smallest positive multiple of both a and b. In the
case either a or b is zero, lcm( a, b) is defined to be 0.
Definition 1.1.3. A prime number is an integer bigger than 1 which has no positive factors other than 1 and itself. Two integers are relatively prime (or coprime) if their greatest common divisor is 1.
The second statement that we are going to prove is:
3 In
other words, we assume (Z, 0, 1, +, ·) from a commutative ring with 1.
8
Proposition 1.1.4.
CHAPTER 1. DIVE INTO PROOFS
√
2 is irrational.
Note here we introduce the word “Proposition”. In contemporary mathematics, a Proposition is a statement of interest and the word Theorem usually is
reserved for propositions that are important for the discourse. The word Lemma
signifies a proposition that is going to be used in proving another propositions.
To prove Proposition 1.1.4, we first need to clarify the word “irrational”. Let
us say that a number is rational if it is the ratio of two integers and is irrational
if it is not rational. The problem of this definition is again—what “numbers”
are we talking about? The definition attempts to definite rational numbers as a
subset of some set of numbers. Well, since the number system should at least
√
contains 2 in order for the statement to make sense, so let us take the set of
numbers as the set of real numbers R. The second question, equally important,
√
is that what is 2? Ans: the positive real number whose square is 2. There
are two assumptions here in order for this definition to make sense: 1) the
existence of positive real number whose square is 2 and 2) the uniqueness of
such number. We will not be able to justify either of them here since we have
yet to define the real numbers! In any case, let us now make things official:
Definition 1.1.5. A real number r is rational if r = m/n for some m, n ∈ Z. A
real number is irrational if it is not rational.
We use Q to denote the set of rational numbers. With these definitions and
√
assumptions sorted out, let’s think about why 2 is irrational and how we can
proof it. The proposition requires us to show that no matter what integers r
√
and s are chosen, the ratio r/s cannot be 2, in order words, (r/s)2 cannot
be 2. There are infinitely many integer pairs to check, how can we achieve
that? This time we use a technique called proof by contradiction: We can prove
P implies Q by assuming P and ¬ Q (the negation of Q) and try to deduce a
contradiction from them. If we succeed in doing so, then we show that P ∧ ¬ Q
implies a false statement and that can only be the case if P ∧ ¬ Q itself is false.
In other words, ¬( P ∧ ¬ Q) is true, but that is logically equivalent to ¬ P ∨ Q
which is also equivalent to P =⇒ Q.
√
Proof of Proposition 1.1.4. Suppose 2 = r/s for some integers r, s. By canceling
the common factors of r and s, we can assume r and s are co-prime. After
squaring both sides of the equation and clear the denominator, we get 2s2 = r2 .
In particular, r2 is even and so is r (see Exercise 5). Thus r = 2k for some integer
k, and so 2s2 = r2 = (2k )2 = 4k2 . That means s2 = 2k2 . By the same argument,
we conclude that s is even as well. This is a contradiction, since r and s are
co-prime and hence cannot be both even.
1.1. WHY DO PROOFS AND HOW TO DO THEM?
9
The next statement that we are going to examine is also a classic.
Theorem 1.1.6. There are infinitely many primes.
Our first proof goes back to Euclid some 2300 years ago! One way of proving the statement is as following: for any given number n, construct a prime
number that is greater than n. However, in this case, such a constructive proof
turns out to be extremely difficult. To date, finding ways to produce large
primes is still one of the most active area in number theory research. So how
did Euclid prove the infinitude of primes some two millenniums ago? The answer is again proof by contradiction. But first we need a fundamental lemma:
Lemma 1.1.7. Every integer larger than 1 has a prime factor.
Proof. Suppose not, then the collection N of integers bigger than 1 that has
no prime factors is non-empty. Its least element, say m0 must not be a prime
otherwise it will be its own prime factor, contradicting the definition of N. So
m0 must have a factor 1 < m1 < m0 . Since m0 is the least element of N and
m1 is less than m0 , m1 cannot be in N. Now since m1 is an integer bigger than
1, so the fact that it is not in N implies it must have a prime factor. But any
factor of m1 is also a factor of m0 (see Exercise 2). So m0 has a prime divisor,
contradiction.
Let A be a set of integers. We say an integer k is a lower bound of A if k ≤ a
for every a ∈ A. We say that A is bounded below if A has a lower bound. Note
that a lower bound of A needs not be in A, for example, 0 is a lower bound of
N but 0 ∈
/ N. A least element of A is an element m ∈ A such that m ≤ a
for every a ∈ A. If m1 , m2 are both least elements of A, then by the definition,
we must have m1 ≤ m2 and m2 ≤ m1 . So consequently, m1 = m2 . In other
words, if A has a least element, then it is the (unique) least element of A. Note,
however, that A needs not have a least element, for instance if A is the set of
negative integers. One thing is clear though, if A is non-empty and finite, then
we can find the least element of A by comparing its elements one by one 4 .
More generally, we have the following:
Proposition 1.1.8. Let W be a set of integers. If W is non-empty and bounded below
then the least element of W exists.
Proof. Since W is bounded below, it has a lower bound, say m. Since W is nonempty, some integer w belongs to W. By definition of lower bound, m ≤ w and
4 we
need the finiteness of A to guarantees that this process will terminate
10
CHAPTER 1. DIVE INTO PROOFS
so w belongs to the set
F = { n ∈ W : m ≤ n ≤ w }.
As a finite non-empty set, the least element of F exists, let us call it w0 . We
claim that w0 is also the least element of W. First, since every element of F is
an element of W, in particular w0 ∈ W. Now all it remains to show is that
no integer smaller than w0 can be in W. Since m is a lower bound of W, no
number smaller than m can be in W. Thus, we only need to show that an
integer a cannot be in W, if m ≤ a < w0 . Suppose the contrary that a ∈ W then
a would be in F by the very definition of F. But this contradicts w0 is the least
element of F.
Proposition 1.1.8 is a fundamental property of the ordering on subsets of
integers that are bounded below (N, in particular). In fact, we have used it
implicitly in our proof of Lemma 1.1.7 to guarantee the existence of m0 and
we are going to use it again to proof the principle of mathematical induction in
Section 1.3. Now let us return to the proof of Theorem 1.1.6.
Proof. Suppose on the contrary that there are only finitely many primes, say
p1 , . . . , pn . Consider the number
N = ( p1 · · · pn ) + 1.
Clearly N > 1 and so it must have a prime factor, say q, by Lemma 1.1.7. On
one hand q cannot be any of the pi since N is a multiple of q but not a multiple
of any of the pi ’s. On the other hand, q must be one of the pi ’s since q is a prime,
so we get a contradiction.
Euclid’s proof is an example of a non-constructive proof. He showed that they
are infinitely many prime numbers without constructing even a single prime!!
In a sense, proving an existence statement with a non-constructive proof is not
“as good as” a constructive one. But in many cases, non-constructive proofs
are the only proofs available.
Let us now give a different proof of Euclid’s result. Certainly, it is not precise of what do we mean by a “different” proof. This usually means a proof
involving a different “idea”. By this standard the following proof
is not really different from Euclid’s proof.
Second proof of Theorem 1.1.6. For each n, the number n! + 1 must have a prime
factor according to Lemma 1.1.7 and any factor of n! + 1 must be larger than
n. Thus we show that there are arbitrary large prime and so there must be
infinitely many primes.
1.2. EXERCISES
1.2
11
Exercises
1. Mimic the proof of Statement 1, show that the sum of two multiples of 3
is again a multiple of 3. More generally, show that for any integer k, the
sum of two multiplies of k remains a multiple of k.
2. Show that for any integers a, b and c, if a divides b and b divides c then a
divides c.
3. Division algorithm Show that for any a, b ∈ Z with b > 0, there exist
unique integers q and r with 0 ≤ r < b such that
a = bq + r
4. Greatest common divisor Show that
(a) For any integers a and b, there is a unique non-negative integer d
such that the set
{ am + bn : m, n ∈ Z}
is the set of multiples of d.
(b) Show that the d in Part (4a) is a common divisor of a and b; and
(c) show that if d0 is a common divisor of a and b then d0 divides d.
5. Show that
(a) The product of two odd numbers is odd.
(b) Deduce that for any integer r if r2 is even then r is even.
(c) Let p be a prime, show that if p divides ab then p divides a or p
divides b.
(d) Deduce that if a2 is divisible by a prime p, then so is a.
6. Mimic the proof of Proposition 1.1.4, show that
√
(a) 3 is irrational.
√
(b) in generally, p is irrational for any prime p.
√
(c) more generally, n is irrational for any square-free n. (An integer
n > 1 is square-free if no square other than 1 divides n.)
√
(d) Deduce that n is irrational if n is not a perfect square.
12
CHAPTER 1. DIVE INTO PROOFS
7. Fundamental Theorem of Arithmetic
(a) Existence of Prime Factorization. Show that every natural number bigger than 1 is a product of primes.
(b) Uniqueness of Prime Factorization. Show that if p1 , . . . , pn , q1 , . . . , qm
are primes and that
p1 · · · p n = q1 · · · q m ,
then n = m and up to permutation pi = qi for each i.
8. Show that there are infinitely many primes of the form 4n − 1.
Hints for Exercises
3 Consider the set of differences of a and the multiples of b, i.e. the set
{ a − bn : n ∈ Z}.
Argue that this set must contains some non-negative integers and hence
its least non-negative member exists. Show that it is the remainder of a
divided by b.
4 If a = b = 0, then d is clearly 0 as well. So we can assume either a or b
is not zero. If the set of integral combinations of a and b is going to the
set of multiples of some d > 0, then clearly d must be the least positive
element of the set. Thus argue that some integral combination of a and b
is positive and so there must be the least one.
5 For Part (a), use division algorithm to argue that an odd number must
be of the form 2k + 1. For Part (c), show that if p does not divide a then
it must divide b. Since p is a prime, if it does not divide a, then a and p
must be relatively prime. Now use Q.4.
6 For Part (c). Suppose n is square-free. Since n > 1, it has a prime factor,
say p. If sqrtn = r/s for some integers r and s. By canceling the gcd
of r and s, we can assume that r and s are relatively prime. Obtain a
contradiction by arguing that (like in Part (b)) that p divides both r and s.
7 For Part (a), mimic the proof of Lemma 1.1.7. For Part (b) use Q.5(c).
1.3. MATHEMATICAL INDUCTION
1.3
13
Mathematical Induction
Let P(n) be a statement about the natural numbers, for example:
The sum of the first n odd numbers is n2 .
Since the n-th odd number is 2n − 1, the statement P(n) asserts that
1 + 3 + . . . + (2n − 1) = n2 .
One can think of P(n) as a family of statements indexed by the natural numbers:
P(1) is the statement 1 = 12
P(2) is the statement 1 + 3 = 22
P(3) is the statement 1 + 3 + 5 = 32 , etc.
All instances of P(n) listed above are true! Hum. . . maybe all of them are true!
If that’s the case, how are we going to show that? Well, as mortals we certainly
can’t verify them one by one. . . Luckily, there is a powerful way, called mathematical induction, of proving a family of statements like this. As you will see, it
is actually a disguise of Proposition 1.1.8.
Theorem 1.3.1 (Principle of Mathematical Induction). Let W be a set of integers
such that
1. w0 ∈ W for some integer w0 ; and
2. For any k ≥ w0 , the implication k ∈ W =⇒ k + 1 ∈ W.
then W ⊇ {n ∈ Z : n ≥ w0 }.
Here is how Theorem 1.3.1 can help us to prove a statement like “the sum
of the first n odd numbers is n2 ”. Let
W = {n ∈ N : P(n) is true.}
In other words, let W be the set of witnesses of the statement. To show that P(n)
is true for all natural number n means to show that the witness set contains the
set N = {n ∈ Z : n ≥ 1}. So according to Theorem 1.3.1, we need to show two
things, namely
1. 1 ∈ W, i.e. P(1) is true; and
14
CHAPTER 1. DIVE INTO PROOFS
2. the implication: for any k ≥ 1, if P(k) is true, then so is P(k + 1).
Let’s see how does it apply to our example:
Proposition 1.3.2. For any n ∈ N, 1 + 3 + . . . (2n − 1) = n2 .
Proof. When n = 1, the left-side of the equation is 1 and right side is 12 which
is again 1. So the statement is true for n = 1. Suppose the statement is true for
some k ≥ 1, i.e.
1 + 3 + . . . + (2k − 1) = k2
(1.1)
then
1 + 3 + . . . + (2k − 1) + (2(k + 1) − 1) = k2 + (2k + 2 − 1)
= k2 + 2k + 1
= ( k + 1)2 .
So we establish the proposition by mathematical induction.
There are some common terminologies about induction proofs:
1. The statement P(w0 ) is call the basic case. In the example, above the base
case is 1 = 12 .
2. The induction step is the proof of the implication (2) in Theorem 1.3.1
within the induction proof. In the example above, it is the proof of the
implication: for all k ≥ 1, if
1 + 3 + . . . + (2k − 1) = k2 ,
then
1 + 3 + . . . + (2( k + 1) − 1) = ( k + 1)2 .
3. The premises P(k) in the implication P(k) =⇒ P(k + 1) is called the
induction hypothesis. In our case, the induction hypothesis is
1 + 3 + . . . + (2k − 1) = k2 .
Mathematical induction is a framework for proving statements indexed by
a well-ordered set of integers (more about well-order later). In our proof-poem
analogy, it plays the same role as the format of a certain kind of poems. But in
mathematics, we can even explain why the proof should take that form. So let
us demonstrate that Theorem 1.3.1 is just Proposition 1.1.8 in disguise.
1.3. MATHEMATICAL INDUCTION
15
Proof of Theorem 1.3.1. Suppose on the contrary that W does not contain the set
{n ∈ Z : n ≥ w0 }. So the set C = {m ∈ Z : m ≥ w0 , m ∈
/ W } is non-empty.
Clearly C is bounded below by w0 , so according the Proposition 1.1.8, the least
element, say m0 , of C exists. By the definition of C, m0 ≥ w0 and m0 ∈
/ W.
Since we assume w0 ∈ W, thus m0 6= w0 . Therefore, we must have m0 > w0 .
That means m0 − 1 ≥ w0 . Note that m0 − 1 cannot be in C since m0 is the least
element of C. Thus we conclude that m0 − 1 ∈ W. But then according to the
implication (2), m0 = (m0 − 1) + 1 ∈ W, contradiction.
Next we introduce a variant of mathematical induction, called the strong
induction. Contrary to what its name suggests, it is not stronger than mathematical induction itself since they imply each other.
Theorem 1.3.3 (Strong Induction Principle). Let W be a set of integers such that
1. w0 ∈ W for some integers w0 ; and
2. For any k ≥ w0 , the implication (∀w0 ≤ m ≤ k, m ∈ W ) =⇒ k + 1 ∈ W is
true,
then W ⊇ {n ∈ Z : n ≥ w0 }.
It is clear that if (2) in Theorem 1.3.1 is true than (2) is true, therefore, Principle of Mathematical Induction does follows from the Strong Induction Principle. Now we show that the converse is also true:
Proof of Theorem 1.3.3. Suppose W satisfies both Condition (1) and (2) of the
theorem. Let P(k) be the statement
∀w0 ≤ m ≤ k, m ∈ W.
Clearly if P(k) is true for all k ≥ w0 , then W contains the set {n ∈ Z : n ≥ w0 }.
By assumption (1), w0 ∈ W. So by mathematical induction (i.e. Theorem 1.3.1),
it remains to show that for every k ≥ w0 , P(k) =⇒ P(k + 1). But it follows
from Condition 2 of the theorem and P(k) that k + 1 ∈ W. But then P(k )
together with k + 1 ∈ W simply means m ∈ W for all w0 ≤ m ≤ k + 1 which is
nothing but P(k + 1). Therefore, we show the implication P(k ) =⇒ P(k + 1)
for any k ≥ w0 and hence finish the proof.
Now that we see that PMI and SMI are logically equivalent, a natural question would be why consider two equivalent forms of induction? It turns out
that sometime SMI is more flexible to apply. To illustrate this point, let us reprove Lemma 1.1.7 using strong induction
16
CHAPTER 1. DIVE INTO PROOFS
Proof. We want to show that every integer larger than 1 has a prime factor.
Clearly, 2 has a prime factor, since itself is a prime. Let k ≥ 2 and suppose
every number from 2 to k has a prime factor. So either k + 1 is a prime, in that
case k + 1 has a prime factor, namely itself; or else k + 1 has a proper factor,
i.e. there is some 1 < m < k that divides k. It then follows from the induction
hypothesis that m has a prime factor, say p. Since m divides k + 1, so p divides
k + 1 as well. Either case, we show that k + 1 must have a prime factor and so
we are done.
1.4
Exercises
1. This is almost everyone first example of Mathematical Induction. Prove
that for any natural number n,
1+2+...+n =
n ( n + 1)
.
2
Numbers of this form are called the triangular numbers. The first few
triangular numbers are 1, 3, 6, 10, . . .. The expression on the right side of
the equation can be interpreted as the number of ways to choose two
1
things out of n things, read as n choose 2 and is often denoted by (n+
2 ).
The story is that Gauss discovered this formula at a very young age (some
say around 8 years old).
2. Prove that for each natural number n ≥ 2,
1
1
1
n−1
+
+···+
=
.
1·2 2·3
( n − 1) · n
n
3. Prove that for all n ≥ 2,
1
1
1
n+1
1−
· 1−
··· 1− 2 =
.
4
9
2n
n
The next few questions are more challenging.
4. Suppose n lines are drawn on a plane in general positions, i.e. no two
lines are parallel and no 3 lines intersect at a point. How many regions
do they separate the plane into? (Try it for small n, and guess a formula.
Then prove your formula by induction.)
5. (Triminoes) This one is another classic.
1.4. EXERCISES
17
(a) Show that a 2n × 2n checkerboard with any one square deleted can
be tiled by “L”-shape tiles each covering 3 squares (called the triminoes).
(b) Prove that for n ≥ 1, 4n − 1 is divisible by 3 directly without resorting to Part (5a).
6. Put 2n dots on a circle, color them in anyway so that n of them are red
and n of them are blue. Show that one can always start from one of these
dots going clock-wise through the circle so that the number of red dots
visited is at least the number of blue dots visited at any moment in this
journey.
7. The Fibonacci sequence Fn is defined recursively as follows: F1 = F2 = 1
and Fn = Fn−1 + Fn−2 for all n ≥ 3. Show that F3k is even for every k ≥ 1.
Chapter 2
Some Set Theory
18
2.1. SETS AND SUBSETS
2.1
19
Sets and subsets
We treat sets as a language for our discussion and will take an informal approach to set theory. To us a set is simply a collection of objects 1 . We specify
the objects of interest in between a pair of curly brackets. For example,
{0, 1, 2}
is a set consists of three symbols 0, 1, and 2. We write
a∈X
to indicate that a is an element (or a member) of the X and write a ∈
/ X to
indicate otherwise. So 1 ∈ {0, 1, 2} but 4 ∈
/ {0, 1, 2}. A set is finite if it has
finitely many elements. The number of elements of a finite set X, denoted by
| X |, it called the size (or cardinality) of X. A set if is infinite if it is not finite.
Later on, we discuss the cardinality of infinite sets but until then when we talk
about the size of a set, we assume the set is finite.
We cannot write down all members of an infinite set, so to specify an infinite
set we either assume we know what are the elements, for example, the set of
natural numbers
N = {1, 2, 3, . . . };
or specify the elements by a property from a given set. For example, the set of
even numbers can be given as
{n ∈ N : n = 2k for some k ∈ N}.
A set X is a subset of a set Y, written as X ⊆ Y, if every element of X is
an element of Y. We refer ⊆ as (set) inclusion. It is clear that if X ⊆ Y and
Y ⊆ Z, then X ⊆ Z. We say that X is a proper subset of Y, written as X ( Y,
if X ⊆ Y but Y 6⊆ X. We say that two sets are equal if they are subset of each
other. There are two important consequence of this declaration. First, there is
only one set with no elements, we call it the empty set or (the null set) and
denote it by ∅. Second, the order in which we list the elements of a set does
not matter. For example, {5, 2, 3, 7} and {2, 3, 5, 7} are the same set. It is the set
of all primes numbers less than 10.
We write
P ( X ) = { A : A ⊆ X }.
1 see
Exercise 15 for a problem of such naı̈ve approach.
(2.1)
20
CHAPTER 2. SOME SET THEORY
for the set of all subsets of X. We call it the power set of X. Since ∅ and X are
always subsets of X, P( X ) has at least two elements unless X = ∅.
Example 2.1.1. Let A = { a}. Then P( A) = {∅, { a}}. Let B = { a, b} then a
subset of B is either a subset of A (if it does not contain b) or it is a subset of A
union {b}. So P( B) = {∅, { a}, {b}, { a, b}}.
2.2
Operations on Sets
Given sets X and Y, the set X \ Y consists of the elements of X that are not in
Y. We call X \ Y the complement of Y in X. In practice, it is often convenient
to fix a set that contains all the objects of interest in our discourse. We call
such a set a universe. For example, if number theory is the topic, then Z the
set of integers is a natural choice of the universe. Certainly, at times we may
find it necessary to enlarge the universe to some larger set, say to Q the set of
rational numbers. For computer scientists, perhaps the set of all binary strings
is a good choice of the universe. So in a sense, there is nothing universal about
the “universe”. We write ¬ X for the complement2 of X in some fixed universe
(that contains X). Clearly, the empty set and the universe are complement of
each other. Moreover, we have
Proposition 2.2.1.
1. ¬¬ X = X
2. X ⊆ Y if and only if ¬Y ⊆ ¬ X.
The intersection of X and Y, denoted by X ∩ Y is the set whose elements
are elements of both X and Y. Two sets are disjoint if they do not have any
element in common; in other words, their intersection is empty. The union of
X and Y, denoted by X ∪ Y, is the set3 whose elements are either elements of
X or elements of Y. We say that the X ∪ Y is a disjoint union if X and Y are
disjoint. Clearly | X ∪ Y | = | X | + |Y | if the union is a disjoint union.
Example 2.2.2. Let X = {m, a, t, h, e} and Y = {m, a, t, i, c, s}. The X ∩ Y =
{m, a, t} and X ∪ Y = {m, a, t, h, e, i, c, s}.
Alternatively, X ∩ Y can be defined as the subset of X and Y that contains
every common subsets of X and Y. Similarly, X ∪ Y is the set the containing
both X and Y but is contained in any set that has both X and Y as subsets.
2∼
X is another popular notation for the complement of X with respect to some universe
is a different from the construction that we have seen before. It does not specify a set as
a subset of another set.
3 Union
2.2. OPERATIONS ON SETS
21
Proposition 2.2.3. For any sets X, Y and Z,
1. X ∩ Y = Y ∩ X, X ∪ Y = Y ∪ X.
2. X ∩ (Y ∩ Z ) = ( X ∩ Y ) ∩ Z, X ∪ (Y ∪ Z ) = ( X ∪ Y ) ∪ Z.
3. If X ⊆ Y, then Z ∩ X ⊆ Z ∩ Y.
4. X ⊆ Y if and only if X ∩ ¬Y = ∅.
We give a proof of the following statements to illustrate how to apply the
results in Proposition 2.2.1 and 2.2.3. As an exercise, try identifying which
statements in these two propositions are used in the proofs.
Theorem 2.2.4 (De Morgan’s Law). For sets X and Y
1. ¬( X ∩ Y ) = ¬ X ∪ ¬Y.
2. ¬( X ∪ Y ) = ¬ X ∩ ¬Y.
Proof. Since X, Y ⊆ X ∪ Y, ¬( X ∪ Y ) ⊆ ¬ X, ¬Y and hence ¬( X ∪ Y ) ⊆ ¬ X ∩
¬Y. On the other hand, ¬ X ∩ ¬Y ⊆ ¬ X, ¬Y. Therefore, ¬(¬ X ∩ ¬Y ) contains
both X and Y and hence contains their union. But that means ¬ X ∩ ¬Y ⊆
¬( X ∪ Y ). This concludes the proof of (1). The second statement follows from
the first by replacing X and Y, respectively, by their complements.
As an application of DeMorgan’s Law, let us show that,
Proposition 2.2.5. For any sets X, Y and Z, X ∩ Y ⊆ Z if and only if X ⊆ ¬Y ∪ Z.
Proof. X ∩ Y ⊆ Z if and only if ( X ∩ Y ) ∩ ¬ Z = X ∩ (Y ∩ ¬ Z ) = ∅ if and only
if X ⊆ ¬(Y ∩ ¬ Z ). By De Morgan’s Law, the last set is ¬Y ∪ Z.
We leave the proofs of the following statements as an exercise.
Theorem 2.2.6 (Distributive Laws). For any sets X, Y and Z
1. X ∩ (Y ∪ Z ) = ( X ∩ Y ) ∪ ( X ∩ Z ).
2. X ∪ (Y ∩ Z ) = ( X ∪ Y ) ∩ ( X ∪ Z ).
The (Cartesian) Product X × Y is the set of ordered pairs
{( x, y) : x ∈ X, y ∈ Y }.
If X = Y, we often write X 2 for X × X. For finite sets X and Y, it is easy to
see that | X × Y | = | X ||Y |. In particular, if one of them is empty then so is their
product.
22
CHAPTER 2. SOME SET THEORY
Example 2.2.7. Let X = {0, 1} and Y = { a, b, c}, then
X 2 = {(0, 0), (0, 1), (1, 0), (1, 1)}
X × Y = {(0, a), (0, b), (0, c), (1, a), (1, b), (1, c)}.
2.2.1
Exercises
1. Let A = {d, i, s, c, r, e, t}, B = {m, a, t, h, e, i, c, s} and the universe U be the
set of lower case English letters. Write down the following sets:
(a) A ∪ B
(c) A \ B
(e) ¬ A ∩ ¬ B
(b) A ∩ B
(d) B \ A
(f) ¬( A ∪ B)
Answers for Part (1e) and (1f) are the same as guaranteed by DeMorgan’s
Law.
2. Repeat Exercise (1) for U = {n ∈ Z : 0 ≤ n ≤ 20}, A is the set of multiples of 3 in U and B is the set of multiples of 5 in U.
3. For n ∈ N, let Cn be the subset of N consists of the multiples of n.
(a) Find k such that Ck = C2 ∩ C3 .
(b) Find k such that Ck = C12 ∩ C18 .
(c) In general, for any m, n ∈ N, show that there is a k ∈ N such that
Ck = Cn ∩ Cm .
4. (Symmetric Difference.) The symmetric difference of two sets X and Y,
denoted by X∆Y is defined to be the set ( X \ Y ) ∪ (Y \ X ). Show that
X∆Y = ( X ∪ Y ) \ ( X ∩ Y ). Deduced that X = Y if and only if X∆Y = ∅.
5. If the X 4Y = X 4 Z, is it necessary that Y = Z? If the answer is “yes”
give a proof. If the answer is “no”, give an example to show why it is not
true.
6. What can be said about two sets X and Y if X \ Y = Y \ X?
7. Suppose X and Y are finite sets. Show that
(a) | X | = | X \ Y | + | X ∩ Y |.
(b) | X ∪ Y | = | X | + |Y | − | X ∩ Y |.
2.2. OPERATIONS ON SETS
23
8. (Ordered Pairs.) Define ( a, b) as the set {{ a}, { a, b}}. Show that ( a, b) =
(c, d) if and only if a = c and b = d.
9. Show that for X, Y nonempty, X × Y = Y × X only if X = Y.
10. Write down P(∅) and P( P(∅)).
11. (Power Set.) Let X = { a, b, c}
(a) Write down P( X ). What is the size of P( X )?
(b) In general, what is the size of the power set of a set of size n?
12. Let X = {0, 1} and Y = { a, b}.
(a) What is the size of P( X × Y )?
(b) Write down the set of all size 2 subsets of X × Y, i.e. the set
{ A ⊆ X × Y : | A | = 2}.
13. A partition of X is a subset of P( X ) consisting of pairwise disjoint subsets
whose union is X.
(a) Find the set of all partitions of { a, b, c}.
(b) How many partitions are there of a set of size n?
14. Prove the Distributive Laws (2.2.6).
15. (Russell’s Paradox) Care must be taken when we define a set by a property (formulas expressible using the ∈ relation). For example, consider
the set
U = {X : X ∈
/ X}
Intuitively, U is the set of all sets that does not contain itself as an element.
Does U belongs to itself? Show that there is a contradiction either way.
24
CHAPTER 2. SOME SET THEORY
2.3
Relations and Functions
A relation from X to Y is simply a subset of X × Y. Let R be relation from X to
Y and S be a relation from Y to Z. Denoted by S ◦ R the composition of S by R
which is the following relation from X to Z
{( x, z) ∈ X × Z : there exists y ∈ Y such that ( x, y) ∈ R and (y, z) ∈ S}
Composition of relations is associative4 , that is, S ◦ ( R ◦ T ) = (S ◦ R) ◦ T. The
inverse of a relation R ⊂ X × Y, denoted by R−1 , is the relation
{(y, x ) ∈ Y × X : ( x, y) ∈ R}
The inverse of R−1 is clearly R itself. Note also that (S ◦ R)−1 = R−1 ◦ S−1 .
A function from X to Y is a triple ( f , X, Y ) where
f ⊆ X × Y is a relation from X to Y such that of any x ∈ X there is
a unique y ∈ Y such that ( x, y) ∈ f .
We call X the domain and Y the co-domain and f the graph of the function
( f , X, Y ). We often refer to a function by its graph and instead of ( f , X, Y ), we
write f : X → Y; instead of ( x, y) ∈ f , we write f ( x ) = y. The range of f ,
denoted by Rng f , is the set of y ∈ Y such that y = f ( x ) for some x ∈ X. In
other words,
Rng f = {y ∈ Y : y = f ( x ) for some x ∈ X }
We often write Rng f as f ( X ). Let Z be a subset of Y, the inverse image of Z
under f is the following subset of X
f −1 ( Z ) = { x ∈ X : f ( x ) ∈ Z } .
If Z = {z} is a singleton, we write f −1 (z) instead of f −1 ({z}).
First, let us give two unorthodox examples: A moment of thoughts should
convince you that “boyfriends” is only a relation but not a function since one
may have more than one boyfriend. However, (biological) “father” is not only
a relation but a function as well5 .
Example 2.3.1. Let X = { a, b, c}, Y = {0, 1, 2} and Z = {α, β, γ}.
And R = {( a, 1), (b, 1), (c, 1), (c, 2)} and S = {(0, α), (1, β), (2, α)}. Then
S ◦ R = {( a, β), (b, β), (c, β), (c, α)}
R−1 = {(1, a), (1, b), (1, c), (2, c)}
S−1 = {(α, 0), (α, 2), ( β, 1)}.
4A
non-associative product that you may be familiar with is the cross product of vectors.
genetics progress, this may change in the future.
5 As
2.3. RELATIONS AND FUNCTIONS
25
Neither R nor R−1 is a function. S is a function but S−1 is not a function. Thus
the inverse of a function in general needs not be a function.
Example 2.3.2. For a real number r, let br c be the greatest integer not exceeding
r. We call this integer the floor of r. We call the function from R to R sending
each real number to its floor the floor function.
Let f be a function from A to B. We say that f is 1-to-1 (or injective, or
an injection) if f ( a) = f ( a0 ) implies a = a0 , for every a, a0 ∈ A. In other
words, no two distinct elements of A map to the same element of B by f . Yet
in other words, f −1 (b) is a singleton for every b ∈ Rng f . We say that f is onto
(or surjective or a surjection) if for every b ∈ B there is an a ∈ A such that
f ( a) = b. In other words, every element of B is the image of some element of
A. Yet in other words, f is onto means its range and its co-domain coincide. A
function that is both 1-1 and onto is bijective (or a 1-to-1 correspondence or a
bijection).
Example 2.3.3.
1. The function S in Example 2.3.1 is neither 1-1 nor onto.
2. The range of the floor function (Example 2.3.2 is Z the set of integers.
Therefore, it is not surjective. The inverse image of an integer n is the
half-open interval [n, n + 1) hence the floor function is not 1-to-1 either.
Example 2.3.4.
1. Let X be a subset of Y, the map ιYX : X → Y sending each x ∈ X to itself
is an injection.
2. For any non-empty sets X and Y, the map π X sending ( x, y) ∈ X × Y to
x and the map πY sending ( x, y) ∈ X × Y to y are clearly both surjective.
We call π X and πY the canonical projections of the product X × Y.
Example 2.3.5. A permutation of a set X is a bijection from X to itself. Let S( X )
denote the set of permutations of X. The identity map of X to itself, idX , is
a permutation of X. The composition of two permutations is a permutation
(Exercise 2). Moreover, the inverse of a permutation of X is a permutation (Exercise 4). Since function composition is associative (Exercise 1), S( X ) together
with function composition form a group. We call it the symmetric group on X.
In particular, we write Sn instead of S( X ), if X is finite of n elements and we
write, for example,
!
1 2 3
2 1 3
26
CHAPTER 2. SOME SET THEORY
for the element of S3 that maps 1 7→ 2, 2 7→ 1 and 3 7→ 3.
Example 2.3.6. There is a 1-to-1 correspondence between P( X ) and the set of
functions from X to 2 = {0, 1}. Namely, to each A ⊆ X we associate its membership function χ A : X → 2,
χ A (x) =

0
x∈
/A
1
x ∈ A.
Conversely, to each function f : X → 2, we associate to f the subset f −1 (1) ∈
P( X ). One checks readily that these two functions are inverse of each other. It
follows easily from this observation that | P( X )| is 2| X | .
We use the notation Y X to denote the set of functions from X to Y. This
seemingly odd notation is, in fact, rather natural: for non-empty finite sets X
and Y, |Y X | = |Y || X | .
2.3.1
Cardinality
The cardinality of a finite set is simply the number of elements that is has. Does
it make sense to talk about sizes of infinite sets? Afterall, are they all have the
same size—infinite? It turns out, maybe rather surprising the first time when
you hear that, there are different levels of “infinity”. We encourage the reader
to learn more about cardinalities by reading a set theory book. For us, we will
distinguish two types of “infinity” namely, countably infinite and uncountably
infinite. Let us begin with the following definition:
Definition 2.3.7. Two sets X and Y (possibly infinite) have the same cardinality,
denoted by | X | = |Y | if there is a 1-to-1 correspondence between them. We say
that a set is countably infinite if it has the same cardinality as the set of natural
numbers N. A set is countable if it is either finite or countably infinite. A set is
uncountable if it is not countable.
Since composition of two bijection is a bijection, so if | X | = |Y | and |Y | =
| Z | then | X | = | Z |.
Example 2.3.8. The set of non-negative integers ω := N ∪ {0} is countably
infinite since n 7→ n + 1 is a bijection from ω to N.
Example 2.3.9. The set of even numbers E is countably infinite since n 7→ 2n is
a bijection from N to E. The set of odd numbers O is also countably infinite
since m 7→ m − 1 is a bijection between E and O.
2.3. RELATIONS AND FUNCTIONS
27
Definition 2.3.10. We write | X | ≤ |Y | if there is an injection from X to Y. In this
case, we say that the cardinality of X is at most the cardinality of Y. We write
| X | < |Y | if | X | ≤ |Y | but | X | 6= |Y |.
If X is infinite then X \ { x } where x ∈ X is still infinite. From this, it follows
easily that there is an injection from N to X for any infinite set X. Thus |N| ≤
| X | for any infinite X. In other words, |N| is the least infinite “size”.
Theorem 2.3.11 (Cantor, Bernstein, Schroeder). If | X | ≤ |Y | and |Y | ≤ | X | then
| X | = |Y | .
Example 2.3.12. Suppose X is an infinite subset of N. Then the natural inclusion
of X into N is injective so | X | ≤ |N|. On the other hand, since X is infinite,
| X | ≤ |N|, therefore, we concldue that X is countably infinite.
It is clear that there is a surjection from N to any countable set X. If there is
a surjection, say f , from N to X, then sending each x ∈ X to its least preimage
is an injection from X into N. Thus we show that
Proposition 2.3.13. X is countable if and only if there is a surjection from N to X.
Proposition 2.3.14. The product of two countable sets is countable.
Proof. Suppose f is a surjection from N to X and g is a surjection from N to
Y. Then the map f × g sending (i, j) to ( f (i ), g( j)) is a surjection from N2 to
X × Y. So it remains to show that N2 is countable. But this follows immediately
by counting N2 “diagonally”:
(1, 1), (1, 2), (2, 1), (1, 3), (2, 2), (3, 1), . . .
Proposition 2.3.15. The union of countably many countable set is countable.
Proof. Suppose Xn (n ∈ N) are countable sets. Let f n be a surjection from N to
Xn . Then the map (i, j) 7→ f i ( j) is a surjection from the countable set N2 to the
S
union n∈N Xn and hence it is countable.
Example 2.3.16. The set of integers is to union of the three countable sets, namely
N, {0} and −N and so is countable according to Proposition 2.3.15.
Example 2.3.17. The set of rational numbers Q is the union of
nm
o
Qn =
: m ∈ Z , ( n ∈ N)
n
and {0}. Each Qn is in bijection with Z and hence countable. Therefore, by
Proposition 2.3.15, Q itself is countable.
28
CHAPTER 2. SOME SET THEORY
Theorem 2.3.18 (Cantor). For any set X, | X | < | P( X )|.
Proof. Mapping x to the subset { x } of X is clearly an injective map. Therefore,
| X | ≤ | P( X )|. Since there is a bijection between P( X ) and 2X , it remains to
show that there is no bijection between X and 2X . Suppose on the contrary that
x 7→ f x is a bijection between X and 2X . Then the map g defined by g( x ) =
1 − f x ( x ) a map from X to 2 but clearly differs from each f x at x, contradicting
the fact that f is surjective.
2.3.2
Exercises
1. (Associativity of relation composition) Suppose R ⊆ X × Y, S ⊆ Y × Z,
T ⊆ W × X. Show that that S ◦ ( R ◦ T ) and (S ◦ R) ◦ T are the same
relation from W to Z.
2. (Composition of functions) Show that
(a) the composition of two injections is an injection.
(b) the composition of two surjection is a surjection.
(c) Deduce that the composition of two bijection is a bijection.
3. Compute R−1 ◦ S−1 in Example 2.3.1. Verify that R−1 ◦ S−1 = (S ◦ R)−1 .
4. (Inverse of a function) Suppose f is a function from X to Y, show that its
inverse relation f −1 is a function from Y to X if and only if f is bijective.
5. Let X, Y be sets
(a) Show that if X = ∅ then Y X = {∅}.
(b) Show that if Y = ∅ and X 6= ∅, then Y X = ∅.
6. Let X = { a, b, c}, Y = {0, 1, 2, 3} and Z = {α, β}
(a) Give a function f : X → Y and a function g : Y → Z such that g ◦ f
is onto but f is NOT 1-to-1. Represent your functions by pictures.
(b) Write down the function g ◦ f that you got in Part (6a) as a set of
ordered pairs.
7. Given X = { p, q, r }, Y = {0, 1, 2, 3}.
(a) Give a function f : X → Y which is NOT injective.
(b) Give a function g : Y → X which is NOT surjective.
2.3. RELATIONS AND FUNCTIONS
29
(c) Write the composition g ◦ f as a set of ordered pairs.
8. Let f be a function from X to Y. Prove or give a counterexample to the
following statements:
(a) If A ⊆ X, f −1 ( f ( A)) = A.
(b) If B ⊆ Y, f ( f −1 ( B)) = B.
9. Let f be a function from X to Y. Prove or give a counterexample to the
following statements:
(a) If A ⊆ X, f ( X \ A) = f ( X ) \ f ( A).
(b) If C ⊆ Y, f −1 (Y \ C ) = f −1 (Y ) \ f −1 (C )
(c) If A, B ⊆ X, f ( A ∩ B) = f ( A) ∩ f ( B).
(d) If A, B ⊆ X, f ( A ∪ B) = f ( A) ∪ f ( B).
(e) If C, D ⊆ Y, f −1 (C ∩ D ) = f −1 (C ) ∩ f −1 ( D ).
(f) If C, D ⊆ Y, f −1 (C ∪ D ) = f −1 (C ) ∪ f −1 ( D ).
10. Let f , g and h be functions. Prove or give a counterexample to the following statements:
(a) If h ◦ f = h ◦ g and h is 1-to-1, then f = g.
(b) If h ◦ f = h ◦ g and h is onto, then f = g.
(c) If f ◦ h = g ◦ h and h is 1-to-1, then f = g.
(d) If f ◦ h = g ◦ h and h is onto, then f = g.
11. Let σ, τ ∈ S4 be
σ=
1
2
2
3
3
4
!
4
,
1
τ=
1
2
2
4
3
1
4
3
!
(a) Compute σ ◦ τ.
(b) Compute τ ◦ σ.
(c) Compute τ −1
(d) Compute σ3 (i.e. σ ◦ σ ◦ σ)
12. Show that the product of finitely many countable sets is countable.
13. Give an explicit formula for the map from N to N2 given in the proof of
Proposition 2.3.14.
30
CHAPTER 2. SOME SET THEORY
2.4
Equivalence Relations
Equality must be among the first few relations that we encountered in our lives.
However, a moment of thoughts should convince you that the daily usage of
the word “equality” has more to do with justice than mathematics. In fact, the
more you think about equality the more peculiar it seems. What means by two
things are equal? The answer given by set theory is that two sets are equal if
and only if they have the same elements. Certainly, then you will ask what
are elements? Or how can we tell whether two elements are the same? Now
you may appreciate why people tried to build everything out of nothing (the
empty set).
On the other hand, is equality what we really care about? Probably not, at
least, not for most of the time. Every cell phone manufactured has a unique
series number. But wouldn’t it be strange if you go into a store and ask to
buy a phone by a series number? Also think about the profile for your ideal
partner that you put on a dating site... Does it really specify “the one”? Trying
to answer these questions should convince you that, most of the time, we only
care about things up to some sort of equivalence.
So what are the essential properties of equality? First, everything should be
equal to itself, right? Second, if a = b then b = a. Finally if a = b and b = c,
then a should be equal to c. Each of these properties is important on its own,
so we will give each of them a name. A relation R on a set X (i.e. R ⊆ X × X
and we write aRb for ( a, b) ∈ R) is
• reflexive if for any a ∈ X, aRa.
• symmetric if for any a, b ∈ X, aRb implies bRa.
• transitive if for any a, b, c ∈ X, aRb and bRc implies aRc.
An equivalence relation on X is a reflexive, symmetric, transitive relation on
X. Let E be an equivalence relation on X, we say that two elements x, y ∈ X
are equivalent under E (or E-equivalent) if aEb.
Example 2.4.1.
1. On any set X, equality = is an equivalence relation. It is the finest6 equivalence relation on X. That means every equivalence relation E on X contains equality.
6 If
R and S are two relations on the same set X, we say that R is finer than S if R ⊆ S.
2.4. EQUIVALENCE RELATIONS
31
2. On any set X, X × X is an equivalence relation (any two elements of X are
equivalent with respect to this relation). It is clearly the coarsest (equivalence) relation on X.
3. Let X = { a, b, c}. Then R = {( a, a), (b, b), (c, c), ( a, b), (b, a)} is an equivalent relation. But S = R ∪ {( a, c), (c, a)} is not even though S is both
reflexive and symmetric, it fails to to a transitive relation.
4. Given a natural number n, two integers are equivalent mod n, written
a ≡ b (mod n) if a and b have the same reminder when divided by n. In
other words a ≡ b (mod n) if (b − a) is divisible by n. It is an equivalence
relation on Z.
Given an equivalence relation E on X, the E-equivalence class of an element x ∈ X, denoted by [ x ] E , is the subset of X consisting of elements that are
equivalence to x (under E):
[ x ] E = {y ∈ X : xEy}
So an equivalence class of E is simply a subset of X consisting of all elements of
X there are E-equivalent to each other. Also, it is clear that xEy exactly means
[ x ] E = [y] E .
Example 2.4.2. Referring to Example 2.4.1
1. The equivalence classes of = are { x } (x ∈ X). In other words, for each
x ∈ X, [ x ]= is { x }.
2. The equivalence classes of X × X is X. So X × X has only 1 equivalence
class namely X. So for any x, y ∈ X, [ x ] X ×X = [y] X × X = X. In other
words, everything is equivalent to everything else.
3. There are n-equivalence classes of ≡ (mod n). To be even more concrete,
let’s take n = 3. There all of multiples of 3 are equivalent, i.e. [0] =
{0, ±3, ±6, . . .} is an equivalence class. While [1] and [2] are the other
two equivalence classes of this relation.
As you can see in each case, the set of equivalence classes form a partition
of X. This is not a coincidence.
Proposition 2.4.3. Let E be an equivalence relation on X. Then
X/E := {[ x ] E : x ∈ X }
is a partition of X.
32
CHAPTER 2. SOME SET THEORY
Proof. Since each x ∈ [ x ] E so the union of the elements of X/E is X. Suppose
[ x ] ∩ [y] 6= ∅. Pick a z in the intersection. Thus zEx and zEy. By symmetry
and transitivity, xEy but that means [ x ] = [y]. In other words, two distinct
equivalent classes must be disjoint. This completes the proof.
Conversely, given a partition P of X, one get an equivalent relation EP by
declaring two elements in X are EP -equivalent if they are in the same member
of EP . An example should clarify this concept.
Example 2.4.4. Consider the partition P = {{ a, c}, {b}, {d, e, f }} of the set X =
{ a, b, c, d, e, f }. Then EP is the equivalence relation:
{( x, x ) : x ∈ X } ∪ {( a, c), (c, a), (d, e), (e, d), (d, f ), ( f , d), (e, f ), ( f , e)}
Often useful is that we identify elements that are equivalent (that is why
we call them as equivalent to begin with). Hence it is instructive to think of
the elements of X/E as individual entities themselves (instead of subsets of
X). The difference here is philosophical rather than mathematical. We speak of
X/E as the quotient of X by E. And the surjection π E : X → X/E, sending x to
its equivalence class [ x ] E is called the canonical map (or canonical projection).
Conversely, given a surjection π : X → Y with X, the set
Pπ = {π −1 (y) : y ∈ Y },
of inverse images of elements of Y under π forms a partition of X and we write
Eπ for the equivalence relation corresponds to the partition Pπ .
Example 2.4.5. Let X = { a, b, c, d, e, f } and Y = {0, 1, 2} and π be the surjection
defined by π ( a) = π (c) = 0, π (b) = 1 and π (3) = π (4) = π (5) = 2. Then one
checks readily, that Eπ is the equivalence relation in Example 2.4.4.
To summarize, we have described a 1-to-1 correspondence between the following three kinds of objects:
• Equivalence relations on X.
• Partitions of X.
• Surjections with domain X.
Exercises
1. For each of the follow conditions give a relation that satisfies them.
2.4. EQUIVALENCE RELATIONS
33
(a) reflexive, symmetric but not transitive.
(b) reflexive, transitive but not symmetric.
(c) symmetric, transitive but not reflexive.
2. For each of the following relation, decide whether it is an equivalence
relation. If not, states which of the three properties of an equivalence
relation that it fails to possess.
3. Suppose E and E0 are equivalence relations on the same set X. We say
that E0 refines E if aE0 b implies aEb for any a, b ∈ X.
Consider the partition P = {{c, a}, {d}, {e, f , b}} of X.
(a) Write down X.
(b) Write down, EP , the equivalence relation associated P as a set of
order pairs.
(c) Give the partition of an equivalence relation on X that refines EP but
not the equality of X nor EP itself.
4. (Bell Numbers) Let Bn be the number of partitions (equivalently the
number of equivalence relations) on a set of size n.
(a) What should be B0 ?
(b) Write down all partitions of the set { a, b, c}. Hence find B3 .
(c) Find B4 .
34
CHAPTER 2. SOME SET THEORY
2.5
Orderings
Order relation is another type of relation frequently encountered in real life.
Think about ranking the “cuteness” of the guys (or girls) in a party. You may
decide that John is cuter Joe and Ryan is cuter than John, so you know Ryan is
cuter than Joe (be consistent with yourself!). However, Dave is different, you
can’t quite compare him to the other guys. Yep, this is essentially what an order
is about except in real life, we often compare only a certain aspect of an object
so it is hardly the case that the relation that we have in mind is anti-symmetric
(which we will define next).
A relation is a partial order (or simply an order) if it is reflexive, transitive
and anti-symmetric. We have already defined reflexive and transitive. A relation R on X is anti-symmetric meaning for any a, b ∈ X if both aRb and bRa
then a = b. Here is where the “cuteness” analogy breaks down. Two guys that
are equally cute to you does not mean they are the same person7 .
From now on, we will use a more suggestive symbol to denote a general
partial order. A poset is a pair ( X, ) where X is a set and is a partial order
on X. We often simply say that X is a poset if the partial order is understood.
We say that a and b are -comparable (or simply comparable) if either a b
or b a. A subset C of X is a chain if any two elements of C are -comparable.
Clearly, a subset of a chain is also a chain. A subset A of X is an anti-chain if
no two elements of A are comparable. Again, it is clear that every subset of an
anti-chain is an anti-chain. Also, every singleton { x } (x ∈ X) is both a chain
and an anti-chain. We say that is a total order (or linear order) on X itself is
a chain. In that case, say we that ( X, ) (or simply X) is a totally (or linearly)
ordered set.
Example 2.5.1.
1. Equality is a partial order. It is not an interesting one since no two distinct
elements are comparable.
2. The usual order ≤ on R is a total order.
3. Divisibility is a partial order on N. More precisely, for natural numbers
a and b, let a ≤d b mean “a divides b”. One checks easily that ≤d is a
partial order on N. However, it is not a total order since, for example, 3
and 5 are not comparable with respect to ≤d .
7a
relation that is reflexive and transitive is called a pre-order.
2.5. ORDERINGS
35
4. For any set X, the inclusion ⊆ is a partial order on ℘( X ). It is not a total
order unless | X | ≤ 1. When X is finite of size n, we identify the poset
(℘( X ), ⊆) with the boolean algbera on n atoms, denoted by Bn . Note
that |Bn | = 2n .
5. Let Σ be a non-empty set. Let Σ∗ be the set of strings (i.e. finite sequences)
over Σ. We say that a string u is a prefix (or initial segment) of v ∈ Σ∗ ,
denoted by u v v if v = uw for some string w. It is easy to see that v is a
partial order on Σ∗ .
6. Suppose (Σ, ≤) is a totally ordered set. The lexicographic order is a
total order on the set of sequences of Σ. For sequences α, β of Σ, a ≺ b if
ai < bi where i is the first index where α and β disagree.
Let Y be a subset of a poset ( X, ). Then
• An element a of Y is the greatest (resp. least) element of Y is for any
y ∈ Y, y a (resp. a y).
• An element a of Y is a maximal (resp. minimal) element of Y if for any
y ∈ Y, a y (resp. y a) implies a = y.
• An element of b of X is an upper (resp. a lower) bound of Y if y b
(b y) for every y ∈ Y.
• An element of s of X is a supremum (resp. an infimum) of Y if s is the
least (resp. greatest) element of the set of upper (resp. lower) bounds of
Y. We use sup Y (resp. inf Y) to denote the supremum (resp. infimum) of
Y.
Few remarks are in order.
1. Y can have at most one greatest (resp. least) element. In other words, Y
may not have a greatest (resp. least) element but if it has, it can have only
one. This is because if a, a0 ∈ Y are greatest elements of Y then first both
of them are in Y. Since a is a greatest element of Y, so a0 a. Likewise,
since a0 is also a greatest element of Y, so a a0 . Thus a = a0 by the
anti-symmetric requirement on . Consequently, sup Y (resp. inf Y), as
the least (resp. greatest) element of the set of upper (resp. lower) bounds
of Y, if exists, must be unique.
2. Y may have more than one maximal (minimal) elements.
36
CHAPTER 2. SOME SET THEORY
3. In a totally ordered set, since any two elements are comparable, the concept of maximal (minimal) element and greatest (least) element coincide.
4. The greatest (resp. least) element of Y, if exists, is a maximal (resp. minimal) element of Y as well but not vice versa.
5. An upper (lower) bound of Y need not be in Y.
6. A subset Y of X need not have a supremum (infimum). However, as the
least elements of the set of upper (lower) bounds of Y, sup Y (inf Y) if
exists must be unique.
7. Also if the greatest (least) element of Y exists, then it must also be the
supremum (infimum) of Y but not vice versa.
8. The concept of supremum and infimum may sound a bit foreign at first.
However, many familiar concepts in mathematics are just supremum and
infimum in disguise. For example,
(a) If the poset is (N, |), then an upper of a subset Y is just a common
multiple of the members of Y. Hence sup Y is the simply the l.c.m
(least common multiple) of the elements of Y. Dually, inf Y is the
g.c.d (greastest common divisor) of the elements of Y.
(b) Next let us consider the poset (℘( X ), ⊆). An upper bound of a subset Y of ℘( X ) (i.e. Y is a collection of subsets of X) is simply a subset
of X that contains each of the members of Y. Hence sup Y is simply
the union of the elements of Y. Likewise, inf Y is simply the intersection of the members of Y.
An example should make these concepts clear
Example 2.5.2. Consider the poset X with the following Hasse diagram.
•j
•h
•i
•e
•f
•g
•b
•c
•d
•a
2.5. ORDERINGS
37
1. The subset {c} of X is a chain (as we have mentioned, any singleton is a
chain). So is the subset of { a, c}. However, the subset { a, d} is not a chain
since a, d are not comparable.
2. As chains of X are elements of ℘( X ). They are partially ordered by inclusion ⊆. In this sense, a maximal chain is a chain that is not properly
contained in any chain. For example, { a, c, f , h, j} is a maximal chain but
so is { a, c, e, h, j}. On the other hand {c, g, j} is not a maximal chain because the chain { a, c, g, i, j} contains it properly.
3. Let Y = { a, b, c} then the minimal elements of Y are a and b. The maximal
elements of Y are b and c. Note that b is both maximal and minimal. Y
has neither the greastest element or the least element.
4. The set of lower bounds of Y is L = { a}. The set of upper bounds of
Y is U = {e, f , h, i, j}. Clearly a itself is the greatest element of { a} so
inf Y = a. On the other hand, since U does not have the least element, so
sup Y does not exist.
5. Now consider the subset Y = {e, f }. Then the set of upper bounds of Y is
{h, j} hence sup Y = h. The set of lower bounds of Y is { a, b, c} and since
it has no greatest elements so inf Y does not exist.
6. A trickier question is what is sup ∅? Since any element of X is an upper
bound of ∅, so the set of upper bounds of ∅ is just X itself. Thereforem
sup ∅ would be the the least element of X, if any. Since the X is our
example has no least element, therefore sup X does not exist in this case.
By the same token, inf X should be the greatest element of X, if any and
so for this particular example, inf X = j.
A partial order on X is a well order on X if every non-empty subset of
X has a least element. Note that this condition already implies that is a total
order on X, since for any two elements a, b ∈ X, the subset { a, b} has a least
element, in particular a and b must be -comparable. Clearly, a finite total
order must be a well order.
Zorn’s lemma is the assertion that if every chain in a partially ordered set
has an upper bound then the partially ordered set has a maximal element. It
is equivalent to Axiom of Choice and the well-ordering principle. The first one
asserts that the product of a non-empty family of non-empty set is non-empty
and the second one asserts that every non-empty set can be well-ordered. The
equivalence between them are non-trivial, unfortunately we won’t have time
38
CHAPTER 2. SOME SET THEORY
to go over them. See for example, Halmo’s classic—Naive Set theory if you are
interested.
Example 2.5.3.
1. The usual order ≤ on N is a well order. However, the usual order on R
is not, for example the non-empty subset (0, 1) has no least element.
2. The lexicographical order on English words is another example of a wellordered.
3. Consider (N, ≤d ), 1 is the least element. If you consider the subset N \
{1}, then the primes are minimal (but not least) elements.
4. X is the greatest elements of the poset (℘( X ), ⊆). If X is finite, then the
singletons are minimal elements of the subset S := P( X ) \ { X, ∅} while
the subsets of size | X | − 1 are the maximal elements. X is an upper bound
of S (note that X ∈
/ S) while ∅ is a lower bound of S.
5. Let X = { a, b, . . . , j}. And consider the partial order ≤ on X
a ≤ d ≤ f , g ≤ h ≤ i, b, c ≤ e, j
Draw it’s Hasse Diagram! The maximal elements of X are i, e and j and
the minimal elements are a, b, c and j. Note that j is both a maximal
and a minimal element of X. The subset {i, h, g, d, a} is a chain and so
is {i, h, f , d, a}. (How many chains of X are there?) They are the maximal chains (if we ordered the chains of ( X, ≤) by inclusion) of X. The set
{ f , g, e, j} is a maximal anti-chain of X.
Let Y be the subset { h, f , g, d} of X then h is the greatest element of Y
and d is the least element of Y. The upper bounds of Y are i and h with h
being the least upper bound. The lower bounds are a and d with d being
the greatest lower bound. The subset {b, c} has an upper bound e but
no lower bounds. The subset { f , b} has neither upper bounds nor lower
bounds.
2.5.1
Embeddings
Let ( X, X ) and (Y, Y ) be two posets. A map ϕ : X → Y is order preserving if
x X x 0 implies ϕ( x ) Y ϕ( x 0 ). An embedding is a 1-to-1 map such that x X
x 0 if and only if f ( x ) Y f ( x 0 ). An isomorphism is a surjective embedding.
Example 2.5.4.
2.5. ORDERINGS
39
1. Suppose | X | ≥ 2, then any constant map from X to Y is an order preserving map but not an embedding (since it is not 1-to-1).
2. A bijection from an anti-chain (of size > 1) to a chain is order preserving
but not an embedding.
3. Let X be a poset. One can check that the map sending x ∈ X to the
following set
↓ x : = { y ∈ X : y x }.
is an embedding, we call it the down-set embedding of X. In particular,
every poset of size n embeds into Bn the Boolean algebra of size 2n .
Exercises
1. Using the fact that there are infinitely many primes, shows that every
finite Boolean algebra can be embedded into (N, |). Deduce that (N, |)
is universal for finite posets, in the sense that every finite poset can be
embedded into (N, |).
2. Show that every partial order on a finite set X can be extended to a total
order on X. In fact, by Zorn’s Lemma, one can show that the same holds
for infinite X.
3. Consider the posets
•z
•β
•x
•y
•v
•u
P2 =
P1 =
•α
•γ
•t
(a) Give two different embeddings from P1 to P2 .
(b) How many embeddings are there from P1 to P2 ?
•w
Index
1-to-1 correspondence, 25
ordered pairs, 23
Bell numbers, 33
bijection, 25
permutation, 25
power set, 20
cardinality, 19, 26
Cartesian Product, 21
co-domain, 24
complement, 20
countable, 26
relation, 24
composition, 24
inverse, 24
domain, 24
empty set, 19
equivalence relation
refine, 33
floor, 25
function, 24
1-to-1, 25
bijective, 25
co-domain, 24
domain, 24
graph, 24
injective, 25
onto, 25
range, 24
surjective, 25
set, 19
complement, 20
disjoint, 20
element, 19
finite, 19
infinite, 19
intersection, 20
power, 20
product, 21
subset, 19
symmetric difference, 22
universe, 20
subset
proper, 19
surjection, 25
symmetric group, 25
uncountable, 26
union
disjoint, 20
infinite
countably, 26
injection, 25
40
Download